Linda began. "As you know, we are in the midst of collating and analyzing the remaining DNA sequence data for human chromosomes four and five, which would complete our portion of the preliminary sequencing of the human genome. The meeting this morning was called to brain-storm the current data, and to map out a plan of action."  Looking directly at Adam, Linda continued, "Dr. Adam Dove will be joining us in a consulting capacity."  

The announcement went largely without reaction, with several heads turning to look at Adam. He returned the curious glances with an innocent but studious façade as he pondered the confidence with which Linda assumed he would join the team. The meeting proceeded to detail various quirks and challenges surrounding chromatographic separation and identification procedures, sequencing options, purification issues, and other such technical bits typical of the project in general. As the presentations proceeded, Adam began reviewing what he actually knew about DNA.

The acronym was short for deoxyribonucleic acid—a polymer made up of four types of bases. The sequence of these four bases in a DNA strand represents all the instructions needed to build and maintain any biologic organism. A set of three bases, a triplet, generally encodes for a single amino acid, a building block for a protein that can be hundreds of amino acids long. The sequence of these triplets along the DNA constitutes the genetic code. Each chromosome held one DNA chain and that a chunk of DNA which codes for a specific protein is called a gene.

That's definitely all I know.

 He guessed that the total amount of DNA which describes an organism is called a genome.

"Adam, now that you have an appreciation of the challenges we face, I would like your opinion on something." 

Adam blinked, mentally filing away his DNA musings, and saw that the meeting room was empty.

It must have been a short meeting.

Linda was standing in front of him and with a smirk, turned and waved for Adam to follow. They entered her laboratory together, where she stopped just inside, letting the door close behind them.

"As you may know, the human genome project was designed to identify each and every gene, and in so doing, it would provide the world with a complete description, basically a roadmap of a human being. The idea was to determine the sequences of approximately three billion bases that make up our DNA, and then to figure out how many genes were present and what proteins they encoded for. Information obtained from this project would be invaluable to us, especially in tackling genetic disorders and developing new ways to design medicines tailored for the individual's biochemistry."

Pausing for a moment to gather her thoughts, Linda furrowed her brow as she caught Adam closing his eyes. She cleared her throat and raised her hands to emphasize a point. "However, we have run into a few surprises. Originally we assumed that hundreds of thousands of genes made up the human genome. Earlier observations indicated that simpler organisms had a smaller number of genes. For example, bacteria and fungi range from about two to eight thousand genes. Fruit flies get up to about fourteen thousand, and mice are at twenty-five thousand. Although the human genome project is not quite complete yet, our data indicate the gene count for humans to be less than thirty thousand."

"Not much more than in mice. Sounds a bit disappointing. What gives?"

"No one knows for sure, but it would appear that we may be using the same basic machinery present in lower species, but in a more complex way."

"Is that the issue that you'd like me to look at?"

Linda shook her head. "Not exactly. There's something else that has emerged which may be much more of a puzzle. It turns out that our genome is largely unused. That is, not only are we limited to thirty thousand genes, but recently we have discovered that a fairly large portion of our genome appears to be nonsense."

Linda took a step closer to Adam. When he spoke, his voice cracked. "Exactly how much of the genome are we talking about?"

"Our analysis suggests that about two percent of all the sequences in our DNA codes for protein. This leaves the rest of our genome with no identifiable purpose. Some scientists have called these sequences 'junk DNA.'  I prefer the term, 'non-coding DNA.'  And, that's where you come in."

Adam's eyebrows involuntarily arched as he asked, "And why would that be?"

"Well, experimentalists have been working on the possible function of non-coding DNA, and the most likely theories suggest that it may be responsible for the regulation of all the processes which go into creating and maintaining us, basically acting to make sure each step in the process of our growth and development takes place at the right time and in the right way. Personally, I think that's a neat explanation, however, there's no concrete proof … and, besides, the non-coding sequences are a bit strange."

            "Strange?  In what way exactly?"

"There are many instances where apparently random sequences repeat themselves in many locations. Actually, they seem to be both random and organized. One other thing … as I mentioned, lower, simpler organisms have fewer genes. However, lower organisms also have less non-coding DNA. The trend that we are seeing is that as an organism becomes more complex, that organism has an increased amount of non-coding DNA."

"Whew. So, how do I fit into the picture?"

"You've done a great deal of work in the area of structure-activity correlations, mathematical procedures designed to detect relationships between molecular structure and biological activity."

Adam nodded as Linda continued. "I have a feeling that there may be more to the non-coding DNA issue than just a random set of sequences coupled to vague theories of cellular regulation."

"You want me to look for patterns within these sequences?  Patterns that may correlate with some type of function?"

"You got it."

"Just how much non-coding sequence data do you have?"

"We are just now completing the sequencing for two chromosomes. The non-coding sections consist of about two hundred fifty million bases. We can get the sequence data from other labs which cover the rest of the human genome. Altogether the mystery sequences contain close to three billion bases."

***

            The cylinder turned and its engines fired in a programmed series of short bursts to begin a controlled process of deceleration.  The G-forces needed to be carefully regulated.  Excessive forces would not damage the cocooned occupants, but could pose issues to the few awake and on duty.   


Algorithm - Book 1 - The MedallionWhere stories live. Discover now