I have been a biologist for many years, but my early college and graduate school training was in physics. One of the hardest parts of my transition from physics to the life sciences was the realization that it is not possible to deduce everything in biology from first principles. Instead evolution involves a quite random series of changes in the genome of an organism, followed by selection of the phenotypic (apparent) changes that work best.
It took me some time to recover from the implication that one can’t use theory to predict how living organisms work. And maybe, in my musing about biological systems, I am still a recovering physicist. So I can’t help thinking that eukaryotic cells, in their own transition from primordial soup to now, somehow missed an elegant solution to solving a biochemical problem.
A process central to the functioning of any type of cell is information flow: the cell’s use of the information in DNA to produce its macromolecular products such as proteins and other building blocks. There are three well-known examples of this information flow. First, we have replication, where the DNA sequences of our chromosomes are employed for self-copying (in other words, making more DNA). Then there is transcription, in which the DNA sequence of a protein-coding gene is used as a template to produce messenger RNA (mRNA). And finally, there’s translation, when the mRNA is in turn is used to direct the amino acid sequence of the final protein product.
In both prokaryotic cells (such as bacteria) and eukaryotic cells (such as our own), major steps in the types of information flow described above are transmitted via highly specific hybridization (sticking) between a nucleic acid template, and either a nucleic acid sequence or a nucleotide building block. In either cell type, specific hybridization permits DNA to act as a template for both its own replication and for production of a complementary mRNA, while synthesis of proteins is directed by hybridization of specific transfer RNAs (tRNAs) to specific sequences in the mRNA.
So far so good. However, in eukaryotic cells, the flow of information from the chromosomal DNA to a protein product involves an extra step. The initial product of gene transcription is not the final mRNA, as occurs in bacteria. Instead, our cells produce a larger pre-mRNA made up of information stretches separated by junk sequences, rather like the dashed yellow line on a highway. The informational stretches (the yellow paint, known as exons) are bits of genes, periodically interrupted by filler DNA (the asphalt in between, known as introns) which don’t code for anything. In eukaryotes, production of the final mRNA thus requires further processing of this long pre-mRNA: by processing, I mean the removal of the filler introns followed by the splicing together of the informational exon sequences.
This intron removal step is amazingly accurate, even for genes that encode a pre-mRNA containing tens or even hundreds of introns. This requisite conversion of pre-mRNA to processed mRNA is clearly an essential component of information flow in eukaryotic cells, involving as it does the selection of a specific subset (exon sequences) of the total sequence information in the pre-mRNA molecule for inclusion in the final mRNA product. It seems unnecessarily complicated, but the process probably evolved so that cells could be flexible in the proteins they make, as alternative splicing of different exons can make different modular proteins from the same initial DNA sequences – and that gives a cell flexibility in responding to its environment. Okay, so if I, the recovering physicist, were using first principles to design a eukaryotic cell, how would I ensure the precise specificity of the removal of the introns from a pre-mRNA molecule to yield the final mRNA product? Clearly (to me, anyway), I would employ the great specificity provided by precise hybridization of complementary nucleic acids, just as replication and transcription do. This would involve using a cellular nucleic acid molecule, precisely complementary to the mRNA, as a template that would guide the specific selection of the exons in the pre-mRNA for inclusion in the final mRNA product. This additional RNA processing step in the synthesis of proteins by eukaryotes would thus take advantage of the same powerful and elegant principle of specific hybridization employed at the other stages of cellular information flow.
But eukaryotic cells apparently employ a rather less elegant biochemical route to process their pre-mRNA to mRNA. This processing, both intron excision and exon splicing, involves the action of a monstrously large multi-component cellular structure called a "spliceosome," bristling with an array of proteins and RNAs. The spliceosome seems to require recognition of common and thus fairly general “consensus” sequences at the ends of introns in pre-mRNAs to correctly identify intron-exon splice junctions.
However, it was recognized some time ago that, at least in the cells of organisms that have a backbone, the limited information in these consensus sequences is not sufficient to specifically identify intron-exon splice junctions amid the sea of DNA in all the chromosomes. As far as I know, this central informational problem in pre-mRNA splicing specificity has not been solved – how are all these needles found in the haystack? And I also don’t think that there have been any previous suggestions, like mine, that precise spliceosome-mediated processing of a specific pre-mRNA might involve a nucleic acid template complementary to the final mRNA product.
But my proposed scheme would solve the splicing specificity problem so neatly! Picture this as yet unknown perpetrator, which I’ve dubbed “cNA” – a nucleic acid (made of either DNA or RNA) which is complementary to a processed mRNA, in other words the mRNA that has had all its introns removed and exons stitched up. When this cNA hybridizes specifically to the unprocessed pre-mRNA, the resulting structure would be a partially double-stranded molecule, in which the pre-mRNA exon sequences are hybridized to the cNA, zipped up like a zipper, while the pre-mRNA introns would form single-stranded loops bulging off of the zipper. To correctly form the final mRNA product, the eukaryotic cell would then just have to employ a single-strand-specific RNA cutter (RNAse) to clip away all the intron bulges, then an RNA stitcher (ligase) to sew together all the exons.
So, what are the problems with my model, and how might I counter them? Here are some of the problems – I am pretty confident that I have not yet thought of them all.
1. There is no evidence whatsoever to support the existence of my proposed class of “cNA" molecules. However, I would point out that we are in an exciting era of discovery, and new classes of RNAs with novel cellular roles are being discovered all the time. In fact, the 2006 Nobel Prize in Physiology or Medicine was recently awarded to Fire and Mello for their characterization in 1998 of RNA interference (“RNAi”) via a novel class of small RNA molecules termed microRNAs. So my proposed “cNA" molecules might also represent a novel class of complementary RNAs that just haven’t been discovered yet. So get to work, all you graduate students!
2. That specific single-stranded nuclear RNAase cutter I mentioned, needed to chew up the spliced-out intron bulges, would have to be fastidious enough to leave intact all other single-stranded RNA, such as the final mRNA itself. And a related biochemical problem: following the splicing out and ligation step, the mature mRNA would still be (perfectly) hybridized to the cNA template. How would the mRNA be released from this hybrid structure, permitting it to direct specific protein synthesis? Who would unzip it? I don’t have a ready solution for either of these problems. There is yet another related problem: part of the RNAi mechanism I mentioned above involves a “Dicer” enzyme that cleaves double-stranded RNA. So, if my proposed cNA is a cRNA, Dicer could be a distinct threat – one could imagine that it would be hard to keep this protein from cutting up the double-stranded regions of the pre-mRNA:cRNA complex, because this is exactly the snack that Dicer favors. I believe the answer to this problem is that Dicer is located in the cytoplasm, while my proposed cNA would be in the nucleus – these two compartments are safely separated most of the time.
3. My scheme would require that a eukaryotic cell contain a specific cNA for each mature mRNA produced by the cell – that’s a lot of extra stuff to make, and the cell is very busy at the best of times. This does seem at first like a strong objection. But it seems likely that my proposed cNAs would in fact be cRNAs (complementary RNA). And since these cRNAs would be complementary to mRNA, they would of course be anti-sense RNAs. They might thus represent a subset of the products of the widespread antisense transcription of the human genome that we already know about – we still don’t know what a lot of these antisense RNAs actually do.
4. One might also ask how this whole process could have arisen during evolution. That is, if a template were needed for production of each type of mRNA in a eukaryotic cell, how would the first mature mRNAs ever have been made? I don’t have a good solution to this “chicken and egg” problem.
Finally, let us suppose a fairly likely proposition: that eukaryotic cells actually don’t employ the model I have described here. But if my model represents such an elegant mechanism for a eukaryotic cell to guide the flow of information through the pre-mRNA processing step, why didn’t this type of cells develop it during the evolution of multi-cellular organisms? It is of course difficult (perhaps even dangerous) to try to answer questions about why evolution has taken a specific path since, as noted at the beginning of this article, temporal genomic changes (mutations, etc.) are blind and random, and it is only environmental selection among the phenotypic results of these changes that is at all guided.
But the best answer I presently have for this question was suggested by Jennifer Rohn, the editor of this magazine. Simply put, eukaryotic cells may have just plain missed the boat. Cellular information flow based on specific hybridization originally developed in non-nucleated single cell organisms such as bacteria. But since the transcripts of these primitive organisms lack introns, the organisms had no need for a mechanism to process pre-mRNAs to mRNAs via splicing out of introns. So at the time eukaryotic cells evolved from prokaryotes, the latter might have lacked key specialized biochemical mechanisms required for my proposed scheme. Instead, eukaryotic cells apparently had to go another route, and develop a complex, less elegant spliceosome-based mechanism for this step in specific information flow.
Ah, well, as the poet e.e. cummings said: “listen: there's a hell of a good universe next door; let's go”. Maybe over there eukaryotes (if there are any!) have managed to go the more elegant route, and employ the beautiful theme of information flow via specific nucleic acid hybridization for all of the steps in the transmission of information from the genome to protein synthesis.
[Reprinted by permission from www.lablit.com/article/174]
(sciencequandaries.blogspot.com)