Text Assignment: Chapter 3, pages 75-87 (includes end-of-chapter material for entire chapter). .Chapter 4, pages 88-95.
Important concepts
Molecular organization of eukaryotic genes: We have already seen that a typical eukaryotic gene has an upstream promoter sequence, as well as various enhancer and silencer sequences associated with it. The region that is transcribed is also quite complex, consisting of far more than a simple protein coding sequence. The messenger RNA that is derived from the transcript always begins with a 5'-untranslated sequence, which, among other things, provides sequences needed for ribosomal attachment in preparation for translation. This is followed by the actual coding sequence and a relative long 3'-untranslated region, which may contain specific signal sequences related to such things as message stability. The initial transcript and the DNA that it is transcribed from also contain numerous intervening sequences that interrupt not only the coding sequence, but also the 5'- and 3'- untranslated sequences. These are called introns. The segments of mRNA sequence located between the introns are called exons. The term hnRNA (heterogeneous nuclear RNA) is sometimes used to describe the original products of transcription of eukaryotic genes before they have been processed into messenger RNA. You may want to look ahead to "The Anatomy of a Gene" (pages 158 to 161 and Figure 6.2) for a more complete overview of the structure of a typical eukaryotic gene.
Processing of the mRNA precursor: The initial transcript of a typical eukaryotic gene is not capable of functioning as a messenger RNA. In most cases, both ends of the transcript must be modified (5'-capping and 3'-polyadenylation), and introns must be "spliced out" during the process of converting the initial transcript (hnRNA) into a functional mRNA.
Capping: Addition of a "cap" structure on the 5'-end of the message transcript occurs soon after transcription has been initiated, when the growing RNA chain is only about 50 nucleotides long. The reaction occurs in two steps. First, a GTP is added to the 5'-end, which already has a 5'-triphosphate structure left over from the first NTP that initiated growth of the transcript in a 3'-direction. An unusual linkage is formed in which the GTP is joined to the first nucleotide of the RNA in a 5'- to 5'- triphosphate bond (see figure 3.19). The second phase is methylation of the guanine in the 7'-position to generate the mature 7-methylguanosine-5',5'-triphosphate cap structure. In addition to protecting the mRNA from degradation, the cap structure is also needed to initiate ribosomal binding onto the message.
Polyadenylation: Most (but not all) eukaryotic mRNAs have a polyadenylic acid tail of 150 to 250 nucleotides added to their 3'-ends before they are exported to the cytoplasm. This is done by an enzyme complex that clips off the 3' end of the original transcript slightly downstream from a specific recognition signal (AAUAAA) and initiates polymerization of ATP to generate the poly(A) tail. Although polyadenylation is a very important phenomenon, please be aware that a subset of mRNAs, including those for the histones do not have poly(A) tails. Also, please be fully aware that the poly (A) tail is generated by a simple polymerization process that does not require a template.
Removal of introns: Nearly all eukaryotic protein-coding genes contain introns that must be removed to generate a functional messenger RNA that is capable of being translated. There are specific recognition signals at the beginning and end of each intron that allow an enzyme complex in the nucleus to recognize the presence of an intron and to remove it, coupling the exons on either side with sufficient precision so that the codons (nucleotide triplets coding for individual amino acids) continue to be read correctly. Note that in many cases, parts of a single codon may be in two exons.
Signals that identify introns: Removal of introns is done in the nucleus before the mRNA is exported to the cytoplasm for translation. The mechanisms that are involved are described in considerable detail in the textbook (pages 76-79 and figure 3.21). The 5'-end of the intron is marked with a consensus sequence of GUAAGU, with the first two nucleotides being invariant. The 3'-end has a sequence consisting of 6 pyrimidines in a row followed by an unspecified nucleotide and then CAG (6PyNCAG), with the AG being invariant. A third sequence 18-40 nucleotides upstream from the 3'-end also plays a major role. It hae a concensus sequence PyNPyPyPuAPy, (Py = pyrimidine and Pu = purine).
Removal of introns: A multicomponent "spliceosome" composed of several small nuclear ribonucleoproteins (snRNPs) carries out the actual message splicing operations. The RNAs in the snRNPs are products of RNA polymerase III, and range in size from 100 to 250 nucleotides. If one regards the hnRNA as reading left to right from 5' to 3', the 5'-end of the intron is first detached from the exon to the left and bent into a lariat loop, with attachment to the A in the internal consensus site. This involves the formation of an unusual 2,5-phosphodiester bond between the guanine nucleotide from the 5'-end of the intron and the adenine nucleotide in the internal consensus sequence. The exon from the left side is held by the complex and joined to the exon from the right side after the 3'-end of the intron has been detached. This joining produces a continuous mRNA sequence with the intron removed and no loss of nucleotides from either exon. This entire process is referred to as "splicing out" the intron (see figure 3.21 for details).
Electron microscopy of introns: If double stranded DNA is dissociated to single strands and allowed to reform double strands in the presence of mRNA, the mRNA hybridizes readily with the antisense strand. In cases where introns have been spliced out, the mRNA hybridizes with the antisense strand, leaving the sense strand as single stranded DNA. However, in places where the DNA codes for an intron, there is no RNA sequence to hybridize because it has been spliced out. In those areas, the DNA forms a double helical loop that is clearly visible in the electron micrographs (Figure 3.22).
Alternative splicing and RNA editing: In certain unusual cases, splicing may remove a larger segment of the transcript that contains a potential exon. This allows different combinations of exons to be assembled together to produce alternative forms of the mRNA that can give rise to proteins with alternative amino acid sequences. (this phenomenon is described briefly on page 244 and illustrated in figure 8.29). There are also rare cases of "editing" of messenger RNA can occur after its transcription. These are unusual situations in which insertions, deletions or base changes may occur. RNA editing is an important but highly specialized processes, which will not be analyzed in detail in this class.
Transport to cytoplasm: After it is fully modified, the mRNA must be transported to the cytoplasm before it can be translated. Our textbook does not appear to have much to say about this phenomenon, at least in the current chapters (if anyone finds a description of it, please let me know). Overall, the transport process is still not well understood. However, it is important to recognize that the mRNA must pass through relatively small nuclear pores and that this process is somehow facilitated by poly (A) and various special proteins, including a poly (A)-binding protien.
Processing of other types of RNA: Introns are also found in ribosomal and transfer RNAs, as well as RNAs transcribed from mitochondrial and chloroplast genomes. A variety of splicing mechanisms are involved in removal of these introns. In some cases, the RNA is capable of self-splicing, with no requirement for the involvement of protein enzymes. Dr. Thomas Cech of the Department of Chemistry and Biochemistry was awarded a Nobel Prize for discovery of this unexpected phenomenon. In addition to the removal of introns, the processing of ribosomal RNAs involves cleaving a very large precursor into the smaller pieces that become the final functional RNAs in the ribosomes (figures 3.23 and 3.24). Transfer RNAs undergo extensive modification of their nucleotide bases (fig. 3.26, and in some cases also have a CCA sequence added to their 3'-ends.
Amino acids and proteins: The second half of the central dogma deals with the translation of coded information contained in a linear RNA molecule into a linear sequence of amino acids in a protein molecule. This requires converting a message that is written with only four different characters, A, C, G, and U, into a seuence with 20 possible alternative amino acids at each position. The protein amino acids all have amino groups attached to their alpha carbon atom (the one next to their carboxyl group). The alpha carbon also carries an attached side chain, ranging from a single hydrogen in glycine to a complex double ring structure in tryptophan (see figure 4.1 for details). You do not need to memorize the exact structures of the protein amino acids, but you should be familiar with the names of all 20 and have a general understanding of their properties, including the ways in which the chemical properties of their side groups influence the properties of the proteins that contain them (positive and negative charges, polar and non-polar properties, presence of hydroxyl groups capable of phosphorylation, presence of -SH groups that can form disulfide crosslinks, etc.).
The genetic code: Our textbook provides a fairly extended discussion of the history of how the genetic code was deciphered on pages 103-106. For the current lecture, we will only deal with those aspects of the code that are summarized in outline form below (see pages 91 - 95 of the textbook for additional details).
DNA code: You will need to be able to move back and forth freely between DNA coding in the sense strands of genes and RNA coding in messenger RNA molecules. Thus, you need to be equally comfortable with TTA or UUA as a codon specifying leucine. Also, you need to remember when to use T and when to use U. For example, if you were asked to write the DNA code for the protein in figure 4.7, you would need to replace each U in the code with T.
Full length mRNA sequence: In addition to nicely demonstrating how a coding sequence is organized, figure 4.7 also provides an example of a full-length mRNA sequence that includes the 5'-cap, the 5'-untranslated sequence, the coding sequence from start codon to stop codon, the 3'-untranslated sequence (including the AAUAAA polyadenylation signal), and the poly (A) tail.
One letter code for amino acids: The tables of codons also present the one letter codes for protein amino acids. You will need to become familiar with these abbreviations. For the most part, they are the first letters of the names of the amino acids. However, in cases where several amino acids start with the same letter, they are sometimes the nearest available letter of the alphabet (K for lysine). In other cases they attempt to mimic the sound of the name (F for phenylalanine, N for asparagine, R for arginine, Y for tyrosine, D for aspartic ("asparDic") acid). The use of W for tryptophan invokes images of Elmer Fudd ("tWyptophan"). The use of E for glutamic acid probably reflects its chemical similarity to aspartic acid (D). The use of Q for glutamine is said to be related to its sound, but that one is a bit of a stretch of the imagination.
Gene-protein colinearilty: One of the earliest lines of experimental evidence supporting the concept that genetic information was in a linear array corresponding to the amino acid sequence of a protein was provided by studies on the A subunit of tryptophan synthetase from E. coli in the laboratory of Charles Yanofsky. These studies verified that the relative map position of each mutation analyzed corresponded to the relative position within the protein of the resulting amino acid substitution (figures 4.5 and 4.6 ).
Go to Review Questions for This Lecture
Return to Index of Lecture Notes
Return to MCDB 2150 home page