Revised October 20, 1998

This is old Lecture 22

MCDB 2150 Lecture 23

Review of Genetic Code and Transcription

Text Assignment: Chapter 12, pages 324-347.

This is the third of four lectures reviewing basic concepts of molecular biology and the central dogma that are covered in MCDB 1150. Because of the large amount of review material to be covered, the portion of these notes on the genetic code and on prokaryotic transcription consists only of rather brief outlines. Somewhat more detailed information is provided on eukaryotic transcription and message processing, which may not have been covered as thoroughtly in previous courses.

The genetic code: Our textbook includes a fairly extended discussion of the history of how the genetic code was deciphered (pages 325-332). This is material that you should read for perspective, but it will not be covered in detail in the lecture or any of our course examinations. One part that will become important later when we discuss the nature of mutation is the section on frameshift mutations on pages 326-327. The material of primary importance for this lecture is the overview of the code on page 325, and the discussion of various aspects of how the code works on pages 332-335.

Important concepts about the code:

Transcription: major concepts.

Prokaryotic transcription


Material beyond this point will not be on the third examination, scheduled for Wednesday, October 28, 1988. However, please remember that we did do a generalized comparison of prokaryotic and eukalryotic systems in Lecture 21.

Eukaryotic transcription

Important concepts:

Separation of transcription and translation: Eukaryotic cells are characterized by the presence of a membrane-bound nucleus. This results in segregation of the genetic DNA and the enzymatic machinery for transcription and message processing into a separate subcellular compartment. The nucleus has only limited communication to the cytoplasm, which contains the systems needed for translation of the mRNA and post-translational modification of the proteins, as well as targeting proteins to appropriate subcellular or extracellular locations.

Selective gene expression: Eukaryotic cells also exhibit selective gene expression. In "simple" cells, such as yeast, alternative mating types are displayed and proteins associated with progression around the mitotic cell cylce are expressed at different times. In addition, the various cell types in a multicellular organisms exhibit highly selective gene expression. This allows a complex organism to be composed of many diverse types of differentiated cells. Such cells display widely different biochemical and structural properties despite having identical genomes in nearly all cases.

Large size of eukaryotic genomes: One of the intriguing aspects of eukaryotic DNA is how so much can be packed into a structure as small as a typical cell nucleus. The human haploid genome contains about 2.8 x 109 base pairs of DNA. Each base pair of double helical DNA is about 0.34 nm in thickness (3.4 x 1010 meters). Thus, a complete haploid human genome is about 95 cm in length. This means that each diploid human nucleus contains 1.9 m of double helical DNA (6 feet, 3 inches). The average diameter of a nucleus is typically in the range of 3 - 10 micrometers. Thus, depending on how compact the nucleus is in a particular type of cell, the length of the double helical DNA that it contains is roughly 200,000 to 600,000 times the diameter of the nucleus. The packaging of that much DNA in such a small space in a manner that eliminates tangling and permits easy replication and trancription is a marvelous feat.

Eukaryotic RNA polymerases: Eukaryotic cells contain three different types of RNA polymerases in their nuclei, each of which has a distinctly different role.

RNA polymerase I transcribes only ribosomal RNAs (18S, 28S, and 5.8S). The initial transcript of RNA polymerase I is a large precursor of all three of these rRNAs, which is then processessed to yield the final rRNAs.

RNA polymerase II transcribes all protein-coding sequences in eukaryotic cells, and is the only one of the three polymerases that we will have time to analyze in any detail. RNA polymerase II is a complex molecular machine whose structural details are not yet fully understood. It is capable of initiating transcription selectively from a variety of types of promoters, working in conjunction with complex sets of transcription factors and influenced by cis-acting regulatory sequences that may be either adjacent to the promoter or relatively distant from it.

RNA polymerase III transcribes a number of small RNA species that do not have protein coding functions, including transfer RNAs, 5S ribosomal RNA (distinct from 5.8S), as well as a number of small RNAs that are involved in nuclear functions, such as splicing of mRNA.

Eukaryotic promoter sites: Eukaryotic promoters are so diverse that it is not yet possible to draw any clear generalizations about them. Many, but not all, have a consensus sequence called the TATA box located about 30 bp upstream from the transcriptional start site (commonly referred to as -30). The consensus sequence is TATAAA, but there is a lot of variation. In addition, many promoters have no recognizable TATA box. In such cases, the transcriptional initiation site is usually less precisely defined, with starts occurring at several different locations. There is a sequence similarity to the prokaryotic -10 box (TAtAaT), but the eukaryotic TATA box is located substantially further upstream.

A second upstream site that is often encountered in eukaryotic promoters is the CCAAT box, commonly referred to as the "cat box". It has a consensus sequence of GGCCAATCT, again with substantial variation, particularly at the ends. When present, it is typically located near the -75 position, relative to the transcriptional start site. There are also a number of other consensus sequences that frequently occur in eukaryotic promoters, such as the GC box, which has a consensus sequence of GGGCGG. Each of the consensus sequences serves as a binding site for one or more types of proteins known as transcription factors.

Enhancers and Silencers: In addition to the cis-acting sequences that are generally considered to be part of the promoter itself, a number of other cis-acting sequences can also influence the extent of transcription of a particular gene. Members of one interesting subclass are referred to as enhancers. Although they may in some cases be located immediately adjacent to promoter sequences, enhancers have two additional properties that distinguish them from true components of the promoter. The first is that they can be moved to quite distant locations, sometimes thousands of base pairs away (including "downstream" locations, sometimes within introns of the gene to be transcribed) without loss of activity. The second is that they can be inverted without altering their effects. Enahncers appear to function as binding sites for gene-specific transcription factors that are capable of interaction with the overall transcriptional complex, probably through a bending of the DNA, even when they are located quite distant from the promoter itself. Silencers are very similar, except that their function is to reduce or stop transcription of a particular gene, rather than to activate it.

Eukaryotic transcription factors: The transcription factors for RNA polymerase II fall into two overall categories, commonly referred to as general transcription factors and gene-specific transcription factors. The general transcription factors interact to form a preinitiation complex, which allows the RNA polymerase to attach to the DNA and initiate transcription, but only weakly in the case of RNA polymerase II. Gene-specific transcription factors are also needed to achieve a high level of transcription. They are believed to function primarily through interaction with enhancers and silencers. We will examine the preinitiation complex, the nature of eukaryotic transcription factors, and the initiation of eukaryotic transcription in greater detail in lectures 37 and 38. Eukaryotic transcription factors are briefly described on page 344 of our textbook and in greater detail on pages 544-554.

Message processing: The initial transcript of a typical eukaryotic gene is not a functional messenger RNA. In most cases three different types of processing must be completed before the mRNA is ready to be transported to the cytoplasm for translation.

Capping: Addition of a "cap" structure on the 5'-end of the message transcript occurs soon after transcription has been initiated, when the growing RNA chain is only about 50 nucleotides long. The reaction occurs in two steps. First, a GTP is added to the 5'-end, which already has a 5'-triphosphate structure left over from the first NTP that initiated growth of the transcript in a 3'-direction. An unusual linkage is formed in which the GTP is joined to the first nucleotide of the RNA in a 5'- to 5'- triphosphate bond. The second phase is methylation of the guanine in the 7'-position to generate the mature 7-methylguanosine-5',5'-triphosphate cap structure. In addition to protecting the mRNA from degradation, the cap structure is also needed to initiate ribosomal binding onto the message.

Removal of introns: Nearly all eukaryotic genes contain introns, which must be spliced out of the transcript to generate a messenger RNA containing coding sequences that can be translated. There are specific recognition signals at the beginning and end of each intron that allow an enzyme complex in the nucleus to recognize the presence of an intron and to remove it, coupling the exons on either side with sufficient precision so that the codons continue to be read "in frame". Note that in many cases, parts of a single codon may be in two exons. We will not examine the details of the splicing process, which is quite complex, in this class. For those who are curious, it is summarized briefly on pages 345-346 of the textbook. A detailed explanation would take more than a full lecture.

Polyadenylation: Most (but not all) eukaryotic mRNAs have a polyadenylic acid tail of 150 to 250 nucleotides added to their 3'-ends before they are exported to the cytoplasm. This is done by an enzyme complex that clips off the 3' end of the original transcript slightly downstream from a specific recognition signal (AAUAAA) and initiates polymerization of ATP to generate the poly(A) tail.

RNA editing: In certain unusual cases, "editing" of the messenger RNA can occur after its transcription. These are unusual situations in which insertions, deletions or base changes may occur. RNA editing is an important but highly specialized processes, which will not be analyzed in detail in this class.

Transport to cytoplasm: After it is fully modified, the mRNA must be transported to the cytoplasm before it can be translated. Other than a mention of the importance of poly(A) for this process, the textbook does not focus on any of the details, and we will not either because of time limitations. However, it is important to recognize that the mRNA must pass through relatively small nuclear pores and that this process is facilitated by special proteins, such as a poly(A)-binding protien.