Revised September 7, 2000
Lecture date: Friday, September 8, 2000
This lecture combines parts of 1999 lectures 5 and 6.
The section on A and P sites was rearranged for clarity without any change in content on September 11.
Please see class notices and updates page for details.

Correction: Under energy consumed during protein synthesis, fourth bulleted item, the parenthetical statement in the second sentence should read (ATP-->AMP and 2 GTP -->2 GDP). The produce of GTP hycrolysis in both cases is GDP and not GMP. (Corrected September 18, 2000)

Lecture 5, MCDB 2150 Fall 2000

Genetic Code, Ribosomes, tRNA, Translation, Protein Structure, Prions

Textbook Assignment: Chapter 4, Pages 89-123.

Important concepts

This is the last lecture in a series of four reviewing basic concepts of molecular biology and the central dogma that are covered in MCDB 1150.

Overview of translation

Amino acids and proteins: This lecture deals with the translation of coded information contained in a linear RNA molecule into a linear sequence of amino acids in a protein molecule. This requires converting a message that is written with only four different characters, A, C, G, and U, into a sequence with 20 possible alternative amino acids at each position. The protein amino acids all have amino groups attached to their alpha carbon atoms (the ones next to their carboxyl groups). The alpha carbon also carries an attached side chain, ranging from a single hydrogen in glycine to a complex double ring structure in tryptophan (see figure 4.1 for details). You do not need to memorize the exact structures of the protein amino acids, but you should be familiar with the names of all 20 and have a general understanding of their properties, including the ways in which the chemical properties of their side groups influence the properties of the proteins that contain them (positive and negative charges, polar and non-polar properties, presence of hydroxyl groups capable of phosphorylation, presence of -SH groups that can form disulfide crosslinks, etc.).

Gene-protein colinearilty: One of the earliest lines of experimental evidence supporting the concept that genetic information was in a linear array corresponding to the amino acid sequence of a protein was provided by studies on the A subunit of tryptophan synthetase from E. coli in the laboratory of Charles Yanofsky. These studies verified that the relative map position of each mutation that was analyzed corresponded accurately to the relative position within the protein of the resulting amino acid substitution (figures 4.5 and 4.6 ).

The genetic code: Our textbook provides a fairly extended discussion of the history of how the genetic code was deciphered on pages 103-106. For the current lecture, we will only deal with those aspects of the code that are summarized in outline form below (see pages 91 - 95 of the textbook for additional details).

Code tables: Table 4.4 identifies the "standard" coding functions of all 64 possible codons. A larger and easier-to-read version of that table can be found inside the front cover of the textbook. It is accompanied by a table showing the codons for all 20 mino acids. Notice that the number of codons for individual amino acids varies from one to six.

One letter code for amino acids: The tables of codons also present the one letter codes for protein amino acids. You will need to become familiar with these abbreviations. For the most part, they are the first letters of the names of the amino acids. However, in cases where several amino acids start with the same letter, they are sometimes the nearest available letter of the alphabet (K for lysine). In other cases they attempt to mimic the sound of the name (F for phenylalanine, N for asparagine, R for arginine, Y for tyrosine, D for aspartic ("asparDic") acid). The use of W for tryptophan invokes images of Elmer Fudd ("tWyptophan"). The use of E for glutamic acid probably reflects its chemical similarity to aspartic acid (D). The use of Q for glutamine is said to be related to its sound, but that one is a bit of a stretch of the imagination.

Compressed representation of code: The format reproduced below is probably the most compact way to represent the code (in this case the DNA code). You will encounter it on some web sites, including the one referenced below for alternative codes. This is the standard code. * = stop codon. The symbol "M" in the starts row suggests that synthesis begins with methionine (carried on the initiator tRNA) even when a different start codon is used. (Note: if the vertical columns of type do not line up typewriter sytle on your screen, forget about this table and use the one in the book. Not all web browsers support the HTML "pre" command).

    AAs  = FFLLSSSSYY**CC*WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG
  Starts = ---M---------------M---------------M----------------------------
  Base1  = TTTTTTTTTTTTTTTTCCCCCCCCCCCCCCCCAAAAAAAAAAAAAAAAGGGGGGGGGGGGGGGG
  Base2  = TTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGGTTTTCCCCAAAAGGGG
  Base3  = TCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGTCAG

Alternative codes: Minor variations from the "standard" code (usually differing by just a few codons) have been found in the mitochondrial genomes of many species, including vertebrates, and also in the primary genomes of some unicellular organisms such as mycoplasma and ciliated protozoa. The most frequent change is use of UGA (one of the stop codons in the "standard" code) as a second codon for tryptophan, but there are also numerous other differences. For a compilation of alternative codes, click here.

DNA and RNA codes: You will need to be able to move back and forth freely between the DNA code in the sense strands of genes and the RNA code in messenger RNA molecules. Thus, you need to be equally comfortable with ATG or AUG as the start codon and TTA or UUA as a codon specifying leucine. Also, you need to remember when to use T and when to use U. For example, if you were asked to write the DNA code for the protein in figure 4.7, you would need to replace each U in the code with T.

Full length mRNA sequence: In addition to nicely demonstrating how a coding sequence is organized, figure 4.7 also provides an example of a full-length mRNA sequence that includes the 5'-cap, the 5'-untranslated sequence, the coding sequence from start codon to stop codon, the 3'-untranslated sequence (including the AAUAAA polyadenylation signal), and the poly (A) tail.

Ribosomes: Ribosomes are complex molecular aggregates, composed of large and small subunits, which assemble together only for the purpose of protein synthesis and separate again as soon as the process is completed. The ribosomes of E. coli consist of a small subunit containing one 16S RNA and 21 different proteins plus a large subunit containing two RNAs (23S and 5S) plus 31 different proteins (figure 4.8). Mammalian ribosomes consist of a small subunit containing an 18S RNA and 30-35 proteins, plus a large subunit with 28S, 5.8S, and 5S RNAs and 45-50 different proteins (figure 4.9). The abbreviation "S" refers to Svedberg units, a measurement of the rate of sedimentation during high speed centrifugation, which reflects the relative sizes on the RNA molecules, but not in a strictly linear relationship. As shown in figure 4.10, ribosomes attach to messenger RNA and synthesize proteins as they progress from one end of the coding sequence to the other. Many ribosomes can be seen attached to a single actively translated mRNA, forming a complex known as a polysome.

Major steps in translation: Details of translation and the major molecular species involved are summarized below in outline form, arranged in approximately the same sequence as in the textbook (see figures 4.8, 4.14, 4.24, 4.25 and 4.26 for a visual summary of the main steps). Except where specified otherwise, the descriptions are for E. coli, which is representative of typical prokaryotic cells. Eukaryotic translation is similar in principle, but differs in many of its details.

Linking amino acids to the appropriate tRNAs

Formation of the prokaryotic initiation complex. (Figure 4.22)

Formation of the 70S ribosome complex (Figure 4.25d)

A and P sites, recruitment of aminoacyl-tRNAs, elongation (Figure 4.25)

Termination of translation

Eukaryotic translation

Energy consumed during protein synthesis.

Protein structure: Section 4.9 on protein structure and function should be read as additional bakcground information that is essentially a review of material from MCDB 1150. Four levels of structural information are commonly recognized.

Post-translational modification: Proteins are subject to a variety of post-translational modifications, including frequent removal of N-terminal methionine, removal of other N- or C- terminal sequences, removal of internal sequences, removal of signal or targeting sequences, modification of specific amino acids (such as conversion of proline to hydroxyproline), phosphorylation of hydroxyl groups, addition of carbohydrate side chains (glycosylation), complexing with metals or other prosthetic groups, and a long list of other possibilities that are not discussed particularly well in the textbook. Some of these modifications are illustrated in a section on enzymes and enzyme activity at the end of the chapter.

Prions: A boxed section at the end of chapter 13 of Klug and Cummings, Concepts of Genetics, 5th edition (the previous text for this course, available at the Norlin reserve desk) discusses an unusual pathogenic unit called a prion (proteinaceous infective agent). Although the prion theory remains controversial, a very large amount of evidence has accumulated showing that prion proteins are coded by the host, and subsequently modified to function as pathogens. The modified proteins accumulate in aggregates that cause degenerative diseases of the brain. The best available evidence seems to indicate that a conformational modification of the normal host protein gives it pathogenic properties plus the ability to catalyze similar modification of additional normal proteins, such that the pathology is infectious. The prion disease that has received the most attention recently is the mad cow disease , which apparently got its start when proteins derived from sheep infected with a similar disease, scrapie, were used in cattle feed. Ordinary sterilization techniques do not inactivate the prion infectivity. Similar diseases are known in humans, including Kuru and Creutzfeld-Jacob disease. There is also considerable evidence suggesting that some "atypical" cases of Creutzfeld-Jacob disease may have been caused by animal to human transmission of mad cow disease through meat from infected animals. Stanley Prusiner was awarded the Nobel Prize in 1998 for his extensive study of prions (which included coining the term "prion").

New material ahead! This lecture marks the end of the brief "review" of material from MCDB 1150. Starting with the next lecture we will begin to examine "new" material in greater detail than has been possible in these "review" lectures.