Revised September 8, 2000
Lecture date: Monday, September 11, 2000
An error was corrected September 15.
See course notices and updates page for details.

Lecture 6, MCDB 2150, Fall 2000

Molecular Basis for Mutation

Textbook Assignment: Chapter 5, Pages 125-142.

Textbook error: There is a typographic error on page 142, three lines from the end of the first paragraph in boxed example 5.3. "G-->A" should read "G-->T". Changing G to A would not be a transversion, and the context makes it clear that what should have been specified is a G to T change.

Major concepts

Introduction: This lecture provides a description of the molecular basis for various types of mutation in terms of changes in the information encoded in DNA. Because the lecture on mutational mechanisms occurred later in the semester prior to adopting our current textbook, it was able to cover various manifestations of mutation in greater detail and thus served as a unifying lecture for many previously introduced concepts. Much of that material has been removed from the current notes, but some items that are not covered in the current chapter of the textbook still remain. For those who may be interested, I am providing a link to the the unedited version of 1998 Lecture 25.

Phenotype and genotype: Please note that I have used the terms "phenotype" and "genotype" rather freely in these notes. Phenotype refers to "the observable outward appearance of an organism, which is controlled by the genotype and its interaction with the environment" (from the glossary of the textbook, page 816). Genotype refers to "the genetic makeup of an individual organism" (page 813). In diploid organisms, phenotype may be determined entirely by a dominant allele, with no outward expression of the corresponding recessive allele at the genetic locus in question. The textbook does not fully introduce these terms until page 157, although phenotype is used without definition a few times in Chapter 1. These terms were also defined from a slightly different perspective in my Lecture 1 notes.

Mutation: A mutation is any change in genetic information relative to a reference "wild-type" genome, including changes that affect expression of genes without altering their coding sequences and changes that do not cause any detectable phenotypic difference (silent mutations). In a complex organism, mutation can occur at many different structural levels and can be classified in many different ways. We will begin exploring this subject, mostly at the level of the central dogma, in this lecture, with many additional concepts to be introduced later in the semester.

Point mutations: In classical genetics, a point mutation was originally defined as a change in an inherited trait that was not accompanied by any chromosomal change that could be seen with a light microscope (even in the giant polytene chromosomes of Drosophila larval salivary glands). In current usage, point mutations are usually understood to involve only one base pair, but to include both substitutions (transitions and transversions), and the addition or deletion of a single base pair. A point mutation can result in missense (amino acid substitution), nonsense (insertion of a stop codon), or frameshift (either positive or negative) as described later in this lecture.

Larger scale mutations: Later in the semester, we will examine mutations that involve larger changes in chromosomes, including deletions, duplications, inversions, translocations from one chromosome to another, and extra or missing chromosomes (lectures 33 and 34, chapter 17 of the textbook). Mutations caused by transposable elements (also known as mobile genetic elements) are covered in chapter 22 of our textbook, and are examined in MCDB 3500.

The importance of mutation: Genes are stable repositories of the information needed for synthesis of all of the RNA and proteins in a living organism. Survival and stability of each species is dependent on faithful replication of genetic information for use by each new generation. However, a low level of mutational change is highly desirable. Over an extended period of time, mutational changes provide the ability for species to adapt to changing conditions and challenges, and thus serve as the raw material for selective survival and the evolution of more advanced and efficient species, as well as the development of biological diversity.

Somatic and germ-line mutations: The mutations that we normally deal with in genetics are those that occur in the germ-line and are thus passed on to subsequent generations. However, mutations can also occur in somatic cells. Those mutations affect only the immediate progeny of the cells they occur in and are not inherited. Colored spots in Indian corn are caused by back mutation of a relatively unstable mutation that is responsible for loss of pigmentation. Cancer is caused by somatic mutations that alter normal cellular growth regulatory mechanisms in a single cell and its direct progeny.

Molecular nature of point mutations: Point mutations can occur in a variety of ways (including frameshift mutations, which are discussed separately below). Simple substitutions that replace one base pair with another alter only one codon in the coding sequence, and thus generate either missence or nonsense mutations. If one purine is replaced by another purine or if one pryimidine is replaced by another pyrimidine in the sense strand base sequence (with complementary changes in the antisense strand), the substitution is called a transition. If the substitution involves replacement of a purine with a pyrimidine or a pyrimidine with a purine, it is called a transversion.

Missense mutations: Most base pair substitutions change the amino acid specified by the codon in which they occur. Such mutations are described as missense mutations because they cause an amino acid substitution in the coded protein. Depending on the nature of the amino acid substitution and its location within the protein, missense mutations may have a variety of effects, ranging from complete loss of biological activity to reduced activity or temperature-sensitive activity or no funtional effect at all.

Open reading frames: Before discussing nonsense and frameshift mutations, it will be helpful to introduce the concept of an open reading frame. In DNA with equal numbers of AT and GC base pairs and a truly random nucleotide sequence, the probability of finding any given nucleotide at any given position is 1/4. The probability of encountering any one of the 64 possible codons is 1/64 (1/4 x1/4 x1/4). Since there are 3 different stop codons in the standard code (UAA, UAG, UGA), the chances that any random group of three adjacent nucleotides will form a stop codon is 3/64, or approximately 1/20. Thus, truly random DNA would not be expected to contain coding sequences for proteins that are hundreds or even thousdands of amino acids in length. An extended nucleotide sequence with no stop codons is called an open reading frame, often abbreviated "ORF". The presence of an open reading frame provides strong suggestive evidence that a protein coding sequence may be present.

Nonsense mutations: Base pair substitutions that generate an in-frame stop codon within a previously functional protein coding sequence cause premature termination of translation of the protein in question and are referred to as nonsense mutations. In some cases, the effects of nonsense mutations can be suppressed by modified tRNA molecules that insert an amino acid with a low efficiency when a stop codon is encountered. Bacterial strains that contain such tRNAs are referred to as suppressor strains.

Silent mutations: In some cases, base pair substitutions generate a different codon for the same amino acid, with no biological effect whatsoever. This is most likely to happen in the third position (wobble base) of redundant codons for the same amino acid. Such changes are considered to be mutations because they alter the genetic code. However, because they have no phenotypic effect, even at the level of protein amino acid sequence, they are called silent mutations. As we will see later in the semester, silent mutations can alter the ability of DNA to be cut by sequence-specific endonucleases known as restriction endonucleases, which play a major role in recombinant DNA technology and DNA fingerprinting.

Frameshift mutations: The genetic code is translated three nucleotide bases (one codon) at a time, with no punctuation between the codons. Addition or deletion of a single base pair in the middle of a coding sequence will result in out-of-frame translation of all of the downstream codons, and thus result in a completely different amino acid sequence, which is often prematurely truncated by stop codons (UAG,UAA,UGA) generated by reading the coding sequence out-of-frame. Such mutations, which are a special subclass of point mutations, are referred to as frameshift mutations. Deletion of a single base pair results in moving ahead one base in all of the downstream codons, and is often referred to as a positive frameshift. Addition of one base pair (or loss of two base pairs) shifts the reading frame behind by one base, and is often referred to as a negative frameshift. Note that deletion or addition of three base pairs (or multiples of threes) does not cause a frameshift, but instead results in deletion or addition of one or more amino acids in the coded protein.

Expanded triplet repeats: One of the genomic oddities that has been discovered in recent years is the triplet repeat (pages 128-131). These are 3 base repeating sequences, that can be either in untranslated or translated regions of the mRNA for a particular gene. If the number of repeats exceeds a critical threshold that is charactreristic for each gene, there is an increased probability of pathology, often involving either neurological or muscular defects. The fragile X-syndrome involves a CGG repeat in the 5'-nontranslated part of the mRNA. Affected individuals may have as many as 700 to 1000 repeats. The long distance from the cap to the start codon reduces the efficiency of translational starts. In addition, the CGG repeats become methylated, reducing the efficienty of transcriptional starts (Fig 5.3 and pages 128-129) Unaffected individuals generally have less than 50 repeats, with 29 being the most frequent. Individuals with 50 - 230 repeats are usually unaffected, but their children are at higher risk. They are said to have a "premutation". Huntington disease involves a CAG repeat, which translates into repeated glutamine residues in a coded protein. The number of repeats in any of the triplet repeat diseases can be increased by mispairing during replication (figure 5.6). Mispairing and slippage during replication also appear to be responsible for classical mutational hotspots, which need only a very short repeated sequence (not involving triplets) to occur (see boxed example 5.2, page 134).

Reversion and suppression: There are many cases where an additional mutational event can reverse the phenotypic effects of a mutation, either by restoring the original sequence, or by causing a second change that compensates for the first. The term "reversion" should only be used to describe a change that restores the original function of the affected gene, either by exact reversal of the original damage (back mutation), or by a compensating change within the same gene (for example, a negative frameshift close to the site of a previous positive frameshift, such that only a few non-critical amino acids in the coded protein are altered). In many cases, a mutation in a second gene can reverse the effect of the original mutation -- this is called suppression, rather than reversion, because the original mutation is still present. (Note that some authors prefer to refer to a second mutation within the original gene as intragenic suppression, with the term "reversion" being reserved entirely for a direct reversal of the original damage)

Non-reverting mutations: Mutations that are capable of reversion usually involve only minimal changes in the original DNA sequence, such as transitions, transversions, or frame shifts. Major deletions, as well as most other large changes in the DNA coding sequence, disrupt gene function severely enough so that there is no spontaneous reversion. In many experimental situations, it is important to have a strictly non-reverting mutation so that revertants will not mask experimental results. In these cases, researchers generally use strains that have large segments of the gene in question deleted. For example, if an investigator is trying to identify a cloned sequence that will restore a function that has been lost through mutation, it is important to know that no reversion is occurring in the mutant strain.

Conditional mutations: Some types of mutations exert their phenotypic effects only under certain environmental conditions. Such mutations are called conditional mutations.

Temperature-sensitive (ts) mutations are missense mutations that do not seriously affect the biological activity of the coded proteins at reasonably low temperatures, but cause them to have a reduced thermal stability. Such proteins become denatured at temperatures that do not affect the corresponding wild-type proteins. However, when the mutant strains are maintained at a lower temperature, the proteins are still able to function reasonably well, and no mutant phenotype is observed. Temperature-sensitive mutations are particularly useful for studying vital functions, such as progression through the cell division cycle. In order to maintain stock cultures of organisms carrying such mutations, it is necessary to be able to expand populations under conditions where the mutations are not expressed phenotypically. Growth at low temperature and analysis of the mutant phenotype at a higher temperature provides such a system.

Nonsense suppression: Another approach to conditional mutation that is used extensively in studies on bacterial viruses is to generate nonsense mutations involving the amber codon (UAG). Viruses bearing such mutations can often be maintained in amber suppressor strains of bacteria and then transferred to regular strains to study the phenotypic effects of the mutations. The amber suppressor strains contain an altered transfer RNA that inefficiently reads the UAG codon as coding for an amino acid (figure 5.4). If the protein is able to function with that particular amino acid inserted at the location of the amber mutation, the virus is able to replicate, although often with reduced efficiency, in the amber suppressor strain.

Permissive and nonpermissive conditions: The conditions that allow growth or function without phenotypic expression of conditional mutations are referred to as permissive. The conditions that cause phenotypic expression of the mutation to occur are referred to as nonpermissive. This nomenclature refers primarily to conditions that permit growth or do not permit growth, but can also be used for other types of conditional mutations, such as loss of pigmentation in body parts that experience higher temperatures in Siamese cats and Himalayan rabbits (Figure 13.25, page 402 in the textbook). The nomenclature can sometimes be confusing. Permissive conditions PERMIT the mutant organism to display a non-mutant phenotype (such as growth at low temperature of a temperature sensitive mutant strain of bacteria that is unable to grow at higher temperatures). Non-premissive conditions result in full expression of the mutant phenotype. They are nonpermissive because they do NOT permit the organism to overcome the effects of the mutation it carries.

Mechanisms of mutation: This portion of the lecture deals primarily with the mechanisms responsible for point mutations and their reversion or suppression (pages 133 - 142).

Tautomerization: Spontaneous mutations that involve base pair substitutions are caused primarily by configurational changes within the individual bases that result in mispairing. These changes, which are called tautomeric shifts, involve momentary expression of rare alternative molecular configurations that exist in equilibrium with the more common forms. Specifically, proton shifts can convert the amino groups in adenine and cytosine to imino groups, and the keto groups in guanine and thymine to enol groups (Figure 5.5).

Transitions: A tautomeric shift in any of the four DNA bases can lead to mispairing of A to C or G to T. The tautomeric state can occur either in the template base or the incoming base. During the next round of DNA synthesis, the mispaired base will pair with its normal partner, resulting in a transition, in which an AT base pair replaces a GC or a GC replaces an AT, with no change in the purine:pyrimidine polarity of the base pair. Transitions are the most common type of mutation resulting from spontaneous mispairing due to tautomerization.

Transversions: To achieve a transversion, in which the positions of purine and pyrimidine are reversed in the DNA double helix, two events are thought to be involved, tautomerization of one of the bases and rotation of the other to yield a purine:purine pairing. Based on information from a previous textbook for this course, the frequency of spontaneous transversions, which is lower than that of transitions, appears to be consistent with this interpretation. However, that book also warns that recent studies suggest that the overall picture may be more complex. Our current text does not discuss transversions in much detail. A second possible mechanism for transversions is the formation of an apurinic site (described below), which can result in replacement of the original purine with any of the four bases.

Frameshifts, triplet expansion, and mutational hotspots: Spontaneous frameshift mutations are believed to arise primarily from slippage and misalignment during replication within a sequence containing several repeats of the same base in a coding sequence. A similar phenomenon appears to be involved in the expansion of triplet repeats. So-caled mutational hotspots within certain genes may also be caused by misalignment during replication in relatively short runs of a repeated base (boxed example 5.2, page 134).

Deamination: When a cytosine undergoes oxidative deamination, it becomes uracil, which is capable of pairing with adenine and causing a transition. The presence of uracil in DNA tends to be detected as an anomoly and may trigger repair mechanisms. However, if cytosine has become methylated to form 5-methylcytosine, as often occurs in DNA, deamination converts it to thymine, which is a normal DNA base that is not detected by repair systems (other than proofreading of GT mispairing during DNA synthesis). Because of selective methylation of CG sequences in many DNAs, there is a tendancy for all non-essential CG sequences to be converted to TA sequences over time. Methylated CG sequences are thus hot spots for mutation, such that in DNA in general, CG sequences tend to be far less frequent that TG sequences. (Remember that a sequence is always described in 5' to 3' terms, such that CG means 5'-CG-3').

Spontaneous mutation rate: For single-celled organisms ranging from bacteria to cultured mammalian cells, mutation rate is usually measured as the probability of mutation within a specific gene per cell division. For higher animals, the rate is measured in terms of the probability per gamete per generation (remember that each new individual contains the contributions from two separate gametes). Bacterial rates are typically in the range of 10-8 to 10-6 per generation. Mammalian (including human) rates for individual easily observed mutations tend to be on the order of 10-5 per generation (Data from the previus text for this course)

Chemical mutagenesis: A variety of chemical mutagens have been discovered that act in several distinctly different ways. Many chemicals that are used in modern industry and technology are potentially mutagenic, which includes their ability to cause cancer as a result of somatic mutation, as discussed in chapter 24 of our textbook (which is not covered in this course due to lack of time and the complexity of the growth regulatory genes that are involved).

Base analogues: The base analogues used in mutagenesis are substances that are sufficiently similar to naturally occurring DNA bases so that their deoxyribonucleotide triphosphates can be incorporated into DNA in place of the normal bases. However, they also have anomolous base-pairing properties, leading to an increased rate of mutagenesis. For example, 5-bromouracil (figure 5.13) pairs like thymine (5-methyluracil), but undergoes more enol tautomerization, leading to more frequent mispairing with guanine. Similarly, 2-aminopurine (figure 5.12) normally pairs with thymine, but can also pair with cytosine. These mispairings lead to an increase in the frequency of transitions.

Nitrous acid: Treatment of DNA with nitrous acid leads to deamination of cytosine and adenine, again resulting in transitions, as described above for spontaneous deamination (Figure 5.15). Note the use of nitrous acid as base-specific mutagenic agent in boxed example 4.4, pages 104-105.

Hydroxylating agents: Hydroxylamine adds a hydroxyl group to the amino group at position 4 of cytosine, causing it to pair with A instead of G. Hydroxylamine thus causes a very specific CG to TA transition (figure 5.15).

Alkylating agents: Certain alkylating agents, such as ethyl methane sulfonate (EMS) and ethyl ethane sulfonate (EES) add alkyl groups to purines, which can cause mispairing, and also destabilize the bond between the purine and deoxyribose, leaving apurinic sites. The absence of a base-pairing partner allows any base to be inserted during the next round of DNA synthesis. This frequently leads to transversions (as well as transitions).

Intercalation: Certain flat aromatic molecules, such as acridine orange and proflavin become inserted between base pairs in DNA, which can lead to misalignment during replication and the occurence of frameshift mutation (fig 5.14).