MCDB 2150
Answers to Fourth Examination, November 18, 1998

Answers adequate for full credit are shown in bold-face type. Alternative answers may be possible in some cases. Additional comments about the question or about answers that were not acceptable for full credit have been added in regular type in some cases.


1. (25 points, 5 for each part) Briefly explain the importance of each of the following in a manner that also makes it clear that you know what the item is.

a. Enhancer

An enhancer is a DNA sequence whose presence enhances transcription from a nearby promoter. The enhancer is not part of the promoter and can be located upstream or downstream, or even in an intron of the gene whose transcription it increases. It can be reversed in direction without losing activity. The enhancer appears to act as a binding site for regulatory proteins that also interact with components of the transcriptional initiation complex at the promoter site.

b. Tautomerization

Tautomerism is a change in charge distribution that alters the pattern of pairing of the nucleotide bases in double helical DNA. There are two common types, keto to enol, and amino to imino. Any of the four bases can be affected, and either type of change can result in GT or AC base pairing. Tautomerism appears to be the source of most spontaneous transition mutations.

c. Reporter gene

A reporter gene is a gene that produces an easy-to-assay product and is capable of being activated by whatever promoter is joined to it. Reporter genes are used to determine the conditions needed for a particular promoter to be active and to determine where and when specific genes are normally expressed. This topic was beyond the cutoff point in lecture 30 and should not have been included in this examination. Answers were marked to show whether they were right or wrong, but because of the error, everyone received a full five points for this question, even if they did not answer at all.

d. Photoreactivation

Photoreactivation is a DNA repair process that removes pyrimidine dimers that have been caused by ultraviolet irradiation. It uses energy from visible light, together with a special photoreactivation enzyme, to directly reverse the dimerization process. It is the first line of defense against ultraviolet damage to DNA.

e. Aminoacyl-tRNA synthetase

Amino-acyl-tRNA synthetase enzymes link specific amino acids to their corresponding tRNAs. They are responsible for associating specific amino acids to tRNAs that carry the anticodons that recognize the mRNA codons for the amino acids. They can be viewed as the only bilingual molecules in the process of translation.



2. (25 points, 5 for each part). Restriction endonucleases are widely used in recombinant DNA research.

a. Distinguish between exonuclease and endonuclease in a manner that makes it clear you understand what each is and how they differ.

An exonuclease starts at one end of a nucleic acid and degrades it by removing one nucleotide at a time. An endonuclease is capable of cleaving internal bonds in a nucleic acid, cutting it into shorter pieces. Depending on the type, exonucleases may start from either the 5'-end or the 3'-end of a single stranded nucleic acid, or from either end of a double helical DNA. Restriction endonucleases are a special class of endonucleases that cut only as highly specific target sites, as described in part b of this question.

b. What are the characteristic features of a target site for cutting by a restriction endonuclease?

The sites are palindromic, such that the double-stranded DNA reads the same 5'- to 3'- on both strands. An example is the Eco RI cut site

GAATTC
CTTAAG

The cut sites usually consist of 4, 6, or 8 base pairs, but some have odd numbers with a single base in the middle.

c. What range of frequencies of cutting is encountered with the various restriction endonucleases that are currently in widespread use?

For 4 base sites, (1/4)4 = 1/256

For 6 base sites, (1/4)6 = 1/4096

For 8 base sites, (1/4)8 = 1/65536

These frequencies are for DNA that is 50% GC. For DNAs that have different base compositions, the frequency of cutting is determined by the product of relative frequencies for each of the individual nucleotides in the recognition site.

d. What is the size of the restriction endonuclease restriction site that is most useful for routine gene cloning operations? Explain the reasoning behind your answer.

A six base site yields fragments just over 4000 bp. This is large enough for most coding sequences and small enough to work with in most common vectors. The fragments from a four base cutter are too small to contain complete coding sequences, and the fragments from an eight base cutter are so large that they are difficult to work with,

e. What is a sticky end and why is it considered useful in gene cloning?

Many restriction endonucleases make uneven cuts resulting in 2- or 4-base single-stranded overhangs. For example, Eco RI leaves a 5'- overhang of AATT. When two identical overhangs encounter each other, there is enough base-pairing to make them associate temporarily. This "stickiness" holds the overlapping ends together long enough for ligase to form covalent bonds between them. This greatly facilitates cloning in vectors that have been cut with the same nucleases as the DNA that is being cloned.



3. (25 points, 5 for each part). DNA sequencing techniques are widely used in molecular genetic studies.

a. How is selective chain termination achieved in the most widely used DNA sequencing procedures?

A 2;-3'-dideoxynucleotide is added at the 3'-end of a growing chain. Because it lacks a 3'-OH for adding another dNTP, it terminates chain growth.

b. What prevents all growing chains from being terminated at the same length?

A small amount of the dideoxynucleotide triphosphate is present together with a larger amount of the normal dNTP. There is only a small chance of termination each time a new base of that type is added to the growing chain. This results in some termination every time that particular nucleotide occurs in the sequence, while allowing most of the chains to continue growing so that additional terminations can occur further downstream. .

c. Explain the role played by electrophoresis in determining DNA sequences.

Electrophoresis is used to separate the terminated fragments by size. A band appears on the gel for each fragment length produced. By reading four parallel lanes from the four chain stoppers, one can read the sequence as a series of successively longer fragment lengths (seen as progressively more slowly migrating bands) in the four lanes.

d. What is an open reading frame and why is its identification important during DNA sequencing?

An open reading frame (ORF) is a sequence that begins with a start codon and continues for a significant number of codons with no stop codons. It reflects a probable coding sequence for a protien. Stop codons are expected roughly every 20 codons in truly random sequence DNA. Thus a longer sequence with no stop codons is likely to be part of an ORF. The presence of introns in eukaryotic genomic sequences can make the detection of an ORF somewhat more complicated.

e. You are sequencing a gene that contains a mutation causing a mouse to lack a functional enzyme. The coding sequence is identical to wild-type except for one missing nucleotide near the 5'-end. What type of mutation does the mouse have and how does it prevent the formation of a functional enzyme?

Deletion of a single nucleotide near the 5'-end of a coding sequence would result in a frameshift mutaiton near the N-terminus of the protein. The amino acid sequence would be drastically altered beyond the point where the mutaiton occurred. It is also likely that a stop codon would be encountered relatively soon as a result of reading the code out-of-frame.



4. (25 points, 5 for each part). You have a cDNA for a relatively small mouse protein. You use primers based on the 5'-end of the coding sequence and the reverse complement of the 3'-end of the coding sequence to do PCR amplification of DNA from a mouse.

a. What special property must be possessed by the DNA polymerase used in PCR and why is this so important for doing PCR?

The polymerase must be sufficiently heat stable to survive repeated cycles of raising the temperature sufficiently to separate the strands of DNA in order to initiate new cycles of replication. Many answers addressed properties that all DNA polymerases must have, such as the ability to initiate synthesis at the 3'-ends of primers, or to initiate repeated cycles of replication. Such properties are not unique to PCR.

b. When you do sequence analysis on the PCR product, you find sequences in the PCR product that are not present in the cDNA. Explain how this can happen..

By far the easiest answer is that the mouse genomic DNA contains introns that have been spliced out of the mRNA used to generate the cDNA. Priming at the ends of the coding sequence will amplify all of the exons coding for the protein plus all introns that are located between the ends of the coding sequence. Many answers suggested that the additional sequences might have come from contamination. This has two problems. First, in order to be amplified, the contaminant must be primed by the primers that are being used. Second, if such priming does occur, the contaminant sequence would be superimposed on top of the coding sequence (and introns), rather than existing as a separate cleanly read sequence. As pointed out in a later lecture (Nov. 23), contamination is most likely to be a problem in procedures such as DNA fingerprinting where the initial sample is very small and contaminating DNA is as likely as the sample to be primed. Several answers suggested that polymerase errors might be involved. These would be expected to insert single base mistakes rather than totally difference sequences.

c. You also find that the PCR product does not contain all of the sequences that are present in the cDNA. Explain why this is so.

The cDNA and the mRNA it was prepared from extend beyong the ends of the coding sequence (which was all that was primed) in both directions. These 5'- and 3'-untranslated sequences would not be amplified. In addition, most cDNAs contain at least some of the poly(A) sequence from the 3'-end of the mRNA, which is not represented at all in the genomic DNA that is being amplified. Some answers suggested that synthesis failed to proceed over the entire distance between the primers. If this happened, the process would totally stop. PCR can continue working only when each cycle generates strands long enough so they can be primed in the opposite direction in the next cycle. Deletions are also unlikely to occur during the amplificaiton process, which typically generates a product of a highly uniform length.

d. The first 60 base pairs at the 5'-end of the sense strand of the PCR product are identical to those of the cDNA coding sequence, except at one position where the sequencing gel for the PCR product contains two bands of equal intensity in the C and T lanes. What is the most likely explanation

The most likely answer here is that the mouse is heterozygous for a point mutation, such that one chromosome contains a C and the other contains a T at this particular position in the coding sequence. Many answers suggested that this must be a wobble base that would have no effect on coding. While a silent mutation of this type is certainly possible, it is also possible that the mouse could be heterozygous for a missense mutation in the protien under discussion. Another interesting possibility that was suggested was a replication error during an early PCR cycle, which then got amplified along with the normal sequence. This could happen, but it would not be expected to generate equal amounts of C and T. Even if the error occurred in the first cycle, only one of the four strands at the end of that cycle would be mutant, and one would expect a final ratio of 3/4 wild-type sequences and 1/4 mutant sequence.

e. You clone the PCR amplification product into a cloning vector and sequence a randomly selected clone. Would you expect to see the double bands for C and T in the sequence of the clone? Explain the reasoning behind your answer.

No. Any one clone would be expected to contain only one double-stranded DNA sequence. Thus a randomly selected clone would be either C or T, but not both. However, if a population of cloned sequences were examined, one would expect to see roughly equal numbers of C and T clones, and if they were pooled for sequencing, one would then see the double bands for C and T at the position of the mutational difference.



Grade distribution:

90-99, 11
80-89, 8
70-79, 13
60-69, 10
50-59, 6
40-49, 4
30-39, 1
20-29, 1

Number of examinations: 54; Average grade = 71.7