This is Old Lecture 12 (Tetrad analysis has been moved to Lecture 15)
Major concepts
Introduction: This lecture begins with a discussion of linkage mapping in humans that goes somewhat beyond the scope of the material at the end of Chapter 5. These notes will serve as the primary text material for those topics. The remainder of the lecture deals with material from Chapter 6 on recombination and mapping in bacteria. In order to avoid unnecessary overlap with MCDB 3500, we will consider only a few highlights from the portion of Chapter 6 that deals with bacterial genetics, as indicated by the items described in these notes. I recommend that you read all of the assigned section of Chapter 6, but you will only be held responsible for those items that have been included in these notes. We will delay our discussion of viral genetics (pages 168-179) until we are ready to examine complementation at a molecular level later in the semester.
Human chromosomal mapping: Because of small numbers of progeny and lack of ability to do controlled matings, alternative methods must be used to construct linkage maps and make chromosomal assignments of genes in humans. Direct sequencing of the entire human genome, which is now being undertaken, will ultimately provide all of the needed information. However, a variety of other approaches have also provided valuable information. We will discuss a few of them here and more in the molecular portion of this course.
Human linkage groups: The phenomenon of sex linkage makes it relatively easy to identify human genes that are carried on the X-chromosome. In addition, examination of the male progeny of women who are known to be heterozygous for two different X-linked genes can provide at least a crude estimate of crossover frequency and map distance. Because humans have 22 autosomal linkage groups and because many linked genes are too far apart to be seen as linked in simple pedigrees, it is much more difficult to verify autosomal linkage. One technique that is widely used to test the hypothesis that two genes may be linked is the use of lod scores (log of odds), as described below.
Lod scores: (This topic is not covered in the textbook. Copies of pages 125-126 from Tamarin, Pinciples of Genetics, 5th Edition, W.C. Brown, 1996) will be distributed to the class). In order to provide statistical evidence that two genetic loci appear to be linked, one must demonstrate the apparent absence of independent assortment. However, the analysis must also take into account the expected degree of recombination of the two loci, based on their proposed map distance.
Initial estimate of recombination frequency: It is usually necessary to combine data from many separate pedigrees to obtain a large enough sample size to conclude that linkage is highly likely. However, for purposes of illustration, an example can be based on a single pedigree. A pair of genes that is suspected of being linked is selected, based on an extended family pedigree or several independent pedigrees. A rough estimate of the recombination frequency is then made, based on the available data. In the example used in the Tamarin excerpt, the genes that are suspected of being linked produce phenotypes (dominant Nail-Patella syndrome, and codominant A and B blook types) that make it possible to see which alleles are present at both of the loci in each child studied. One of the eight children in the pedigree being studied exhibits apparent recombination, leading to an initial estimate of 12.5 map units between the two loci.
Ordered probability of observed births based on assumed linkage: Each recombination event generates two recombinant gametes. Since the probability of recombination is 0.125, the probability of a child receiving either of the recombinant chromosomes is 1/2 the recombination frequency (0.0625). Similaryly, the probability that recombination will not occur (based on the initial estimate) is 0.875, making the probability of a child receiving either of the non-recombinant chromosomes 0.4375. Since each birth is an inependent event, the product rule applies. The ordered probability for all of the births, based on the assumed degree of linkage is calculated by multiplying all of the individual probabilities together.
Ordered probability based on assumption of no linkage: To determine whether the proposed linkage is a better fit than random probability, the ordered probability of the observed births is also calculated, based on the assumption that there is no linkage (0.5 probability of receiving a particular allele of each locus and thus 0.25 probability for any particular pair).
Calculation of lod score: The ordered probability of obtaining the observed births based on the assumption of linkage divided by the ordered probability based on the assumption of no linkage gives a measure of how much greater (if any) the probability based on linkage is relative to the probability without assuming linkage. Because the numbers that are obtained sometimes become very large, the results are usually reported as the logarithm to the base 10 of the ratio, commonly referred to as the "lod" (log of odds) score. In the example we have been analyzing, the lod score (Z) is calculated as follows:
Confirmation of linkage: A positive lod score indicates the odds are greater than 1:1 that there is linkage. A lod score of 3.0 (1000:1 odds) or more is considered to be strong confirmation of linkage, while lower positive values are considered suggestive of linkage. Negative values suggest that the the hypothesis being tested is wrong. In the example above, a lod score of just over 1.0 suggests slightly greater than 10:1 odds that there is linkage.
Most likely recombination frequency: In a more sophisticated computerized analysis, it is possible to vary the proposed recombination frequency (theta) over a wide range of values. The value of theta that yields the highest lod score is considered to be the most likely recombination frequency, as shown in table I of the Tamarin handout. Note that an assumption of absolute linkage (theta = 0) will give a lod score of minus infinity if a single recombination occurs. (If one of the births whose probabilities are multiplied together to form the numerator has a zero probability, multiplying that zero into the numerator causes the entire numerator to become a zero and the log of zero is minus infinity).
Chromosome banding: Several techniques are available that produce banded patterns of staining of human chromosomes (see pages 502-503 for detail). Through use of these techniques, it is possible to produce unique patterns of staining that permit individual human chromosomes to be identified cytologically (Fig. 17.13). This ability is important for chromosomal identificaiton in the cell hybridization techniques described below.
Cell hybrids: Mammalian cells from various species can be fused in ways that generate viable hybrid cells whose genomes are essentially the summation of the two parental genotypes. When normal human cells are fused with rapidly growing mouse or Chinese hamster cell lines, human chromosomes are preferentially lost from the hybrids, often leaving lines with only one or just a few human chromosomes. In cases where these cells can be shown to produce a specific human protein, the gene coding for that protein can be assigned to one of the human chromosomes in the hybrid cell. In many cases, it is necessary to examine several lines that each contain a few human chromosomes to establish a correlation between a specific human protein and a specific human chromosome. (Fig. 5.21).
Linkage studies: Once a few genes have been assigned to a particular chromosome, linkage studies can often be used to assign other genes to the same chromosome, even if specific gene products cannot be demonstrated in the cell hybrids. Examples include disease genes for which the gene products may not yet be known. Thus, if the inheritance of the disease can be shown through the use of lod scores to be linked to a marker known to be on a certain chromosome, the genetic locus responsible for the disease can also be assigned to that chromosome.
Introduction: This section of the lecture, which is based on Chapter 6, seeks only to provide a general overall understanding of some of the methodology of bacterial genetics. We will examine a few examples of the regulation of gene expression in bacteria at the molecular level in substantial detail later in the semester (lectures 34-35).
Genetic markers: Most of the genetic markers used in bacterial genetics are based on ability to multiply under a particular set of conditions or ability to carry out a particular metabolic process. Wild type bacteria are often able to grow on minimal media that consist only of an appropriate reduced carbon energy source and an adequate mixture of inorganic ions. The common intestinal bacterium, Escherichia coli (E. coli), will be used as an example here. (Please note that despite extensive recent publicity about certain relatively rare strains of E. coli that are highly pathogenic, typical laboratory strains are relatively harmless).
Prototrophs and auxotrophs: A wild type strain that has minimal requirements for exogenously supplied nutrients is refered to as a prototroph . A mutant strain that has lost the ability to synthesize its own supply of a particular nutrient, such as histidine or adenine or thiamine is called an auxotroph . While the prototroph can be grown on a minimal medium, a histidine auxotroph will not grow on that medium unless histidine is added. Similarly, an adenine auxotroph must have adenine added to its culture medium, etc. In certain cases, loss of any of several enzymes involved in the biosynthesis of a particular nutrient can result in the same auxotrophic phenotype. Thus, for example a strain of E. coli that is isolated as an auxotroph unable to multiply in the absence of tryptophan, might have a loss-of-function mutation in any of five different enzymes that are involved in the biosynthesis of tryptophan.
Ability to utilize substrates: The ability to utilize carbon and energy sources other than glucose often requires induced expression of enzymes that are not otherwise present. Mutations either in the genes coding for those enzymes or in the regulatory systems that are involved in their activation can result in loss of ability to multiply on the alternative substrates. Such mutations (for example, loss of ability to use lactose as an alternative to glucose) are widely used as genetic markers. Other markers, such as the size and shape of colonies developed from a single bacterial cell, and the ability to resist infection by certain kinds of bacterial viruses can also be used as markers.
Transformation: Transformation refers to the ability of extracellular DNA to enter bacterial cells and recombine with the bacterial genome, thereby giving the cells new genetic properties. Transformation of bacteria by non-living material derived from other bacteria was discovered in 1928. Demonstration by Avery et al. that the transforming material was DNA, which was published in 1944, was the first direct evidence that DNA carried genetic information. A crude form of genetic mapping can be done by determining the frequency with which two separate mutations are simultaneously reversed by transformation. Loci that are closer together are more likely to be co-transformed.
Conjugation: Bacterial conjugation can be viewed as a primitive form of sex, in which a cell from a donor strain injects DNA into recipient cell, where it can undergo recombination and become part of the recipient's genome. Donor strains contain an additional genetic element, called the fertility factor (F). The F factor is usually in the form of a plasmid (a small independently replicating circle of DNA not directly associated with the host chromosome). In such cases, the frequency of transformation with any genes other than the F factor is very low (due to random chromosomal integration, as described below).
Hfr strains: In certain F+ strains, the F factor has become integrated into the bacterial chromosome. When this happens, transfer of chromosomal DNA from the donor to the recipient begins adjacent to the integrated F factor and progresses around the entire bacterial chromosome if the process in not interrupted. Bacterial strains with integrated F factors are referred to as Hfr strains (for "high frequency of recombination").
Mechanisms of DNA transfer: The DNA transfer that occurs in conjugation begins as a single strand break in the donor chromosome (or plasmid), with only one strand transferred through the F-pilus to the recipient cell. The single strands in the donor and recipient are both converted into double strands, restoring the donor chromosome and generating double stranded donor DNA in the recipient. As a result of receiving new DNA from the donor, the recipient is temporarily partially diploid. Recombination can then integrate parts of the transferred DNA into the recipient genome. Any DNA that is not integrated is soon destroyed.
Chromosomal mapping: Map distances in E. coli and other types of bacteria are based on the order and rate of transfer of genes from an Hfr donor to an F- recipient. Because the F factor can integrate into the bacterial chromosome at different sites, transduction can start at different genes. In addition, it can proceed around the E. coli genome in either direction, but the order of transfer of the genes is always the same (or the exact reverse). This provides relative map locations. The relative timing of the transfer of individual genes can be determined by interrupting the process at various times after it is started. Map distances on the circular map of the E. coli genome (Figures 6.8 and 6.9) are expressed as units of time, relative to transfer of the entire genome, which requires approximately 100 minutes under standard conditions.
Merozygotes: A merozygote is a partially diploid cell generated by introducing a partial second genome into a cell with its own complete genome. Although the extra DNA is normally destroyed quite rapidly, a stabilized merozygote can be generated by integrating the partial second genome into a plasmid. One example is integration of a limited number of genes from a bacterial chromosome into the F factor plasmid to generate a stabilized partial diploid called an F+ merozygote (Figure 6.11).
Sexduction: The bacterial genes carried on the F+ plasmid in an F+ merozygote are easily transmitted into F- cells by conjugation, and are maintained as independent plasmids in the recipient cells. This allows the formation of various types of stable F+ merozygotes and studies of interactions of genes that are either cis- (both genes on the same "chromosome") or trans- (each gene on a separate "chromosome"). Such interactions will be very important in our study of the coordinated control of clusters of related genes known as operons, later in the semester.
Transduction: Bacterial viruses normally contain viral DNA surrounded by a coat of viral protein. Sometimes fragments of host DNA are incorporated, rather than viral DNA. When a defective virus of this type "infects" a new host cell, that DNA is injected into the host and may recombine with the host genome. Through the use of selective media in which the host cells survive only if they have been transduced with an appropriate gene, it is possible to identify bacteria that have undergone successful transduction and to study patterns of linkage to other nearby genes in their progeny.