Text Assignment: Chapter 8, Pages 216-229.
Major Concepts
Symbol substitution: Standard HTML (the computer language used for web pages) does not have a symbol for the Greek letter beta. Because of this, "ß" is used in these notes as a substitute for beta. This character is actually a German double s. If it does not come out looking at least vaguely like a beta, make sure that your browser is set for the iso8859-1 character set (called Western [Latin-1] on some browsers).
Operons: An operon consists of a group of coordinately controlled genes that are all related to a particular cellular function. Operons occur primarily in prokaryotic organisms. Typically, the genes are physically adjacent to one another and all controlled from a single upstream promoter. A single transcript initiated from the promoter serves as the mRNA for all of the genes controlled from that promoter. Such a transcript is referred to as polycistronic because it contains several protein coding sequences (cistrons) in a tandem arrangement on a single RNA molecule. A "cistron" is the coding unit for one polypeptide chain, defined by complementation testing.
Operator locus: An operator locus, located immediately adjacent to the promoter, controls whether or not the operon is transcribed. A regulatory protein, coded from a gene located outside of the operon, binds to the operator to either repress or induce transcription.
Metabloic control of operons: The activities of the regulatory proteins are altered by the presence or absence of various types of small molecules. For example, E. coli preferentially uses glucose as an energy source when it is available. Under conditions where the supply of glucose is inadequate, genes coding for enzymes that support the utilization of a variety of alternative energy sources can be activated. However, the genes coding for the enzymes needed to use a specific energy source, such as lactose, are only transcribed when that substance is present. Similarly, as we will see in the next lecture, genes coding for enzymes needed to synthesize an essential nutrient, such as the amino acid tryptophan, are only transcribed when the nutrient in question is not available in adequate amounts from the culture medium.
lac operon: This lecture focuses on the lac operon, whose gene products are required for utilization of lactose as an energy source. The lac operon consists of three structural (protein coding) genes whose products are involved in the utilization of lactose.
Dual control: The lac operon is under dual control, repressed by the presence of glucose and induced by the presence of lactose when glucose is absent. A low level of expression is maintained constituitively (always present), thus providing a priming effect for induction. When induction is acheived, there is a sharp increase in transcription of the polycistronic message which codes for all three proteins. The paragraphs that follow examine the induction of the lac operon by lactose (via allolactose) in the absence of glucose, followed by a description of the mechanisms that prevent induction when an adequate amount of glucose is present.
Operator: The primary operator site (O1) is located right at the transcriptional start site (figure 8.3). A related site (O3) is located just upstream, and a third site (O2) is located downstream, embedded within the ß-galactosidase coding sequence. In the absence of protein interactions, the operator sites have no effect on transcription. However, the lac repressor protein, coded by the nearby lacI locus, forms a tetramer that attaches firmly to the O1 site and somewhat less strongly to the O3 site. This causes the DNA to form a loop, which prevents the initiation of transcription (figure 8.5a).
Induction by allolactose: The tetrameric lac repressor protein, has four binding sites for allolactose. When those sites become occupied, the protein undergoes a change in shape (allosteric modification), which causes it to dissociate from the operator site, allowing transcription to proceed (figure 8.5b). Thus allolactose induces the lac operon by a process of de-repression. In order for lactose to induce the operon, there must already be present a low level of permease to get the lactose into the cell and a low level of ß-galactosidase to convert the lactose to allolactose. Mutants that totally lack either the permease or ß-galactosidase cannot be induced by lactose.
Non-metabolized inducer: One of the problems encountered when using lactose or allolactose to induce the lac operon is that as soon as the enzymes of the operon are induced, they begin to destroy the inducer. One way around this problem is to use isopropyl-thio-ß-galactoside (IPTG), which is a potent inducer that is not metabolized. In addition, IPTG passes readily through cell membranes, eliminating the need for active permease in order to obtain induction.
Gene interactions in merozygotes: It is possible to construct merozygotes that carry a complete second copy of the lac operon on an F' plasmid. The use of merozygotes with various mutations on one or both of their lac operons makes it possible to to study cis- and trans- interactions among the various genes in the chromosomal and plasmid copies of the operon.
Operator Mutation: The primary operator (O1)
is a specific palindromic DNA sequence that is recognized by
the lac repressor protein. If this sequence changes
(mutates) to a sufficient degree, the repressor can no longer
bind to it. This type of operator mutation is denoted by Oc.
The "c" stands for constituitive, since this type
of operator mutation allows the operon to be consituitively transcribed
(always on). Operator mutations act in a cis-dominant mode,
such that they are not overcome by introducing a normal copy of
the gene carried on a separate DNA, such as the plasmid in a merozygote.
Repressor mutations: Mutations in the LacI gene that cause the repressor protein to be unable to bind the operator (denoted I-) will always be recessive when trans- to a wild-type copy of the gene, whose product will bind to both operators. Repressor protein mutations that are unable to bind allolactose (denoted IS) will remain bound to the operator locus, and will be dominant in trans-, preventing transcription of both host and plasmid operons.
Structural gene mutations: A mutation that blocks the activity of ß-galactosidase will prevent the conversion of lactose to allolactose. Induction of the operon by lactose will be blocked, but externally-supplied allolactose will still serve as an inducer. The induced operon will not generate a functional ß-galactosidase, but the other two enzymes will be induced normally. A mutation that blocks the activity of the permease will block (or greatly impair) all induction unless artificial means are employed to get lactose into the cell.
Control of lac operon by glucose: E. coli preferentially uses glucose as its primary energy supply whenever adequate amounts are available. Negative control of the lac operon by glucose is achieved by a somewhat indirect mechanism. As glucose levels fall, intracellular levels of cyclic adenosine monophosphate (cAMP) begin to rise. When this happens, an intracellular protein called Catabolite Activator Protein (CAP) binds to the cAMP and becomes a positive-acting regulator of the lac operon, binding upstream from the -35 promoter sequence (see figures 8.13 and 8.7). This process was named catabolite repression before the detailed mechanism was well understood. Products of the metabolism of glucose (catabolites) suppress cAMP levels, which in turn prevents binding of the CAP-cAMP complex and thus "represses" transcription of the operon through failure to stimulate that transcription.
Dual control of lac operon: In order to achieve a high level of induction it is necessary 1) for the CAP site to be occupied by a CAP/cAMP complex, and 2) for the lac operator site to not be occupied by the repressor protein. The description of these controls can easily become confusing. Glucose indirectly represses the lac operon by keeping cAMP levels low and blocking active induction by the CAP/cAMP complex. Allolactose induces the operon by binding to the repressor protein, causing it to release the operator, resulting in de-repression of the operon.
The trp-operon, which we will be examining in the next lecture,
is even more complex, so be sure to gain a full understanding of
the lac-operon as a starting point as quickly as possible.
APPENDIX
The material that follows is optional. For those who are interested, it offers an opportunity to explore a bit further some of the detailed information that is currently available about the lac operon and other E. coli genes..
Mapping the lac operon: The detailed genetic map of E. coli shown in figure 7.22 places the lac operon at approximately 8 minutes, with a reverse (counterclockwise) orientation relative to the standard circular map. A more precise map, based on the DNA sequence of the entire E. coli genome, places the start codon of the LacI gene at nucleotide 366,734 and the stop codon for the lacA gene at nucleotide 360,473. The details are available online in tabular form from the E. coli genome site (this is a long table that may load slowly -- after it comes on screen, scroll down to 7.8 minutes to find the genes of the lac operon).
Alternatively, you can go to a web page maintained by the Entrez Genome service of the National Center for Biotechnology Information that provides access to the E. coli genome in a searchable form. When you arrive at that page, you will see a heading that says "Feature table". One of the items under it is "Protein-coding genes: 4289", with the number in the form of a hyperlink. If you have a fast connection and an adequate computer, you can click on that link and see a table that lists all 4289 potentially protein-coding genes, including a brief description of each and links to more detailed information. Note that many of the sequences are described as "putative", meaning that they look like they should have a particular function, but it has not been proven. In addition, a substantial number are labeled "orf, hypothetical protein", meaning that they are known only as open reading frames that have the general properties of protein-coding genes.
Alternatively, you can enter 350,000 in the "start from" box on the Entrez Genome opening page for the E. coli genome to obtain a display that includes the entire lac operon and its neighbors on both sides. When that display comes up you can click on individual genes within the lac operon and obtain a display (it will be a second browser window), that includes the complete amino acid sequence (in one letter code) for the protein coded by that gene. The display will include information on similar protein sequences reported from other sources.
If you would like to see the nucleotide sequence for the region around the lac operon, click on the "GenBank" box near the top of the protein sequence page described above. This will bring up an extended web page with information on a segment of nearly 11,000 base pairs that includes the lac operon. As you scroll down, information is presented for each gene in that DNA segment. The entire sequence is at the bottom of the page. At the left, opposite the descriptive material for each gene, click on "CDS" for coding sequence or "gene" for the complete gene sequence (they are the same for prokaryotic cells). This will give you the complete coding sequence in a forward direction relative to translation. The big overall sequence at the bottom of the page contains the reverse complements for the genes in the lac operon, which are oriented in the opposite direction from that used for the complete genomic sequence for E. coli.