Chapter 21

© 2012 Pearson Education, Inc.

Lectures byKathleen Fitzpatrick

Simon Fraser University

Chapter 21

Gene Expression I: The Genetic Code and Transcription


The Genetic Code and Transcription

• The coded information of DNA is used to guide RNA production and the subsequent translation into protein

• The synthesis of RNA molecules is called transcription


The Directional Flow of Genetic Information

• DNA serves as a template for the synthesis of an RNA molecule which then directs the synthesis of a protein product

• Sometimes the RNA itself is the final product

• The principle of directional information flow from DNA to RNA to protein is the central dogma of molecular biology


Transcription and translation

• Transcription refers to RNA synthesis using DNA as a template

• Translation is the synthesis of protein using the information in the RNA

• Messenger RNA, mRNA, is RNA that is translated into protein


Additional types of RNA

• Ribosomal RNA, rRNA, is an integral component of the ribosome

• Transfer RNA, tRNA, molecules serve as intermediaries, bringing amino acids to the ribosome

• Both function during translation


Figure 21-1


Refinements of the central dogma

• There are exceptions to the central dogma

• For example, there are RNA viruses that carry out reverse transcription, using RNA as a template for DNA synthesis

• Other viruses produce RNAs from an RNA template


Figure 21A-1


The Genetic Code

• The relationship between the DNA base sequence and the linear order of amino acids in the protein products is based on a set of rules known as the genetic code

• They detected a link between gene mutations and proteins


Mutants and metabolic pathways

• Beadle and Tatum grew mutants on minimal medium with metabolic precursors of a particular amino acid or vitamin

• They determined which precursors allowed the growth of each mutant

• They were able to infer that each mutation disabled a single enzymatic step of a metabolic pathway, the one-gene-one-enzyme hypothesis


Most Genes Code for the Amino Acid Sequences of Polypeptide Chains

• Linus Pauling studied the inherited disease sickle-cell anemia, in which the red blood cells assume a sickle shape

• He analyzed hemoglobin using electrophoresis and found that hemoglobin of sickle cells migrated differently from normal hemoglobin

• Vernon Ingram used the protease trypsin to cleave hemoglobin into fragments and then examined the peptides


Figure 21-2


Figure 21-3


Sickle-cell hemoglobin differs from normal hemoglobin

• Ingram found just one amino acid difference between normal and sickle-cell hemoglobin

• The sickle-cell hemoglobin has a valine instead of a glutamic acid; a neutral amino acid instead of a negatively charged one

• This changed the one-gene-one-enzyme hypothesis; hemoglobin is not an enzyme


A refined hypothesis

• The new hypothesis was refined to the one-gene-one-polypeptide theory: the nucleotide sequence of a gene determines the amino acid sequence of a polypeptide chain

• Charles Yanofsky showed that mutations in the bacterial tryptophan synthase gene corresponded to changed amino acids in the polypeptide


Gene function is complicated

• Most eukaryotic genes contain noncoding sequences among the coding regions of the gene

• Coding sequences can be read in various combinations, each coding for a unique polypeptide chain; this is called alternative splicing

• Some types of genes encode functional RNAs


The Genetic Code Is a Triplet Code

• There are four DNA bases and 20 amino acids

• A doublet code, in which two bases specify a single amino acid, is inadequate as only 16 combinations are possible

• A triplet code, in which combinations of three bases specify amino acids, would have 64 possible combinations, more than enough for all 20 amino acids


Frameshift mutations

• The gene is written in a language of three-letter words

• Inserting or deleting a nucleotide causes the rest of the sequence to be read out of phase—this is a shift in the reading frame

• Mutations that cause insertion or deletion of a nucleotide are thus called frameshift mutations


Figure 21-4


The Genetic Code Is Degenerate and Nonoverlapping

• There are 64 combinations of nucleotide triplets and only 20 amino acids

• This means the genetic code is degenerate, meaning that a particular amino acid can be specified by more than one triplet

• It is also nonoverlapping; the reading frame advances three nucleotides at a time


Figure 21-5


Figure 21-5A


Figure 21-5B


The genetic code

• Although the genetic code is always nonoverlapping, there are cases where a segment of DNA is translated in more than one reading frame

• E.g., some viruses with very small genomes have overlapping genes, and some bacteria have genes that slightly overlap


Messenger RNA Guides the Synthesis of Polypeptide Chains

• The genetic code refers to the order of nucleotides in the mRNA molecules that direct protein synthesis

• mRNA is transcribed from DNA similarly to how DNA is replicated, but with two differences


Differences between mRNA synthesis and DNA replication

• In mRNA synthesis, only one DNA strand is copied, called the template strand; the other strand is called the coding strand because it is similar to the mRNA sequence

• In mRNA synthesis, a uracil base (U) is used instead of thymine


Cell-free systems

• Nirenberg and Matthei pioneered the use of cell-free systems for studying protein synthesis

• They decided to add synthetic RNAs of known sequence to the cell-free system

• They used polynucleotide phosphorylase to make synthetic RNA molecules of predictable base composition


Working out the genetic code

• When a single ribonucleotide is used to make RNA the RNA is called a homopolymer

• When poly (U), but not other homopolymers, was added to the cell-free system, a large amount of phenylalanine was incorporated, suggesting that UUU specifies phenylalanine


The Codon Dictionary Was Established Using Synthetic RNA Polymers and Triplets

• RNA triplets, called codons, are read by the transcriptional machinery

• Further homopolymer experiments showed AAA codes for lysine, and CCC codes for proline

• Copolymers were tested (containing a mixture of two nucleotides) but it was difficult to be sure which codon specified each amino acid


A different approach

• Khorana used an approach with one important difference—he synthesized the RNA molecules in an alternating sequence

• This sort of copolymer has only two codons, e.g., UAUAUAUA UAU and AUA, and Khorana could narrow the codon assignments to either tyrosine or isoleucine

• Eventually, these experiments allowed assignment of all the codons


Of the 64 Possible Codons in Messenger RNA, 61 Code for Amino Acids

• All 64 codons are used in the translation of mRNA

• 61 of them specify the addition of specific amino acids to a growing polypeptide chain

• One of them, AUG, plays a role as a start codon

• The remaining 3 (UAA, UAG, UGA) are stop codons, which terminate polypeptide synthesis


Figure 21-6


The genetic code is unambiguous and degenerate

• Every codon has one meaning only, the genetic code is unambiguous

• It is also degenerate—many of the amino acids are specified by more than one codon

• With a degenerate code, most mutations cause codon changes and a changed amino acid


The Genetic Code Is (Nearly) Universal

• Except for a few cases all organisms use the same basic genetic code

• In the case of mitochondria, and a few bacteria, the genetic code differs in several ways

• E.g., AGA is a stop codon in mammalian mitochondria and in some organisms codons specify nonstandard amino acids


Transcription in Bacterial Cells

• The fundamental principles of transcription were first elucidated in bacteria, where molecules and mechanisms are relatively simple


Transcription Is Catalyzed by RNA Polymerase, Which Synthesizes RNA Using DNA as a Template

• Transcription is carried out by the enzyme RNA polymerase

• Bacteria have a single kind of RNA polymerase to synthesize all three classes of RNA—mRNA, tRNA, and rRNA

• The RNA polymerase of E. coli has two two subunits, and a dissociable sigma () factor


Transcription Involves Four Stages: Binding, Initiation, Elongation, and Termination

• The DNA that gives rise to one RNA molecule is called the transcription unit

• Transcription begins when RNA polymerase binds to a promoter sequence (1) triggering local unwinding of the double helix

• RNA polymerase then initiates synthesis of RNA using one DNA strand as a template (2)


Figure 21-7


Steps of RNA synthesis (continued)

• After initiation the RNA polymerase moves along the DNA template, unwinding the helix and elongating the RNA (3)

• Eventually the enzyme transcribes a termination signal which stops RNA synthesis and causes release of the RNA and dissociation of the polymerase (4)


Binding of RNA Polymerase to a Promoter Sequence

• RNA polymerase binds to a DNA promoter site, a sequence of several dozen base pairs that determines where RNA synthesis will start

• The terms upstream and downstream refer to sequences located toward the 5 or 3 end of the transcription unit, respectively

• The promoter is upstream of the transcribed sequence


Initiation of RNA Synthesis

• Initiation of RNA synthesis takes place once the DNA is unwound

• One of the DNA strands serves as a template for RNA synthesis, using incoming NTPs that are complementary to the template strand

• RNA polymerase catalyzes the formation of a phosphodiester bond between the NTPs


Elongation of the RNA Chain

• Chain elongation continues as RNA polymerase moves along the DNA molecule

• The RNA is elongated in the 5 to 3 direction, with each new nucleotide added to the 3 end

• As the polymerase moves along the DNA strand, the double helix ahead of the polymerase is unwound and the DNA behind it is rewound into a double helix


Figure 21-9


RNA polymerases have exonuclease activity

• When an incorrect nucleotide is incorporated, the polymerase backs up slightly and the incorrect nucleotide and the previous one are removed

• This is RNA proofreading; occasional errors in RNA molecules are not as critical as errors in DNA replication


Termination of RNA Synthesis

• Elongation of the RNA chain proceeds until the RNA polymerase copies a sequence called the termination signal

• There are two types of termination signals based on whether or not they require a protein called the rho factor

• RNA molecules that terminate without the rho factor contain

a short GC-rich sequence followed by several Us


Types of termination signal (continued)

• RNA molecules that don’t form the GC-rich hairpin require the rho factor for termination

• The rho factor is an ATP-dependent unwinding enzyme moving along the RNA molecule toward the 3 end and unwinding it from the DNA template as it proceeds


Transcription in Eukaryotic Cells

• Eukaryotic transcription involves the same four stages as prokaryotic but there are several important differences

– Each of three different RNA polymerases transcribes one or more different classes of RNA

– Eukaryotic promoters are more varied than bacterial ones, some are even located downstream of the gene


Eukaryotic transcription

• Eukaryotic transcription differs from that of prokaryotes

– RNA polymerases in eukaryotes require additional proteins called transcription factors, some of which must bind before the RNA polymerase can bind

– Protein-protein interactions play a prominent role in eukaryotic transcription


Eukaryotic transcription (continued)

• Eukaryotic transcription differs from that of prokaryotes

– RNA cleavage is more important than termination of transcription in determining the 3 end of the transcript

– Newly forming RNA molecules undergo RNA processing, chemical modification during and after transcription


RNA Polymerase I, II and III Carry Out Transcription in the Eukaryotic Nucleus

• There are three RNA polymerases in the nucleus designated RNA polymerases I, II, and III

• These differ in their location in the nucleus and the types of RNA they synthesize


Table 21-1


The RNA polymerases

• RNA polymerase I, in the nucleolus, synthesizes an RNA molecule that is a precursor for three types of rRNA

• RNA polymerase II is found in nucleoplasm and synthesizes mRNA; the molecules are found in clusters called transcription factories, where active genes congregate to be transcribed

• RNA polymerase II is very sensitive to -amanitin, unlike polymerase I


The RNA polymerases (continued)

• RNA polymerase III, in the nucleoplasm, synthesizes a variety of small RNAs including tRNA, and the 5S rRNA

• It is sensitive to -amanitin but only at higher levels than polymerase II

• All three polymerases are large, and composed of multiple polypeptide subunits


Three Classes of Promoters Are Found in Eukaryotic Nuclear Genes, One for Each Type of RNA Polymerase

• Eukaryotic promoters are varied, but can be grouped into three categories

• The promoter used by RNA polymerase I has two parts

• The core promoter is the smallest set of DNA sequences that initiates transcription


The upstream control element

• The core promoter is sufficient for initiation of transcription

• However, transcription occurs more efficiently in the presence of an upstream control element, a fairly long sequence similar to the core promoter


Figure 21-11A


The promoter for RNA polymerase II

• At least four types of DNA sequences are involved in core promoter function

• 1. A short initiator sequence surrounds the transcription startpoint

• 2. The TATA box, a consensus sequence of TATA followed by 2-3 As, is located about 25 nucleotides upstream of the startpoint


The promoter for RNA polymerase II (continued)

• Four types of DNA sequences are involved in core promoter function (continued)

• 3. The TFIIB recognition element (BRE) is located slightly upstream of the TATA box

• 4. The downstream promoter element (DPE) is located about 30 nucleotides downstream from the startpoint


Figure 21-11B


Additional control elements

• Core promoters are only capable of driving a basal (low) level of transcription

• Additional short sequences upstream (upstream control elements) improve the promoter’s efficiency

• Some are common to many different genes, e.g., the CAAT box and the GC box


Upstream control elements

• The location of upstream control elements varies from gene to gene

• Those within 100–200 nucleotides of the startpoint are called proximal control elements

• Those farther away are called enhancer elements


Promoters for RNA polymerase III

• RNA polymerase III uses promoters that are entirely downstream of the startpoint

• In both 5S RNA and tRNA the promoters are different but both consensus sequences fall into two blocks of about 10 bp each


Figure 21-11C


General Transcription Factors Are Involved in the Transcription of All Nuclear Genes

• A general transcription factor is always required for RNA polymerase binding to promoters

• Eukaryotes have many such factors, called TFs, that bind the promoter in a defined order starting with TFIID

• Eventually a large complex of proteins forms a preinitiation complex on the promoter


Elongation, Termination, and RNA Cleavage Are Involved in Completing Eukaryotic RNA Synthesis

• After initiation RNA polymerases move along the DNA and synthesize a complementary RNA

• Termination is governed by signals that differ for each type of RNA polymerase

• Transcription by polymerase I is terminated by a protein that recognizes an 18-nucleotide signal in the growing RNA chain


Termination of transcription

• For RNA polymerase III, termination signals include a short run of Us and no protein factors are required for their recognition

• For RNA polymerase II, transcripts are cleaved at a specific site before transcription ceases

• The cleavage site is 10–35 nucleotides downstream of a AAUAAA sequence in the RNA


Polyadenylation

• The cleavage site of polymerase II transcripts is also the site for addition of a poly(A) tail

• This is a string of adenine nucleotides added to the 3 end of most eukaryotic mRNAs


RNA Processing

• A newly produced RNA molecule is called the primary transcript

• It must undergo RNA processing (chemical modification) before it can function in the cell


Ribosomal RNA Processing Involves Cleavage of Multiple rRNAs from a Common Precursor

• rRNA is the most abundant and stable form of RNA in cells

• Four types of rRNA are distinguished by their different sedimentation rates during centrifugation

• The small ribosomal subunit has one 18S rRNA molecule, whereas the larger has three (28S, 5.8S, and 5S)


Table 21-2


Processing of rRNAs

• The three larger eukaryotic rRNAs are encoded by a single transcription unit, which produces a primary transcript called the pre-rRNA

• The three rRNAs are separated by transcribed spacers

• A series of cleavage reactions remove the spacers, and methyl groups are added to the pre-rRNA


Figure 21-14


Ribosome assembly in the nucleolus

• Processing of pre-rRNA is accompanied by assembly of the RNA with proteins to form the ribosomal subunits

• 5S RNA is transcribed by RNA polymerase III in a separate transcription unit with multiple copies in long tandem arrays

• 5S rRNA transcripts require little or no processing


Transfer RNA Processing Involves Removal, Addition, and Chemical Modification of Nucleotides

• Cells synthesize several dozen kinds of tRNA molecules

• They fold into a secondary structure, most containing four hairpin loops; but some have a fifth region called a variable loop

• tRNAs have a cloverleaf structure, and are synthesized as pre-tRNAs, followed by processing


Figure 21-15


The events of processing the pre-tRNA

• At the 5 end a short leader sequence (16 nucleotides) is removed (1)

• At the 3 end, the two terminal nucleotides are removed and replaced with CCA (2)

• About 10–15% of the nucleotides are chemically modified (3)


Pre-tRNA processing (continued)

• Types of chemical modifications include methylation and creation of unusual bases (dihydrouracil, ribothymine, pseudouridine, inosine)

• An internal 14-nucleotide sequence is removed, though only for a few tRNAs (4)


Messenger RNA Processing in Eukaryotes Involves Capping, Addition of Poly(A), and Removal of Introns

• Most bacterial RNA is synthesized in a form that is ready for translation with no need for processing

• Because there is no nuclear membrane, bacterial transcripts are translated as they are transcribed


Transcription and translation in eukaryotes

• Eukaryotic transcripts must be exported from the nucleus to be translated

• Substantial processing occurs in the nucleus before export

• Primary transcripts are often very long, 2,000–20,000 nucleotides, referred to as heterogeneous nuclear RNA (hnRNA)


Eukaryotic transcripts

• Pre-mRNAs are processed by removal of sequences and addition of 5 caps and 3 tails

• The C-terminal domain of one of the subunits of RNA polymerase II acts as a platform for protein complexes involved in processing


5 Caps and 3 Poly(A) Tails

• Eukaryotic mRNAs have a modified nucleotide called the 5 cap and the 3 ends have a long stretch of adenines called the poly(A) tail

• The 5 cap is a guanosine that is methylated at position 7 of the purine ring

• It is bound to the RNA molecule by a 5–5 linkage rather than the usual 3–5 bond


Figure 21-17


Roles of the 5 cap

• The 5 cap is added soon after transcription is initiated

• The cap contributes to mRNA stability by protecting the RNA from nucleases

• The cap also plays a role in positioning the RNA on the ribosome for initiation of translation


The poly(A) tail

• The poly(A) tail ranges from 50 to 250 nucleotides long and is added by the enzyme poly(A) polymerase

• A signal, AAUAAA, is located just upstream of the polyadenylation site, and a GU- or U-rich element is located downstream of it


Figure 21-18


Function of the poly(A) tail

• The poly(A) tail protects the mRNA from nuclease attack; the length of the tail influences stability

• It is also required for export of the transcript to the cytoplasm

• It may also help ribosomes recognize and bind mRNAs


The Discovery of Introns

• The precursors for most mRNAs and some rRNAs and tRNAs contain introns, sequences within the primary transcript that are removed

• Experiments demonstrated that eukaryotic gene sequences contain extra DNA that does not appear in the mature RNA


Exons and introns

• Sequences that appear in the final mRNA were called exons

• Introns are present in most protein coding genes of multicellular eukaryotes

• The size and number of introns varies considerably


Table 21-3


Spliceosomes Remove Introns from Pre-mRNA

• The process of removing introns and joining the exons is RNA splicing

• About 15% of inherited human diseases involve splicing errors; such errors lead to incorrect protein products

• Sequences commonly found at the intron-exon boundaries likely determine the 5 and 3 splice sites


Figure 21-20


Splice sites

• Analysis of base sequences of hundreds of different introns revealed that the 5 end of an intron typically starts with GU and terminates with AG at the 3 end

• The sequences immediately adajcent to the 3 and 5 ends of the intron tend to be similar

• One additional sequence near the 3 end of the intron is called the branch point


Figure 21-22


The Existence of Introns Permits Alternative Splicing and Exon Shuffling

• In some cases introns are processed to yield functional products

• In few cases introns are translated into proteins

• However most introns are destroyed without serving any obvious function


Alternative splicing

• The presence of introns allows each gene’s pre-mRNA molecule to be spliced in multiple ways, leading to production of multiple protein products

• This alternative splicing is possible via mechanisms allowing certain splice sites to be activated or skipped

• Regulatory proteins and snoRNAs bind to splicing enhancer or silencer sequences


Figure 21-23


Intron functions

• Besides alternative splicing, introns allow the evolution of new protein-coding genes through recombination events

• Recombination between introns produces new combinations of exons—exon shuffling

• It can also produce duplicate copies of exons within a gene, one of which could mutate to a new sequence


RNA Editing Allows mRNA Coding Sequences to Be Altered• Another type of RNA processing is RNA editing

• Anything from a single nucleotide to hundreds may be inserted, removed, or altered in the mRNA

• Some of the best-studied examples occur in mitochondria of trypanosomes

• Small guide RNAs, encoded by different mitochondria genes, determine the location for the placement of the Us


Key Aspects of mRNA Metabolism

• Two key aspects of mRNA metabolism are important to understanding mRNA behavior in cells

• mRNAs have a short life span

• mRNAs have the ability to amplify genetic information

• mRNA can be synthesized again and again from a piece of template DNA, providing an opportunity for amplification of genetic information


Most mRNA Molecules Have a Relatively Short Life Span

• Most mRNA molecules have a high turnover rate (rate at which molecules are degraded and replaced)

• It is measured in terms of half-life, the time required for 50% of the molecules to degrade

• mRNA molecules of eukaryotes have half-lives of several hours to a few days; in bacteria, the half-lives are usually only a few minutes

Chapter 21

Documents

Transcript of Chapter 21