RNA and Protein Synthesis 7.2 Transcription & Gene Expression.
Transcription and Regulation of Gene Expression
Transcript of Transcription and Regulation of Gene Expression
Transcription
and Regulation
of Gene
Expression
Outline
• Transcription in Prokaryotes
• Transcription in Eukaryotes
• Regulation of Transcription in Prokaryotes
• Transcription Regulation in Eukaryotes
• Structural Motifs in DNA-Binding proteins
• Post-Transcriptional Processing of mRNA
RNA
• Usually single-stranded
• Has uracil as a base
• Ribose as the sugar
• Carries protein-encoding information
•Can be catalytic
DNA
• Usually double-stranded
• Has thymine as a base
• Deoxyribose as the sugar
• Carries RNA-encoding information
• Not catalytic
Types of RNA
• mRNA, tRNA and rRNA
• These three kinds of RNA are present in
both eukaryotic and prokaryotic cells as
well as in mitochondria
The secondary structure RNA
• is a random coil, or is determined by its interaction
with proteins
• occasionally there are complementary regions
with the same RNA molecule
• LARGE DIFFERENCES IN PHYSICAL
PROPERTIES RNA AND DNA
• RNA - does not exist as a large double helix -
does not show hyperchromicity under denaturation
• does not have well-defined melting point
• show large changes in viscosity
Messenger RNAs (mRNA)
• mRNA carries the genetic information
coded in DNA into the cytoplasm
• The order of nucleotides in the mRNA
determines the order of amino acids in
the protein translated from it.
Transfer RNAs (tRNAs)
• This class of small RNAs
transfers amino acids to the protein-
synthesizing machinery and translates
the nucleic acid “language” into amino acid
“laguage”
Ribosomal RNAs (rRNAs)
• This class of RNAs plus a large number of
ribosomal proteins are assembled and make
up ribosomes, the enzymatic machinery
on which protein synthesis takes place.
• Ribosomes engage the mRNAs and form a
catalytic domain into which the tRNAs
enter with their attached AAs.
Other Forms of RNA
rRNA and tRNA only appreciated later
• All three forms participate in protein synthesis
• All made by DNA-dependent RNA polymerases
• This process is called transcription
• Not all genes encode proteins! Some encode
rRNAs or tRNAs
• Transcription is tightly regulated. Only 0.01% of
genes in a typical eukaryotic cell are undergoing
transcription at any given moment
TranscriptionThe new RNA molecule is formed by incorporating
nucleotides that are complementary to the template strand.
DNA coding strand
DNA template strand
DNA
5’
3’
5’
3’
G T C A T T C G G
C A G T A A G C C
G
RNA
5’
GG U C A U U C3’
RNA synthesis and processing
All 3 major classes of RNA rRNA, tRNA and mRNA:
are synthesized in the nucleolus (by copying of
DNA)
are modified after synthesis and before being
transported into the cytoplasm → PROTEIN
SYNTHESIS
the selection of the segment to be copied – is the
major control point in this process!
as synthesis proceeds, the RNA is released and
the DNA reforms a normal helical structure
Transcription in Prokaryotes Only a single RNA polymerase
• In E.coli, RNA polymerase is 465 kD complex,
with 2 , 1 , 1 ', 1
' binds DNA
binds NTPs and interacts with
recognizes promoter sequences on DNA
subunits appear to be essential for assembly
and for activation of enzyme by regulatory
proteins
• RNA polymerases contain no nuclease activity
Stages of Transcription
See next Figure
• binding of RNA polymerase holoenzyme
at promoter sites
• initiation of polymerization
• chain elongation
• chain termination
Properties of Promoters See Figure
• Promoters typically consist of 40 bp region
on the 5'-side of the transcription start site
• Two consensus sequence elements:
• The "-35 region", with consensus TTGACA
- sigma subunit appears to bind here
• The Pribnow box near -10, with consensus
TATAAT - this region is ideal for unwinding
- why?
Initiation of Polymerization • RNA polymerase has two binding sites for NTPs
• Initiation site prefers to binds ATP and GTP (most RNAs begin with a purine at 5'-end)
• Elongation site binds the second incoming NTP
• 3'-OH of first attacks alpha-P of second to form a new phosphoester bond (eliminating PPi)
• When 6-10 unit oligonucleotide has been made, sigma subunit dissociates, completing "initiation"
• Note rifamycin and rifampicin and their different modes of action
Chain Elongation Core polymerase - no sigma
• Polymerase is accurate - only about 1 error
in 10,000 bases
• Even this error rate is OK, since many
transcripts are made from each gene
• Elongation rate is 20-50 bases per second -
slower in G/C-rich regions (why??) and
faster elsewhere
• Topoisomerases precede and follow
polymerase to relieve supercoiling
Chain Termination Two mechanisms
• Rho - the termination factor protein
– rho is an ATP-dependent helicase
– it moves along RNA transcript, finds the "bubble", unwinds it and releases RNA chain
• Specific sequences - termination sites in DNA
– inverted repeat, rich in G:C, which forms a stem-loop in RNA transcript
– 6-8 As in DNA coding for Us in transcript
Transcription in Eukaryotes• RNA polymerases I, II and III transcribe rRNA,
mRNA and tRNA genes, respectively
• Pol III transcribes a few other RNAs as well
• All 3 are big, multimeric proteins (500-700 kD)
• All have 2 large subunits with sequences similar
to and ' in E.coli RNA polymerase, so
catalytic site may be conserved
• Pol II is most sensitive to -amanitin, an
octapeptide from Amanita phalloides
("destroying angel mushroom")
Transcription Factors More on this later, but a short note now
• The three polymerases (I, II and III) interact
with their promoters via so-called
transcription factors
• Transcription factors recognize and initiate
transcription at specific promoter sequences
• Some transcription factors (TFIIIA and TFIIIC
for RNA polymerase III) bind to specific
recognition sequences within the coding
region
RNA Polymerase II Most interesting because it regulates
synthesis of mRNA
• Yeast Pol II consists of 10 different peptides
(RPB1 - RPB10)
• RPB1 and RPB2 are homologous to E. coli RNA
polymerase and '
• RPB1 has DNA-binding site; RPB2 binds NTP
• RPB1 has C-terminal domain (CTD) or PTSPSYS
• 5 of these 7 have -OH, so this is a hydrophilic and
phosphorylatable site
More RNA Polymerase II
• CTD is essential and this domain may project away from the globular portion of the enzyme (up to 50 nm!)
• Only RNA Pol II whose CTD is NOT phosphorylated can initiate transcription
• TATA box (TATAAA) is a consensus promoter
• 7 general transcription factors are required
• See TFIID bound to TATA
Transcription Regulation in
Prokaryotes
• Genes for enzymes for pathways are
grouped in clusters on the chromosome
- called operons
• This allows coordinated expression
• A regulatory sequence adjacent to such
a unit determines whether it is
transcribed - this is the ‘operator’
• Regulatory proteins work with operators
to control transcription of the genes
Induction and Repression• Increased synthesis of genes in response to a
metabolite is ‘induction’
• Decreased synthesis in response to a metabolite is ‘repression’
• Some substrates induce enzyme synthesis even though the enzymes can’t metabolize the substrate - these are ‘gratuitous inducers’ -such as IPTG
Structural Motifs in DNA-Binding Regulatory Proteins
• Crucial feature must be atomic contacts between
protein residues and bases and sugar-phosphate
backbone of DNA
• Most contacts are in the major groove of DNA
• 80% of regulatory proteins can be assigned to
one of three classes: helix-turn-helix (HTH),
zinc finger (Zn-finger) and leucine zipper
(bZIP)
• In addition to DNA-binding domains, these
proteins usually possess other domains that
interact with other proteins
Alpha Helices and DNA
A perfect fit!
• A recurring feature of DNA-binding proteins
is the presence of -helical segments that fit
directly into the major groove of B-form DNA
• Diameter of helix is 1.2 nm
• Major groove of DNA is about 1.2 nm wide
and 0.6 to 0.8 nM deep
• Proteins can recognize specific sites in DNA
The Helix-Turn-Helix Motif First identified in 3 prokaryotic proteins
• two repressor proteins (Cro and cI) and the E.
coli catabolite activator protein (CAP)
• All these bind as dimers to dyad-symmetric
sites on DNA (see Figure)
• All contain two alpha helices separated by a
loop with a beta turn
• The C-terminal helix fits in major groove of
DNA; N-terminal helix stabilizes by
hydrophobic interactions with C-terminal helix
Helix-Turn-Helix II
See next Figures
• Residues 1-7 of the motif are the first helix
(but called "helix 2")
• Residue 9 is the turn maker - a Gly, of course
• Residues 12-20 are the second helix (called
"helix 3")
• Recognition of DNA sequence involves the
sides of base pairs that face the major groove
The Zn-Finger Motif
First discovered in TFIIIA from Xenopus laevis, the
African clawed toad
• Now known to exist in nearly all organisms
• Two main classes: C2H2 and Cx
• C2H2 domains consist of Cys-x2-Cys and His-x3-
His domains separated by at least 7-8 aas
• Cx domains consist of 4, 5 or 6 Cys residues
separated by various numbers of other residues
More Zn-Fingers Their secondary and tertiary structures
• C2H2 -type Zn fingers form a folded beta
strand and an alpha helix that fits into the
DNA major groove
• Cx-type Zn fingers consist of two mini-
domains of four Cys ligands to Zn followed
by an alpha helix: the first helix is DNA
• recognition helix, second helix packs
against the first
The Leucine Zipper Motif First found in C/EBP, a DNA-binding protein in
rat liver nuclei
• Now found in nearly all organisms
• Characteristic features: a 28-residue sequence
with Leu every 7th position and a "basic
region"
• (What do you know by now about 7-residue
repeats?)
• This suggests amphipathic alpha helix and a
coiled-coil dimer
The Structure of the Zipper
and its DNA complex
• Leucine zipper proteins (aka bZIP proteins)
dimerize, either as homo- or hetero-dimers
• The basic region is the DNA-recognition site
• Basic region is often modelled as a pair of
helices that can wrap around the major groove
• Homodimers recognize dyad-symmetric DNA
• Heterodimers recognize non-symmetric DNA
• Fos and Jun are classic bZIPs
Post-transcriptional Processing
of mRNA in Eukaryotes
• Translation closely follows transcription
in prokaryotes
• In eukaryotes, these processes are
separated - transcription in nucleus,
translation in cytoplasm
• On the way from nucleus to cytoplasm,
the mRNA is converted from "primary
transcript" to "mature mRNA"
Eukaryotic Genes are Split • Introns intervene between exons
• Examples: actin gene has 309-bp intron
separates first three amino acids and the other
350 or so
• But chicken pro-alpha-2 collagen gene is 40-
kbp long, with 51 exons of only 5 kbp total.
• The exons range in size from 45 to 249 bases
• Mechanism by which introns are excised and
exons are spliced together is complex and
must be precise
Capping and Methylation
• Primary transcripts (pre-mRNAs or
heterogeneous nuclear RNA) are usually first
"capped" by a guanylyl group
• The reaction is catalyzed by guanylyl
transferase
• Capping G residue is methylated at 7-position
• Additional methylations occur at 2'-O positions
of next two residues and at 6-amino of the first
adenine
3'-Polyadenylylation • Termination of transcription occurs only after
RNA polymerase has transcribed past a
consensus AAUAAA sequence - the poly(A)+
addition site
• 10-30 nucleotides past this site, a string of
100 to 200 adenine residues are added to
the mRNA transcript - the poly(A)+ tail
• poly(A) polymerase adds these A residues
• Function not known for sure, but poly(A) tail
may govern stability of the mRNA
Splicing of Pre-mRNA Capped, polyadenylated RNA, in the form of a RNP
complex, is the substrate for splicing
• In "splicing", the introns are excised and the
exons are sewn together to form mature mRNA
• Splicing occurs only in the nucleus
• The 5'-end of an intron in higher eukaryotes is
always GU and the 3'-end is always AG
• All introns have a "branch site" 18 to 40
nucleotides upstream from 3'-splice site
• Branch site is essential to splicing
The Branch site and Lariat • Branch site is usually YNYRAY, where Y =
pyrimidine, R = purine and N is anything
• The "lariat" a covalently closed loop of RNA is
formed by attachment of the 5'-P of the intron's
invariant 5'-G to the 2'-OH at the branch A site
• The exons then join, excising the lariat.
• The lariat is unstable; the 2'-5' phosphodiester is
quickly cleaved and intron is degraded in the
nucleus.
The Importance of snRNP • Small nuclear ribonucleoprotein particles -
snRNPs, pronounced "snurps" - are involved in
splicing
• A snRNP consists of a small RNA (100-200
bases long) and about 10 different proteins
• Some of the 10 proteins are general, some are
specific.
• snRNPs and pre-mRNA form the spliceosome
• Spliceosome is the size of ribosomes, and its
assembly requires ATP