Molecular Biology and Biological Chemistry
description
Transcript of Molecular Biology and Biological Chemistry
Molecular Biology and Biological Chemistry
The Fundamentals of Bioinformatics Chapter 1
Introduction
• The Scale Spectrum
• The Genetic Material
• Gene Structure and Information Content
• Protein Structure and Function
• The Nature of Chemical Bonds
• Molecular Biology Tools
• Genomic Information Content
The Scale Spectrum
• Nano– Genes, proteins, genetic networks
• Micro– Organ physiology, pharmacokinetics
• Macro– Whole body, multi-organism
nano micro macro
DNA structure.
DNA: Deoxyribose Nucleic Acid
History:• 1868 Miescher – discovered nuclein
• 1944 Avery – experimental evidence that DNA is constituent of genes.
• 1953 Watson&Crick – double helical nature of DNA.“We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This
structure has novel features which are of considerable biological interest.”
• 1980 X-ray structure of more than a full turn of DNA.
The Genetic Material
• Genes: – the basis of inheritance– A specific sequence of nucleotides.(nt)
• Nucleotide bases– 4 types: Guanine(G), Adenine (A), Thymine
(T), & Cytosine (C)– Only differ in their ‘Nitrogenous base’– Alphabet of the ‘Language of Genes’
Five types of bases.
Base Pairings
• DNA is highly redundant– Strands are complementary– Permits replication
• Base pairings are stable and robust– Only G-C or A-T combinations possible
Complementarity of nucleotide– bases for double stranded helical structure.
Double helical structure of DNA.
Antiparallel Nature of DNA
• 5’end of one strand matches 3’ end of other
If one strand is 5’-GTATCC-3’
Then other is 3’-CATAGG-5’
Most processes go from 5’ to 3’, so write as:
5’-GGATAC-3’• Strands are reverse complements• 5’ is ‘upstream’, and 3’ is ‘downstream’
The Genome• Full complement of Genes
• Set of chromasomes– DNA chains
The Central Dogma
• DNA makes RNA makes Protein– General not universal
• Enzymes– Proteins that makes things happen, but are not
used up– X_ase
RNA-polymerase ribosomes
The Central Dogma (2)
• Transcription– RNA construction mediated by RNA-polymerase– One-one correspondence with DNA
• G, C, A, and U (Uracil)
• Translation– Conversion of nucleotides to amino acids– Ribosomes - complex structure of RNA & protein– Mediates protein synthesis
The Central Dogma (3)
Gene Structure and Information Content
• Information formatting and interpretation is very important– Alphabet and punctuation
• Same ‘language’ used for both:– Prokaryotes (bacteria)– Eukaryotes (more complex life forms)
Promoter Sequences
• Gene Expression– Process of using information in DNA to make RNA
molecule then a corresponding protein
• Expressing right quantity of protein essential for survival
• Two crucial distinctions– Which part of genome is start of a gene– Which genes code for proteins needed at a particular
time
• Responsibility falls to RNA-polymerase
Promoter sequences (2)
• Can’t look for single nucleotide– 1 in 4 chance of appearing at random
– General probability of a sequence = (1/4)n
• Prokaryotes: 13 nt promoter sequences– 1 in 70 million chance of random appearance
– Genome a few million nts long
– Datum: 1nt, 6 that are 10 nts upstream & 6 that are 35 nts upstream
• Eukaryotes are several orders of magnitude bigger
Promoter Sequences (3)
• Two types of Genes:
1. Structural• Cell structure or metabolism
2. Regulatory• Production control• Positive regulation• Negative regulation
The Genetic Code
• Need way to robustly translate from DNA to Protein– 4 nt alphabet– 20 amino acid (aa) alphabet – Mismatch
• Codon (triplet code)– 1&2 nts give < 20– Each aa coded by a codon– Degeneracy: more than 1 codon per aa = robustness– Stop codon: full stop
The Genetic Code
Open Reading Frames (ORFs)
• Start codon: AUG (and methinine)• Reading frame
– Established by start codon– Necessary for accurate translation– Mistakes lead to wrong proteins (& premature stops)
• Open Reading Frame– Inordinately long reading frame with no stop codon– Proteins 100s of aa long– Random stop: 1 in 20– Distinguishing feature of prokaryotes and eukaryotes.
Introns and Exons
• Messenger RNA - perfect copy of DNA• Introns: locally uninformative sequences in mRNA• Exons: locally informative sequences in mRNA• Splicing: removal of introns, rejoining exons• Spliceosomes: enzymes that do splicing
– GT-AG rule (potentially too common)
– Checks 6 extra nts
– Allows subtle nuances
Introns and Exons (2)
Protein Structure and Function
• Proteins are molecular machinery that performs most work in cells
• Vast array of tasks– Structure, catalysis, transportation, signalling
metabolism …
• Highly complex compounds– Primary, secondary, tertiary, quaternary
structure.
Primary & Secondary Structure
• Primary structurePrimary structure = the linear sequence of amino acids comprising a protein:
AGVGTVPMTAYGNDIQYYGQVT…• Secondary structureSecondary structure
– Regular patterns of hydrogen bonding in proteins result in two patterns that emerge in nearly every protein structure known: the -helix and the-sheet
– The location of direction of these periodic, repeating structures is known as the secondary structuresecondary structure of the protein
Planarity of the peptide bond
Phi () – the angle of rotation about the N-C bond.
Psi () – the angle of rotation about the C-C bond.
The planar bond angles and bond lengths are fixed.
Phi and psi
= = 180° is extended conformation
: C to N–H : C=O to C
C
C=O
N–H
The alpha helix 60°
Properties of the alpha helix 60°
• Hydrogen bondsHydrogen bondsbetween C=O ofresidue n, andNH of residuen+4
• 3.6 residues/turn
• 1.5 Å/residue rise
• 100°/residue turn
The beta strand (& sheet) 135° +135°
Properties of beta sheets• Formed of stretches of 5-10 residues in
extended conformation
• Pleated – each C a bitabove or below the previous
• Parallel/aniparallelParallel/aniparallel,contiguous/non-contiguous
Parallel and anti-parallel -sheets
• Anti-parallel is slightly energetically favoredAnti-parallelAnti-parallel ParallelParallel
Molecular Biology Tools
• Restriction enzyme digests
• Gel electrophoresis
• Blotting and hybridization
• Cloning
• Polymerase chain reaction
• DNA sequencing
Genomic Information Content
• C-value paradox– No correlation between organism complexity
and DNA size
• Reassociation Kinetics– Denaturing/renaturing– Cot equation: t0.5– Junk DNA
… & Finally
“There are only 10 types of people in the world: those that understand binary and those that do not”
Pete Smith (or Anon)