A super quick introduction to molecular biology and genomics · 2018-01-31 · A super quick...
Transcript of A super quick introduction to molecular biology and genomics · 2018-01-31 · A super quick...
A super quick introduction to molecular biology and
genomics
Héctor Corrada Bravo Dept. of Computer Science
Center for Bioinformatics and Computational BiologyUniversity of Maryland
Keyterms• Genotype/Phenotype• Cell• Proteins• Evolu6on:inheritance,selec6on,varia6on• DNA/RNA• Chromosome• Gene• Genome• Replica6on• Transcrip6on• Exon/Intron• Transla6on• Codon• CentralDogma• GeneExpression• Regula6on• Epigene6cs
Whyaremychildrensuchpigs?
WhyamIsuchapig?
Phenotype,cells,metabolism,protein
Proteins• phenotype:characteris6cs(traits)ofanorganism• characteris6csduetocellularstructuresandac6vi6es–mostlycarriedoutbyproteins
• Examples:
5
alpha-kera7n componentofhair
insulin regulatesbloodglucoselevel
ac7n&myosin musclecontrac7on
hemoglobin oxygentransport
DNApolymerase synthesisofDNA
DNAglycosylases DNArepair
matrixmetalloproteinase extra-cellularmatrixdegrada7on
Gene6cs
• gene:inclassicalgene6csitwasanabstractconcept– aunitofinheritancepassedfromparenttooffspring– specifyproteins
• genomereferstothecompletesetofgenes• genotype:gene6ccharacteris6csofanindividual
6
Hector Corrada Bravo
What is Genomics?
• Study the molecular basis of variation in development and disease
• Using high-throughput experimental methods
• algorithms
• ML
• data management
• modeling
7
cancer
healthy
WhatisGenomics?• Eachcellcontainsacompletecopyofanorganism’sgenome,orblueprintforallcellularstructuresandac6vi6es.
• Thegenomeisdistributedalongchromosomes,whicharemadeofcompressedandentwinedDNA.
• Cellsareofmanydifferenttypes(e.g.blood,skin,nervecells),butallcanbetracedbacktoasinglecell,thefer6lizedegg.
Chromosomes
Theseareactuallyhuman.Andforadownsyndromepa6ent
DNA
WatsonandCrick1953
DNAs(Deoxyribonucleicacids)aremoleculestostoregene6cinforma6onofalivingorganism.
DNAconsistsoftwopolymersmadefromfourtypesofnucleo6des:adenine(A)guanine(G),cytosine(C)andthymine(T).
Purines:A,G;Pyrimidines:C,T
Twopolymersarecomplementarytoeachotherandfromadouble-helixstructure
5’-ACCGTTCGACGGTAA-3’ ||||||||||||||| 3’-TGGCAAGCTGCCATT-5’
chromatin
Measurement
• Forasmallenoughpiece,wecanmeasurethesequenceofbases,referredtoassequencing
• HumanGenomeProject
GenomeTCAGTTGGAGCTGCTCCCCCACGGCCTCTCCTCACATTCCACGTCCTGTAGCTCTATGACCTCCACCTTTGAGTCCCTCCTCTCACACCTGACATGAAAAGGCACATGAGGATCCTCAAATACCCCGTGATCAGTCTCAGGGTAGCTCTCATAGCCTGGACAGGGCCCCCCTCGGGGGTTGCGCCCAGGTCCAGGCGGGGGATGCACAGCAACAGTCACCGAAGCAGAAGCCGTCACAGTGGTGATGGGCTGGCAGTAGCTGGGCACAGAGCTGCCCATGGCGGTGGACGTTGGGTTCCGAGGGTTGTGAGAACGGGCCCCACGGGGCCCTGAGCGGTCCCTATTGCTAGGGCCAGAATGCCCTTCAGTAGAAATTTCAAAAGCGTCTCTGCGCGGTCTGTAGGGGGGTGGCCGCAAGCCTTCTCTAGGGGGATCCCTTCGAGGCTGCTGGCCTTGCCGTCCAGGGGACAAGGAGCCAGAGTCCAGGTGGGGCTGTTGCCGAGGGGTCAAGGGAGGCTGATGTCTGGAGTCCGGATGGACCACCTGCAGAGGAGAGACATAGGTCAACACAGGGAGGTAGGATGGTGGTGATGTTCCACCCACAAAAGAAAACCTATTCCTTTAGAAACCTCCAGGATGTGAATCCTGCCTGCACCTGCACAGCTGGCTGGAGGCATATAGCCACTGCCCATAGATCTCAACTTACCCTCACAACCAACTGCCCCCAGGCCTAAGTTCTCTGCCTCAAAACTGCCAAGGCCTGGATAGCCAAGAGCCTGGGTGTCTTGGAAATATGCAACCATAAATAGTAGCTTTTAGAAGTATAAGGCTCCTGTTTCTGGGTCATATTAGTGTTGTTTTCACCTGTCCCCAGCCCTAAGCCAGGTGTGGCCAGAAGCAAATGTACTGTAAGAGCAGAGCAAAAACTTCCACACAGATAGTTCTGTTAGGCAATACATCTCTGCCTGACTATTAGGAATCTGGTTTCTGGGTCCTCTGTACAAAGCTCGGAGCAACACAGTGGCCACATCAATCAAAAGGACCGTGACCAACTTCAAAGTCGGTGAGCTTGTACCTATTTTTAGGCTCCTGCTGAACAGAACCAGATTCACACTACAGCTCAGCAGGGCATCGTCACGGGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTTGGGGGGGGGGGGTGGACAGAGGACGGGGACACAATTCACTGGCCAGCCCTTCTCTCCTTCAAGGAAGGCTGCTCTAGCCTGGGACTGGAATACACATTTCCTGTAAACATGGTGGGGGCCTCAGGCAAGCCAGAGTTTTGGAGCCTTCCTTAACTCTTCAAGGTGAGCATCTTGACTTGGAGGGTGGGGGTGCGGGTAAGGAAGGAACCTGTGGACTCCTCCCTACAAGACAGAAAAGGAATAAGCCACGAAGACAATAACGATTTTTGTATCAAGCGTCCTCTCCCATTTCAGCTTACCTGACAATGAAATCAAATTCGGACCCTGCAAGCATCAGTACACCCAGCAGAGTGGACACAGCACCGTCCAGAACGGGAGCAAACATGTGCTCCAGAGCGAGCATAGCCCTGTGGTTCTTGTCCCCAATGGCTGTCAGAAAGGCCTGAACAAAGGAGAAAATTGACACGGTCACATTCTGGGTGTGGTAAAGTGCTCAGCTGTGTCTATACTTGGGTTTTGTAT…
TotalamountofDNAinhumangenome:3*109basepairs(bp)
Replica6on
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
CCCGTAAGTATTTG
TTGGGTAATGC
ATGGGTCAATTA
TTTAGTAG
AATGTCnucleo6desavailableincells
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
Genes
Gene Gene Gene Gene Gene
CentralDogma
DNA RNA Proteins
GenesencodeproteinswhicharetranscribedintomRNAandtranslatedintoproteins.
Transcrip6on
C T A G C G C T C
| | | | | | | | | G A T C G C G A G
DNA
C U A G C G
RNApolymerase
mRNA
http://www.uniprot.org/uniprot/P09238
Transla6on
h\p://gel.ym.edu.tw/~ycl6/sc2005/images/transla6on.gif
The genetic code
gene regulation
gene regulation
http://string-db.org/version_9_0/newstring_cgi/show_network_section.pl?identifier=9606.ENSP00000279441&all_channels_on=1&network_flavor=evidence&targetmode=proteins
Whatmakesthemdifferent?
Muchhumanvaria6onisduetodifferencein~6millionbasepairs(0.1%ofgenome)referredtoasSNPs
TACATAGCCATCGGTANGTACTCAATGATGATAGenomicDNA: A SNP
G
SingleNucleo6dePolymorphism(SNP)
Threegenotypes
TACATAGCCATCGGTAAGTACTCAATGATGATA
AA
ATGTATCGGTAGCCATTCATGAGTTACTACTAT
TACATAGCCATCGGTAAGTACTCAATGATGATAATGTATCGGTAGCCATTCATGAGTTACTACTAT
Mother
Father
TACATAGCCATCGGTAAGTACTCAATGATGATA
AG
ATGTATCGGTAGCCATTCATGAGTTACTACTAT
TACATAGCCATCGGTAGGTACTCAATGATGATAATGTATCGGTAGCCATCCATGAGTTACTACTAT
Mother
Father
TACATAGCCATCGGTAGGTACTCAATGATGATA
GG
ATGTATCGGTAGCCATCCATGAGTTACTACTAT
TACATAGCCATCGGTAGGTACTCAATGATGATAATGTATCGGTAGCCATCCATGAGTTACTACTAT
Mother
Father