Research in Computational Genomics Mar Albà
Transcript of Research in Computational Genomics Mar Albà
![Page 1: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/1.jpg)
Research in Computational Genomics
Mar Albà
Evolutionary Genomics GroupResearch Unit on Biomedical Informatics
Universitat Pompeu Fabra
UPC, April 1 2005
![Page 2: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/2.jpg)
1. The genetic information
2. The human genome project
3. Genomics: techniques and research
![Page 3: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/3.jpg)
1. The genetic information
![Page 4: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/4.jpg)
1865 – Mendel
The genetic information: inheritance
![Page 5: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/5.jpg)
1928 – Griffith : transforming principle
deadly bacteria
non deadly bacteria
pneumonia Infection of mice
Die
Live
boiled deadly bacteria Live
Die+
1944 - Avery, MacLeod, McCarthy: DNA is the transforming principle
Live
Die
+ + DNAse
+ + protease
DNA is the hereditary material
![Page 6: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/6.jpg)
DNA structure
1953 – Watson and Crickdiscover the structure of DNA
1953 – Rosalind FranklinX difraction image of DNA
![Page 7: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/7.jpg)
DNA structure: antiparallel double helix
A: adenineG: guanineC: citosineT: thymine
C-GA-T
nucleotides:
![Page 8: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/8.jpg)
RNA:
-single strand
-uracil instead of thimine
-contains riboseinstead of desoxiribose
A-UC-G
![Page 9: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/9.jpg)
Proteins
QIKDLLVSSSTDLDTTLVLVNAIYFKGMWKTAFNAEDTREMPFHVTKQESKPVQMMCMNNSFNVATLPAEKMKILELPFASGDLSMLVLLPDEVSDLERIEKTINFEKLTEWTNPNTMEKRRVKVYLPQMKIEEKYNLTSVLMALGMTDLFIPSANLTGISSAESLKISQAVHGAFMELSEDGIEMAGSTGVIEDIKHSPESEQFRADHPFLFLIKHNPTNTIVYFGRYWSP
![Page 10: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/10.jpg)
Proteins are made of amino acids
amino acid
![Page 11: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/11.jpg)
20 amino acids
![Page 12: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/12.jpg)
Peptide bond
Proteins: amino acid chain
![Page 13: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/13.jpg)
![Page 14: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/14.jpg)
DNA replication
![Page 15: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/15.jpg)
Transcription
The transcription of a gene may be off or on, dependingon the cell type and conditions.
![Page 16: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/16.jpg)
Translation
![Page 17: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/17.jpg)
Translation
![Page 18: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/18.jpg)
Genetic code
1 2 3 4 5 6
nucleotides coding DNA
AA 1 AA 2amino acids
protein
ATGGCACAACCA…
MetAlaGlnPro..
![Page 19: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/19.jpg)
DNA cloning
DNA fragments Vectors (replicating DNA)
+ DNA ligase
vectorwith insert
transformation of bacteria
amplificationextraction
![Page 20: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/20.jpg)
DNA sequencing
......
DNA polymerase
DNA synthesis
resulting partial labelled fragments
![Page 21: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/21.jpg)
DNA sequencing
![Page 22: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/22.jpg)
2. The human genome project
![Page 23: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/23.jpg)
The human genome project
1953 - Discovery of the DNA double helix by Watson and Crick
1995 - Haemophilus influenzae genome
2001 - The first draft of the human genome ispublished, covering approximately 94% of thegenome (Public Consortium + Celera)
2003 – Human genome sequence complete
![Page 24: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/24.jpg)
2001 – Draft of the human genome
15 February 2001
![Page 25: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/25.jpg)
Josep Abril and Roderic Guigó
IMIM (Institut Municipal d’Investigacions Mèdiques, Barcelona)participates in the annotation of the human genome
![Page 26: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/26.jpg)
Human genome : 3.000.000.000 nucleotides
![Page 27: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/27.jpg)
Human chromosomes
![Page 28: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/28.jpg)
What’s in the human genome?
gene non-coding part
gene coding part(2%)
“parasitic”repetitiveelements
microsatellitesDNA long repeats
![Page 29: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/29.jpg)
EXONS
INTRONS
‘UPSTREAM’REGULATORYELEMENT
‘DOWNSTREAM’REGULATORYELEMENT
PROMOTER
PROTEIN
Gene structure
![Page 30: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/30.jpg)
Organism Genome Size (Bases) Estimated Genes
Human (Homo sapiens) 3 billion 30,000
Laboratory mouse (M. musculus) 2.6 billion 30,000
Mustard weed (A. thaliana) 100 million 25,000
Roundworm (C. elegans) 97 million 19,000
Fruit fly (D. melanogaster) 137 million 13,000
Yeast (S. cerevisiae) 12.1 million 6,000
Bacterium (E. coli) 4.6 million 3,200
Human immunodeficiency virus (HIV)
9700 9
Comparison with other genomes
![Page 31: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/31.jpg)
~ 30.000 genes
~ 10.000 already known (cDNA)
-Gene prediction programmes
-Homology to other species
-ESTs (expressed sequence tags)
Gene catalogue
- the functions of approximately half of the genes are not known !
![Page 32: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/32.jpg)
“Parasitic” repetitive elements
Nature, Feb. 15, 2001
![Page 33: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/33.jpg)
“Parasitic” repetitive elementsRetrotransposition
genomeLINE
RNA
transcriptionpol II
translation Translocationof the complex
LINE copy
cytoplasm
![Page 34: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/34.jpg)
3. Genomics: techniques and research
![Page 35: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/35.jpg)
- bioinformatics
- genome sequencing and annotation
- functional genomics
- systems biology
Genomics
![Page 36: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/36.jpg)
![Page 37: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/37.jpg)
Genome sequencing and annotation
![Page 38: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/38.jpg)
Exponential growth of DNA sequences
![Page 39: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/39.jpg)
How many genomes?
Genome Sequencing Projects on GOLD ©
0
200
400
600
800
1000
1200
Dec-97Mar-98Jun-98Sep-98Dec-98Mar-99Jun-99Sep-99Dec-99Mar-00Jun-00Sep-00Dec-00Mar-01Jun-01Sep-01Dec-01Mar-02Jun-02Sep-02Dec-02Mar-03Jun-03Sep-03Dec-03Mar-04
Incomplete
Complete
![Page 40: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/40.jpg)
Recently sequenced eukaryotic genomes
T.rubripes
C.intestinalis
A.gossypii
A.mellifera
R.norvegicus
A.gambiae
![Page 41: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/41.jpg)
How long does it take to sequence a genome?
bacteria: 1 day
fungus: 1 week
insect: 1-2 months
mammal: 1-2 years
![Page 42: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/42.jpg)
Gene prediction
- DNA coding for protein sequences (exons) only accounts for 2% of the human genome
-Information we can use:
- splice site signals-statistics of coding sequences
EXONS
PROTEIN
gene
![Page 43: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/43.jpg)
Sequence similarity
-To predict genes we can also use sequence similaritysearches to known proteins
alignment of protein sequences
![Page 44: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/44.jpg)
Microbial Genomes at NCBI
http://www.ncbi.nlm.nih.gov/genomes/MICROBES/Complete.html
National Center for Biotechnology information, National Institute of Health
![Page 45: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/45.jpg)
Functional annotation of all genes in a genome
![Page 46: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/46.jpg)
Ensembl Genome Browser
http//www.ensembl.org European Bioinformatics Institute
![Page 47: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/47.jpg)
Ensembl Genome Browser
![Page 48: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/48.jpg)
![Page 49: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/49.jpg)
Encode (NIH)Encyclopedia Of DNA Elements
- exhaustive analysis of 1% of the human genome
- identification of functional elements
- development and comparison ofdifferent computational methods
http://www.genome.gov/Pages/Research/ENCODE/2003-
![Page 50: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/50.jpg)
HapMap (Haplotype Map)
http://www.hapmap.org/2002-
Variability map (single nucleotide polymorphism, SNPs) in Africa, Asiaand USA populations.
It will help identify genes involved incomplex disease, by association with particular haplotypes.
haplotype variants
SNPs
![Page 51: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/51.jpg)
Environmental Genome Shotgun Sequencing of the Sargasso Sea
J.Craig Venter et al. Science, Vol 304, Issue 5667, 66-74, 2 April 2004
1.045 billion base pairs
1800 genomic species
148 previously unknown bacterial phylotypes
![Page 52: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/52.jpg)
Functional genomics
![Page 53: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/53.jpg)
DNA microarrays: high throughput analysisof gene transcription
![Page 54: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/54.jpg)
chIp-chip : analysis of protein binding DNA fragments
cross-link protein and DNA
immunoprecipitation
eliminate protein
hybridize with DNA
![Page 55: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/55.jpg)
Protein-protein interactions: yeast two hybrid
![Page 56: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/56.jpg)
Protein interaction networks
![Page 57: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/57.jpg)
![Page 58: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/58.jpg)
Systems biology
- Development of mathematical methods to model thebehaviour of biological systems, including all elements inthe system and their interactions.
![Page 59: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/59.jpg)
Funded in 2000 byLeroy Hood, Seattle
Masaru Tomita,Keio Unversity, Japan
![Page 60: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/60.jpg)
National Center for Biotechnology Information (USA):
http://www.ncbi.nlm.nih.gov
European Bioinformatics Institute (UK):
http://www.ebi.ac.uk
![Page 61: Research in Computational Genomics Mar Albà](https://reader033.fdocuments.in/reader033/viewer/2022060108/554ea2fdb4c9055f7b8b4869/html5/thumbnails/61.jpg)
Acknowledgements :
Grup de Recerca en Informàtica Biomèdica – Ferran SanzGrup de Genòmica Computacional – Roderic Guigó
Universitat Pompeu Fabra
www.imim.es/grib
Genòmica ComputacionalGRIB