Professional Development Course 1 – Molecular Medicine Genome Biology June 12 , 2012

Post on 29-Jan-2016

30 views 0 download

description

Professional Development Course 1 – Molecular Medicine Genome Biology June 12 , 2012. Ansuman Chattopadhyay , PhD Head, Molecular Biology Information Services Health Sciences Library System University of Pittsburgh ansuman@pitt.edu http://www.hsls.pitt.edu/guides/genetics. - PowerPoint PPT Presentation

Transcript of Professional Development Course 1 – Molecular Medicine Genome Biology June 12 , 2012

Professional Development Course 1 –Molecular Medicine

Genome BiologyJune 12, 2012

Ansuman Chattopadhyay, PhDHead, Molecular Biology Information ServicesHealth Sciences Library SystemUniversity of Pittsburghansuman@pitt.edu

http://www.hsls.pitt.edu/guides/genetics

Genomic achievements since the Human Genome Project

http://www.hsls.pitt.edu/molbio

Objective

Organism Whole Genome Sequence Databases

Genome Browsers

http://www.hsls.pitt.edu/molbio

Topics

Genome Sequencing Projects

NCBI Genome resources Integrated Microbial Genome UCSC Genome Bioinformatics

Genome Browsers

UCSC Genome Browser UCSC Table Browser NCBI Map viewer Generic Genome Browser (Gbrowse)

http://www.hsls.pitt.edu/molbio

Genome Biology

Human Genome Project Video

http://www.hsls.pitt.edu/molbio

Chromosome Structure

http://www.hsls.pitt.edu/molbio

Genome Biology: Karyotype

Adapted from NGHRI

Trisomy 21

Monosomy X

http://www.hsls.pitt.edu/molbio

Genome Biology: Karyotype

NHGRI

http://www.hsls.pitt.edu/molbio

Genome Biology: Molecular Cloning

p53

CFTRNFkB

8 September, 1989

http://www.hsls.pitt.edu/molbio

Genome Biology : Time Line

1976

RNA Bacteriophage MS2

2001

Human Genome Draft Seq

2003

Published Complete Human Ref Genome

2007

Diploid Genome seq ofan Individual Human

2011

Published Complete Genomes: 1863 organisms

1995

HaemophilusInfluenza

2008

Jim Watson Genome

Yeast

1996

1998

C. elegans

2002

Drosophila

http://www.hsls.pitt.edu/molbio

DNA Sequencing Cost

http://www.hsls.pitt.edu/molbio

Oxford Nanopore

A 20-node installation, using 8,000-nanopore cartridges, is expected

to deliver a complete human genome at 50-fold coverage in 15 minutes, according to the company, or 3 terabases of data per day, based on a sequencing

speed of 300 bases per second. For that setup, the cost per gigabase is expected to be under $10.

http://www.hsls.pitt.edu/molbio

Organism Whole Genome Sequences

2001 2012

http://www.hsls.pitt.edu/molbio

Organism Whole Genome Sequences

HumanMouse

Rat

Dog

Cow

Chimp

Rabbit

……..

http://www.hsls.pitt.edu/molbio

Genomes OnLine Database (GOLD) http://www.genomesonline.org/index.htm

Global comprehensive access to information regarding complete and ongoing genome projects, as well as metagenomes & metadata

http://www.hsls.pitt.edu/molbio

Search for organism’s whole genome

sequence

http://www.hsls.pitt.edu/molbio

Genome Resources

NCBI: Genomes Resources : Link

Genome: http://www.ncbi.nlm.nih.gov/sites/entrez?db=genome

JGI: Integrated Microbial genome Link

http://www.hsls.pitt.edu/molbio

NCBI Genome

http://www.hsls.pitt.edu/molbio

NCBI BioProject Query: Check the status of genome sequencing

for an organism, such as honey bee.

Answer: Enter search term under BioProject

Select the appropriate organism

The BioProject summary page will provide information of available projects and sequencing status

Click on Project Type for more detailed information

Explore Related Resources

http://www.hsls.pitt.edu/molbio

http://www.hsls.pitt.edu/molbio

Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/rabbit.swf

Resources

• NCBI Genome Project: http://www.ncbi.nlm.nih.gov/genomeprj• NCBI Genome: http://www.ncbi.nlm.nih.gov/sites/genome

Find the genomic sequence for an organism, such as rabbit.

NCBI Genome Project A collection of complete and in-progress large-scale sequencing, assembly,

annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals for browsing and retrieving projects pertaining to each organism.

CLICKRabbit

http://www.ncbi.nlm.nih.gov/genomeprj

http://www.hsls.pitt.edu/molbio

NCBI Genome Project : Rabbit Genome

http://www.hsls.pitt.edu/molbio

NCBI Genome Project : Rabbit Genome

http://www.hsls.pitt.edu/molbio

http://www.hsls.pitt.edu/molbio

Link to the video tutorial:http://media.hsls.pitt.edu/media/molbiovideos/img.swf

Resources

Integrated Microbial Genome (IMG):http://img.jgi.doe.gov/cgi-bin/w/main.cgi

Find the genomic sequence for a bacteria, such as Salmonella enterica

Human genome sequence

http://www.hsls.pitt.edu/molbio

Genomic achievements since the Human Genome Project

http://www.hsls.pitt.edu/molbio

http://goo.gl/bsZdN

http://www.hsls.pitt.edu/molbio

Genome Biology: Structural Variations

http://www.hsls.pitt.edu/molbio

Genome Reference Consortium

Link to the PLoS Biology paper on the GRC : http://goo.gl/30Xun

http://www.hsls.pitt.edu/molbio

NCBI Genome Resourceshttp://www.ncbi.nlm.nih.gov/guide/genomes/

http://www.hsls.pitt.edu/molbio

What is a Genome Browser?

Genome Browsers enable researchers to visualize & browse entire genomes with annotated data including:

• gene prediction and structure • proteins• expression• regulation• variation• comparative analysis• etc.

Annotated data is usually from multiple diverse sources.

http://www.hsls.pitt.edu/molbio

Genome Browsers

The Big Three

NCBI MapViewer UCSC Genome Browser EBI Ensemble

Generic Genome Browser

(Gbrowse)

Display: Vertical

Display: Horizontal

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser Default Tracks

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser Page

http://www.hsls.pitt.edu/molbio

mRNA and EST Tracks

Expression (such as microarray)

Comparative Genomics• As a group• Individual species

Variation and Repeats(including SNPs, copy number variation)

Groups of data (Tracks)

ENCODE Tracks

Phenotype and Disease Tracks

Regulation (including TFBS)

Navigating the Human Genome

Browse the region of human chromosome 7 between 54,318043 to 55,974,438 bp (chr7:54,318,043-55,974,438)

http://www.hsls.pitt.edu/molbio

http://www.hsls.pitt.edu/molbio

Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/ucsc_genes.swf

Resource

UCSC Genome Browser: http://genome.ucsc.edu/

Browse the region of human chromosome 7 between 54,318043 to 55,974,438 bp.

What genes are present in this region ?

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser: Navigating a Genomic Region

What genes are present in this region?

http://www.hsls.pitt.edu/molbio

Bioinformatics Institutionshttp://www.ebi.ac.uk/http://www.ncbi.nlm.nih.gov/

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser: Navigating a Genomic Region

What is RefSeq ?

http://www.hsls.pitt.edu/molbio

NCBI Sequence Databases

GenBank archival database of nucleotide sequences

from >160,000 organisms More info

RefSeq based on GenBank record, non-redundant

expert verified databases of reference sequences More info

http://www.hsls.pitt.edu/molbio

International Nucleotide Sequence Database Collaboration

http://www.hsls.pitt.edu/molbio

Primary Vs Derivative databases

http://www.hsls.pitt.edu/molbio

RefSeq Scope & Accessions

Genomic DNA NC_123456 - complete genome, complete

chromosome, complete plasmid NG_123456 - genomic region NT_123456 - genomic contig

mRNA - NM_123456 Protein - NP_123456

more about RefSeq scope and accessions...

http://www.hsls.pitt.edu/molbio

RefSeq Status Codes

Provisional Reviewed Predicted Genome Annotation

more about RefSeq status codes

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

Display Options

http://www.hsls.pitt.edu/molbio

Hide: removes a track from view

Dense: all items collapsed into a single line

Squish: each item = separate line, but 50% height + packed

Pack: each item separate, but efficiently stacked (full height)

Full: each item on separate line

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

Gene Description

http://www.hsls.pitt.edu/molbio

Gene Description

http://www.hsls.pitt.edu/molbio

Informative description

other resource links

microarray data

mRNA secondary structure

links to sequences

protein domains/structure

orthologs in other species

Gene Ontology™ descriptions

mRNA descriptions

pathways

genetic association studies

comparative toxicology

gene model

UCSC Genome Browser: Navigating a Genomic Region

Find SNPs present in this region

http://www.hsls.pitt.edu/molbio

http://www.hsls.pitt.edu/molbio

Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/ucsc_snp.swfFile: UCSC_part2.swf

Resource

UCSC Genome Browser: http://genome.ucsc.edu/

Browse the region of human chromosome 7 between 55,033,691 to 55,282,150 bp.

What genetic variations are present in this region ?Retrieve the DNA sequence of this genomic region showing

SNPs in red and all gene exons in blue

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

UCSC Genome Browser: Navigating a Genomic Region

http://www.hsls.pitt.edu/molbio

BLAT: Map a protein sequence into the

genome

http://www.hsls.pitt.edu/molbio

UCSC Blat: Place a Peptide Seq into the Genome

Peptide Seq:NKSSHFYSNVGLQIQTYELQESNVQLKLTVVET

Nucleotide seq:AAATCCTCACATTTTTACTCAAATGTTGGACTTCAAATTCAGACATATGAACTTCAGGAAAGC AATGTTCA

http://www.hsls.pitt.edu/molbio

http://www.hsls.pitt.edu/molbio

Link to the video tutorial:http://media.hsls.pitt.edu/media/clres2705/blat.swfFile: Blat.swf

Resource

UCSC BLAT: http://genome.ucsc.edu/cgi-bin/hgBlat?command=start

Place a mRNA or peptide sequence into the human genome

UCSC Blathttp://genome.ucsc.edu/cgi-bin/hgBlat

http://www.hsls.pitt.edu/molbio

UCSC Blat

http://www.hsls.pitt.edu/molbio

UCSC Blat

Peptide Seq:NKSSHFYSNVGLQIQTYELQESNVQLKLTVVET

http://www.hsls.pitt.edu/molbio

Thank you!Any questions?

Carrie Iwema Ansuman Chattopadhyayiwema@pitt.edu ansuman@pitt.edu 412-383-6887 412-648-1297

http://www.hsls.pitt.edu/molbio