Introduction to BioComputing Biology in silico 3 rd February 2010

Post on 24-Feb-2016

38 views 0 download

Tags:

description

Introduction to BioComputing Biology in silico 3 rd February 2010. Carrie Iwema , PhD, MLS Molecular Biology Information Specialist Health Sciences Library System University of Pittsburgh iwema@pitt.edu http://www.hsls.pitt.edu/guides/genetics. General Topics. Information Overload - PowerPoint PPT Presentation

Transcript of Introduction to BioComputing Biology in silico 3 rd February 2010

Introduction to BioComputingBiology in silico3rd February 2010

Carrie Iwema, PhD, MLSMolecular Biology Information SpecialistHealth Sciences Library SystemUniversity of Pittsburghiwema@pitt.edu

http://www.hsls.pitt.edu/guides/genetics

General Topics

Information Overload

Genome Gene Protein

http://www.hsls.pitt.edu/guides/genetics

Specific Topics Information Overload

PubMed Alternatives to PubMed

GoPubMed Novoseek PubGet

Molecular Databases HSLS Molecular Biology Information Service

Genome Gene Protein Genome Biology Genome Browsers

UCSC Genome Browser NCBI MapViewer

Entrez Gene UniProt

http://www.hsls.pitt.edu/guides/genetics

Information Overload

209K• Breast

Cancer

84K• Colon

Cancer

52K • p53

4K • STAT1

5,394 Journals

http://www.hsls.pitt.edu/guides/genetics

1.3 billionsearches in 2009

Growth of Molecular Databases

Source: Nodal Point Blog

2008: 1075

http://www.hsls.pitt.edu/guides/genetics

2009: 1170

2010: 1230

Molecular Databases Nucleic Acids Research: Oxford Journals

Annual Database Issue Annual Web Server Issue

Journals Bioinformatics: Oxford Journals BMC Bioinformatics: BioMed Central Database: Oxford Journals *new in 2009*

Articles on “genetic databases” PubMed: 21,851 results MeSH: 16,398 results

http://www.hsls.pitt.edu/guides/genetics

HSLS Molecular Biology Information Service

Workshops

Website

Software Licensing

Bioinformatics Consultations

http://www.hsls.pitt.edu/guides/genetics

HSLS OBRC

http://www.hsls.pitt.edu/guides/genetics

HSLS OBRC in Science

HSLS OBRC

2441 links to databases

and software

~3000hits/day

http://www.hsls.pitt.edu/guides/genetics

search.HSLS.MolBio Integrated search system

Databases & Software Articles on Databases & Software Genes/Proteins Pathways Protocols Videos Recommended Articles

Tabbed browsing Clustered search results

http://www.hsls.pitt.edu/guides/genetics

Hands-on exercises Locate databases on

Natural antisense, UTR, copy number variation

Retrieve gene information for Your favorite gene, BRCA1, STAT1

Find a suitable protocol for Methylation PCR, in situ hybridization, primer design

Identify videos on Protein structure prediction, human genome project

http://www.hsls.pitt.edu/guides/genetics

Genome Biology

http://www.hsls.pitt.edu/guides/genetics

From Cell to Gene

Human Genome Project Video

http://www.hsls.pitt.edu/guides/genetics

Genome Biology Time Line

1976

RNA Bacteriophage MS2

2001

Human Genome Draft Seq

2003

Published Complete Human Ref Genome

2007

Diploid Genome seq ofan Individual Human

2010

Published Complete Genomes: 1191 organisms

1995

HaemophilusInfluenza

Human Genome Project Video

2008

Jim Watson Genome

http://www.hsls.pitt.edu/guides/genetics

Genome Resources

NCBI: Genomes Resources : Link

Genome Project Genome: 6108 species

Genomes OnLine Database (GOLD): Link

JGI: Integrated Microbial Genomes: Link

http://www.hsls.pitt.edu/guides/genetics

NCBI Genome Resources

http://www.hsls.pitt.edu/guides/genetics

Practice Question: Query: Check the status of genome sequencing for

an organism, such as rabbit.

Answer: Pick an organism or metagenome project name. Search the Genome Project database. To get the most precise

results specify the organism field when searching with an organism name, for example: human[orgn].

Click on the desired Genome Project if more than one result. The Genome Project summary page will provide information of

available projects and sequencing status.

http://www.hsls.pitt.edu/guides/genetics

NCBI Genome Project A collection of complete and in-progress large-scale sequencing,

assembly, annotation, and mapping projects for cellular organisms. The database is organized into organism-specific overviews that function as portals for browsing and retrieving projects pertaining to each organism.

CLICKRabbit

http://www.hsls.pitt.edu/guides/genetics

NCBI Genome Project : Rabbit Genome

http://www.hsls.pitt.edu/guides/genetics

NCBI Genome Project : Rabbit Genome

http://www.hsls.pitt.edu/guides/genetics

NCBI Entrez Genome:

http://www.hsls.pitt.edu/guides/genetics

Genomes Online Database (GOLD) http://genomesonline.org/index2.htm

Global resource for comprehensive access to information regarding complete and ongoing genome projects, metagenomes, and metadata.

“genome sequencing has come of age, and genomics will become central to microbiology's future. It may appear at the moment that the human genome is the main focus and primary goal of genome sequencing, but do not be deceived. The real justification in the long run, is microbial genomics”

Carl Woese, 1998http://www.hsls.pitt.edu/guides/genetics

Genome Browsers

http://www.hsls.pitt.edu/guides/genetics

Genome Browsers: What are they?

Genome Browsers enable researchers to visualize and browse entire 

genomes with annotated data including gene prediction and

structure, proteins, expression, regulation, variation, comparative

analysis, etc.

http://www.hsls.pitt.edu/guides/genetics

Genome Browsers The Big Three

NCBI MapViewer UCSC Genome Browser EBI Ensembl

Generic Genome Browser (Gbrowse) JBrowse (Ajax based like Google Map)

Display: Vertical

Display: Horizontal

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser

http://www.hsls.pitt.edu/guides/genetics

Navigating the Human Genome

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

UCSC Genome Browser

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Set up basic browser parameters

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Start fresh

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

What genes are present in this region?

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

NCBI sequence databases RefSeq

based on GenBank records; non-redundant, expert-verified databases of reference sequences Link

GenBank archival database of nucleotide sequences

from >160,000 organisms Link

http://www.hsls.pitt.edu/guides/genetics

International Nucleotide Sequence Database Collaboration

http://www.hsls.pitt.edu/guides/genetics

Primary Vs Derivative databases

http://www.hsls.pitt.edu/guides/genetics

RefSeq Scope & Accessions Genomic DNA

NC_123456 - complete genome, chromosome, plasmid NG_123456 - genomic region NT_123456 - genomic contig

mRNA NM_123456 Protein NP_123456

more about RefSeq scope and accessions...

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

Zoom in and display only the EGFR gene

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Select the gene region from the “Scale” track to zoom in

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

Display all Single Nucleotide polymorphisms (SNPs) present in this gene

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

Retrieve the nucleotide sequence of this genomic region showing all exons in blue and SNPs in Red,

bold faced and underlined.

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region: sequence view

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

Look in probable promoter region and see if there’s anything

interesting…

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

Zoom out

UCSC Genome Browser: navigating a genomic region

Browse the region of human chromosome 7 between bp 54,318,043 to 55,974,438

What transcription factors bind in this region?

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: navigating a genomic region

http://www.hsls.pitt.edu/guides/genetics

Discovery Tool…

http://www.hsls.pitt.edu/guides/genetics

NCBI MapViewer

http://www.hsls.pitt.edu/guides/genetics

NCBI MapViewer How To: View/download features around an object or

between two objects on a chromosomeStarting with...CHROMOSOMAL COORDINATES

Begin on the Map Viewer home page. Click the "R" icon under Tools for the desired organism and build.

Select the chromosome, enter the coordinates in the From and To boxes, and click Go. Use either exact coordinates, e.g., 61551076, or values such as, 61M or 61551K.

If necessary, use the Maps & Options dialog box to change displayed maps; the maps and region displayed determine the data available.

Entrez Gene

http://www.hsls.pitt.edu/guides/genetics

Common Questions

What is its function?

What are its neighboring genes?

What is its genomic seq?How many splice varients are there?

What are its intron-exon architechure?

What diseases are associated with it?

Which tissues it expressed ?

How can I get its cDNA clone?

http://www.hsls.pitt.edu/guides/genetics

SNP

Genomic Sequence

Expression Profile

Interacting Partners3D Structure

mRNA Sequence

Chromosomal Localization

Disease

Amino acid Sequence

Homologous Sequences

http://www.hsls.pitt.edu/guides/genetics

NCBI : Entrez Gene

Entrez GeneFind: gene symbols and aliases sequences: genomic, mRNA, protein intron-exon architecture genomic context: neighboring and

antisense genes interacting partners associated gene ontology terms:

function, cellular component and biological process

http://www.hsls.pitt.edu/guides/genetics

Entrez Gene

a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer

each record represents a single gene from a given organism

http://www.hsls.pitt.edu/guides/genetics

Entrez Gene Sequences

mRNA Seq

Protein Seq

Genomic Seq

http://www.hsls.pitt.edu/guides/genetics

Entrez Gene Links

http://www.hsls.pitt.edu/guides/genetics

Gene Ontology (GO)

Controlled vocabulary tagging

Function Biological Processes Cellular Component

http://www.hsls.pitt.edu/guides/genetics

Entrez Gene: Gene Table

http://www.hsls.pitt.edu/guides/genetics

Introns/Exons

Try it!

Find mRNA sequence for your gene of interest

http://www.hsls.pitt.edu/guides/genetics

Find mRNA Sequence for Reelin Gene

http://www.hsls.pitt.edu/guides/genetics

FASTA vs GenBank records

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: find a gene in the genome

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: find a gene in the genome

http://www.hsls.pitt.edu/guides/genetics

UCSC Genome Browser: find a gene in the genome

http://www.hsls.pitt.edu/guides/genetics

Bioinformatics Databases & Software Providers

NCBI Home page Site map Resource Guide

EBI Home page Databases Software

http://www.hsls.pitt.edu/guides/genetics

UniProt

http://www.hsls.pitt.edu/guides/genetics

UniProt

world's most comprehensive catalog of information on proteins

http://www.hsls.pitt.edu/guides/genetics

a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR

UniProt

http://www.hsls.pitt.edu/guides/genetics

UniProt

http://www.hsls.pitt.edu/guides/genetics

Thank you!Any questions?

Carrie Iwema Ansuman Chattopadhyayiwema@pitt.edu ansuman@pitt.edu 412-383-6887 412-648-1297

http://www.hsls.pitt.edu/guides/genetics