Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical...

17
Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library

Transcript of Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical...

Page 1: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Beyond PubMed and BLAST: Exploring NCBI tools and

databases

Kate BronstadDavid Flynn

Alumni Medical Library

Page 2: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Alumni Medical Library• Location

− 12th Floor Instructional Bldg− www.medlib.bu.edu

• Services − Electronic resources: full text access through PubMed, Google Scholar, Web of Science −Reference: drop in or by reservation− Instruction: request class sessions or creation of web tutorial - Learning resource center: lab space, hands-on instruction

Page 3: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

NCBI

• National Center for Biotechnology Information

• Built on Entrez System• Original database was Nucleotide• PubMed built upon this original structure.• PubMed, GENE, other molecular databases interconnected• Gene discovery, related data options in PubMed• MyNCBI works with multiple databases

Page 4: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

GENE

• Gives sequence, expression, information about protein structure and function.

• Doesn't list all known and predicted genes• Focuses on completely sequenced genomes or ones where research communities are actively contributing genetic information. • Information from RefSeq and collaborating model organism databases. • Mix of curated and automatically updated information. •Pulls in, links out to resources outside of NCBI. •4.6 Million records for 5,588 taxa

Page 5: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

GENE Record•Summary

official full name, gene type, lineage, summary, AKA

•Genomic regions, transcripts – structure, exon-intron boundaries.

− Gene table for fuller display.

• Bibliography: GeneRIF. − Summary of gene functions with specific references to related articles about function of gene/proteins in PubMed. Put together by people at NCBI.− Not comprehensive, but will give you the most relevant papers regarding function. − Authors can contact the NCBI to submit their citations

Page 6: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

RefSeq

• Reference Sequences− Nucleotide sequences and protein translation− Curated by NCBI or NCBI-approved programs.

• Difference between GenBank and RefSeq − GenBank has raw data and duplicated records− Metadata in GenBank can be incomplete− RefSeq annotated, curated and non-redundant. − NCBI takes best sequences from GenBank and curates for RefSeq records     

Page 7: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

RefSeq Record NumbersmRNAs and Proteins

NM_123456 Curated mRNANP_123456 Curated ProteinNR_123456 Curated non-coding RNAXM_123456 Predicted mRNAXP_123456 Predicted Protein XR_123456 Predicted non-coding RNAGene RecordsNG_123456 Reference Genomic SequenceChromosomeNC_123455 Microbial replicons, organelle

genomes, human chromosomesAC-123455 Alternate assembliesAssembliesNT_123456 Contig NW_123456 WGS Supercontig

Page 8: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

OMIM• Online Mendelian Inheritance in Man• Previously in print, 10 volumes, updated every 2 years. • Contains all the known genes in humans. • Gives referenced explanations of cloning, allelic variations, inheritance, mapping, molecular genetics• Links to clinical and testing information• OMIA (Online Mendelian Inheritance in Animals) a separate database for information in animals.

Page 9: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Databases for Evidence

• GEO Profiles: Microarray Data Repository public repository

- Archives and freely distributes microarray, next-generation sequencing, and other high-

throughput functional genomic data. - Submitted by researchers. Offers data storage, web-based interfaces and applications to query and download content

• Evidence Viewer: Graphical display of evidence supporting a gene model

Page 10: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Genome• Sequence and map data from the whole genomes of over 1000 organisms

-Represent organisms that are completely sequenced and those that are in progress.

• Graphical overviews of complete genomes/chromosomes • Specialized genome BLAST search to see alignments in context of genome

• Good for microbial genomes.

Page 11: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Homologene• May want to use instead of BLAST if looking for a model organism with same function or if looking at an evolutionary comparison. • Allows downloads of genomic information.

- Can capture regulatory region by including bases up or down stream.

• Multiple and pairwise alignment• Protein Alignment scores

- Substitution rates, synonymous vs. non, conservative vs. radical • Polymorphisms in GeneView dbSNP link

Page 12: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Structure and Models

• Structure, MMDB (Molecular Modeling Database)-Access from Protein link, Related Structure

• CN3D for application to view at different angles, highlight sequence in structure.

• VAST (Vector Alignment Search Tool) searches by geometric criteria

Page 13: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

BLink

• BLAST Link - Pre-run BLAST results- NCBI runs weekly searches for every new protein sequence.

• Can use instead of running BLAST search- More information than in default BLAST:

taxonomy report, view multiple alignments, search data against different

Page 14: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Links to Outside Databases

• MGI• Ensembl • KEGG: Kyoto encyclopedia of genes and genomes - Integrated databases - Pathway, disease, drug - Good for quick pathway and protein graphics•UCSC Genome Browser -Visualize tracks to compare information like gene predictions, ESTs, conserved regions. - BLAT Blast-like alignment tool – quicker but not as sensitive as BLAST.

Page 15: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Gene Information from GO• Gene expression information from Gene Ontology (GO) - Lists what has been assigned to the gene in: Molecular Function Biological Processes Cellular Component

• Level of evidence and references linked when available.

• Links into AMIGO browser for more ontology or evidence information

•Can search GENE for GO information by placing suffix at end of searchEx: “vasodilation [GO]”

Page 16: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

BU Resources• Biostatistics- Dr.Mayetri Gupta: created statistical software for discovering transcription factor binding sites (motifs) and regulatory modules, gene regulatory networks, and phylogenetic inference.

- Dr. Paola Sebastiani: created software for network modeling called Bayesware Discoverer, also CAGED, BAGED for analysis of gene expression data.

Page 17: Beyond PubMed and BLAST: Exploring NCBI tools and databases Kate Bronstad David Flynn Alumni Medical Library.

Library Support

• Contact the library with any suggestions, recommendations that we can list or promote for BU community

• Software and datasets can be archived in BU’s Digital Common

• If there are resources we don’t have, we may be able to procure them for you.

• Hands-on BLAST workshop offered.