BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course...
Transcript of BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course...
![Page 1: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/1.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
BINF 6211Design and Implementation of
Bioinformatics DatabasesLecture 22
April 21st, 2008Dr. Jennifer W. Weller
Dr. Andrew Carr
![Page 2: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/2.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Agenda
• Genetic Databases– OMIM– dbSNP
![Page 3: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/3.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Genetic Information Databases• From the phenotype perspective
– A mutation may be inferred from the way it tracks in crosses (inheritance)
– Given enough crosses, the relative location may be inferred • Recombination frequency with respect to linked phenotypes
– The physical map location provides an absolute position– Mutant alleles have the sequence changes leading to the range of
phenotypes associated with the disease– The frame of reference is within the set of samples having the
phenotype• From the physical location perspective
– Not all sequence changes (alleles) lead to different phenotypes• The changes may be synonomous• The changes may lead to subtle, multi-genic effects• Variants are defined with respect to the gene/chromosomal location• The frame of reference is at the genome sequence representative of a
population.
![Page 4: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/4.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Online Mendelian Inheritance in Man
• A catalog of human genes and genetic disorders – Expert curation: Dr. Victor A. McKusick (JHU) and colleagues – Development for the World Wide Web and housed for serving by NCBI.
• Contents: textual information and references, a federated database– Links to MEDLINE– Links to sequence records in Entrez– Links to additional related resources at NCBI and elsewhere, as curators deem
relevant.– http://www.ncbi.nlm.nih.gov/sites/entrez?db=omim
![Page 5: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/5.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
• List is not comprehensive – contents must fit the categories defined.
• Two strategies to get subsets of interest– Via phenotype
• Use the Limits page to retrieve records that have the prefixes (+,#,%, ) by checking the box in front of each
GO– Via clinical synopsis
• Don’t select the above boxes GO• Not all disease-related records have such a synopsis
– Use the History page to combine the two searches
http://www.ncbi.nlm.nih.gov/Omim/mimstats.html
![Page 6: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/6.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Y-linked is 4xxxxx OID
![Page 7: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/7.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Record Types• “Mendelian inheritance" refers to the transmission of inherited
characters– Via reproductive transmission of genes.
• Character types have keys in certain ranges, as below– (100000- 200000- ) Autosomal loci or phenotypes from before May 15,
1994.– (300000- )X-linked loci or phenotypes– (400000- )Y-linked loci or phenotypes– (500000- )Mitochondrial loci or phenotypes– (600000- )Autosomal loci or phenotypes from after May 15, 1994
• Allelic variants have the MIM number of the parent entry, a period, then a unique 4-digit number. – Example: Factor IX (hemophilia B) locus is 306900
• Alleles are 306900.0001 to 306900.0101. – The beta-globin locus (HBB) is 141900
• Sickle hemoglobin allele is 141900.0243.
![Page 8: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/8.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
MIM special characters• Symbols preceding an entry number mean:
– An asterisk (*) indicates a gene of known sequence.– A number symbol (#) indicates that it is a descriptive entry
• This is usually of a phenotype, and will be explained in the first paragraph, discussion of related genes is included in the Gene entry
• This does not represent a unique locus.– A plus sign (+) description of a gene of known sequence and
a phenotype.– A percent sign (%) a confirmed Mendelian phenotype or
phenotypic locus but the molecular basis is not known.– No symbol a description of a phenotype where the Mendelian
basis is not proven, or there may be phenotypic crossover– A caret symbol (^) the entry no longer exists.
![Page 9: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/9.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Other limitations• There is no aggregation function to track how many inherited diseases have
a known sequence• There is no aggregation function for inherited diseases with a known
phenotype but no corresponding sequence (those with % prefix)– Entrez Gene will let you retrieve human genes for which there is no sequence
data or for which only a phenotype is known BUT there is no keyword here for disease genes
• human[orgn] NOT gene_nucleotide[filter]• human[orgn] AND phenotype_only[Properties]• Note: these lists will overlap• To get those with a phenotype and an OMIM record use human[orgn] AND
phenotype_only[Properties] AND gene_omim[filter]• While there is an emphasis on inheritance and cytogenetics, there is very
little information on chromosomal aberrations (these are often NOT inherited).
– For this the genome-wide map of chromosomal break points elsewhere is a better source (although this does not include monoploid or polyploid examples)
• An OMIM record may link to a gene not actually the primary locus, if it was included in the discussion by the authors of a paper – these are the links at the top and bottom of the record, while those in the side bar are limited to the specific locus.
![Page 10: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/10.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 11: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/11.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 12: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/12.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Gene Map• This section has information on the cytogenetic
locations of genes – a single tabular file, with chromosomes in order, each from ptel to q tel– Limited to those with demonstrated cytogenetic
location• if only mapped to a chromosome given at the end of the
list for that chromosome– The Web version is searchable by gene symbol,
chromosomal location or keyword– There is an associated file called the GeneMapKey to
describe the column headings in the file, and special characters used
![Page 13: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/13.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Morbid Map• Alphabetical list of diseases described in OMIM, with location as
known– Searchable by gene name, location and keyword– There is a graphical view of this data, which is visualizable in the Entrez
Map Viewer• You need to select the correct display settings in order to interpret the input
correctly• Special symbols
– [ ] information for molecular aberrations that don’t lead to something classified as a disease
– { } indicates a variant that leads to pathogen susceptibility– ? Means the mapping status is unresolved
• After the name of the disorder there is a (number) that indicates the method of mapping
• With respect to the WT gene (1)• With respect to the mutant allele (2)• Both (3)
![Page 14: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/14.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Local hosting• OMIM is not relational but some of the
information is tracked in a relational system:– MIM number, create date, update dates
• It uses the ASN.1 format– You can download selected files (matrices of
commonly requested data – think data hypercubes):• The complete text of OMIM• The OMIM Gene table, either from the ftp site or from the
directory of the NCBI Web site as an alphabetical list of gene symbols and their MIM numbers.
• The OMIM Gene Map key and columns in the GeneMap file • The OMIM Morbid Map
![Page 15: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/15.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
OMIM Case Study
• I want to design an array that will capture the mutations known to be associated with the Collagen I-A1 gene.
• I want to know what other genes I might need to assay for patients with this phenotype
• I want to know what patient data I should collect to do a good job on clinical signs and symptoms.
![Page 16: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/16.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 17: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/17.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 18: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/18.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 19: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/19.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 20: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/20.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Nearby genes on the chromosome
![Page 21: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/21.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 22: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/22.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 23: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/23.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 24: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/24.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Go back to the Collagen DB
Lists of specific types of mutations, but not in a text file
![Page 25: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/25.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
For SNP-specific assays
![Page 26: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/26.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 27: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/27.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, CarrCircular
![Page 28: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/28.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 29: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/29.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 30: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/30.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 31: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/31.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 32: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/32.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
NIA database, but no data on bone
![Page 33: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/33.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 34: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/34.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 35: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/35.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 36: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/36.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Pathway data rather than chromosomal location data
![Page 37: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/37.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Signs and Symptoms
![Page 38: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/38.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 39: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/39.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 40: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/40.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
SNPs• SNPs are single nucleotide polymorphisms, f=1:300 nt
– Sequence variants that do not change the number of bases in a gene
• Can still cause early truncation of a gene product• Most are biallelic• Several large-scale international and commercial projects have
undertaken to assess the level of polymorphism in the human genome, in various populations
• Why useful: the collection of such markers is unique for an individual– Mapping– defining population structure– performing functional studies
![Page 41: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/41.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
dbSNP• dbSNP is a public database of single nucleotide polymorphisms (SNPs) and
abit more– Any species is allowed and from any part of a particular genome.– SNPs linked to known genes or expressed DNA segments (ESTs) are most
useful. • Thus SNPs from these regions are prioritized for integration with other ncbi
databases/view/tools.• dbSNP includes several types of simple genetic polymorphisms
– single-base nucleotide substitutions– small-scale multi-base deletions or insertions– retroposable element insertions– microsatellite repeat variation.
• Experimental information is also included: – the sequence information around the polymorphism– specific experimental conditions necessary to perform an experiment (such as
PCR of the locus)– frequency information by population or individual genotype.
![Page 42: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/42.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
Integration map
![Page 43: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/43.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
dbSNP Schemas• dbSNP Schema
– > 100 tables and many relationships among tables. – No single ER diagram with all dbSNP tables is available
• Sub-schemas are available in which tables are grouped according to subject areas:– - Batch Submission:– - Submitted SNP– - Submitted snp, population frequency and individual genotype– - Frequency calculation by submitted snp and population.– - SNP Mapping and Annotation– Version control: b125_SNPContigLoc_b34_3: is the mapping
data for b125 snps that are mapped to NCBI genome build 34 version 3.
![Page 44: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/44.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 45: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/45.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 46: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/46.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 47: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/47.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
• SNPs are indexed by two different accession numbers– the HANDLE | ID /
NCBI | ssASSAY IDforms which refer to an individual submission record
– the NCBI | rsSNP IDform which refers to the abstracted SNP and all associated records.
![Page 48: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/48.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
No data
Try next
![Page 49: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/49.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
What should the record have
![Page 50: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/50.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 51: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/51.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr
![Page 52: BINF 6211 Design and Implementation of …bioinformatics.gmu.edu/weller/BINF8211/Course Notes/BINF6211_f2008...Design and Implementation of Bioinformatics Databases Lecture 22 April](https://reader031.fdocuments.in/reader031/viewer/2022022008/5ae6a0bd7f8b9a29048dc229/html5/thumbnails/52.jpg)
Winter 2008 UNCC CS/Bioinformatics Instructors: Weller, Carr