Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen...

28
Essential Essential Bioinformatics and Bioinformatics and Biocomputing Module Biocomputing Module ( ( Tutorial Tutorial ) ) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett

Transcript of Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen...

Page 1: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Essential Bioinformatics and Essential Bioinformatics and Biocomputing ModuleBiocomputing Module

((TutorialTutorial))

Biological Databases Lecturer: Chen Yuzong

Jan 2003

TAs: Cao Zhiwei Lee Teckkwong, Bernett

Page 2: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Biological databases

Tutorial outline

Purpose: Extraction of relevant information for a particular macromolecule

Case study: HLA-DRB1 molecule

Page 3: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Biological databases

Case study: HLA-DR1 molecule

Characterization of HLA-DR1 beta chain

1. Gene sequence

2. Gene structure: Introns, exons, promoter region

3. Protein sequence

4. Protein Sequence features and domains

5. Protein 3-D Structure

6. Function

7. Variation – closest variants in humans and other species

8. Clinical features

Page 4: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Clarification of Genetic information

Page 5: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Biological databases Case study: HLA-DR1 molecule

HLA class II histocompatibility antigen, DR-1 beta chain • Possible starting points

a) GenBank b) SWISS-PROT

http://www.expasy.org (SWISS-PROT)

Search by keyword “HLA DR-1”

• Pick P13758

Page 6: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: Accession No. in Swiss-prot

Page 7: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: Swiss-prot entry

Page 8: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: Cross references

Page 9: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: Keywords

Page 10: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DR1 molecule: Function

From SWISS-PROT

Protein of the major histocompatibility complex (MHC) class II which is involved in the induction of strong immune reaction. MHC II is involved in the control the expression of surface structures on lymphocytes and macrophages.

Page 11: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: protein sequence features

Page 12: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: Protein sequence

Page 13: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.
Page 14: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Biological databases: Case study: HLA-DRB1 molecule

By now we have studied SWISS-PROT entries. From these we can extract the specific information – colored red:

1.Gene sequence 2. Gene structure: Introns, exons, promoter region3. Protein sequence4. Protein Sequence features and domains5. Protein 3-D Structure6. Function7.Variation – closest variants in humans and other species8. Clinical features

Page 15: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1 molecule (Genbank, follow the link) http://www3.ncbi.nlm.nih.gov/htbin-post/Entrez/query?db=n&form=6&dopt=g&uid=X03069

Page 16: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DR1 (Genbank): Gene structure (mRNA level)

Page 17: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DR1 (Genbank): Gene sequence (mRNA)

Page 18: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Map gene feature to RNA sequence

tagttctccc tgagtgagac ttgcctgctt ctctggcccc tggtcctgtc ctgttctcca 60 gcatggtgtg tctgaagctc cctggaggct cctgcatgac agcgctgaca gtgacactga 120 tggtgctgag ctccccactg gctttggctg gggacacccg accacgtttc ttgtggcagc 180 ttaagtttga atgtcatttc ttcaatggga cggagcgggt gcggttgctg gaaagatgca 240 tctataacca agaggagtcc gtgcgcttcg acagcgacgt gggggagtac cgggcggtga 300 cggagctggg gcggcctgat gccgagtact ggaacagcca gaaggacctc ctggagcaga 360 ggcgggccgc ggtggacacc tactgcagac acaactacgg ggttggtgag agcttcacag 420 tgcagcggcg agttgagcct aaggtgactg tgtatccttc aaagacccag cccctgcagc 480 accacaacct cctggtctgc tctgtgagtg gtttctatcc aggcagcatt gaagtcaggt 540 ggttccggaa cggccaggaa gagaaggctg gggtggtgtc cacaggcctg atccagaatg 600 gagattggac cttccagacc ctggtgatgc tggaaacagt tcctcggagt ggagaggttt 660 acacctgcca agtggagcac ccaagtgtga cgagccctct cacagtggaa tggagagcac 720 ggtctgaatc tgcacagagc aagatgctga gtggagtcgg gggcttcgtg ctgggcctgc 780 tcttccttgg ggccgggctg ttcatctact tcaggaatca gaaaggacac tctggacttc 840 agccaacagg attcctgagc tgaaatgcag atgaccacat tcaaggaaga accttctgtc 900 ccagctttgc agaatgaaaa gctttcctgc ttggcagtta ttcttccaca agagagggct 960 ttctcaggac ctggttgcta ctggttcggc aactgcagaa aatgtcctcc cttgtggctt 1020 cctcagctcc tgcccttggc ctgaagtccc agcattgatg acagcgcctc atcttcaact 1080 tttgtgctcc cctttgccta aaccgtatgg cctcccgtgc atctgtactc accctgtacg 1140 acaaacacat tacattatta aatgtttctc aaagatggag tt 1182

Sig:63-149; exon1/2:162/163; exon2/3:432/433; exon3/4:741/715; exon4/5:825/826;

Exon5/6:849/850

Page 19: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Searching intron from GenbankEg. key words (HLA-DRB1*01 intron) http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=nucleotide&list_uids=1568491&dopt=GenBank

X88793. H.sapiens HLA-DRB...[gi:1568491]

LOCUS HSDRB101G 1332 bp DNA linear PRI 21-MAY-1996DEFINITION H.sapiens HLA-DRB1*01 gene.ACCESSION X88793VERSION X88793.1 GI:1568491KEYWORDS HLA class II DR-beta chain; HLA-DRB1*01 gene.SOURCE human. ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo.REFERENCE 1 (bases 1 to 1332) AUTHORS Svensson,A.C., Setterblad,N., Pihlgren,U., Rask,L. and Andersson,G. TITLE Evolutionary relationship between human major histocompatibility complex HLA-DR haplotypes JOURNAL Immunogenetics 43 (5), 304-314 (1996) MEDLINE 96175156 PUBMED 9110934REFERENCE 2 (bases 1 to 1332) AUTHORS Svensson,A.C. TITLE Direct Submission JOURNAL Submitted (19-JUN-1995) A.C. Svensson, Swedish University of Agric. Sciences, Dept of Cell Research, Genetic Center, Box 7055, 750 07 Uppsala, SWEDENFEATURES Location/Qualifiers source 1..1332 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="6" /haplotype="DR1/1"

Page 20: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

X88793. H.sapiens HLA-DRB...[gi:1568491] (cotinued)

FEATURES Location/Qualifiers source 1..1332 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="6" /haplotype="DR1/1" intron <1..469 /number=4 gene 470..493 /gene="HLA-DRB1*01" CDS <470..>493 /gene="HLA-DRB1*01" /codon_start=1 /product="HLA class II DR-beta chain" /protein_id="CAA61272.1" /db_xref="GI:1568492" /db_xref="SPTREMBL:Q29797" /translation="DTLDFSQQ" exon 470..493 /gene="HLA-DRB1*01" /number=5 intron 494..>1332 /number=5 LTR 540..1053 /note="ERV9 LTR"

Page 21: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Searching promoter from GenbankEg. key words (HLA-DRB1 promoter)

Page 22: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.
Page 23: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Biological databases: Case study: HLA-DRB1 molecule

By now we have studied SWISS-PROT and Genbank entries. From these we can extract the specific information – colored red:

1.Gene sequence 2. Gene structure: Introns, exons, promoter region3. Protein sequence4. Protein Sequence features and domains5. Protein Structure6. Function7.Variation – closest variants in humans and other species8. Clinical features

Page 24: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Protein strucutre: Follow Swiss-prot link to PDB(Eg. PDBID 1DLH) http://www.rcsb.org/pdb/cgi/explore.cgi?pid=2031041686850&page=0&pdbId=1DLH

Crystal structure of the human class II MHC protein HLA-DR1 complexed with an influenza virus peptide.

Page 25: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1: Variation – closest variants in humans

Keyword: HLA-DRB1 variant search against Genbank or swissprot

•68 hits in Genbank on 04/01/2003 21:43pm.

•1 hit in Swiss-prot at the same time. http://tw.expasy.org/cgi-bin/niceprot.pl?Q30112

•The No. of matching list may vary because of database updating.

Page 26: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1:

Variation – closest variants in other species

Try keywords Use sequence similarity search

Page 27: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

HLA-DRB1 molecule: Clinical featuresThis entry has no links to the OMIM database, we will resort to searching literature databases. http://www3.ncbi.nlm.nih.gov/Entrez/

Page 28: Essential Bioinformatics and Biocomputing Module (Tutorial) Biological Databases Lecturer: Chen Yuzong Jan 2003 TAs: Cao Zhiwei Lee Teckkwong, Bernett.

Summary Genebank Swiss-prot PDB Pubmed Others