Bioinformatics. ï« introduction ï« molecular biology ï«...

download Bioinformatics. ï« introduction ï« molecular biology ï« biotechnology ï« bioMEMS ï« bioinformatics ï« bio-modeling ï« cells and e-cells ï« transcription

If you can't read please download the document

  • date post

    25-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    1

Embed Size (px)

Transcript of Bioinformatics. ï« introduction ï« molecular biology ï«...

  • Slide 1
  • bioinformatics
  • Slide 2
  • introduction molecular biology biotechnology bioMEMS bioinformatics bio-modeling cells and e-cells transcription and regulation cell communication neural networks dna computing fractals and patterns the birds and the bees .. and ants course layout
  • Slide 3
  • book Introduction to Computational Molecular Biology
  • Slide 4
  • introduction
  • Slide 5
  • DNA
  • Slide 6
  • Slide 7
  • central dogma
  • Slide 8
  • definitions Informatics the science of information management Bioinformatics the science of biological information management
  • Slide 9
  • what is bioinformatics?
  • Slide 10
  • Bioinformatics is Multidisciplinary Computer Science Math Statistics Structural Biology Phylogenetics Drug Design Genomics Molecular Biology interdisciplinary
  • Slide 11
  • increasing levels of complexity Genome (DNA) Transcriptosome (RNA) Proteome (proteins) Metabalome (metabolic pathways)
  • Slide 12
  • Source: GenBank GenBank basepair growth growth of biological databases
  • Slide 13
  • 3D structures growth http://www.rcsb.org/pdb/holdings.html
  • Slide 14
  • symbol meaning explanation GG Guanine A A Adenine TT Thymine CC Cytosine R A or G puRine Y C or T pYrimidine N A, C, G or T aNy base UU Uracil DNA/RNA
  • Slide 15
  • some definitions use of computers to catalog and organize molecular life science information into meaningful entities. subset of computational biology Methods to analyse, store, search, retrieve and represent biological data by computers /in computers massive amounts of data: databases extracting information and knowledge from "raw" data for most bioscientists, all they need in bioinformatics is sequence analysis definitions of bioinformatics
  • Slide 16
  • bioinformatics is not just the storage of data in a computer. bioinformatics is the use of computers to test a biological hypothesis prior to performing the experiment in the laboratory. bioinformatics is the design of software programs that analyse data. what does it do?
  • Slide 17
  • nucleotide and protein sequences protein structures all sorts of functional data related to genes, proteins and their regulation, interactions etc. curated and non-curated databases bioinformatics databases
  • Slide 18
  • sequence searching and sequence alignments looking at properties that can be analyzed/predicted from sequence data protein structures and their analysis structural classification visualisation of macromolecules system-wide understanding of the biology of a given organism some goals
  • Slide 19
  • genomes and their annotation complete genomes of many organisms are available seeing parts lists of everything an organism needs and figuring out how they work together annotation: looking at the DNA sequence
  • Slide 20
  • genomes and their annotation gene finding is not always straightforward problem: rare gene products, for which you cannot find corresponding mRNA or protein sequences in databanks additional complication: alternative splicing, many transcripts per gene
  • Slide 21
  • genomes and their annotation if you intend to analyze or just use data from a databank it is useful to know both the goals and the reality of their annotation level inconsistencies, missing data even well-annotated databanks provide only a fraction of all biologically relevant information relevant to a gene or a molecule (compared to literature)
  • Slide 22
  • annotation: a vision databank content: all knowlegde on functions of a gene product add structural information insights in structure-function relationships add data on expression patterns and regulation understanding cell differentiation and other big questions in biology on molecular level
  • Slide 23
  • current -omics
  • Slide 24
  • metabolomics to identify, measure and interpret the complex time- related concentration, activity and flux of metabolites in cells, tissues, and other bio-samples such as blood, urine, and saliva.
  • Slide 25
  • systems biology Integrated view of biology at multiple levels Generation of quantitative, predictive models of the behavior of biological systems, such as organisms
  • Slide 26
  • bioinformatics in short very short
  • Slide 27
  • common genes?
  • Slide 28
  • Application of information technology to the storage, management and analysis of biological information Facilitated by the use of computers what is bioinformatics?
  • Slide 29
  • Sequence analysis Geneticists/ molecular biologists analyse genome sequence information to understand disease processes Molecular modeling Crystallographers/ biochemists design drugs using computer-aided tools Phylogeny/evolution Geneticists obtain information about the evolution of organisms by looking for similarities in gene sequences Ecology and population studies Bioinformatics is used to handle large amounts of data obtained in population studies Medical informatics Personalised medicine
  • Slide 30
  • Nucleotide sequence file Search databases for similar sequences Sequence comparison Multiple sequence analysis Design further experiments Restriction mapping PCR planning Translate into protein Search for known motifs RNA structure prediction non-coding coding Protein sequence analysis Search for protein coding regions Sequencing project management Protein sequence file Sequence comparison Search for known motifs Predict secondary structure Predict tertiary structure Create a multiple sequence alignment Edit the alignment Format the alignment for publication Molecular phylogeny Protein family analysis Nucleotide sequence analysis Sequence entry sequence analysis: overview Manual sequence entry Sequence database browsing Search databases for similar sequences
  • Slide 31
  • gene sequencing Automated chemical sequencing methods allow rapid generation of large data banks of gene sequences
  • Slide 32
  • Sequences producing significant alignments: (bits) Value gnl|PID|e252316 (Z74911) ORF YOR003w [Saccharomyces cerevisiae] 112 7e-26 gi|603258 (U18795) Prb1p: vacuolar protease B [Saccharomyces ce... 106 5e-24 gnl|PID|e264388 (X59720) YCR045c, len:491 [Saccharomyces cerevi... 69 7e-13 gnl|PID|e239708 (Z71514) ORF YNL238w [Saccharomyces cerevisiae] 30 0.66 gnl|PID|e239572 (Z71603) ORF YNL327w [Saccharomyces cerevisiae] 29 1.1 gnl|PID|e239737 (Z71554) ORF YNL278w [Saccharomyces cerevisiae] 29 1.5 gnl|PID|e252316 (Z74911) ORF YOR003w [Saccharomyces cerevisiae] Length = 478 Score = 112 bits (278), Expect = 7e-26 Identities = 85/259 (32%), Positives = 117/259 (44%), Gaps = 32/259 (12%) Query: 2 QSVPWGISRVQAPAAHNRG---------LTGSGVKVAVLDTGIST-HPDLNIRGG-ASFV 50 + PWG+ RV G G GV VLDTGI T H D R + + Sbjct: 174 EEAPWGLHRVSHREKPKYGQDLEYLYEDAAGKGVTSYVLDTGIDTEHEDFEGRAEWGAVI 233 Query: 51 PGEPSTQDGNGHGTHVAGTIAALNNSIGVLGVAPSAELYXXXXXXXXXXXXXXXXXQGLE 110 P D NGHGTH AG I + + GVA + ++ +G+E Sbjct: 234 PANDEASDLNGHGTHCAGIIGSKH-----FGVAKNTKIVAVKVLRSNGEGTVSDVIKGIE 288 The BLAST program has been written to allow rapid comparison of a new gene sequence with the 100s of 1000s of gene sequences in data bases database similarity searching
  • Slide 33
  • 768 TT....TGTGTGCATTTAAGGGTGATAGTGTATTTGCTCTTTAAGAGCTG 813 || || || | | ||| | |||| ||||| ||| ||| 87 TTGACAGGTACCCAACTGTGTGTGCTGATGTA.TTGCTGGCCAAGGACTG 135..... 814 AGTGTTTGAGCCTCTGTTTGTGTGTAATTGAGTGTGCATGTGTGGGAGTG 863 | | | | |||||| | |||| | || | | 136 AAGGATC.............TCAGTAATTAATCATGCACCTATGTGGCGG 172..... 864 AAATTGTGGAATGTGTATGCTCATAGCACTGAGTGAAAATAAAAGATTGT 913 ||| | ||| || || ||| | ||||||||| || |||||| | 173 AAA.TATGGGATATGCATGTCGA...CACTGAGTG..AAGGCAAGATTAT 216 sequence comparison Gene sequences can be aligned to see similarities between gene from different sources
  • Slide 34
  • restriction mapping Genes can be analysed to detect gene sequences that can be cleaved with restriction enzymes AceIII 1 CAGCTCnnnnnnnnnn... AluI 2 AGCT AlwI 1 GGATCnnnnn_ ApoI 2 rAATT_y BanII 1 G_rGCyC BfaI 2 CTA_G BfiI 1 ACTGGG BsaXI 1 ACnnnnnCTCC BsgI 1 GTGCAGnnnnnnnnnnn... BsiHKAI 1 G_wGCwC Bsp1286I 1 G_dGChC BsrI 2 ACTG_Gn BsrFI 1 rCCGG_y CjeI 2 CCAnnnnnnGTnnnnnn... CviJI 4 rGCy CviRI 1 TGCA DdeI 2 CTnA_G DpnI 2 GATC EcoRI 1 GAATT_C HinfI 2 GAnT_C MaeIII 1 GTnAC_ MnlI 1 CCTCnnnnnn_n MseI 2 TTA_A MspI 1 CCG_G NdeI 1 CATA_TG Sau3AI 2 GATC_ SstI 1 G_AGCTC TfiI 2 GAwT_C Tsp45I 1 GTsAC_ Tsp509I 3 AATT_ TspRI 1 CAGTGnn 50100150200250
  • Slide 35
  • PCR primer design Oligonucleotides for use in the polymerisation chain reaction can be designed using computer based programs OPTIMAL primer length --> 20 MINIMUM primer length --> 18 MAXIMUM primer length --> 22 OPTIMAL primer melting temperature --> 60.000 MINIMUM acceptable melting temp --> 57.000 MAXIMUM acceptable melting temp --> 63.000 MINIMUM acceptable primer GC% --> 20.000 MAXIMUM acceptable primer GC% --> 80.000 Salt concentration (mM) --> 50.000 DNA concentration (nM) --> 50.000 MAX no. unknown bases (Ns) allowed --> 0 MAX acceptable self-complementarity --> 12 MAXIMUM