Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve...
-
Upload
annabelle-marshall -
Category
Documents
-
view
235 -
download
2
Transcript of Introduction to Bioinformatics. What is Bioinformatics Easy Answer Using computers to solve...
Introduction to BioinformaticsIntroduction to Bioinformatics
What is BioinformaticsWhat is Bioinformatics
Easy AnswerUsing computers to solve molecular biology
problems; Intersection of molecular biology and computer science
Hard AnswerComputational techniques (e.g. algorithms, artificial
intelligence, databases) for management and analysis of biological data and knowledge
BioinformaticsBioinformatics
Bioinformatics = Biology + Information
Biology is becoming an information science
Computation methods are necessary to analyze the massive amount of information that coming out of the genome projects
Bioinformatics is Another Bioinformatics is Another Revolution in BiologyRevolution in Biology
Three concepts, which remain Three concepts, which remain central to Bioinformaticscentral to Bioinformatics
Data representation
A complex, dynamic, three-dimensional molecule a simple string of characters
Three concepts, which remain Three concepts, which remain central to Bioinformaticscentral to Bioinformatics
The concept of similarity– Evolution has operated on every sequence– In biomolecular sequences (DNA, RNA or amino acid
sequences). High sequence similarity usually implies significant functional or structural similarity.
– The opposite is not true– Algorithms for comparing sequences and finding
similar regions are at the heart of bioinformatics
Three concepts, which remain Three concepts, which remain central to Bioinformaticscentral to Bioinformatics
Bioinformatics is not a theoretical science; it is driven by the data, which in turn is driven by the needs of biology.
Sequences
Microarray technologies
…
GenBank GrowthGenBank Growth
Moore’s LawMoore’s Law
What do you need to know?What do you need to know?
It all depends on your background
Are you a …? Biologist with some computer knowledge, or Computer scientist with some biology
background
Few do both well
BackgroundBackground
Biology for Computer Scientists
Computer Science for Biologists
Biological Information FlowBiological Information FlowGenome Introns/Exons
Gene Sequence
Protein Sequence
Protein Functions
Protein Structure
Cellular Pathways
Bioinformatics attempts to model this pathway
Living ThingsLiving Things
Entropy (the tendency to disorder) always increase
Living organisms have low entropy compared with things like soil
They are relatively orderly…
The most critical task is to maintain the distinction between inside and outside
Living ThingsLiving Things
In order to maintain low entropy, living organisms must expend energy to keep things orderly.
They figured out how to do this 4 billion years ago
The functions of life, therefore, are meant to facilitate the acquisition and orderly expenditure of energy
Living ThingsLiving Things
The compartments with low entropy are separated from “the world.”
Cells are the smallest unit of such compartments.
Bacteria are single-cell organismsHumans are multi-cell organisms
The “living things” have the The “living things” have the following tasks:following tasks:
Gather energy from environment Use energy to maintain inside/outside distinction Use extra energy to reproduce Develop strategies for being successful and
efficient at the above tasks– Develop ways to move around– Develop signal transduction capabilities (e.g. vision)– Develop methods for efficient energy capture (e.g.
digestion)– Develop ways to reproduce effectively
How to accomplish…?How to accomplish…?
Living compartments on earth have developed three basic technologies– Ability to separate inside from outside (lipids)– Ability to build three-dimensional molecules
that assist in the critical functions of life (Protein, RNA)
– Ability to compress the information about how (and when) to build these molecules in linear code (DNA)
Bioinformatics Schematic of a Bioinformatics Schematic of a CellCell
LipidsLipids
Made of hydrophilic (water loving) molecular fragment connected to hydrophobic fragments
Spontaneously form sheets (lipid membranes) in which all the hydrophilic ends align on the outside, and hydrophobic ends align on the inside
Creates a very stable separation, not easy to pass through except for water and a few other small atoms/molecules
What is Nucleotide?What is Nucleotide? Pentose, base, phosphate group
Pentose: RNA and DNAPentose: RNA and DNA
BaseBase
Adenine (A), Cytosine (C), Guanine (G), Thymine (T),
Uracil (U).
Nucleic Acid ChainNucleic Acid Chain
Condensation reaction Orientation From 5’ to 3’ In DNA or RNA, a nucleic
acid chain is called “Strand”– DNA: double-stranded– RNA: a single strand
The number of bases– Base pair (bp) in DNA
DNA StructureDNA Structure
DNA StructureDNA Structure
DNA StructureDNA Structure
RNA Structure and FunctionRNA Structure and Function
• The major role of RNA is to participate in protein synthesis
•Messenger RNA (mRNA)
•Transfer RNA (tRNA)
•Ribosomal RNA (rRNA)
mRNAmRNA
The Genetic CodeThe Genetic Code
What is gene?What is gene?
A gene includes the entire nucleic acid sequence necessary for the expression of its product.
Such sequence may be divided into– Regulatory region– Transcriptional region: exons and introns
Exons encode a peptide or functional RNA Introns will be removed after transcription
GeneGene
GenomeGenome
The total genetic information of an organism.
For most organisms, it is the complete DNA sequence
For RNA viruses, the genome is the complete RNA sequence
Genes and ControlGenes and Control
Human genome has 3,000,000,000 bps divided into 23 liner segments (chromosome)
A gene has an average 1340 DNA bps, thus specifying a protein of about ? (how many) amino acids
Humans have about 35,000 genes = 40,000,000 DNA bps = 3% of total DNA in genome
Human have another 2,960,000,000 bps for control information. (e.g. when, where, how long, etc…)
Gene ExpressionGene Expression
An organism may contain many types of cells, each with distinct shape and function
However, they all have the same genome
The genes in a genome do not have any effect on cellular functions until they are “expressed”
Different types of cells express different sets of genes, thereby exhibiting various shapes and functions
Gene ExpressionGene Expression
The production of a protein or a functional RNA from its gene
Several steps are required– Transcription– RNA processing– Nuclear transport– Protein synthesis
Gene ExpressionGene Expression
Central DogmaCentral Dogma
DNA RNA Protein
Next …Next …
Protein Structure and FunctionProtein Structure and Function
An Amino AcidAn Amino Acid
An amino acid is defined as the molecule containing an amino group (NH2), a carboxyl group (COOH) and an R group.
R-CH(NH2)-COOH
The R group differs among various amino acids. In a protein, the R group is also call a sidechain.
An Amino AcidAn Amino Acid
The Twenty Amino Acids of The Twenty Amino Acids of ProteinsProteins
The Twenty Amino Acids of The Twenty Amino Acids of ProteinsProteins
ProteinProtein
Peptide ― a chain of amino acids linked together by peptide bonds.
Polypeptides ― long peptides
Oligopeptides ― short peptides (< 10 amino acids)
Protein are made up of one or more polypeptides with more than 50 amino acids
Protein StructureProtein Structure Primary Structure
– Refers to its amino acid sequence
Secondary structureSecondary structure
Regular, repeated patterns of folding of the protein backbone.
Two most common folding patterns– Alpha helix– Beta sheet
Tertiary StructureTertiary Structure
The overall folding of the entire polypeptide chain into a specific 3D shape
Quaternary StructureQuaternary Structure
Many proteins are formed more than one polypeptide chain
Describe the way in which the different subunits are packed together to form the overall structure of the protein
Hemoglobin molecule
Quaternary StructureQuaternary Structure
EvolutionEvolution
Mutation ― rare events, sometimes single base changes, sometimes larger events
Recombination ― how your genome was constructed as a mixture of your two parents
Through Natural Selection Homology (similarity): different species are
assumed to have common ancestors The genetic variation between different people is
…(surprisingly ..)
ReferencesReferences
http://www.biology.arizona.edu/biochemistry/problem_sets/large_molecules/
http://helix-web.stanford.edu/bmi214/index2004.html
http://www.web-books.com/MoBio/http://www.cs.sunysb.edu/~skiena/549/