Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics...

26
Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info

Transcript of Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics...

Page 1: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Introduction to Bioinformatics

Yana Kortsarts

References: An Introduction to Bioinformatics Algorithms

bioalgorithms.info

Page 2: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

What is Bioinformatics? Bioinformatics is a relatively new

interdisciplinary field that integrates computer science, mathematics, biology, and information technology to manage, analyze, and understand biological, biochemical and biophysical information.

Bioinformatics is a computational science and the subset of larger field of Computational Biology.

Page 3: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

What is Bioinformatics? Bioinformatics is the use of computers to study

biology Bioinformatics is the science of using

information to understand biology Bioinformatics is integration of information

technology (IT) and biology Bioinformatics is the development of

computational methods for studying structure, function and evolution of genes, proteins and whole genomes

Page 4: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Course Curriculum Ethics, Computing and Genomics Review of Molecular Biology and

Biochemistry Concepts DNA and protein structure Gene expression (transcription and translation) Molecular Biology Central Dogma

Biological Research on the Web Public Biological Databases and Data Formats Searching Biological Databases

Page 5: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Course Curriculum Introduction to Bioinformatics Algorithms

Sequence alignments, scoring, gaps Algorithm Design Techniques: Exhaustive Search,

Dynamic Programming The Needleman and Wunsch Algorithm The Smith-Waterman Algorithm Introduction to BLAST Multiple Sequence Alignment Phylogenetic Trees

Introduction to Python and Biopython in UNIX environment

Page 6: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Some Terminology Cell is a primary unit of life Cell consists of molecules, chemical

reactions and a copy of the genome for that organism

All life on this planet depends on three types of molecules: DNA, RNA and proteins

Page 7: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Some Terminology DNA

Holds information on how cell works RNA

Acts to transfer short pieces of information to different parts of cell

Provide templates to synthesize into protein Proteins

Form enzymes that send signals to other cells and regulate gene activity

Form body’s major components (e.g. hair, skin, etc.)

Page 8: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

DNA - Deoxyribonucleic Acid Genetic material

Consists of two long strands Each strand is made of:

Phosphates Sugar Nucleotides

A (adenine) G (guanine) C ( cytosine) T (thymine)

Page 9: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

DNA – Double Helix Structure

Page 10: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Discovery of DNA DNA Sequences

Chargaff and Vischer, 1949 DNA consisting of A, T, G, C

Adenine, Guanine, Cytosine, Thymine Chargaff Rule

Noticing #A#T and #G#C A “strange but possibly meaningless”

phenomenon. Wow!! A Double Helix

Watson and Crick, Nature, April 25, 1953

Rich, 1973 Structural biologist at MIT. DNA’s structure in atomic resolution. Crick Watson

1 Biologist1 Physics Ph.D. Student900 wordsNobel Prize

Page 11: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Watson & Crick – “…the secret of life”

Watson: a zoologist, Crick: a physicist

“In 1947 Crick knew no biology and practically no organic chemistry or crystallography..” – www.nobel.se

Applying Chagraff’s rules and the X-ray image from Rosalind Franklin, they constructed a “tinkertoy” model showing the double helix

Their 1953 Nature paper: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”

Watson & Crick with DNA model

Rosalind Franklin with X-ray image of DNA

Page 12: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

DNA: The Basis of Life Deoxyribonucleic Acid (DNA)

Double stranded with complementary strands A-T, C-G DNA is a polymer

Sugar-Phosphate-Base Bases held together by H bonding to the opposite strand

Page 13: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

DNA, continued

Phosphate

Base (A,T, C or G)

http://www.bio.miami.edu/dana/104/DNA2.jpg

Sugar

Page 14: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

DNA, continued DNA has a double helix structure. However,

it is not symmetric. It has a “forward” and “backward” direction. The ends are labeled 5’ and 3’ after the Carbon atoms in the sugar component.

5’ AATCGCAAT 3’

3’ TTAGCGTTA 5’

DNA always reads 5’ to 3’ for transcription replication

Page 15: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Double helix of DNA

Page 16: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

The Central Dogma of Molecular Biology

Information has been transferred from DNA (information storage molecule) to RNA (information transfer molecule) to a specific protein (a functional, non-coding product)

DNA RNA Protein

transcription translation

Page 17: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

DNA, RNA, and the Flow of Information

TranslationTranscription

Replication

Page 18: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Central Dogma (DNARNAprotein)

The paradigm that DNA directs its transcription to RNA, which is then translated into a protein.

Transcription(DNARNA) The process which transfers genetic information from the DNA to the RNA.

Translation(RNAprotein) The process of transforming RNA to protein as specified by the genetic code.

Page 19: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

RNA RNA is similar to DNA chemically. It is usually only

a single strand. T(hyamine) is replaced by U(racil) Some forms of RNA can form secondary structures

by “pairing up” with itself. This can have change its

properties dramatically.

DNA and RNA

can pair with

each other.

http://www.cgl.ucsf.edu/home/glasfeld/tutorial/trna/trna.giftRNA linear and 3D view:

Page 20: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

More Terminology

Transcription of DNA DNA transcribed into RNA RNA exits as a single-strand unit and as a double-helix

as well RNA consist of A, C, G and U (uracil)

Types of RNA Messenger RNA – mRNA Transfer RNA – tRNA Ribosomal RNA – rRNA

Page 21: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

More Terminology Translation of Messenger RNA (mRNA):

mRNA is translated into protein Proteins:

linear polymers built from amino acids The transfer of information from DNA to specific

protein via RNA takes place according to the genetic code. The RNA sequence is divided into blocks of three

letters This block is called CODON Each codon corresponds to the specific amino acid

Page 22: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

More Terminology Four different nucleotides are used to build DNA

and RNA molecules – A, G, C, T and A, G, C, U 20 different amino acids are used in protein

synthesis Four nucleotides can be arranged in 64 different

combinations of three. There are 64 = 4*4*4 different codons Some codons are redundant and some have

special function – to terminate the translation process

Page 23: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Translation

The process of going from RNA to polypeptide.

Three base pairs of RNA (called a codon) correspond to one amino acid based on a fixed table.

Always starts with Methionine and ends with a stop codon

Page 24: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Cell Information: Instruction book of Life DNA, RNA, and

Proteins are examples of strings written in either the four-letter nucleotide of DNA and RNA (A C G T/U)

or the twenty-letter amino acid of proteins. Each amino acid is coded by 3 nucleotides called codon. (Leu, Arg, Met, etc.)

Page 25: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Protein Synthesis: Summary

There are twenty amino acids, each coded by three- base-sequences in DNA, called “codons” This code is degenerate

The central dogma describes how proteins derive from DNA DNA mRNA (splicing?)

protein The protein adopts a 3D

structure specific to it’s amino acid arrangement and function

Page 26: Introduction to Bioinformatics Yana Kortsarts References: An Introduction to Bioinformatics Algorithms bioalgorithms.info.

Proteins

Complex organic molecules made up of amino acid subunits

20 different kinds of amino acids. Each has a 1 and 3 letter abbreviation.

http://www.ncbi.nlm.nih.gov/Class/MLACourse/Modules/MolBioReview/iupac_aa_abbreviations.html

Proteins are often enzymes that catalyze reactions. Also called “poly-peptides”

*Some other amino acids exist but not in humans.