Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made...

32
Basic Examples from Genomics Scientific Computing 2013-2014

Transcript of Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made...

Page 1: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Basic Examples from Genomics

Scientific Computing 2013-2014

Page 2: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Genetics as a set of principles and analytical procedures did not begin until 1866, when anAugustinian monk named Gregor Mendel performed a set of experiments that revealed thebasic inheritance mathematics (information that is carried between generation).

Until 1944, it was generally assumed that chromosomal proteins carry genetic information, and that DNAplays a secondary role. This view was shattered by Avery and McCarty who demonstratedthat the molecule deoxy-ribonucleic acid (DNA) is the major carrier of genetic material in living organisms, i.e., responsible for inheritance. The basic biological units responsiblefor possession and passing on of a single characteristic are called genes.

In 1953 James Watson and Francis Crick deduced the three dimensional double helix structure of DNA andimmediately inferred its method of replication.

In February 2001, the first draft of the human genome was published.

TIMELINE

Page 3: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

• Prokaryotes – no nucleus (bacteria)• Their genomes are circular

• Eukaryotes – have nucleus (animal,plants)• Linear genomes with multiple chromosomes in pairs.

Two kinds of Cells

Page 4: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

DNA

• Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms.

Backbone:sugars and phosphate groupsDNA is a long polymer of simple units

called nucleotides

BasesA: adenosine C: cytidine G: guanosine T: thymidine

Page 5: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

CTGCTGTACTAGGATGCTGGTGGAGAGAGCTGCATATAAATCTTTGAGAGATGCACCAAG AATCACCATCATGGTTTCCGCCATAGGGGCTTCTTTTTTTATTCAAAATCTTGCCATTGT TTTATTTGGTGGTAGACCGAAAACTGTTCCAACGGTGGAGGTATTGTCCGGGGTGATAAA GCTGGGGTCCGTATCTCTACAAAGGCTGACCTTAGTGATTCCAGTAGTAACCATACTGCT ATTATTTCTTTTGATGTTTTTAGTGAACCAAACGAAAACTGGAATGGCAATGCGTGCCGT ATCCAAGGACTATGAAACCGCGCGGCTTATGGGAATTGACGTCAATAAAATTATTACCAT AACCTTTGGTATTGGCTCTGCTCTGGCAGCTATTGGTGGCATCATGTGGGGCGCAAAATT TCCTAAAATAGACCCTTTTGTTGGGACTATGCCGGGTATTAAATGCTTTATTGCTGCAGT TCTAGGTGGAATCGGAAACATTCCCGGTGCAGTAATCGGGGGGTTCATCTTAGGGATTGG AGAGATTATGCTCATTGCTTTTCTACCGAGCCTAACTGGCTATCGAGATGCCTTTGCTTT CATACTACTGATTATCATTCTACTGTTTAAGCCAACAGGAATCATGGGTGAAAAAATTGC GGAGAAGGTGTAGACGATGAAAAAGAAAAATACCATATTAACTGGATTAGCAGTATTGCT TTTATTGATTTATTTGATTTATGCAAATAAGAATTATGATTCTTATAAAATTAGAGTTCT AAATCTATGTGCAATTTATGCTGTATTGGGACTCAGTATGAATTTGATCAATGGATTTAC AGGTTTATTTTCCCTTGGACATGCAGGTTTTATGGCAGTAGGTGCCTATACTACCGCTCT TCTGACCATGACACCGCAAAGTAAGGAGGCAACATTCTTCTTAGTGCCCATTGTAGAGCC TTTGGCTAAAATTCAGCTTCCTTTTTTTGTGGCACTGATCATCGGTGGACTACTTTCAGC AATGGTGGCATTTTTAATCGGTGCACCGACTTTAAGGCTGAAGGGCGATTATTTAGCCAT

Complementary Base Pairing:A TC G

Page 6: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Sizes of Genomes

Page 7: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands
Page 8: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands
Page 9: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands
Page 10: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Protein Structure and Function

10

Views of a proteinWireframe Ball and stick

Page 11: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Protein Structure and Function

11

Views of a proteinSpacefill Cartoon CPK colors

Carbon = green, black, or grey

Nitrogen = blue

Oxygen = red

Sulfur = yellow

Hydrogen = white

Page 12: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

VIDEO:

- DNA is Packaged

- Central Dogma (of Biology)

- Transcription

- Translation

Page 13: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Clustering

Clustering is the process of grouping data objects into a set of disjoint classes, called clusters, so that objects within a class have high similarity to each other, while objects in separate classes are more dissimilar.

Clustering is an example of unsupervised classification.

“Classification” refers to a procedure that assigns data objects to a set of classes.

“Unsupervised” means that clustering does not rely on predefined classes and training examples while classifying the data objects. Thus, clustering is distinguished from pattern recognition or the areas of statistics known asdiscriminant analysis and decision analysis, which seek to find rules for classifying objects from a given set of preclassified objects.

Page 14: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

What is DNA Microarray?• Scientists used to be able to perform genetic

analyses of a few genes at once. DNA microarray allows us to analyze thousands of genes in one experiment!

Page 15: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Purposes.

• So why do we use DNA microarray?• To measure changes in gene expression levels – two samples’ gene expression

can be compared from different samples, such as from cells of different stages of mitosis.

• To observe genomic gains and losses. Microarray Comparative Genomic Hybridization (CGH)

• To observe mutations in DNA.

Page 16: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

The Plate.

• Usually made commercially.• Made of glass, silicon, or nylon.• Each plate contains thousands of spots, and each spot contains a probe for a

different gene.• A probe can be a cDNA fragment or a synthetic oligonucleotide, such as BAC

(bacterial artificial chromosome set).• Probes can either be attached by robotic means, where a needle applies the

cDNA to the plate, or by a method similar to making silicon chips for computers. The latter is called a Gene Chip.

Page 17: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Let’s perform a microarray!1) Collect Samples.

2) Isolate mRNA.

3) Create Labelled DNA.

4) Hybridization.

5) Microarray Scanner.

6) Analyze Data.

Page 18: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

STEP 1: Collect Samples.

This can be from a variety of organisms. We’ll use two samples – cancerous human skin tissue & healthy human skin tissue

Page 19: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

STEP 2: Isolate mRNA.

• Extract the RNA from the samples. Using either a column, or a solvent such as phenol-chloroform.

• After isolating the RNA, we need to isolate the mRNA from the rRNA and tRNA. mRNA has a poly-A tail, so we can use a column containing beads with poly-T tails to bind the mRNA.

• Rinse with buffer to release the mRNA from the beads. The buffer disrupts the pH, disrupting the hybrid bonds.

Page 20: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

STEP 3: Create Labelled DNA.

Add a labelling mix to the RNA. The labelling mix contains poly-T (oligo dT) primers, reverse transcriptase (to make cDNA), and fluorescently dyed nucleotides.

We will add cyanine 3 (fluoresces green) to the healthy cells and cyanine 5 (fluoresces red) to the cancerous cells.

The primer and RT bind to the mRNA first, then add the fluorescently dyed nucleotides, creating a complementary strand of DNA

Page 21: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

STEP 4: Hybridization.

• Apply the cDNA we have just created to a microarray plate.

• When comparing two samples, apply both samples to the same plate.

• The ssDNA will bind to the cDNA already present on the plate.

Page 22: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

STEP 5: Microarray Scanner.

The scanner has a laser, a computer, and a camera.

The laser causes the hybrid bonds to fluoresce.

The camera records the images produced when the laser scans the plate.

The computer allows us to immediately view our results and it also stores our data.

Page 23: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

STEP 6: Analyze the Data.

GREEN – the healthy sample hybridized more than the diseased sample.

RED – the diseased/cancerous sample hybridized more than the nondiseased sample.

YELLOW - both samples hybridized equally to the target DNA.

BLACK - areas where neither sample hybridized to the target DNA.

By comparing the differences in gene expression between the two samples, we can understand more about the genomics of a disease.

Page 24: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Benefits.

• Relatively affordable (for some people!), about $60,000 for an arrayer and scanner setup.

• The plates are convenient to work with because they are small.

• Fast - Thousands of genes can be analyzed at once.

Page 25: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Problems.

• Oligonucleotide libraries – redundancy and contamination.

• DNA Microarray only detects whether a gene is turned on or off.

• Massive amounts of data.

Page 26: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

EXAMPLES (only two experiments)

Page 27: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

A GENE EXPRESSION MATRIX

The original gene expression matrix obtained from a scanning process contains noise, missing values, andsystematic variations arising from the experimental procedure. Data preprocessing is indispensable before any clusteranalysis can be performed.

Page 28: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands
Page 29: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

Hierarchical clustering is a technique that organizes elements into a tree, rather than forming an explicit partitioning of the elements into clusters. In this case, the genes are represented as the leaves of a tree. The edges of the trees are assigned lengths and the distances between leaves—that is, the length of the path in the tree that connects two leaves—correlate withentries in the distance matrix. Such trees are used in both the analysis of expression data and in studies of molecular evolution.

Example.

Page 30: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands
Page 31: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands

The HIERARCHICALCLUSTERING algorithm below takes an n×n distance matrix d as an input, and progressively generates n different partitions of the data as the tree it outputs. The largest partition has n single-element clusters, with every element forming its own cluster. The second-largest partition combines the two closest clusters from the largest partition, and thus has n − 1 clusters. In general, the ith partition combines the two closest clusters from the (i − 1)th partition and has n − i + 1 clusters.

Page 32: Basic Examples from Genomics - unimi.it Examples from Genomi… · The Plate. • Usually made commercially. • Made of glass, silicon, or nylon. • Each plate contains thousands