Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey...

Post on 13-Dec-2015

212 views 0 download

Tags:

Transcript of Bioinformatics MEDC601 Lecture by Brad Windle Ph# 628-1956 email: bwindle@vcu.edu Office: Massey...

Bioinformatics

MEDC601

Lecture by Brad WindlePh# 628-1956email: bwindle@vcu.eduOffice: Massey Cancer Center, Goodwin LabsRoom 319

Web site for lecture:

http://www.people.vcu.edu/~bwindle/Courses/MEDC601

Profile

A set of data or characteristics pertaining to an item

Profiles are sometimes referred to as Signatures or Fingerprints

Cellular Profiles

GeneExpression

ProteinExpression

MiscData

SNPs

DNAMethylation

Cell State

Drug Response

Metabolitics

StructuralGenomic

ProteinStates

Disease

Gene/ProteinSequence

ProteinStructure

DrugStructure

Cellular Profiles

GeneExpression

ProteinExpression

MiscData

SNPs

DNAMethylation

Drug Response

Metabolitics

StructuralGenomic

ProteinStates

Gene/ProteinSequence

ProteinStructure

DrugStructure

Profiles Have Two Sides

Genes / Samples Sample 1 Sample 2 Sample 3 Sample 4

Gene 1 1 2 3 32

Gene 2 5 3 17 22

Gene 3 23 65 21 23

Gene 4 2 1 3 3

Genes / Samples Sample 1 Sample 2 Sample 3 Sample 4

Gene 1 1 2 3 32

Gene 2 5 3 17 22

Gene 3 23 65 21 23

Gene 4 2 1 3 3

A gene profile across samplesand a sample profile across genes

Bioinformatics uses tools for learning from the Profiles

There are two basic forms of learning

Unsupervised Learning

Supervised Learning

Unsupervised Learning

Definition: Learning from observationExplorationLet the data reveal what you learnYou can learn that what you did not expectAllows you to formulate relevant hypothesesIt’s a hypothesis generator

Supervised Learning

Definition: Learning from exampleIt’s focusedAllows you to test relevant hypotheses but it doesn’t usually allow you to prove the hypothesisIt often involves statistical or computational modelingUsually requires experimental validation of what you learn

Supervised Learning

Examples of methods

QSAR Modeling, Prediction model

Classification Model, eg., Is a patient a good candidate for a particular drug treatment?

Simulation Modeling, eg., Cell simulation

Why can’t we observe the patterns unaided?

The patterns are too complex or abstract.

There’s too much data.

There’s too much noise.

Drug-related profiles

drug1

R

R

R

R

R

R

e1e2e3e4e5e6

Drug profilebased on structure

drug1 drug 2 drug 3 drug 4

e1

Structure profilebased on drugs

How phys/chem properties relate to biological or biochemical properties, such cell killing or enzyme activity is within the realm of QSAR

cell1

drug1drug2drug3drug4drug5drug6

Cell profile based on various drug sensitivities

Drug profile based on cellsensivities

cell1 cell2 cell3 cell4 cell5 cell6

drug 1

COMPARE

NCI 60 cell lines profiled for sensitivity to drugs

How do drugs relate to each other based on cellular response?

How do cells relate to each other based on drug response?

A major tool in unsupervised learning is

Cluster Analysis

It evaluates how similar items are to each otherDo they have similar patterns within their profiles?How relatively close the items are to each otherThere are various ways to measure closeness

Cell response profile

Monks et al. Anti-Cancer Drug Design 12:553 (1997)

Cell line clustersbased on drug response

Cell clusters correspondcell type to a limited extent

Scherf et al, nature genetics 24:236 (2000)

Drug clusters correspond to drug targets or mechanisms of action

not necessarily drug structure.

Scherf et al, nature genetics 24:236 (2000)

Scherf et al, nature genetics 24:236 (2000)

QuickTime™ and aH.264 decompressor

are needed to see this picture.

QuickTime™ and aH.264 decompressor

are needed to see this picture.

QuickTime™ and aH.264 decompressor

are needed to see this picture.

QuickTime™ and aH.264 decompressor

are needed to see this picture.

COMPARE is a resource for exploring targets and mechanisms

Your compound of interest can be profiled and compared tothe profiles for >70,000 compounds

Compounds with good matches may have known characteristics,such as target and mechanism, thus revealing a possible target andmechanism for your compound

http://dtp.nci.nih.gov/docs/compare/compare.html

Wallqvist StudyWallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)

Found genes that correlated with drug sensitivity

Hypothesized that some of those gene’s proteins are the targetof the drugs that correlated with gene expression

Start with drug response, identify drugs based on correlation with genes

Drugs

123

Identified genes based on correl with drugs 123

Genes

Identify corresponding protein for each gene from structural protein database (PDB)

123

Proteins

Identify small compounds (ligands) that have been fitted to proteins using 3D modeling

123

Ligands

Identify ligands with structural correlation with drugs

Compound 1Compound 2

Compound 5

Compound 9

Compound 11

Compound 15

Compound 18

Compound 20

Gene 4Gene 5

Gene 7Gene 8

Gene 14

Gene 16

Gene 23Gene 24

Protein 2

Protein 7

Protein 9

Protein 17

Protein 23

Ligand 4

Ligand 8Ligand 9

Ligand 16

Compound 6

Compound 11

Compound 6

Compound 3Compound 4

Compound 7Compound 8

Compound 10

Compound 12Compound 13Compound 14

Compound 16Compound 17

Compound 19

Gene 1Gene 2Gene 3

Gene 6

Gene 9Gene 10Gene 11Gene 12Gene 13

Gene 15

Gene 17Gene 18Gene 19Gene 20Gene 21Gene 22

Gene 25Gene 26Gene 27

Protein 1

Protein 3Protein 4Protein 5Protein 6

Protein 8

Protein 10Protein 11Protein 12Protein 13Protein 14Protein 15Protein 16

Protein 18Protein 19Protein 20Protein 21Protein 22

Protein 24

Ligand 1Ligand 2Ligand 3

Ligand 5Ligand 6Ligand 7

Ligand 10Ligand 11Ligand 12Ligand 13Ligand 14Ligand 15

Ligand 17Ligand 18Ligand 19Ligand 20

Compound 1Compound 2Compound 3Compound 4Compound 5

Compound 7Compound 8Compound 9Compound 10

Compound 12Compound 13Compound 14Compound 15Compound 16Compound 17Compound 18Compound 19Compound 20

Compounds screened in NCI 60 cell lines

Genes that correlate with compounds in NCI 60 cell lines

Corresponding proteinsin 3D structural database

Compounds that bindin silico to proteins

Compounds with structuralsimilarity to ligands

The similarity between the ligand specific for the protein and the drug that correlates with expression of the protein (gene) suggests that the drug is targeting the protein (interacting)

calcium/calmodulin-dependent protein kinase I

protein kinase C

Wallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)

alcohol dehydrogenase 5

Wallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)

4 sets of dataDRGEPPDBLigand

Gene-Drug correlationGene to Protein translationProtein-Ligand predictionLigand-Drug correlation