Post on 13-Dec-2015
Bioinformatics
MEDC601
Lecture by Brad WindlePh# 628-1956email: bwindle@vcu.eduOffice: Massey Cancer Center, Goodwin LabsRoom 319
Web site for lecture:
http://www.people.vcu.edu/~bwindle/Courses/MEDC601
Profile
A set of data or characteristics pertaining to an item
Profiles are sometimes referred to as Signatures or Fingerprints
Cellular Profiles
GeneExpression
ProteinExpression
MiscData
SNPs
DNAMethylation
Cell State
Drug Response
Metabolitics
StructuralGenomic
ProteinStates
Disease
Gene/ProteinSequence
ProteinStructure
DrugStructure
Cellular Profiles
GeneExpression
ProteinExpression
MiscData
SNPs
DNAMethylation
Drug Response
Metabolitics
StructuralGenomic
ProteinStates
Gene/ProteinSequence
ProteinStructure
DrugStructure
Profiles Have Two Sides
Genes / Samples Sample 1 Sample 2 Sample 3 Sample 4
Gene 1 1 2 3 32
Gene 2 5 3 17 22
Gene 3 23 65 21 23
Gene 4 2 1 3 3
Genes / Samples Sample 1 Sample 2 Sample 3 Sample 4
Gene 1 1 2 3 32
Gene 2 5 3 17 22
Gene 3 23 65 21 23
Gene 4 2 1 3 3
A gene profile across samplesand a sample profile across genes
Bioinformatics uses tools for learning from the Profiles
There are two basic forms of learning
Unsupervised Learning
Supervised Learning
Unsupervised Learning
Definition: Learning from observationExplorationLet the data reveal what you learnYou can learn that what you did not expectAllows you to formulate relevant hypothesesIt’s a hypothesis generator
Supervised Learning
Definition: Learning from exampleIt’s focusedAllows you to test relevant hypotheses but it doesn’t usually allow you to prove the hypothesisIt often involves statistical or computational modelingUsually requires experimental validation of what you learn
Supervised Learning
Examples of methods
QSAR Modeling, Prediction model
Classification Model, eg., Is a patient a good candidate for a particular drug treatment?
Simulation Modeling, eg., Cell simulation
Why can’t we observe the patterns unaided?
The patterns are too complex or abstract.
There’s too much data.
There’s too much noise.
Drug-related profiles
drug1
R
R
R
R
R
R
e1e2e3e4e5e6
Drug profilebased on structure
drug1 drug 2 drug 3 drug 4
e1
Structure profilebased on drugs
How phys/chem properties relate to biological or biochemical properties, such cell killing or enzyme activity is within the realm of QSAR
cell1
drug1drug2drug3drug4drug5drug6
Cell profile based on various drug sensitivities
Drug profile based on cellsensivities
cell1 cell2 cell3 cell4 cell5 cell6
drug 1
COMPARE
NCI 60 cell lines profiled for sensitivity to drugs
How do drugs relate to each other based on cellular response?
How do cells relate to each other based on drug response?
A major tool in unsupervised learning is
Cluster Analysis
It evaluates how similar items are to each otherDo they have similar patterns within their profiles?How relatively close the items are to each otherThere are various ways to measure closeness
Cell response profile
Monks et al. Anti-Cancer Drug Design 12:553 (1997)
Cell line clustersbased on drug response
Cell clusters correspondcell type to a limited extent
Scherf et al, nature genetics 24:236 (2000)
Drug clusters correspond to drug targets or mechanisms of action
not necessarily drug structure.
Scherf et al, nature genetics 24:236 (2000)
Scherf et al, nature genetics 24:236 (2000)
QuickTime™ and aH.264 decompressor
are needed to see this picture.
QuickTime™ and aH.264 decompressor
are needed to see this picture.
QuickTime™ and aH.264 decompressor
are needed to see this picture.
QuickTime™ and aH.264 decompressor
are needed to see this picture.
COMPARE is a resource for exploring targets and mechanisms
Your compound of interest can be profiled and compared tothe profiles for >70,000 compounds
Compounds with good matches may have known characteristics,such as target and mechanism, thus revealing a possible target andmechanism for your compound
http://dtp.nci.nih.gov/docs/compare/compare.html
Wallqvist StudyWallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)
Found genes that correlated with drug sensitivity
Hypothesized that some of those gene’s proteins are the targetof the drugs that correlated with gene expression
Start with drug response, identify drugs based on correlation with genes
Drugs
123
Identified genes based on correl with drugs 123
Genes
Identify corresponding protein for each gene from structural protein database (PDB)
123
Proteins
Identify small compounds (ligands) that have been fitted to proteins using 3D modeling
123
Ligands
Identify ligands with structural correlation with drugs
Compound 1Compound 2
Compound 5
Compound 9
Compound 11
Compound 15
Compound 18
Compound 20
Gene 4Gene 5
Gene 7Gene 8
Gene 14
Gene 16
Gene 23Gene 24
Protein 2
Protein 7
Protein 9
Protein 17
Protein 23
Ligand 4
Ligand 8Ligand 9
Ligand 16
Compound 6
Compound 11
Compound 6
Compound 3Compound 4
Compound 7Compound 8
Compound 10
Compound 12Compound 13Compound 14
Compound 16Compound 17
Compound 19
Gene 1Gene 2Gene 3
Gene 6
Gene 9Gene 10Gene 11Gene 12Gene 13
Gene 15
Gene 17Gene 18Gene 19Gene 20Gene 21Gene 22
Gene 25Gene 26Gene 27
Protein 1
Protein 3Protein 4Protein 5Protein 6
Protein 8
Protein 10Protein 11Protein 12Protein 13Protein 14Protein 15Protein 16
Protein 18Protein 19Protein 20Protein 21Protein 22
Protein 24
Ligand 1Ligand 2Ligand 3
Ligand 5Ligand 6Ligand 7
Ligand 10Ligand 11Ligand 12Ligand 13Ligand 14Ligand 15
Ligand 17Ligand 18Ligand 19Ligand 20
Compound 1Compound 2Compound 3Compound 4Compound 5
Compound 7Compound 8Compound 9Compound 10
Compound 12Compound 13Compound 14Compound 15Compound 16Compound 17Compound 18Compound 19Compound 20
Compounds screened in NCI 60 cell lines
Genes that correlate with compounds in NCI 60 cell lines
Corresponding proteinsin 3D structural database
Compounds that bindin silico to proteins
Compounds with structuralsimilarity to ligands
The similarity between the ligand specific for the protein and the drug that correlates with expression of the protein (gene) suggests that the drug is targeting the protein (interacting)
calcium/calmodulin-dependent protein kinase I
protein kinase C
Wallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)
alcohol dehydrogenase 5
Wallqvist et al, Molecular Cancer Therapeutics 1:311-320 (2002)
4 sets of dataDRGEPPDBLigand
Gene-Drug correlationGene to Protein translationProtein-Ligand predictionLigand-Drug correlation