cDNA Microarray analysis of an invasive brain tumor

download cDNA Microarray analysis of an invasive brain tumor

If you can't read please download the document

description

cDNA Microarray analysis of an invasive brain tumor. OR More answers than you can handle Dominique B Hoelzinger. Overview. Introduction Generating data Analyzing data Interpreting data. The biological problem. Glioblastoma multiforme the deadliest brain cancer Current treatments: - PowerPoint PPT Presentation

Transcript of cDNA Microarray analysis of an invasive brain tumor

  • cDNA Microarray analysis of an invasive brain tumorORMore answers than you can handleDominique B Hoelzinger

  • OverviewIntroductionGenerating dataAnalyzing dataInterpreting data

  • The biological problemGlioblastoma multiformethe deadliest brain cancerCurrent treatments:SurgeryChemotherapyRadiotherapyStem cellsGene therapy

  • SPREAD OF GLIOBLASTOMA MULTIFORME1) corpus callosum2) Fornix3) Optic radiation 4) Association pathways5) Anterior commissure

  • Glioma motilityWhat make these cells move?What switches them from dividing to motile?

  • The ones that got awayHighly invasiveSurgeon cant reach themChemotherapy and radiotherapy cant reach themThey are not dividing

    corecorerimrim

  • Laser Capture Microdissection

  • 1) PrepareFollow routine protocols for preparinga tissue on a plain, uncovered microscope slide

    2) Locate3) Capture4) Microdissect5) AnalyzeVisualize the sample through the video monitor or the microscope. Position the CapSure film carrier over the cell(s) of interest

    Press the button to pulse the low power infrared laser. The desired cell(s) adhere to the CapSure film carrier. Lift the CapSure film carrier, with the desired cell(s)to the film surface. The surrounding tissue remains intact. Place the CapSure film carrier directly onto a standard microcentrifuge tube (Eppendorf) containing the extraction buffer. The cell contents, DNA, RNA or are ready for subsequent molecular analysis.

  • Microdissection of single cellsIdentify invading glioma cells on cryostat sections Using 20x magnification, laser-capture tumor cells Retrieve captured cells on LCM Cap Verify cell capture by inspection of Cap

    10mm

  • About RNA

  • OverviewIntroductionGenerating dataAnalyzing dataInterpreting data

  • Robotic Array Assembly

  • cDNA microarray technologyhttp://research.nhgri.nih.gov/microarray/image_analysis.html

  • Really raw data

  • OverviewIntroductionGenerating dataAnalyzing dataInterpreting data

  • GeneSpringNormalizes the calculated dataSelects genes more than two-fold over or under the ratio of 1 (equally expressed in both populations)

    Custer analysis

    Principal Components Analysis

  • Genes down-regulated in migrating cells

    C/RNameDescriptionExtracellular33 IGFBP5insulin-like growth factor binding protein 512 IGFBP2insulin-like growth factor binding protein 211 DEPP decidual protein induced by progesterone11 ABCC3 ATP-binding cassette, C (CFTR/MRP) 310 TNCtenascin C (hexabrachion)7 SRPX sushi-repeat-containing protein, X chrom5 SFRP4secreted frizzled-related protein 44 SERPINB2serine (or cystein) proteinase inhibitor, 2 (P4 SERPINH2serine (or cystein) proteinase inhibit3 MUC1mucin 13 EGFR-RSLikely ortholog of mouse EGFVascular Involvement/Angiogenesis43 FCGR3A Fc fragment of IgG, low affinity IIIa,42 PTGER4 prostaglandin E receptor 4 (subtype17 HLA-DRA major histocompatibility complex, class II, 6 CD163CD 163 antigen5 VEGFvascular endothelial growth factor5 VCAM1vascular cell adhesion molecule 14 LMO2LIM domain only 2 (rhombotin-like1)4 CD68CD68 antigenSignal Transduction6 IQGAPIQ motiv containing GTPase activating8 RDC1 G protein-coupled receptor4 RGS16 Regulator of G-protein signaling 163 NFKBIA NFKB inhibitor, alpha3 PLD2phospholipase D 23 TK2thymidine kinase 2, mitochondrial3 ABL1abelson murine leukemia viral oncogene homolog 1

    Cytoskeleton12 VIM vimentin7 PLEKplekstrin5 MSN moesin4 CAPGCapping protein (actin filament), gelsolin-like3 KANKkidney ankyrin repeat-containing proteinApoptosis4 CASP4 caspase 44 PIG3p53 induced gene 3Transcription14 FP36L1 zinc finger protein 36, C3H type-like 1 (ERF-1)7 ID4 inhibitor of DNA binding 4, dominant neg helix-loop-helix protein3 BTF3basic transcription factor 36 EYA2eyes absent (Drosophila) homolog 24 EGR1 Early growth response 14 JUNB Jun B proto-oncogene4 CEBPB CCAAT/enhancer binding protein (C/EBP), beta3 NFKBIAnuclear factor kappa-B inhibitor alpha3 FOXM1forkhead box 1MProliferation3 CKS2CDC28 protein kinase regulatory subunit 23 CDC20cell division cycle 20Unknown function5 H47315EST7 MT1Lmetallothionein 1L6 CLIC1chloride intracellular channel 16 MT2Ametallothionein 2A4 HNRPH1heterogeneous nuclear ribonucleoprotein H14 R68464EST4 APOEapolipoprotein E3 KIAA0630KIAA0630 protein3 MSI2Musashi homolog 2

  • OverviewIntroductionGenerating dataAnalyzing dataInterpreting data

  • BioHavasu project

  • Unusual Suspects: Cataloging Cancer Related Proteins, Genes using Biomedical LiteraturePathway involvement (activity of protein): Determine the cellular pathway(s) during which the protein is involved : apoptosis, proliferation, or migrationInteraction (protein/protein , protein/nucleic acids or protein /fatty acids): Determine protein binding. Swissprot, Entrez protein or ExpasyDisease (protein/disease, protein/tissue type): Determine the types of cancer that the protein is related to.Protein Action (protein/function): Determine the diverse activation and inhibition relationships between proteins as well as sub-cellular localization.

  • Understanding relationships

  • Sub-cellular localization

  • Proposed Ontology-Directed Extraction MethodologyModel Medical Terminology: Identify existing medical ontologies such as UMLS for modeling the domain knowledge.Text Classifier Module: Build a classifier for identifying interesting sentences in MEDLINE abstracts.Natural Language Processing: Identify pre-processing steps for structuring free-text. Such steps involve part of speech tagging, noun and verb phrase chunking and shallow parsing.Relationship Extractor Module: Build an extractor system using machine-learning techniques, such as ILP, for learning rules that combine the medical ontologies with learned patterns on sentences to extract relationships among proteins. Usability, Performance and Scalability: Determine if the system is usable by biologists, if it can be easily trained to extract new types of relationships and its recall and precision is at acceptable levels.

  • So that I dont have to spend hours finding diagrams myself.Mef 2CHB-EGFLPAGCRG proteins

  • Promoter AnalysisFind the promoter regionGenome browserFind transcription binding siteTESSGenomatixBiobase, etcAlign several promoters to find common patterns

  • The ones that got awayHighly invasiveSurgeon cant reach themChemotherapy and radiotherapy cant reach themThey are not dividing

    corecorerimrim

  • Genetics again!

  • Transcription Core promoterTranscription factorsCo-activatorsEnhancers

  • Transcription factors

  • Consensus binding sitesPosition weighted matricesDefine variation in promoter consensus sequences

  • The sequenced human genome

  • Finding the Promoter

  • Genome BrowserHuman Genome Browser Gateway

  • TESS

  • TESS Job W0793006061 : Tabulated Results

  • Promoter structure1234

  • Promoter Alinement

  • Genomatix

  • The next step, biological significance

    Proof of transcriptional regulation = proof of proteinCellular specificitySubcellular localizationActivityTissue micro-arrayTissueInformatics

  • ConclusioncDNA microarray technology has opened a flood gate of informationBiologists need HELPExpedite the interpretation of data. ideas wanted

    My name is DHI am a molecular biologist at at TGentoday Ill be talking about the cDNA analysisA biological question is answered by an experiment that brings insight to the multiple signaling pathways necessary to coordinate a complex cellular behaviour. In the process this experiment generates so much data that the researcher would require months to begin to understand the important relationships before them. Automated data mining would speed up this enormous task so that the information can be translated into experiments designed to prove these relationships. Please stop me and ask any questions that arise.1. As an introduction I am going to set the stage by refreshing your memory of genetics and introduce you to the type of cancer we study.Generating data covers the technical aspects of cDNA microarray analysis Analyzing covers the transformation of fluorescent signals to expression ratios of the genes that yield lists of genes with potential biological significance. Interpretation involves identifying the genes, finding information on them , identifying relationships and other correlations between the genes on one list or between opposite gene lists. Normal scan on leftThree months later scan showed large tumor enhanced in blueSee surgical resectionPatient died two months laterGene therapy involves infecting only the cancer cells with engineered viruses that deliver a gene that would then kill those cells.In its infancyAnd even more in its infancy is stem cell therapy. Engineered stem cell virtually chase down the moving tumor cells and deliver the treatment themselves. The first phases of clinical trials are starting.This tumor is the most aggressive of the glial tumors. Glia cells are the infrastructure and support around the neurons.The glioma cells preferentially travel along established structures such as blood vessels orFind out what genes are responsible for their invasive nature.Compare the genetic activity of the moving cells to those that remain stationary.Compare genes expressed in the tumor core versus those expressed in the invasive rim We think that the moving cells are a different clinical target than the cells remaining in the tumor core.

    = a different clinical targetBut to be able to implement a different therapyy we must first find out how to target this special sub-population of cells. Analyze them sepparately.How to carve out these minute cells 6 microns from their context in the brainEnter LCM Misshapen raisins = nucleiNeed a pathologist!!!!Harvest cells to isolate the genetic materialEvery cell of your body carries all the information needed to make a complete human being out of it. The information is carried in the form of a code. The code of life =DNA. DNA is like n encyclopedia of parts necessary to be human. These parts are the proteins that make up each cell. Not every cell needs every protein to be made. Cells in your eyes dont need to be making toe nailsTherefore every cell carefully only makes those proteins needed by copying only those parts of the genome that will be used in that cell type at that time in development. These copies are mRNAMore about that laterBut we can analyze what RNA a cell produces , which reflect what proteins are necessary for that cell at that point in place and time. Much of it will be the same since the cells we are comparing are the same type of cell from the same patient. But we are looking for the differences.Generating data covers the technical aspects of cDNA microarray analysis

    Each slide 15k dots Each dot is one gene, 100k copies of one geneThey are arrayed in a grid, the position of each gene is know.Red and green fluorescences are measured.You can see that there will be lots of information coming from this type of experiment

    Robotics required to reproducibly print slides with up to 20K genes on themPins taking up nanoliter scale DNA suspension an accurately depositing it on a glass slide, where is immobilized1 superimposed fluorescence looks like this, now it is time to measure relative signal intensity2Array Target Segmentation Since each element of an array is printed automatically to a pre-defined locations, we can safely assume that the detectable signals form a regular array which can be automatically aligned to a predefined grid-overlay3 Target Detection: important that the final signal intensity be measured over regions corresponding to probe-hybridized-to-target area. Also to define the background, for signal noise extraction.4 target intensity extraction once the area of the target is defined, the fluorescence within that area its median, its mean and the median, and mean of the background can be calculated for each spot on the array, oh and the standard deviation etc youll see.5 Normalization and Ratio Analysis We have used the expression ratio to determine whether a gene expression differs significantly between the red and green channels. Such an approach is intuitive because two similar samples lead to a R/G ratio close to 1Ie they are equally expressed in both populations.Formula used to calculate the ratio of red over the green channel normalized to the total red and green fluorescence in the entire slide (experiment) For more info on the mathematical processing of the fluorescence values pleas go the this web siteAccession numbers are gene descriptors.37 data columns per geneArrays have 5k to 15k gene.This is the sushi of array analysis.5K= 185 00015K= 555.000

    5K= 185 000 15K= 555.000That times 10 experiments=5.550.000That times 100 experiments=55.500.00Why so much information for just one spot?To be certain that the difference in expression is statistically significant Clearly this data stream will be reduced!Import only certain columns of this data to an array analysis software program.

    Analyzing covers the transformation of fluorescent signals to expression ratios of the genes that yield lists of genes with potential biological significanceLoad the median of the green channel minus the median of the green channel background( the mean could misrepresent some spots that have a high spike of signal in a small spot area)Same for red channelAlso load the flag column: alerts for bad spot size/shapeNormalizes the calculated ratios across all experiments (if all experiments have a robust and equal fluorescence level) or within one gene chip etc. Normalization procedures vary with the type of experiment and the intensity of fluorescence, in every type of normalization you may loose data points and also obtain some false positives

    Finds clusters of experiments with the same gene expression profileGroups them together.For example the result of this experiment would be that the cancers in blue show a concordant gene expression profileNow what?We have a number and maybe a gene name.What do all the gene numbers , accession numbers mean?

    Multiple names this brings up the need to define the true identity of the gene.Online mendelian inheritance in manPseudonyms and short synopsisEnter names of genes of interest to find the collated entries for all the pieces of DNA recorded in Genbank Hs mm rnPseudonymsCatalog of all entries for parts of sequences or the entire sequence of the gene and homology to other animalsEntrez Nucleotide find some entriesPublished material on this gene.

    Ullas Nambiar is developing a program to get us from accession number to publication records navigating through all the previous steps. This will result in incredible time savings.LocusLink: a hub of background information pertaining to this geneIm showing you all these site to illustrate the many ways to get at the complete complement of information available pertaining to one spot or accession number LocusLink: what functions or cellular localizations is this protein linked withClick on G-coupled receptors to find out other genes that might be involved in similar signaling pathways.LocusLink: Chromosomal location Protein functionsSee what other proteins are involved in the same pathwaysNeed to go farther downOther genes involved in signaling through G-coupled receptorsFor example rhoThen it would be nice to go back to pub med to find references that include rho and autotaxinAnd a number of other combinationsThis would take unimaginable man-(or in this case woman) hours to get a thorough link between two proteins or to really find out its role in a complex signaling cascadeHasan Dalvucu is helping to solve just that problem pubMed and protein databases that list known functional groups in proteins Swissprot , Expasy and Entrez protein Train the system to recognize the names of proteins domains drugsTo get the full picture.1- autotaxin and migration, apoptosis proliferation cell cycle etc. positive or negative correlationsautotaxin and GCPR, autotaxin and some other candidate arp2/3You have this protein name and dont know what it does find out domainsTrain the system to recognize the names of proteins domains drugsautotaxin and gliomaautotaxin and phosphorylation or autotaxin and membranePositive and negative interactions relationships between basiceventsThe output of this analysis would end up like this a diagram of the inter-relationships between the proteins of interest and the other players in cell signaling cascades .Eg Scheme illustrating gene candidates and the pathways they are involved in glioma invasion

    Pak neg affects MLCK and activates LIMk net result two sub-pathways are modified so that actin is not broken down, rather it is polymerized (synthesized) at the leading edge of the cell this illustrates the last point of unusual suspectsSubcelluar localizationSee next slideIllustrating sub-cellular localizationFig showing the change in morphology of a keratinocyte as it starts moving. Polarization of the actin cytoskeltonBlue is actin

    This is how Dr. Dalvucu suggests to accomplish this feat!I think this will save weeks of research time!ExactlyAre genes that are transcribed at the same time in the same cell regulated by the same elements?

    = a different clinical targetBut to be able to implement a different therapyy we must first find out how to target this special sub-population of cells. Analyze them sepparately.How to carve out these minute cells 6 microns from their context in the brainEnter LCM Transcription and then TranslationEvery cell of your body carries all the information needed to make a complete human being out of it. The information is carried in the form of a code. The code of life =DNA. DNA is like n encyclopedia of parts necessary to be human. These parts are the proteins that make up each cell. Not every cell needs every protein to be made. Cells in your cornea dont need to be making toe nailsTherefore every cell carefully only makes those proteins needed by copying only those parts of the genome that will be used in that cell type at that time in development. These copies are mRNAMore about that laterBut we can analyze what RNA a cell produces and from that loosely infer what its function is.

    Transcription: no accident!Specificity:Core promoter insures transcription but does not confer tissue or temporal specificity, recruits RNA polymerase IISome elements bind DNA others bind to them charge interactionsTranscription factors confer specificity and amplify signal, may recruit other transcription factorsEnhancers step up the rate API hetero or homodimers leucine zipper?Helix turn helixZN finger

    Add a position weighted matrixShow genome browser and selection of promoter area cross species conservationUse hyperlinkTESS transcription element search systemFind transcription factor binding sitesProblem: this shows binding site for 100 bp in a very stringent screen. Promoters can be very longWe chose to input 2000bp mission abulate all putative transcrption factor on these 2000bdo that for 30 genes1- Promoter organization overviewVertebrate pol II promoters usually consist of multiple binding sites for transcription factors which are necessary for promoter function. However, individual promoter elements require a specific order to constitute a functional promoter. This organization can be dissected into at least three different levels with distinct functionality encoded at each level.

    2- Transcription factor binding sites (TF-sites)Individual TF-sites build the basis of the promoter. These are relatively short stretches of DNA (10 - 20 nucleotides), sufficiently conserved in sequence to allow specific recognition by the corresponding transcription factor.

    3- Promoter modulesThe next higher level of promoter organization is the one of promoter modules which are composed of two or more TF-sites in a defined distance range. In contrast to isolated binding sites these sites allow synergistic or antagonistic effects.

    4- Promoter modelsThe highest and most complex level of organization of a promoter is the complete promoter itself. Functionally related promoters often exhibit a clearly defined core organization of binding sites conserved both in orientation as well as in distances (with some variability). This is true even when the promoter sequences show no significant overall sequence similarity precluding alignment-based detection also for whole promoters (except phylogenetic footprints of evolutionary related promoters).

    The promoter model shown below describes the general framework of TF-sites which is common to all mammalian actin promoters even across species!Promoter modulesGreen is AP-1Red is Mef2CShow aligned sequences and promoters bound to themWill also need computer assisted interpretation.Quantitating the staining signal intensity and percent of areaCorrelate this with gene expression profile