DNA Microarray analysis using Hierarchical clustering

download DNA Microarray analysis using Hierarchical clustering

of 6

Transcript of DNA Microarray analysis using Hierarchical clustering

  • 7/21/2019 DNA Microarray analysis using Hierarchical clustering

    1/6

    Experiment 41. Experiment: DNA Microarray analysis using Hierarchical clustering

    Equipment Required: Computer with internet connection.Material Required: - Matla

    2. Learning Objectives:

    !o acquaint students with the clustering o" large data.!o acquaint with matla

    3. Theory:

    Clstering metho!s

    #ne o" the goals o" microarray data analysis is to cluster genes or samples with similar

    e$pression pro"iles together% to ma&e meaning"ul iological in"erence aout the set o"genes or samples. Clustering% also &nown as class disco'ery%

    !he idea is that coregulated and "unctionally related genes are proaly going to e$press

    (go up or down) simultaneously% so they can e grouped into clusters. allows easiermanagement o" the data set.

    Clustering techniques can help:

    * to identi"y groups o" coregulated genes%* to identi"y spatial or temporal e$pression patterns%

    * to reduce redundancy in prediction models.

    * to identi"y new iological classes (i.e. new tumor classes)%* to detect e$perimental arti"acts%

    * or "or display purposes.

    Clustering is one o" the unsuper'ised approaches to classi"y data

    "hat is #$TL$%&*MA!+A, short "or #atri$ Laboratory.*MA!+A, is a tool "or doing numerical

    computations with matrices and 'ectors.*/t is 'ery power"ul and easy to use*integrates

    computation%'isuali0ation and programming12E2:

    3. !eaching 4,ioin"ormatics graduate and undergraduate courses

    *M/!%Har'ard%2tan"ord%Cornell%Carnegie Mellon%56. Research --recent papers use MA!+A, "or:

    a. 2equencing

    *,ase calling algorithm design

    . Microarray analysis*2tatistical modeling o" microarrays%image analysis

    c. 7roteomics

    *Mass spectrometry data classi"icationd. 2ystems ,iology

    *8lu$ Analysis%2imulation o" Metaolic 7athways%/nteraction Networ& /denti"ication

    Clustering algorithms

  • 7/21/2019 DNA Microarray analysis using Hierarchical clustering

    2/6

    !he traditional algorithms "or clustering are:

    3. Hierarchical clustering.

    6. 9-means clustering.. 2el"-organi0ing "eature maps (a 'ariant o" sel"-organi0ing maps).

    ;. ,inning

    'ierarchical clsteringHierarchical clustering typically uses a progressi'e comination o" elements that are most

    similar. !he result is plotted as a dendrogram that represents the clusters and relations etween

    the clusters. . Repeat steps and ; "or the most high-le'el clusters.!he top-down algorithm wor&s as "ollows:

    3. All the genes or e$periments are considered to e in one super-cluster.

    6. Di'ide each cluster into 6 clusters y using &-means clustering with &?6.. Repeat step until all clusters contains a single gene or e$periment.

    !his algorithm tends to e "aster than the ottom-up approach.

    4. (roce!re:

    /nstall Matla on the system

    1se cluster analysis tool:

    'ierarchical Clstering in #$TL$%!o per"orm hierarchical cluster analysis on a data set using the 2tatistics !oolo$

    "unctions% "ollow this procedure:

    @ )tep * 1 +in! the similarity or !issimilarity bet,een every pair o- objects in the !ata

    set:

    @ /n this step% you calculate the distanceetween o=ects using the p!ist"unction. !he

    p!ist"unction supports many di""erent ways to compute this measurement.@ )tep * 2 rop the objects into a binary/ hierarchical clster tree:@ /n this step% you lin& pairs o" o=ects that are in close pro$imity using thelin0age

  • 7/21/2019 DNA Microarray analysis using Hierarchical clustering

    3/6

    "unction% which is the main "unction to implement hierarchical clustering method. !he

    lin0age"unction uses the distance in"ormation generated in step 3 to determine the

    pro$imity o" o=ects to each other. As o=ects are paired into inary clusters% the newly"ormed clusters are grouped into larger clusters until a hierarchical tree is "ormed.

    @ )tep * 3 etermine ,here to ct the hierarchical tree into clsters:

    @ /n this step% you use the cluster "unction to prune ranches o"" the ottom o" thehierarchical tree% and assign all the o=ects elow each cut to a single cluster. !his creates

    a partition o" the data. !he cluster "unction can create these clusters y detecting natural

    groupings in the hierarchical tree or y cutting o"" the hierarchical tree at an aritrarypoint.

    !he MA!+A,s 2tatistics !oolo$ includes a con'enience "unction% clster!ata% which

    per"orms all these steps. No need to e$ecute the p!ist/ lin0age/ or clster"unctionsseparately

    Command used:3. pDist "unction: D ? pdist(B) computes the Euclidean distance etween pairs o" o=ects in

    m-y-n data matri$ B.

    6. lin&age "unction: ? lin&age() creates an agglomerati'e hierarchical cluster tree "romthe distances in

    . square"orm "unction: ? square"orm(y)% where y is a 'ector as created y the pdist

    "unction% con'erts y into a square% symmetric "ormat % in which (i%=) denotes thedistance etween the ith and =th o=ects in the original data.

    ;. Dendrogram: H ? dendrogram() generates a dendrogram plot o" the hierarchical% inary

    cluster tree represented y

    Example:1

    B ? F3 6G6.> ;.>G6 6G; 3.>G; 6.>

    ? pdist(B)

    B ?

    3.IIII 6.IIII

    6.>III ;.>III 6.IIII 6.IIII

    ;.IIII 3.>III

    ;.IIII 6.>III

    ?

    6.J3>> 3.IIII .I;3; .I;3; 6.>;J> .>;3 6.>III 6.IK3K 6.IK3K 3.IIII

    square"orm()

  • 7/21/2019 DNA Microarray analysis using Hierarchical clustering

    4/6

    ans ?

    I 6.J3>> 3.IIII .I;3; .I;3; 6.J3>> I 6.>;J> .>;3 6.>III

    3.IIII 6.>;J> I 6.IK3K 6.IK3K

    .I;3; .>;3 6.IK3K I 3.IIII .I;3; 6.>III 6.IK3K 3.IIII I

    ? lin&age()

    ?

    ;.IIII >.IIII 3.IIII 3.IIII .IIII 3.IIII

    K.IIII L.IIII 6.IK3K

    6.IIII .IIII 6.>III

    dendrogram()

    Example: 2

    load mn.dat

    mn'ector ? square"orm(mn)

    mn'ector ?

  • 7/21/2019 DNA Microarray analysis using Hierarchical clustering

    5/6

    Columns 3 through J

    I.II I.;II I.LII I.KLII I.K;II I.L>II I.KII I.>II

    I.6II

    Columns 3I through 3

    I.6JII I.;>II I.>II I.>LII I.L>II I.LKII I.66II I.>III.>3II

    Columns 3J through 6L

    I.>LII I.L6II I.L;II I.;>II I.>III I.>KII I.KJII I.L3II

    I.II

    Columns 6 through K

    I.6JII I.;KII I.;KII I.6II I.;6II I.;3II I.;II I.6III.63II

    mnclustering ? lin&age(mn'ector%complete)OOO mnclustering ? lin&age(mn'ector%complete)

    P

    Error: !he input character is not 'alid in MA!+A, statements or e$pressions.

    mnclustering ? lin&age(mn'ector%QcompleteQ)

    mnclustering ?

    .IIII J.IIII I.63II

    .IIII ;.IIII I.66II K.IIII L.IIII I.6II

    6.IIII 33.IIII I.6JII

    >.IIII 36.IIII I.II

    3.IIII 3.IIII I.II 3I.IIII 3;.IIII I.;KII

    3>.IIII 3K.IIII I.KII

    dendrogram (mnclustering)

    . eire! eslts:!he student should e ale to identi"y clusters.7arameters: None

    Relationships to e determined: None

  • 7/21/2019 DNA Microarray analysis using Hierarchical clustering

    6/6

    5. Cations:None

    6. Learning otcomes: