The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

19
Analysis of Experiment E-GEOD- 29989: Alternative Splicing of Exons in Human Hematopoietic Stem cells The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12 Phillip Woolwine

description

Analysis of Experiment E-GEOD-29989: Alternative Splicing of Exons in Human Hematopoietic Stem cells. The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12 Phillip Woolwine. Outline. Alternative splicing of exons (ASE) - PowerPoint PPT Presentation

Transcript of The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Page 1: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Analysis of Experiment E-GEOD-29989: Alternative Splicing of Exons in Human

Hematopoietic Stem cells

The Johns Hopkins UniversityAdvanced Genomics & Genetics Analysis

AS410.713.81.SU12Phillip Woolwine

Page 2: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Outline

Alternative splicing of exons (ASE)

ASE probed w/ Affymetrix GeneChip Exon array

Data analyzed w/ Affymetrix Power Tools (APT), R & Bioconductor

ASE during lineage-specific hematopoietic differentiation

Page 3: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Background

ASE leads to mRNAs that can have similar or different functions/products and is a basis for functional diversity in gene expression

Includes exon skipping, mutually exclusive exons, alternative 5' donor sites, alternative 3' donor sites, and/or intron retention

ASE is believed to be a major player in lineage-specific differentiation of blood cells

Aberrant ASE can lead to leukemias, lymphomas, etc

Understanding of ASE events and the exome profile will benefit understanding of disease pathogenesis and aid in therapies

Page 4: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Methods Data retrieved from Experiment E-GEOD-29989 by Lui et al (2011)

• Affymetrix GeneChip Human Exon 1.0 ST array• Transcriptional profile of lineage-specific differentiation of CD34 cells into

Erythropoietic (E), Granulopoietic (G), and Megakaryopoietic (M) cells

Data normalized with APT 1.14.2

PCA outlier analysis in R and Bioconductor

Filtered by DABG p-val > 0.05 in >50% of classes

T-test used to determine exon enrichment/depletion by p-val & fold change (FC)• Evidence for alternative splicing verified in Ensembl

Top exon probes mapped to genes and differential expression plotted• green plots are CD34, red plots are lineage-specific

DAVID pathway analysis

Page 5: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsPCA of exon probes

Exon arrays are clustered along cell lineages; no outliers G & E lineage more similar than M lineage

Page 6: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsScree plot of exon probes

Approx. 70% of variability is explained in the first two eigenvectors

Page 7: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsFiltering and Testing Correction

Filter on DABG of all class types removed ~83,000 low intensity probes• mRNA filters may be more sensitive in some cases (table1, Della Beffa et al, 2008)

Other filtering included using core probesets in APT (core,ps, core.mps) 197,245 probes remained after statistical tests Potential for false positives (FP) though multiple testing correction not performedAt p-val <0.01 & FC > |1.5|there were 6413 in E, 3316 in G, 9638 in M; possible high FPIt can be argued that FWER is too conservative for the high-dimensionality of exon data ;

the tests may not necessarily be independent nor uniform in a non-significant way and FDR may not be appropriate (Della Beffa et al, 2008) Proper pre-filtering of probes and true splice sites is a better strategy to limit FP

Page 8: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsOne Significant ASE Transcript Cluster Common to All 3 Lineages

Dimensionality reduction at p-val < 0.00001 & FC >|2| IGHA2 immunoglobulin heavy constant alpha 2

Page 9: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Results

Significant ASE in Erythropoietic vs CD34

Dimensionality reduction at p-val < 0.00001 & FC >|2| 20 unique transcript clusters ordered by p-val

exon probe test_statistic pv lower_ci upper_ci log2_fc transcript_id3764725 -94.66201 1.85E-07 -1.3478 -1.2689 -1.30835 37646802522514 -89.20705 3.30E-07 -2.17621 -2.03966 -2.10793 25225092949661 45.40627 1.41E-06 2.53127 2.86105 2.69616 29496223362331 45.16613 1.70E-06 2.10366 2.38106 2.24236 33622633834504 59.227 1.97E-06 2.530274 2.793166 2.66172 38345022350929 -41.35524 2.12E-06 -1.83529 -1.60407 -1.71968 23509223081719 -65.99954 2.88E-06 -2.81293 -2.56643 -2.68968 30817072500292 -43.56399 2.97E-06 -1.2957 -1.13712 -1.21641 25002753932139 -52.20334 3.03E-06 -2.43312 -2.17509 -2.3041 39321312340510 -38.16035 3.16E-06 -2.35865 -2.03735 -2.198 23404333841868 -40.8034 3.36E-06 -2.29577 -1.99845 -2.14711 38418623568540 -33.24723 6.05E-06 -2.36948 -2.00122 -2.18535 35685343724580 -31.96947 6.19E-06 -3.6351 -3.05234 -3.34372 37245453145161 46.62158 6.92E-06 2.344684 2.665209 2.504947 31451492333140 -34.85999 7.10E-06 -3.23725 -2.74898 -2.99311 23331363841587 -36.89025 7.10E-06 -3.64063 -3.11492 -3.37777 38415743581927 -52.92897 7.97E-06 -2.68494 -2.39052 -2.53773 35816372340357 -32.72331 8.60E-06 -3.21971 -2.7059 -2.9628 23403503079216 33.57286 8.97E-06 1.691841 2.006399 1.84912 30792023326951 -62.76521 9.98E-06 -2.025 -1.8283 -1.92665 3326950

Page 10: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsSignificant ASE in Erythropoietic vs CD34 Genes and Pathways

DAVID Functional Annotation reveals enrichment for alternative splicing

ID Gene Name2500275 BCL2-like 11 3834502 CD79a molecule, immunoglob-assoc a3362263 DENN/MADD domain containing 5A2340350 DnaJ (Hsp40) homolog, subfam C, mem 63841862 Fc fragment of IgA, receptor for2522509 NIF3 NGG1 interacting factor 3-like 1 2333136 cell division cycle 20 homolog2350922 glutathione S-transferase mu 13581637 immunoglobulin heavy variable 3-303724545 integrin, beta 3 2340433 leptin receptor3841574 leukocyte immunoglob-like receptor, B, 13326950 ld lipoprotein receptor class A dom con 33081707 motor neuron and pancreas homeobox 13079202 K+ voltage-gated channel, subfam H, 23932131 proteasome assembly chaperone 13568534 spectrin, beta, erythrocytic2949622 tenascin XB; tenascin XA pseudogene3764680 tripartite motif-containing 373145149 tumor protein p53 induc nuclear prot 1

Page 11: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Results

Significant ASE in Granulopoietic vs CD34

Dimensionality reduction at p-val < 0.00001 & FC >|2| 9 unique transcript clusters ordered by p-val

exon probes test_statistic pv lower_ci upper_ci log2_fc transcript_id3662846 71.18055 2.92E-07 2.480959 2.683834 2.582397 36628083835043 -43.71178 1.69E-06 -6.9581 -6.12603 -6.54206 38350353837162 39.51809 2.52E-06 4.016632 4.624342 4.320487 38371323582003 37.10581 3.39E-06 0.961212 1.117175 1.039193 35816372874427 36.01181 3.62E-06 1.731912 2.021508 1.87671 28743713304629 32.80535 8.14E-06 2.687296 3.195017 2.941157 33046243815226 -29.11174 8.47E-06 -3.5762 -2.95292 -3.26456 38152233959353 -30.37461 9.42E-06 -1.98261 -1.64655 -1.81458 39593503761306 55.43858 9.59E-06 1.783076 1.996257 1.889667 3761291

Page 12: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Results

Significant ASE in Granulopoietic vs CD34 Genes & Pathways

DAVID Functional Annotation reveals significant enrichment for signaling Clusters include alternative splicing

ID Gene Name3304624 5'-nucleotidase, cytosolic II3835035 CD177 molecule3662808 G protein-coupled receptor 563837132 SUMO1 activating enzyme subunit 13959350 apolipoprotein L, 32874371 fibrillin 23761291 homeobox B23581637 immunoglobulin heavy variable 3-303815223 proteinase 3

Page 13: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Results

Significant ASE in Megakaryopoietic vs CD34

Dimensionality reduction at p-val < 0.00001 & FC >|2| 37 unique transcript clusters ordered by p-val

exon probes test_statistic pv

Lower_ci

upper_ci

log2_fc transcript_id exon probes

test_statistic pv

lower_ci

upper_ci

log2_fc

transcript_id

2883456 -84.51229 2.73E-07 -3.50093 -3.27244 -3.38669 2883440 2712681 -37.69503 3.17E-06 -1.5477 -1.33481 -1.44125 27126323996325 -109.04446 3.42E-07 -1.61352 -1.52799 -1.57075 3996306 2692866 -49.65337 3.61E-06 -5.39051 -4.7912 -5.09085 26928162902594 -56.26256 5.98E-07 -3.2605 -2.95384 -3.10717 2902593 3581932 -44.69765 4.07E-06 -2.52521 -2.21855 -2.37188 35816373726702 -57.64931 6.76E-07 -4.37159 -3.96699 -4.16929 3726691 3595502 -37.43227 4.13E-06 -4.1035 -3.53029 -3.81689 35954413619789 54.18277 7.45E-07 1.385162 1.535165 1.460163 3619773 3595493 -34.27633 4.39E-06 -4.33439 -3.68445 -4.00942 35954413595504 -53.69291 9.06E-07 -4.53231 -4.08325 -4.30778 3595441 3432535 33.46881 5.03E-06 1.562017 1.845336 1.703677 34325143604294 56.16099 1.14E-06 1.188597 1.31523 1.251913 3604287 2361299 -33.85212 5.11E-06 -2.29419 -1.94484 -2.11951 23612793229353 47.77881 1.15E-06 4.543631 5.104329 4.82398 3229338 3838099 -32.24923 5.75E-06 -2.79873 -2.3543 -2.57652 38380943580603 -54.68796 1.31E-06 -2.31754 -2.08843 -2.20298 3580498 3716864 -32.60492 6.01E-06 -3.26711 -2.7518 -3.00946 37167832527758 44.34504 1.58E-06 2.367172 2.683661 2.525417 2527747 3024034 34.14316 6.13E-06 2.361409 2.785817 2.573613 30240253959870 45.13276 1.72E-06 1.265458 1.432502 1.34898 3959862 3661695 -31.31969 6.21E-06 -1.57125 -1.31533 -1.44329 36616843321394 -42.07574 1.95E-06 -3.01379 -2.64041 -2.8271 3321361 3726707 -31.22488 6.32E-06 -5.70935 -4.77663 -5.24299 37266912368174 43.60399 1.97E-06 4.125353 4.69036 4.407857 2367963 2320744 -30.79117 6.86E-06 -2.61282 -2.18002 -2.39642 23207273870412 -41.48994 2.04E-06 -1.85525 -1.62244 -1.73885 3870361 3654966 -30.52836 7.17E-06 -3.2377 -2.69696 -2.96733 36549563308528 42.7371 2.12E-06 1.375135 1.567531 1.471333 3308489 3778506 32.03566 7.32E-06 2.483458 2.960449 2.721953 37785043582276 -52.45079 2.15E-06 -4.868 -4.36081 -4.61441 3581637 3738849 -29.4231 7.97E-06 -2.51354 -2.08001 -2.29677 37388423538094 -41.90661 2.16E-06 -2.25268 -1.97169 -2.11218 3538087 2639809 -37.8211 8.21E-06 -3.54434 -3.03911 -3.29172 26397343442152 40.56574 2.36E-06 3.300372 3.786662 3.543517 3442150 2402609 29.28454 8.25E-06 1.636783 1.979964 1.808373 24026013922533 -39.85965 2.66E-06 -2.60946 -2.26819 -2.43883 3922444 2368177 38.67097 8.41E-06 3.27527 3.80971 3.54249 23679633927281 38.26922 2.79E-06 1.090847 1.26152 1.176183 3927226 2476518 -32.72997 8.74E-06 -7.47151 -6.27858 -6.87505 24765102664340 40.73583 2.96E-06 2.27312 2.609973 2.441547 2664332 2376831 -28.84159 9.00E-06 -2.24658 -1.85133 -2.04895 23767992712681 -37.69503 3.17E-06 -1.5477 -1.33481 -1.44125 2712632

Page 14: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsSignificant ASE in Megakaryopoietic vs CD34 Genes & Pathways

DAVID Functional Annotation reveals enrichment for alternative splicing & signaling ID Gene Name ID Gene Name

3432514 2'-5'-oligoadenylate synthetase 2, 69/71kDa 3738842 hexosaminidase containing2883440 ADAM metallopeptidase domain 19 3581637 immunoglobulin heavy variable 3-303726691 ATP-binding cassette, sub-family C, member 3 2376799 inhibitor of kappa light pp enhancer in B-cells, kinase epsilon3922444 ATP-binding cassette, sub-family G, member 1 2692816 integrin, beta 53580498 CDC42 binding protein kinase beta 3604287 interleukin 16 3619773 INO80 homolog 2639734 kalirin, RhoGEF kinase3308489 KIAA1598 2361279 lamin A/C3870361 NLR family, pyrin domain containing 12 2476510 latent transforming gf beta binding protein 12367963 RAB GTPase activating protein 1-like 2902593 lymphocyte antigen 6 complex, locus G6F3716783 RAB11 family interacting protein 4 (class II) 3661684 matrix metallopeptidase 2 3778504 RAB31, member RAS oncogene family 3024025 mesoderm specific transcript homolog2402601 UBX domain protein 11 3654956 nuclear pore complex interacting protein-like 23442150 acrosin binding protein 3959862 parvalbumin3927226 amyloid beta (A4) precursor protein 3996306 ribosomal protein L10; ribosomal protein L10 pseudogene 152664332 collagen-like tail subunit of acetylcholinesterase 3838094 similar to ferritin, light polypeptide; ferritin, light polypeptide3538087 dapper, antagonist of beta-catenin, homolog 1 2527747 solute carrier family 11, member 13229338 ficolin (collagen/fibrinogen domain containing) 1 3321361 spondin 1, extracellular matrix protein3595441 glutamate receptor, N-methyl D-aspartate-like 1B 2712632 transferrin receptor (p90, CD71)

2320727 tumor necrosis factor receptor superfamily, member 1B

(Several categories not shown but include those for immune system development)

Page 15: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsTop Significantly Upregulated ASE in Erythropoietic

P-val < 0.01 & FC >2 in Erythropoietic ; P-val < 0.01 & FC < 1.5 in G & M lineages Top upregulated exon probe 2527682; cluster id 2527672; gene PKND

Significantly downregulated in Megakaryopoietic lineage

Page 16: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsTop Significantly Upregulated ASE in Granulopoietic

P-val < 0.01 & FC >2 in Granulopoietic ; P-val < 0.01 & FC < 1.5 in E & M lineages Top upregulated exon probe 4016430, 4016431; cluster id 4016428; gene BEX2

Significantly reduced expression versus CD34 in Megakaryopoietic lineage

Significantly reduced expression versus CD34 in Erythropoietic lineage

Page 17: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ResultsTop Significantly Upregulated ASE in Megakaryopoietic

P-val < 0.01 & FC >2 in Megakaryopoietic ; P-val < 0.01 & FC < 1.5 in E & G lineages Top upregulated exon probe 3275248; cluster id 3275132; gene GDI2

downregulated

downregulated

upregulated

Page 18: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

Discussion

ASE occurs during lineage-specific hematopoietic differentiation of CD34 cells into Erythropoietic, Granulopoietic, and Megakaryopoietic cells

Pathway terms are significantly enriched in alternative splicing and signaling, including those for immune system development, consistent with known biology

Relatively increased ASE in megakaryopoietic differentiation may suggest increased transcriptional complexity during development

Comparison to original research results by Lui et al (2011) share a few top hits and similar pathway enrichment

• However, most top genes were not identical and is probably due to their use of the ExonSVD model for statistical assessment of exon enrichment/depletion

High number of significant hits at p< 0.01 may indicate high FDR and may warrant further filtering and dimensionality reduction

• May be interesting to combine MiDAS and Rank Product for testing and correction

Page 19: The Johns Hopkins University Advanced Genomics & Genetics Analysis AS410.713.81.SU12

ReferencesDella Beffa et al (2008) Dissecting an alternative splicing analysis workflow for GeneChip

Exon 1.0 ST Affymetrix arrays. BMC Genomics 9:571, PMID:19040723

EBI (2012) Ensembl Genome Browser, release 68.<online> Available at: < http://useast.ensembl.org/index.html >[Access 8/20/12]

Higgs, B (2012) Advanced Genomics & Genetics Analysis, Lecture 2: Analysis and interpretation of splice variants. Johns Hopkins University, unpublished

Lui et al (2011) Transcriptome Profiling and Sequencing of differentiated Human Hematopoietic Stem cells Reveal Lineage Specific Expression and Alternative Splicing of Genes, Physiol Genomics 43(20):1117-34, PMID: 21828245

NIAID/NIH (2012) DAVID Bioinformatics Resources 6.7: Functional Annotation Tool. <online> Available at:< http://david.abcc.ncifcrf.gov/tools.jsp >[Accessed 8/20/12]