Analysis of Experiment E-GEOD-29989: Alternative Splicing of
Exons in Human Hematopoietic Stem cells The Johns Hopkins
University Advanced Genomics & Genetics Analysis
AS410.713.81.SU12 Phillip Woolwine
Slide 2
Outline Alternative splicing of exons (ASE) ASE probed w/
Affymetrix GeneChip Exon array Data analyzed w/ Affymetrix Power
Tools (APT), R & Bioconductor ASE during lineage-specific
hematopoietic differentiation
Slide 3
Background ASE leads to mRNAs that can have similar or
different functions/products and is a basis for functional
diversity in gene expression Includes exon skipping, mutually
exclusive exons, alternative 5' donor sites, alternative 3' donor
sites, and/or intron retention ASE is believed to be a major player
in lineage-specific differentiation of blood cells Aberrant ASE can
lead to leukemias, lymphomas, etc Understanding of ASE events and
the exome profile will benefit understanding of disease
pathogenesis and aid in therapies
Slide 4
Methods Data retrieved from Experiment E-GEOD-29989 by Lui et
al (2011) Affymetrix GeneChip Human Exon 1.0 ST array
Transcriptional profile of lineage-specific differentiation of CD34
cells into Erythropoietic (E), Granulopoietic (G), and
Megakaryopoietic (M) cells Data normalized with APT 1.14.2 PCA
outlier analysis in R and Bioconductor Filtered by DABG p-val >
0.05 in >50% of classes T-test used to determine exon
enrichment/depletion by p-val & fold change (FC) Evidence for
alternative splicing verified in Ensembl Top exon probes mapped to
genes and differential expression plotted green plots are CD34, red
plots are lineage-specific DAVID pathway analysis
Slide 5
Results PCA of exon probes Exon arrays are clustered along cell
lineages; no outliers G & E lineage more similar than M
lineage
Slide 6
Results Scree plot of exon probes Approx. 70% of variability is
explained in the first two eigenvectors
Slide 7
Results Filtering and Testing Correction Filter on DABG of all
class types removed ~83,000 low intensity probes mRNA filters may
be more sensitive in some cases (table1, Della Beffa et al, 2008)
Other filtering included using core probesets in APT (core,ps,
core.mps) 197,245 probes remained after statistical tests Potential
for false positives (FP) though multiple testing correction not
performed At p-val |1.5|there were 6413 in E, 3316 in G, 9638 in M;
possible high FP It can be argued that FWER is too conservative for
the high-dimensionality of exon data ; the tests may not
necessarily be independent nor uniform in a non-significant way and
FDR may not be appropriate (Della Beffa et al, 2008) Proper
pre-filtering of probes and true splice sites is a better strategy
to limit FP
Slide 8
Results One Significant ASE Transcript Cluster Common to All 3
Lineages Dimensionality reduction at p-val |2| IGHA2 immunoglobulin
heavy constant alpha 2
Slide 9
Results Significant ASE in Erythropoietic vs CD34
Dimensionality reduction at p-val |2| 20 unique transcript clusters
ordered by p-val exon
probetest_statisticpvlower_ciupper_cilog2_fctranscript_id
3764725-94.662011.85E-07-1.3478-1.2689-1.308353764680
2522514-89.207053.30E-07-2.17621-2.03966-2.107932522509
294966145.406271.41E-062.531272.861052.696162949622
336233145.166131.70E-062.103662.381062.242363362263
383450459.2271.97E-062.5302742.7931662.661723834502
2350929-41.355242.12E-06-1.83529-1.60407-1.719682350922
3081719-65.999542.88E-06-2.81293-2.56643-2.689683081707
2500292-43.563992.97E-06-1.2957-1.13712-1.216412500275
3932139-52.203343.03E-06-2.43312-2.17509-2.30413932131
2340510-38.160353.16E-06-2.35865-2.03735-2.1982340433
3841868-40.80343.36E-06-2.29577-1.99845-2.147113841862
3568540-33.247236.05E-06-2.36948-2.00122-2.185353568534
3724580-31.969476.19E-06-3.6351-3.05234-3.343723724545
314516146.621586.92E-062.3446842.6652092.5049473145149
2333140-34.859997.10E-06-3.23725-2.74898-2.993112333136
3841587-36.890257.10E-06-3.64063-3.11492-3.377773841574
3581927-52.928977.97E-06-2.68494-2.39052-2.537733581637
2340357-32.723318.60E-06-3.21971-2.7059-2.96282340350
307921633.572868.97E-061.6918412.0063991.849123079202
3326951-62.765219.98E-06-2.025-1.8283-1.926653326950
Slide 10
Results Significant ASE in Erythropoietic vs CD34 Genes and
Pathways DAVID Functional Annotation reveals enrichment for
alternative splicing IDGene Name 2500275BCL2-like 11 3834502CD79a
molecule, immunoglob-assoc a 3362263DENN/MADD domain containing 5A
2340350DnaJ (Hsp40) homolog, subfam C, mem 6 3841862Fc fragment of
IgA, receptor for 2522509NIF3 NGG1 interacting factor 3-like 1
2333136cell division cycle 20 homolog 2350922glutathione
S-transferase mu 1 3581637immunoglobulin heavy variable 3-30
3724545integrin, beta 3 2340433leptin receptor 3841574leukocyte
immunoglob-like receptor, B, 1 3326950ld lipoprotein receptor class
A dom con 3 3081707motor neuron and pancreas homeobox 1 3079202K+
voltage-gated channel, subfam H, 2 3932131proteasome assembly
chaperone 1 3568534spectrin, beta, erythrocytic 2949622tenascin XB;
tenascin XA pseudogene 3764680tripartite motif-containing 37
3145149tumor protein p53 induc nuclear prot 1
Slide 11
Results Significant ASE in Granulopoietic vs CD34
Dimensionality reduction at p-val |2| 9 unique transcript clusters
ordered by p-val exon
probestest_statisticpvlower_ciupper_cilog2_fctranscript_id
366284671.180552.92E-072.4809592.6838342.5823973662808
3835043-43.711781.69E-06-6.9581-6.12603-6.542063835035
383716239.518092.52E-064.0166324.6243424.3204873837132
358200337.105813.39E-060.9612121.1171751.0391933581637
287442736.011813.62E-061.7319122.0215081.876712874371
330462932.805358.14E-062.6872963.1950172.9411573304624
3815226-29.111748.47E-06-3.5762-2.95292-3.264563815223
3959353-30.374619.42E-06-1.98261-1.64655-1.814583959350
376130655.438589.59E-061.7830761.9962571.8896673761291
Slide 12
Results Significant ASE in Granulopoietic vs CD34 Genes &
Pathways DAVID Functional Annotation reveals significant enrichment
for signaling Clusters include alternative splicing IDGene Name
33046245'-nucleotidase, cytosolic II 3835035CD177 molecule 3662808G
protein-coupled receptor 56 3837132SUMO1 activating enzyme subunit
1 3959350apolipoprotein L, 3 2874371fibrillin 2 3761291homeobox B2
3581637immunoglobulin heavy variable 3-30 3815223proteinase 3
Results Significant ASE in Megakaryopoietic vs CD34 Genes &
Pathways DAVID Functional Annotation reveals enrichment for
alternative splicing & signaling IDGene Name IDGene Name
34325142'-5'-oligoadenylate synthetase 2,
69/71kDa3738842hexosaminidase containing 2883440ADAM
metallopeptidase domain 193581637immunoglobulin heavy variable 3-30
3726691ATP-binding cassette, sub-family C, member 32376799inhibitor
of kappa light pp enhancer in B-cells, kinase epsilon
3922444ATP-binding cassette, sub-family G, member 12692816integrin,
beta 5 3580498CDC42 binding protein kinase beta3604287interleukin
16 3619773INO80 homolog2639734kalirin, RhoGEF kinase
3308489KIAA15982361279lamin A/C 3870361NLR family, pyrin domain
containing 122476510latent transforming gf beta binding protein 1
2367963RAB GTPase activating protein 1-like2902593lymphocyte
antigen 6 complex, locus G6F 3716783RAB11 family interacting
protein 4 (class II)3661684matrix metallopeptidase 2 3778504RAB31,
member RAS oncogene family3024025mesoderm specific transcript
homolog 2402601UBX domain protein 113654956nuclear pore complex
interacting protein-like 2 3442150acrosin binding
protein3959862parvalbumin 3927226amyloid beta (A4) precursor
protein3996306ribosomal protein L10; ribosomal protein L10
pseudogene 15 2664332collagen-like tail subunit of
acetylcholinesterase3838094similar to ferritin, light polypeptide;
ferritin, light polypeptide 3538087dapper, antagonist of
beta-catenin, homolog 12527747solute carrier family 11, member 1
3229338ficolin (collagen/fibrinogen domain containing)
13321361spondin 1, extracellular matrix protein 3595441glutamate
receptor, N-methyl D-aspartate-like 1B2712632transferrin receptor
(p90, CD71) 2320727tumor necrosis factor receptor superfamily,
member 1B (Several categories not shown but include those for
immune system development)
Slide 15
Results Top Significantly Upregulated ASE in Erythropoietic
P-val 2 in Erythropoietic ; P-val < 0.01 & FC < 1.5 in G
& M lineages Top upregulated exon probe 2527682; cluster id
2527672; gene PKND Significantly downregulated in Megakaryopoietic
lineage
Slide 16
Results Top Significantly Upregulated ASE in Granulopoietic
P-val 2 in Granulopoietic ; P-val < 0.01 & FC < 1.5 in E
& M lineages Top upregulated exon probe 4016430, 4016431;
cluster id 4016428; gene BEX2 Significantly reduced expression
versus CD34 in Megakaryopoietic lineage Significantly reduced
expression versus CD34 in Erythropoietic lineage
Slide 17
Results Top Significantly Upregulated ASE in Megakaryopoietic
P-val 2 in Megakaryopoietic ; P-val < 0.01 & FC < 1.5 in
E & G lineages Top upregulated exon probe 3275248; cluster id
3275132; gene GDI2 downregulated upregulated
Slide 18
Discussion ASE occurs during lineage-specific hematopoietic
differentiation of CD34 cells into Erythropoietic, Granulopoietic,
and Megakaryopoietic cells Pathway terms are significantly enriched
in alternative splicing and signaling, including those for immune
system development, consistent with known biology Relatively
increased ASE in megakaryopoietic differentiation may suggest
increased transcriptional complexity during development Comparison
to original research results by Lui et al (2011) share a few top
hits and similar pathway enrichment However, most top genes were
not identical and is probably due to their use of the ExonSVD model
for statistical assessment of exon enrichment/depletion High number
of significant hits at p< 0.01 may indicate high FDR and may
warrant further filtering and dimensionality reduction May be
interesting to combine MiDAS and Rank Product for testing and
correction
Slide 19
References Della Beffa et al (2008) Dissecting an alternative
splicing analysis workflow for GeneChip Exon 1.0 ST Affymetrix
arrays. BMC Genomics 9:571, PMID:19040723 EBI (2012) Ensembl Genome
Browser, release 68. Available at: [Access 8/20/12] Higgs, B (2012)
Advanced Genomics & Genetics Analysis, Lecture 2: Analysis and
interpretation of splice variants. Johns Hopkins University,
unpublished Lui et al (2011) Transcriptome Profiling and Sequencing
of differentiated Human Hematopoietic Stem cells Reveal Lineage
Specific Expression and Alternative Splicing of Genes, Physiol
Genomics 43(20):1117-34, PMID: 21828245 NIAID/NIH (2012) DAVID
Bioinformatics Resources 6.7: Functional Annotation Tool. Available
at: [Accessed 8/20/12]