Paper Review on Cross- species Microarray Comparison Hong Lu 2008-10-14.

Post on 18-Jan-2016

212 views 0 download

Transcript of Paper Review on Cross- species Microarray Comparison Hong Lu 2008-10-14.

Paper Review on Cross-species Microarray

Comparison

Hong Lu

2008-10-14

Title: Conservation of Regional Gene Expression in Mouse and Human Brain

Authors: Strand AD, Olson JM., et.al

Year: 2007

Journal: PLoS genetics

Purpose

In-species comparison: To find the differences to distinguish resistant

and sensitive tissues and cell types.

Cross-species comparison: To provide a framework to explore the ability of

mouse to model diseases of the human brain.

Human Group I Group II Total

Tissue3: caudate,cerebellum,motor cortex

2: caudatecerebellum

Pe

rson

s

man 8 7 15

woman 4 2 6

Total 12 9 21

Total Slides 12 x 3 = 36 9 x 2 = 18 54

Ag

e

Range 36 ~ 77 22 ~ 72 22 ~ 77

Mean 58 49 54

Affymetrix HG-U133A

Probesets # 22,283

Materials

Species Human Mouse(C57BL)

Tissue

3 caudate,

cerebellum,motor cortex

3 caudate,

cerebellum,motor cortex

Sa

mp

le

Male 8 1

Female 4 5

Total 12 6

Total Slides 12 x 3 = 36 6 x 3 = 18

Ag

e

Range 36 ~ 77 (years) 35 (days)

Mean 58 (years) 35 (days)

Affymetrix HG-U133A MOE_430A_2

Probesets # 22,283 22,690

Microarray analysis

1) Normalize the CEL files with Robust Multiple-array Average (RMA).

2) Fit a linear model for each of three pairs with LIMMA (bioconductor package)

gene expression ≈ donor + tissue type• Caudate/Cerebellum• BA4 Cortex/Cerebellum• BA4 Cortex/Caudate

3) Get log ratio, paired t-statistics and p-values

Sample result (human)

Score Caudate/Cerebellum …

Caudate Cerebellum Motor cortex

Probeset ID Log Ratio

t P.value …

106.05 -89.15 -16.9 215241_at 6.08 65.1 1.65E-21 …

103.2 -62.01 -41.19 220313_at 5.95 71.9 3.13E-22 …

93.7 -51.66 -42.04 207307_at 5.04 71.9 3.16E-22 …

Caudate score = t-score(Caudate/Cerebellum) + t-score(Caudate/BA4 Cortex)

Different Regions of the Brain Show Many Statistically Significant Differentially Expressed Genes

To select sets of genes whose expression was highly enriched in one of the three regions

1) p < 0.001 and log ratio ≥ 1 in both relevant pair-wise comparisons.

2) The log ratios of the two relevant comparisons were summed, such as log2(BA4/caudate) + log2(BA4/cerebellum) would be candidate BA4 genes

Caudate

Cerebellum BA4 Cortex

3) Order sum of log ratios

4) if summed regional score >2 in more than one region, probesets were culled from the list.

Table 3:Selected Regionally Enriched Genes in Human and Mouse Brain Tissues

Gene Expression Variation between Tissues and Individuals

gene expression ≈ donor + tissue type

Within-tissue variance VS Between-tissue variance

The variance for a probeset, across n samples, was calculated by

where xi is the RMA signal for probeset i on array n.

The between-tissue variability was greater for 89% of the human probesets and 85% of the mouse probesets.

Conclusion: Compared to expression dictated by regional identity,

age and gender appear to have effects of small magnitude or of large magnitude on a small fraction of genes, even in humans.

Cross-Species Comparison of Regional Gene Expression

What’s the relationship between mouse probesets and human probesets?

ENSEMBL

Mouse probesets Mouse ENSEMBL identities

(Example: 1415688_at)

Human probesets Human ENSEMBL identities (209141_at)

dN/dS

dN (number of nonsynonymous substitutions / number of nonsynonymous sites)

dS (number of synonymous substitutions / number of synonymous sites)

dN/dS was generated using the codeml (PAML package, pair-wise Maximum Likelihood Method) with F3 × 4 codon evolution model

Pick up 2,998 one-to-one orthologus pairs.

Compute normalized Euclidian distance between all possible nonself pairs of tissues.

where there are g probesets and x and y are any two mouse or human samples. Euclidian distances between regions were calculated using the mean RMA probeset signals for each tissue.

Conclusion: Orthologous Brain Regions between Species Are More Similar to Each Other than to Different Regions within a Species

Analysis of GO categories

Human: 70.6% of the probesets had an assigned GO category .

Mouse: 66.2% of the probesets had an assigned GO category.

For each GO category,

The total number of probes in that category (a)VS

The number of probes appearing on a list of differentially expressed probes (p < 0.05) (b)

Fisher's exact test Pearson chi-square

If a or b < 10 Otherwise

To detect which category is over-represented.

Conclusion: Mouse and Human Brain Regions Share a Higher Number of Overrepresented Functional Groups than Would Be Expected by Chance

Relationships between Tissue-Specific Expression, Conservation of Sequence, and

Conservation of Expression

(A) X-axis: dN/dS ratios, least conserved (left) to most conserved (right).Y-axis: Correlation coefficient between human and mouse log ratios.

(B) X-axis: The percent nucleotide identity, low (left) to high (right).Y-axis: Correlation coefficient between human and mouse log ratios.

Conclusion: Genes with High Variance across Tissues Have Greater Conservation of

Nucleotide Sequence

Conclusion

1) In-species comparison:The different brain regions have distinctly different expression profiles.

2) Cross-species comparison:Region-specific genes are conserved at both the sequence and gene expression levels. (positive correlated)

Advantage and Shortage?

Thanks