Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

21
Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium Aquilinum Joshua Der, Michael Barker, Norman Wickett, Claude dePamphilis and Paul Wolf

description

http://www.intl-pag.org/18/abstracts/W54_PAGXVIII_396.htmlAs the sister lineage to seed plants, ferns (i.e. monilophytes) are an important clade for comparative evolutionary studies of land plants. Additionally, with the evolution and maintenance of free-living and photosynthetic gametophyte and sporophyte life stages, ferns are an ideal group for studies of both life-cycle evolution in land plants and genome function in haploid and diploid phases. The development of genomic resources in ferns lags far behind that in other plants, due primarily to large genome sizes and the absence of economic crop species. High-throughput sequencing technologies have now enabled genome-scale studies in non-model organisms. We present an analysis of the gametophyte transcriptome of the bracken fern, Pteridium aquilinum. A full-length enriched, normalized cDNA library was generated with RNA derived from a pool of sexually mature male, female, and hermaphroditic gametophytes and sequenced with the Roche 454 GS FLX Titanium chemistry. A total of 681,722 reads with a mean length of 372.6 bp remained after quality filtering, repeat masking, and primer/vector screening. Cleaned reads were assembled de novo, resulting in 50,658 assembled unigenes with a mean length of 637.65 bp and a total length of 32.65 MB (5.49X unigene read-depth coverage). Unigenes were BLASTed against the inferred proteins of ten complete plant genomes and pseudo-annotated with the GO-slim vocabulary. 34,254 unigenes (68%) had a BLAST best hit and were assigned a tentative functional annotation. We also present an assessment of transcriptome coverage and explore the utility of these data for comparative evolutionary and functional genomic studies in land plants.Authors: Joshua P Der(1), Michael S. Barker(2,3), Norman Wickett(4), Claude W. Depamphilis(4) and Paul Wolf(1,5)(1) Department of Biology, Utah State University, Logan, UT 84322, USA(2) The Biodiversity Research Centre and Department of Botany, University of British Columbia, Vancouver BC V6T 1Z4, CANADA(3) Department of Biology and Center for Genomics and Bioinformatics, Indiana University, Bloomington, IN 47405, USA(4) Department of Biology, Institute of Molecular Evolutionary Genetics, and The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.(5) Ecology Center, Utah State University, Logan, UT 84322, USAhttp://www.intl-pag.org/18/abstracts/W54_PAGXVIII_396.html

Transcript of Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Page 1: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Functional Genomics of Fern Gametophytes:

Transcriptome Sequencing in Pteridium Aquilinum

Joshua Der, Michael Barker, Norman Wickett, Claude dePamphilis and Paul Wolf

Page 2: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Acknowledgments

Coauthors:• Michael Barker (U. British Columbia) - project design and

transcriptome assembly• Norman Wickett and Claude dePamphilis (Penn State U.) -

transcriptome annotation and interpretation of results• Paul Wolf (Utah State U.) - project design, funding,

interpretation of results, and general supportUtah State University:• Aaron Duffy - tissue culture & bioinformatics help• Mike Pfrender - RNA lab space & equipment• VP for Research & Center for Integrated BioSystems -

research funds• Dept. of Biology, Center for Integrated BioSystems, & Ecology

Center - travel fundsIndiana University:• Keithanne Mockaitis - cDNA library preparation & 454

sequencingUniversity of British Columbia:• Katrina Dlugosch - sequence cleaning scriptPenn State University:• Eric Wafula - general scripting help

Page 3: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Fern Evolution

Ferns

Lycophytes

Bryophytes

Seed PlantsSister to seed plants

Ancient lineage (Devonian)

~11000 extant species

High diversity in morphology, geography, and ecology

Evolved and maintain independent gametophyte and sporophyte generations

Page 4: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Fern Evolution

haploid spores (n)

meiosis

sperm (n)

egg (n)

zygote (2n)

Fern life cycle

syngamy

Sister to seed plants

Ancient lineage (Devonian)

~11000 extant species

High diversity in morphology, geography, and ecology

Evolved and maintain independent gametophyte and sporophyte generations

Page 5: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Fern Genetics

Recessive alleles are not masked in haploid gametophytes

Gametic phase segregation and recombination can be directly observed

Controlled crosses can be performed to produce double haploid sporophytes (i.e. complete homozygotes)

Apogamy and apospory can be induced, unlinking ploidy and life stage Klekowski 1971

Page 6: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Challenges In Fern Genetics

Limited agronomic importance

Large genome sizes (avg. 10 Gb)

High chromosome numbers (avg. n = 57)

Extensive history of hybridization and polyploidy

Photo credit: Mike Windham

Page 7: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Fern Genomics

Genomic resource development in ferns has lagged far behind those in flowering plants (but wait for Mike's talk next)

No fern genome sequencing projects have been funded

New high throughput sequencing has started to bring the power of genomics to non-model organisms

www.454.com

Page 8: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Bracken Fern: Pteridium aquilinum

Worldwide distribution

Toxic to livestock and weedy in pasture, so has been extensively studied

Highly adaptable and phenotypically plastic

Established culture techniques

Model system for understanding the fern life cycle, gametophyte development, and sex determination

Phylogeny is well characterized

Paleopolyploid with diploid gene expression

Genome size: 1C = 9.8 GbLindman. 1917-1926. Bilder ur Nordens Flora-508

Page 9: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

The Fern Gametophyte Transcriptome

How has the fern life cycle influenced genome evolution?

What genes are active in the gametophyte generation?

What is the functional profile of these genes?

Do gametophyte specific genes experience purifying selection?

Do reproductive proteins have a signature of positive selection or is their rate of molecular evolution elevated?

What is the function of "flowering" gene homologues in fern gametophytes?

Page 10: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Sequence Pre-processing: Cleaned ESTs

RNA from whole gametophytes: male, female, and bisexual

cDNA library normalized and enriched for full-length mRNA

Reads were quality and length filtered, adapter and polyA/T trimmed

Cleaned reads: 681,722Mean length: 372.60 bpTotal bases: 254 Mb

Histogram of cleaned reads

Cleaned read length, maximum = 624

Num

ber o

f seq

uenc

es

0 100 200 300 400 500 600

050

0010

000

1500

0

Page 11: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

EST Assembly: Unigenes

Two-step strategy for EST assembly to reduce redundancy in the unigene set:

1. ESTs were first assembled in MIRA2. Assembly passed to CAP3 to join

additional contigs

Histogram of transcriptome unigenes (CAP3)

Unigene length, largest transcript = 4897 bp

Num

ber o

f seq

uenc

es

0 500 1000 1500 2000 25000

2000

4000

6000

Total unigenes = 38889

Mean length = 685.76 bp

Total bases = 26.67 Mp

Assembly: MIRA (1º)

CAP3 (2º)

# singletons: 638 183

# 1º contigs: 50,020 32,801

# 2º contigs: 0 5,905

# unigenes: 50,658 38,889

mean unigene length: 637.7 bp 685.8 bp

largest unigene length: 4,489 bp 4,897 bp

total consensus: 32.30 Mb 26.67 Mb

Page 12: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome CoverageTo assess the depth and breadth of transcriptome coverage, we compared our assembly with the predictions from a simulation model using ESTcalc

Wall et. al., 2009. BMC Genomics 10:347

Parameters ESTcalc Actual (CAP3)

Technology 454 GSFLX 454 GSFLX (Titanium)

Library type normalized normalizedReads/plate 681,722 681,722Read length 372.6 bp 372.6 bp

OutputTotal sequence amount 254 MB 254.0076 MBTotal assembled sequence 26.2 MB 26.67 MBPercent transcriptome (A) 87 % ?Percent of genes tagged (B) 100 % ?Unigene count (C) 32,044 38,889Mean unigene length (D) 819 bp 685.8 bpSingleton yield (E) 19 % 0.0047 %Percent of genes with 90% coverage 69.8 % ?Percent of genes with 100% coverage (F) 23.7 % ?

Page 13: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation

Two complementary strategies for functional annotation

1. BLAST unigenes in NCBI nr protein database

GO annotation using Blast2GO

Broad functional perspective with a rich objective GO annotation

2. BLAST to inferred proteomes of 10 complete plant genomes

Pseudo-annotated based on MCL cluster membership in PlantTribes2.0

Tribe and OrthoGroup assignment, GO-slim function, and Arabidopsis gene id & description

Plant gene family classification with detailed information from well-curated reference genomes

Page 14: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation: nr BLASTx

Top-Hit species distribution

0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 5,000BLAST HITs

Physcomitrella patensVitis vinifera

Picea sitchensisRicinus communis

Populus trichocarpaArabidopsis thaliana

Oryza sativaSorghum bicolor

Glycine maxZea mays

Gossypium hirsutumMedicago truncatula

unknownAdiantum capillus-veneris

Ceratopteris richardiiNicotiana tabacum

Marchantia polymorphaSolanum tuberosum

Chlamydomonas reinhardtiiAlsophila spinulosa

Ginkgo bilobaMicromonas sp.

Pteris vittataElaeis guineensis

Pinus taedaSolanum lycopersicum

Micromonas pusillaTriticum aestivum

Gossypium barbadenseothers

46%54%

Positive BLAST hitNo BLAST hit

21,097 of 38,889 unigenes with positive hit (e-value cutoff 1e-10)

Page 15: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation: Blast2Go

Localization of genes is predominantly in the nucleus, mitochondria, and plastids

cellular_component Level 5

endoplasmic reticulum

(317)

nucleoplasm (376)

vacuole (274)

Golgi apparatus

(212)

microbody (119)

plastid (3,613)

cytoskeleton (238)

nucleus (1,325)

endosome (10)

nucleolus (197)

nuclear lumen (555)

cytosol (448)

mitochondrion (1,967)

Cellular Component - GO level 5

Page 16: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation: Blast2Go

Two main biological processes involve metabolism and cellular machinery

biological_process Level 2multicellular organismal

process (166)

localization (1,713)

multi-organism process (15)

growth (41)

establishment of localization

(1,713)

reproduction (73)

biological regulation (853)

developmental process (194)

reproductive process (29)

cellular process (7,432)

regulation of biological

process (716)

response to stimulus (908)

metabolic process (7,641)

Biological Process - GO level 2

Page 17: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation: Blast2Go

Two main molecular functions are binding (DNA, RNA, and protein) and catalytic activity (hydrolase and transferase activity)

molecular_function Level 2

enzyme regulator

activity (106)

binding (8,120)

transcription regulator

activity (409)

structural molecule

activity (542)

translation regulator activity (1)

transporter activity (908)

molecular transducer

activity (357)

catalytic activity (7,915)

Molecular Function - GO level 2

Page 18: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation: PlantTribes2.0

25,172 of 38,889 unigenes with positive hit, e-value cutoff 1e-5

Unigenes classified into 7,126 Tribes and 9,548 OrthoGroups

35%

65%

Positive BLAST hitNo BLAST hit

Page 19: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Transcriptome Annotation: PlantTribes2.0

Some interesting results:

Single unigene similar to LEAFY

one copy found in seed plants, two in Physcomitrella and Selaginella

Single unigene similar to SEPALLATA3

a gene family absent from gymnosperms, thought to have originated with flowers and required by B and C floral organ identity genes to function

Single unigene similar to PISTILLATA and two unigenes similar to CAULIFLOWER

not known in gymnosperms, Physcomitrella, or Selaginella

WARNING: these annotations are based on BLAST which may return distant homologues. A detailed phylogenetic examination is needed!

Page 20: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Future Work

Sequence the sporophyte transcriptomeTranscriptome profiling in various life stages/tissues (RNA-seq)Examine gene family evolution in land plantsRNA editing in the chloroplast genomePopulation genomics (with mined SSR and SNP loci)Linkage mapping

Page 21: Functional Genomics of Fern Gametophytes: Transcriptome Sequencing in Pteridium aquilinum.

Thank You!

Collecting bracken in the Rocky Mountains with my field assistant