7 species at high (8.4 - 11X) coverage

8

description

7 species at high (8.4 - 11X) coverage 2 species ( D. sech. & D. persim ) at intermediate (4.9 and 4.1X) coverage why these two? 7 inbred lines of D. simulans at low coverage (1 - 3X) why do these and why at low coverage?. Quality score: Q20 = 99% accurate, Q40 = 99.99% accurate. - PowerPoint PPT Presentation

Transcript of 7 species at high (8.4 - 11X) coverage

Page 1: 7 species at high (8.4 - 11X) coverage
Page 2: 7 species at high (8.4 - 11X) coverage

7 species at high (8.4 - 11X) coverage2 species (D. sech. & D. persim) at intermediate (4.9 and 4.1X) coverage

why these two?7 inbred lines of D. simulans at low coverage (1 - 3X)

why do these and why at low coverage?

Page 3: 7 species at high (8.4 - 11X) coverage

Quality score: Q20 = 99% accurate, Q40 = 99.99% accurate

N50 score: 50% of assembled basepairs are in contigs of this length or greater

Page 4: 7 species at high (8.4 - 11X) coverage

ORF annotation by: 4 de novo gene prediction models 3 homology-based predictors 1 combo method used GLEAN program to analyze combined set to predict

optimal start, stop, intron splice sites

Page 5: 7 species at high (8.4 - 11X) coverage

Why look at “TE-derived expression”?

Page 6: 7 species at high (8.4 - 11X) coverage

Homology assessment

Orthology by: fuzzy RBH (FRB) & Synpipe (incorporates synteny)

8,563 genes with 1:1 orthologs in melanogaster group

6,698 genes with 1:1 orthologs in all 12 species

** Caveats: missing genes- “plausible homologs” for 61% of genes ‘without’ orthologs- “confirmed” 20.7% gene losses- unable to resolve ~18%

“probably underestimating the number of genuine absences”

Page 7: 7 species at high (8.4 - 11X) coverage
Page 8: 7 species at high (8.4 - 11X) coverage

Red = same directionGreen = inversionBlue = translocation