PICODIV will amass large amount of data –cultures –sequences –environmental data Databases...

19
• PICODIV will amass large amount of data – cultures – sequences – environmental data • Databases – keep track of data produced – verify the data – avoid errors – make data quickly available to all • EU requirement PICODIV databases

description

Web site

Transcript of PICODIV will amass large amount of data –cultures –sequences –environmental data Databases...

Page 1: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

• PICODIV will amass large amount of data– cultures– sequences– environmental data

• Databases– keep track of data produced– verify the data – avoid errors– make data quickly available to all

• EU requirement

PICODIV databases

Page 2: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Databases• Taxonomy• Cultures• SSU rRNA sequences• Probes• Environmental data• Other ?

– Pigments– TEM pictures

Page 3: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Web site

Page 4: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Web data interface

Page 5: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Taxonomy

Page 6: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Taxonomy: pigments

Page 7: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Cultures: RCC catalog

Page 8: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Cultures: additional informationPicture

Synechococcus MAX 42

0

0.2

0.4

0.6

0.8

1

400 450 500 550Longueur d'onde

Spectre Pigments

• Flow cytometry, RFLP

Page 9: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Cultures

• Starter cutures --> Environmental database• Unialgal --> RCC catalog (not released)

Page 10: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

EMBL

Sequence data bases: input

PICODIVenvironmentalcultures

Access database

Automatic query

Email as fasta file

• SSU vs LSU

• Full length ?

• Taxonomy ?

VB program

Page 11: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.
Page 12: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Filtre

Page 13: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

ARB aligned

- phylogeny (trees)- probe design

Sequence data bases: output

Raw sequences

- BLAST

Access database

Full sequences

All sequences

Webperiodic update

Page 14: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

• Import files under EMBL format

• Mark all new sequences aligned: date + person (e.g. 20-jun-2000 DV) pub: n or PICODIV author: e.g. K Valentin

• Fast align by finding the closest relative with the PT-server SSU_RNA

• Quick add marked species to existing tree (use a sub-tree rather than the full tree)

• If tree incorrect remove from tree and align again to closest relative (either known or from BLAST search)

• Save only changes (not whole database)

• Update PT-Server

ARB processing

Page 15: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Novel sequences have not been added to the full tree (tree_all_dec98), except for mitochondrial sequences. Two subtrees have been extracted and new sequences added to them:

Tree name Method Sequences Type of sequences added

tree_all_dec98 Parsimony 13804 mito

tree_euk_algae Parsimony 1695 nuclear: only lower eukaryotes

tree_cyano_plastid Parsimony 341 cyanobacteria and plastids

ARB trees

Page 16: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Probe database

Page 17: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Environmental data bases• One per site

• Sampling code

• Hydrological and meteorological data

• Sampling information (volumes, protocols etc…)

• Culture isolation data

• Measurement data

• flow cytometry

• pigments

• TEM

• probes

Page 18: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

Cultures Sequence

ProbesTaxonomy

Environment

Interacting data bases

Page 19: PICODIV will amass large amount of data –cultures –sequences –environmental data Databases –keep track of data produced –verify the data –avoid errors.

It is our responsabilty to keep PICODIV databases updated

for the benefit of all