Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of...

17
Using ALE to reconstruct gene trees ALE is developed by Gergely Szöllősi [email protected] @sllsi Mistakes due to Bastien Boussau @Bastounette Presentation available on slideshare: http://www.slideshare.net/

description

ALE is a series of programs for inferring high quality gene trees in the presence of gene duplication, transfer, and loss. It is based on Amalgamated Likelihood Estimation and a probabilistic model of gene duplication, transfer and loss.

Transcript of Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of...

Page 1: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Using ALE to reconstruct gene treesALE is developed by Gergely Szöllősi [email protected] @sllsi

Mistakes due to Bastien Boussau @Bastounette

Presentation available on slideshare: http://www.slideshare.net/

Page 2: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

References• Phylogenetic modeling of lateral gene transfer reconstructs the

pattern and relative timing of speciations. Szöllősi GJ, Boussau B, Abby SS, Tannier E, Daubin V. Proc Natl Acad Sci U S A. 2012 109(43):17513-8. doi: 10.1073/pnas.1202997109.

• Lateral gene transfer from the dead. Szöllősi GJ, Tannier E, Lartillot N, Daubin V. Syst Biol. 2013 62(3):386-97. doi: 10.1093/sysbio/syt003.

• Efficient exploration of the space of reconciled gene trees. Szöllősi GJ, Rosikiewicz W, Boussau B, Tannier E, Daubin V. Syst Biol. 2013 62(6):901-12. doi: 10.1093/sysbio/syt054.

Page 3: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Model and ALE program• Birth-death probabilistic model including:

• gene duplication (D)

• gene loss (L)

• gene transfer (T)

• Given (one or more) gene trees and an ultrametric species tree, returns:

• a reconciled gene tree, annotated with D,T,L events

• its Likelihood

• optimized rates of D,T,L events

• Can return ML estimate, or sample according to likelihood

Page 4: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

T IM E

Transfers tread beyond the represented species phylogeny

Lateral gene transfer from the dead. Szöllosi GJ, Tannier E, Lartillot N, Daubin V. Syst Biol. 2013 62(3):386-97. doi: 10.1093/sysbio/syt003.

Page 5: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

T IM E

Transfers tread beyond the represented species phylogeny

Lateral gene transfer from the dead. Szöllosi GJ, Tannier E, Lartillot N, Daubin V. Syst Biol. 2013 62(3):386-97. doi: 10.1093/sysbio/syt003.

Page 6: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

T IM E

Mathematically, most transfers involve unrepresented lineages

Transfers tread beyond the represented species phylogeny

Lateral gene transfer from the dead. Szöllosi GJ, Tannier E, Lartillot N, Daubin V. Syst Biol. 2013 62(3):386-97. doi: 10.1093/sysbio/syt003.

Page 7: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

ALE download and installation

• 2 ways:

1. use a virtual image through VirtualBox:"

• download ftp://www.prabi.fr/pub/ancestrome/ALE-demo.ova

• while downloading, if necessary, install both VirtualBox and VirtualBox additions (on virtualbox website (https://www.virtualbox.org/); be careful to choose the right version for your system);

• run VirtualBox, and do File / Import a Virtual Application. Then, select the downloaded file;

2. Install it in full:"

• download it from https://github.com/ssolo/ALE

• Follow the instructions in the README.md file

Page 8: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

ALE pipeline• Examples can be found in the example_data folder

• Generate gene tree distribution (e.g. PhyloBayes, mrBayes, revBayes…), then

• use ALEobserve on the tree distribution:

!./ALEobserve_LINUX HBG745965_real.1.treelist 1000!

• use ALEml if you want to get the ML estimate of the reconciled gene tree

! ./ALEml S.tree HBG745965_real.1.treelist.ale!

• use ALEsample if you want to get a sample of reconciled gene trees

! ./ALEsample S.tree HBG284202_real.ale!

• Look at the annotations written as bootstrap values (e.g. NJplot)

Page 9: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Species tree

Page 10: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled PhyML gene

tree

Page 11: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled PhyML gene

tree

Page 12: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled PhyML gene

tree

Page 13: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled PhyML gene

tree

Page 14: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled PhyML gene

tree

Many (8) transfers in the PhyML tree

Page 15: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled Amalgamated

gene tree

ACAM1_1_PE135CYAP4_1_PE420THEEB_1_PE1494

GLVIO1_1_PE1169SYNJB_1_PE950SYNJA_1_PE2030.3

.26

@25|20.20

T@26|26

TRIEI_1_PE668NOSP7_2_PE4607ANAVT_4_PE931ANASP_6_PE2969.6.18

SYNY3_4_PE2642SYNP6_1_PE1291SYNE7_1_PE221

SYNR3_2_PE944PROMM_1_PE495PROM3_1_PE1845

PROM4_1_PE1056PRMAR1_1_PE1063PROMT_1_PE635PROM1_1_PE651

PROM9_1_PE623PROM2_1_PE676PROMS_1_PE651PROM0_1_PE623.7

.8

PROM5_1_PE662PROMP_1_PE651

.9

.5

.2

.10

.14

.16

.4

.17

SYNPW_2_PE774SYNS3_1_PE1833SYNPX_1_PE1481SYNSC_1_PE1008SYNS9_1_PE925

.12

.11

.13

.21

.19

.22

@22|1.1

.23

Tb@23|23

MICAN_1_PE2716CYAP7_1_PE2280CYAA5_6_PE594CYAP8_1_PE2632

.24

.15

T@23|SYNY3

.25

.31

.30.29

.33:0

.27

.32

0.2

Page 16: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

Reconciled Amalgamated

gene tree

ACAM1_1_PE135CYAP4_1_PE420THEEB_1_PE1494

GLVIO1_1_PE1169SYNJB_1_PE950SYNJA_1_PE2030.3

.26

@25|20.20

T@26|26

TRIEI_1_PE668NOSP7_2_PE4607ANAVT_4_PE931ANASP_6_PE2969.6.18

SYNY3_4_PE2642SYNP6_1_PE1291SYNE7_1_PE221

SYNR3_2_PE944PROMM_1_PE495PROM3_1_PE1845

PROM4_1_PE1056PRMAR1_1_PE1063PROMT_1_PE635PROM1_1_PE651

PROM9_1_PE623PROM2_1_PE676PROMS_1_PE651PROM0_1_PE623.7

.8

PROM5_1_PE662PROMP_1_PE651

.9

.5

.2

.10

.14

.16

.4

.17

SYNPW_2_PE774SYNS3_1_PE1833SYNPX_1_PE1481SYNSC_1_PE1008SYNS9_1_PE925

.12

.11

.13

.21

.19

.22

@22|1.1

.23

Tb@23|23

MICAN_1_PE2716CYAP7_1_PE2280CYAA5_6_PE594CYAP8_1_PE2632

.24

.15

T@23|SYNY3

.25

.31

.30.29

.33:0

.27

.32

0.2

Amalgamation of gene trees leads to fewer transfers (3)

Page 17: Presentation describing ALE, a suite of programs for gene tree reconstruction in the presence of gene duplication, transfer and loss

In progress…

• STRALE: • A Bayesian probabilistic method that can interpret thousands of

gene trees with DTL events and reconstruct a time-ordered species tree

• Currently undergoing tests • Can run on thousands of gene families (parallel architecture) • Will be open access • Can run on dozens of species