A1 Bertrand

1
Cloning and Sanger sequencing EXPLORING SHALLOW SPLITS IN COSTA RICAN BARCODED LEPIDOPTERA Target Species Number of Specimens/Specie s Number of Screened clones Number of Successful Sequences/Specimen Aellopos ceculus 2 24 / specimen 16 10 Pychalia ficus 2 24 / specimen 17 10 Carathis byblis 8 100 / specimen 93 94 95 92 87 88 90 92 Table 1: Summaries the depth of screening and cloning success rate of target taxa sequences. We will extend this analysis to shallow splits in species of numerous families of Lepidoptera. We will continue our analysis of deep sequencing approach using 454 pyrosequencing to examine a much higher number of sequences from each specimen in cloning. Our next workflow will compare nuclear markers with COI to explore any changes in clustering patterns. RNA extraction from fresh samples followed by RT-PCR is a prospective analysis to solely reveal the sequence of expressed copy COI. Fig 3. NJ-tree of sample MHMXQ166-08 clustering with 92 successful clones and representative sequences in BOLD (outlined in red) and separately from the atypical branch. MHARB180-05|CbDHJ02|COI-5P 80 24 26 59 51 48 20 88 55 61 38 9 23 MHMXQ169-08|CbDHJ02|COI-5P 81 60 96 36 74 XAA057-04|CbDHJ02|COI-5P 85 25 42 52 93 MHARB186-05|CbDHJ02|COI-5P 4 90 28 37 49 MHMXT648-08|CbDHJ02|COI-5P 14 31 94 2 58 MHMXO045-08|CbDHJ02|COI-5P 63 79 72 78 BLPCI338-08|CbDHJ02|COI-5P 73 39 12 6 MHMXJ323-07|CbDHJ02|COI-5P 8 44 BLPCJ332-08|CbDHJ02|COI-5P MHMXH367-07|CbDHJ02|COI-5P 66 10 43 33 70 67 18 65 MHMXO047-08|CbDHJ02|COI-5P 32 92 53 19 69 MHARB183-05|CbDHJ02|COI-5P 75 71 MHARB181-05|CbDHJ02|COI-5P 21 3 76 95 57 82 87 BLPCI339-08|CbDHJ02|COI-5P MHMXQ168-08|CbDHJ02|COI-5P MHMXT647-08|CbDHJ02|COI-5P 17 MHMXO046-08|CbDHJ02|COI-5P 34 84 77 86 1 30 MHMXQ170-08|CbDHJ02|COI-5P 56 11 41 46 47 MHMXQ167-08|CbDHJ02|COI-5P 64 29 40 22 MHARB768-06|CbDHJ02|COI-5P 50 89 16 27 MHMXQ166-08|CbDHJ02|COI-5P 45 68 91 35 13 7 15 5 54 62 MHMXT649-08|CbDHJ01|COI-5P BLPCI337-08|CbDHJ01|COI-5P BLPCG696-08|CbDHJ01|COI-5P MHMXQ165-08|CbDHJ01|COI-5P MHMXJ324-07|CbDHJ01|COI-5P MHARB769-06|CbDHJ01|COI-5P MHARB182-05|CbDHJ01|COI-5P MHARB179-05|CbDHJ01|COI-5P 0.001 NUMT? Carathis byblis All samples analyzed were collected from the Area de Conservación Guanacaste in northwestern Costa Rica. NJ trees were constructed of CO1 barcodes for the families Sphingidae and Arctiidae. Target 1: Aellopos ceculus and Pachylia ficus were chosen from the Sphingidae family. Two specimens of each species from the atypical branch were analyzed. NUMT? Pachylia ficus Figure 1: Neighbor-joining trees of the selected species showing shallow splits in their DNA barcode sequences. We examined two individuals from atypical groups in Pachylia ficus and Aellopos ceculus. Carathis byblis underwent deeper analysis with five individuals from the representative and three individuals from the atypical branch (shown with dashed rectangles). Target 2: Carathis byblis was chosen from the Arctiidae family. Eight specimens were chosen, five from the representative and three from the atypical branch. One individual was used for preliminary 454 analysis. TARGET TAXA Aellopos ceculus NUMT? Burns, J. M., D. H. Janzen, M. Hajibabaei, W. Hallwachs & P. D. N. Hebert (2008) DNA barcodes and cryptic species of skipper butterflies in the genus Perichares in Area de Conservacion Guanacaste, Costa Rica. Proceedings of the National Academy of Sciences of the United States of America, 105, 6350-6355. Hebert, P. D. N., S. Ratnasingham & J. R. deWaard (2003) Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proceedings of the Royal Society of London Series B-Biological Sciences, 270, S96-S99. Janzen, D. H., M. Hajibabaei, J. M. Burns, W. Hallwachs, E. Remigio & P. D. N. Hebert (2005) Wedding biodiversity inventory of a large and complex Lepidoptera fauna with DNA barcoding. Philosophical Transactions of the Royal Society B- Biological Sciences, 360, 1835-1845. Song, H., J. E. Buhay, M. F. Whiting & K. A. Crandall (2008) Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified. Proceedings of the National Academy of Sciences of the United States of America, 105, 13486-13491. Table 1. summarizes the results of the cloning experiments for each species. All sequences with insertions, deletions or stop codons were removed as well as any sequences with more than 6 SNPs following traditional DNA barcoding guidelines. In both Target1 and Target2 species, each group of cloned sequences when aligned and compared with the original data from BOLD (Barcode of Life Datasystems) clustered with the parent sequences and separate from their sister lineages, illustrated in Fig 3. Massive 454 sequences of Target 2, Carathis byblis, were aligned and compared with the original data from BOLD and the results showed clustering with the parent sequences AND the secondary sister lineage, corresponding to the original dichotomy seen. Workflow B Workflow A Bertrand, C. (1) , Taidi, S. (1) , Janzen, D.H. (2) , Hallwachs, W. (2) , Hajibabaei, M. (1) (1) Biodiversity Institute of Ontario, Guelph, Ontario, Canada (2) University of Pennsylvania, Philadelphia, Pennsylvania, United States Our initial analysis suggests, the shallow splits within these three species are not a result of NUMTs and are true mitochondrial CO1 sequences that may represent cryptic species. However the 454 analysis detected two sequence types and we propose that deeper screening is required to reveal the dichotomous branching patterns. Massive sequencing using 454 Pyrosequencing is the most effective technology available. We thank ACG parataxonomists for collecting Lepidoptera specimens and Tanya Dapkey for sorting, sub-sampling and shipping specimens for analysis. We thank CCDB staff for barcode analysis and Shadi Shokralla for 454 pyrosequencing. MOLECULAR ANALYSIS Workflow A: DNA was extracted using a standard silica-based approach. PCR amplification of COI (658bp) using standard LepF/R primers (Hebert et al. 2003) was performed followed by gel purification. The purified products were cloned into TOPO vectors and transformed to competent E.coli cells. Colonies were directly amplified and sequenced through Sanger sequencing. This was performed for both Target 1 and Target 2 species. Workflow B: Massive sequencing of one individual of Target 2 (Carathis byblis) was performed by high throughput sequencing by synthesis (Pyrosequencing) utilizing an in house 454 FLX genome sequencer. The 454 sequencing results were subjected to all quality filters. Sequences were compared with Workflow A results to detect any variation in the barcoding region. Funding Provided By: A total on-going inventory of 10,000 species of Lepidoptera of the Area de Conservación Guanacaste (ACG) in northwestern Costa Rica has integrated DNA barcoding to assist in identification and in discovery of new and cryptic species. Over 100,000 individuals from this inventory have been barcoded in this densely sampled DNA regional barcode campaign (Janzen et al. 2005). Barcoding has revealed numerous cases where a morphologically-defined species contains two or more sympatric barcode clusters in a neighbour-joining (NJ) tree. At least half of these (often shallow but consistent) cases are found to be supported by morphological and/or ecological differences, reinforcing their status as different (usually undescribed) species (Burns et al. 2008). However, a significant number of these cases lack these correlates and therefore could be cases where nuclear pseudogene copies of mitochondrial genes (NUMTs) have been amplified from some conspecific individuals, while the true barcode has been amplified from other conspecifics, giving the result of two adjacent clusters in an NJ tree (Song et al. 2008). Most NUMTs show obvious signs of pseudogenes, such as stop codons and frame shift mutations, thereby allowing their rejection as barcodes. Deeper genetic exploration can bring clarity to where this rejection is not possible. We are currently cloning and Sanger sequencing the 658pb standard barcoding region of cytochrome c oxidase (COI) and comparing the results with deep amplicon sequencing utilizing next generation 454 Pyrosequencing. In a later phase of this project we will use nuclear markers to augment our current results. Fig 2. Experimental design showing samples going into the cloning and 454 Pyrosequencing workflows

description

poster rtest

Transcript of A1 Bertrand

Page 1: A1 Bertrand

Cloning and Sanger sequencing

EXPLORING SHALLOW SPLITS IN COSTA RICAN BARCODED LEPIDOPTERA

Target Species Number of Specimens/Specie

s

Number of Screened clones Number of Successful Sequences/Specimen

Aellopos ceculus 2 24 / specimen 16

10

Pychalia ficus 2 24 / specimen 17

10

Carathis byblis 8 100 / specimen 93

94

95

92

87

88

90

92

Table 1: Summaries the depth of screening and cloning success rate of target taxa sequences.

•We will extend this analysis to shallow splits in species of numerous families of Lepidoptera.• We will continue our analysis of deep sequencing approach using 454 pyrosequencing to examine a much higher number of sequences from each specimen in cloning. •Our next workflow will compare nuclear markers with COI to explore any changes in clustering patterns.•RNA extraction from fresh samples followed by RT-PCR is a prospective analysis to solely reveal the sequence of expressed copy COI.

Fig 3. NJ-tree of sample MHMXQ166-08 clustering with 92 successful clones and representative sequences in BOLD (outlined in red) and separately from the atypical branch.

MHARB180-05|CbDHJ02|COI-5P

80

24

26

59

51

48

20

88

55

61

38

9

23

MHMXQ169-08|CbDHJ02|COI-5P

81

60

96

36

74

XAA057-04|CbDHJ02|COI-5P

85

25

42

52

93

MHARB186-05|CbDHJ02|COI-5P

4

90

28

37

49

MHMXT648-08|CbDHJ02|COI-5P

14

31

94

2

58

MHMXO045-08|CbDHJ02|COI-5P

63

79

72

78

BLPCI338-08|CbDHJ02|COI-5P

73

39

12

6

MHMXJ323-07|CbDHJ02|COI-5P

8

44

BLPCJ332-08|CbDHJ02|COI-5P

MHMXH367-07|CbDHJ02|COI-5P

66

10

43

33

70

67

18

65

MHMXO047-08|CbDHJ02|COI-5P

32

92

53

19

69

MHARB183-05|CbDHJ02|COI-5P

75

71

MHARB181-05|CbDHJ02|COI-5P

21

3

76

95

57

82

87

BLPCI339-08|CbDHJ02|COI-5P

MHMXQ168-08|CbDHJ02|COI-5P

MHMXT647-08|CbDHJ02|COI-5P

17

MHMXO046-08|CbDHJ02|COI-5P

34

84

77

86

1

30

MHMXQ170-08|CbDHJ02|COI-5P

56

11

41

46

47

MHMXQ167-08|CbDHJ02|COI-5P

64

29

40

22

MHARB768-06|CbDHJ02|COI-5P

50

89

16

27

MHMXQ166-08|CbDHJ02|COI-5P

45

68

91

35

13

7

15

5

54

62

MHMXT649-08|CbDHJ01|COI-5P

BLPCI337-08|CbDHJ01|COI-5P

BLPCG696-08|CbDHJ01|COI-5P

MHMXQ165-08|CbDHJ01|COI-5P

MHMXJ324-07|CbDHJ01|COI-5P

MHARB769-06|CbDHJ01|COI-5P

MHARB182-05|CbDHJ01|COI-5P

MHARB179-05|CbDHJ01|COI-5P

0.001

NUMT?Carathis byblis

•All samples analyzed were collected from the Area de Conservación Guanacaste in northwestern Costa Rica. •NJ trees were constructed of CO1 barcodes for the families Sphingidae and Arctiidae.Target 1: Aellopos ceculus and Pachylia ficus were chosen from the Sphingidae family. Two specimens of each species from the atypical branch were analyzed.

NUMT?

Pachylia ficus

Figure 1: Neighbor-joining trees of the selected species showing shallow splits in their DNA barcode sequences. We examinedtwo individuals from atypical groups in Pachylia ficus and Aellopos ceculus. Carathis byblis underwent deeper analysis with fiveindividuals from the representative and three individuals from the atypical branch (shown with dashed rectangles).

Target 2: Carathis byblis was chosen from the Arctiidae family. Eight specimens were chosen, five from the representative and three from the atypical branch. One individual was used for preliminary 454 analysis.

TARGET TAXA

Aellopos ceculus

NUMT?

Burns, J. M., D. H. Janzen, M. Hajibabaei, W. Hallwachs & P. D. N. Hebert (2008) DNA barcodes and cryptic species of skipper butterflies in the genus Perichares in Area de Conservacion Guanacaste, Costa Rica. Proceedings of the National Academy of Sciences of the United States of America, 105, 6350-6355.

Hebert, P. D. N., S. Ratnasingham & J. R. deWaard (2003) Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proceedings of the Royal Society of London Series B-Biological Sciences, 270, S96-S99.

Janzen, D. H., M. Hajibabaei, J. M. Burns, W. Hallwachs, E. Remigio & P. D. N. Hebert (2005) Wedding biodiversity inventory of a large and complex Lepidoptera fauna with DNA barcoding. Philosophical Transactions of the Royal Society B-Biological Sciences, 360, 1835-1845.

Song, H., J. E. Buhay, M. F. Whiting & K. A. Crandall (2008) Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified. Proceedings of the National Academy of Sciences of the United States of America, 105, 13486-13491.

• Table 1. summarizes the results of the cloning experiments foreach species.• All sequences with insertions, deletions or stop codons wereremoved as well as any sequences with more than 6 SNPs followingtraditional DNA barcoding guidelines.• In both Target1 and Target2 species, each group of clonedsequences when aligned and compared with the original data fromBOLD (Barcode of Life Datasystems) clustered with the parentsequences and separate from their sister lineages, illustrated in Fig 3.

•Massive 454 sequences of Target 2, Carathis byblis, were alignedand compared with the original data from BOLD and the resultsshowed clustering with the parent sequences AND the secondarysister lineage, corresponding to the original dichotomy seen.

Workflow B

Workflow A

Bertrand, C.(1), Taidi, S.(1), Janzen, D.H.(2), Hallwachs, W.(2), Hajibabaei, M.(1)

(1) Biodiversity Institute of Ontario, Guelph, Ontario, Canada

(2) University of Pennsylvania, Philadelphia, Pennsylvania, United States

Our initial analysis suggests, the shallow splits within these three species are not a result of NUMTs and are true mitochondrial CO1 sequences that may represent cryptic species. However the 454 analysis detected two sequence types and we propose that deeper screening is required to reveal the dichotomous branching patterns. Massive sequencing using 454 Pyrosequencing is the most effective technology available.

We thank ACG parataxonomists for collecting Lepidoptera specimens and Tanya Dapkey for sorting, sub-sampling and shipping specimens for analysis.

We thank CCDB staff for barcode analysis and Shadi Shokralla for 454 pyrosequencing.

MOLECULAR ANALYSIS

Workflow A: DNA was extracted using a standard silica-based approach. PCR amplification of COI (658bp) using standard LepF/R primers (Hebert et al. 2003) was performed followed by gel purification. The purified products were cloned into TOPO vectors and transformed to competent E.coli cells. Colonies were directly amplified and sequenced through Sanger sequencing. This was performed for both Target 1 and Target 2 species.

Workflow B: Massive sequencing of one individual of Target 2 (Carathis byblis) was performed by highthroughput sequencing by synthesis (Pyrosequencing) utilizing an in house 454 FLX genome sequencer.The 454 sequencing results were subjected to all quality filters. Sequences were compared withWorkflow A results to detect any variation in the barcoding region.

Funding Provided By:

A total on-going inventory of 10,000 species of Lepidoptera of the Area de Conservación Guanacaste(ACG) in northwestern Costa Rica has integrated DNA barcoding to assist in identification and in discoveryof new and cryptic species. Over 100,000 individuals from this inventory have been barcoded in thisdensely sampled DNA regional barcode campaign (Janzen et al. 2005). Barcoding has revealed numerouscases where a morphologically-defined species contains two or more sympatric barcode clusters in aneighbour-joining (NJ) tree. At least half of these (often shallow but consistent) cases are found to besupported by morphological and/or ecological differences, reinforcing their status as different (usuallyundescribed) species (Burns et al. 2008). However, a significant number of these cases lack thesecorrelates and therefore could be cases where nuclear pseudogene copies of mitochondrial genes(NUMTs) have been amplified from some conspecific individuals, while the true barcode has beenamplified from other conspecifics, giving the result of two adjacent clusters in an NJ tree (Song et al.2008). Most NUMTs show obvious signs of pseudogenes, such as stop codons and frame shift mutations,thereby allowing their rejection as barcodes. Deeper genetic exploration can bring clarity to where thisrejection is not possible. We are currently cloning and Sanger sequencing the 658pb standard barcodingregion of cytochrome c oxidase (COI) and comparing the results with deep amplicon sequencing utilizingnext generation 454 Pyrosequencing. In a later phase of this project we will use nuclear markers toaugment our current results.

Fig 2. Experimental design showing samples going into the cloning and 454 Pyrosequencing workflows