The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's...

5
The Kangaroo Genome - Australia's Secret Weapon Jennifer Marshall Graves 1* and Elizabeth Kuczek 2 1 Research School of Biological Sciences, Australian National University, ACT 2601 2 Australian Genome Research Facility, St Lucia, QLD 4072 *Corresponding author: [email protected] Page 4 AUSTRALIAN BIOCHEMIST Vol 37 No 2 August 2006 SHOWCASE ON RESEARCH Fig. 1. Phylogeny of vertebrates, emphasising the big gap that marsupials fill between eutherian mammal radiation ~100 MYA and the divergence of mammals from birds and reptiles ~300 MYA. FISH REPTILES MAMMALS BIRDS MONOTREMES EUTHERIANS MARSUPIALS ANAPSIDS DIAPSIDS AMPH platypus human SILURIAN DEVONIAN CARBONIFEROUS PERMIAN TRIASSIC JURASSIC CRETACEOUS TERTIARY echidna Ameridelphids dasyurids macropodids insectivores rodents ungu l ates carnivores l e m urs OW monkeys NW monkeys apes synapsids therapods therians cartilagenous bony frogs salamanders turtles snakes crocodiles birds 67 140 210 250 290 360 410 Australia's unique mammals offer an untapped genomic resource for comparative genomics. The ARC Centre for Kangaroo Genomics, together with the Australian Genome Reseach Facility (AGRF), is applying high throughput methods and developing new molecular and cytology techniques to map and sequence the genome of our model kangaroo, the tammar wallaby. Marsupial genetics and cytology have already made significant contributions to the understanding of gene function and evolution, and increasing availability of kangaroo DNA sequence information will provide these benefits on a genomic scale. Here we explore the benefits of the kangaroo genome project, describe the genomic resources now available and those being developed, and summarise the contributions from cytogenetic and genetic studies of marsupials. Mammalian Genomics - Why Marsupials? The human genome has been fully sequenced, but much remains to be learned about human genes and the small signals that control them, as well as about the 98% of the genome that does not code for protein. Comparative genomics has tremendous power to identify genes and regulatory sequences, and to explore how complex systems of genetic control evolved and how they work. This has great practical, as well as far- reaching theoretical, benefits to our understanding of the organisation, function and evolution of the mammalian (including human) genome. The power of comparative genome analysis depends on the richness and evolutionary depth of the species that are compared. The chimp, mouse, rat and dog genomes are complete; and bovine - and of all things, elephant - is on the way. Chicken and three fish genomes are done. However, mouse and even elephant genomes are so closely related to human, having shared a common ancestor a mere 100 million years ago (MYA), that it is hard to identify conserved sequence against the background noise. At the other extreme, birds and fish are so distantly related to humans (>300 and 450 MYA respectively) that it is often impossible to align sequence; besides, they are not much use for studying mammalian characteristics like milk, or mammalian regulatory systems like X inactivation and imprinting. Marsupials fill this gap. They last shared a common ancestor with humans about 180 MYA (1), so they occupy the evolutionary sweet spot that is close enough for meaningful comparison, but distant enough to provide major genetic variation (2). Importantly, too, they are mammals, and share many mammalian characteristics, see Fig. 1.

Transcript of The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's...

Page 1: The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's Secret Weapon Jennifer Marshall Graves1* and Elizabeth Kuczek2 1Research School of

The Kangaroo Genome − Australia's Secret WeaponJennifer Marshall Graves1* and Elizabeth Kuczek2

1Research School of Biological Sciences, Australian National University, ACT 26012Australian Genome Research Facility, St Lucia, QLD 4072

*Corresponding author: [email protected]

Page 4 AUSTRALIAN BIOCHEMIST Vol 37 No 2 August 2006

SHOWCASE ON RESEARCH

Fig. 1. Phylogeny of vertebrates, emphasising the big gap that marsupials fill between eutherian mammal radiation ~100 MYA and the divergence of mammals from birds and reptiles ~300 MYA.

FISH REPTILES MAMMALSBIRDSMONOTREMES EUTHERIANS

MARSUPIALSANAPSIDS DIAPSIDS

AMPH

plat

ypus

hum

an

SILURIAN

DEVONIAN

CARBONIFEROUS

PERMIAN

TRIASSIC

JURASSIC

CRETACEOUS

TERTIARY

echi

dna

Am

erid

elph

ids

dasy

urid

s

mac

ropo

dids

inse

ctiv

ores

rode

nts

ungu

late

sca

rniv

ores

lem

urs

OW

mon

keys

NW

mon

keys

apes

syna

psids

therap

ods

theria

ns

carti

lage

nous

bony

frogs

sala

man

ders

turtl

es

snak

escr

ocod

iles

bird

s

67

140

210

250

290

360

410

Australia's unique mammals offer an untapped genomic resource for comparative genomics. The ARC Centre for Kangaroo Genomics, together with the Australian Genome Reseach Facility (AGRF), is applying high throughput methods and developing new molecular and cytology techniques to map and sequence the genome of our model kangaroo, the tammar wallaby. Marsupial genetics and cytology have a l ready made s igni f i cant contr ibut ions to the understanding of gene function and evolution, and increasing availability of kangaroo DNA sequence

information will provide these benefits on a genomic scale. Here we explore the benefits of the kangaroo genome project, describe the genomic resources now available and those being developed, and summarise the contributions from cytogenetic and genetic studies of marsupials.

Mammalian Genomics − Why Marsupials?The human genome has been fully sequenced, but

much remains to be learned about human genes and the small signals that control them, as well as about the 98%

of the genome that does not code for protein. Comparative genomics has tremendous power to identify genes and regulatory sequences, and to explore how complex systems of genetic control evolved and how they work. This has great practical, as well as far-reaching theoretical, benefits to our understanding of the organisation, f u n c t i o n a n d e v o l u t i o n o f t h e mammalian (including human) genome.The power of comparative genome

analysis depends on the richness and evolutionary depth of the species that are compared. The chimp, mouse, rat and dog genomes are complete; and bovine − and of all things, elephant − is on the way. Chicken and three fish genomes are done. However, mouse and even elephant genomes are so closely related to human, having shared a common ancestor a mere 100 million years ago (MYA), that it is hard to identify conserved sequence against the background noise. At the other extreme, birds and fish are so distantly related to humans (>300 and 450 MYA respectively) that it is often impossible to align sequence; besides, they are not much use for studying mammalian characteristics like milk, or mammalian regulatory systems like X inactivation and imprinting.

Marsupials fill this gap. They last shared a common ancestor with humans about 180 MYA (1), so they occupy the evolutionary sweet spot that is close enough for meaningful comparison, but distant enough to provide major genetic variation (2). Importantly, too, they are mammals, and share many mammalian characteristics, see Fig. 1.

Page 2: The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's Secret Weapon Jennifer Marshall Graves1* and Elizabeth Kuczek2 1Research School of

SHOWCASE ON RESEARCH

The Kangaroo Genome ProjectUntil two years ago, no marsupial genome was even

on the drawing boards. In 2002 a consortium of (largely Australian) marsupial geneticists and biologists, and genome scientists and bioinformatics experts submitted a proposal to National Institutes of Health (NIH), USA, to sequence the genome of a model kangaroo. This ultimately lost out to a hurriedly commissioned bid to sequence an 'American marsupial', the short tailed gray opossum Monodelphis domestica, a rainforest species from Brazil. This project, a whole genome shotgun with 6-fold coverage, is now in its assembly stage. Happily, the Broad Genome Centre in Boston has involved several Australian scientists with the annotation and interpretation of the opossum sequence. The first large-scale exploration of opossum sequence, establishing how the major histocompatibility complex evolved, was led by an Australian group (3).

The arguments we made for sequencing marsupials propelled our efforts toward securing funding for an Australian-led onslaught on the kangaroo genome (4). Major funding is being provided by the Victorian Government, and, in an unprecedented gesture of goodwill, this is being matched by the NIH. The availability of the two marsupial genome sequences will provide an important comparison, given that the degree of divergence between these two marsupials (~80 MY) is the same as that between mouse and humans.

Whole genome shotgun sequencing is a joint effort between the AGRF (Fig. 2) and Baylor Genome Centre in Houston, headed by Richard Gibbs (an Aussie expatriate). Further funding is required for the project with the aim that a two times coverage will be available for this species by the end of 2006.

We are using a small member of the kangaroo family, the tammar wallaby (Macropus eugenii), see Fig. 3. Tammar was endorsed as a model 15 years ago b y t h e m a r s u p i a l g e n e t i c s a n d r e p r o d u c t i o n communities (5) because it is small and cheap, can be bred in captivity and is relatively easy to handle. The species is captive bred at three sites in Australia, and has been the subject of many classic genetic and genomic, as well as physiological, developmental and ecological studies.

The tammar wallaby genome project is the only major genome project being undertaken in Australia. Already trace archives contain sequence that can be u s e d f o r m a n y s t u d i e s o f m a r s u p i a l g e n e s . Comparison of tammar wallaby and / or opossum sequences with sequence from eutherian (placental) mammals such as human and mouse has already borne fruit - and sometimes upset the applecart.

Vol 37 No 2 August 2006 AUSTRALIAN BIOCHEMIST Page 5

The Kangaroo Genome − Australia's Secret Weapon

Fig. 2. Sequencing of the wallaby genome to generate the necessary approximately six million genomic reads (one times coverage) requires high throughput processes coupled with automation. Matthew Johnson (AGRF) prepares plates for processing on a robotic liquid handling system.

Fig. 3. Tammar wallabies are relatively easy to breed in captivity. A joey is pictured here at the Melbourne breeding colony with (from left) Dr Matthew Wakefield, Professor Marilyn Renfree and Professor Jenny Graves.

The Kangaroo GenomeThe marsupial genome is about 3.3 billion base pairs

in size, similar to eutherian mammal genomes. However, it is packaged into a few large chromosomes that are a cytologist's dream because they are all easily identified. Marsupial karyotypes are very stable across even the most diverged lineages, whose karyotypes are separated by just a few simple rearrangements. This was demonstrated by G-banding and, more recently, chromosome painting between marsupial groups using fluorescence tagged DNA from isolated chromosomes as a probe for in situ hybridisation (Fig. 4) (6). This has confirmed the original deduction that there is an ancestral marsupial karyotype with 12 autosomes and a pair of sex chromosomes.The kangaroo family has more variable chromosome

numbers (10-22) that arose from a macropodid ancestor with 22 chromosomes. The tammar wallaby has 16 chromosomes that are a cytologist's dream (7).

Because marsupials and monotremes (platypus and echidna) diverged from eutherian mammals 180 and 210 MYA, respectively, sequences have diverged much more than between, say, human and mouse (80 MYA). Approximately 34% of the marsupial sequence already available in the trace archives was alignable with the human genome in a large study (8), compared with 45%-75% for other eutherian mammals.

Gene Finding and AnnotationComparisons between tammar and human sequences

have been used to find (or have inadvertently turned up) several novel human genes, established their

evolutionary history and offered clues to their function.

There have been many accidental d i s c o v e r i e s o f n e w g e n e s f r o m comparisons between human and marsupial genomes. Attempts to discover the tammar homologue of a testis-specific Y chromosome gene RBMY involved in human spermatogenesis unexpectedly revealed a homologue on the human X

Page 3: The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's Secret Weapon Jennifer Marshall Graves1* and Elizabeth Kuczek2 1Research School of

chromosome. RBMX is expressed widely in the body, and appears to be critical for brain development (9, 10). As well as discovering an important new human gene, this work overturned a popular hypothesis that male-specific genes on the Y chromosome were kidnapped from autosomes. Rather, it established that most sex and spermatogenesis genes on the Y chromosome are re l i cs o f genes wi th genera l functions on the X, from which the Y evolved (11). This turns out to be true of most of the sex-specific genes on the Y, even including the sex-determining gene itself SRY, which evolved from an X homologue SOX3 that is also involved in brain development (is there a message here?).

Between-species comparisons of protein coding regions can identify parts of a protein that are particularly strongly conserved, and therefore likely to be most important for function (12). In the first large-scale comparison of marsupial and eutherian sequences, Chapman (13) analysed the genomic region surrounding the lymphoblastic leukaemia derived sequence 1 (LYL1) gene. Comparisons between mouse and human sequences showed high conservation over the region. When the marsupial sequence was compared with human and mouse, non-coding homology was reduced to the extent that all promoters and exons could be readily identified, as well as putative transcription factor-binding sites.A spectacular example is the sex determining gene

SRY. Alignment of human, mouse and kangaroo sequence first revealed that this important gene was not, as might have been expected, highly conserved (14). In fact, the only region that is recognisable between these species is a 240 base pair region that codes for an 80 amino acid domain that binds to DNA and bends it through a specific angle. Regions outside this high mobility group (HMG) box area cannot even be aligned. This implies that binding of DNA by the HMG box is all that is required of SRY, and makes us wonder whether this 'master gene' is an activator of testis genes, or, rather, an inhibitor of testis-inhibiting genes. More recently, comparison of coding sequence in the ATRX gene (alpha thalassemia -mental retardation on the X) between tammar wallaby and human identified conserved protein binding sites that offer insights into the function of this large and complex multifunctional protein (15).

Non-Coding RegionsAlignments of sequence between tammar and

placental mammals can also reveal functional signals buried in non-coding DNA. Because marsupials and monotremes diverged from eutherian mammals such a long time ago, non-functional sequences are expected to have diverged beyond recognit ion, making conserved sequences easier to spot. The smaller proportion of alignable sequence improves the s e l e c t i v i t y o f t h e a n a l y s i s , p e r m i t t i n g r a p i d identification of the most conserved (therefore the most important) functional non-coding regions.

Comparisons of sequence between four placental mammals was effective in defining regulatory regions in the untranslated regions of genes (16), but the addition of marsupial sequence makes this comparison many times more stringent. For instance, lining up the prion gene PRNP and flanking regions in tammar against mouse, sheep and human identified, for the first time, conserved transcription factor binding sites in the 5' region and the first intron (17).

This phylogenetic footprinting strategy is very powerful, as demonstrated by the comparison of a 1.9 Mb region from three marsupials (the North American opossum Didelphis virginiana, the Brazilian opossum Monodelphis domestica and the tammar wallaby Macropus eugenii) and a monotreme (the platypus Ornithorhynchus anatinus). The results clearly confirm our longstanding prediction that marsupials make a unique contribution to the power of comparative analysis. Non-eutherian sequence can therefore make a strong contribution to comprehensive functional annotation of non-coding DNA, such as is being undertaken by the ENCODE (ENCyclopedia Of DNA Elements) project (18).

Complex Genetic SystemsOf particular interest are comparisons of complex

genetic systems between distantly related mammals. X chromosome inactivation in female mammals, representing large scale transcriptional inhibition of a

SHOWCASE ON RESEARCH

The Kangaroo Genome − Australia's Secret Weapon

Page 6 AUSTRALIAN BIOCHEMIST Vol 37 No 2 August 2006

F i g . 4 . Members of the kangaroo family have wonderful large chromosomes. Chromosome painting shows that the chromosomes of different species are all very similar. Here tammar wallaby chromosomes have been physically separated by flow sorting. DNA from tammar chromosomes two (green), seven (pink) and the X (white) have been tagged with different f l u o r o c h r o m e s a n d h y b r i d i s e d b a c k t o t h e chromosomes of another species (the swamp wallaby Wallabia bicolor), whose enormous chromosomes are shown to be fusions of tammar chromosomes.

Page 4: The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's Secret Weapon Jennifer Marshall Graves1* and Elizabeth Kuczek2 1Research School of

SHOWCASE ON RESEARCH

Vol 37 No 2 August 2006 AUSTRALIAN BIOCHEMIST Page 7

The Kangaroo Genome − Australia's Secret Weapon

whole chromosome, has been exhaustively studied in humans and mice without getting to the bottom of what seems to be a very complex regulatory system, involving layer upon layer of repressive devices. X inactivation in kangaroos appears to be a simpler and less stable system (19), so molecular comparisons of the sequences that control inactivation can provide unique insight into how the human system evolved, and how it works.

Likewise, there is tremendous interest in comparing genomic imprinting between marsupials, mice and humans. This parent-specific gene repression, again, appears to be complex at the molecular level. Worse, as a system that eschews the benefits of diploidy, it doesn't seem to make sense. Exploring the expression of imprinted genes in marsupials has provided a u n i q u e i n s i g h t , n o t o n l y i n t o i t s m o l e c u l a r mechanism, but also into the reasons why it has been selected (20). Recent studies again pour cold water on the favoured hypothesis, that imprinting resulted from a war between the interests of the paternal and maternal genomes. The hypothesis that the paternal genome is favoured by rapid growth of the foetus, even at the expense of the mother, whereas the maternal genome does better if the mother saves herself to have other children (not necessarily by the same father) predicts that imprinting arose at the same time that viviparity evolved. However, we find t h a t s o m e i m p r i n t e d d o m a i n s w e r e n o t e v e n a s s e m b l e d u n t i l w e l l a f t e r t h e d i v e r g e n c e o f m a r s u p i a l s a n d e u t h e r i a n s ( R a p k i n s e t a l . , unpublished data).

Unique Characters − Medical and Commercial Opportunities?Analysing the genomes of marsupials is particularly

interest ing because of the unique anatomical , physiological and genetic features of this group of mammals (21). Marsupials are famous for their striking differences in reproduction. For a start , many marsupials have the ability to turn off and on the development of the early embryo; after mating the fertilised egg divides a few times, then the blastocyst passes into a state of suspended animation (diapause) that can last for up to 11 months before the cessation of suckling of another joey triggers resumption of deve lopment . Knowing how marsupia ls turn blastocyst development off and on could have e n o r m o u s i m p l i c a t i o n s f o r d i s c o v e r y o f n e w contraceptives and new treatments for infertility.

Most famously, marsupials give birth to very underdeveloped (altricial) young, which have yet to complete hindlimb and organ development (for instance, the gonad is not differentiated until after birth). The young are essentially a mouth and a gut, and forelimbs with which to climb through the mother's fur to reach a teat (often but not always protected in a pouch). The tiny newborn attaches to the teat and suckles continuously for some months w h i l e d e v e l o p m e n t c o n t i n u e s . T h e d i f f e r e n t developmental stages demand quite different milk

formulations, and the constituents of marsupial milk change radically over the months of lactation.

Some of these unique features provide opportunities to discover new products that could be useful to medicine or agriculture, or even be of commercial interest. For instance, we could learn new tricks in the care of premature infants from marsupial genes that control the sophisticated milk producing system. Recently, a Melbourne group announced that tammar wallaby milk contains a novel antibiotic that could be used to solve increasingly serious infection problems. The pouch is also of great interest, since it somehow manages to inhibit the growth of microorganisms.

Thus knowledge and understanding of kangaroo genome sequence could lead to new treatments for premature births, better milk production in cows, as well as novel antibiotics.

ConclusionThe scale and power of mammalian comparative

genome analysis has taken a big leap forward in the last few years, so that evolutionary genomics was nominated as 'the breakthrough of the year' on the cover of Science magazine in December 2005. Happily, Australia is now entering this burgeoning field with the contribution of unique sequence data from the kangaroo. The comparative-genomics firepower arising from mammal sequence datasets is escalating with the inc lus ion of sequence f rom Austra l ia ' s i conic mammals.

References1. Woodburne, M.O., Rich, T.H., and Springer, M.S.

(2003) Mol. Phylogenet. Evol. 28, 360-3852. Wakefield, M.J., and Graves, J.A.M. (2003) EMBO

Rep. 4, 143-1473. Belov, K., Deakin, J.E. Papenfuss, A.T., Baker, M.L.,

et al., (2006) PLOS Biology in press4. Graves, J.A.M., Wakefield, M.J., Renfree, M.B.,

Cooper, D.W., Speed, T., Lindblad-Toh, K, Lander, E.S., and Wilson, R.K. (2002) http://www.genome.gov/Pages/Research/Sequencing/SeqProposals/WallabySEQ.pdf

5. Hinds, L.A., Poole, W.E., Tyndale-Biscoe, C.H., van Oorschot, R.A., and Cooper, D.W. (1990) Aust. J. Zool. 37, 223-234

6. Rens, W., O'Brien, P.C., Yang, F., Graves, J.A.M., and Ferguson-Smith, M.A. (1999). Chromosome Res. 7, 461-474

7. Alsop, A.E., Miethke, P., Rofe, R., Koina, E., Sankovic, N., Deakin, J., Haines, H., Rapkins, R., and Graves, J.A.M. (2005) Chromosome Res. 13, 627-636

8 Margulies, E.H., Maduro, V.V., Thomas, P.J., Tomkins, J.P., Amemiya, C.T., Luo, M., and Green, E.D. (2005) Proc. Natl. Acad. Sci. USA 102, 3354-3359

9. Delbridge, M.L., Lingenfelter, P.A., Disteche, C.M., and Graves, J.A.M. (1999) Nature Genet. 22, 223-224

10.Tsend-Ayush, E., O'Sullivan, L.A., Grützner, F., Onnebo, S.M., Lewis, R.S., Delbridge, M.L., Graves, J.A.M., and Ward, A.C. (2005) Dev. Dyn. 234, 682-688

References continued on page 10

Page 5: The Kangaroo Genome Australia's Secret Weapon · 2016-07-13 · The Kangaroo Genome -Australia's Secret Weapon Jennifer Marshall Graves1* and Elizabeth Kuczek2 1Research School of

Page 10 AUSTRALIAN BIOCHEMIST Vol 37 No 2 August 2006

References continued from page 7

11.Graves, J.A.M. (2006) Cell 124, 901-91412.Yang, Z. (2005) Proc. Natl. Acad. Sci. USA 102, 3179-318013.Chapman, M.A., Charchar, F.J., Kinston, S., Bird,

C.P., Grafham, D., Rogers, J., Grutzner, F., Graves, J.A.M., Green, A.R., and Gottgens, B. (2003) Genomics 81, 249-259

14.Foster, J.W., and Graves, J.A.M. (1994) Proc. Natl. Acad. Sci. USA 91, 1927-1931

15.Park, D.J., Pask, A.J., Huynh, K., Renfree, M.B., Harley, V.R., and Graves, J.A.M. (2004) Gene 339 39-48

16.Xie, X., Lu, J., Kulbokas, E.J., Golub, T.R., Mootha, V., Lindblad-Toh, K., Lander, E.S., and Kellis, M. (2005) Nature 434 338-345

17.Premzl, M., Delbridge, M.L., Gready, J.E., Wilson, P., Johnson, M., Davis, J., Kuczek, E., and Graves, J.A.M. (2005) Gene 349 121-134

18.ENCODE Project Consortium (2004) Science 306 636-640

19.Cooper, D.W., Johnston, P.G., Graves. J.A.M., and Watson, J.M. (1993) Sem. Dev. Biol. 4 117-128

20.O'Neill, M.J., Ingram, R.S., Vrana, P.B., and Tilghman, S.M. (2000) Dev. Genes Evol. 210 18-20

21.Tyndale-Biscoe, C.H. (2005) Life of Marsupials CSIRO Publishing

SHOWCASE ON RESEARCH

The Kangaroo Genome −Australia's Secret Weapon