The Evolution of Alternative Splicing in the Pax Family: The View from the Basal Chordate Amphioxus

16
The Evolution of Alternative Splicing in the Pax Family: The View from the Basal Chordate Amphioxus Stephen Short Æ Linda Z. Holland Received: 28 January 2008 / Accepted: 22 April 2008 / Published online: 14 May 2008 Ó Springer Science+Business Media, LLC 2008 Abstract Pax genes encode transcription factors critical for metazoan development. Large-scale gene duplication with subsequent gene losses during vertebrate evolution has resulted in two human genes for each of the Pax1/9, Pax3/7, and Pax4/6 subfamilies and three for the Pax2/5/8 subfamily, compared to one each in the cephalochordate amphioxus. In addition, alternative splicing occurs in ver- tebrate Pax transcripts from all four subfamilies, and many splice forms are known to have functional importance. To better understand the evolution of alternative splicing within the Pax family, we systematically surveyed tran- scripts of the four amphioxus Pax genes. We have found alternative splicing in every gene. Comparisons with ver- tebrates suggest that the number of alternative splicing events per gene has not decreased following duplication; there are comparable levels in the four amphioxus Pax genes as in each gene of the equivalent vertebrate families. Thus, the total number of isoforms for the nine vertebrate genes is considerably higher than for the four amphioxus genes. Most alternative splicing events appear to have arisen since the divergence of amphioxus and vertebrate lineages, suggesting that differences in alternative splicing could account for divergent functions of the highly con- served Pax genes in both lineages. However, several events predicted to dramatically alter known functional domains are conserved between amphioxus and vertebrates, sug- gestive of a common chordate function. Our results, together with previous studies of vertebrate Pax genes, support the theory that alternative splicing impacts func- tional motifs more than gene duplication followed by divergence. Keywords Pax Alternative splicing Amphioxus Branchiostoma Gene duplication Introduction The Pax genes encode transcription factors that are vital for many developmental processes and play important roles in a diverse range of diseases (Chi and Epstein 2002; Robson et al. 2006). They are defined by a 128-amino acid DNA binding domain, termed the ‘‘paired domain,’’ that folds as two subdomains, termed the PAI (N-terminal) and RED (C-terminal) domains (Xu et al. 1995). The PAI subdomain cooperates with the RED subdomain and is required for binding to target sequences (Czerny et al. 1993; Pellizzari et al. 1999; Zwollo et al. 1997). The genes are subdivided into four classes based on the presence or absence of additional motifs (Robson et al. 2006). Class I contains Pax1 and Pax9, which also encode an octapeptide sequence that interacts with the Groucho corepressor (Eberhard et al. 2000; Kreslova et al. 2002), but they lack a homeodomain. The class II Pax genes, Pax2, Pax5, and Pax8, encode the octapeptide and also a partial homeodomain, whereas the class III genes, Pax3 and Pax7, encode the octapeptide plus a full homeodomain. Finally, the class IV Pax genes, Pax4 and Pax6, encode the full homeodomain but lack the octapeptide. In addition, Pax proteins contain a transacti- vation domain located in their C-terminal regions (Chalepakis et al. 1994; Dorfler and Busslinger 1996; Kalousova et al. 1999; Nornes et al. 1996; Schafer et al. Electronic supplementary material The online version of this article (doi:10.1007/s00239-008-9113-5) contains supplementary material, which is available to authorized users. S. Short L. Z. Holland (&) Marine Biology Research Division, Scripps Institution of Oceanography, La Jolla, CA 92093-0202, USA e-mail: [email protected] 123 J Mol Evol (2008) 66:605–620 DOI 10.1007/s00239-008-9113-5

Transcript of The Evolution of Alternative Splicing in the Pax Family: The View from the Basal Chordate Amphioxus

The Evolution of Alternative Splicing in the Pax Family: The Viewfrom the Basal Chordate Amphioxus

Stephen Short Æ Linda Z. Holland

Received: 28 January 2008 / Accepted: 22 April 2008 / Published online: 14 May 2008

� Springer Science+Business Media, LLC 2008

Abstract Pax genes encode transcription factors critical

for metazoan development. Large-scale gene duplication

with subsequent gene losses during vertebrate evolution

has resulted in two human genes for each of the Pax1/9,

Pax3/7, and Pax4/6 subfamilies and three for the Pax2/5/8

subfamily, compared to one each in the cephalochordate

amphioxus. In addition, alternative splicing occurs in ver-

tebrate Pax transcripts from all four subfamilies, and many

splice forms are known to have functional importance. To

better understand the evolution of alternative splicing

within the Pax family, we systematically surveyed tran-

scripts of the four amphioxus Pax genes. We have found

alternative splicing in every gene. Comparisons with ver-

tebrates suggest that the number of alternative splicing

events per gene has not decreased following duplication;

there are comparable levels in the four amphioxus Pax

genes as in each gene of the equivalent vertebrate families.

Thus, the total number of isoforms for the nine vertebrate

genes is considerably higher than for the four amphioxus

genes. Most alternative splicing events appear to have

arisen since the divergence of amphioxus and vertebrate

lineages, suggesting that differences in alternative splicing

could account for divergent functions of the highly con-

served Pax genes in both lineages. However, several events

predicted to dramatically alter known functional domains

are conserved between amphioxus and vertebrates, sug-

gestive of a common chordate function. Our results,

together with previous studies of vertebrate Pax genes,

support the theory that alternative splicing impacts func-

tional motifs more than gene duplication followed by

divergence.

Keywords Pax � Alternative splicing � Amphioxus �Branchiostoma � Gene duplication

Introduction

The Pax genes encode transcription factors that are vital for

many developmental processes and play important roles in

a diverse range of diseases (Chi and Epstein 2002; Robson

et al. 2006). They are defined by a 128-amino acid DNA

binding domain, termed the ‘‘paired domain,’’ that folds as

two subdomains, termed the PAI (N-terminal) and RED

(C-terminal) domains (Xu et al. 1995). The PAI subdomain

cooperates with the RED subdomain and is required for

binding to target sequences (Czerny et al. 1993; Pellizzari

et al. 1999; Zwollo et al. 1997). The genes are subdivided

into four classes based on the presence or absence of

additional motifs (Robson et al. 2006). Class I contains

Pax1 and Pax9, which also encode an octapeptide sequence

that interacts with the Groucho corepressor (Eberhard et al.

2000; Kreslova et al. 2002), but they lack a homeodomain.

The class II Pax genes, Pax2, Pax5, and Pax8, encode the

octapeptide and also a partial homeodomain, whereas the

class III genes, Pax3 and Pax7, encode the octapeptide plus

a full homeodomain. Finally, the class IV Pax genes, Pax4

and Pax6, encode the full homeodomain but lack the

octapeptide. In addition, Pax proteins contain a transacti-

vation domain located in their C-terminal regions

(Chalepakis et al. 1994; Dorfler and Busslinger 1996;

Kalousova et al. 1999; Nornes et al. 1996; Schafer et al.

Electronic supplementary material The online version of thisarticle (doi:10.1007/s00239-008-9113-5) contains supplementarymaterial, which is available to authorized users.

S. Short � L. Z. Holland (&)

Marine Biology Research Division, Scripps Institution of

Oceanography, La Jolla, CA 92093-0202, USA

e-mail: [email protected]

123

J Mol Evol (2008) 66:605–620

DOI 10.1007/s00239-008-9113-5

1994; Tang et al. 1998). The C-terminal regions of Pax2,

Pax5, Pax8, and Pax4 also contain inhibitory domains

(Dorfler and Busslinger 1996; Fujitani et al. 1999). Such

inhibitory domains have not yet been clearly demonstrated

for the other Pax genes.

Despite an apparent two rounds of whole-genome

duplication in the vertebrate lineage, the human and mouse

genomes have surprisingly few genes (Lander et al. 2001;

Putnam et al. 2008; Waterston et al. 2002), apparently due

to losses of many duplicates. It has been suggested that

alternative splicing may help compensate for such gene

loss by allowing a greatly expanded and diversified pro-

teome from a relatively small number of genes (Graveley

2001). All four classes of vertebrate Pax genes have

alternatively spliced transcripts, with many isoforms hav-

ing distinct activities (Anspach et al. 2001; Azuma et al.

2005; Kozmik et al. 1993; Lamey et al. 2004; Miyamoto

et al. 2001; Nornes et al. 1996; Ritz-Laser et al. 2000;

Robichaud et al. 2004; Wang et al. 2007).

Cephalochordates (amphioxus), which diverged from

vertebrates about half a billion years ago (Shu et al. 1999),

represent basal chordates (Blair and Hedges 2005; Bourlat

et al. 2006; Philippe et al. 2005). Amphioxus shares a

fundamental body plan with the vertebrates, but is much

simpler both structurally and genomically, and therefore is

useful as a model for the ancestral vertebrate before ver-

tebrates apparently underwent several whole-genome

duplications (Holland et al. 2004; Holland 2003; Putnam

et al. 2007, 2008). Consequently, since the amphioxus

genome has very little gene duplication or gene loss, it has

only a single gene in each of the four Pax classes (Glardon

et al. 1998; Holland et al. 1999, 1995; Kozmik et al. 1999).

The functional domains in each class of vertebrate Pax

genes are conserved in amphioxus and vertebrates and in

all the vertebrate duplicates, suggesting strong evolutionary

constraints for maintaining the particular domain combi-

nations even after gene duplication (Glardon et al. 1998;

Holland et al. 1995, 1999; Kozmik et al. 1999) (Fig. 1A–

D). To date, alternative splicing of amphioxus Pax genes

has been described only in AmphiPax4/6 (Glardon et al.

1998) and AmphiPax2/5/8 (Kozmik et al. 1999), with two

isoforms of AmphiPax2/5/8 displaying functional equiva-

lency to isoforms of human Pax8 (Kreslova et al. 2002).

However, the extent of alternative splicing in amphioxus

Pax transcripts has not been systematically investigated.

Recent studies have found an inverse correlation

between gene duplication and alternative splicing (Kopel-

man et al. 2005; Su et al. 2006), suggesting that the two

mechanisms could be interchangeable sources of functional

diversification. However, it was recently shown that the

two processes have different effects on the proteome, with

alternative splicing having a greater impact on protein

sequence and structure than does duplication followed by

divergence (Talavera et al. 2007). To contribute to our

understanding of the relationship between alternative

splicing and gene duplication and investigate the evolution

of alternative splicing within the chordate Pax genes, we

systematically surveyed the alternative splicing of the

amphioxus Pax transcripts. The nested PCR approach we

used is more sensitive and more comprehensive in terms of

tissue types and developmental stages surveyed than any

study carried out to date for vertebrate Pax genes.

Our results showed that there are alternative splicing

events in all four amphioxus Pax transcripts but at differing

levels. Compared to vertebrates, amphioxus has approxi-

mately the same or even fewer splice forms per Pax gene,

Fig. 1 Graphical representations of ClustalW alignments. The func-

tional domains are conserved in vertebrate and amphioxus Pax genes

and are maintained in all vertebrate duplicates. Exon numbers and

functional domains are included to allow comparison of previously

reported vertebrate alternative splicing events with the amphioxus

events reported in this study. There is less sequence conservation in

the C-terminal regions, and therefore, the alignment of corresponding

exons is less certain (introns not to scale). (A) Alignment of

AmphiPax1/9 (Bf) (accession no. AJ238974) and mouse (Mm) Pax 1

(accession no. NM008780) and 9 (accession no. NM001041). The

coding sequence of AmphiPax1/9 is spread over five exons (see

results). (B) Alignment of AmphiPax2/5/8 a (Bf) (accession no.

AF053762) but with exon 7, used in the b form (AF053763), also

included mouse Pax2 (accession no. NP035167), Pax5 (accession no.

CAM23221), and Pax8 (accession no. NM011040). The coding

sequence of AmphiPax2/5/8 is spread over 11 exons. The region

encoded by exon 4 in amphioxus is split into multiple exons in the

vertebrate genes. An exon equivalent to exon 8 in human Pax8 is also

found in Xenopus Pax2 but is labeled exon 9 (Heller and Brandli

1997). (C) Alignment of AmphiPax3/7 (Bf) (accession no.

AF165886), mouse Pax3 (AK044985), and mouse Pax7 (accession

no. AF254422). The coding region of AmphiPax3/7 is spread over six

exons. The paired domain and octapeptide is contained within exon 1

in amphioxus, however, it is spread over multiple exons in

vertebrates. (D) Alignment of AmphiPax4/6 (Bf) (accession no.

AJ223444), mouse Pax4 (accession no. AF031150), and mouse Pax6

(accession no. CAA453380). The coding sequence of AmphiPax4/6 is

spread over 13 exons (due to the use of mutiple start codons, exons 1

and 2 are missing from this cDNA sequence)

606 J Mol Evol (2008) 66:605–620

123

indicating not only that the number of alternative splicing

events has not decreased subsequent to gene duplication

but that the total number of alternatively spliced Pax iso-

forms for the nine vertebrate Pax genes is probably

considerably higher than for the four amphioxus ones.

Alternative splicing of amphioxus, as well as of vertebrate,

Pax genes, is predicted to dramatically alter known func-

tional domains, creating much greater differences than

among the duplicates of a given vertebrate Pax gene.

Moreover, although most alternative splicing events are

divergent between the amphioxus and the vertebrate Pax

homologues, several events are conserved—a notable

example being one that removes most of the paired domain

of AmphiPax2/5/8 and vertebrate Pax5 and is known to

alter DNA binding (Zwollo et al. 1997). This conservation

of mRNA splice forms over a wide phylogenetic distance

implies conservation of protein function and suggests that

comparison of alternative splice forms over large phylo-

genetic distances may be a useful strategy for

distinguishing functionally important isoforms.

Materials and Methods

Identification and Characterization of Alternatively

Spliced Transcripts

The technique used to isolate isoforms has been described

previously (Gorlov and Saunders 2002). It involves a total

of two rounds of PCR. In brief, a first round of RT-PCR

with primers spanning a given region of the transcript (see

below) is followed by electrophoretic separation of the RT-

PCR product and DNA extraction from sections of the

agarose gel (QiAquick Gel Extraction Kit; Qiagen,Valen-

cia, CA, USA) surrounding the expected band. These

sections potentially contain the PCR product of uncharac-

terized isoforms. Extractions were performed regardless of

whether an additional band was evident following the ini-

tial RT-PCR reaction. Finally, the contents of the extracted

sections were used as templates for a second round of PCR

with nested primers. Identification of isoforms more than

*150 bp greater or less than the major isoform was per-

formed using nested primers spanning the entire transcript.

The isolation of splice variants that differ only slightly

from the major isoform/s (*25 bp) was performed using

the same technique but with nested primers flanking each

exon. As an example, to analyze the potential alternative

splicing of all, or part of, AmphiPax258 exon 2, the above

method was used but with nested primers targeted against

exons 1 and 3 instead of across the entire transcript. The

same approach was used for exons along all four AmphiPax

transcripts. Potential isoforms evident from either the first

or the second round of PCR reactions were eluted, cloned

directly into a TA vector (Invitrogen, Carlsbad, CA, USA),

and characterized by automated sequencing (Seqxcel Inc.,

La Jolla, CA, USA). In some cases, additional PCR reac-

tions were performed using various combinations of

primers to gain information regarding the context of

splicing events or to check for intron retention. The

sequences of all primers and their exon locations are listed

in the supplementary materials. The use of standard splice

donor and/or acceptor sites was confirmed using the

Branchiostoma floridae v.1.0 genome sequence (http://

www.genome.jgi-psf.org/Brafl1/Brafl1.home.html).

Animal Collection and RT-PCR Analysis

Branchiostoma floridae adults and developmental stages

were obtained as previously described (Holland and Yu

2004) and stored in 4 M guanidinium thiocyanate, 25 mM

sodium citrate, 0.5% Sarcosyl, 0.1 M b-mercaptoethanol.

Total nucleic acid was isolated via multiple rounds of pH

4.7 phenol–chloroform (5:1) extractions, followed by eth-

anol precipitation. The DNA was removed with RNase-free

DNase (New England Biolabs, Ipswich, MA, USA). RNA

(5 lg) was reverse transcribed into cDNA with Superscript

II (Invitrogen, Carlsbad, CA, USA) and stored at -20�C.

The cDNA was subjected to PCR for 36 cycles: 1 min at

94�, 40 s at 56�C, and 2 min at 72�C for survey across

entire transcripts or 1 min at 72�C for survey across indi-

vidual exons. The second round of PCR reactions was

identical to the first round except that reactions were per-

formed for 32 cycles.

Results

Experimental Design

To identify a maximum number of Pax splice forms and

determine developmental stage specificity, we surveyed

splicing with a nested RT-PCR based technique (Gorlov

and Saunders 2002) using RNA isolated from amphioxus

neurulae, early larvae and adults. Each of these stages had

been shown by semiquantitative RT-PCR (data not shown)

and in situ hybridization to express all four Pax genes

(Glardon et al. 1998; Hetzer-Egger et al. 2000; Holland

et al. 1999, 1995; Kozmik et al. 1999). Comparisons of the

sequences of PCR products with the Branchiostoma flori-

dae v.1.0 genome sequence (http://www.genome.jgi-psf.

org/Brafl1/Brafl1.home.html) confirmed the use of standard

splice donor and/or acceptor sites (GT and AG respec-

tively) in all the splicing events we found.

Although our survey is as comprehensive as possible,

we could not identify isoforms produced by alternative

promoters because the PCR method requires the sequence

J Mol Evol (2008) 66:605–620 607

123

of the first and last exons. The alternative splicing events

are, therefore, restricted to alternative use of splice donors,

acceptors, exon cassette, and retained introns. An analysis

of alternative splicing events conserved between mouse

and human suggests that approximately 70% of all alter-

native splicing events fall within one of these four

catorgories (Sugnet et al. 2004). However, even by con-

ducting PCR across each exon as well as across the entire

transcript, we could only detect differences in length of

*25 bp or more. Alternative splicing of trinucleotide and

hexanucleotide sequences is known to occur in vertebrate

Pax genes and be functionally important (Kozmik et al.

1997; Lamey et al. 2004). Despite these limitations, this

method is more sensitive and, in terms of the number of

tissue types and developmental stages surveyed, more

comprehensive than any survey of alternative splicing in

vertebrate Pax genes to date and provides a lower limit for

the amount of alternative splicing of amphioxus Pax tran-

scripts. We consider an alternative splicing event between

amphioxus and vertebrates to be conserved only if the exon

or retained intron in question is located in the equivalent

position of an alternatively spliced exon or retained intron

within the vertebrate genes (Fig. 1). The comparisons use

alternative splicing events found in the Pax transcripts of a

wide range of vertebrate species. However, most events

have been isolated in human and/or mouse and these events

are used where possible. Notable exceptions include Pax2,

for which the most comprehensive survey has been per-

formed in Xenopus (Heller and Brandli 1997), and Pax6,

for which a survey performed in pigeon has made an

important contribution to our knowledge of splicing events

(Bandah et al. 2007). To give an indication of expression

levels, we have stated which variants could only be isolated

via nested PCR. All new sequences isolated, as well as

splice donors and acceptors used, are provided in the

supplementary material.

Alternative Splicing Creates Two Isoforms

of AmphiPax1/9

Although the AmphiPax1/9 coding region was thought

to include four exons (Hetzer-Egger et al. 2000), the

B. floridae v. 1.0 genome sequence reveals an additional

intron disrupting what was designated exon 4, resulting in a

total of five exons (Figs. 1A and 2A). In addition to the

single isoform of AmphiPax1/9 previously characterized,

which we term 5a(-) (Hetzer-Egger et al. 2000; Holland

et al. 1995), our survey revealed a longer transcript

[5a(+)], resulting from the use of an alternative upstream

splice acceptor in exon 5 (Fig. 2B). However, this tran-

script has an altered reading frame which would code for a

truncated protein, missing the C-terminal end of the likely

transactivation domain. The transactivation domain also

undergoes alternative splicing in vertebrates (Nornes et al.

1996), but the exons and splice sites involved differ from

those in amphioxus, suggesting independent evolution.

Semiquantitative RT-PCR with primers flanking exon 5a

showed that the 5a(-) isoform is minor and is develop-

mentally regulated relative to the 5a(+) isoform. When

volumes loaded on a gel were adjusted to contain equal

amounts of the 5a(+) isoform, the 5a(-) form could only

be detected at the early larval stage (Fig. 2C).

AmphiPax2/5/8 Transcripts Undergo Considerable

Alternative Splicing

The AmphiPax2/5/8 coding region is spread across

11 exons (Figs. 3 and 4A) and is extensively alternatively

spliced. Two isoforms, a and b, which result from the

skipping or inclusion of exon 7 (Fig. 4B) and possess

different transactivation properties, were previously

described (Kozmik et al. 1999; Kreslova et al. 2002). We

found four different splice events involving exons 1–5, all

of which would create isoforms lacking portions of the

paired domain and would presumably have altered DNA

Fig. 2 Alternative splicing creates two isoforms of AmphiPax1/9.

(A) Use of the B. floridae v.1.0 genome sequence suggests that the

coding sequence of AmphiPax1/9 is distributed over five exons. On

the basis of vertebrate evidence it is expected that the transactivation

domain would be located within, or distributed over, exons 3–5. (B)

In addition to the previously characterized isoform, termed 5a(-),

another isoform that uses an alternative upstream splice acceptor was

found. This results in a longer transcript, termed 5a(+), but is

predicted to code for a truncated protein, with altered potential

transactivation domain. (C) Semiquantitative RT-PCR performed

with primers flanking exon 5a suggest that the exon 5a splicing event

is developmentally regulated. PCR product was loaded to give equal

quantities of the 5a(+) form for each stage, and under these

conditions, the 5a(-) form could only be detected at the early larval

stage. sm, size marker; gas, gastrula; en, early neurula; el, early larval;

ad, adult

608 J Mol Evol (2008) 66:605–620

123

binding properties (Fig. 3). All events involving exons 1–5,

with the exception of exon 4a alternative splicing, were

isolated using nested PCR at neurula and larval stages. The

skipping of exon 2 would remove most of the PAI sub-

domain of the paired domain and cause the reading frame

to shift, leading to a premature termination codon (PTC) in

exon 3. This event has also been found in the Pax5 tran-

scripts of humans, mice, frog, and zebrafish (Borson et al.

2002; Heller and Brandli 1999; Kwak et al. 2006; Zwollo

et al. 1997), and it was suggested that an internal ATG site,

which is conserved with AmphiPax2/5/8, may serve as an

alternate start codon (Zwollo et al. 1997). Western blots of

cell lines expressing human Pax5 showed that for a small

percentage of full-length transcripts, the internal ATG can

serve as a start codon (Zwollo et al. 1997). If this is true,

then splicing-out of exon 2 may regulate the relative pro-

portions of full-length transcripts and of those lacking the

paired domain.

Skipping of exon 3 does not alter the reading frame and

would result in the deletion of nearly all the RED subdo-

main of the paired domain. The use of an upstream

alternative splice acceptor at the 50 end of exon 4 would

result in an extra eight amino acids (termed 4a) at the C-

terminal end of the RED subdomain. In addition to deleting

the C-terminal portion of the paired box, skipping of exon

4 would also remove the octapeptide and alter the reading

frame, causing a PTC within exon 5. Thus, this isoform,

which would include most of the paired domain plus an

additional 13 amino acids, may bind to DNA and, if so,

could act as a dominant-negative.

In addition to the four alternative splicing events in the

50 half of AmphiPax2/5/8, we found seven involving exons

6–11 (Fig. 4C). These include isoforms that would lack the

transactivation domains and possess altered inhibitory

domains and, therefore, probably have differing transacti-

vation properties as previously shown for the a and b

isoforms (Kreslova et al. 2002). The a isoform, which

skips exon 10 and uses an upstream exon 11 splice acceptor

resulting in what we term the a reading frame (Fig. 4B),

has both the C-terminal activation domain and the adjacent

inhibitory domain. The b isoform includes exon 7, which

has a stop codon within the exon (Fig. 4B), and codes for a

serine/threonine rich C-terminal region that cannot trans-

activate in in vitro experiments (Kreslova et al. 2002). We

have numbered the newly discovered C-terminal splice

variants I–VII (Fig. 4C). Variant I is like the a isoform but

has an altered reading frame due to the use of an alternative

splice acceptor within exon 11, which would create a C-

terminal domain, termed a ‘‘paired-type homeodomain

tail’’ (PHT), a sequence previously suggested to be the true

Pax2/5/8 C-terminal (Vorobyov and Horst 2006). Our

studies confirm the use of this reading frame, but we sug-

gest that its use, and, therefore, the presence or absence of

the PHT, results from alternative splicing. Two splice

forms (II and III) skip exons 7–10, and both lack the entire

transactivation domain. Variant II uses the downstream

splice acceptor within exon 11 resulting in the PHT frame

and appears to be conserved in vertebrate Pax8 (Poleev

et al. 1995), while variant III, isolated via nested PCR, uses

the upstream splice acceptor resulting in the a reading

frame. Splice variants IV, V, and VI include all or most of

exon 10 and use different combinations of alternative

splice donors and acceptors in exons 10 and 11 resulting in

variant inhibitory and/or transactivation domains. Both

variant IV and variant VI were isolated via nested PCR.

Variant IV uses an internal splice donor within exon 10 and

the standard exon 11 splice acceptor and changes the

reading frame for exon 11. Variant V uses the splice donor

within exon 10 in combination with the internal exon 11

acceptor, resulting in the a reading frame minus 6 amino

acids of the inhibitory domain. Variant VI uses a down-

stream splice donor that results in an extra four amino acids

Fig. 3 Alternative splicing in the region that encodes the Amphi-Pax2/5/8 N-terminal (exons 1–5) would create isoforms lacking

portions of the paired domain and the octapeptide sequence. Exon 2

skipping would remove almost all the PAI subdomain of the paired

domain, causing a frame shift, resulting in a premature termination

codon (PTC) in exon 3. A conserved internal ATG site in exon 3 is

suggested as an alternate start codon in vertebrates (Lowen et al.

2001; Zwollo et al. 1997). Skipping of exon 3 does not alter the

reading frame and would result in the deletion of almost all the RED

subdomain. An upstream alternative splice acceptor at the 50 end of

exon 4 would cause the inclusion of eight amino acids, termed 4a, at

the C-terminal of the RED subdomain. Exon 4 exclusion not only

would remove 4a but also would delete the octapeptide sequence,

resulting in an altered the reading frame and a PTC within exon 5.

Translated from the standard start codon, this isoform would include

most of the paired domain plus an additional 13 amino acids. On the

basis of known splice donor-acceptor combinations, and assuming

that the downstream start codon is conserved, there are potentially

eight N-terminal isoforms, five of which would be expected to also

encode a C-terminal region

J Mol Evol (2008) 66:605–620 609

123

at the C-terminal of exon 10, termed 10b, together with the

internal exon 11 splice acceptor, resulting in the PHT

reading frame. Finally, splice form VII was isolated via

nested PCR and, like the b isoform, includes exon 7.

However, rather than continuing to the stop codon used by

the b form (Kreslova et al. 2002), it uses a different splice

donor in combination with the standard splice acceptor 50

of exon 8. This exon has been termed 7a. Its inclusion

results in an isoform with the serine/threonine-rich region

as in the b form, but with the downstream C-terminal

sequences described above. This insertion is comparable to

exon 8 of human Pax8 (Fig. 1B) in that the human exon is

also alternatively spliced and contains a comparable serine/

threonine region (Kozmik et al. 1993).

AmphiPax3/7 has Multiple Isoforms

The coding region of AmphiPax3/7 is spread across 6

exons (Figs. 1C and 5A). In addition to the published

sequence, which is the 3(+) isoform (Fig. 5B) (Holland

et al. 1999), nested PCR revealed an isoform lacking exon

3 (Fig. 5B; 3-). This splicing event eliminates 42 bases at

the 30 end of the region encoding the homeobox, alters the

reading frame of exon 4, and results in a stop codon 5

nucleotides into exon 5. This region is equivalent to exon 6

in human and mouse Pax3 and 7 (Fig. 1C). Consequently,

this isoform would lack most of the third helix of the

Fig. 4 Alternative splicing in the region that encodes the Amphi-Pax2/5/8 C-terminal would create isoforms with altered

transactivation and inhibitory domains. (A) The region encoding the

C-terminal (exons 6–11) including an alternatively spliced exon,

termed 10a and 10b. The previously described exon 7 has been split

into 7a and 7b (see below). (B) The previously described AmphiPax2/

5/8 isoforms. The a form (accession no. AF053762) skips exons 7 and

10 and uses the upstream exon 11 stop codon. The b form (accession

no. AF053763) includes exons 7a and 7b and uses a stop codon within

exon 7, creating an altered, serine/threonine-rich, C-terminal

sequence. (C) This survey isolated a further seven alternative splicing

events. We have numbered the C-terminal splice variants I–VII.

Variant I uses a downstream splice acceptor within exon 11, resulting

in an altered reading frame. Two splice forms (II and III) skip exons

7–10 and lack the entire transactivation domain. Variant II uses the

downstream exon 11 splice acceptor, resulting in the PHT reading

frame (Vorobyov and Horst 2006), while variant III uses the upstream

acceptor, resulting in the a reading frame. Splice variants IV, V, and

VI include the novel exon 10, disrupting the previously characterized

inhibitory domain, but use different combinations of alternative splice

donors and acceptors. Splice form VII includes exon 7a but uses a

splice donor, prior to the exon 7 stop codon, in combination with the

standard splice acceptor 50 of exon 8. This is predicted to result in

isoforms containing a serine/threonine-rich region but potentially

possessing any of the above downstream C-terminal sequences. On

the basis of known splice donor-acceptor combinations, there are

potentially 13 C-terminal isoforms

Fig. 5 Alternative splicing of AmphiPax3/7. (A) The protein is

encoded by six exons and includes a paired domain, an octapeptide,

and a complete homeodomain. The transactivation domain would be

encoded by exons 4–6. (B) In addition to the published sequence (3+)

(Holland et al. 1999), the survey revealed an isoform lacking exon 3

as well as isoforms retaining introns 2, 4, and 5. The removal of exon

3 would remove 14 amino acids from the C-terminal of the

homeodomain and alter the reading frame of exon 4, resulting in a

stop codon within exon 5. The retention of intron 2 would remove the

same 14 amino acids but create an altered C-terminal sequence. The

retention of introns 4 and 5 would all create truncated transactivation

domains. On the basis of known splice donor-acceptor combinations,

there are potentially six AmphiPax3/7 isoforms

610 J Mol Evol (2008) 66:605–620

123

homeodomain, which, for other Pax genes, has been shown

to mediate both DNA and protein interactions (Bruun et al.

2005) and would have a shortened potential transactivation

domain with an altered sequence. A possible homologous

splicing event has been described for mouse Pax3, termed

3f (Barber et al. 1999; Seo et al. 1998). Because several

isoforms of vertebrate Pax3 and Pax7 are the result of

retained introns, we actively checked the equivalent

AmphiPax3/7 introns. We found that introns 2, 4, and 5

were retained in amphioxus transcripts (Fig. 5B; In2+,

In4+, In5+). In all cases, the retention causes a stop codon

(predicted using the B. floridae genome) within the retained

intron. Also, in each case, if a splice donor existed down-

stream of the intron primer site but upstream of the

predicted stop codon, we would expect the isoforms to be

evident in the nested PCR exon-exon survey, suggesting

that these events are due to retained introns rather than

alternative 30 splice donors. Figure 1C shows that the

amphioxus introns are equivalent to introns 5, 7, and 8,

respectively, in vertebrate Pax3 and -7. The retention of

introns 5 and 7 was reported for zebrafish Pax7 (Seo et al.

1998), while the retention of intron 8 has been reported in

zebrafish Pax7 and mouse and human Pax3 and Pax7

(Barber et al. 1999; Seo et al. 1998; Vorobyov and Horst

2004), suggesting these events may be highly conserved.

Alternative Splicing of AmphiPax4/6 Transcripts

Creates an Isoform Lacking the PAI Subdomain

Previous work identified five isoforms of amphioxus

Pax4/6 (Glardon et al. 1998). Genomic analysis indicates

that three of these (J2, 4.1, and 12.2) use an alternative

downstream promoter, while the other two use the

upstream promoter (Fig. 6A). Our survey confirmed these

splicing events (Fig. 6A and B) and, using nested PCR,

revealed several more in the region encoding the N-ter-

minal half, all of which would use the upstream promoter.

One of these, which would require the use of the

upstream promoter and/or start codon to create an in-

frame protein, involves a new exon (exon 2.1) and would

change the sequence on the N-terminal side of the paired

domain. Another involves alternative splicing of exon 4

(Fig. 6A). The exon 4(+/-) event is analogous to the

alternative splicing of exon 2 found in AmphiPax2/5/8. It

is predicted to remove almost the entire PAI subdomain

of the paired domain and alter the reading frame, leading

to a PTC within the sequence normally coding for exon 7.

However, the use of alternative downstream start codons

or promoters resulting in isoforms lacking the paired

domain is well documented (Bandah et al. 2007; Carriere

et al. 1993; Jaworski et al. 1997; Zhang and Emmons

1995) and as suggested for Pax2/5/8 exon 2, the alter-

native splicing of exon 4 may offer a mechanism to

regulate the relative proportion of both ‘paired’ and

‘paired-less’ forms of AmphiPax4/6.

Discussion

Alternative splicing of primary transcripts is one means for

proteome expansion in metazoans (Blencowe 2006) and is

known to be functionally important in both vertebrate and

amphioxus Pax genes (Epstein et al. 1994; Kreslova et al.

2002). Our analyses point to both conserved and divergent

splicing events impacting known functional domains and

suggest that levels of alternative splicing in the four

amphioxus Pax genes are comparable to those in each gene

of the equivalent vertebrate family. Thus, the total number

of isoforms for the nine vertebrate genes is considerably

higher than for the four amphioxus genes.

Alternative Splicing in the N-terminal Encoding Region

Suggests Functional Conservation and Divergence

Following gene duplication, alternatively spliced isoforms

of the ancestral gene can be subfunctionalized either by

being split between the duplicates and, thus, becoming

Fig. 6 Alternative splicing of AmphiPax4/6. (A) Previous splicing

events (Glardon et al. 1998) are shown. Genomic analysis suggests

that the previously described transcripts (J2, 4.1, and 12.2) use an

alternative promoter (downstream arrow), while the remaining

transcripts use an upstream promoter (upstream arrow). This survey

revealed new isoforms in transcripts driven from the upstream

promoter. This includes the inclusion of exon 2.1, predicted to create

a new sequence on the N-terminal side of the paired domain, but

would require a novel upstream start codon to produce an in-frame

protein. The alternative splicing of exon 4 is predicted to remove

almost the entire PAI subdomain of the paired domain, altering the

reading frame and causing a PTC within the sequence normally

coding for exon 7. (B) This survey confirms previous splice sites,

including an event (13a+/-), predicted to remove highly conserved

residues, one of which is a target for MAP kinase-mediated regulation

in vertebrates (Mikkola et al. 1999). On the basis of known splice

donor-acceptor combinations, and not assuming the existence of

uncharacterized start codons, there are potentially 18 AmphiPax4/6

isoforms

J Mol Evol (2008) 66:605–620 611

123

encoded by distinct genes (MacLean et al. 1997) or by

losing some duplicate splice-forms. Neofunctionalization

of splice forms, defined as an alternative splicing event that

evolved in any of the postduplication genes but is not

present in the ancestral form, can also occur. Vertebrate

Pax genes appear to have undergone both subfunctional-

ization and neofunctionalization of alternatively spliced

forms. An example of the former is a splicing event that

skips exon 2 (Fig. 3), which is conserved between Amph-

iPax2/5/8 and human, mouse, frog, and zebrafish Pax5

(Borson et al. 2002; Heller and Brandli 1999; Kwak et al.

2006; Zwollo et al. 1997), but which has apparently been

lost from vertebrate Pax2 and Pax8; there are no published

reports of this splice form for Pax2 and Pax8, and we could

find no evidence in mammalian EST sequences (data not

shown). As noted above, this event removes most of the

PAI subdomain of the paired domain. If the accepted ATG

is used as the start codon, isoforms lacking exon 2 would

have a premature stop codon within the sequence normally

encoding exon 3, and would be predicted to produce a

truncated and out-of-frame protein, containing no part of

any of the functional domains and, therefore, would likely

be nonfunctional. However, it has been shown that trans-

lation of transcripts can initiate from a downstream ATG

within exon 3 (see Fig. 3) that is conserved in AmphiPax2/

5/8 and vertebrate Pax5, resulting in isoforms (e.g., Pax5b

and 5e) that lack most of the RED domain as well as the

PAI domain but are in-frame (Lowen et al. 2001; Zwollo

et al. 1997). Although such isoforms would bind paired

domain binding sites poorly, if at all (Zwollo et al. 1997),

there is evidence that they function to increase the trans-

activation activity of other Pax5 isoforms (Lowen et al.

2001). Theoretically, transcripts lacking exon 2 could use

either ATG as a start codon. If the downstream ATG were

used, the resulting protein would be the same as that

translated from the downstream ATG of transcripts

including exon 2. However, in exon 2(-) forms, transcripts

from the upstream ATG would encode an out of frame and

extremely truncated protein. Therefore, the increased

skipping of exon 2 would likely skew the relative propor-

tion of functional isoforms toward those initiating from the

second ATG. The conservation of isoforms lacking the

same region of the paired domain in amphioxus Pax2/5/8

and human Pax5 suggests not only that these forms are

functional, but also that they have important roles in early

development.

We found comparable alternative splicing in Amphi-

Pax4/6, where skipping of exon 4 also removes most of the

PAI subdomain of the paired domain, altering the reading

frame and resulting in a premature stop codon. Although an

identical isoform has not been reported in vertebrate Pax4

or Pax6, events removing the paired domain have been

reported (Gorlov and Saunders 2002). In addition, the use of

downstream promoters and start codons resulting in iso-

forms lacking the paired domain (often termed paired-less)

occurs in the Pax6 genes of C. elegans (Zhang and Emmons

1995) and several vertebrates (Bandah et al. 2007; Carriere

et al. 1993; Jaworski et al. 1997). In addition, products of

the two Drosophila paralogues, eyg and toe, lack the PAI

subdomain and bind only via their RED and homeodomains

(Jun et al. 1998) Possible functions are suggested by a study

demonstrating that a paired-less form of Pax6, which

interacts with the full-length Pax6 via the homeodomain,

confers increased transactivation from paired domain

binding sites (Bruun et al. 2005; Mikkola et al. 2001).

Further potential functions are suggested by cooperative

interactions that occur between the paired and homeodo-

mains (Jun and Desplan 1996). For example, Pax6 isoforms

with altered paired domains affect transactivation mediated

by a reporter construct containing homeodomain binding

sites (Mishra et al. 2002). The use of a highly conserved

methionine within amphioxus exon 6 may result in

amphioxus paired-less forms, and as suggested for the

AmphiPax2/5/8 exon 2(-) form, this event may regulate

relative proportions of paired and paired-less forms.

Comparison of our findings with those reported for

vertebrates provides evidence for neofunctionalization of

Pax splice variants subsequent to gene duplication in ver-

tebrates. For example, in addition to the exon 2(-) form of

human Pax5, there are at least six events that skip multiple

exons within the N-terminal encoding region (e.g., an exon

2,3,4,5[-] form) (Borson et al. 2002). However, we found

no evidence for any of these forms in AmphiPax2/5/8 at the

stages we analyzed. Equivalent variants have not been

reported in vertebrate Pax2 and Pax8, although in the

absence of a systematic survey of isoforms, the possibility

remains that such isoforms might exist. However, we

cannot rule out the possibility that these six splice variants

could predate the amphioxus-vertebrate divergence but

have been lost in the amphioxus lineage. Even so, since the

percentage of genes and exons undergoing alternative

splicing appears higher in vertebrates compared to inver-

tebrates (Kim et al. 2007), the simplest explanation is that

these isoforms represent neofunctionalization of Pax5

within the vertebrate lineage.

Another example of likely neofuctionalization is the

alternative splicing of a functionally important 42-base pair

insertion (exon 5a) (Epstein et al. 1994; Kozmik et al. 1997)

in all vertebrate Pax6 genes investigated to date, including

those of fish (Puschel et al. 1992). The absence of exon 5a in

the Pax4/6 genes of both amphioxus (Glardon et al. 1998)

(Fig. 6A) and sea urchin (Czerny and Busslinger 1995)

suggests that it evolved within the vertebrate lineage.

Several alternative splice forms involving the N-termi-

nal coding region of amphioxus Pax transcripts appear to

have no clear counterparts in vertebrates, for example, the

612 J Mol Evol (2008) 66:605–620

123

alternative splicing of exons 3, 4, and 4b in AmphiPax2/5/8

and all events in AmphiPax4/6 (Figs. 3 and 7). These

splicing events may represent examples of neofunctional-

ization within the amphioxus lineage. However, the

possibility remains that comparable splice forms exist but

have not yet been detected in orthologues of vertebrates

and/or other invertebrates such as sea urchin. Although

many splice forms of Pax genes, especially in vertebrates,

have been described (e.g., Bandah et al. 2007; Borson et al.

2002), more comprehensive analyses are clearly needed.

We found no alternative splicing in the N-terminal half

of AmphiPax1/9 (Fig. 2) and none has been reported in the

comparable region of vertebrate Pax1 or Pax9. The alter-

native splicing events found in AmphiPax3/7 (Fig. 5)

predominantly influence the C-terminal and so are dis-

cussed below. However, it is worth noting that alternative

splicing in vertebrates results in insertion of a functionally

important glutamine into the paired domain of both ver-

tebrate Pax3 and Pax7 and of a glycine-leucine dipeptide

into the Pax7 paired domain (Lamey et al. 2004). These

events can happen, as the paired domain of vertebrate Pax3

and Pax7 is split over three exons (Fig. 1C), however,

since the paired domain of AmphiPax3/7 is encoded by a

single exon, the same alternative splicing is not possible.

These events could represent neofunctionalization within

the vertebrates or loss within the amphioxus linage fol-

lowing the amphioxus-vertebrate divergence.

Alternative Splicing in the C-terminal Encoding Region

is Widespread in the Transcripts of Pax Genes

Alternative splicing affecting the C-terminal transactivation

domains occurs in all classes of vertebrate and amphioxus

Pax genes. However, the evolutionarily conservation of

splicing events affecting this region is more difficult to

ascertain because the sequence and intron/exon organization

downstream of the homeodomains are not as well conserved,

and many events appear to be lineage specific. Even so, some

of these isoforms, such as Pax2/5/8 C-terminal II (Fig. 4C),

human Pax8e (Poleev et al. 1995), and Pax5D789 (Robi-

chaud et al. 2004), do appear to be homologous. All three

isoforms are predicted to lack the entire transactivation

domain but include the region normally encoding the

inhibitory domain, albeit in an altered reading frame. Iso-

forms that lack, or have dramatically altered, transactivation

domains (e.g., AmphiPax2/5/8 b, which lacks the transacti-

vation domain due to inclusion of exon 7) may act as

competitive inhibitors of other isoforms (Kreslova et al.

2002). The removal of 19 bp at the 50 end of exon 11 creates

the PHT reading frame (Vorobyov and Horst 2006) in

AmphiPax2/5/8 but also acts to remove a SSYPYYS

sequence (C-terminal I, II, and VI). This event appears

highly conserved, as a 19-bp deletion is also found in the final

exon of frog and human Pax2 (Heller and Brandli 1997;

Tavassoli et al. 1997) and acts to remove the homologous

SSPYYYS sequence in both. In addition, the insertion of the

serine/threonine-rich exon 7a (Fig. 4, C-terminal VII)

appears homologous to the alternatively spliced, serine/

threonine-rich, exon 8 in human (Kozmik et al. 1993).

Another possibly conserved splicing event is the skipping of

exon 3 in Pax3/7 genes. The exon 3(-) form of AmphiPax3/7

(Fig. 5B) would remove 14 amino acids from the C-terminal

of the homeodomain and cause a frame shift and premature

stop codon affecting the presumed transactivation domain. A

homologous splice form occurs in mouse Pax3, termed

Pax3f (Barber et al. 1999). As mentioned above, the reten-

tion of introns in AmphiPax3/7 is homologous to several

events in vertebrates and would truncate the transactivation

domain to varying extents (Barber et al. 1999; Seo et al.

1998; Vorobyov and Horst 2004). The apparent conservation

of isoforms over such a wide phylogenetic distance suggests

they share a function common to all chordates.

Transcripts of AmphiPax4/6 also undergo alternative

splicing in the C-terminal encoding regions. However, the

use of alternative splice sites in exon 10 or 13, as in

AmphiPax4/6, has not to date been described in vertebrate

Pax6. Isoforms of mammalian Pax4 with altered transac-

tivation domains have been isolated (Miyamoto et al. 2001;

Tokuyama et al. 1998), but comparison of the exons

involved suggests that they do not represent conserved

events. Use of the downstream splice acceptor within exon

13 of AmphiPax4/6 results in the inclusion of exon 13b.

This includes a conserved serine residue (Fig. 6B), phos-

phorylation of which by mitogen-activated protein kinase

(MAPK) in vertebrate Pax6 alters the transactivation abil-

ity (Mikkola et al. 1999). Use of the upstream splice

acceptor 50 of exon 13 in leads to the inclusion of exon 13a,

resulting in a premature stop codon and a truncated protein

lacking the conserved serine. In human Pax6 there are

numerous missense mutations that alter the transactivation

domain. Patients with such mutations typically suffer from

aniridia due to haploinsufficiency (Hanson et al. 1993;

Mikkola et al. 1999; Singh et al. 2001). One such muta-

tion, of a conserved residue in exon 13 of human Pax6,

which is removed by alternative splicing in AmphiPax4/6,

alters the binding affinity of the homeodomain (Singh et al.

2001). Whether the ability of the C-terminal to influence

the DNA binding domains is a general property of Pax

proteins is still uncertain, but it is supported by changed

DNA binding properties of human Pax8 isoforms with

altered C-terminal regions (Poleev et al. 1995). However,

such a vast array of often quite divergent alternative

splicing events in these regions would allow for lineage

specific repertoires of Pax proteins, each possessing a range

DNA binding specificities. More complete investigations

of 30 splicing in vertebrate Pax genes are needed.

J Mol Evol (2008) 66:605–620 613

123

Isoforms with Premature Termination Codons (PTCs)

As discussed above, we found several Pax2/5/8, 3/7, and 4/

6 alternative splicing events that would introduce a PTC

and, in some cases, would appear to encode a nonfunc-

tional protein unless translated from a downstream start

codon (Figs. 3 and 6A). Nonsense-mediated decay (NMD)

is a eukaryotic mRNA surveillance pathway ensuring

degradation of PTC-containing transcripts (Conti and Iza-

urralde 2005). A link has been suggested between NMD

and mRNA splicing in mammalian cells, such that the

introduction of a PTC via an alternative splicing event

provides a mechanism to regulate protein levels (Lejeune

and Maquat 2005). The extent of this link is still unclear, as

the majority of PTC-containing transcripts are present at

uniformly low levels, apparently independent of NMD

(Pan et al. 2006). However, this mechanism may be highly

conserved and it is possible that the PTC-containing tran-

scripts we found could be part of a mechanism regulating

the level of functional Pax proteins. Alternatively, PTC-

containing transcripts may be the result of splicing errors.

A comparison of human and mouse ESTs has suggested

that a certain amount of all splicing is aberrant, resulting in

truncated nonfunctional proteins (Sorek et al. 2004). Even

so, we doubt that the alternative splicing events we found

in amphioxus Pax genes, although evidently present only at

low levels (i.e., isolated via nested PCR), represent random

mistakes in splicing. As noted above, some of these rare

splice forms are conserved between amphioxus and verte-

brates, suggesting that they are functional. For example, the

conservation of exon 2-skipping in AmphiPax2/5/8 and

vertebrate Pax5 suggests that, although this event can only

be isolated using nested PCR, it generates functional pro-

teins. Also, apart from the isoforms conserved across the

chordate phylum, there are many PTC-containing Pax

transcripts conserved, to varying extents, within the ver-

tebrate subphylum. Indeed, since alternatively spliced

exons, as well as retained introns, both of which alter the

reading frame and thereby introduce PTCs, are common in

the vertebrate Pax genes (e.g., Barber et al. 1999; Kozmik

et al. 1993; Zwollo et al. 1997), their appearance in

amphioxus is not surprising. Additionally, given the sen-

sitivity offered by two rounds of PCR, the number of

independent primer sets used for the complete screen of

each Pax gene transcript (see supplementary materials) and

the assumption of no bias in the primer efficiencies for any

single Pax gene, if we were isolating only low-level

aberrant splicing, we might expect the number of alterna-

tively spliced transcripts to be similar for each of the four

amphioxus Pax genes. Instead, the numbers are dissimilar,

with much lower numbers for Pax1/9 than for Pax2/5/8.

One explanation for the low level of expression of some of

these isoforms may be that they occur in only a small

population of cells. For example, at the neurula stage,

AmphiPax2/5/8 is expressed in the few pigment cells of the

frontal eye, a slightly larger number of cells in the devel-

oping kidney, and more in the central nervous system and

the developing gill slits (Kozmik et al. 1999). Thus

although we cannot rule out aberrant splicing, a high

degree of conserved gene-specific aberrant splicing within

the Pax family, presumably due to the conservation of

alternative splice sites for reasons other than the production

of altered Pax proteins, would be a phenomenon worthy of

further study.

Evolution of the Pax Family and Alternative Splicing

Although there is some uncertainty regarding the duplica-

tion history of the Pax genes, it seems likely that the

duplication of a single Proto-Pax gene in the urmetazoan

ancestor prior to the divergence of the cnidarians and bi-

laterian lineages gave rise to the two precursors of Pax1/9/

3/7 and Pax2/5/8/4/6 lineages and that further duplications

resulted in all the four classes of Pax genes in amphioxus,

plus another termed Pox-neuro, that was lost in chordates

(Balczarek et al. 1997; Hoshiyama et al. 2007; Matus et al.

2007; Vorobyov and Horst 2006). Within the lineage

leading to vertebrates, it is thought that further whole-

genome duplications followed by gene loss have resulted in

the nine Pax genes in most vertebrates (Holland et al.

2004; Holland 2003; Putnam et al 2007, 2008). Our results

suggest that, in addition to the duplicates, the number of

alternative splicing events per Pax gene appears to be at

least equivalent in amphioxus and vertebrates and, in some

cases, greater in the latter. The numbers of alternative

splicing events with implications for the common ancestor

genes are summarized in Fig. 7. It should be noted that in

some cases the equivalent exon undergoes alternative

splicing in all or some of the vertebrate paralogues, sug-

gesting that the event occurred in the ancestor gene and

was maintained following a duplication event. However,

for this comparison these events are considered separate

because, in all cases, the amino acid sequence of the exon

has diverged and, therefore, no longer creates an identical

isoform. Both vertebrate Pax9 (Nornes et al. 1996) and

AmphiPax1/9 have two known isoforms (Fig. 7A). The

presence of more isoforms in the former is suggested by

analyses of human and mouse ESTs (de la Grange et al.

2005; Stamm et al. 2006; Thanaraj et al. 2004). In addi-

tion, multiple isoforms of Pax1/9 have been found in the

tunicate Halocynthia roretzi (Ogasawara et al. 1999),

suggesting independent expansion of Pax1/9 splice-forms

in this fast-evolving group. Similarly, the levels of alter-

native splicing in AmphiPax2/5/8 appear to be comparable

to those reported in human and mouse Pax2, 5, and 8,

revealing an overall expansion of isoforms available to

614 J Mol Evol (2008) 66:605–620

123

vertebrates (Fig. 7C) (e.g., Borson et al. 2002; Heller and

Brandli 1997, 1999; Kozmik et al. 1993; Mackereth et al.

2005; Pellizzari et al. 2006; Poleev et al. 1995; Robichaud

et al. 2004; Sekine et al. 2007; Tavassoli et al. 1997; Ward

et al. 1994; Zwollo et al. 1997). For Pax4 and 6 the amount

of alternative splicing in vertebrates is broadly equivalent

Fig. 7 Alternative splicing of amphioxus and vertebrate Pax genes

with implications for the common ancestor genes assuming no large-

scale loss of alternative splicing (see Discussion for references). (A)

The single and probably nonconserved alternative splicing event

found in amphioxus and vertebrate Pax9 suggests that little or no

alternative splicing was present in the common ancestor. (B) The

number of alternative splicing events appears to have undergone a

moderate expansion in the vertebrate Pax3 and 7. However, the

events in amphioxus do have counterparts in the vertebrate genes,

suggesting a common ancestor with multiple events. �One event

included in this number may not be evident using our survey method.

(C) The alternative splicing of amphioxus Pax2/5/8 is comparable to

each of the vertebrate genes. Some of these events appear to be

conserved, suggesting a common ancestor gene containing multiple

alternative splicing events, with many other events being particular to

the amphioxus and vertebrate lineages. (D) The number of events in

amphioxus is at least comparable to that found in the vertebrate genes.

No event appears clearly conserved between vertebrates and amphi-

oxus, suggesting that they have arisen independently following the

divergence of the two lineages. The status of the ancestor gene is

therefore completely unknown. *Two of the events included in this

number involve the use of alternative promoters. Such events would

not be isolated using the techniques used, however, exclusion of these

events does not alter the overall conclusion

J Mol Evol (2008) 66:605–620 615

123

to, or greater than, that in AmphiPax4/6, although all events

appear to be lineage specific (Fig. 7D) (Bandah et al. 2007;

Carriere et al. 1993; Epstein et al. 1994; Gorlov and

Saunders 2002; Inoue et al. 1998; Mishra et al. 2002;

Miyamoto et al. 2001; Tao et al. 1998; Tokuyama et al.

1998). For Pax3/7, with the exceptions described above

(Fig. 5), we see no evidence in amphioxus for many of the

isoforms previously described in vertebrates (Barr et al.

1999; Lamey et al. 2004; Parker et al. 2004; Tsukamoto

et al. 1994; Vorobyov and Horst 2004) and conclude that

the repertoire of splice variants has probably expanded in

the vertebrate lineage (Fig. 7B).

The method we employed to isolate amphioxus Pax

isoforms, which uses multiple rounds of PCR flanking

single exons, as well as across the entire transcript (Gorlov

and Saunders 2002), is probably more sensitive than that

used in any previous survey of splicing in vertebrate Pax

genes. Moreover, because we used whole embryos and

adults, our survey of tissue types is all-inclusive. Conse-

quently, in the absence of equally comprehensive studies of

vertebrate Pax splice forms, it seems likely that more

isoforms of vertebrate Pax genes remain to be discovered.

However, just on the basis of previously reported alterna-

tive splicing, it seems that the total number of alternatively

spliced Pax isoforms for the nine vertebrate Pax genes is

considerably higher than for the four amphioxus ones. This

conclusion is consistent with the recent finding that, in

general, the percentage of genes and exons undergoing

alternative splicing is higher in vertebrates compared to

invertebrates (Kim et al. 2007).

It has been demonstrated that, in general, gene dupli-

cation and alternative splicing have an inverse relationship

(Kopelman et al. 2005; Su et al. 2006), suggesting that

alternative splicing and gene duplication are interchange-

able mechanisms of proteome diversification. However,

this does not hold for amphioxus and vertebrate Pax genes.

The number of alternatively spliced isoforms per Pax gene

appears to be at least equivalent in amphioxus and verte-

brates and, in some cases, greater in the latter (Fig. 7).

Although our results contradict the finding of an inverse

relationship, the duplication of the Pax genes at the base of

the vertebrate lineage is thought to be quite ancient, per-

haps 520–650 million years ago (Panopoulou et al. 2003;

Robinson-Rechavi et al. 2004; Shu et al. 1999), while the

inverse correlation is much more pronounced for recent

duplicates, (less than *80–90 million years ago) (Kopel-

man et al. 2005; Su et al. 2006). It is possible that this

period of time has given an opportunity for the evolution of

a large amount of neofunctional alternative splicing, fol-

lowing what may have been initial rounds of

subfunctionalization subsequent to the duplication events, a

pattern that may be more common in anciently duplicated

gene families.

A comparison of amphioxus and vertebrate splicing

events that impact domains of known function suggests

that the difference between splice variants is considerably

more dramatic between Pax isoforms than between the

vertebrate duplicates, in which all the functional domains

have remained intact (Glardon et al. 1998; Holland et al.

1999, 1995; Kozmik et al. 1999) (Fig. 1A–D). This is

consistent with a study demonstrating that gene duplication

and alternative splicing are not interchangeable mecha-

nisms of proteome diversification (Talavera et al. 2007).

This same study also suggested that the inverse correlation

between gene duplication and alternative splicing might be

due to the negative selection of alternatively spliced

duplicates because of the necessity for a multiple, simul-

taneous dosage balance of regulating factors. The

discovery of alternative splicing events in AmphiPax2/5/8

and 3/7 that are apparently conserved with vertebrates

suggests that there were considerable levels of alternative

splicing in the common ancestor and offers two examples

of alternatively spliced genes being duplicated and

maintained.

Possible Role of Expanded Pax Alternative Splicing

in Vertebrates

Given the apparent expansion of alternative splicing within

the vertebrate Pax lineage it is interesting to consider the

possible roles of these isoforms. As described above,

functional studies demonstrate that the Pax isoforms have

altered DNA binding and transactivation capacities, sug-

gesting that they may bind different gene promoters and/or

cause different levels of transcription from the same pro-

moter. Microarray analysis supports this idea by showing

that different isoforms of Pax3 regulate distinct but over-

lapping sets of genes (Wang et al. 2007). It could be that

vertebrate Pax genes can influence a far wider range of

genes in a much more subtle manner than can the amphi-

oxus Pax genes, with their more limited repertoire of splice

variants. The development role of these additional splice

forms in vertebrates is incompletely understood. However,

the additional isoforms could have played a part in the

acquisition of new roles for Pax3-expressing cells at the

edges of the neural plate in connection with the evolution

of neural crest. Pax3 is required for the normal migration

and differentiation of the neural crest (Robson et al. 2006),

which evolved after the split between amphioxus and tu-

nicates plus vertebrates (Shimeld and Holland 2000).

Interestingly, the transfection of Pax3 splice variants that

are not conserved with amphioxus into melanocytes, which

derive from neural crest, has isoform-specific effects on

cell growth, migration, proliferation, and apoptosis (Wang

et al. 2006). Possible insights into the importance of line-

age-specific alternative splicing events in vertebrate Pax4

616 J Mol Evol (2008) 66:605–620

123

and Pax6 is provided by the alternative splicing of exon 5a

in vertebrate Pax6. It has been shown that this event plays a

distinct role in postnatal iris formation and is important for

the structural integrity of the cornea, lens, and retina (Singh

et al. 2002). Our study (Fig. 6A), along with previous

investigations (Czerny and Busslinger 1995; Glardon et al.

1998), suggests that this event does not occur in inverte-

brates, which is entirely consistent with a role in the

development of advanced features of the vertebrate eye.

The developmental roles of the expanded alternative

splicing seen in vertebrate Pax2, 5, and 8 are largely

unknown. However, the alternative splicing of exon 8 in

human Pax5, an event that does not occur in the nearest

equivalent exon of amphioxus, has been implicated in the

altered regulation of genes in human lymphocytic leukemia

B cells (Oppezzo et al. 2005). The Pax2, 5, and 8 genes are

involved in several developmental processes that have

become highly elaborated within the vertebrate lineage

(Chi and Epstein 2002), and further investigation into the

functions of specific isoforms is clearly in order.

In summary, our comparative study of alternative

splicing in amphioxus and vertebrate Pax genes has shown

that, for this gene family, there is not an inverse relation-

ship between alternative splicing and gene duplication. We

find that many events appear to be lineage specific but also

find conservation of splice forms that dramatically impact

functional motifs. Such evolutionary conservation suggests

that these isoforms are not simply a by-product of aberrant

splicing and points to the necessity of future experiments to

test their function.

Acknowledgments We would like to thank John Lawrence for

his hospitality at the University of South Florida. We also thank

Zbynek Kozmik, Christine Beardsley, and Colin Sharpe for helpful

criticism and comments on the manuscript. This work was sup-

ported by Grant MCB06-20019 from the National Science

Foundation to L.Z.H.

References

Anspach J, Poulsen G, Kaattari I, Pollock R, Zwollo P (2001)

Reduction in DNA binding activity of the transcription factor

Pax-5a in B lymphocytes of aged mice. J Immunol 166:2617–

2626

Azuma N, Tadokoro K, Asaka A, Yamada M, Yamaguchi Y, Handa

H, Matsushima S, Watanabe T, Kohsaka S, Kida Y, Shiraishi T,

Ogura T, Shimamura K, Nakafuku M (2005) The Pax6 isoform

bearing an alternative spliced exon promotes the development of

the neural retinal structure. Hum Mol Genet 14:735–745

Balczarek KA, Lai ZC, Kumar S (1997) Evolution of functional

diversification of the paired box (Pax) DNA-binding domains.

Mol Biol Evol 14:829–842

Bandah D, Swissa T, Ben-Shlomo G, Banin E, Ofri R, Sharon D

(2007) A complex expression pattern of Pax6 in the pigeon

retina. Invest Ophthalmol Vis Sci 48:2503–2509

Barber TD, Barber MC, Cloutier TE, Friedman TB (1999) PAX3 gene

structure, alternative splicing and evolution. Gene 237:311–319

Barr FG, Fitzgerald JC, Ginsberg JP, Vanella ML, Davis RJ,

Bennicelli JL (1999) Predominant expression of alternative

PAX3 and PAX7 forms in myogenic and neural tumor cell lines.

Cancer Res 59:5443–5448

Blair JE, Hedges SB (2005) Molecular phylogeny and divergence

times of deuterostome animals. Mol Biol Evol 22:2275–22784

Blencowe BJ (2006) Alternative splicing: new insights from global

analyses. Cell 126:37–47

Borson ND, Lacy MQ, Wettstein PJ (2002) Altered mRNA expres-

sion of Pax5 and Blimp-1 in B cells in multiple myeloma. Blood

100:4629–4639

Bourlat SJ, Juliusdottir T, Lowe CJ, Freeman R, Aronowicz J,

Kirschner M, Lander ES, Thorndyke M, Nakano H, Kohn AB,

Heyland A, Moroz LL, Copley RR, Telford MJ (2006)

Deuterostome phylogeny reveals monophyletic chordates and

the new phylum Xenoturbellida. Nature 444:85–88

Bruun JA, Thomassen EI, Kristiansen K, Tylden G, Holm T, Mikkola

I, Bjorkoy G, Johansen T (2005) The third helix of the

homeodomain of paired class homeodomain proteins acts as a

recognition helix both for DNA and protein interactions. Nucleic

Acids Res 33:2661–2675

Carriere C, Plaza S, Martin P, Quatannens B, Bailly M, Stehelin D,

Saule S (1993) Characterization of quail Pax-6 (Pax-QNR)

proteins expressed in the neuroretina. Mol Cell Biol 13:7257–7266

Chalepakis G, Jones FS, Edelman GM, Gruss P (1994) Pax-3 contains

domains for transcription activation and transcription inhibition.

Proc Natl Acad Sci USA 91:12745–12749

Chi N, Epstein JA (2002) Getting your Pax straight: pax proteins in

development and disease. Trends Genet 18:41–47

Conti E, Izaurralde E (2005) Nonsense-mediated mRNA decay:

molecular insights and mechanistic variations across species.

Curr Opin Cell Biol 17:316–325

Czerny T, Busslinger M (1995) DNA-binding and transactivation

properties of Pax-6: three amino acids in the paired domain are

responsible for the different sequence recognition of Pax-6 and

BSAP (Pax-5). Mol Cell Biol 15:2858–2871

Czerny T, Schaffner G, Busslinger M (1993) DNA sequence

recognition by Pax proteins: bipartite structure of the paired

domain and its binding site. Genes Dev 7:2048–2061

de la Grange P, Dutertre M, Martin N, Auboeuf D (2005) FAST DB: a

website resource for the study of the expression regulation of

human gene products. Nucleic Acids Res 33:4276–4284

Dorfler P, Busslinger M (1996) C-terminal activating and inhibitory

domains determine the transactivation potential of BSAP (Pax-

5), Pax-2 and Pax-8. EMBO J 15:1971–1982

Eberhard D, Jimenez G, Heavey B, Busslinger M (2000) Transcrip-

tional repression by Pax5 (BSAP) through interaction with

corepressors of the Groucho family. EMBO J 19:2292–2303

Epstein J, Cai J, Glaser T, Jepeal L, Maas R (1994) Identification of a

Pax paired domain recognition sequence and evidence for DNA-

dependent conformational changes. J Biol Chem 269:8355–8361

Fujitani Y, Kajimoto Y, Yasuda T, Matsuoka TA, Kaneto H,

Umayahara Y, Fujita N, Watada H, Miyazaki JI, Yamasaki Y,

Hori M (1999) Identification of a portable repression domain and

an E1A-responsive activation domain in Pax4: a possible role of

Pax4 as a transcriptional repressor in the pancreas. Mol Cell Biol

19:8281–8291

Glardon S, Holland LZ, Gehring WJ, Holland ND (1998) Isolation

and developmental expression of the amphioxus Pax-6 gene

(AmphiPax-6): insights into eye and photoreceptor evolution.

Development 125:2701–2710

Gorlov IP, Saunders GF (2002) A method for isolating alternatively

spliced isoforms: isolation of murine Pax6 isoforms. Anal

Biochem 308:401–404

Graveley BR (2001) Alternative splicing: increasing diversity in the

proteomic world. Trends Genet 17:100–107

J Mol Evol (2008) 66:605–620 617

123

Hanson IM, Seawright A, Hardman K, Hodgson S, Zaletayev D,

Fekete G, van Heyningen V (1993) PAX6 mutations in aniridia.

Hum Mol Genet 2:915–920

Heller N, Brandli AW (1997) Xenopus Pax-2 displays multiple splice

forms during embryogenesis and pronephric kidney develop-

ment. Mech Dev 69:83–104

Heller N, Brandli AW (1999) Xenopus Pax-2/5/8 orthologues: novel

insights into Pax gene evolution and identification of Pax-8 as

the earliest marker for otic and pronephric cell lineages. Dev

Genet 24:208–219

Hetzer-Egger C, Schorpp M, Boehm T (2000) Evolutionary conser-

vation of gene structures of the Pax1/9 gene family. Biochim

Biophys Acta 1492:517–521

Holland PW (2003) More genes in vertebrates? J Struct Funct

Genomics 3:75–84

Holland LZ, Yu JK (2004) Cephalochordate (amphioxus) embryos:

procurement, culture, and basic methods. Methods Cell Biol

74:195–215

Holland ND, Holland LZ, Kozmik Z (1995) An amphioxus Pax gene,

AmphiPax-1, expressed in embryonic endoderm, but not in

mesoderm: implications for the evolution of class I paired box

genes. Mol Mar Biol Biotechnol 4:206–214

Holland LZ, Schubert M, Kozmik Z, Holland ND (1999) AmphiPax3/7, an amphioxus paired box gene: insights into chordate

myogenesis, neurogenesis, and the possible evolutionary precur-

sor of definitive vertebrate neural crest. Evol Dev 1:153–165

Holland LZ, Laudet V, Schubert M (2004) The chordate amphioxus:

an emerging model organism for developmental biology. Cell

Mol Life Sci 61:2290–2308

Hoshiyama D, Iwabe N, Miyata T (2007) Evolution of the gene

families forming the Pax/Six regulatory network: isolation of

genes from primitive animals and molecular phylogenetic

analyses. FEBS Lett 581:1639–1643

Inoue H, Nomiyama J, Nakai K, Matsutani A, Tanizawa Y, Oka Y

(1998) Isolation of full-length cDNA of mouse PAX4 gene and

identification of its human homologue. Biochem Biophys Res

Commun 243:628–633

Jaworski C, Sperbeck S, Graham C, Wistow G (1997) Alternative

splicing of Pax6 in bovine eye and evolutionary conservation of

intron sequences. Biochem Biophys Res Commun 240:196–202

Jun S, Desplan C (1996) Cooperative interactions between paired

domain and homeodomain. Development 122:2639–2650

Jun S, Wallen RV, Goriely A, Kalionis B, Desplan C (1998) Lune/eye

gone, a Pax-like protein, uses a partial paired domain and a

homeodomain for DNA recognition. Proc Natl Acad Sci USA

95:13720–1375

Kalousova A, Benes V, Paces J, Paces V, Kozmik Z (1999) DNA

binding and transactivating properties of the paired and homeo-

box protein Pax4. Biochem Biophys Res Commun 259:510–518

Kim E, Magen A, Ast G (2007) Different levels of alternative splicing

among eukaryotes. Nucleic Acids Res 35:125–131

Kopelman NM, Lancet D, Yanai I (2005) Alternative splicing and

gene duplication are inversely correlated evolutionary mecha-

nisms. Nat Genet 37:588–589

Kozmik Z, Kurzbauer R, Dorfler P, Busslinger M (1993) Alternative

splicing of Pax-8 gene transcripts is developmentally regulated

and generates isoforms with different transactivation properties.

Mol Cell Biol 13:6024–6035

Kozmik Z, Czerny T, Busslinger M (1997) Alternatively spliced

insertions in the paired domain restrict the DNA sequence

specificity of Pax6 and Pax8. EMBO J 16:6793–6803

Kozmik Z, Holland ND, Kalousova A, Paces J, Schubert M, Holland

LZ (1999) Characterization of an amphioxus paired box gene,

AmphiPax2/5/8: developmental expression patterns in optic

support cells, nephridium, thyroid-like structures and pharyngeal

gill slits, but not in the midbrain-hindbrain boundary region.

Development 126:1295–1304

Kreslova J, Holland LZ, Schubert M, Burgtorf C, Benes V, Kozmik Z

(2002) Functional equivalency of amphioxus and vertebrate

Pax258 transcription factors suggests that the activation of mid-

hindbrain specific genes in vertebrates occurs via the recruitment

of Pax regulatory elements. Gene 282:143–150

Kwak SJ, Vemaraju S, Moorman SJ, Zeddies D, Popper AN, Riley

BB (2006) Zebrafish pax5 regulates development of the utricular

macula and vestibular function. Dev Dyn 235:3026–3038

Lamey TM, Koenders A, Ziman M (2004) Pax genes in myogenesis:

alternate transcripts add complexity. Histol Histopathol

19:1289–1300

Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J,

Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D,

Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine

R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C,

Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan

A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian

A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley

D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas

P, Dunham A, Dunham I, Durbin R, French L, Grafham D,

Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C,

McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC,

Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston

RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis

ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL,

Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer

JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW,

Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S,

Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C,

Uberbacher E, Frazier M et al (2001) Initial sequencing and

analysis of the human genome. Nature 409:860–921

Lejeune F, Maquat LE (2005) Mechanistic links between nonsense-

mediated mRNA decay and pre-mRNA splicing in mammalian

cells. Curr Opin Cell Biol 17:309–315

Lowen M, Scott G, Zwollo P (2001) Functional analyses of two

alternative isoforms of the transcription factor Pax-5. J Biol

Chem 276:42565–42574

Mackereth MD, Kwak SJ, Fritz A, Riley BB (2005) Zebrafish pax8 is

required for otic placode induction and plays a redundant role

with Pax2 genes in the maintenance of the otic placode.

Development 132:371–382

MacLean DW, Meedel TH, Hastings KE (1997) Tissue-specific

alternative splicing of ascidian troponin I isoforms. Redesign of

a protein isoform-generating mechanism during chordate evolu-

tion. J Biol Chem 272:32115–32120

Matus DQ, Pang K, Daly M, Martindale MQ (2007) Expression of

Pax gene family members in the anthozoan cnidarian, Nemato-stella vectensis. Evol Dev 9:25–38

Mikkola I, Bruun JA, Bjorkoy G, Holm T, Johansen T (1999)

Phosphorylation of the transactivation domain of Pax6 by

extracellular signal-regulated kinase and p38 mitogen-activated

protein kinase. J Biol Chem 274:15115–15126

Mikkola I, Bruun JA, Holm T, Johansen T (2001) Superactivation of

Pax6-mediated transactivation from paired domain-binding sites

by dna-independent recruitment of different homeodomain

proteins. J Biol Chem 276:4109–4118

Mishra R, Gorlov IP, Chao LY, Singh S, Saunders GF (2002) PAX6,

paired domain influences sequence recognition by the homeo-

domain. J Biol Chem 277:49488–49494

Miyamoto T, Kakizawa T, Ichikawa K, Nishio S, Kajikawa S,

Hashizume K (2001) Expression of dominant negative form of

PAX4 in human insulinoma. Biochem Biophys Res Commun

282:34–40

618 J Mol Evol (2008) 66:605–620

123

Nornes S, Mikkola I, Krauss S, Delghandi M, Perander M, Johansen T

(1996) Zebrafish Pax9 encodes two proteins with distinct C-

terminal transactivating domains of different potency negatively

regulated by adjacent N-terminal sequences. J Biol Chem

271:26914–26923

Ogasawara M, Wada H, Peters H, Satoh N (1999) Developmental

expression of Pax1/9 genes in urochordate and hemichordate

gills: insight into function and evolution of the pharyngeal

epithelium. Development 126:2539–2550

Oppezzo P, Dumas G, Lalanne AI, Payelle-Brogard B, Magnac C,

Pritsch O, Dighiero G, Vuillier F (2005) Different isoforms of

BSAP regulate expression of AID in normal and chronic

lymphocytic leukemia B cells. Blood 105:2495–503

Pan Q, Saltzman AL, Kim YK, Misquitta C, Shai O, Maquat LE, Frey

BJ, Blencowe BJ (2006) Quantitative microarray profiling

provides evidence against widespread coupling of alternative

splicing with nonsense-mediated mRNA decay to control gene

expression. Genes Dev 20:153–158

Panopoulou G, Hennig S, Groth D, Krause A, Poustka AJ, Herwig R,

Vingron M, Lehrach H (2003) New evidence for genome-wide

duplications at the origin of vertebrates using an amphioxus gene

set and completed animal genomes. Genome Res 13:1056–1066

Parker CJ, Shawcross SG, Li H, Wang QY, Herrington CS, Kumar S,

MacKie RM, Prime W, Rennie IG, Sisley K, Kumar P (2004)

Expression of PAX 3 alternatively spliced transcripts and

identification of two new isoforms in human tumors of neural

crest origin. Int J Cancer 108:314–320

Pellizzari L, Tell G, Damante G (1999) Co-operation between the PAI

and RED subdomains of Pax-8 in the interaction with the

thyroglobulin promoter. Biochem J 337(Pt 2):253–262

Pellizzari L, Puppin C, Mariuzzi L, Saro F, Pandolfi M, Di Lauro R,

Beltrami CA, Damante G (2006) PAX8 expression in human

bladder cancer. Oncol Rep 16:1015–1020

Philippe H, Lartillot N, Brinkmann H (2005) Multigene analyses of

bilaterian animals corroborate the monophyly of Ecdysozoa,

Lophotrochozoa, and Protostomia. Mol Biol Evol 22:1246–1253

Poleev A, Wendler F, Fickenscher H, Zannini MS, Yaginuma K,

Abbott C, Plachov D (1995) Distinct functional properties of

three human paired-box-protein, PAX8, isoforms generated by

alternative splicing in thyroid, kidney and Wilms’ tumors. Eur J

Biochem 228:899–911

Puschel AW, Gruss P, Westerfield M (1992) Sequence and expression

pattern of pax-6 are highly conserved between zebrafish and

mice. Development 114:643–651

Putnam NH, Srivastava M, Hellsten U, Dirks B, Chapman J, Salamov

A, Terry A, Shapiro H, Lindquist E, Kapitonov VV, Jurka J,

Genikhovich G, Grigoriev IV, Lucas SM, Steele RE, Finnerty

JR, Technau U, Martindale MQ, Rokhsar DS (2007) Sea

anemone genome reveals ancestral eumetazoan gene repertoire

and genomic organization. Science 317:86–94

Putnam NH, Butts T, Ferrier DEK, Furlong RF, Hellsten U,

Kawashima T, Robinson-Rechavi M, Shoguchi E, Terry A, Yu

JK, Benito-Gutierrez E, Dubchak I, Garcia-Fernandez J, Grigo-

riev IV, Horton AC, de Jong PJ, Jurka J, Kapitonov V, Kohara Y,

Kuroki Y, Lindquist E, Lucas S, Osoegawa K, Pennacchio LA,

Salamov AA, Satou Y, Sauka-Spengler T, Schmutz J, Shin-I T,

Toyoda A, Gibson-Brown JJ, Bronner-Fraser M, Fujiyama A,

Holland LZ, Holland PWH, Satoh N, Rokhsar DS (2008) The

amphioxus genome and the evolution of the chordate karyotype.

Nature (in press)

Ritz-Laser B, Estreicher A, Gauthier B, Philippe J (2000) The paired

homeodomain transcription factor Pax-2 is expressed in the

endocrine pancreas and transactivates the glucagon gene

promoter. J Biol Chem 275:32708–32715

Robichaud GA, Nardini M, Laflamme M, Cuperlovic-Culf M,

Ouellette RJ (2004) Human Pax-5 C-terminal isoforms possess

distinct transactivation properties and are differentially modu-

lated in normal and malignant B cells. J Biol Chem 279:49956–

49963

Robinson-Rechavi M, Boussau B, Laudet V (2004) Phylogenetic

dating and characterization of gene duplications in vertebrates:

the cartilaginous fish reference. Mol Biol Evol 21:580–586

Robson EJ, He SJ, Eccles MR (2006) A PANorama of PAX genes in

cancer and development. Natl Rev Cancer 6:52–62

Schafer BW, Czerny T, Bernasconi M, Genini M, Busslinger M

(1994) Molecular cloning and characterization of a human PAX-

7 cDNA expressed in normal and neoplastic myocytes. Nucleic

Acids Res 22:4574–4582

Sekine R, Kitamura T, Tsuji T, Tojo A (2007) Identification and

comparative analysis of Pax5 C-terminal isoforms expressed in

human cord blood-derived B cell progenitors. Immunol Lett

111:21–25

Seo HC, Saetre BO, Havik B, Ellingsen S, Fjose A (1998) The

zebrafish Pax3 and Pax7 homologues are highly conserved,

encode multiple isoforms and show dynamic segment-like

expression in the developing brain. Mech Dev 70:49–63

Shimeld SM, Holland PW (2000) Vertebrate innovations. Proc Natl

Acad Sci USA 97:4449–4552

Shu DG, Luo HL, Morris SC, Zhang XL, Hu SX, Chen L, Han J, Zhu

M, Li Y, Chen LZ (1999) Lower Cambrian vertebrates from

South China. Nature 402:42–46

Singh S, Chao LY, Mishra R, Davies J, Saunders GF (2001) Missense

mutation at the C-terminus of PAX6 negatively modulates

homeodomain function. Hum Mol Genet 10:911–918

Singh S, Mishra R, Arango NA, Deng JM, Behringer RR, Saunders

GF (2002) Iris hypoplasia in mice that lack the alternatively

spliced Pax6(5a) isoform. Proc Natl Acad Sci USA 99:6812–

6815

Sorek R, Shamir R, Ast G (2004) How prevalent is functional

alternative splicing in the human genome? Trends Genet 20:68–71

Stamm S, Riethoven JJ, Le Texier V, Gopalakrishnan C, Kumanduri

V, Tang Y, Barbosa-Morais NL, Thanaraj TA (2006) ASD: a

bioinformatics resource on alternative splicing. Nucleic Acids

Res 34:D46–D55

Su Z, Wang J, Yu J, Huang X, Gu X (2006) Evolution of alternative

splicing after gene duplication. Genome Res 16:182–189

Sugnet CW, Kent WJ, Ares M, Jr, Haussler D (2004) Transcriptome

and genome conservation of alternative splicing events in

humans and mice. Pacif Symp Biocomput 9:66–77

Talavera D, Vogel C, Orozco M, Teichmann SA, de la Cruz X (2007)

The (in)dependence of alternative splicing and gene duplication.

PLoS Comput Biol 3:e33

Tang HK, Singh S, Saunders GF (1998) Dissection of the transac-

tivation function of the transcription factor encoded by the eye

developmental gene PAX6. J Biol Chem 273:7210–7221

Tao T, Wasson J, Bernal-Mizrachi E, Behn PS, Chayen S, Duprat L,

Meyer J, Glaser B, Permutt MA (1998) Isolation and character-

ization of the human PAX4 gene. Diabetes 47:1650–1653

Tavassoli K, Ruger W, Horst J (1997) Alternative splicing in PAX2

generates a new reading frame and an extended conserved

coding region at the carboxy terminus. Hum Genet 101:371–375

Thanaraj TA, Stamm S, Clark F, Riethoven JJ, Le Texier V, Muilu J

(2004) ASD: the alternative splicing database. Nucleic Acids

Res 32:D64–D69

Tokuyama Y, Yagui K, Sakurai K, Hashimoto N, Saito Y, Kanatsuka

A (1998) Molecular cloning of rat Pax4: identification of four

isoforms in rat insulinoma cells. Biochem Biophys Res Commun

248:153–156

Tsukamoto K, Nakamura Y, Niikawa N (1994) Isolation of two

isoforms of the PAX3 gene transcripts and their tissue-specific

alternative expression in human adult tissues. Hum Genet

93:270–274

J Mol Evol (2008) 66:605–620 619

123

Vorobyov E, Horst J (2004) Expression of two protein isoforms of

PAX7 is controlled by competing cleavage-polyadenylation and

splicing. Gene 342:107–112

Vorobyov E, Horst J (2006) Getting the proto-Pax by the tail. J Mol

Evol 63:153–164

Wang Q, Kumar S, Slevin M, Kumar P (2006) Functional analysis of

alternative isoforms of the transcription factor PAX3 in mela-

nocytes in vitro. Cancer Res 66:8574–8580

Wang Q, Kumar S, Mitsios N, Slevin M, Kumar P (2007)

Investigation of downstream target genes of PAX3c, PAX3e

and PAX3g isoforms in melanocytes by microarray analysis. Int

J Cancer 120:1223–1231

Ward TA, Nebel A, Reeve AE, Eccles MR (1994) Alternative

messenger RNA forms and open reading frames within an

additional conserved region of the human PAX-2 gene. Cell

Growth Differ 5:1015–1021

Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF,

Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P,

Antonarakis SE, Attwood J, Baertsch R, Bailey J, Barlow K,

Beck S, Berry E, Birren B, Bloom T, Bork P, Botcherby M, Bray

N, Brent MR, Brown DG, Brown SD, Bult C, Burton J, Butler J,

Campbell RD, Carninci P, Cawley S, Chiaromonte F, Chinwalla

AT, Church DM, Clamp M, Clee C, Collins FS, Cook LL,

Copley RR, Coulson A, Couronne O, Cuff J, Curwen V, Cutts T,

Daly M, David R, Davies J, Delehaunty KD, Deri J, Dermitzakis

ET, Dewey C, Dickens NJ, Diekhans M, Dodge S, Dubchak I,

Dunn DM, Eddy SR, Elnitski L, Emes RD, Eswara P, Eyras E,

Felsenfeld A, Fewell GA, Flicek P, Foley K, Frankel WN, Fulton

LA, Fulton RS, Furey TS, Gage D, Gibbs RA, Glusman G,

Gnerre S, Goldman N, Goodstadt L, Grafham D, Graves TA,

Green ED, Gregory S, Guigo R, Guyer M, Hardison RC,

Haussler D, Hayashizaki Y, Hillier LW, Hinrichs A, Hlavina W,

Holzer T, Hsu F, Hua A, Hubbard T, Hunt A, Jackson I, Jaffe

DB, Johnson LS, Jones M, Jones TA, Joy A, Kamal M, Karlsson

EK et al (2002) Initial sequencing and comparative analysis of

the mouse genome. Nature 420:520–562

Xu W, Rould MA, Jun S, Desplan C, Pabo CO (1995) Crystal

structure of a paired domain-DNA complex at 2.5 A resolution

reveals structural basis for Pax developmental mutations. Cell

80:639–650

Zhang Y, Emmons SW (1995) Specification of sense-organ identity

by a Caenorhabditis elegans Pax-6 homologue. Nature 377:55–

59

Zwollo P, Arrieta H, Ede K, Molinder K, Desiderio S, Pollock R

(1997) The Pax-5 gene is alternatively spliced during B-cell

development. J Biol Chem 272:10160–10168

620 J Mol Evol (2008) 66:605–620

123