Discovery of fiber-active enzymes in Populus wood

64
Discovery of fiber-active enzymes in Populus wood Henrik Aspeborg Royal Institute of Technology Department of Biotechnology Stockholm 2004

Transcript of Discovery of fiber-active enzymes in Populus wood

Page 1: Discovery of fiber-active enzymes in Populus wood

Discovery of fiber-active enzymesin Populus wood

Henrik Aspeborg

Royal Institute of TechnologyDepartment of Biotechnology

Stockholm 2004

Page 2: Discovery of fiber-active enzymes in Populus wood

Henrik Aspeborg

Department of BiotechnologyRoyal Institute of TechnologyAlbaNova University CenterSE-106 91 StockholmSweden

Printed at Universitetsservice US ABBox 700 14Sweden

ISBN 91-7283-785-3

Page 3: Discovery of fiber-active enzymes in Populus wood

Henrik Aspeborg (2004). Discovery of fiber-active enzymes in Populus wood. Doctoral dissertationfrom the Department of Biotechnology, Royal Institute of Technology, KTH, AlbaNova UniversityCenter, Stockholm, Sweden.

ISBN 91-7283-785-3

Abstract

Renewable fibers produced by forest trees provide excellent raw material of high economic value forindustrial applications. Despite this, the genes and corresponding enzymes involved in wood fiberbiosynthesis in trees are poorly characterized. This thesis describes a functional genomics approach forthe identification of carbohydrate-active enzymes involved in secondary cell wall (wood) formation inhybrid aspen. First, a 3’ target amplification method was developed to enable microarray-based gene expressionanalysis on minute amounts of RNA. The amplification method was evaluated using both a smallermicroarray containing 192 cDNA clones and a larger microarray containing 2995 cDNA clones thatwere hybridized with targets isolated from xylem and phloem. Moreover, a gene expression study ofphloem differentiation was performed to show the usefulness of the amplification method. A microarray containing 2995 cDNA clones representing a unigene set of a cambial region ESTlibrary was used to study gene expression during wood formation. Transcript populations from thintissue sections representing different stages of xylem development were hybridized onto themicroarrays. It was demonstrated that genes encoding lignin and cellulose biosynthetic enzymes, aswell as a number of genes without assigned function, were differentially expressed across thedevelopmental gradient. Microarrays were also used to track changes in gene expression in the developing xylem oftransgenic, GA-20 oxidase overexpressing hybrid aspens that had increased secondary growth. Thestudy revealed that a number of genes encoding cell wall related enzymes were upregulated in thetransgenic trees. Moreover, most genes with high transcript changes could be assigned a role in theearly events of xylogenesis. Ten genes encoding putative cellulose synthases (CesAs) were identified in our own Populus EST-database. Full length cDNA sequences were obtained for five of them. Expression analyses performedwith real-time PCR and microarrays in normal wood undergoing xylogenesis and in tension woodrevealed xylem specific expression of four putative CesA isoenzymes. Finally, an approach combining expression profiling, bioinformatics as well as EST and full lengthsequencing was adopted to identify secondary cell wall related genes encoding carbohydrate-activeenzymes, such as glycosyltransferases and glycoside hydrolases. As expected, glycosyltransferasesinvolved in the carbohydrate biosynthesis dominated the collection of the secondary cell wall relatedenzymes that were identified.

Key words: Populus, xylogenesis, secondary cell wall, cellulose, hemicellulose, microarrays, transcriptprofiling, carbohydrate-active enzyme, glycosyltransferase, glycoside hydrolase

Henrik Aspeborg

Page 4: Discovery of fiber-active enzymes in Populus wood

Sammanfattning

Skogsträdens förnyelsebara fibrer är en unik råvara av betydande ekonomiskt värde förskogsindustrin. Trots detta är de gener och motsvarande enzymer som är involverade ivedfiberbiosyntesen inte särskilt väl karaktäriserade. Den här avhandlingen beskriver hurfunktionsgenomik kan användas i syfte att identifiera kolhydrataktiva enzymer som är involverade ibildningen av den sekundära cellväggen i hybridasp. Först utvecklades en 3’ amplifieringsmetod för mål-RNA. Denna metod möjliggörexpressionsanalyser med hjälp av mikromatriser på små mängder RNA. För att utvärderaamplifieringsmetoden hybridiserades en mindre mikromatris bestående av 192 cDNA-kloner och enstörre mikromatris bestående av 2995 cDNA-kloner med isolerat mål-RNA från xylem och floem.Dessutom genomfördes en genexpressionsstudie på differentierande floem för att visaamplifieringsmetodens användbarhet. En mikromatris, bestående av 2995 cDNA-kloner som samtliga representerade unika gener från ettEST-bibliotek isolerat från kambieregionen, användes för att studera genuttrycket undervedbildningens gång. Transkriptpopulationer från tunna vävnadssnitt, som representerade olika stadierav xylemutveckling, hybridiserades till mikromatriserna. Gener som kodade för lignin ochcellulosabiosyntesenzymer, samt ett antal gener utan klarlagd funktion, befanns vara differentielltuttryckta under vedbildningen. Mikromatriser användes också för att påvisa genuttrycksförändringar i xylemet hos transgenahybridaspar med förhöjd sekundär tillväxt. Dessa hybridaspar överuttrycker GA-20 oxidas. Studienvisade att ett antal gener, kodande för cellväggsrelaterade enzymer, var uppreglerade i de transgenaträden. Dessutom noterades att gener med de största transkriptförändringarna var de gener som äraktiva i vedbildningens tidiga stadier. Tio gener kodande för förmodade cellulosasyntaser (CesAs) identifierades i den egna Populus EST-databasen. För fem av dem erhölls cDNA-fullängdssekvenser. Expressionanalysresultat av pågåendexylogenes i normalved och dragved, som utfördes med realtids-PCR och mikromatriser, visade att fyraförmodade CesA-isoenzymer var xylemspecifikt uttryckta. Slutligen kombinerades expressionsanlys, bioinformatik, EST-sekvensering ochfullängdssekvensering i ett försök att identifiera sekundära cellväggsrelaterade gener kodande förkolhydrataktiva enzymer som t.ex. glykosyltransferaser och glykosidhydrolaser. Som förväntatdominerade glykosyltransferaser involverade i kolhydratbiosyntes gruppen av identifierade sekundäracellväggsrelaterade enzymer

Page 5: Discovery of fiber-active enzymes in Populus wood

To my parents and my family

What's so funny 'bout peace, love & understandingNick Lowe

Page 6: Discovery of fiber-active enzymes in Populus wood
Page 7: Discovery of fiber-active enzymes in Populus wood

LIST OF PUBLICATIONSThis thesis is based on the following papers and are referred to in the text by theirRoman numerals:

I. Hertzberg M, Sievertzon M, Aspeborg H, Nilsson P, Sandberg G, LundebergJ. cDNA microarray analysis of small plant tissue samples using a cDNA tagtarget amplification protocol. Plant J. 2001 Mar;25(5):585-91.

II. Hertzberg M*, Aspeborg H*, Schrader J, Andersson A, Erlandsson R,Blomqvist K, Bhalerao R, Uhlen M, Teeri TT, Lundeberg J, Sundberg B,Nilsson P, Sandberg G. A transcriptional roadmap to wood formation. ProcNatl Acad Sci USA. 2001 Dec 4;98(25):14732-7.

III. Israelsson M, Eriksson ME, Hertzberg M, Aspeborg H, Nilsson P, Moritz T.Changes in gene expression in the wood-forming tissue of transgenic hybridaspen with increased secondary growth. Plant Mol Biol. 2003 Jul; 52(4): 893-903.

IV. Djerbi S, Aspeborg H, Nilsson P, Sundberg B, Mellerowicz E, Blomqvist K,Teeri TT. Identification and expression analysis of genes encoding putativecellulose synthases (CesA) in the hybrid aspen, Populus tremula (L.) x P.tremuloides (Michx.). Accepted for publication in Cellulose.

V. Aspeborg H, Schrader J, Coutinho PM, Stam M, Kallas Å, Nilsson P,Denman S, Amini B, Sterky F, Master E, Sandberg G, Mellerowicz E,Sundberg B, Henrissat B, Teeri TT. Carbohydrate-active enzymes involved inthe secondary cell wall biogenesis in hybrid aspen. Manuscript.

* M.H. and H.A. contributed equally to this work and should be considered as jointfirst authors.

RELATED PAPERS

(a) Andersson A, Keskitalo J, Sjodin A, Bhalerao R, Sterky F, Wissel K, Tandre K,Aspeborg H, Moyle R, Ohmiya Y, Bhalerao R, Brunner A, Gustafsson P,Karlsson J, Lundeberg J, Nilsson O, Sandberg G, Strauss S, Sundberg B, UhlenM, Jansson S, Nilsson P. A transcriptional timetable of autumn senescence.Genome Biol. 2004;5(4):R24. Epub 2004 Mar 10.

(b) Blomqvist K, Djerbi S, Aspeborg H, Teeri TT. Cellulose biosynthesis in trees.Submitted as a book chapter.

Page 8: Discovery of fiber-active enzymes in Populus wood
Page 9: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

TABLE OF CONTENTS

1 INTRODUCTION ___________________________________________________ 2 1.1 From genes to genomes _________________________________________________ 2

1.2 EST sequencing _______________________________________________________ 3

1.3 Global gene expression analysis __________________________________________ 5 1.3.1 Background ______________________________________________________________ 5 1.3.2 Array fabrication __________________________________________________________ 6 1.3.3 Target preparation and hybridization___________________________________________ 8 1.3.4 Image analysis ____________________________________________________________ 9 1.3.5 Data analysis ____________________________________________________________ 11 1.3.6 Differentially expressed genes_______________________________________________ 13 1.3.7 Validation ______________________________________________________________ 13

1.4 Searching for a function _______________________________________________ 14

1.5 Wood formation ______________________________________________________ 15 1.5.1 Economic importance _____________________________________________________ 15 1.5.2 Studies of wood formation - poplar and other model organisms _____________________ 17 1.5.3 Xylem differentiation______________________________________________________ 21 1.5.4 The plant cell wall ________________________________________________________ 23 1.5.5 Carbohydrate biosynthesis and turnover _______________________________________ 24

1.5.5.1 Carbohydrate-active enzymes involved in cell wall biosynthesis __________________ 24 1.5.5.2 Cellulose biosynthesis ___________________________________________________ 25 1.5.5.3 Hemicellulose biosynthesis _______________________________________________ 26 1.5.5.4 Pectin biosynthesis______________________________________________________ 27

1.5.6 Other cell wall components _________________________________________________ 28 1.5.6.1 Lignin _______________________________________________________________ 28 1.5.6.2 Cell wall proteins_______________________________________________________ 28 1.5.6.3 Extractives ____________________________________________________________ 29

2 OBJECTIVES OF THE PRESENT INVESTIGATION ___________________ 30

3 RESULTS AND DISCUSSION _______________________________________ 31 3.1 Transcript profiling of woody tissues_____________________________________ 31

3.1.1 Development of a target amplification method (I)________________________________ 31 3.1.2 Transcript profiling during xylogenesis (II) (III) (V) _____________________________ 33

3.1.2.1 A transcriptional roadmap to wood formation (II)______________________________ 33 3.1.2.2 A transcriptional roadmap to wood formation revisited (V) ______________________ 35 3.1.2.3 Transcript profiling of transgenic trees with increased secondary growth (III)________ 36

3.1.3 Transcript profiling of cellulose synthases (IV) _________________________________ 38

3.2 Identification of carbohydrate active enzymes involved in xylogenesis (V) ______ 39

3.3 Unpublished data _____________________________________________________ 41 3.3.1 Full length sequencing of genes with unknown function___________________________ 41 3.3.2 PUNK1 ________________________________________________________________ 43

4 Concluding remarks and future prospects _______________________________ 46

5 ABBREVIATIONS _________________________________________________ 47

6 ACKNOWLEDGEMENTS ___________________________________________ 49

7 REFERENCES ____________________________________________________ 51

1

Page 10: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

1 INTRODUCTION

1.1 From genes to genomes

The late 1990s and the first years of the new millennium will be remembered as the

dawn of genomics in the history of biology. During the 1980s the improvement and

automation of sequencing methods made the complete sequencing of viruses and

organelles possible, but it was not until 1995, that the first complete genome sequence

of a free-living organism, the bacterium Haemophilus influenzae, was published

(Fleischmann et al., 1995). Since then, more than a hundred genomes have been

sequenced from the main domains of life: bacteria, archaea and eukarya. Model

organisms for genome sequencing have been chosen to cover a variety of life forms.

The budding yeast Saccharomyces cerevisiae (1997), human Homo sapiens (2001),

mouse Mus musculus (2002), the nematode Caenorhabditis elegans (1998), the fruit

fly Drosophila melanogaster (2000), the mosquito Anopheles gambiae 2002, and the

plant Arabidopsis thaliana (2000), are examples of completely sequenced eukaryotes

(http://www.ncbi.nih.gov). Other genomes such as the rice genome have been

presented in near-completed drafts, and sequencing projects of several hundred

genomes are on-going at various stages of completion. The genome is defined as the

complete (haploid) DNA content of an organism and the new field of genome studies

has been named genomics. Note that the terminology varies, sometimes related

“omic” topics like, transcriptomics, proteomics, metabolomics, glycomics are

included in the term genomics. Preferably the narrow definition of genomics as the

discipline of mapping, sequencing and analyzing genomes should be used, while the

commonly used term functional genomics is more suitable for the different “omics”,

which make use of high-throughput and large-scale experimental methodologies to

determine the biological function of the genes and their products. In the age of

genomics and functional genomics, the old school of molecular biology, which

followed a gene-at-a-time approach, has been replaced by global approaches for

identifying and understanding gene function, and this has given scientists a new

perspective. With this new global view we can seek answers to questions like: How

and which genes operate together? How and which proteins interact with genes and

other proteins? How do the networks or circuits of genes and proteins interact with

other networks to determine a certain cell type? This new approach to biology, which

2

Page 11: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

studies the interrelationships of all of the elements in a system rather than studying

them one at a time, is termed systems biology (Hood, 2003). As a consequence of the

enormous quantity of data that has been produced by high-throughput experimental

technologies, bioinformatics, i.e. computational and algorithmic approaches to handle

and study biological data, has emerged as a new, essential discipline for genomics and

functional genomics.

1.2 EST sequencing One of the goals of genome sequencing projects is to identify the complete set of

genes of an organism, translate them into proteins and then annotate them on the basis

of their similarity to known proteins in the public databases. However, the coding

sequences of genes which represent the vast majority of the information content, only

represent a tiny fraction of the total DNA. Only 1.5 % of the human genome has been

estimated to consist of coding sequence (Lander et al., 2001). Instead of looking at the

complete genome, an efficient alternative approach is to only look at the expressed

part of the genome, the transcriptome. The expressed genes, i.e. those that are

transcribed into mRNA, in a cell/tissue at a specific moment will determine the

current metabolic status of that particular cell/tissue. For instance, the mRNA pool of

roots will be different from the mRNA pool in leaves. The most widely used method

for gene identification of the transcribed portion of the genome is expressed sequence

tag (EST) sequencing. The term expressed sequence tag (EST) is used to describe

partial sequences of cDNA, reversely transcribed from mRNA. The reason for the

conversion of RNA to cDNA is the higher stability of DNA. The first application of

high-throughput sequencing of cDNA clones was described in 1991, where a brain

specific cDNA library was exploited (Adams et al., 1991). Since then more than 20

million ESTs have been sequenced from more than 600 distinctly annotated species,

representing a wide taxonomic variety of fungi, plants and animals (dbEST 19 March

2004, http://www.ncbi.nih.gov) (Boguski et al., 1993). Several plant species also have

large EST collections (Table 1). Since the 5’ sequences are more likely to contain

protein coding sequence than the 3’ends, early EST-projects favored sequencing the

5’ end of directionally cloned cDNAs. Currently, there is preference for sequencing

the 3’ end of the cDNA clone because it is likely to offer more unique sequence,

including the untranslated region (UTR), and can be used to distinguish between

3

Page 12: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

closely related gene family members. The strategy of sequencing both ends of the

cDNA clone are also becoming widespread (Rudd, 2003). A high quality cDNA

library with mostly full length genes is valuable for further investigation of candidate

genes (especially for organisms without their genome sequenced), since laborious

methods such as RACE (Rapid Amplification of cDNA Ends), which would

otherwise be necessary to obtain the full length gene, are not needed. Despite the fact

that ESTs are limited in size (e.g. typically 400-500 bp), assembly of overlapping

ESTs into clusters, can result in much longer sequences, and even full length cDNAs.

Table 1. Top 16 plant species ranked by the number of dbEST entries April 2, 2004

Species Common name Number of entries

Triticum aestivum Wheat 555472 Zea mays Maize 395951 Hordeum vulgare + subsp. vulgare Barley 356856 Glycine max Soybean 346582 Oryza sativa Rice 283989 Saccharum officinarum Noble cane 246301 Arabidopsis thaliana Thale cress 204396 Sorghum bicolor Sorghum 190864 Medicago truncatula Barrel medic 187763 Populus spp.* Poplars 155454 Lycopersicon esculentum Tomato 150519 Solanum tuberosum Potato 144730 Vitis vinifera Grape 137660 Pinus taeda Loblolly pine 110622 Lotus corniculatus var. japonicus Birdsfoot trefoil 110563 Lactuca sativa Garden lettuce 68188 * Populus species and hybrids Populus tremula x Populus tremuloides Hybrid aspen 65981 Populus tremula (European) aspen 31288 Populus balsamifera subsp. trichocarpa Black cottonwood 26825 Populus tremuloides Quaking aspen 12813 Populus x canescens (Populus alba x Populus tremula)

Gray poplar 10446

Populus balsamifera subsp. trichocarpa x Populus deltoides

Interamerican poplar

7524

Populus alba x Populus glandulosa 519 Populus euphratica Euphrates poplar 58

4

Page 13: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

1.3 Global gene expression analysis

1.3.1 Background

Knowing when, where and to what extent a gene is expressed provides clues to gene

function. Preferably, gene expression information is obtained from a variety of

physiological and developmental conditions. Gene expression tools for a single gene

or a small set of genes have been available for a long time to provide an insight into

gene function. During the last decade sequence-based methods such as serial analysis

of gene expression (SAGE) (Velculescu et al., 1995) and EST sequencing have been

used to measure mRNA abundance for a large number of genes. For instance, large

EST datasets that originate from multiple tissue specific cDNA libraries provide

information on gene expression, simply by observing the relative abundance of the

individual ESTs in the different libraries, a procedure called electronic or digital

Northerns (Ohlrogge and Benning, 2000). However, the sequence-based methods are

expensive and slow ways to study gene expression. New parallel hybridization-based

approaches that allow gene expression patterns for thousands of genes to be obtained

simultaneously have instead become increasingly popular. Spotted microarrays

(Schena et al., 1995) and in situ oligonucleotide arrays (Lockhart et al., 1996) are two

such high-throughput methods that monitor global gene expression, and since they

were first described they have had a tremendous impact on the molecular biology

field. A supplement in Nature genetics was dedicated to these emerging technologies

in 1999, and was followed by another supplement in 2002 (Nature Gen. 1999. Vol 21

(1 suppl) pp.1-60, Nature Gen 2002. Vol 32 (1 suppl) pp 461-552). For a

comprehensive review of various microarray related issues these two collections of

reviews are highly recommended.

Array technology allows the precise positioning of DNA fragments at high density

onto a solid support, creating a two-dimensional array of DNA spots that can act as

molecular detectors. There are three main types of microarrays: filter arrays, spotted

glass slide arrays, and in situ synthesized oligonucleotide arrays (Holloway et al.,

2002). Filter arrays, often called macroarrays, cannot be spotted with the same density

as glass slide arrays, and the target is usually labeled radioactively. These are

relatively cheap to produce, but to be able to compare different samples they have to

be hybridized to individual separate arrays. The Affymetrix GeneChipsTM are the

5

Page 14: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

most widely used arrays with in situ synthesized oligonucleotides. Short

oligonucleotides, only 20-25 bases long, are synthesized by photolithography. The

advantages of using Affymetrix chips are the highly standardized setup, the complete

control over the sequences and the possibility to detect splice variants and single

nucleotide polymorphisms (SNPs). The weakness of this technology is the relatively

high cost, the limited organisms available, and that two chips are required to compare

two different samples because the Affymetrix technology can only handle one

fluorescent dye. In spotted arrays either oligonucleotides or cDNA fragments can be

used as probes. Although cDNA spotted microarrays have been the favored type,

spotting oligonucleotides 50-70 bases long is becoming increasingly popular. A

unique and powerful feature of the two-color spotted microarrays is the ability to

make a direct comparison between two samples on the same microarray (Fig. 1).

Since spotted cDNA microarrays have been successfully used in the present thesis,

this technology will be described in further detail. In the following description, the

term probe will refer to the immobilized DNA, while the labeled, mobile DNA that is

hybridized to the array is called target as suggested in the Nature Genetics supplement

Chipping forecast (Phimister, 1999).

1.3.2 Array fabrication

When creating spotted cDNA microarrays, a large collection of cDNAs representing

as many unique transcripts as possible (e.g. a subset of genes of interest or preferably

all genes in the genome), are PCR-amplified and subsequently spotted onto glass

slides using a high-speed robotic system. The glass slides are coated with poly-lysine,

amino silanes or amino-reactive silanes, which enhance the adherence of the printed

DNA. After printing, the DNA is usually cross-linked to the matrix by ultraviolet

irradiation. Alternatively, baking the arrays in 80°C for two to four hours works

equally well. To reduce background fluorescence, which may interfere with spot

identification and subsequent data analysis, the free reactive groups on the unprinted

area of the glass slides can be deactivated with chemicals like succinic anhydride or

by treatment with biomolecules such as bovine serum albumin (BSA). Finally, the

DNA molecules are denatured in water at high temperature before hybridization.

Other critical factors for the spot quality and morphology are the concentration of the

spotted DNA, the composition of the printing solution, temperature and relative

humidity.

6

Page 15: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

Microarray / DNA chip

DNA clones (genes)

PCR amplification Purification

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . . . . . . . . . .

Robotic printing

Probe Laser 1 Laser 2

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

“green” raw data“red” raw data

Sample Reference

Labelled cDNA

Hybridize to microarray

mRNA

Target

Image analysis

Ratio Cy3

Cy5/

Figure 1. Schematic overview of spotted cDNA microarrays.

By spotting the cDNA clones in replicates, an estimation of the intra-slide variations

can be obtained. Because of array space limitations, duplicates are most widely

prepared. Another option is to print many replicates of a few selected genes.

Preferably, the replicates should be distributed over the slide surface. In addition to

the spotted cDNA clones of the examined organism, both positive and negative

element controls can be spotted to evaluate the quality and performance of the

microarray. Positive controls are used to examine the dynamic range for calibration or

as ratio controls (i.e. is the expected ratio obtained). These spiking controls are made

of two components. PCR products from a different organism are printed on the array,

and in vitro transcribed RNAs from these corresponding PCR products are added to

the hybridization mix in known amounts. Examples of negative controls are: poly(dA)

oligos, Cot-1 DNA, printing buffer and non-crossreactive PCR products from

unrelated species. These are included to check for non-specific hybridizations,

7

Page 16: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

blocking of repetitive elements, and pin washing efficiency. There are currently

several commercial control kits available.

1.3.3 Target preparation and hybridization

Target production begins with RNA isolation from the tissues of interest. Either

highly purified mRNA or total RNA can be used in subsequent reverse transcription

of mRNA. For direct labeling methods, targets are fluorescently labeled during the

reverse transcription step, using fluorescently labeled nucleotides. The alternative is

indirect labeling, whereby an amino allyl modified dUTP is used instead of a

prelabeled nucleotide. In an extra step after reverse transcription, the free amine group

on the amino allyl modified dUTP can be coupled to a reactive N-hydroxysuccinimydl

ester fluorescent dye. Although this technique is more time-consuming than direct

labeling, it has become widely used because it improves sensitivity, removes dye

biases and can be performed at decreased cost. The green cyanine-3 (Cy3) and the red

cyanine-5 (Cy5) have been the most popular dyes used in experiments with cDNA

spotted microarrays, but alternative dyes are available. A substantial amount of RNA

is required for labeling and hybridization, amounts typically above 10 µg of total

RNA are recommended. In order to perform microarray experiments with tissue

samples where the RNA amount is limited, various RNA amplification methods have

been developed. As the preservation of the relative expression levels from the original

cDNA/mRNA is critical, the aim is to develop unbiased amplification methods with

high reproducibility. After labeling and purification of the samples to remove

unincorporated dyes, the two targets, one typically labeled with Cy3 and the other

with Cy5, are mixed together in a hybridization buffer and hybridized to a single

microarray slide. During the period of hybridization, which extends several hours or

overnight, the target nucleic acids are allowed to form stable duplexes with the

corresponding DNA molecules immobilized on the array. Manual hybridizations are

performed underneath cover slips in small hybridization cassettes. As an alternative,

automated hybridization stations are available. The stringency of hybridization is

controlled by factors such as the hybridization buffer composition and temperature.

Choosing the washing conditions after hybridization is also critical for the outcome of

the microarray experiment.

8

Page 17: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

1.3.4 Image analysis

Both confocal scanning devices and CCD cameras are used for fluorescence

detection. The scanner produces separate digital images for each of the Cy3 (green)

and Cy5 (red) channels. Generally the Cy5-channel is scanned first, as this dye is

more susceptible to photobleaching. The overlay of the two output images offers a

quick visualization of the microarray experiment and information about the

uniformity of the hybridization, spot morphology, background and artifacts, such as

dust particles, are revealed. Next, image analysis software is needed to quantify each

individual array element. First, a spot finding method is required to locate all signal

spots in the image and estimate the size of the of the spot. The spot-finding technique

must be robust and able to manage uneven spot sizes and positional deviations. Then

the foreground and background intensities for each spot are calculated. The

foreground signal is often measured as the mean or the median of the intensities of the

pixels identified inside the spot. There are several algorithms used to calculate the

background, and they all try to estimate the background noise in the area of the spot.

Most often the background signal is subtracted from the foreground, but this may not

always be the best choice. Finally the data can be filtered and spots exhibiting a low

signal to background ratio are removed from the data set (e.g. to exclude any spot

with intensity lower than the background plus two standard deviations). Typically, the

intensity ratios are also log-transformed (logarithm base 2 is most widely used). The

advantage of this transformation is that upregulated and downregulated values are

comparable and of the same scale (Quackenbush, 2002; Leung and Cavalieri, 2003).

This means that an upregulated gene with intensity ratio = 2 has a log2 (ratio) = 1,

while a downregulated gene with intensity ratio = 0.5 has a log2 (ratio) = -1, and a

constantly expressed gene with intensity ratio = 1 has a log2 (ratio) = 0.

After the image processing, but before the downstream analysis of the gene

expression profiles, it is necessary to normalize the relative fluorescence intensities in

each of the two scanned channels. The goals of normalization are to identify and

overcome as much of the technical variation as possible such that the observed ratios

will most accurately reflect the biological variance. The most common sources of

technical fluctuations are differences in labeling, detection efficiencies between the

used fluorescent labels, unequal quantities of initial RNA for the two samples

examined and systematic biases in the measured expression levels (Quackenbush,

9

Page 18: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

2002). By performing dye-swap replications, i.e. reversing the labeling of targets in

the second hybridization, a reduction of systematic biases can be achieved, but

normalization is still required. Over the years, the normalization methods have been

improved and become more sophisticated. The normalization factor can be calculated

using all the genes, a set of constantly expressed housekeeping genes or spiking

controls. It has been observed that housekeeping genes are not as constantly expressed

as previously assumed, and therefore the expression of selected housekeeping genes

must be carefully inspected before used for normalization.

There are basically two main techniques used for the normalization of gene

expression data from a single array hybridization (i.e. within-slide normalization).

They share the assumption that most of the genes in the array, some subset of genes,

or a set of spiking controls should have an average expression equal to one. Total

intensity normalization assumes that the initial quantity of mRNA is the same for the

two compared samples and that the change of expression is symmetrical, i.e. the

number of upregulated genes is equal to the number of downregulated genes. Thus,

the total intensity for all the array elements should be the same in the two scanned

channels (i.e. Cy3 and Cy5) and the normalization factor can be calculated after

summing the intensities in both channels. The second main type of normalization uses

regression techniques. Here one assumes that a significant fraction of the genes will

display no change in expression in the two samples compared. In a scatter plot these

genes would cluster along a straight line with a slope of one. Normalization is

performed by computing the best-fit slope and adjusting the data so that the slope is

one. The regression techniques also include non-linear methods such as lowess

(locally weighted scatterplot smoothing) regression (Quackenbush, 2001). Other

alternative approaches to normalize microarray data are log centering, rank invariant

methods and Chen’s ratio statistics (Quackenbush, 2002).

It is now evident that the dye bias in most microarrays is both intensity and position

dependent. The intensity dependent effects are easily visualized with a MA-plot

(Dudoit et al., 2002), which is a scatterplot of the M-values [M = log2 (R/G)] against

the A-values [A = 0.5log2(RG)] of an array (See Fig. 2 for details). For detecting

position dependent effects spatial plots or boxplots, which plot the distribution of the

log-ratios for each print-tip group, can be used (Fig. 2) (Smyth and Speed, 2003). The

10

Page 19: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

non-linear lowess normalization has emerged as the dominating normalization method

to remove these systematic biases. If there are no evident spatial effects global lowess

normalization is a good choice for normalization. However, if there are possible

spatial effects print-tip lowess or scaled print tip lowess normalization should be

considered (Leung and Cavalieri, 2003). In addition to within-slide normalization,

between array normalization is useful if there are substantial scale differences

between microarray slides. This will prevent any particular replicate to become too

dominant and skew the data (Leung and Cavalieri, 2003; Smyth and Speed, 2003).

1.3.5 Data analysis

Cluster analysis methods are used to more systematically group related gene

expression patterns across experiments. By exploring the large data sets generated by

microarrays, this type of analysis can provide novel perspectives on cellular

regulatory mechanisms and can associate the expression of unknown genes with a

putative function. There are basically two main categories of clustering methods,

those that are unsupervised and those that are supervised. In unsupervised methods,

existing knowledge about the functional roles of different genes is not considered

prior to the analysis. In contrast, the supervised methods require the existence of a

training set of genes that are known to be related. The unsupervised hierarchical

clustering analysis presented in 1998 is still one of the most widely used unsupervised

methods for clustering microarray data (Eisen et al., 1998). This technique produces a

tree called a dendrogram. Other unsupervised clustering techniques, such as self-

organizing maps and k-means clustering, produce a fixed number of categories

instead of a hierarchy. Principal component analysis (PCA) and singular value

decomposition (SVD) are two other frequently used unsupervised methods

(Quackenbush, 2001; Holzman and Kolker, 2004). Examples of supervised methods

include: support vector machines, linear discriminant analysis, neural networks,

decision trees and k-nearest neighbors (Valdivia Granda, 2003).

11

Page 20: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

aa

Print tip

A = 1/2log2(RG)

a

b

Figure 2. (a) is showing a MA-plot. M = log2 (R/G) A = 0.5log2(RG) R=Cy5 channel R=Cy3 channel.

(b) is showing a boxplot.

12

Page 21: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

1.3.6 Differentially expressed genes

One of the goals of microarray experiments is to identify which genes are

differentially expressed between pairs of samples. In the early days of microarrays,

differentially expressed genes were inferred by a fixed threshold cut off method (i.e. a

two-fold increase or decrease), but to reliably conclude that a particular gene is

significantly differentially expressed, statistical methods based on replicate

microarray experiments are needed. To increase data reliability, it has been suggested

that every microarray experiment should be performed in triplicate (Lee et al., 2000).

There are two types of replication, biological and technical. Technical replicates refer

to replication of the same extracted RNA samples, whereas biological replication

usually refers to the analysis of RNA from different individuals. In general, biological

replicates are preferred since technical replicates will not solve the problem of

biological variation (Yang and Speed, 2002). The procedure of selecting differentially

expressed genes can be divided into two steps. First, all genes are ranked in the order

of evidence for differential expression, then a cut-off value is chosen for the ranking

statistic, and all genes with a value above the cut-off are considered to be significantly

differentially expressed (Smyth et al., 2003). Commonly used statistical methods for

identifying differentially expressed genes are the student’s t-test and its variants,

ANOVA, Bayesian methods and the Mann-Whitney test (Leung and Cavalieri, 2003).

1.3.7 Validation

The difficulties with clone management constitute one of the drawbacks with spotted

cDNA microarrays. Handling large numbers of clones is time consuming and there is

a risk of clone mix up and contamination. Early reports discovered high error rates of

clone sets (both commercial and academic), and this important realization led to the

requirement that all sequences of a clone set must be checked before using them for

microarray printing (Knight, 2001). When the microarray data has been processed and

is finally analyzed, there are two general approaches to independently validate the

data: in silico analysis and experimental analysis. The in silico method compares the

microarray results with other gene expression data found in the literature or that is

deposited in public or private databases. Common experimental validation approaches

include: reverse transcriptase PCR (RT-PCR), real-time RT-PCR, northern blot and in

situ hybridization (Chuaqui et al., 2002). However, these follow-up experiments can

only include a small subset of the genes represented on the microarray. Finally, it is

13

Page 22: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

important to stress that the mRNA and protein levels do not always correlate, so

further characterization of the gene product is generally required.

1.4 Searching for a function

In the present thesis, gene function means the biochemical function of the end

product, the protein, but sometimes gene function is referred to as a cellular function

(e.g. a role in signal transduction pathway) or a developmental function (e.g. a role in

organogenesis) (Bouchez and Hofte, 1998). Although knowing the DNA sequence of

a gene and its expression pattern is valuable, these data do not provide direct

information about gene function. In many cases, information about the function of the

deduced protein sequence can be inferred on the basis of their similarity to better-

characterized proteins from other organisms. Although specific terms have been

developed to discuss related gene sequences (Table 2), many predicted open reading

frames cannot be assigned a function simply by similarity to other known proteins.

For instance, more than 30% of the ORFs in Arabidopsis are classified as unknowns

(The Arabidopsis genome initiative, 2000). Moreover, knowing that a gene encodes a

transcriptional factor or is involved in signal transduction, does not reveal the specific

cellular processes the gene is involved in (Somerville and Somerville, 1999). Many of

the “known” genes have been assigned a putative function based only on their

similarities to other genes, but annotating genes in this way produces considerable

uncertainty (Osterlund and Paterson, 2002). Gene expression data is also insufficient

for assigning function. In fact, there is probably no single technology or approach that

can provide that information. Instead, different approaches provide pieces of the

complicated puzzle. Thus, in addition to the DNA sequence and transcript profiles, in

situ hybridization, protein localization, heterologous protein expression studies,

protein-protein interaction studies, post-translational modification studies, 3-D

structure studies, and mutant analyses are important methods for deciphering gene

function.

One of the most efficient approaches to determine gene function is the so-called

reverse genetics approach, where a loss-of-function mutation is created in the gene of

interest and the phenotype of the resulting mutant is studied. This is in contrast to the

classical forward genetics approach, which starts with the observation of an

14

Page 23: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

interesting phenotype and ends with tracking down the gene responsible for it. There

are a number of reverse genetics tools available such as RNA interference (RNAi),

virus-induced gene silencing and the use of systematically generated transfer DNA

(T-DNA) or transposon populations. But the reverse genetics approach also has

drawbacks. Many proteins are members of large protein-families, where the

individual members share high sequence similarity and redundant functions. In such

cases, there is a risk that disruption of a single gene may not produce an organism

with a visible or a molecular phenotype. If the gene family is not too large, one could

attempt to isolate mutants in each member of the family and then cross them to

generate double or multiple mutants to overcome this problem. Disruption of an

essential gene causing lethality is another limitation of reverse genetic approaches. To

overcome this problem it is possible to generate leaky alleles or conditional mutants

(Perrin et al., 2001; Tissier and Bourgeois, 2001). Still, knowing which environmental

condition or developmental stage will manifest an interesting phenotype may require

additional experiments.

Table 2. Classification of the different types of gene homology

Homolog Related genes that evolved from a single common ancestor

Paralog Related genes that evolved due to gene duplication within a species

Ortholog Related genes that evolved after speciation

1.5 Wood formation

1.5.1 Economic importance

Wood formation (xylogenesis) is a fundamental biological process of significant

economic and commercial value. Trees are responsible for the majority of land-borne

biomass production and constitute an important industrial raw material. Wood is

ranked as the fifth most important product in world trade, and provides biofuel, fibers,

solid wood products, but also chemical products, plastics, food and textile products

(Plomion et al., 2001). The world’s forest area was estimated to be 3.87 billion

hectares in 2001, which is about one-third of the earth’s land surface, and the total of

15

Page 24: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

above-ground woody biomass in forests was estimated to 420 billion tones (FAO,

2001). Global demand for wood is growing at 1.7% annually, so there is a need to

develop forestry in a long-term sustainable way to meet this demand (Fenning and

Gershenzon, 2002). The development of high-yielding short-rotation plantation

forests is one approach to supply the world with more wood without degrading the

remaining natural forests of the world. Biotechnology has significant potential to

contribute to this development. To date the public concern about the risk of using

genetically modified organisms (GMOs) has slowed down the development of

genetically modified (GM) trees. However, plenty of benefits can be achieved using

biotechnology without performing genetic modifications until the risk of GM tree

plantations has been thoroughly investigated, e.g. by utilizing molecular marker

assisted breeding.

Sweden is located in the northern coniferous zone, with spruce and pine being the

most common tree species in forests. The most common Swedish hardwood

(angiosperm) species is birch, followed by aspen and alder. Forestry has for a long

time been a vital part of Swedish economy. Although the proportion of the population

employed in the forestry sector has decreased, forestry is still one of the most

important industrial sectors in Sweden. For instance, in 2001, forest products worth

110 billion SEK were exported (approximately 15% of the total exports), while

imports of forestry products amounted to 22 billion SEK. Thus, the surplus on the

trade in forest products was 88 billion SEK. Sweden is the world’s fourth largest

exporter of both pulp and paper and the world’s second largest exporter of sawn

timber products (http://www.skogsindustrierna.se).

Forest and forest product biotechnology research was initiated in the pulp and paper

industry in the early 1980’s. The idea that microorganisms and their enzymes can

successfully replace or supplement conventional chemical processes in the pulp and

paper industry has now gained acceptance. Biotechnology has the potential to perform

more specific reactions in more environmentally friendly processes with significant

energy savings. Current biotechnological methods in the pulp and paper industry

include bio-pulping, enzymatic bleaching, pitch control and purification of waste

water. Microbial xylanases, peroxidases, lipases and laccases are examples of

enzymes that have been used successfully to improve pulp processing (Mansfield and

16

Page 25: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

Esteghlalian, 2003). Today, the role of forest product biotechnology is expanding

beyond the pulp and paper industry. Newer application areas include genetic

engineering of forest trees to improve forest health, e.g. disease resistance, cold

tolerance, rooting ability, and enhanced properties, e.g. accelerated growth, improved

fiber quality, lignin content/composition, phytoremediation, enzymatic hydrolysis of

cellulose and hemicellulose for bioethanol, production of bio-glued board materials,

utilization of antagonistic organisms for wood protection and the development of new

biocomposites. During the past decade the main focus of pulp and paper

biotechnology has been screening for new potentially interesting enzymes from

microbial sources. Plants provide an alternative and largely unexplored source of

enzymes. While microbial enzymes are wood degrading, plants contain wood

synthesizing enzymes, thus offering us a new set of target enzymes for fiber

modification. These biosynthetic enzymes could be used to add functionality onto

fiber surfaces. In summary, forest product biotechnology is a promising research field,

which could be integrated in almost any step of the process that starts with the birth of

a tree and ends with a finished product.

1.5.2 Studies of wood formation - poplar and other model organisms

It might seem evident that for the study of wood formation, trees should be the model

choice, but in fact weeds and ornamentals can also contribute to the knowledge of this

process. The weed thale cress (Arabidopsis thaliana) was the first plant to have its

genome sequenced, and this model organism has been extremely valuable for plant

biology. Under appropriate growth conditions Arabidopsis undergoes secondary

growth in hypocotyls, and can therefore be used to identify genes involved in wood

formation. The fibers and vessel elements of Arabidopsis are also morphologically

and ultrastructurally similar to those of trees like poplar (Chaffey et al., 2002;

Bhalerao et al., 2003). The ornamental garden plant, Zinnia (Zinnia elegans), can be

used to study certain aspects of wood formation in vitro. Isolated Zinnia mesophyll

cells maintained in liquid culture can be induced to redifferentiate synchronously into

tracheary elements. During this process genes and proteins can be identified at

different stages of xylem formation. The Zinnia treachery elements form vessel-like

structures in vitro, resembling vessels formed in planta (Fukuda, 1997; McCann,

1997; Milioni et al., 2001; Demura et al., 2002; Milioni et al., 2002; Pesquet et al.,

2003). However, there are a number of wood related processes that cannot be studied

17

Page 26: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

in annual plants (e.g. heartwood formation, the seasonal cycle of cambial

dormancy/activity, wood maturation, the seasonal variation of earlywood and

latewood production). Moreover, there are limited amounts of xylem tissue in both

Arabidopsis and Zinnia. Thus, although other plant models can contribute to our

understanding of certain aspects of wood formation and other important processes of a

tree, most tree-specific questions are best answered using tree models.

Woods are classified as hardwoods (angiosperms) or softwoods (gymnosperms).

There are basic structural differences between the two kinds of wood, but the terms

“hardwood” and “softwood” do not necessarily reflect the relative density or hardness

of wood. Poplars and pines were the two first tree genera to be studied at a molecular

level, and these are still the two most extensively studied tree genera with the largest

published EST-libraries, representing the angiosperms and the gymnosperms,

respectively (Allona et al., 1998; Sterky et al., 1998). There are other on-going public

EST-projects for birch, spruce, black locust and eucalyptus, and EST-collections from

pine and eucalyptus have also been reported by industrial laboratories (Bhalerao et al.,

2003; Campbell et al., 2003); Campbell et al, 2003; Yang et al 2003) (Table 3). The

first tree genome sequencing project was initiated in 2002, when The DOE Joint

Genome Institute started sequencing a member of the Populus genus, the black

cottonwood Populus trichocarpa. With this genomic platform, which will be public in

May 2004, Populus will be the most important model tree genus for the next coming

years. Though gymnosperms are generally more commercially important, a genome

sequencing effort for a representative gymnosperm is not expected in the near future,

mainly because of the cost of sequencing their large-sized genomes. Furthermore, a

lot of the knowledge obtained by studying Populus will be applicable to gymnosperm

tree species. Poplar has been gradually established as the model tree and the model

hardwood over the years in part because its genome is relatively small. The 500-550

Mbp poplar genome is similar in size to the rice genome, and only 4 times larger than

the genome of Arabidopsis, yet 40 to 50 times smaller than the huge pine genome

(Wullschleger et al., 2002; Brunner et al., 2004). Poplars are also easy to transform,

they were the first trees to be genetically transformed, and they are among the most

fast-growing trees found in the temperate regions. The first EST-project in poplar

reported about 5000 EST sequences from Populus tremula x tremuloides and Populus

trichocarpa (Sterky et al., 1998). Since then, over 130000 ESTs from Populus

18

Page 27: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

tremula x tremuloides, Populus trichocarpa and Populus tremula have been

sequenced in this project (http://www.populus.db.umu.se). Furthermore, 7000 root ESTs

from Populus trichocarpa x deltoides (Kohler et al., 2003), 10000 wood related ESTs

from Populus tremula x alba (Dejardin et al., 2004) and 12000 ESTs from different

tissues of Populus tremuloides (Wullschleger et al., 2002) have also been reported. In

addition to the EST and genomic sequencing efforts, several other resources are

already or will soon be available, including microarrays, a bacterial artificial

chromosome (BAC) physical map, a complete chloroplast DNA sequence, and several

hundred mapped simple sequence repeat (SSR) markers (Bhalerao et al., 2003). In

addition, Populus is closely related to Arabidopsis, which facilitates comparative

functional studies between the two model systems. All together this makes poplar an

ideal model system for studying phenomena such as shoot dormancy, adaptions to

deep frost, the presence of juvenile and mature phases, and extensive secondary

growth in forest trees (Mellerowicz et al., 2001).

Table 3. Sequencing projects in trees

Species Worldwide-web URL

Genome Populus trichocarpa http://genome.jgi-psf.org/poplar0/poplar0.home.html

ESTs Populus tremula, Populus trichocarpa, Populus tremula x tremuloides

http://www.populus.db.umu.se

Populus trichocarpa x deltoides, Populus tremula x alba

http://mycor.nancy.inra.fr/poplardb/aims.html

Populus trichocarpa http://www.arborea.ulaval.ca/ Populus tremuloides http://aspendb.mtu.edu/ Betula pendula http://www.biocenter.helsinki.fi/bi/PLANT/Helariutta/ Eucalyptus globulus http://ccgb.umn.edu/biodata/ Eucalyptus camaldulensis http://ccgb.umn.edu/biodata/ Robinia pseudoacacia http://ccgb.umn.edu/biodata/ Pinus taeda http://pinetree.ccgb.umn.edu/ Picea Glauca http://www.arborea.ulaval.ca/

The genus Populus is a member of the Salicaceae (willow) family and the order

Salicales. Populus species are naturally widely distributed over the Northern

Hemisphere with a small representation in tropical Africa (Bradshaw et al., 2000).

Although various classifications have been suggested, a recent classification

recognize 29 species that are grouped under six separate sections (Eckenwalder,

19

Page 28: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

1996). The only naturally occurring poplar species in Sweden, the European aspen

(Populus tremula), belongs to the section Leuce (aspens) (Fig. 3a). Populus tremula is

a broadly distributed tree, covering almost the whole of Europe and parts of northern

Asia, China and Japan. The closest relatives are White poplar (Populus alba), the

American (quaking) aspen (Populus tremuloides), Siebold aspen (Populus sieboldii),

and the Chinese aspen (Populus adenopoda) (Almgren 1990). The hybrid aspen

Populus tremula x tremuloides is a crossing between European aspen (Populus

tremula) and American aspen (Populus tremuloides) (Fig. 3b). The first crossing in

Sweden was made in 1939, and it was found that the hybrid showed higher growth

rate than either of the parental species (Almgren and Skogsstyrelsen, 1990). From

1940 to 1965, efforts were made to develop hybrid aspen forestry, but due to the

decline of the Swedish match industry (caused by the introduction of the cigarette

lighter), which was the most important supporter of the research, the interest in hybrid

aspen waned. Since the 1980s, the Swedish forestry has shown renewed interest in

hybrid aspen as an alternative for short rotation forestry on former agricultural land.

However, the dominating alternative crop today in Sweden are willows (Salix spp)

planted on approximately 20000 ha, whereas the area with poplar plantations is only

300 ha. Recent studies have shown that poplars and hybrid aspen are superior biomass

producers compared with tree species commonly grown on agricultural land at these

latitudes (Norway spruce and birch), thus poplar cultivation has a strong potential to

expand in Sweden (Christersson, 1996; Rytter, 2002; Karacic et al., 2003). Although

hybrid aspen or other poplars are not yet planted over large areas in Sweden, the

importance and the knowledge of intensive poplar cultivation for bioenergy and fiber

production in Europe and North America is well documented (Tuskan, 1998; Pellis et

al., 2004).

20

Page 29: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

Figure 3. (a) Populus tremula growing in Lohärad, Sweden. (b) Populus tremula x tremuloides in the

greenhouse. Photo: Henrik Aspeborg.

1.5.3 Xylem differentiation

Wood is secondary xylem, that is produced by the vascular cambium through a series

of developmental steps. Cells formed by the vascular cambium give rise to phloem on

the outside and xylem towards the inside of the tree trunk. As the cambium adds cells

to the secondary xylem, the cambium is deposited outwards, and the circumference

and the stem diameter are increased. The increasing diameter growth is caused by

periclinal divisions and the increasing circumference is accounted for by occasional

anticlinal (radial) divisions. The major steps of wood development are cell division,

expansion, secondary cell wall formation, lignification and programmed cell death. In

angiosperms fusiform cambial cells differentiate inwards into three xylem cell types:

vessel elements, fibers and axial parenchyma; whereas on the outer side they give rise

to the phloem cell variants: sieve tubes, companion cells, axial parenchyma and fibers

(Fig. 4). Xylem has two main functions: water transport and mechanical support,

whereas phloem conducts nutrients synthesized during photosynthesis. Ray cambial

21

Page 30: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

cells give rise to contact and isolation ray cells, which play a role in radial

conductance and storage of nutrients such as starch, proteins and lipids (Mellerowicz

et al., 2001; Plomion et al., 2001). Wood fiber morphology, structure and composition

vary greatly between tree species, but differences

Figure 4. Macerated wood of Populus tremula x tremuloides showing major cell types: vessel elements, fibers and ray parenchyma cells (image kindly provided by Ewa Mellerowicz).

within species are also found. Fibers formed in the spring (earlywood) are

characterized by thin walls, while fibers created in the end of summer (latewood) have

thick walls. Juvenile wood formed within the first 10-20 years of a tree’s life, and in

the crown of older trees, has a lower density and a higher cellulose microfibril angle

compared to mature wood. Another important variant of wood formation in a tree is

reaction wood. This type of wood is produced to compensate for exposure to wind or

other types of bending pressure. Softwoods form reaction wood on the compressed

side of a bending trunk (compression wood), whereas reaction wood formed by

hardwoods is induced on the upper side of a leaning stem (tension wood).

22

Page 31: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

1.5.4 The plant cell wall

Properties of the wood fiber depend on the biosynthesis of the plant cell wall. The

plant cell wall is an intricate biochemical network that plays a vital role in almost

every aspect of plant physiology. The many different functions of the plant cell wall

include provision of mechanical strength, definition of cell shape, participation in cell-

cell communication, protection against pathogen attack and interaction with

symbionts. The cell wall is composed of several layers that are produced at different

periods during cell differentiation. The middle lamella is the first layer produced

during xylogenesis and it is developed after cell division. The middle lamella ensures

the adhesion of a cell with its neighbors, and consists mainly of pectin and small

amounts of lignin, which are deposited later on. The primary wall is formed at the

beginning of cell differentiation. Populus and other dicots have a Type 1 primary wall

consisting of cellulose microfibrils associated with hemicellulose, which interacts

with a structurally independent pectin network. A third network consists of structural

proteins or a phenylpropanoid network. The primary wall is strong but flexible and

capable of expansion (increase in cell size in two or three dimensions) and elongation

(expansion in only one dimension). Proteins like xyloglucan endotransglycosylases

(XETs) and expansins have possible wall-loosening activities during cell expansion

(Fry et al., 1992; Mcqueen-Mason et al., 1992). When the cells have reached their

final size, a secondary cell wall is synthesized on the inner side of the primary wall.

The secondary cell wall of wood fibers is divided into three different sublayers, S1,

S2, and S3 (Fig 5). Cellulose, hemicellulose and lignin are present in each of these

layers. The S1 layer is the thinnest of the S layers, representing only 5% to 10% of the

total thickness of the cell wall. This layer can be regarded as an intermediate between

the primary wall and the inner layers S2 and S3. The thick middle layer S2 accounts

for 75% to 85% of the cell wall, and thereby is the most important regarding wood

fiber properties. S3 is the innermost layer, and is also thin. Because the cellulose

microfibrils are arranged in different directions in these layers, it is possible to

distinguish them visually. Xylem vessel elements and ray cells also form secondary

walls with three different S-layers (Mellerowicz et al., 2001).

23

Page 32: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

Figure 5. A schematic cross section of the fiber cell wall that illustrates the different cell wall layers

and the structure of the secondary cell wall. The orientation of the cellulose microfibrils differ between

the cell wall layers.

1.5.5 Carbohydrate biosynthesis and turnover

1.5.5.1 Carbohydrate-active enzymes involved in cell wall biosynthesis

Before discussing the biosynthesis of different polysaccharides in the plant cell walls,

a description of key players in this process is needed. There is a vast number of

carbohydrate-active enzymes (CAZymes) involved in the synthesis, modification and

degradation of oligosaccharides and polysaccharides. The synthesis of glycosides is

performed by glycosyltransferases (GTs), modification is a task for e.g. carbohydrate

esterases (Ces), and the breakdown of glycosidic linkages is performed by glycoside

hydrolases (GHs) and polysaccharide lyases (PLs). Genes encoding glycoside

24

Page 33: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

hydrolases and glycosyltransferases have been classified in sequence- and structure-

based families (Henrissat and Davies, 1997; Coutinho et al., 2003). All of the

glycoside hydrolases and glycosyltransferases families, as well as families of

polysaccharide lyases, carbohydrate esterases and carbohydrate-binding modules

(CBMs), can be found at the continuously updated Carbohydrate-Active enZyme web

server (http:/afmb.cnrs-mrs.fr/CAZY/). In March 2004, there were entries for 93

families of glycoside hydrolases and transglycosidases, 70 families of

glycosyltransferases, 14 families of polysaccharide lyases, 14 families of carbohydrate

esterases and 34 families of carbohydrate-binding domains. While the mechanism and

3-D structure are known for many glycoside hydrolases, characterization of the

glycosyltransferases has been much slower (Davies and Henrissat, 2002). The power

of the sequence-based family classification is that once the mechanism is established

for a member of a family, this mechanism is apparently shared by all other members

of the family (Henrissat et al., 2001). For instance, an unknown glycosyltransferase

can always be annotated as a putative inverting (or retaining) glycosyltransferase from

family GTxx. An alternative enzyme classification is provided by the International

Union of Biochemistry and Molecular Biology (IUBMB), which assigns an enzyme

commission (EC) number to an enzyme based on donor, acceptor and product

specificity. However, issuing an EC number is problematic when dealing with

enzymes that act on several distinct substrates, and furthermore, the EC numbers do

not consider the structural and mechanistic features of the enzymes (Coutinho et al.,

2003). From the analysis of completed genomes, it is clear that plants (Arabidopsis)

feature far more glycoside hydrolases and glycosyltransferases in relation to total

number of putative genes than sequenced genomes from any other organism

(Henrissat et al., 2001). One explanation for this is that a vast number of

carbohydrate-active enzymes is required for the biosynthesis and modification of the

complex polysaccharides found in the plant cell wall (Coutinho et al., 2003).

1.5.5.2 Cellulose biosynthesis

Cellulose is the most abundant polysaccharide in nature with approximately 180

billion tons produced each year (Delmer, 1999). Cellulose is composed of (1→4)-β-

D-glucan chains, which are synthesized in plants by large plasma membrane-bound

protein complexes. These enzyme complexes are organized in hexagonal structures

25

Page 34: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

called rosettes. As recently as 1995 not a single gene encoding a cellulose synthase

had been cloned or identified. Since the first CesA gene was identified in the bacteria

Acetobacter xylinum and Agrobacterium tumefaciens (Saxena et al., 1990; Matthysse

et al., 1995), which produce extracellular ribbons of cellulose, many probable CesAs

have been identified in plants. The most extensive studies have so far been conducted

in Arabidopsis and cotton. For instance, at least 10 different CesA genes that encode

catalytic subunits of the cellulose synthase complex have been found in Arabidopsis,

and have been shown by mutant analyses to play distinct roles in the cellulose

biosynthesis (Doblin et al., 2002). Like the bacterial CesAs, the plant CesAs have

eight predicted transmembrane domains, two at the N-terminus and six at the C-

terminus. In addition, plant CesAs contain two unique domains, a “plant conserved

region” (CR-P) and a “hypervariable region” (HVR) (Pear et al., 1996; Delmer,

1999). Ancillary proteins or enzymes are likely required for the extrusion of cellulosic

chains and the assembly of microfibrils. Korrigan, a membrane-bound plant

endoglucanase, is the only other protein that has been strongly linked to the process of

synthesizing cellulose. Other proteins including sucrose synthase, cytoskeleton

components, Rac13, redox proteins and a lipid transfer protein have also been

implicated in cellulose biosynthesis. Enzymes encoded by CesA genes belong to the

large glycosyltransferase family GT2 (Henrissat et al., 2001). Expression analyses

have suggested that some CesAs have a role in secondary cell wall synthesis, while

others are important for the primary wall synthesis (Pear et al., 1996; Gardiner et al.,

2003; Tanaka et al., 2003; Taylor et al., 2003; Burton et al., 2004; Liang and Joshi,

2004).

1.5.5.3 Hemicellulose biosynthesis

Hemicellulose is the second most abundant polysaccharide in nature. Whereas

cellulose polymerization occurs at the plasma membrane, hemicellulose biosynthesis

takes place in the Golgi apparatus, followed by deposition in the cell wall by

exocytosis. While the chemical structure of cellulose is the same for all plants,

hemicelluloses are a heterogeneous group of carbohydrate polymers, which vary

between different cell walls and plant taxa. The major hemicellulose of hardwoods

(angiosperms) is O-acetyl-(4-O-methylglucurono)-xylan, but a smaller portion of

glucomannan is also present. The dominating softwood (gymnosperm) hemicellulose

is an O-acetyl-galactomannan accompanied with a minor amount of arabino-(4-O-

26

Page 35: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

methylglucurono)-xylan. Xyloglucan is the major primary wall hemicellulose of Type

1 cell walls found in most dicots, conifers and in selected monocots except grasses

(Brett and Waldron, 1996). The hemicellulose biosynthetic enzymes identified so far

are mostly glycosyltransferases involved in the biosynthesis of the primary cell wall

hemicellulose xyloglucan (Perrin et al., 1999; Faik et al., 2000; Faik et al., 2002;

Madson et al., 2003). However, the first identified and characterized enzyme involved

in cell wall biosynthesis was an alpha-galactosyltransferase from fenugreek

(Trigonella foenum-graecum), which was involved in the biosynthesis of the storage

carbohydrate galactomannan (Edwards et al., 1999). All the above mentioned

glycosyltransferases are responsible for decorating the hemicellulose backbones with

side chains. A glycosyltransferase involved in synthesizing the backbone of

galactomannan in guar (Cyamopsis tetragonoloba) was only recently identified.

Interestingly, this β-(1→4)-mannan synthase is related to CesA genes (Dhugga et al,

2004).

1.5.5.4 Pectin biosynthesis

Pectic substances constitute a highly complex family of polysaccharides that are rich

in D-galacturonic acid (GalA). Since pectins are found mainly in the plant primary

cell walls and middle lamella, they are a minor component in wood. Still pectins play

an important role during wood formation. Three classes of pectins can be

distinguished based on two different backbone configurations: homogalacturonan

(HG), rhamnogalacturonan I (RGI) and rhamnogalacturonan II (RGII). HG and RGII

have backbones of α(1→4)-linked galacturonic acid, while RGI contains a backbone

of alternating rhamnose and galacturonic acid. In a recent model of the dicot primary

wall, HG and other pectins are portrayed as side-chains of RGI (Vincken et al., 2003).

It has been estimated that more than 50 glycosyltransferases are required for the

biosynthesis of the pectic polysaccharides (Ridley et al., 2001). However, only three

glycosyltransferases involved in pectin biosynthesis have been identified so far: a

family GT8 glycosyltransferase from Arabidopsis was linked to HG synthesis

(Bouton et al., 2002), a mutation in another family GT8 glycosyltransferase affected

RGI (Lao et al., 2003), and a family GT47 from Nicotiana plumbaginifolia was

reported to be involved in constructing RGII (Iwai et al., 2002). Two genes involved

in the nucleotide sugar interconversion pathway were also shown to be involved in

27

Page 36: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

pectin biosynthesis. A GDP-Man-4,6-dehydratase (MUR1) involved in RGII (Bonin

et al., 1997), and recently a NDP-rhamnose synthase was shown to influence RG1

synthesis (Usadel et al., 2004).

1.5.6 Other cell wall components

1.5.6.1 Lignin

Lignin is the third major component of wood and second only to cellulose as the most

abundant organic compound on earth. Lignin is a highly branched phenolic polymer

consisting of three different phenylpropane monomers: p-courmaryl alcohol, coniferyl

alcohol, and sinapyl alcohol. The relative amount of each monomer differs depending

on whether the lignin is from gymnosperms or angiosperms. Lignin is the last

secondary cell wall polymer synthesized, and it is believed to act as ‘glue’ between

the cell wall carbohydrates. Every lignin deposition event is preceded by carbohydrate

deposition (i.e. when S1 formation is initiated, lignification occurs in the middle

lamella and primary wall, after the S2 layer is completed lignin is deposited in the

secondary cell wall, while the bulk of lignin is deposited after the carbohydrates have

been laid down in the S3 layer). Lignin deposition is influenced by the chemical

nature of the cell wall carbohydrates and the orientation of the cellulose microfibrils

(Boerjan et al., 2003). Although lignin biosynthetic pathways have been re-evaluated

recently, most of the enzymes known to participate in monolignol biosynthesis have

been cloned and the overall process is relatively well understood (Grima-Pettenati and

Goffner, 1999; Dixon et al., 2001; Humphreys and Chapple, 2002; Baucher et al.,

2003; Boerjan et al., 2003)

1.5.6.2 Cell wall proteins

Although the cell wall is mainly composed of polysaccharide polymers, structural

proteins are also important components of the cell wall. Four major classes of

structural proteins have been recognized: hydroxy-proline-rich glycoproteins

(HRGPs), the proline-rich proteins (PRPs), glycine-rich proteins (GRPs) and

arabinogalactan proteins (AGPs). Most of the cell wall proteins are glycosylated. The

AGPs are so heavily glycosylated, more than 95% might be carbohydrates, that

28

Page 37: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

proteoglycans is a more suitable name. Other proteins found in the cell wall are wall-

associated kinases, lecitins, expansins and cell wall modifying enzymes.

1.5.6.3 Extractives

These compounds are named extractives because they can be removed from wood by

extraction with solvent. Wood contains 0,4 to 8,3% extractives by dry weight

depending on the species. The extractives can generally be divided into three types,

i.e., terpenes, resins and phenols. The presence of extractives affects the color of

wood. The wood resins, lipophilic compounds such as free fatty acids, resin acids,

waxes, fatty alcohols and sterols, form colloidal pitch during wood pulping. Pitch

deposition results in low quality pulp and cause problems in paper machine efficiency

(Gutierrez et al., 2001).

29

Page 38: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

2 OBJECTIVES OF THE PRESENT INVESTIGATION

The present thesis has employed a functional genomics approach to identify enzymes

involved in wood and fiber formation (i.e. the secondary cell wall formation) in

hybrid aspen. The specific aims were:

• To set up a microarray-based transcript profiling system for poplar

• To use transcript profiling as a tool to screen for genes upregulated during

secondary cell wall formation

• To use bioinformatic tools to identify genes encoding carbohydrate-active

enzymes with special emphasis on those upregulated during secondary cell wall

formation

• To obtain full length cDNA sequences for a selected number of carbohydrate-

active enzymes and genes with unknown function

30

Page 39: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

3 RESULTS AND DISCUSSION

3.1 Transcript profiling of woody tissues

3.1.1 Development of a target amplification method (I)

Although the large-scale EST-sequencing initiative of Populus, founded by the

Swedish Center for Tree Functional Genomics (SCTFG), is an excellent resource for

gene discovery, it lacks the high resolution needed to analyze specific cell types.

Microarray analysis of wood formation could provide information about stage specific

gene expression, but microarrays are generally restricted by the amount of available

RNA. In order to perform microarray-based gene expression analysis on sampled thin

30 µm wide sections of the wood forming zone, a target amplifying method was

developed (paper I). PCR is a powerful method that allows amplification of mRNA

populations, but the amplification introduces a bias for shorter fragments due to the

more efficient amplification of these transcripts compared to longer ones. To

overcome this bias, the transcriptome was randomly fragmented by sonication prior to

the PCR step. Subsequently, 3’ cDNA tags for each transcript were isolated and

amplified, followed by labeling using asymmetric PCR (Fig. 6). A small microarray

containing 192 hybrid aspen clones printed in triplicate, as well as a microarray

containing 2995 hybrid aspen clones printed in duplicate, were hybridized with targets

isolated from xylem and phloem to assess the accuracy of the 3’ cDNA tag

amplification method. By comparing microarray data resulting from the amplification

protocol and 0.1 µg of total RNA with microarray data resulting from standard target

production techniques and 1 µg mRNA, it was estimated that a 2-fold difference in

expression could be distinguished with 99% confidence using the new method. To

further demonstrate the usefulness of the amplification technique, the transcript

profiles of two phloem sections representing different stages of development were

compared using the small array containing 192 clones. Sample A consisted of

differentiating phloem directly adjacent to the cambial cells, while sample B consisted

of further differentiated and more mature phloem cells (e.g. cells with secondary cell

walls). This experiment identified a few cell wall related genes. For instance, an

expansin gene was upregulated in sample A, consistent with the rapid expansion of

these cells. A gene upregulated in sample A that was annotated as similar to the

31

Page 40: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

Arabidopsis F1N21.7 protein has been reannotated as a glycosyltransferase from

family GT13. Among the upregulated genes found in sample B, two lignin related

genes were identified: a phenylalanine ammonia-lyase (PAL) gene, and a blue copper

protein gene. At present, only two additional comprehensive and large-scale studies of

phloem gene expression have been published in plants (Oh et al., 2003; Vilaine et al.,

2003).

Immobilization onto streptavidin coated beads Repair (T4 DNA polymerase)

Release of 3' tags by restriction

Blunt end----------TTTTTTTTTTT----------AAAAAAAAAAA 5'

3'NotI

Ligate to adaptors

-------TTTTTTTTTTT-------AAAAAAAAAAA

NotI BamHI

PCR amplification (20-25 cycles)and labelling by assymetric PCR (10 cycles)

mRNA isolation from 0.1 - 1 µg of total RNA

nucleus AAAAA(A)n

AAAAA(A)n

1st and 2nd strand cDNA synthesis

AAAAA(A)n

AAAAA(A)nAAAAA(A)n

AAAAAAAAAAAAA(A)nTTTTTTTTTTTTT-NotI-biotin-5’

Fragmentation by sonication

Hybridize to microarray

……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….

AAAAA(A)n

AAAAA(A)nAAAAA(A)n

Figure 6. A schematic overview of the 3’ target amplification method.

32

Page 41: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

3.1.2 Transcript profiling during xylogenesis (II) (III) (V)

3.1.2.1 A transcriptional roadmap to wood formation (II)

A large number of genes potentially involved in wood formation was identified in the

first poplar EST-library (Sterky et al., 1998). To gain insight into how expression of

these genes is regulated during development and differentiation of wood, a large scale

gene expression analysis tool, such as that provided by cDNA microarray technology

was needed. Therefore, the EST sequences originating from the cambial region and

described in Sterky et al (1998) were clustered to create a unigene set of 2995 genes,

and this collection was PCR-amplified and spotted in duplicate onto microarrays.

Development of secondary xylem is a highly organized process and recognizable

boundaries exist between different developmental stages. The ordered nature of

xylogenesis was exploited when obtaining tissue sections by tangential cryo

sectioning (Uggla et al., 1996) that represented (Ph) phloem, (A) cambium, (B) early

expansion, (C) late expansion, (D) secondary call wall formation and (E) late cell

maturation (e.g. programmed cell death). Target preparation from these samples

integrated the non-biased RNA amplification method described in paper I.

An experiment spanning a wide developmental gradient (i.e. from meristematic cells

to cells undergoing programmed cell death) using an array consisting of relatively few

genes requires careful experimental design. Normalization and choice of reference

were two critical factors that warranted considerable thought. Since the majority of

the genes were expected to be differentially expressed across the developmental

gradient, commonly used normalization methods could not be applied. Therefore,

instead of using all of the genes on the array, spiked human controls were used for

normalization. Because there is no obvious reference tissue section in developing

xylem, a mixture of equal amounts of the individual samples (A+B+C+D+E) was

used as a common reference. This ensures that genes expressed in only one tissue will

be represented in the reference, thereby decreasing the risk of loosing such genes in

the quality filter process. However, the transcript profiles of genes that are highly

expressed in one sample will be truncated because that sample comprises 1/5 of the

reference.

33

Page 42: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

The results showed that a large proportion of the examined genes were under strict

developmental stage-specific transcriptional regulation. More than 1200 genes of the

almost 3000 represented on the microarray showed 4-fold differential expression

between the different stages (max ratio/min ratio > 4) and over 800 showed a greater

than 8-fold alteration. A cluster analysis of the latter group revealed gene classes with

developmentally regulated transcript profiles (paper II: Fig. 1 C and D, I-X). For

instance, genes induced during secondary cell wall formation were discriminated,

indicating a role for these genes in this particular process. A significant number of the

genes represented on the microarray showed no or weak similarity to any deposited

sequences in the public databases. Another interesting group contained genes showing

high similarity to Arabidopsis proteins with unknown function or annotated as

hypothetical proteins. More than 200 of the latter group were differentially expressed

during xylem development. Transcript profiles of 72 hybrid aspen genes that were not

assigned function through similarity searches are shown in Fig. 7.

-4

-3

-2

-1

0

1

2

3

4

5

Ph A B C D E

Log2

rat

io

Figure 7. 72 D-upregulated genes with unknown function.

For numerous genes, the transcript profiles obtained across the developing xylem

matched the theoretical patterns predicted by the known function of these genes.

Characteristic cellular processes and corresponding genes predicted for each

34

Page 43: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

developmental stage are summarized in Fig. 8. A high proportion of differentially

expressed genes were either cell wall biosynthesis related or possibly involved in

cytoskeleton formation or trafficking. Cellulose is the major component of wood and

in accordance with this fact, two out of the three CesA genes represented on the array

were upregulated during secondary cell wall formation (D). As expected most of the

genes encoding lignin biosynthetic enzymes were upregulated in the later stages of

xylogenesis (D and E). Cortical microtubules (MTs) have been suggested to facilitate

the alignment of cellulose microfibrils as they are deposited in the cell wall (Baskin,

2001). Notably, there were 14 different tubulin genes present in the poplar unigene set

and ten were strongly upregulated during late expansion (C) and secondary wall

biosynthesis (D).

Figure 8. A schematic representation of the different tissue sections.

3.1.2.2 A transcriptional roadmap to wood formation revisited (V)

For this experiment a larger Populus microarray was used consisting of 13490 genes

spotted in duplicate. This unigene set was based on 36354 ESTs originating from

seven different tissue specific cDNA libraries: cambial region, active cambium,

dormant cambium, tension wood, mature leaf, senescent leaf, and flower bud

35

Page 44: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

(Andersson et al, 2004). In addition, a number of control elements were spotted

including human and archaeal DNA fragments as well the commercial Lucidea

Scorecard kit (Amersham Biosciences, Uppsala, Sweden). Two genes, which were

represented in all of the seven different cDNA libraries, were also spotted in 2 x 36

copies to help evaluate the performance of the microarrays. The target samples were

the same as in paper II, except that the phloem section was excluded. The

amplification, labeling and hybridization steps were performed as previously

described. The data was normalized using intensity dependent locally weighted

scatterplot smoothing (lowess). While the earlier experiment described in paper II

studied the complete developmental gradient from A-E, this study focused only on

genes related to secondary cell wall formation, i.e. those that were upregulated in zone

D. This experiment identified 1140 secondary cell wall related genes, whereas

approximately 300 secondary cell wall related genes were identified in paper II. The

proportion of secondary cell wall related genes is similar between the two

experiments: 8.5% and 10%, respectively. However, more than 50% of the secondary

cell wall related carbohydrate-active enzymes were only discovered using the smaller

array, showing that the cambial region cDNA clones spotted on those arrays were

enriched with gene fragments encoding these types of enzymes. Out of the 50 most

upregulated genes in zone D that were identified using the smaller array, 90% were

also found upregulated in zone D using the 13490-element microarray. Explanations

for the differences observed using the different microarrays include: genes not passing

the quality filter, genes having a ratio close to the cut-off value and difference in

normalization between the two experiments.

3.1.2.3 Transcript profiling of transgenic trees with increased secondary growth (III)

Numerous studies have stressed the importance of the plant hormone auxin for wood

formation. However, gibberellins (GAs) are also important regulators of secondary

growth in the tree trunk (Ridoutt et al., 1996; Eriksson et al., 2000). The “wood”

microarray described in paper II was used to investigate how GA 20-oxidase

overexpressing in a transgenic hybrid aspen (Eriksson et al., 2000) affected gene

expression in the developing xylem. The phenotype of transgenic lines overexpressing

GA-oxidase is distinguished by great increases in secondary growth, i.e. the number

of xylem fibers and fiber length is increased compared to wild-type (wt) plants. The

36

Page 45: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

transgenic plants contain higher levels of both gibberellins and auxin, which suggest

that the increased secondary growth might be the outcome of the combined effect of

elevated auxin and GA levels.

A large proportion of the genes upregulated in GA-oxidase overexpressing plants

encoded proteins involved in cell wall formation and cell wall expansion. Several

pectin lyases from family PL1, which might have a role in remodeling the primary

wall during wall extension, were induced. A family PL1 gene from Salix gilgiana is

expressed in elongating and differentiating tissues (Futamura et al., 2002). However,

the pectin methyl esterases discussed in paper III, were wrongly annotated. They are

now annotated as multi-copper oxidases, which are distantly related to ascorbate

oxidases and laccases. The Arabidopsis ortholog SKU5, which corresponds to

AI164970 and AI163151, is most strongly expressed in expanding tissues (Sedbrook

et al., 2002). One pectin methyl esterase from family CE8 was downregulated in the

transgenic trees. Other cell wall related genes that were induced in the GA 20-oxidase

overexpressing trees included an endoglucanase (cellulase) from family GH9, two

glycoside hydrolases from family GH17, a putative xylosidase from family GH3 and a

few expansins. The upregulated endoglucanase was named PttCel9B and is soluble in

contrast to other plant endoglucanases, including ones found in poplar, that are

membrane-bound (Rudsander et al., 2003). Orthologs of PttCel9B that have been

identified so far act on cellulose as well as on xyloglucan (Ohmiya et al., 2000;

Woolley et al., 2001). The GH17 family of glycosidases contain enzymes able to

degrade β(1→3)- or β(1→3)-(1→4)-glucans. One of the GH17 genes contains an

additional module of unknown function called X8 (Henrissat and Davies, 2000). The

putative beta-xylosidase is most likely involved in pectin remodeling. Expansins have

been extensively studied and are cell wall proteins that promote cell wall expansion

by a largely unknown cell wall loosening mechanism. Interestingly, rice expansins are

induced by gibberellin (Lee and Kende, 2001). Cellulose biosynthesis is important for

cell expansion of the primary cell wall, thus induced expression of sucrose synthase

(family GT4) is consistent with a role in channeling sucrose into UDP-glucose, a

precursor of cellulose.

37

Page 46: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

A number of flavonoid and lignin biosynthesis related genes were also upregulated in

the transgenic plants, including a O-methyltransferase, a putative sinapyl

alcoholdehydrogenase (SAD), a dirigent protein, and a isoflavone reductase. The

obtained full length sequence of SAD was similar to the Populus tremula PtSAD gene,

which might be essential for the biosynthesis of syringyl (S) lignin in angiosperms (Li

et al., 2001). Two other S lignin biosynthesis genes were also upregulated, while the

expression of a G lignin specific cinnamylalcohol dehydrogenase (CAD) was

constant. Notably, a higher S/G lignin ratio was observed in the transgenic plants,

which is in accordance with our expression data. To identify the regions of the

developing xylem that are most strongly affected by GA overproduction, a

comparison was made with the tissue-specific expression data originating from paper

II. The majority of the genes induced in the GA 20-oxidase overexpressing trees had

highest expression in zones A-C, i.e. the sites of cell proliferation and cell expansion.

This finding fitted well with the known function of these genes in cell expansion.

3.1.3 Transcript profiling of cellulose synthases (IV)

Among the over 100000 ESTs found in the Populus EST database, 10 genes encoding

cellulose synthases (CesAs) were identified by manual bioinformatic analyses. Full

length cDNA sequences were obtained for five of the poplar CesA genes. A

comparison of the number of EST sequences that corresponded to each of the poplar

CesA genes in the different tissue-specific EST libraries, suggested that they were

differentially regulated. To learn more about the specific roles of the individual CesA

genes in hybrid aspen, transcript profiling experiments were performed for the five

full length genes using real time RT-PCR, and for all identified CesAs using cDNA

microarrays, comparing xylem with phloem and tension wood with opposite side

wood. The microarrays used in this study consisted of 13490 genes spotted in

duplicate, i.e. the microarray also used in paper V. From the microarray experiments it

was observed that the genes encoding four of the putative CesA isoenzymes,

PttCesA1, PttCesA3-1, PttCesA3-2, and PttCesA9, were clearly upregulated in xylem.

The expression patterns of these four genes are consistent with their phylogenetic

relationship to xylem specific Arabidopsis CesA isoenzymes (paper IV: Fig 2). The

expression of the remaining CesA genes were more or less constant. Real-time RT-

PCR confirmed the results of the microarray experiments, showing upregulation of

38

Page 47: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

PttCesA1, PttCesA3-1 and PttCesA3-2 in xylem, while PttCesA2 and PttCesA4 were

found to be more evenly expressed in the studied tissues.

3.2 Identification of carbohydrate active enzymes involved in

xylogenesis (V)

Despite the importance of xylem secondary cell wall processes in trees for producing

renewable fibers for the industry, the identification of enzymes involved in

synthesizing secondary walls is only in its infancy. A major goal of the present thesis

was to identify carbohydrate-active enzymes involved in the secondary cell wall

biosynthesis in xylem. Fragments of genes encoding cell wall related enzymes such as

cellulose synthases, cellulases, xyloglucan endotransglycosylases (XETs), and

xylanases were discovered in the cambial region EST library of hybrid aspen (Sterky

et al., 1998), and transcript profiling techniques were used to identify genes encoding

enzymes involved in secondary cell wall formation (paper II). By sequencing in total

19 different tissue specific cDNA libraries from Populus tremula, Populus tremula x

tremuloides and Populus trichocarpa , the EST collection has reached over 100000

sequences (Sterky et al., 2004). The ESTs have also been assembled into clusters and

contigs to form consensus sequences that will facilitate the annotation and thus also

enzyme discovery.

In order to get unique insight into which enzymes are important for constructing the

xylem secondary cell wall, we mined the microarray datasets from papers II, IV and V

to identify genes that are specifically upregulated during this process (zone D). In

addition, all Populus ESTs, contig sequences and full length sequences were screened

for carbohydrate-active enzymes, with the bioinformatic tools routinely used to assign

proteins to the Carbohydrate-active enzyme database (CAZy) (Henrissat and Davies,

2000). The advantage of these tools is that each gene is compared against a collection

of more than 45500 individual catalytic and ancillary associated modules separately,

so the family assignment is not disturbed by the frequent modularity of carbohydrate-

active enzymes (Fig 9). This way incorrect annotations commonly introduced by

BLAST-based automated annotation approaches are avoided.

39

Page 48: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

CBM22 CBM22 CBM22 GH10PttXyn10

GH17 X8 X8At2g39640

CBM18 GH19At3g12500

GT37 GT37UNKAt1g14070

Figure 9. Examples of modular carbohydrate-active enzymes from plants.

29 glycosyltransferases from 8 different families and 9 glycoside hydrolases from 8

different families were identified as secondary cell wall related (Table 4). Genes

encoding either polysaccharide lyases or carbohydrate esterases were not identified

among the secondary wall specific genes. Although putative functions can be assigned

to a few of the identified genes (e.g. cellulose synthases in family GT2, the β-

galactosidases in the family GH35), the specific function of the majority of the genes

remains unknown. However, possible roles include: remodeling the primary wall,

hemicellulose biosynthesis or protein glycosylation. The results of the bioinformatic

analysis can also be used to reannotate the datasets from previous microarray

experiments, thereby identify new candidate genes with interesting transcript profiles

that encode carbohydrate-active enzymes.

Table 4. Identified secondary cell wall related CAZy-families

Glycosyltransferases

GT1 GT2 GT8 GT14

GT31 GT43 GT47 GT61

Glycoside hydrolases

GH9 GH10 GH16 GH17

GH19 GH28 GH35 GH51

40

Page 49: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

Hemicelluloses such as xyloglucan (Pauly and Scheller, 2000) glucuronoxylan

(Timell, 1967) glucomannan (Teleman et al., 2003) and pectins (Pauly and Scheller,

2000) are often O-acetylated, but enzymes responsible for the O-acetylation of these

polysaccharides have not been identified. The biological role of the O-acetyl

substituents is not known, although they might minimize enzymatic polysaccharide

breakdown. It has been shown that Acetyl Co-A is the donor substrate for O-

acetylation of cell wall polysaccharides (Pauly and Scheller, 2000). Recent results

provide evidence that there is a positive correlation between xyloglucan O-acetylation

and fucosylation, suggesting that a suitable substrate for at least one Arabidopsis O-

acetyltransferase is fucosylated xyloglucan (Perrin et al., 2003). Cas1p is a membrane

protein required for the O-acetylation of glucuronoxylomannan in the fungal pathogen

Cryptococcus neoformans, and four orthologs were found in Arabidopsis (Perrin et

al., 2003). By performing a TBLASTN search with the Cas1p sequence against the

PopulusDB, a gene coding for a putative O-acetyltransferase was identified, which

previously was annotated as having unknown function. The transcript of the Populus

putative O-acetyltransferase is upregulated during secondary cell wall formation,

consistent with a role in O-acetylation of the secondary wall polysaccharides

glucomannan and glucuronoxylan. The functional characterization of this gene and its

products is required to verify or discard its hypothesized role in O-acetylation.

3.3 Unpublished data

3.3.1 Full length sequencing of genes with unknown function

Since the beginning of this thesis work, particular interest was directed towards genes

represented by ESTs that shared no similarity to any sequence in the public databases

“no-hits”, and genes that were similar to plant hypothetical proteins (i.e. these genes

have only been predicted by gene-finding algorithms) or plant genes with unknown

function. Together, these two groups of ESTs corresponded to more than 500 genes.

The hypothesis was that among the genes found in these two categories, new fiber-

active enzymes could be identified. Due to additional sequences deposited in public

and internal databases, increased lengths of poplar sequences provided by full length

sequencing and cluster/contig analyses and the availability of new advanced

41

Page 50: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

bioinformatic tools, initially annotated “no hits” and “unknown” poplar genes have

indeed been classified as carbohydrate active genes.

Table 5. Functional classification of unknown genes

Function or Functional Group Type No.

Auxin-Inducible 2 Germin-like 1

Arp2/3 complex 1

Protein-Protein Interaction LIM Domain 1 RING Zn Finger 2 Tpr-repeat 1 Coiled-coil 4

Glu/Asp binding domain 1 Protein-DNA Interaction Homeodomain 1

Signaling Rho p21 1

Possible Ser/Thr Kinase 1

Membrane Localized GPI-anchor 1 GPI-anchor and Fasciclin Domain 1 GPI-anchor, Cu-binding, phytocyanin-like 1 GPI-anchor, TM, extensin-like 1 TM and Signal Sequence 3 TM 8 Signal sequence 1

DUF231 3 (Domain with Unknown Function in At.)

No Domains or Function Found 20

TOTAL 55

72 genes that were upregulated during secondary cell wall formation (paper II) but

whose function remained unknown, were selected for full length sequencing (Fig 7).

This was motivated by the fact that increased sequence length and quality increases

the chance of finding meaningful sequence similarities, which improves annotation

accuracy. For instance, the closest Arabidopsis homolog in large gene families can be

more easily identified, which is necessary for generating Arabidopsis knockout lines

that can be used to infer roles of poplar enzymes. In addition, full length cDNA

sequences are required to clone and heterologously express enzymes for subsequent in

42

Page 51: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

vitro characterization. The approach followed was to first sequence the complete

insert of the clones from which the EST-sequences were derived. Almost 30 % of the

sequenced cDNA inserts were classified as full length genes by aligning the open

reading frames (ORFs) to best Arabidopsis BLASTX hits. With the additional DNA

sequence data, new carbohydrate-active enzymes were identified. In fact, of the 72

sequenced genes 14 glycosyltransferases from families GT2, GT8, GT14, GT34,

GT43 and GT47 were identified. Full length cDNA sequences of glycosyltransferases,

whose corresponding cDNA clones did not contain the full length gene, were obtained

by rapid amplification of cDNA ends (RACE) as described in paper V. When possible

the remaining novel genes were annotated by using the SMART (Simple Modular

Architecture Research) tool (Schultz et al., 1998; Letunic et al., 2004). A large

proportion of the genes contained either transmembrane regions or GPI-anchors,

showing the importance of membrane proteins in secondary cell wall biosynthesis

(Table 5). The full length sequences obtained in this project will also be used to

annotate the Populus genome.

3.3.2 PUNK1

A short summary of the work on the Populus gene PUNK1 (Poplar unknown 1) will

serve as an example of the difficulties of assigning a function to a gene that is not

significantly similar to other genes/proteins. Based on the relatively high number of

ESTs corresponding to this gene that were found in the first cambial region EST-

library (Sterky et al., 1998), this gene was selected for full length sequencing. The

longest open reading frame (ORF) corresponded to a protein with 176 amino acids.

Although these data were obtained before the Arabidopsis genome was completed, a

large number of Arabidopsis sequences had been deposited in the public databases.

Despite this, no match to PUNK1 was found in Arabidopsis until the genome was

released. The Arabidopsis proteins with sequence similarity to PUNK1 were all

annotated as hypothetical proteins and were much larger than PUNK1. However, data

from a northern blot confirmed that the full length cDNA sequence of PUNK1 had

been obtained. From the microarray experiment presented in paper II it was observed

that the PUNK1 transcript was highly upregulated in zone D (i.e. secondary cell wall

formation), which was confirmed in paper V. PUNK1 was cloned and the

recombinant protein was expressed in E. coli. Antibodies were raised against the

43

Page 52: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

recombinant PUNK1 protein and Western blots using the antibody as a probe showed

the existence of a protein of the predicted size in protein extractions from xylem and

phloem. Preliminary results of immunolocalization studies were consistent with data

from the Western blot analysis and indicated that PUNK1 is present in secondary cell

wall forming and radially expanding fibers and vessels, as well as in sieve tubes,

companion cells and fibers of the phloem. Thus, the PUNK1 transcript is highly

abundant in the secondary cell wall forming zone of xylem, and the protein is found in

both xylem and phloem. However, these data alone do not suggest a putative function

for PUNK1.

Table 6. Identified ESTs corresponding to the PUNK1 cDNA

Populus species/hybrid Number of ESTs Tissue

In PopulusDB Populus tremula x tremuloides 4 Developing xylem Populus tremula x tremuloides 1 Tension wood Populus balsamifera subsp. trichocarpa 1 Female catkins

Other poplar EST projects Populus balsamifera subsp. trichocarpa 1 Developing xylem Populus alba x Populus tremula 3 Tension wood Populus alba x Populus tremula 1 Opposite side xylem Populus balsamifera subsp. trichocarpa x Populus deltoides

1 Shoot tips + mature leaves + woody stems

Over the years, as additional sequences have been added to the databases, new hits to

other plant ESTs have been found. Xylem-specific ESTs corresponding to PUNK1

were found in other poplar species and other EST-projects (Table 6). A poplar cDNA

fragment corresponding to a PUNK1 ortholog was isolated by a differential display

experiment that compared poplars grown either with limited or luxurious amounts of

available nitrogen. However, in the published experiment this PUNK1 ortholog was

not discussed. Very recently (March 2004), a new Arabidopsis gene was deposited

that became the new best hit to PUNK1. This expressed protein only contains 96

amino acids. For a long time the only identifiable motif in the PUNK1 protein was a

coiled coil region. Recently (December 2003) PUNK1 was found to be similar to the

new domain pfam06886.1 TPX2, which was deposited in the conserved domain

database (Marchler-Bauer et al., 2003). This domain represents a conserved region of

approximately 60 amino acid residues within the eukaryotic targeting protein Xklp2

44

Page 53: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

(TPX2). TPX2 is a microtubule-associated protein that mediates the binding of the C-

terminal domain of Xklp2 to microtubules (Wittmann et al., 2000). 17 Arabidopsis

proteins, including previously identified proteins with similarity to PUNK1 contain

this domain. Several microtubule-related genes such as tubulins and kinesin-like

genes have transcript profiles that are very similar to PUNK1 (i.e. with a peak in zone

D). Therefore, it is plausible that PUNK1 also associates with microtubules, which is

a hypothesis that can now be tested to help decipher the true function of this xylem

related protein.

45

Page 54: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

4 Concluding remarks and future prospects

The objective of this thesis was to identify genes that encode fiber-active enzymes. A

large number of such genes were identified through gene expression analyses, full

length sequencing and bioinformatics, but the activity and biological function of the

majority of the corresponding enzymes remains to be investigated. The first steps

towards the characterization of these enzymes have since been taken. Attempts are

underway to express a number of the enzyme targets in Pichia pastoris for subsequent

enzymatic characterizations, and protein fragments have been expressed in E. coli for

antibody production and subsequent immunolocalization. Phenotypic studies in planta

are being performed in Populus using RNAi, and gene knockouts of the poplar

homologs in Arabidopsis are being carried out. Fourier transform infrared (FTIR)

microspectroscopy will be used to track changes in cell wall composition or

architecture. Furthermore, the sequenced Populus genome will provide additional

information about the already identified carbohydrate-active enzymes, as well as

information of the total arsenal of carbohydrate-active enzymes found in poplar.

The effort of characterizing these enzymes will eventually lead to an improved

understanding of the biochemistry of cell wall biosynthesis, which will provide the

basis for engineering fiber chemistry and structure in the future. The use of the poplar

XET to develop a novel chemo-enzymatic approach for cellulosic fiber modification

shows the potential of using plant enzymes in forest product biotechnology by a

biomimic approach (Brumer et al., 2004). Most of the enzymes in the present study,

however, are involved in the biosynthesis of the fiber carbohydrates, suggesting that

these should be used to engineer the fiber in planta. The present dataset of

carbohydrate-active enzymes constitutes an excellent source of new enzymes for

future application in forest industry processes.

46

Page 55: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

5 ABBREVIATIONS

AGP arabinogalactan protein

BAC bacterial artificial chromosome

BLAST basic local alignment tool

BSA bovine serum albumin

bp base pair

CAD cinnamylalcohol dehydrogenase

CAZyme carbohydrate-active enzyme

CBM carbohydrate-binding module

cDNA complementary DNA

CE carbohydrate esterase

CesA cellulose synthase catalytic subunit

CR-P plant conserved region

Cy3/Cy5 cyanine 5/ cyanine 3

DNA deoxyribonucleic acid

EC enzyme commission

EST expressed sequence tag

FTIR Fourier transform infrared microspectroscopy

GA gibberellin

GalA galacturonic acid

GH glycoside hydrolase

GM genetically modified

GMO genetically modified organism

GRP glycine-rich protein

GT glycosyltransferase

HG homogalacturonan

HRGP hydroxy-proline-rich glycoprotein

HVR hypervariable region

lowess locally weighted scatterplot smoothing

Mbp mega base pairs

mRNA messenger RNA

MT microtubule

ORF open reading frame

47

Page 56: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

PAL phenylalanine ammonia-lyase

PCR polymerase chain reaction

PL polysaccharide lyase

PRP proline-rich proteins

PUNK1 poplar unknown 1

RACE rapid amplification of cDNA ends

RGI rhamnogalacturonan I

RGII rhamnogalacturonan II

RNA ribonucleic acid

RNAi ribonucleic acid interference

RT-PCR reverse transcriptase polymerase chain reaction

S syringyl

SAD sinapyl alcoholdehydrogenase

SAGE serial analysis of gene expression

SNP single nucleotide polymorphism

SSR simple sequence repeat

SVD singular value decomposition

T-DNA transfer DNA

UTR untranslated region

wt wild-type

XET xyloglucan endotransglycosylase

48

Page 57: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

6 ACKNOWLEDGEMENTS A-wop-bop-a-loo-lop a-lop bam boo, as Little Richard would have said, I made it. I would not have accomplished this without the help and contribution from a number of people. Here we go, I would especially like to thank: Professor Tuula Teeri, my supervisor who accepted me as a PhD student when the wood biotech team was still just a tiny group. I am very grateful for your support and encouragement. You have outstanding energy and it has been great to be a part of your successful and expanding group. Kristina Blomqvist, who supervised me in the lab and who immediately and enthusiastically put me to work. I have appreciated that you always had the time to answer both scientific and other questions, when I came running. I have also enjoyed your positive attitude. We also shared the hassle of commuting with SJ, and I had enough first. Professor Joakim Lundeberg and Peter Nilsson for welcoming me over to the other house/floor and letting me work with the microarrays. Although it was pretty cold to walk back and forth between the different houses during winter, I have really enjoyed working with you Peter. Thanks also for your critical comments on this thesis and all my papers. Emma Master, for efficient supervision and for critical reading of this thesis. All the collaborators and co-authors up north in Umeå: Professor Göran Sandberg, Professor Björn Sundberg, Professor Thomas Moritz, Ewa Mellerowicz, Magnus Hertzberg, Rupali Bhalerao and Jarmo Schrader. All the European Union EDEN collaborators and especially Professor Bernard Henrissat and Professor Malcolm Bennett and their groups. All the other microarrayers and collaborators at floor 3: ärke-Anders Andersson, Valtteri Wirta, Tove Andersson, Maria Sievertzon, Cecilia Laurell, Anneli Johansson + others. The guys involved in the poplar EST-sequence project: Fredrik Sterky, Rikard Erlandsson, Per Unneberg and Bahram and his sequencing crew for all the help over the years. My room-mates and colleagues, the wonderful ladies, Soraya Djerbi, Åsa Kallas and Ulla Rudsander. I have had a great time in BH, with gossip, candy, baby chat, fika, jokes and fruitful research discussions. Too bad my dust collecting piles of paper soon will be gone. The next-generation of GT-explorers antibody-Alex and Anders W for collaborations in the lab. Janne L, for taking care of me as a newcomer in Tuulas group but also for several hard-working hours out on the tennis court. Martin, Harry and Fredrik V for help with computers. Martin also for your time as a BH stand in, Harry also for the obvious – punk, and Fredrik V also for bird discussions. Anna and Totte for collaboration, wild

49

Page 58: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

scientific discussions, but maybe even more for the chats about rockabilly and old bluesmen. Hongbin for the trying to bind those PUNK1 antibodies. Nader for your good mood and the week as protein purification teachers. Alphonsa and Lotta, the dominating administrators. Kathleen and Wald for bringing the Belgian goodies. All the lunchtåg-participants through the years (Joke, Stuart, Per B, Didier, Magnus, Park, Jenny........) for sharing the mostly downs and occasional ups at midday. Henrik W, Olof (also for our week as teachers), Anders M and all the other innebandy players that kept the old man busy with the stick. All the past and present people in the Wood Biotechnology lab as well as the Department of Biotechnology for a memorable and enjoyable time. Fantastic friends outside the lab, that I hope to see more now when this is over: Jeppa, Nordin + JW:s vänner, Angetun, Mr Mart, Johan S, Speedway-Fredrik, Ahlin mfl. My first idols, my brothers Micke and Håkan. My parents in law Marta and Curt for all the help and babysitting (especially the last busy weeks). Finally, I would like to thank my parents Bengt and Ella for believing in me and all their support through the years. I wrote in first grade that I wanted to become a scientist. I would not have come this far without you. Of course I also have to thank my own little family: Anna-Karin for all your love, your patience with this work and for making our home such a wonderful place, my son Arvid, who always gives me a smile when I get home, and who keeps me busy with cars, sharks and Byggare Bob, and the little newcomer of the family that will arrive soon.

50

Page 59: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

7 REFERENCES Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu

A, Olde B, Moreno RF, Kerlavage AR, Mccombie WR, Venter JC (1991) Complementary-DNA Sequencing - Expressed Sequence Tags and Human Genome Project. Science 252: 1651-1656

Allona I, Quinn M, Shoop E, Swope K, St Cyr S, Carlis J, Riedl J, Retzel E, Campbell MM, Sederoff R, Whetten RW (1998) Analysis of xylem formation in pine by cDNA sequencing. Proceedings of the National Academy of Sciences of the United States of America 95: 9693-9698

Almgren G, Skogsstyrelsen (1990) Lövskog : björk, asp och al i skogsbruk och naturvård. Skogsstyr., Jönköping

Baskin TI (2001) On the alignment of cellulose microfibrils by cortical microtubules: a review and a model. Protoplasma 215: 150-171

Baucher M, Halpin C, Petit-Conil M, Boerjan W (2003) Lignin: Genetic engineering and impact on pulping. Critical Reviews in Biochemistry and Molecular Biology 38: 305-350

Bhalerao R, Nilsson O, Sandberg G (2003) Out of the woods: forest biotechnology enters the genomic era. Current Opinion in Biotechnology 14: 206-213

Boerjan W, Ralph J, Baucher M (2003) Lignin biosynthesis. Annual Review of Plant Biology 54: 519-546

Boguski MS, Lowe TMJ, Tolstoshev CM (1993) Dbest - Database for Expressed Sequence Tags. Nature Genetics 4: 332-333

Bonin CP, Potter I, Vanzin GF, Reiter WD (1997) The MUR1 gene of Arabidopsis thaliana encodes an isoform of GDP-D-mannose-4,6-dehydratase, catalyzing the first step in the de novo synthesis of GDP-L-fucose. Proceedings of the National Academy of Sciences of the United States of America 94: 2085-2090

Bouchez D, Hofte H (1998) Functional genomics in plants. Plant Physiology 118: 725-732 Bouton S, Leboeuf E, Mouille G, Leydecker MT, Talbotec J, Granier F, Lahaye M, Hofte H,

Truong HN (2002) Quasimodo1 encodes a putative membrane-bound glycosyltransferase required for normal pectin synthesis and cell adhesion in Arabidopsis. Plant Cell 14: 2577-2590

Bradshaw HD, Ceulemans R, Davis J, Stettler R (2000) Emerging model systems in plant biology: Poplar (Populus) as a model forest tree. Journal of Plant Growth Regulation 19: 306-313

Brett C, Waldron K (1996) Physiology and biochemistry of plant cell walls. Chapman and Hall, London, UK

Brumer H, Zhou Q, Bauman M, Carlson K, Teeri TT (2004) Activation of crystalline cellulose surfaces through the chemo-enzymatic modification of xyloglucan. J. Am. Chem. Soc.: in press

Brunner AM, Busov VB, Strauss SH (2004) Poplar genome sequence: functional genomics in an ecologically dominant plant species. Trends in Plant Science 9: 49-56

Burton RA, Shirley NJ, King BJ, Harvey AJ, Fincher GB (2004) The CesA gene family of barley. Quantitative analysis of transcripts reveals two groups of co-expressed genes. Plant Physiology 134: 224-236

Campbell MM, Brunner AM, Jones HM, Strauss SH (2003) Forestry's fertile crescent: the application of biotechnology to forest trees. Plant Biotechnology Journal 1: 141-154

Chaffey N, Cholewa E, Regan S, Sundberg B (2002) Secondary xylem development in Arabidopsis: a model for wood formation. Physiologia Plantarum 114: 594-600

Christersson L (1996) Future research on hybrid aspen and hybrid poplar cultivation in Sweden. Biomass & Bioenergy 11: 109-113

Chuaqui RF, Bonner RF, Best CJM, Gillespie JW, Flaig MJ, Hewitt SM, Phillips JL, Krizman DB, Tangrea MA, Ahram M, Linehan WM, Knezevic V, Emmert-Buck MR (2002) Post-analysis follow-up and validation of microarray experiments. Nature Genetics 32: 509-514

Coutinho PM, Deleury E, Davies GJ, Henrissat B (2003) An evolving hierarchical family classification for glycosyltransferases. Journal of Molecular Biology 328: 307-317

Coutinho PM, Starn M, Blanc E, Henrissat B (2003) Why are there so many carbohydrate-active enzyme-related genes in plants? Trends in Plant Science 8: 563-565

Davies GJ, Henrissat B (2002) Structural enzymology of carbohydrate-active enzymes: implications for the post-genomic era. Biochemical Society Transactions 30: 291-297

51

Page 60: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

Dejardin A, Leple JC, Lesage-Descauses MC, Costa G, Pilate G (2004) Expressed sequence tags from poplar wood tissues - A comparative analysis from multiple libraries. Plant Biology 6: 55-64

Delmer DP (1999) Cellulose biosynthesis: Exciting times for a difficult field of study. Annual Review of Plant Physiology and Plant Molecular Biology 50: 245-276

Demura T, Tashiro G, Horiguchi G, Kishimoto N, Kubo M, Matsuoka N, Minami A, Nagata-Hiwatashi M, Nakamura K, Okamura Y, Sassa N, Suzuki S, Yazaki J, Kikuchi S, Fukuda H (2002) Visualization by comprehensive microarray analysis of gene expression programs during transdifferentiation of mesophyll cells into xylem cells. Proceedings of the National Academy of Sciences of the United States of America 99: 15794-15799

Dixon RA, Chen F, Guo DJ, Parvathi K (2001) The biosynthesis of monolignols: a "metabolic grid", or independent pathways to guaiacyl and syringyl units? Phytochemistry 57: 1069-1084

Doblin MS, Kurek I, Jacob-Wilk D, Delmer DP (2002) Cellulose biosynthesis in plants: from genes to rosettes. Plant and Cell Physiology 43: 1407-1420

Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12: 111-139

Eckenwalder JE (1996) Systematics and evolution of Populus. In RF Stettler, HD Bradshaw, PE Heilman, TM Hinckley, eds, Biology of Populus and its implications for management and conservation. NRC rearch press, Ottawa, Canada, pp 7-32

Edwards ME, Dickson CA, Chengappa S, Sidebottom C, Gidley MJ, Reid JSG (1999) Molecular characterisation of a membrane-bound galactosyltransferase of plant cell wall matrix polysaccharide biosynthesis. Plant Journal 19: 691-697

Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95: 14863-14868

Eriksson ME, Israelsson M, Olsson O, Moritz T (2000) Increased gibberellin biosynthesis in transgenic trees promotes growth, biomass production and xylem fiber length. Nature Biotechnology 18: 784-788

Faik A, Bar-Peled M, DeRocher AE, Zeng WQ, Perrin RM, Wilkerson C, Raikhel NV, Keegstra K (2000) Biochemical characterization and molecular cloning of an alpha-1,2-fucosyltransferase that catalyzes the last step of cell wall xyloglucan biosynthesis in pea. Journal of Biological Chemistry 275: 15082-15089

Faik A, Price NJ, Raikhel NV, Keegstra K (2002) An Arabidopsis gene encoding an alpha-xylosyltransferase involved in xyloglucan biosynthesis. Proceedings of the National Academy of Sciences of the United States of America 99: 7797-7802

Fenning TM, Gershenzon J (2002) Where will the wood come from? Plantation forests and the role of biotechnology. Trends in Biotechnology 20: 291-296

Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, Mckenney K, Sutton G, Fitzhugh W, Fields C, Gocayne JD, Scott J, Shirley R, Liu LI, Glodek A, Kelley JM, Weidman JF, Phillips CA, Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JL, Geoghagen NSM, Gnehm CL, Mcdonald LA, Small KV, Fraser CM, Smith HO, Venter JC (1995) Whole-Genome Random Sequencing and Assembly of Haemophilus-Influenzae Rd. Science 269: 496-512

Fry SC, Smith RC, Renwick KF, Martin DJ, Hodge SK, Matthews KJ (1992) Xyloglucan Endotransglycosylase, a New Wall-Loosening Enzyme-Activity from Plants. Biochemical Journal 282: 821-828

Fukuda H (1997) Tracheary element differentiation. Plant Cell 9: 1147-1156 Futamura N, Kouchi H, Shinohara K (2002) A gene for pectate lyase expressed in elongating and

differentiating tissues of a Japanese willow (Salix gilgiana). Journal of Plant Physiology 159: 1123-1130

Gardiner JC, Taylor NG, Turner SR (2003) Control of cellulose synthase complex localization in developing xylem. Plant Cell 15: 1740-1748

Grima-Pettenati J, Goffner D (1999) Lignin genetic engineering revisited. Plant Science 145: 51-65 Gutierrez A, del Rio JC, Martinez MJ, Martinez AT (2001) The biotechnological control of pitch in

paper pulp manufacturing. Trends in Biotechnology 19: 340-348 Henrissat B, Coutinho PM, Davies GJ (2001) A census of carbohydrate-active enzymes in the

genome of Arabidopsis thaliana. Plant Molecular Biology 47: 55-72 Henrissat B, Davies G (1997) Structural and sequence-based classification of glycoside hydrolases.

Current Opinion in Structural Biology 7: 637-644

52

Page 61: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

Henrissat B, Davies GJ (2000) Glycoside hydrolases and glycosyltransferases. Families, modules, and implications for genomics. Plant Physiology 124: 1515-1519

Holloway AJ, van Laar RK, Tothill RW, Bowtell DDL (2002) Options available - from start to finish - for obtaining data from DNA microarrays. Nature Genetics 32: 481-489

Holzman T, Kolker E (2004) Statistical analysis of global gene expression data: some practical considerations. Current Opinion in Biotechnology 15: 52-57

Hood L (2003) Systems biology: integrating technology, biology, and computation. Mechanisms of Ageing and Development 124: 9-16

Humphreys JM, Chapple C (2002) Rewriting the lignin roadmap. Current Opinion in Plant Biology 5: 224-229

Iwai H, Masaoka N, Ishii T, Satoh S (2002) A pectin glucuronyltransferase gene is essential for intercellular attachment in the plant meristem. Proceedings of the National Academy of Sciences of the United States of America 99: 16319-16324

Karacic A, Verwijst T, Weih M (2003) Above-ground woody biomass production of short-rotation populus plantations on agricultural land in Sweden. Scandinavian Journal of Forest Research 18: 427-437

Knight J (2001) When the chips are down. Nature 410: 860-861 Kohler A, Delaruelle C, Martin D, Encelot N, Martin F (2003) The poplar root transcriptome:

analysis of 7000 expressed sequence tags. Febs Letters 542: 37-41 Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle

M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang HM, Yu J, Wang J, Huang GY, Gu J, Hood L, Rowen L, Madan A, Qin SZ, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan HQ, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JGR, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang WH, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz JR, Slater G, Smit AFA, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ (2001) Initial sequencing and analysis of the human genome. Nature 409: 860-921

Lao NT, Long D, Kiang S, Coupland G, Shoue DA, Carpita NC, Kavanagh TA (2003) Mutation of a family 8 glycosyltransferase gene alters cell wall carbohydrate composition and causes a humidity-sensitive semi-sterile dwarf phenotype in Arabidopsis. Plant Molecular Biology 53: 687-701

Lee MLT, Kuo FC, Whitmore GA, Sklar J (2000) Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations.

53

Page 62: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

Proceedings of the National Academy of Sciences of the United States of America 97: 9834-9839

Lee Y, Kende H (2001) Expression of beta-expansins is correlated with internodal elongation in deepwater rice. Plant Physiology 127: 645-654

Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork P (2004) SMART 4.0: towards genomic data integration. Nucleic Acids Research 32: D142-D144

Leung YF, Cavalieri D (2003) Fundamentals of cDNA microarray data analysis. Trends in Genetics 19: 649-659

Li LG, Cheng XF, Leshkevich J, Umezawa T, Harding SA, Chiang VL (2001) The last step of syringyl monolignol biosynthesis in angiosperms is regulated by a novel gene encoding sinapyl alcohol dehydrogenase. Plant Cell 13: 1567-1585

Liang X, Joshi CP (2004) Molecular cloning of ten distinct hypervariable regions from the cellulose synthase superfamily in aspen trees. Tree Physiology 24: 543-550

Lockhart DJ, Dong HL, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang CW, Kobayashi M, Horton H, Brown EL (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nature Biotechnology 14: 1675-1680

Madson M, Dunand C, Li XM, Verma R, Vanzin GF, Calplan J, Shoue DA, Carpita NC, Reiter WD (2003) The MUR3 gene of Arabidopsis encodes a xyloglucan galactosyltransferase that is evolutionarily related to animal exostosins. Plant Cell 15: 1662-1670

Mansfield SD, Esteghlalian AR (2003) Applications of biotechnology in the forest products industry. Applications of Enzymes to Lignocellulosics 855: 2-29

Marchler-Bauer A, Anderson JB, DeWeese-Scott C, Fedorova ND, Geer LY, He SQ, Hurwitz DI, Jackson JD, Jacobs AR, Lanczycki CJ, Liebert CA, Liu CL, Madej T, Marchler GH, Mazumder R, Nikolskaya AN, Panchenko AR, Rao BS, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Vasudevan S, Wang YL, Yamashita RA, Yin JJ, Bryant SH (2003) CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Research 31: 383-387

Matthysse AG, White S, Lightfoot R (1995) Genes Required for Cellulose Synthesis in Agrobacterium-Tumefaciens. Journal of Bacteriology 177: 1069-1075

McCann M (1997) Tracheary element formation: building up to a dead end. Trends in Plant Science 2: 333-338

Mcqueen-Mason S, Durachko DM, Cosgrove DJ (1992) 2 Endogenous Proteins That Induce Cell-Wall Extension in Plants. Plant Cell 4: 1425-1433

Mellerowicz EJ, Baucher M, Sundberg B, Boerjan W (2001) Unravelling cell wall formation in the woody dicot stem. Plant Molecular Biology 47: 239-274

Milioni D, Sado PE, Stacey NJ, Domingo C, Roberts K, McCann MC (2001) Differential expression of cell-wall-related genes during the formation of tracheary elements in the Zinnia mesophyll cell system. Plant Molecular Biology 47: 221-238

Milioni D, Sado PE, Stacey NJ, Roberts K, McCann MC (2002) Early gene expression associated with the commitment and differentiation of a plant tracheary element is revealed by cDNA-amplified fragment length polymorphism analysis. Plant Cell 14: 2813-2824

Oh S, Park S, Han KH (2003) Transcriptional regulation of secondary growth in Arabidopsis thaliana. Journal of Experimental Botany 54: 2709-2722

Ohlrogge J, Benning C (2000) Unraveling plant metabolism by EST analysis. Current Opinion in Plant Biology 3: 224-228

Ohmiya Y, Samejima M, Shiroishi M, Amano Y, Kanda T, Sakai F, Hayashi T (2000) Evidence that endo-1,4-beta-glucanases act on cellulose in suspension-cultured poplar cells. Plant Journal 24: 147-158

Osterlund MT, Paterson AH (2002) Applied plant genomics: the secret is integration. Current Opinion in Plant Biology 5: 141-145

Pauly M, Scheller HV (2000) O-Acetylation of plant cell wall polysaccharides: identification and partial characterization of a rhamnogalacturonan O-acetyl-transferase from potato suspension-cultured cells. Planta 210: 659-667

Pear JR, Kawagoe Y, Schreckengost WE, Delmer DP, Stalker DM (1996) Higher plants contain homologs of the bacterial celA genes encoding the catalytic subunit of cellulose synthase. Proceedings of the National Academy of Sciences of the United States of America 93: 12637-12642

Pellis A, Laureysens I, Ceulemans R (2004) Growth and production of a short rotation coppice culture of poplar I. Clonal differences in leaf characteristics in relation to biomass production. Biomass and Bioenergy 27: 9-19

54

Page 63: Discovery of fiber-active enzymes in Populus wood

DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD

Perrin R, Wilkerson C, Keegstra K (2001) Golgi enzymes that synthesize plant cell wall polysaccharides: finding and evaluating candidates in the genomic era. Plant Molecular Biology 47: 115-130

Perrin RM, DeRocher AE, Bar-Peled M, Zeng WQ, Norambuena L, Orellana A, Raikhel NV, Keegstra K (1999) Xyloglucan fucosyltransferase, an enzyme involved in plant cell wall biosynthesis. Science 284: 1976-1979

Perrin RM, Jia ZH, Wagner TA, O'Neill MA, Sarria R, York WS, Raikhel NV, Keegstra K (2003) Analysis of xyloglucan fucosylation in Arabidopsis. Plant Physiology 132: 768-778

Pesquet E, Jauneau A, Digonnet C, Boudet AM, Pichon M, Goffner D (2003) Zinnia elegans: the missing link from in vitro tracheary elements to xylem. Physiologia Plantarum 119: 463-468

Phimister B (1999) Going global. Nature Genetics 21: 1-1 Plomion C, Leprovost G, Stokes A (2001) Wood formation in trees. Plant Physiology 127: 1513-

1523 Quackenbush J (2001) Computational analysis of microarray data. Nature Reviews Genetics 2: 418-

427 Quackenbush J (2002) Microarray data normalization and transformation. Nature Genetics 32: 496-

501 Ridley BL, O'Neill MA, Mohnen DA (2001) Pectins: structure, biosynthesis, and oligogalacturonide-

related signaling. Phytochemistry 57: 929-967 Ridoutt BG, Pharis RP, Sands R (1996) Fibre length and gibberellins A(1) and A(20) are decreased

in Eucalyptus globulus by acylcyclohexanedione injected into the stem. Physiologia Plantarum 96: 559-566

Rudd S (2003) Expressed sequence tags: alternative or complement to whole genome sequences? Trends in Plant Science 8: 321-329

Rudsander U, Denman S, Raza S, Teeri TT (2003) Molecular features of family GH9 cellulases in hybrid aspen and the filamentous fungus Phanerochaete chrysosporium. J. Appl. Glycobiol.: 253-256

Rytter L (2002) Nutrient content in stems of hybrid aspen as affected by tree age and tree size, and nutrient removal with harvest. Biomass & Bioenergy 23: 13-25

Saxena IM, Lin FC, Brown RM (1990) Cloning and Sequencing of the Cellulose Synthase Catalytic Subunit Gene of Acetobacter-Xylinum. Plant Molecular Biology 15: 673-683

Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative Monitoring of Gene-Expression Patterns with a Complementary-DNA Microarray. Science 270: 467-470

Schultz J, Milpetz F, Bork P, Ponting CP (1998) SMART, a simple modular architecture research tool: Identification of signaling domains. Proceedings of the National Academy of Sciences of the United States of America 95: 5857-5864

Sedbrook JC, Carroll KL, Hung KF, Masson PH, Somerville CR (2002) The Arabidopsis SKU5 gene encodes an extracellular glycosyl phosphatidylinositol-anchored glycoprotein involved in directional root growth. Plant Cell 14: 1635-1648

Smyth GK, Speed T (2003) Normalization of cDNA microarray data. Methods 31: 265-273 Smyth GK, Yang YH, Speed T (2003) Statistical issues in cdna microarray data analysis. In MJ

Brownstein, AB Khodursky, eds, Functional genomics: Methods and protocols, Vol 224. Humana press, Totowa, NJ, pp 111-136

Somerville C, Somerville S (1999) Plant functional genomics. Science 285: 380-383 Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Campaa L, Jonsson

Lindvall J, Tandre K, Strauss SH, Sundberg B, Gustafsson P, Uhlen M, Bhalerao RP, Nilsson O, Sandberg G, Karlsson J, Lundeberg J, Jansson S (2004) A Populus EST resource for functional genomics. Proceedings of the National Academy of Sciences of the United States of America: in press

Sterky F, Regan S, Karlsson J, Hertzberg M, Rohde A, Holmberg A, Amini B, Bhalerao R, Larsson M, Villarroel R, Van Montagu M, Sandberg G, Olsson O, Teeri TT, Boerjan W, Gustafsson P, Uhlen M, Sundberg B, Lundeberg J (1998) Gene discovery in the wood-forming tissues of poplar: Analysis of 5,692 expressed sequence tags. Proceedings of the National Academy of Sciences of the United States of America 95: 13330-13335

Tanaka K, Murata K, Yamazaki M, Onosato K, Miyao A, Hirochika H (2003) Three distinct rice cellulose synthase catalytic subunit genes required for cellulose synthesis in the secondary wall. Plant Physiology 133: 73-83

Taylor NG, Howells RM, Huttly AK, Vickers K, Turner SR (2003) Interactions among three distinct CesA proteins essential for cellulose synthesis. Proceedings of the National Academy of Sciences of the United States of America 100: 1450-1455

55

Page 64: Discovery of fiber-active enzymes in Populus wood

HENRIK ASPEBORG

56

Teleman A, Nordstrom M, Tenkanen M, Jacobs A, Dahlman O (2003) Isolation and characterization of O-acetylated glucomannans from aspen and birch wood. Carbohydrate Research 338: 525-534

The Arabidopsis genome initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815

Timell TE (1967) Recent progress in the chemistry of wood hemicelluloses. Wood. Sci. Techn. 1: 45-70

Tissier T, Bourgeois P (2001) Reverse genetics in plants. Current Genomics: 269-284 Tuskan GA (1998) Short-rotation woody crop supply systems in the United States: What do we know

and what do we need to know? Biomass & Bioenergy 14: 307-315 Uggla C, Moritz T, Sandberg G, Sundberg B (1996) Auxin as a positional signal in pattern

formation in plants. Proceedings of the National Academy of Sciences of the United States of America 93: 9282-9286

Usadel B, Kuschinsky AM, Rosso MG, Eckermann N, Pauly M (2004) RHM2 is involved in mucilage pectin synthesis and is required for the development of the seed coat in Arabidopsis. Plant Physiology 134: 286-295

Valdivia Granda W (2003) Strategies for clustering, classifying, integrating, standardizing and visualizing microarray gene expression data. In E Blalock, ed, A beginner's guide to microarrays. Kluwer Academic Publishers, pp 277-340

Velculescu VE, Zhang L, Vogelstein B, Kinzler KW (1995) Serial Analysis of Gene-Expression. Science 270: 484-487

Vilaine F, Palauqui JC, Amselem J, Kusiak C, Lemoine R, Dinant S (2003) Towards deciphering phloem: a transcriptome analysis of the phloem of Apium graveolens. Plant Journal 36: 67-81

Vincken JP, Schols HA, Oomen RJFJ, McCann MC, Ulvskov P, Voragen AGJ, Visser RGF (2003) If homogalacturonan were a side chain of rhamnogalacturonan I. Implications for cell wall architecture. Plant Physiology 132: 1781-1789

Wittmann T, Wilm M, Karsenti E, Vernos I (2000) TPX2, a novel Xenopus MAP involved in spindle pole organization. Journal of Cell Biology 149: 1405-1418

Woolley LC, James DJ, Manning K (2001) Purification and properties of an endo-beta-1,4-glucanase from strawberry and down-regulation of the corresponding gene, cel1. Planta 214: 11-21

Wullschleger SD, Jansson S, Taylor G (2002) Genomics and forest biology: Populus emerges as the perennial favorite. Plant Cell 14: 2651-2655.

Yang YH, Speed T (2002) Design issues for cDNA microarray experiments. Nature Reviews Genetics 3: 579-588