Discovery of fiber-active enzymes in Populus wood
Transcript of Discovery of fiber-active enzymes in Populus wood
Discovery of fiber-active enzymesin Populus wood
Henrik Aspeborg
Royal Institute of TechnologyDepartment of Biotechnology
Stockholm 2004
Henrik Aspeborg
Department of BiotechnologyRoyal Institute of TechnologyAlbaNova University CenterSE-106 91 StockholmSweden
Printed at Universitetsservice US ABBox 700 14Sweden
ISBN 91-7283-785-3
Henrik Aspeborg (2004). Discovery of fiber-active enzymes in Populus wood. Doctoral dissertationfrom the Department of Biotechnology, Royal Institute of Technology, KTH, AlbaNova UniversityCenter, Stockholm, Sweden.
ISBN 91-7283-785-3
Abstract
Renewable fibers produced by forest trees provide excellent raw material of high economic value forindustrial applications. Despite this, the genes and corresponding enzymes involved in wood fiberbiosynthesis in trees are poorly characterized. This thesis describes a functional genomics approach forthe identification of carbohydrate-active enzymes involved in secondary cell wall (wood) formation inhybrid aspen. First, a 3’ target amplification method was developed to enable microarray-based gene expressionanalysis on minute amounts of RNA. The amplification method was evaluated using both a smallermicroarray containing 192 cDNA clones and a larger microarray containing 2995 cDNA clones thatwere hybridized with targets isolated from xylem and phloem. Moreover, a gene expression study ofphloem differentiation was performed to show the usefulness of the amplification method. A microarray containing 2995 cDNA clones representing a unigene set of a cambial region ESTlibrary was used to study gene expression during wood formation. Transcript populations from thintissue sections representing different stages of xylem development were hybridized onto themicroarrays. It was demonstrated that genes encoding lignin and cellulose biosynthetic enzymes, aswell as a number of genes without assigned function, were differentially expressed across thedevelopmental gradient. Microarrays were also used to track changes in gene expression in the developing xylem oftransgenic, GA-20 oxidase overexpressing hybrid aspens that had increased secondary growth. Thestudy revealed that a number of genes encoding cell wall related enzymes were upregulated in thetransgenic trees. Moreover, most genes with high transcript changes could be assigned a role in theearly events of xylogenesis. Ten genes encoding putative cellulose synthases (CesAs) were identified in our own Populus EST-database. Full length cDNA sequences were obtained for five of them. Expression analyses performedwith real-time PCR and microarrays in normal wood undergoing xylogenesis and in tension woodrevealed xylem specific expression of four putative CesA isoenzymes. Finally, an approach combining expression profiling, bioinformatics as well as EST and full lengthsequencing was adopted to identify secondary cell wall related genes encoding carbohydrate-activeenzymes, such as glycosyltransferases and glycoside hydrolases. As expected, glycosyltransferasesinvolved in the carbohydrate biosynthesis dominated the collection of the secondary cell wall relatedenzymes that were identified.
Key words: Populus, xylogenesis, secondary cell wall, cellulose, hemicellulose, microarrays, transcriptprofiling, carbohydrate-active enzyme, glycosyltransferase, glycoside hydrolase
Henrik Aspeborg
Sammanfattning
Skogsträdens förnyelsebara fibrer är en unik råvara av betydande ekonomiskt värde förskogsindustrin. Trots detta är de gener och motsvarande enzymer som är involverade ivedfiberbiosyntesen inte särskilt väl karaktäriserade. Den här avhandlingen beskriver hurfunktionsgenomik kan användas i syfte att identifiera kolhydrataktiva enzymer som är involverade ibildningen av den sekundära cellväggen i hybridasp. Först utvecklades en 3’ amplifieringsmetod för mål-RNA. Denna metod möjliggörexpressionsanalyser med hjälp av mikromatriser på små mängder RNA. För att utvärderaamplifieringsmetoden hybridiserades en mindre mikromatris bestående av 192 cDNA-kloner och enstörre mikromatris bestående av 2995 cDNA-kloner med isolerat mål-RNA från xylem och floem.Dessutom genomfördes en genexpressionsstudie på differentierande floem för att visaamplifieringsmetodens användbarhet. En mikromatris, bestående av 2995 cDNA-kloner som samtliga representerade unika gener från ettEST-bibliotek isolerat från kambieregionen, användes för att studera genuttrycket undervedbildningens gång. Transkriptpopulationer från tunna vävnadssnitt, som representerade olika stadierav xylemutveckling, hybridiserades till mikromatriserna. Gener som kodade för lignin ochcellulosabiosyntesenzymer, samt ett antal gener utan klarlagd funktion, befanns vara differentielltuttryckta under vedbildningen. Mikromatriser användes också för att påvisa genuttrycksförändringar i xylemet hos transgenahybridaspar med förhöjd sekundär tillväxt. Dessa hybridaspar överuttrycker GA-20 oxidas. Studienvisade att ett antal gener, kodande för cellväggsrelaterade enzymer, var uppreglerade i de transgenaträden. Dessutom noterades att gener med de största transkriptförändringarna var de gener som äraktiva i vedbildningens tidiga stadier. Tio gener kodande för förmodade cellulosasyntaser (CesAs) identifierades i den egna Populus EST-databasen. För fem av dem erhölls cDNA-fullängdssekvenser. Expressionanalysresultat av pågåendexylogenes i normalved och dragved, som utfördes med realtids-PCR och mikromatriser, visade att fyraförmodade CesA-isoenzymer var xylemspecifikt uttryckta. Slutligen kombinerades expressionsanlys, bioinformatik, EST-sekvensering ochfullängdssekvensering i ett försök att identifiera sekundära cellväggsrelaterade gener kodande förkolhydrataktiva enzymer som t.ex. glykosyltransferaser och glykosidhydrolaser. Som förväntatdominerade glykosyltransferaser involverade i kolhydratbiosyntes gruppen av identifierade sekundäracellväggsrelaterade enzymer
To my parents and my family
What's so funny 'bout peace, love & understandingNick Lowe
LIST OF PUBLICATIONSThis thesis is based on the following papers and are referred to in the text by theirRoman numerals:
I. Hertzberg M, Sievertzon M, Aspeborg H, Nilsson P, Sandberg G, LundebergJ. cDNA microarray analysis of small plant tissue samples using a cDNA tagtarget amplification protocol. Plant J. 2001 Mar;25(5):585-91.
II. Hertzberg M*, Aspeborg H*, Schrader J, Andersson A, Erlandsson R,Blomqvist K, Bhalerao R, Uhlen M, Teeri TT, Lundeberg J, Sundberg B,Nilsson P, Sandberg G. A transcriptional roadmap to wood formation. ProcNatl Acad Sci USA. 2001 Dec 4;98(25):14732-7.
III. Israelsson M, Eriksson ME, Hertzberg M, Aspeborg H, Nilsson P, Moritz T.Changes in gene expression in the wood-forming tissue of transgenic hybridaspen with increased secondary growth. Plant Mol Biol. 2003 Jul; 52(4): 893-903.
IV. Djerbi S, Aspeborg H, Nilsson P, Sundberg B, Mellerowicz E, Blomqvist K,Teeri TT. Identification and expression analysis of genes encoding putativecellulose synthases (CesA) in the hybrid aspen, Populus tremula (L.) x P.tremuloides (Michx.). Accepted for publication in Cellulose.
V. Aspeborg H, Schrader J, Coutinho PM, Stam M, Kallas Å, Nilsson P,Denman S, Amini B, Sterky F, Master E, Sandberg G, Mellerowicz E,Sundberg B, Henrissat B, Teeri TT. Carbohydrate-active enzymes involved inthe secondary cell wall biogenesis in hybrid aspen. Manuscript.
* M.H. and H.A. contributed equally to this work and should be considered as jointfirst authors.
RELATED PAPERS
(a) Andersson A, Keskitalo J, Sjodin A, Bhalerao R, Sterky F, Wissel K, Tandre K,Aspeborg H, Moyle R, Ohmiya Y, Bhalerao R, Brunner A, Gustafsson P,Karlsson J, Lundeberg J, Nilsson O, Sandberg G, Strauss S, Sundberg B, UhlenM, Jansson S, Nilsson P. A transcriptional timetable of autumn senescence.Genome Biol. 2004;5(4):R24. Epub 2004 Mar 10.
(b) Blomqvist K, Djerbi S, Aspeborg H, Teeri TT. Cellulose biosynthesis in trees.Submitted as a book chapter.
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
TABLE OF CONTENTS
1 INTRODUCTION ___________________________________________________ 2 1.1 From genes to genomes _________________________________________________ 2
1.2 EST sequencing _______________________________________________________ 3
1.3 Global gene expression analysis __________________________________________ 5 1.3.1 Background ______________________________________________________________ 5 1.3.2 Array fabrication __________________________________________________________ 6 1.3.3 Target preparation and hybridization___________________________________________ 8 1.3.4 Image analysis ____________________________________________________________ 9 1.3.5 Data analysis ____________________________________________________________ 11 1.3.6 Differentially expressed genes_______________________________________________ 13 1.3.7 Validation ______________________________________________________________ 13
1.4 Searching for a function _______________________________________________ 14
1.5 Wood formation ______________________________________________________ 15 1.5.1 Economic importance _____________________________________________________ 15 1.5.2 Studies of wood formation - poplar and other model organisms _____________________ 17 1.5.3 Xylem differentiation______________________________________________________ 21 1.5.4 The plant cell wall ________________________________________________________ 23 1.5.5 Carbohydrate biosynthesis and turnover _______________________________________ 24
1.5.5.1 Carbohydrate-active enzymes involved in cell wall biosynthesis __________________ 24 1.5.5.2 Cellulose biosynthesis ___________________________________________________ 25 1.5.5.3 Hemicellulose biosynthesis _______________________________________________ 26 1.5.5.4 Pectin biosynthesis______________________________________________________ 27
1.5.6 Other cell wall components _________________________________________________ 28 1.5.6.1 Lignin _______________________________________________________________ 28 1.5.6.2 Cell wall proteins_______________________________________________________ 28 1.5.6.3 Extractives ____________________________________________________________ 29
2 OBJECTIVES OF THE PRESENT INVESTIGATION ___________________ 30
3 RESULTS AND DISCUSSION _______________________________________ 31 3.1 Transcript profiling of woody tissues_____________________________________ 31
3.1.1 Development of a target amplification method (I)________________________________ 31 3.1.2 Transcript profiling during xylogenesis (II) (III) (V) _____________________________ 33
3.1.2.1 A transcriptional roadmap to wood formation (II)______________________________ 33 3.1.2.2 A transcriptional roadmap to wood formation revisited (V) ______________________ 35 3.1.2.3 Transcript profiling of transgenic trees with increased secondary growth (III)________ 36
3.1.3 Transcript profiling of cellulose synthases (IV) _________________________________ 38
3.2 Identification of carbohydrate active enzymes involved in xylogenesis (V) ______ 39
3.3 Unpublished data _____________________________________________________ 41 3.3.1 Full length sequencing of genes with unknown function___________________________ 41 3.3.2 PUNK1 ________________________________________________________________ 43
4 Concluding remarks and future prospects _______________________________ 46
5 ABBREVIATIONS _________________________________________________ 47
6 ACKNOWLEDGEMENTS ___________________________________________ 49
7 REFERENCES ____________________________________________________ 51
1
HENRIK ASPEBORG
1 INTRODUCTION
1.1 From genes to genomes
The late 1990s and the first years of the new millennium will be remembered as the
dawn of genomics in the history of biology. During the 1980s the improvement and
automation of sequencing methods made the complete sequencing of viruses and
organelles possible, but it was not until 1995, that the first complete genome sequence
of a free-living organism, the bacterium Haemophilus influenzae, was published
(Fleischmann et al., 1995). Since then, more than a hundred genomes have been
sequenced from the main domains of life: bacteria, archaea and eukarya. Model
organisms for genome sequencing have been chosen to cover a variety of life forms.
The budding yeast Saccharomyces cerevisiae (1997), human Homo sapiens (2001),
mouse Mus musculus (2002), the nematode Caenorhabditis elegans (1998), the fruit
fly Drosophila melanogaster (2000), the mosquito Anopheles gambiae 2002, and the
plant Arabidopsis thaliana (2000), are examples of completely sequenced eukaryotes
(http://www.ncbi.nih.gov). Other genomes such as the rice genome have been
presented in near-completed drafts, and sequencing projects of several hundred
genomes are on-going at various stages of completion. The genome is defined as the
complete (haploid) DNA content of an organism and the new field of genome studies
has been named genomics. Note that the terminology varies, sometimes related
“omic” topics like, transcriptomics, proteomics, metabolomics, glycomics are
included in the term genomics. Preferably the narrow definition of genomics as the
discipline of mapping, sequencing and analyzing genomes should be used, while the
commonly used term functional genomics is more suitable for the different “omics”,
which make use of high-throughput and large-scale experimental methodologies to
determine the biological function of the genes and their products. In the age of
genomics and functional genomics, the old school of molecular biology, which
followed a gene-at-a-time approach, has been replaced by global approaches for
identifying and understanding gene function, and this has given scientists a new
perspective. With this new global view we can seek answers to questions like: How
and which genes operate together? How and which proteins interact with genes and
other proteins? How do the networks or circuits of genes and proteins interact with
other networks to determine a certain cell type? This new approach to biology, which
2
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
studies the interrelationships of all of the elements in a system rather than studying
them one at a time, is termed systems biology (Hood, 2003). As a consequence of the
enormous quantity of data that has been produced by high-throughput experimental
technologies, bioinformatics, i.e. computational and algorithmic approaches to handle
and study biological data, has emerged as a new, essential discipline for genomics and
functional genomics.
1.2 EST sequencing One of the goals of genome sequencing projects is to identify the complete set of
genes of an organism, translate them into proteins and then annotate them on the basis
of their similarity to known proteins in the public databases. However, the coding
sequences of genes which represent the vast majority of the information content, only
represent a tiny fraction of the total DNA. Only 1.5 % of the human genome has been
estimated to consist of coding sequence (Lander et al., 2001). Instead of looking at the
complete genome, an efficient alternative approach is to only look at the expressed
part of the genome, the transcriptome. The expressed genes, i.e. those that are
transcribed into mRNA, in a cell/tissue at a specific moment will determine the
current metabolic status of that particular cell/tissue. For instance, the mRNA pool of
roots will be different from the mRNA pool in leaves. The most widely used method
for gene identification of the transcribed portion of the genome is expressed sequence
tag (EST) sequencing. The term expressed sequence tag (EST) is used to describe
partial sequences of cDNA, reversely transcribed from mRNA. The reason for the
conversion of RNA to cDNA is the higher stability of DNA. The first application of
high-throughput sequencing of cDNA clones was described in 1991, where a brain
specific cDNA library was exploited (Adams et al., 1991). Since then more than 20
million ESTs have been sequenced from more than 600 distinctly annotated species,
representing a wide taxonomic variety of fungi, plants and animals (dbEST 19 March
2004, http://www.ncbi.nih.gov) (Boguski et al., 1993). Several plant species also have
large EST collections (Table 1). Since the 5’ sequences are more likely to contain
protein coding sequence than the 3’ends, early EST-projects favored sequencing the
5’ end of directionally cloned cDNAs. Currently, there is preference for sequencing
the 3’ end of the cDNA clone because it is likely to offer more unique sequence,
including the untranslated region (UTR), and can be used to distinguish between
3
HENRIK ASPEBORG
closely related gene family members. The strategy of sequencing both ends of the
cDNA clone are also becoming widespread (Rudd, 2003). A high quality cDNA
library with mostly full length genes is valuable for further investigation of candidate
genes (especially for organisms without their genome sequenced), since laborious
methods such as RACE (Rapid Amplification of cDNA Ends), which would
otherwise be necessary to obtain the full length gene, are not needed. Despite the fact
that ESTs are limited in size (e.g. typically 400-500 bp), assembly of overlapping
ESTs into clusters, can result in much longer sequences, and even full length cDNAs.
Table 1. Top 16 plant species ranked by the number of dbEST entries April 2, 2004
Species Common name Number of entries
Triticum aestivum Wheat 555472 Zea mays Maize 395951 Hordeum vulgare + subsp. vulgare Barley 356856 Glycine max Soybean 346582 Oryza sativa Rice 283989 Saccharum officinarum Noble cane 246301 Arabidopsis thaliana Thale cress 204396 Sorghum bicolor Sorghum 190864 Medicago truncatula Barrel medic 187763 Populus spp.* Poplars 155454 Lycopersicon esculentum Tomato 150519 Solanum tuberosum Potato 144730 Vitis vinifera Grape 137660 Pinus taeda Loblolly pine 110622 Lotus corniculatus var. japonicus Birdsfoot trefoil 110563 Lactuca sativa Garden lettuce 68188 * Populus species and hybrids Populus tremula x Populus tremuloides Hybrid aspen 65981 Populus tremula (European) aspen 31288 Populus balsamifera subsp. trichocarpa Black cottonwood 26825 Populus tremuloides Quaking aspen 12813 Populus x canescens (Populus alba x Populus tremula)
Gray poplar 10446
Populus balsamifera subsp. trichocarpa x Populus deltoides
Interamerican poplar
7524
Populus alba x Populus glandulosa 519 Populus euphratica Euphrates poplar 58
4
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
1.3 Global gene expression analysis
1.3.1 Background
Knowing when, where and to what extent a gene is expressed provides clues to gene
function. Preferably, gene expression information is obtained from a variety of
physiological and developmental conditions. Gene expression tools for a single gene
or a small set of genes have been available for a long time to provide an insight into
gene function. During the last decade sequence-based methods such as serial analysis
of gene expression (SAGE) (Velculescu et al., 1995) and EST sequencing have been
used to measure mRNA abundance for a large number of genes. For instance, large
EST datasets that originate from multiple tissue specific cDNA libraries provide
information on gene expression, simply by observing the relative abundance of the
individual ESTs in the different libraries, a procedure called electronic or digital
Northerns (Ohlrogge and Benning, 2000). However, the sequence-based methods are
expensive and slow ways to study gene expression. New parallel hybridization-based
approaches that allow gene expression patterns for thousands of genes to be obtained
simultaneously have instead become increasingly popular. Spotted microarrays
(Schena et al., 1995) and in situ oligonucleotide arrays (Lockhart et al., 1996) are two
such high-throughput methods that monitor global gene expression, and since they
were first described they have had a tremendous impact on the molecular biology
field. A supplement in Nature genetics was dedicated to these emerging technologies
in 1999, and was followed by another supplement in 2002 (Nature Gen. 1999. Vol 21
(1 suppl) pp.1-60, Nature Gen 2002. Vol 32 (1 suppl) pp 461-552). For a
comprehensive review of various microarray related issues these two collections of
reviews are highly recommended.
Array technology allows the precise positioning of DNA fragments at high density
onto a solid support, creating a two-dimensional array of DNA spots that can act as
molecular detectors. There are three main types of microarrays: filter arrays, spotted
glass slide arrays, and in situ synthesized oligonucleotide arrays (Holloway et al.,
2002). Filter arrays, often called macroarrays, cannot be spotted with the same density
as glass slide arrays, and the target is usually labeled radioactively. These are
relatively cheap to produce, but to be able to compare different samples they have to
be hybridized to individual separate arrays. The Affymetrix GeneChipsTM are the
5
HENRIK ASPEBORG
most widely used arrays with in situ synthesized oligonucleotides. Short
oligonucleotides, only 20-25 bases long, are synthesized by photolithography. The
advantages of using Affymetrix chips are the highly standardized setup, the complete
control over the sequences and the possibility to detect splice variants and single
nucleotide polymorphisms (SNPs). The weakness of this technology is the relatively
high cost, the limited organisms available, and that two chips are required to compare
two different samples because the Affymetrix technology can only handle one
fluorescent dye. In spotted arrays either oligonucleotides or cDNA fragments can be
used as probes. Although cDNA spotted microarrays have been the favored type,
spotting oligonucleotides 50-70 bases long is becoming increasingly popular. A
unique and powerful feature of the two-color spotted microarrays is the ability to
make a direct comparison between two samples on the same microarray (Fig. 1).
Since spotted cDNA microarrays have been successfully used in the present thesis,
this technology will be described in further detail. In the following description, the
term probe will refer to the immobilized DNA, while the labeled, mobile DNA that is
hybridized to the array is called target as suggested in the Nature Genetics supplement
Chipping forecast (Phimister, 1999).
1.3.2 Array fabrication
When creating spotted cDNA microarrays, a large collection of cDNAs representing
as many unique transcripts as possible (e.g. a subset of genes of interest or preferably
all genes in the genome), are PCR-amplified and subsequently spotted onto glass
slides using a high-speed robotic system. The glass slides are coated with poly-lysine,
amino silanes or amino-reactive silanes, which enhance the adherence of the printed
DNA. After printing, the DNA is usually cross-linked to the matrix by ultraviolet
irradiation. Alternatively, baking the arrays in 80°C for two to four hours works
equally well. To reduce background fluorescence, which may interfere with spot
identification and subsequent data analysis, the free reactive groups on the unprinted
area of the glass slides can be deactivated with chemicals like succinic anhydride or
by treatment with biomolecules such as bovine serum albumin (BSA). Finally, the
DNA molecules are denatured in water at high temperature before hybridization.
Other critical factors for the spot quality and morphology are the concentration of the
spotted DNA, the composition of the printing solution, temperature and relative
humidity.
6
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
Microarray / DNA chip
DNA clones (genes)
PCR amplification Purification
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . .
Robotic printing
Probe Laser 1 Laser 2
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
“green” raw data“red” raw data
Sample Reference
Labelled cDNA
Hybridize to microarray
mRNA
Target
Image analysis
Ratio Cy3
Cy5/
Figure 1. Schematic overview of spotted cDNA microarrays.
By spotting the cDNA clones in replicates, an estimation of the intra-slide variations
can be obtained. Because of array space limitations, duplicates are most widely
prepared. Another option is to print many replicates of a few selected genes.
Preferably, the replicates should be distributed over the slide surface. In addition to
the spotted cDNA clones of the examined organism, both positive and negative
element controls can be spotted to evaluate the quality and performance of the
microarray. Positive controls are used to examine the dynamic range for calibration or
as ratio controls (i.e. is the expected ratio obtained). These spiking controls are made
of two components. PCR products from a different organism are printed on the array,
and in vitro transcribed RNAs from these corresponding PCR products are added to
the hybridization mix in known amounts. Examples of negative controls are: poly(dA)
oligos, Cot-1 DNA, printing buffer and non-crossreactive PCR products from
unrelated species. These are included to check for non-specific hybridizations,
7
HENRIK ASPEBORG
blocking of repetitive elements, and pin washing efficiency. There are currently
several commercial control kits available.
1.3.3 Target preparation and hybridization
Target production begins with RNA isolation from the tissues of interest. Either
highly purified mRNA or total RNA can be used in subsequent reverse transcription
of mRNA. For direct labeling methods, targets are fluorescently labeled during the
reverse transcription step, using fluorescently labeled nucleotides. The alternative is
indirect labeling, whereby an amino allyl modified dUTP is used instead of a
prelabeled nucleotide. In an extra step after reverse transcription, the free amine group
on the amino allyl modified dUTP can be coupled to a reactive N-hydroxysuccinimydl
ester fluorescent dye. Although this technique is more time-consuming than direct
labeling, it has become widely used because it improves sensitivity, removes dye
biases and can be performed at decreased cost. The green cyanine-3 (Cy3) and the red
cyanine-5 (Cy5) have been the most popular dyes used in experiments with cDNA
spotted microarrays, but alternative dyes are available. A substantial amount of RNA
is required for labeling and hybridization, amounts typically above 10 µg of total
RNA are recommended. In order to perform microarray experiments with tissue
samples where the RNA amount is limited, various RNA amplification methods have
been developed. As the preservation of the relative expression levels from the original
cDNA/mRNA is critical, the aim is to develop unbiased amplification methods with
high reproducibility. After labeling and purification of the samples to remove
unincorporated dyes, the two targets, one typically labeled with Cy3 and the other
with Cy5, are mixed together in a hybridization buffer and hybridized to a single
microarray slide. During the period of hybridization, which extends several hours or
overnight, the target nucleic acids are allowed to form stable duplexes with the
corresponding DNA molecules immobilized on the array. Manual hybridizations are
performed underneath cover slips in small hybridization cassettes. As an alternative,
automated hybridization stations are available. The stringency of hybridization is
controlled by factors such as the hybridization buffer composition and temperature.
Choosing the washing conditions after hybridization is also critical for the outcome of
the microarray experiment.
8
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
1.3.4 Image analysis
Both confocal scanning devices and CCD cameras are used for fluorescence
detection. The scanner produces separate digital images for each of the Cy3 (green)
and Cy5 (red) channels. Generally the Cy5-channel is scanned first, as this dye is
more susceptible to photobleaching. The overlay of the two output images offers a
quick visualization of the microarray experiment and information about the
uniformity of the hybridization, spot morphology, background and artifacts, such as
dust particles, are revealed. Next, image analysis software is needed to quantify each
individual array element. First, a spot finding method is required to locate all signal
spots in the image and estimate the size of the of the spot. The spot-finding technique
must be robust and able to manage uneven spot sizes and positional deviations. Then
the foreground and background intensities for each spot are calculated. The
foreground signal is often measured as the mean or the median of the intensities of the
pixels identified inside the spot. There are several algorithms used to calculate the
background, and they all try to estimate the background noise in the area of the spot.
Most often the background signal is subtracted from the foreground, but this may not
always be the best choice. Finally the data can be filtered and spots exhibiting a low
signal to background ratio are removed from the data set (e.g. to exclude any spot
with intensity lower than the background plus two standard deviations). Typically, the
intensity ratios are also log-transformed (logarithm base 2 is most widely used). The
advantage of this transformation is that upregulated and downregulated values are
comparable and of the same scale (Quackenbush, 2002; Leung and Cavalieri, 2003).
This means that an upregulated gene with intensity ratio = 2 has a log2 (ratio) = 1,
while a downregulated gene with intensity ratio = 0.5 has a log2 (ratio) = -1, and a
constantly expressed gene with intensity ratio = 1 has a log2 (ratio) = 0.
After the image processing, but before the downstream analysis of the gene
expression profiles, it is necessary to normalize the relative fluorescence intensities in
each of the two scanned channels. The goals of normalization are to identify and
overcome as much of the technical variation as possible such that the observed ratios
will most accurately reflect the biological variance. The most common sources of
technical fluctuations are differences in labeling, detection efficiencies between the
used fluorescent labels, unequal quantities of initial RNA for the two samples
examined and systematic biases in the measured expression levels (Quackenbush,
9
HENRIK ASPEBORG
2002). By performing dye-swap replications, i.e. reversing the labeling of targets in
the second hybridization, a reduction of systematic biases can be achieved, but
normalization is still required. Over the years, the normalization methods have been
improved and become more sophisticated. The normalization factor can be calculated
using all the genes, a set of constantly expressed housekeeping genes or spiking
controls. It has been observed that housekeeping genes are not as constantly expressed
as previously assumed, and therefore the expression of selected housekeeping genes
must be carefully inspected before used for normalization.
There are basically two main techniques used for the normalization of gene
expression data from a single array hybridization (i.e. within-slide normalization).
They share the assumption that most of the genes in the array, some subset of genes,
or a set of spiking controls should have an average expression equal to one. Total
intensity normalization assumes that the initial quantity of mRNA is the same for the
two compared samples and that the change of expression is symmetrical, i.e. the
number of upregulated genes is equal to the number of downregulated genes. Thus,
the total intensity for all the array elements should be the same in the two scanned
channels (i.e. Cy3 and Cy5) and the normalization factor can be calculated after
summing the intensities in both channels. The second main type of normalization uses
regression techniques. Here one assumes that a significant fraction of the genes will
display no change in expression in the two samples compared. In a scatter plot these
genes would cluster along a straight line with a slope of one. Normalization is
performed by computing the best-fit slope and adjusting the data so that the slope is
one. The regression techniques also include non-linear methods such as lowess
(locally weighted scatterplot smoothing) regression (Quackenbush, 2001). Other
alternative approaches to normalize microarray data are log centering, rank invariant
methods and Chen’s ratio statistics (Quackenbush, 2002).
It is now evident that the dye bias in most microarrays is both intensity and position
dependent. The intensity dependent effects are easily visualized with a MA-plot
(Dudoit et al., 2002), which is a scatterplot of the M-values [M = log2 (R/G)] against
the A-values [A = 0.5log2(RG)] of an array (See Fig. 2 for details). For detecting
position dependent effects spatial plots or boxplots, which plot the distribution of the
log-ratios for each print-tip group, can be used (Fig. 2) (Smyth and Speed, 2003). The
10
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
non-linear lowess normalization has emerged as the dominating normalization method
to remove these systematic biases. If there are no evident spatial effects global lowess
normalization is a good choice for normalization. However, if there are possible
spatial effects print-tip lowess or scaled print tip lowess normalization should be
considered (Leung and Cavalieri, 2003). In addition to within-slide normalization,
between array normalization is useful if there are substantial scale differences
between microarray slides. This will prevent any particular replicate to become too
dominant and skew the data (Leung and Cavalieri, 2003; Smyth and Speed, 2003).
1.3.5 Data analysis
Cluster analysis methods are used to more systematically group related gene
expression patterns across experiments. By exploring the large data sets generated by
microarrays, this type of analysis can provide novel perspectives on cellular
regulatory mechanisms and can associate the expression of unknown genes with a
putative function. There are basically two main categories of clustering methods,
those that are unsupervised and those that are supervised. In unsupervised methods,
existing knowledge about the functional roles of different genes is not considered
prior to the analysis. In contrast, the supervised methods require the existence of a
training set of genes that are known to be related. The unsupervised hierarchical
clustering analysis presented in 1998 is still one of the most widely used unsupervised
methods for clustering microarray data (Eisen et al., 1998). This technique produces a
tree called a dendrogram. Other unsupervised clustering techniques, such as self-
organizing maps and k-means clustering, produce a fixed number of categories
instead of a hierarchy. Principal component analysis (PCA) and singular value
decomposition (SVD) are two other frequently used unsupervised methods
(Quackenbush, 2001; Holzman and Kolker, 2004). Examples of supervised methods
include: support vector machines, linear discriminant analysis, neural networks,
decision trees and k-nearest neighbors (Valdivia Granda, 2003).
11
HENRIK ASPEBORG
aa
Print tip
A = 1/2log2(RG)
a
b
Figure 2. (a) is showing a MA-plot. M = log2 (R/G) A = 0.5log2(RG) R=Cy5 channel R=Cy3 channel.
(b) is showing a boxplot.
12
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
1.3.6 Differentially expressed genes
One of the goals of microarray experiments is to identify which genes are
differentially expressed between pairs of samples. In the early days of microarrays,
differentially expressed genes were inferred by a fixed threshold cut off method (i.e. a
two-fold increase or decrease), but to reliably conclude that a particular gene is
significantly differentially expressed, statistical methods based on replicate
microarray experiments are needed. To increase data reliability, it has been suggested
that every microarray experiment should be performed in triplicate (Lee et al., 2000).
There are two types of replication, biological and technical. Technical replicates refer
to replication of the same extracted RNA samples, whereas biological replication
usually refers to the analysis of RNA from different individuals. In general, biological
replicates are preferred since technical replicates will not solve the problem of
biological variation (Yang and Speed, 2002). The procedure of selecting differentially
expressed genes can be divided into two steps. First, all genes are ranked in the order
of evidence for differential expression, then a cut-off value is chosen for the ranking
statistic, and all genes with a value above the cut-off are considered to be significantly
differentially expressed (Smyth et al., 2003). Commonly used statistical methods for
identifying differentially expressed genes are the student’s t-test and its variants,
ANOVA, Bayesian methods and the Mann-Whitney test (Leung and Cavalieri, 2003).
1.3.7 Validation
The difficulties with clone management constitute one of the drawbacks with spotted
cDNA microarrays. Handling large numbers of clones is time consuming and there is
a risk of clone mix up and contamination. Early reports discovered high error rates of
clone sets (both commercial and academic), and this important realization led to the
requirement that all sequences of a clone set must be checked before using them for
microarray printing (Knight, 2001). When the microarray data has been processed and
is finally analyzed, there are two general approaches to independently validate the
data: in silico analysis and experimental analysis. The in silico method compares the
microarray results with other gene expression data found in the literature or that is
deposited in public or private databases. Common experimental validation approaches
include: reverse transcriptase PCR (RT-PCR), real-time RT-PCR, northern blot and in
situ hybridization (Chuaqui et al., 2002). However, these follow-up experiments can
only include a small subset of the genes represented on the microarray. Finally, it is
13
HENRIK ASPEBORG
important to stress that the mRNA and protein levels do not always correlate, so
further characterization of the gene product is generally required.
1.4 Searching for a function
In the present thesis, gene function means the biochemical function of the end
product, the protein, but sometimes gene function is referred to as a cellular function
(e.g. a role in signal transduction pathway) or a developmental function (e.g. a role in
organogenesis) (Bouchez and Hofte, 1998). Although knowing the DNA sequence of
a gene and its expression pattern is valuable, these data do not provide direct
information about gene function. In many cases, information about the function of the
deduced protein sequence can be inferred on the basis of their similarity to better-
characterized proteins from other organisms. Although specific terms have been
developed to discuss related gene sequences (Table 2), many predicted open reading
frames cannot be assigned a function simply by similarity to other known proteins.
For instance, more than 30% of the ORFs in Arabidopsis are classified as unknowns
(The Arabidopsis genome initiative, 2000). Moreover, knowing that a gene encodes a
transcriptional factor or is involved in signal transduction, does not reveal the specific
cellular processes the gene is involved in (Somerville and Somerville, 1999). Many of
the “known” genes have been assigned a putative function based only on their
similarities to other genes, but annotating genes in this way produces considerable
uncertainty (Osterlund and Paterson, 2002). Gene expression data is also insufficient
for assigning function. In fact, there is probably no single technology or approach that
can provide that information. Instead, different approaches provide pieces of the
complicated puzzle. Thus, in addition to the DNA sequence and transcript profiles, in
situ hybridization, protein localization, heterologous protein expression studies,
protein-protein interaction studies, post-translational modification studies, 3-D
structure studies, and mutant analyses are important methods for deciphering gene
function.
One of the most efficient approaches to determine gene function is the so-called
reverse genetics approach, where a loss-of-function mutation is created in the gene of
interest and the phenotype of the resulting mutant is studied. This is in contrast to the
classical forward genetics approach, which starts with the observation of an
14
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
interesting phenotype and ends with tracking down the gene responsible for it. There
are a number of reverse genetics tools available such as RNA interference (RNAi),
virus-induced gene silencing and the use of systematically generated transfer DNA
(T-DNA) or transposon populations. But the reverse genetics approach also has
drawbacks. Many proteins are members of large protein-families, where the
individual members share high sequence similarity and redundant functions. In such
cases, there is a risk that disruption of a single gene may not produce an organism
with a visible or a molecular phenotype. If the gene family is not too large, one could
attempt to isolate mutants in each member of the family and then cross them to
generate double or multiple mutants to overcome this problem. Disruption of an
essential gene causing lethality is another limitation of reverse genetic approaches. To
overcome this problem it is possible to generate leaky alleles or conditional mutants
(Perrin et al., 2001; Tissier and Bourgeois, 2001). Still, knowing which environmental
condition or developmental stage will manifest an interesting phenotype may require
additional experiments.
Table 2. Classification of the different types of gene homology
Homolog Related genes that evolved from a single common ancestor
Paralog Related genes that evolved due to gene duplication within a species
Ortholog Related genes that evolved after speciation
1.5 Wood formation
1.5.1 Economic importance
Wood formation (xylogenesis) is a fundamental biological process of significant
economic and commercial value. Trees are responsible for the majority of land-borne
biomass production and constitute an important industrial raw material. Wood is
ranked as the fifth most important product in world trade, and provides biofuel, fibers,
solid wood products, but also chemical products, plastics, food and textile products
(Plomion et al., 2001). The world’s forest area was estimated to be 3.87 billion
hectares in 2001, which is about one-third of the earth’s land surface, and the total of
15
HENRIK ASPEBORG
above-ground woody biomass in forests was estimated to 420 billion tones (FAO,
2001). Global demand for wood is growing at 1.7% annually, so there is a need to
develop forestry in a long-term sustainable way to meet this demand (Fenning and
Gershenzon, 2002). The development of high-yielding short-rotation plantation
forests is one approach to supply the world with more wood without degrading the
remaining natural forests of the world. Biotechnology has significant potential to
contribute to this development. To date the public concern about the risk of using
genetically modified organisms (GMOs) has slowed down the development of
genetically modified (GM) trees. However, plenty of benefits can be achieved using
biotechnology without performing genetic modifications until the risk of GM tree
plantations has been thoroughly investigated, e.g. by utilizing molecular marker
assisted breeding.
Sweden is located in the northern coniferous zone, with spruce and pine being the
most common tree species in forests. The most common Swedish hardwood
(angiosperm) species is birch, followed by aspen and alder. Forestry has for a long
time been a vital part of Swedish economy. Although the proportion of the population
employed in the forestry sector has decreased, forestry is still one of the most
important industrial sectors in Sweden. For instance, in 2001, forest products worth
110 billion SEK were exported (approximately 15% of the total exports), while
imports of forestry products amounted to 22 billion SEK. Thus, the surplus on the
trade in forest products was 88 billion SEK. Sweden is the world’s fourth largest
exporter of both pulp and paper and the world’s second largest exporter of sawn
timber products (http://www.skogsindustrierna.se).
Forest and forest product biotechnology research was initiated in the pulp and paper
industry in the early 1980’s. The idea that microorganisms and their enzymes can
successfully replace or supplement conventional chemical processes in the pulp and
paper industry has now gained acceptance. Biotechnology has the potential to perform
more specific reactions in more environmentally friendly processes with significant
energy savings. Current biotechnological methods in the pulp and paper industry
include bio-pulping, enzymatic bleaching, pitch control and purification of waste
water. Microbial xylanases, peroxidases, lipases and laccases are examples of
enzymes that have been used successfully to improve pulp processing (Mansfield and
16
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
Esteghlalian, 2003). Today, the role of forest product biotechnology is expanding
beyond the pulp and paper industry. Newer application areas include genetic
engineering of forest trees to improve forest health, e.g. disease resistance, cold
tolerance, rooting ability, and enhanced properties, e.g. accelerated growth, improved
fiber quality, lignin content/composition, phytoremediation, enzymatic hydrolysis of
cellulose and hemicellulose for bioethanol, production of bio-glued board materials,
utilization of antagonistic organisms for wood protection and the development of new
biocomposites. During the past decade the main focus of pulp and paper
biotechnology has been screening for new potentially interesting enzymes from
microbial sources. Plants provide an alternative and largely unexplored source of
enzymes. While microbial enzymes are wood degrading, plants contain wood
synthesizing enzymes, thus offering us a new set of target enzymes for fiber
modification. These biosynthetic enzymes could be used to add functionality onto
fiber surfaces. In summary, forest product biotechnology is a promising research field,
which could be integrated in almost any step of the process that starts with the birth of
a tree and ends with a finished product.
1.5.2 Studies of wood formation - poplar and other model organisms
It might seem evident that for the study of wood formation, trees should be the model
choice, but in fact weeds and ornamentals can also contribute to the knowledge of this
process. The weed thale cress (Arabidopsis thaliana) was the first plant to have its
genome sequenced, and this model organism has been extremely valuable for plant
biology. Under appropriate growth conditions Arabidopsis undergoes secondary
growth in hypocotyls, and can therefore be used to identify genes involved in wood
formation. The fibers and vessel elements of Arabidopsis are also morphologically
and ultrastructurally similar to those of trees like poplar (Chaffey et al., 2002;
Bhalerao et al., 2003). The ornamental garden plant, Zinnia (Zinnia elegans), can be
used to study certain aspects of wood formation in vitro. Isolated Zinnia mesophyll
cells maintained in liquid culture can be induced to redifferentiate synchronously into
tracheary elements. During this process genes and proteins can be identified at
different stages of xylem formation. The Zinnia treachery elements form vessel-like
structures in vitro, resembling vessels formed in planta (Fukuda, 1997; McCann,
1997; Milioni et al., 2001; Demura et al., 2002; Milioni et al., 2002; Pesquet et al.,
2003). However, there are a number of wood related processes that cannot be studied
17
HENRIK ASPEBORG
in annual plants (e.g. heartwood formation, the seasonal cycle of cambial
dormancy/activity, wood maturation, the seasonal variation of earlywood and
latewood production). Moreover, there are limited amounts of xylem tissue in both
Arabidopsis and Zinnia. Thus, although other plant models can contribute to our
understanding of certain aspects of wood formation and other important processes of a
tree, most tree-specific questions are best answered using tree models.
Woods are classified as hardwoods (angiosperms) or softwoods (gymnosperms).
There are basic structural differences between the two kinds of wood, but the terms
“hardwood” and “softwood” do not necessarily reflect the relative density or hardness
of wood. Poplars and pines were the two first tree genera to be studied at a molecular
level, and these are still the two most extensively studied tree genera with the largest
published EST-libraries, representing the angiosperms and the gymnosperms,
respectively (Allona et al., 1998; Sterky et al., 1998). There are other on-going public
EST-projects for birch, spruce, black locust and eucalyptus, and EST-collections from
pine and eucalyptus have also been reported by industrial laboratories (Bhalerao et al.,
2003; Campbell et al., 2003); Campbell et al, 2003; Yang et al 2003) (Table 3). The
first tree genome sequencing project was initiated in 2002, when The DOE Joint
Genome Institute started sequencing a member of the Populus genus, the black
cottonwood Populus trichocarpa. With this genomic platform, which will be public in
May 2004, Populus will be the most important model tree genus for the next coming
years. Though gymnosperms are generally more commercially important, a genome
sequencing effort for a representative gymnosperm is not expected in the near future,
mainly because of the cost of sequencing their large-sized genomes. Furthermore, a
lot of the knowledge obtained by studying Populus will be applicable to gymnosperm
tree species. Poplar has been gradually established as the model tree and the model
hardwood over the years in part because its genome is relatively small. The 500-550
Mbp poplar genome is similar in size to the rice genome, and only 4 times larger than
the genome of Arabidopsis, yet 40 to 50 times smaller than the huge pine genome
(Wullschleger et al., 2002; Brunner et al., 2004). Poplars are also easy to transform,
they were the first trees to be genetically transformed, and they are among the most
fast-growing trees found in the temperate regions. The first EST-project in poplar
reported about 5000 EST sequences from Populus tremula x tremuloides and Populus
trichocarpa (Sterky et al., 1998). Since then, over 130000 ESTs from Populus
18
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
tremula x tremuloides, Populus trichocarpa and Populus tremula have been
sequenced in this project (http://www.populus.db.umu.se). Furthermore, 7000 root ESTs
from Populus trichocarpa x deltoides (Kohler et al., 2003), 10000 wood related ESTs
from Populus tremula x alba (Dejardin et al., 2004) and 12000 ESTs from different
tissues of Populus tremuloides (Wullschleger et al., 2002) have also been reported. In
addition to the EST and genomic sequencing efforts, several other resources are
already or will soon be available, including microarrays, a bacterial artificial
chromosome (BAC) physical map, a complete chloroplast DNA sequence, and several
hundred mapped simple sequence repeat (SSR) markers (Bhalerao et al., 2003). In
addition, Populus is closely related to Arabidopsis, which facilitates comparative
functional studies between the two model systems. All together this makes poplar an
ideal model system for studying phenomena such as shoot dormancy, adaptions to
deep frost, the presence of juvenile and mature phases, and extensive secondary
growth in forest trees (Mellerowicz et al., 2001).
Table 3. Sequencing projects in trees
Species Worldwide-web URL
Genome Populus trichocarpa http://genome.jgi-psf.org/poplar0/poplar0.home.html
ESTs Populus tremula, Populus trichocarpa, Populus tremula x tremuloides
http://www.populus.db.umu.se
Populus trichocarpa x deltoides, Populus tremula x alba
http://mycor.nancy.inra.fr/poplardb/aims.html
Populus trichocarpa http://www.arborea.ulaval.ca/ Populus tremuloides http://aspendb.mtu.edu/ Betula pendula http://www.biocenter.helsinki.fi/bi/PLANT/Helariutta/ Eucalyptus globulus http://ccgb.umn.edu/biodata/ Eucalyptus camaldulensis http://ccgb.umn.edu/biodata/ Robinia pseudoacacia http://ccgb.umn.edu/biodata/ Pinus taeda http://pinetree.ccgb.umn.edu/ Picea Glauca http://www.arborea.ulaval.ca/
The genus Populus is a member of the Salicaceae (willow) family and the order
Salicales. Populus species are naturally widely distributed over the Northern
Hemisphere with a small representation in tropical Africa (Bradshaw et al., 2000).
Although various classifications have been suggested, a recent classification
recognize 29 species that are grouped under six separate sections (Eckenwalder,
19
HENRIK ASPEBORG
1996). The only naturally occurring poplar species in Sweden, the European aspen
(Populus tremula), belongs to the section Leuce (aspens) (Fig. 3a). Populus tremula is
a broadly distributed tree, covering almost the whole of Europe and parts of northern
Asia, China and Japan. The closest relatives are White poplar (Populus alba), the
American (quaking) aspen (Populus tremuloides), Siebold aspen (Populus sieboldii),
and the Chinese aspen (Populus adenopoda) (Almgren 1990). The hybrid aspen
Populus tremula x tremuloides is a crossing between European aspen (Populus
tremula) and American aspen (Populus tremuloides) (Fig. 3b). The first crossing in
Sweden was made in 1939, and it was found that the hybrid showed higher growth
rate than either of the parental species (Almgren and Skogsstyrelsen, 1990). From
1940 to 1965, efforts were made to develop hybrid aspen forestry, but due to the
decline of the Swedish match industry (caused by the introduction of the cigarette
lighter), which was the most important supporter of the research, the interest in hybrid
aspen waned. Since the 1980s, the Swedish forestry has shown renewed interest in
hybrid aspen as an alternative for short rotation forestry on former agricultural land.
However, the dominating alternative crop today in Sweden are willows (Salix spp)
planted on approximately 20000 ha, whereas the area with poplar plantations is only
300 ha. Recent studies have shown that poplars and hybrid aspen are superior biomass
producers compared with tree species commonly grown on agricultural land at these
latitudes (Norway spruce and birch), thus poplar cultivation has a strong potential to
expand in Sweden (Christersson, 1996; Rytter, 2002; Karacic et al., 2003). Although
hybrid aspen or other poplars are not yet planted over large areas in Sweden, the
importance and the knowledge of intensive poplar cultivation for bioenergy and fiber
production in Europe and North America is well documented (Tuskan, 1998; Pellis et
al., 2004).
20
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
Figure 3. (a) Populus tremula growing in Lohärad, Sweden. (b) Populus tremula x tremuloides in the
greenhouse. Photo: Henrik Aspeborg.
1.5.3 Xylem differentiation
Wood is secondary xylem, that is produced by the vascular cambium through a series
of developmental steps. Cells formed by the vascular cambium give rise to phloem on
the outside and xylem towards the inside of the tree trunk. As the cambium adds cells
to the secondary xylem, the cambium is deposited outwards, and the circumference
and the stem diameter are increased. The increasing diameter growth is caused by
periclinal divisions and the increasing circumference is accounted for by occasional
anticlinal (radial) divisions. The major steps of wood development are cell division,
expansion, secondary cell wall formation, lignification and programmed cell death. In
angiosperms fusiform cambial cells differentiate inwards into three xylem cell types:
vessel elements, fibers and axial parenchyma; whereas on the outer side they give rise
to the phloem cell variants: sieve tubes, companion cells, axial parenchyma and fibers
(Fig. 4). Xylem has two main functions: water transport and mechanical support,
whereas phloem conducts nutrients synthesized during photosynthesis. Ray cambial
21
HENRIK ASPEBORG
cells give rise to contact and isolation ray cells, which play a role in radial
conductance and storage of nutrients such as starch, proteins and lipids (Mellerowicz
et al., 2001; Plomion et al., 2001). Wood fiber morphology, structure and composition
vary greatly between tree species, but differences
Figure 4. Macerated wood of Populus tremula x tremuloides showing major cell types: vessel elements, fibers and ray parenchyma cells (image kindly provided by Ewa Mellerowicz).
within species are also found. Fibers formed in the spring (earlywood) are
characterized by thin walls, while fibers created in the end of summer (latewood) have
thick walls. Juvenile wood formed within the first 10-20 years of a tree’s life, and in
the crown of older trees, has a lower density and a higher cellulose microfibril angle
compared to mature wood. Another important variant of wood formation in a tree is
reaction wood. This type of wood is produced to compensate for exposure to wind or
other types of bending pressure. Softwoods form reaction wood on the compressed
side of a bending trunk (compression wood), whereas reaction wood formed by
hardwoods is induced on the upper side of a leaning stem (tension wood).
22
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
1.5.4 The plant cell wall
Properties of the wood fiber depend on the biosynthesis of the plant cell wall. The
plant cell wall is an intricate biochemical network that plays a vital role in almost
every aspect of plant physiology. The many different functions of the plant cell wall
include provision of mechanical strength, definition of cell shape, participation in cell-
cell communication, protection against pathogen attack and interaction with
symbionts. The cell wall is composed of several layers that are produced at different
periods during cell differentiation. The middle lamella is the first layer produced
during xylogenesis and it is developed after cell division. The middle lamella ensures
the adhesion of a cell with its neighbors, and consists mainly of pectin and small
amounts of lignin, which are deposited later on. The primary wall is formed at the
beginning of cell differentiation. Populus and other dicots have a Type 1 primary wall
consisting of cellulose microfibrils associated with hemicellulose, which interacts
with a structurally independent pectin network. A third network consists of structural
proteins or a phenylpropanoid network. The primary wall is strong but flexible and
capable of expansion (increase in cell size in two or three dimensions) and elongation
(expansion in only one dimension). Proteins like xyloglucan endotransglycosylases
(XETs) and expansins have possible wall-loosening activities during cell expansion
(Fry et al., 1992; Mcqueen-Mason et al., 1992). When the cells have reached their
final size, a secondary cell wall is synthesized on the inner side of the primary wall.
The secondary cell wall of wood fibers is divided into three different sublayers, S1,
S2, and S3 (Fig 5). Cellulose, hemicellulose and lignin are present in each of these
layers. The S1 layer is the thinnest of the S layers, representing only 5% to 10% of the
total thickness of the cell wall. This layer can be regarded as an intermediate between
the primary wall and the inner layers S2 and S3. The thick middle layer S2 accounts
for 75% to 85% of the cell wall, and thereby is the most important regarding wood
fiber properties. S3 is the innermost layer, and is also thin. Because the cellulose
microfibrils are arranged in different directions in these layers, it is possible to
distinguish them visually. Xylem vessel elements and ray cells also form secondary
walls with three different S-layers (Mellerowicz et al., 2001).
23
HENRIK ASPEBORG
Figure 5. A schematic cross section of the fiber cell wall that illustrates the different cell wall layers
and the structure of the secondary cell wall. The orientation of the cellulose microfibrils differ between
the cell wall layers.
1.5.5 Carbohydrate biosynthesis and turnover
1.5.5.1 Carbohydrate-active enzymes involved in cell wall biosynthesis
Before discussing the biosynthesis of different polysaccharides in the plant cell walls,
a description of key players in this process is needed. There is a vast number of
carbohydrate-active enzymes (CAZymes) involved in the synthesis, modification and
degradation of oligosaccharides and polysaccharides. The synthesis of glycosides is
performed by glycosyltransferases (GTs), modification is a task for e.g. carbohydrate
esterases (Ces), and the breakdown of glycosidic linkages is performed by glycoside
hydrolases (GHs) and polysaccharide lyases (PLs). Genes encoding glycoside
24
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
hydrolases and glycosyltransferases have been classified in sequence- and structure-
based families (Henrissat and Davies, 1997; Coutinho et al., 2003). All of the
glycoside hydrolases and glycosyltransferases families, as well as families of
polysaccharide lyases, carbohydrate esterases and carbohydrate-binding modules
(CBMs), can be found at the continuously updated Carbohydrate-Active enZyme web
server (http:/afmb.cnrs-mrs.fr/CAZY/). In March 2004, there were entries for 93
families of glycoside hydrolases and transglycosidases, 70 families of
glycosyltransferases, 14 families of polysaccharide lyases, 14 families of carbohydrate
esterases and 34 families of carbohydrate-binding domains. While the mechanism and
3-D structure are known for many glycoside hydrolases, characterization of the
glycosyltransferases has been much slower (Davies and Henrissat, 2002). The power
of the sequence-based family classification is that once the mechanism is established
for a member of a family, this mechanism is apparently shared by all other members
of the family (Henrissat et al., 2001). For instance, an unknown glycosyltransferase
can always be annotated as a putative inverting (or retaining) glycosyltransferase from
family GTxx. An alternative enzyme classification is provided by the International
Union of Biochemistry and Molecular Biology (IUBMB), which assigns an enzyme
commission (EC) number to an enzyme based on donor, acceptor and product
specificity. However, issuing an EC number is problematic when dealing with
enzymes that act on several distinct substrates, and furthermore, the EC numbers do
not consider the structural and mechanistic features of the enzymes (Coutinho et al.,
2003). From the analysis of completed genomes, it is clear that plants (Arabidopsis)
feature far more glycoside hydrolases and glycosyltransferases in relation to total
number of putative genes than sequenced genomes from any other organism
(Henrissat et al., 2001). One explanation for this is that a vast number of
carbohydrate-active enzymes is required for the biosynthesis and modification of the
complex polysaccharides found in the plant cell wall (Coutinho et al., 2003).
1.5.5.2 Cellulose biosynthesis
Cellulose is the most abundant polysaccharide in nature with approximately 180
billion tons produced each year (Delmer, 1999). Cellulose is composed of (1→4)-β-
D-glucan chains, which are synthesized in plants by large plasma membrane-bound
protein complexes. These enzyme complexes are organized in hexagonal structures
25
HENRIK ASPEBORG
called rosettes. As recently as 1995 not a single gene encoding a cellulose synthase
had been cloned or identified. Since the first CesA gene was identified in the bacteria
Acetobacter xylinum and Agrobacterium tumefaciens (Saxena et al., 1990; Matthysse
et al., 1995), which produce extracellular ribbons of cellulose, many probable CesAs
have been identified in plants. The most extensive studies have so far been conducted
in Arabidopsis and cotton. For instance, at least 10 different CesA genes that encode
catalytic subunits of the cellulose synthase complex have been found in Arabidopsis,
and have been shown by mutant analyses to play distinct roles in the cellulose
biosynthesis (Doblin et al., 2002). Like the bacterial CesAs, the plant CesAs have
eight predicted transmembrane domains, two at the N-terminus and six at the C-
terminus. In addition, plant CesAs contain two unique domains, a “plant conserved
region” (CR-P) and a “hypervariable region” (HVR) (Pear et al., 1996; Delmer,
1999). Ancillary proteins or enzymes are likely required for the extrusion of cellulosic
chains and the assembly of microfibrils. Korrigan, a membrane-bound plant
endoglucanase, is the only other protein that has been strongly linked to the process of
synthesizing cellulose. Other proteins including sucrose synthase, cytoskeleton
components, Rac13, redox proteins and a lipid transfer protein have also been
implicated in cellulose biosynthesis. Enzymes encoded by CesA genes belong to the
large glycosyltransferase family GT2 (Henrissat et al., 2001). Expression analyses
have suggested that some CesAs have a role in secondary cell wall synthesis, while
others are important for the primary wall synthesis (Pear et al., 1996; Gardiner et al.,
2003; Tanaka et al., 2003; Taylor et al., 2003; Burton et al., 2004; Liang and Joshi,
2004).
1.5.5.3 Hemicellulose biosynthesis
Hemicellulose is the second most abundant polysaccharide in nature. Whereas
cellulose polymerization occurs at the plasma membrane, hemicellulose biosynthesis
takes place in the Golgi apparatus, followed by deposition in the cell wall by
exocytosis. While the chemical structure of cellulose is the same for all plants,
hemicelluloses are a heterogeneous group of carbohydrate polymers, which vary
between different cell walls and plant taxa. The major hemicellulose of hardwoods
(angiosperms) is O-acetyl-(4-O-methylglucurono)-xylan, but a smaller portion of
glucomannan is also present. The dominating softwood (gymnosperm) hemicellulose
is an O-acetyl-galactomannan accompanied with a minor amount of arabino-(4-O-
26
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
methylglucurono)-xylan. Xyloglucan is the major primary wall hemicellulose of Type
1 cell walls found in most dicots, conifers and in selected monocots except grasses
(Brett and Waldron, 1996). The hemicellulose biosynthetic enzymes identified so far
are mostly glycosyltransferases involved in the biosynthesis of the primary cell wall
hemicellulose xyloglucan (Perrin et al., 1999; Faik et al., 2000; Faik et al., 2002;
Madson et al., 2003). However, the first identified and characterized enzyme involved
in cell wall biosynthesis was an alpha-galactosyltransferase from fenugreek
(Trigonella foenum-graecum), which was involved in the biosynthesis of the storage
carbohydrate galactomannan (Edwards et al., 1999). All the above mentioned
glycosyltransferases are responsible for decorating the hemicellulose backbones with
side chains. A glycosyltransferase involved in synthesizing the backbone of
galactomannan in guar (Cyamopsis tetragonoloba) was only recently identified.
Interestingly, this β-(1→4)-mannan synthase is related to CesA genes (Dhugga et al,
2004).
1.5.5.4 Pectin biosynthesis
Pectic substances constitute a highly complex family of polysaccharides that are rich
in D-galacturonic acid (GalA). Since pectins are found mainly in the plant primary
cell walls and middle lamella, they are a minor component in wood. Still pectins play
an important role during wood formation. Three classes of pectins can be
distinguished based on two different backbone configurations: homogalacturonan
(HG), rhamnogalacturonan I (RGI) and rhamnogalacturonan II (RGII). HG and RGII
have backbones of α(1→4)-linked galacturonic acid, while RGI contains a backbone
of alternating rhamnose and galacturonic acid. In a recent model of the dicot primary
wall, HG and other pectins are portrayed as side-chains of RGI (Vincken et al., 2003).
It has been estimated that more than 50 glycosyltransferases are required for the
biosynthesis of the pectic polysaccharides (Ridley et al., 2001). However, only three
glycosyltransferases involved in pectin biosynthesis have been identified so far: a
family GT8 glycosyltransferase from Arabidopsis was linked to HG synthesis
(Bouton et al., 2002), a mutation in another family GT8 glycosyltransferase affected
RGI (Lao et al., 2003), and a family GT47 from Nicotiana plumbaginifolia was
reported to be involved in constructing RGII (Iwai et al., 2002). Two genes involved
in the nucleotide sugar interconversion pathway were also shown to be involved in
27
HENRIK ASPEBORG
pectin biosynthesis. A GDP-Man-4,6-dehydratase (MUR1) involved in RGII (Bonin
et al., 1997), and recently a NDP-rhamnose synthase was shown to influence RG1
synthesis (Usadel et al., 2004).
1.5.6 Other cell wall components
1.5.6.1 Lignin
Lignin is the third major component of wood and second only to cellulose as the most
abundant organic compound on earth. Lignin is a highly branched phenolic polymer
consisting of three different phenylpropane monomers: p-courmaryl alcohol, coniferyl
alcohol, and sinapyl alcohol. The relative amount of each monomer differs depending
on whether the lignin is from gymnosperms or angiosperms. Lignin is the last
secondary cell wall polymer synthesized, and it is believed to act as ‘glue’ between
the cell wall carbohydrates. Every lignin deposition event is preceded by carbohydrate
deposition (i.e. when S1 formation is initiated, lignification occurs in the middle
lamella and primary wall, after the S2 layer is completed lignin is deposited in the
secondary cell wall, while the bulk of lignin is deposited after the carbohydrates have
been laid down in the S3 layer). Lignin deposition is influenced by the chemical
nature of the cell wall carbohydrates and the orientation of the cellulose microfibrils
(Boerjan et al., 2003). Although lignin biosynthetic pathways have been re-evaluated
recently, most of the enzymes known to participate in monolignol biosynthesis have
been cloned and the overall process is relatively well understood (Grima-Pettenati and
Goffner, 1999; Dixon et al., 2001; Humphreys and Chapple, 2002; Baucher et al.,
2003; Boerjan et al., 2003)
1.5.6.2 Cell wall proteins
Although the cell wall is mainly composed of polysaccharide polymers, structural
proteins are also important components of the cell wall. Four major classes of
structural proteins have been recognized: hydroxy-proline-rich glycoproteins
(HRGPs), the proline-rich proteins (PRPs), glycine-rich proteins (GRPs) and
arabinogalactan proteins (AGPs). Most of the cell wall proteins are glycosylated. The
AGPs are so heavily glycosylated, more than 95% might be carbohydrates, that
28
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
proteoglycans is a more suitable name. Other proteins found in the cell wall are wall-
associated kinases, lecitins, expansins and cell wall modifying enzymes.
1.5.6.3 Extractives
These compounds are named extractives because they can be removed from wood by
extraction with solvent. Wood contains 0,4 to 8,3% extractives by dry weight
depending on the species. The extractives can generally be divided into three types,
i.e., terpenes, resins and phenols. The presence of extractives affects the color of
wood. The wood resins, lipophilic compounds such as free fatty acids, resin acids,
waxes, fatty alcohols and sterols, form colloidal pitch during wood pulping. Pitch
deposition results in low quality pulp and cause problems in paper machine efficiency
(Gutierrez et al., 2001).
29
HENRIK ASPEBORG
2 OBJECTIVES OF THE PRESENT INVESTIGATION
The present thesis has employed a functional genomics approach to identify enzymes
involved in wood and fiber formation (i.e. the secondary cell wall formation) in
hybrid aspen. The specific aims were:
• To set up a microarray-based transcript profiling system for poplar
• To use transcript profiling as a tool to screen for genes upregulated during
secondary cell wall formation
• To use bioinformatic tools to identify genes encoding carbohydrate-active
enzymes with special emphasis on those upregulated during secondary cell wall
formation
• To obtain full length cDNA sequences for a selected number of carbohydrate-
active enzymes and genes with unknown function
30
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
3 RESULTS AND DISCUSSION
3.1 Transcript profiling of woody tissues
3.1.1 Development of a target amplification method (I)
Although the large-scale EST-sequencing initiative of Populus, founded by the
Swedish Center for Tree Functional Genomics (SCTFG), is an excellent resource for
gene discovery, it lacks the high resolution needed to analyze specific cell types.
Microarray analysis of wood formation could provide information about stage specific
gene expression, but microarrays are generally restricted by the amount of available
RNA. In order to perform microarray-based gene expression analysis on sampled thin
30 µm wide sections of the wood forming zone, a target amplifying method was
developed (paper I). PCR is a powerful method that allows amplification of mRNA
populations, but the amplification introduces a bias for shorter fragments due to the
more efficient amplification of these transcripts compared to longer ones. To
overcome this bias, the transcriptome was randomly fragmented by sonication prior to
the PCR step. Subsequently, 3’ cDNA tags for each transcript were isolated and
amplified, followed by labeling using asymmetric PCR (Fig. 6). A small microarray
containing 192 hybrid aspen clones printed in triplicate, as well as a microarray
containing 2995 hybrid aspen clones printed in duplicate, were hybridized with targets
isolated from xylem and phloem to assess the accuracy of the 3’ cDNA tag
amplification method. By comparing microarray data resulting from the amplification
protocol and 0.1 µg of total RNA with microarray data resulting from standard target
production techniques and 1 µg mRNA, it was estimated that a 2-fold difference in
expression could be distinguished with 99% confidence using the new method. To
further demonstrate the usefulness of the amplification technique, the transcript
profiles of two phloem sections representing different stages of development were
compared using the small array containing 192 clones. Sample A consisted of
differentiating phloem directly adjacent to the cambial cells, while sample B consisted
of further differentiated and more mature phloem cells (e.g. cells with secondary cell
walls). This experiment identified a few cell wall related genes. For instance, an
expansin gene was upregulated in sample A, consistent with the rapid expansion of
these cells. A gene upregulated in sample A that was annotated as similar to the
31
HENRIK ASPEBORG
Arabidopsis F1N21.7 protein has been reannotated as a glycosyltransferase from
family GT13. Among the upregulated genes found in sample B, two lignin related
genes were identified: a phenylalanine ammonia-lyase (PAL) gene, and a blue copper
protein gene. At present, only two additional comprehensive and large-scale studies of
phloem gene expression have been published in plants (Oh et al., 2003; Vilaine et al.,
2003).
Immobilization onto streptavidin coated beads Repair (T4 DNA polymerase)
Release of 3' tags by restriction
Blunt end----------TTTTTTTTTTT----------AAAAAAAAAAA 5'
3'NotI
Ligate to adaptors
-------TTTTTTTTTTT-------AAAAAAAAAAA
NotI BamHI
PCR amplification (20-25 cycles)and labelling by assymetric PCR (10 cycles)
mRNA isolation from 0.1 - 1 µg of total RNA
nucleus AAAAA(A)n
AAAAA(A)n
1st and 2nd strand cDNA synthesis
AAAAA(A)n
AAAAA(A)nAAAAA(A)n
AAAAAAAAAAAAA(A)nTTTTTTTTTTTTT-NotI-biotin-5’
Fragmentation by sonication
Hybridize to microarray
……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….……………………………….
AAAAA(A)n
AAAAA(A)nAAAAA(A)n
Figure 6. A schematic overview of the 3’ target amplification method.
32
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
3.1.2 Transcript profiling during xylogenesis (II) (III) (V)
3.1.2.1 A transcriptional roadmap to wood formation (II)
A large number of genes potentially involved in wood formation was identified in the
first poplar EST-library (Sterky et al., 1998). To gain insight into how expression of
these genes is regulated during development and differentiation of wood, a large scale
gene expression analysis tool, such as that provided by cDNA microarray technology
was needed. Therefore, the EST sequences originating from the cambial region and
described in Sterky et al (1998) were clustered to create a unigene set of 2995 genes,
and this collection was PCR-amplified and spotted in duplicate onto microarrays.
Development of secondary xylem is a highly organized process and recognizable
boundaries exist between different developmental stages. The ordered nature of
xylogenesis was exploited when obtaining tissue sections by tangential cryo
sectioning (Uggla et al., 1996) that represented (Ph) phloem, (A) cambium, (B) early
expansion, (C) late expansion, (D) secondary call wall formation and (E) late cell
maturation (e.g. programmed cell death). Target preparation from these samples
integrated the non-biased RNA amplification method described in paper I.
An experiment spanning a wide developmental gradient (i.e. from meristematic cells
to cells undergoing programmed cell death) using an array consisting of relatively few
genes requires careful experimental design. Normalization and choice of reference
were two critical factors that warranted considerable thought. Since the majority of
the genes were expected to be differentially expressed across the developmental
gradient, commonly used normalization methods could not be applied. Therefore,
instead of using all of the genes on the array, spiked human controls were used for
normalization. Because there is no obvious reference tissue section in developing
xylem, a mixture of equal amounts of the individual samples (A+B+C+D+E) was
used as a common reference. This ensures that genes expressed in only one tissue will
be represented in the reference, thereby decreasing the risk of loosing such genes in
the quality filter process. However, the transcript profiles of genes that are highly
expressed in one sample will be truncated because that sample comprises 1/5 of the
reference.
33
HENRIK ASPEBORG
The results showed that a large proportion of the examined genes were under strict
developmental stage-specific transcriptional regulation. More than 1200 genes of the
almost 3000 represented on the microarray showed 4-fold differential expression
between the different stages (max ratio/min ratio > 4) and over 800 showed a greater
than 8-fold alteration. A cluster analysis of the latter group revealed gene classes with
developmentally regulated transcript profiles (paper II: Fig. 1 C and D, I-X). For
instance, genes induced during secondary cell wall formation were discriminated,
indicating a role for these genes in this particular process. A significant number of the
genes represented on the microarray showed no or weak similarity to any deposited
sequences in the public databases. Another interesting group contained genes showing
high similarity to Arabidopsis proteins with unknown function or annotated as
hypothetical proteins. More than 200 of the latter group were differentially expressed
during xylem development. Transcript profiles of 72 hybrid aspen genes that were not
assigned function through similarity searches are shown in Fig. 7.
-4
-3
-2
-1
0
1
2
3
4
5
Ph A B C D E
Log2
rat
io
Figure 7. 72 D-upregulated genes with unknown function.
For numerous genes, the transcript profiles obtained across the developing xylem
matched the theoretical patterns predicted by the known function of these genes.
Characteristic cellular processes and corresponding genes predicted for each
34
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
developmental stage are summarized in Fig. 8. A high proportion of differentially
expressed genes were either cell wall biosynthesis related or possibly involved in
cytoskeleton formation or trafficking. Cellulose is the major component of wood and
in accordance with this fact, two out of the three CesA genes represented on the array
were upregulated during secondary cell wall formation (D). As expected most of the
genes encoding lignin biosynthetic enzymes were upregulated in the later stages of
xylogenesis (D and E). Cortical microtubules (MTs) have been suggested to facilitate
the alignment of cellulose microfibrils as they are deposited in the cell wall (Baskin,
2001). Notably, there were 14 different tubulin genes present in the poplar unigene set
and ten were strongly upregulated during late expansion (C) and secondary wall
biosynthesis (D).
Figure 8. A schematic representation of the different tissue sections.
3.1.2.2 A transcriptional roadmap to wood formation revisited (V)
For this experiment a larger Populus microarray was used consisting of 13490 genes
spotted in duplicate. This unigene set was based on 36354 ESTs originating from
seven different tissue specific cDNA libraries: cambial region, active cambium,
dormant cambium, tension wood, mature leaf, senescent leaf, and flower bud
35
HENRIK ASPEBORG
(Andersson et al, 2004). In addition, a number of control elements were spotted
including human and archaeal DNA fragments as well the commercial Lucidea
Scorecard kit (Amersham Biosciences, Uppsala, Sweden). Two genes, which were
represented in all of the seven different cDNA libraries, were also spotted in 2 x 36
copies to help evaluate the performance of the microarrays. The target samples were
the same as in paper II, except that the phloem section was excluded. The
amplification, labeling and hybridization steps were performed as previously
described. The data was normalized using intensity dependent locally weighted
scatterplot smoothing (lowess). While the earlier experiment described in paper II
studied the complete developmental gradient from A-E, this study focused only on
genes related to secondary cell wall formation, i.e. those that were upregulated in zone
D. This experiment identified 1140 secondary cell wall related genes, whereas
approximately 300 secondary cell wall related genes were identified in paper II. The
proportion of secondary cell wall related genes is similar between the two
experiments: 8.5% and 10%, respectively. However, more than 50% of the secondary
cell wall related carbohydrate-active enzymes were only discovered using the smaller
array, showing that the cambial region cDNA clones spotted on those arrays were
enriched with gene fragments encoding these types of enzymes. Out of the 50 most
upregulated genes in zone D that were identified using the smaller array, 90% were
also found upregulated in zone D using the 13490-element microarray. Explanations
for the differences observed using the different microarrays include: genes not passing
the quality filter, genes having a ratio close to the cut-off value and difference in
normalization between the two experiments.
3.1.2.3 Transcript profiling of transgenic trees with increased secondary growth (III)
Numerous studies have stressed the importance of the plant hormone auxin for wood
formation. However, gibberellins (GAs) are also important regulators of secondary
growth in the tree trunk (Ridoutt et al., 1996; Eriksson et al., 2000). The “wood”
microarray described in paper II was used to investigate how GA 20-oxidase
overexpressing in a transgenic hybrid aspen (Eriksson et al., 2000) affected gene
expression in the developing xylem. The phenotype of transgenic lines overexpressing
GA-oxidase is distinguished by great increases in secondary growth, i.e. the number
of xylem fibers and fiber length is increased compared to wild-type (wt) plants. The
36
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
transgenic plants contain higher levels of both gibberellins and auxin, which suggest
that the increased secondary growth might be the outcome of the combined effect of
elevated auxin and GA levels.
A large proportion of the genes upregulated in GA-oxidase overexpressing plants
encoded proteins involved in cell wall formation and cell wall expansion. Several
pectin lyases from family PL1, which might have a role in remodeling the primary
wall during wall extension, were induced. A family PL1 gene from Salix gilgiana is
expressed in elongating and differentiating tissues (Futamura et al., 2002). However,
the pectin methyl esterases discussed in paper III, were wrongly annotated. They are
now annotated as multi-copper oxidases, which are distantly related to ascorbate
oxidases and laccases. The Arabidopsis ortholog SKU5, which corresponds to
AI164970 and AI163151, is most strongly expressed in expanding tissues (Sedbrook
et al., 2002). One pectin methyl esterase from family CE8 was downregulated in the
transgenic trees. Other cell wall related genes that were induced in the GA 20-oxidase
overexpressing trees included an endoglucanase (cellulase) from family GH9, two
glycoside hydrolases from family GH17, a putative xylosidase from family GH3 and a
few expansins. The upregulated endoglucanase was named PttCel9B and is soluble in
contrast to other plant endoglucanases, including ones found in poplar, that are
membrane-bound (Rudsander et al., 2003). Orthologs of PttCel9B that have been
identified so far act on cellulose as well as on xyloglucan (Ohmiya et al., 2000;
Woolley et al., 2001). The GH17 family of glycosidases contain enzymes able to
degrade β(1→3)- or β(1→3)-(1→4)-glucans. One of the GH17 genes contains an
additional module of unknown function called X8 (Henrissat and Davies, 2000). The
putative beta-xylosidase is most likely involved in pectin remodeling. Expansins have
been extensively studied and are cell wall proteins that promote cell wall expansion
by a largely unknown cell wall loosening mechanism. Interestingly, rice expansins are
induced by gibberellin (Lee and Kende, 2001). Cellulose biosynthesis is important for
cell expansion of the primary cell wall, thus induced expression of sucrose synthase
(family GT4) is consistent with a role in channeling sucrose into UDP-glucose, a
precursor of cellulose.
37
HENRIK ASPEBORG
A number of flavonoid and lignin biosynthesis related genes were also upregulated in
the transgenic plants, including a O-methyltransferase, a putative sinapyl
alcoholdehydrogenase (SAD), a dirigent protein, and a isoflavone reductase. The
obtained full length sequence of SAD was similar to the Populus tremula PtSAD gene,
which might be essential for the biosynthesis of syringyl (S) lignin in angiosperms (Li
et al., 2001). Two other S lignin biosynthesis genes were also upregulated, while the
expression of a G lignin specific cinnamylalcohol dehydrogenase (CAD) was
constant. Notably, a higher S/G lignin ratio was observed in the transgenic plants,
which is in accordance with our expression data. To identify the regions of the
developing xylem that are most strongly affected by GA overproduction, a
comparison was made with the tissue-specific expression data originating from paper
II. The majority of the genes induced in the GA 20-oxidase overexpressing trees had
highest expression in zones A-C, i.e. the sites of cell proliferation and cell expansion.
This finding fitted well with the known function of these genes in cell expansion.
3.1.3 Transcript profiling of cellulose synthases (IV)
Among the over 100000 ESTs found in the Populus EST database, 10 genes encoding
cellulose synthases (CesAs) were identified by manual bioinformatic analyses. Full
length cDNA sequences were obtained for five of the poplar CesA genes. A
comparison of the number of EST sequences that corresponded to each of the poplar
CesA genes in the different tissue-specific EST libraries, suggested that they were
differentially regulated. To learn more about the specific roles of the individual CesA
genes in hybrid aspen, transcript profiling experiments were performed for the five
full length genes using real time RT-PCR, and for all identified CesAs using cDNA
microarrays, comparing xylem with phloem and tension wood with opposite side
wood. The microarrays used in this study consisted of 13490 genes spotted in
duplicate, i.e. the microarray also used in paper V. From the microarray experiments it
was observed that the genes encoding four of the putative CesA isoenzymes,
PttCesA1, PttCesA3-1, PttCesA3-2, and PttCesA9, were clearly upregulated in xylem.
The expression patterns of these four genes are consistent with their phylogenetic
relationship to xylem specific Arabidopsis CesA isoenzymes (paper IV: Fig 2). The
expression of the remaining CesA genes were more or less constant. Real-time RT-
PCR confirmed the results of the microarray experiments, showing upregulation of
38
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
PttCesA1, PttCesA3-1 and PttCesA3-2 in xylem, while PttCesA2 and PttCesA4 were
found to be more evenly expressed in the studied tissues.
3.2 Identification of carbohydrate active enzymes involved in
xylogenesis (V)
Despite the importance of xylem secondary cell wall processes in trees for producing
renewable fibers for the industry, the identification of enzymes involved in
synthesizing secondary walls is only in its infancy. A major goal of the present thesis
was to identify carbohydrate-active enzymes involved in the secondary cell wall
biosynthesis in xylem. Fragments of genes encoding cell wall related enzymes such as
cellulose synthases, cellulases, xyloglucan endotransglycosylases (XETs), and
xylanases were discovered in the cambial region EST library of hybrid aspen (Sterky
et al., 1998), and transcript profiling techniques were used to identify genes encoding
enzymes involved in secondary cell wall formation (paper II). By sequencing in total
19 different tissue specific cDNA libraries from Populus tremula, Populus tremula x
tremuloides and Populus trichocarpa , the EST collection has reached over 100000
sequences (Sterky et al., 2004). The ESTs have also been assembled into clusters and
contigs to form consensus sequences that will facilitate the annotation and thus also
enzyme discovery.
In order to get unique insight into which enzymes are important for constructing the
xylem secondary cell wall, we mined the microarray datasets from papers II, IV and V
to identify genes that are specifically upregulated during this process (zone D). In
addition, all Populus ESTs, contig sequences and full length sequences were screened
for carbohydrate-active enzymes, with the bioinformatic tools routinely used to assign
proteins to the Carbohydrate-active enzyme database (CAZy) (Henrissat and Davies,
2000). The advantage of these tools is that each gene is compared against a collection
of more than 45500 individual catalytic and ancillary associated modules separately,
so the family assignment is not disturbed by the frequent modularity of carbohydrate-
active enzymes (Fig 9). This way incorrect annotations commonly introduced by
BLAST-based automated annotation approaches are avoided.
39
HENRIK ASPEBORG
CBM22 CBM22 CBM22 GH10PttXyn10
GH17 X8 X8At2g39640
CBM18 GH19At3g12500
GT37 GT37UNKAt1g14070
Figure 9. Examples of modular carbohydrate-active enzymes from plants.
29 glycosyltransferases from 8 different families and 9 glycoside hydrolases from 8
different families were identified as secondary cell wall related (Table 4). Genes
encoding either polysaccharide lyases or carbohydrate esterases were not identified
among the secondary wall specific genes. Although putative functions can be assigned
to a few of the identified genes (e.g. cellulose synthases in family GT2, the β-
galactosidases in the family GH35), the specific function of the majority of the genes
remains unknown. However, possible roles include: remodeling the primary wall,
hemicellulose biosynthesis or protein glycosylation. The results of the bioinformatic
analysis can also be used to reannotate the datasets from previous microarray
experiments, thereby identify new candidate genes with interesting transcript profiles
that encode carbohydrate-active enzymes.
Table 4. Identified secondary cell wall related CAZy-families
Glycosyltransferases
GT1 GT2 GT8 GT14
GT31 GT43 GT47 GT61
Glycoside hydrolases
GH9 GH10 GH16 GH17
GH19 GH28 GH35 GH51
40
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
Hemicelluloses such as xyloglucan (Pauly and Scheller, 2000) glucuronoxylan
(Timell, 1967) glucomannan (Teleman et al., 2003) and pectins (Pauly and Scheller,
2000) are often O-acetylated, but enzymes responsible for the O-acetylation of these
polysaccharides have not been identified. The biological role of the O-acetyl
substituents is not known, although they might minimize enzymatic polysaccharide
breakdown. It has been shown that Acetyl Co-A is the donor substrate for O-
acetylation of cell wall polysaccharides (Pauly and Scheller, 2000). Recent results
provide evidence that there is a positive correlation between xyloglucan O-acetylation
and fucosylation, suggesting that a suitable substrate for at least one Arabidopsis O-
acetyltransferase is fucosylated xyloglucan (Perrin et al., 2003). Cas1p is a membrane
protein required for the O-acetylation of glucuronoxylomannan in the fungal pathogen
Cryptococcus neoformans, and four orthologs were found in Arabidopsis (Perrin et
al., 2003). By performing a TBLASTN search with the Cas1p sequence against the
PopulusDB, a gene coding for a putative O-acetyltransferase was identified, which
previously was annotated as having unknown function. The transcript of the Populus
putative O-acetyltransferase is upregulated during secondary cell wall formation,
consistent with a role in O-acetylation of the secondary wall polysaccharides
glucomannan and glucuronoxylan. The functional characterization of this gene and its
products is required to verify or discard its hypothesized role in O-acetylation.
3.3 Unpublished data
3.3.1 Full length sequencing of genes with unknown function
Since the beginning of this thesis work, particular interest was directed towards genes
represented by ESTs that shared no similarity to any sequence in the public databases
“no-hits”, and genes that were similar to plant hypothetical proteins (i.e. these genes
have only been predicted by gene-finding algorithms) or plant genes with unknown
function. Together, these two groups of ESTs corresponded to more than 500 genes.
The hypothesis was that among the genes found in these two categories, new fiber-
active enzymes could be identified. Due to additional sequences deposited in public
and internal databases, increased lengths of poplar sequences provided by full length
sequencing and cluster/contig analyses and the availability of new advanced
41
HENRIK ASPEBORG
bioinformatic tools, initially annotated “no hits” and “unknown” poplar genes have
indeed been classified as carbohydrate active genes.
Table 5. Functional classification of unknown genes
Function or Functional Group Type No.
Auxin-Inducible 2 Germin-like 1
Arp2/3 complex 1
Protein-Protein Interaction LIM Domain 1 RING Zn Finger 2 Tpr-repeat 1 Coiled-coil 4
Glu/Asp binding domain 1 Protein-DNA Interaction Homeodomain 1
Signaling Rho p21 1
Possible Ser/Thr Kinase 1
Membrane Localized GPI-anchor 1 GPI-anchor and Fasciclin Domain 1 GPI-anchor, Cu-binding, phytocyanin-like 1 GPI-anchor, TM, extensin-like 1 TM and Signal Sequence 3 TM 8 Signal sequence 1
DUF231 3 (Domain with Unknown Function in At.)
No Domains or Function Found 20
TOTAL 55
72 genes that were upregulated during secondary cell wall formation (paper II) but
whose function remained unknown, were selected for full length sequencing (Fig 7).
This was motivated by the fact that increased sequence length and quality increases
the chance of finding meaningful sequence similarities, which improves annotation
accuracy. For instance, the closest Arabidopsis homolog in large gene families can be
more easily identified, which is necessary for generating Arabidopsis knockout lines
that can be used to infer roles of poplar enzymes. In addition, full length cDNA
sequences are required to clone and heterologously express enzymes for subsequent in
42
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
vitro characterization. The approach followed was to first sequence the complete
insert of the clones from which the EST-sequences were derived. Almost 30 % of the
sequenced cDNA inserts were classified as full length genes by aligning the open
reading frames (ORFs) to best Arabidopsis BLASTX hits. With the additional DNA
sequence data, new carbohydrate-active enzymes were identified. In fact, of the 72
sequenced genes 14 glycosyltransferases from families GT2, GT8, GT14, GT34,
GT43 and GT47 were identified. Full length cDNA sequences of glycosyltransferases,
whose corresponding cDNA clones did not contain the full length gene, were obtained
by rapid amplification of cDNA ends (RACE) as described in paper V. When possible
the remaining novel genes were annotated by using the SMART (Simple Modular
Architecture Research) tool (Schultz et al., 1998; Letunic et al., 2004). A large
proportion of the genes contained either transmembrane regions or GPI-anchors,
showing the importance of membrane proteins in secondary cell wall biosynthesis
(Table 5). The full length sequences obtained in this project will also be used to
annotate the Populus genome.
3.3.2 PUNK1
A short summary of the work on the Populus gene PUNK1 (Poplar unknown 1) will
serve as an example of the difficulties of assigning a function to a gene that is not
significantly similar to other genes/proteins. Based on the relatively high number of
ESTs corresponding to this gene that were found in the first cambial region EST-
library (Sterky et al., 1998), this gene was selected for full length sequencing. The
longest open reading frame (ORF) corresponded to a protein with 176 amino acids.
Although these data were obtained before the Arabidopsis genome was completed, a
large number of Arabidopsis sequences had been deposited in the public databases.
Despite this, no match to PUNK1 was found in Arabidopsis until the genome was
released. The Arabidopsis proteins with sequence similarity to PUNK1 were all
annotated as hypothetical proteins and were much larger than PUNK1. However, data
from a northern blot confirmed that the full length cDNA sequence of PUNK1 had
been obtained. From the microarray experiment presented in paper II it was observed
that the PUNK1 transcript was highly upregulated in zone D (i.e. secondary cell wall
formation), which was confirmed in paper V. PUNK1 was cloned and the
recombinant protein was expressed in E. coli. Antibodies were raised against the
43
HENRIK ASPEBORG
recombinant PUNK1 protein and Western blots using the antibody as a probe showed
the existence of a protein of the predicted size in protein extractions from xylem and
phloem. Preliminary results of immunolocalization studies were consistent with data
from the Western blot analysis and indicated that PUNK1 is present in secondary cell
wall forming and radially expanding fibers and vessels, as well as in sieve tubes,
companion cells and fibers of the phloem. Thus, the PUNK1 transcript is highly
abundant in the secondary cell wall forming zone of xylem, and the protein is found in
both xylem and phloem. However, these data alone do not suggest a putative function
for PUNK1.
Table 6. Identified ESTs corresponding to the PUNK1 cDNA
Populus species/hybrid Number of ESTs Tissue
In PopulusDB Populus tremula x tremuloides 4 Developing xylem Populus tremula x tremuloides 1 Tension wood Populus balsamifera subsp. trichocarpa 1 Female catkins
Other poplar EST projects Populus balsamifera subsp. trichocarpa 1 Developing xylem Populus alba x Populus tremula 3 Tension wood Populus alba x Populus tremula 1 Opposite side xylem Populus balsamifera subsp. trichocarpa x Populus deltoides
1 Shoot tips + mature leaves + woody stems
Over the years, as additional sequences have been added to the databases, new hits to
other plant ESTs have been found. Xylem-specific ESTs corresponding to PUNK1
were found in other poplar species and other EST-projects (Table 6). A poplar cDNA
fragment corresponding to a PUNK1 ortholog was isolated by a differential display
experiment that compared poplars grown either with limited or luxurious amounts of
available nitrogen. However, in the published experiment this PUNK1 ortholog was
not discussed. Very recently (March 2004), a new Arabidopsis gene was deposited
that became the new best hit to PUNK1. This expressed protein only contains 96
amino acids. For a long time the only identifiable motif in the PUNK1 protein was a
coiled coil region. Recently (December 2003) PUNK1 was found to be similar to the
new domain pfam06886.1 TPX2, which was deposited in the conserved domain
database (Marchler-Bauer et al., 2003). This domain represents a conserved region of
approximately 60 amino acid residues within the eukaryotic targeting protein Xklp2
44
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
(TPX2). TPX2 is a microtubule-associated protein that mediates the binding of the C-
terminal domain of Xklp2 to microtubules (Wittmann et al., 2000). 17 Arabidopsis
proteins, including previously identified proteins with similarity to PUNK1 contain
this domain. Several microtubule-related genes such as tubulins and kinesin-like
genes have transcript profiles that are very similar to PUNK1 (i.e. with a peak in zone
D). Therefore, it is plausible that PUNK1 also associates with microtubules, which is
a hypothesis that can now be tested to help decipher the true function of this xylem
related protein.
45
HENRIK ASPEBORG
4 Concluding remarks and future prospects
The objective of this thesis was to identify genes that encode fiber-active enzymes. A
large number of such genes were identified through gene expression analyses, full
length sequencing and bioinformatics, but the activity and biological function of the
majority of the corresponding enzymes remains to be investigated. The first steps
towards the characterization of these enzymes have since been taken. Attempts are
underway to express a number of the enzyme targets in Pichia pastoris for subsequent
enzymatic characterizations, and protein fragments have been expressed in E. coli for
antibody production and subsequent immunolocalization. Phenotypic studies in planta
are being performed in Populus using RNAi, and gene knockouts of the poplar
homologs in Arabidopsis are being carried out. Fourier transform infrared (FTIR)
microspectroscopy will be used to track changes in cell wall composition or
architecture. Furthermore, the sequenced Populus genome will provide additional
information about the already identified carbohydrate-active enzymes, as well as
information of the total arsenal of carbohydrate-active enzymes found in poplar.
The effort of characterizing these enzymes will eventually lead to an improved
understanding of the biochemistry of cell wall biosynthesis, which will provide the
basis for engineering fiber chemistry and structure in the future. The use of the poplar
XET to develop a novel chemo-enzymatic approach for cellulosic fiber modification
shows the potential of using plant enzymes in forest product biotechnology by a
biomimic approach (Brumer et al., 2004). Most of the enzymes in the present study,
however, are involved in the biosynthesis of the fiber carbohydrates, suggesting that
these should be used to engineer the fiber in planta. The present dataset of
carbohydrate-active enzymes constitutes an excellent source of new enzymes for
future application in forest industry processes.
46
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
5 ABBREVIATIONS
AGP arabinogalactan protein
BAC bacterial artificial chromosome
BLAST basic local alignment tool
BSA bovine serum albumin
bp base pair
CAD cinnamylalcohol dehydrogenase
CAZyme carbohydrate-active enzyme
CBM carbohydrate-binding module
cDNA complementary DNA
CE carbohydrate esterase
CesA cellulose synthase catalytic subunit
CR-P plant conserved region
Cy3/Cy5 cyanine 5/ cyanine 3
DNA deoxyribonucleic acid
EC enzyme commission
EST expressed sequence tag
FTIR Fourier transform infrared microspectroscopy
GA gibberellin
GalA galacturonic acid
GH glycoside hydrolase
GM genetically modified
GMO genetically modified organism
GRP glycine-rich protein
GT glycosyltransferase
HG homogalacturonan
HRGP hydroxy-proline-rich glycoprotein
HVR hypervariable region
lowess locally weighted scatterplot smoothing
Mbp mega base pairs
mRNA messenger RNA
MT microtubule
ORF open reading frame
47
HENRIK ASPEBORG
PAL phenylalanine ammonia-lyase
PCR polymerase chain reaction
PL polysaccharide lyase
PRP proline-rich proteins
PUNK1 poplar unknown 1
RACE rapid amplification of cDNA ends
RGI rhamnogalacturonan I
RGII rhamnogalacturonan II
RNA ribonucleic acid
RNAi ribonucleic acid interference
RT-PCR reverse transcriptase polymerase chain reaction
S syringyl
SAD sinapyl alcoholdehydrogenase
SAGE serial analysis of gene expression
SNP single nucleotide polymorphism
SSR simple sequence repeat
SVD singular value decomposition
T-DNA transfer DNA
UTR untranslated region
wt wild-type
XET xyloglucan endotransglycosylase
48
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
6 ACKNOWLEDGEMENTS A-wop-bop-a-loo-lop a-lop bam boo, as Little Richard would have said, I made it. I would not have accomplished this without the help and contribution from a number of people. Here we go, I would especially like to thank: Professor Tuula Teeri, my supervisor who accepted me as a PhD student when the wood biotech team was still just a tiny group. I am very grateful for your support and encouragement. You have outstanding energy and it has been great to be a part of your successful and expanding group. Kristina Blomqvist, who supervised me in the lab and who immediately and enthusiastically put me to work. I have appreciated that you always had the time to answer both scientific and other questions, when I came running. I have also enjoyed your positive attitude. We also shared the hassle of commuting with SJ, and I had enough first. Professor Joakim Lundeberg and Peter Nilsson for welcoming me over to the other house/floor and letting me work with the microarrays. Although it was pretty cold to walk back and forth between the different houses during winter, I have really enjoyed working with you Peter. Thanks also for your critical comments on this thesis and all my papers. Emma Master, for efficient supervision and for critical reading of this thesis. All the collaborators and co-authors up north in Umeå: Professor Göran Sandberg, Professor Björn Sundberg, Professor Thomas Moritz, Ewa Mellerowicz, Magnus Hertzberg, Rupali Bhalerao and Jarmo Schrader. All the European Union EDEN collaborators and especially Professor Bernard Henrissat and Professor Malcolm Bennett and their groups. All the other microarrayers and collaborators at floor 3: ärke-Anders Andersson, Valtteri Wirta, Tove Andersson, Maria Sievertzon, Cecilia Laurell, Anneli Johansson + others. The guys involved in the poplar EST-sequence project: Fredrik Sterky, Rikard Erlandsson, Per Unneberg and Bahram and his sequencing crew for all the help over the years. My room-mates and colleagues, the wonderful ladies, Soraya Djerbi, Åsa Kallas and Ulla Rudsander. I have had a great time in BH, with gossip, candy, baby chat, fika, jokes and fruitful research discussions. Too bad my dust collecting piles of paper soon will be gone. The next-generation of GT-explorers antibody-Alex and Anders W for collaborations in the lab. Janne L, for taking care of me as a newcomer in Tuulas group but also for several hard-working hours out on the tennis court. Martin, Harry and Fredrik V for help with computers. Martin also for your time as a BH stand in, Harry also for the obvious – punk, and Fredrik V also for bird discussions. Anna and Totte for collaboration, wild
49
HENRIK ASPEBORG
scientific discussions, but maybe even more for the chats about rockabilly and old bluesmen. Hongbin for the trying to bind those PUNK1 antibodies. Nader for your good mood and the week as protein purification teachers. Alphonsa and Lotta, the dominating administrators. Kathleen and Wald for bringing the Belgian goodies. All the lunchtåg-participants through the years (Joke, Stuart, Per B, Didier, Magnus, Park, Jenny........) for sharing the mostly downs and occasional ups at midday. Henrik W, Olof (also for our week as teachers), Anders M and all the other innebandy players that kept the old man busy with the stick. All the past and present people in the Wood Biotechnology lab as well as the Department of Biotechnology for a memorable and enjoyable time. Fantastic friends outside the lab, that I hope to see more now when this is over: Jeppa, Nordin + JW:s vänner, Angetun, Mr Mart, Johan S, Speedway-Fredrik, Ahlin mfl. My first idols, my brothers Micke and Håkan. My parents in law Marta and Curt for all the help and babysitting (especially the last busy weeks). Finally, I would like to thank my parents Bengt and Ella for believing in me and all their support through the years. I wrote in first grade that I wanted to become a scientist. I would not have come this far without you. Of course I also have to thank my own little family: Anna-Karin for all your love, your patience with this work and for making our home such a wonderful place, my son Arvid, who always gives me a smile when I get home, and who keeps me busy with cars, sharks and Byggare Bob, and the little newcomer of the family that will arrive soon.
50
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
7 REFERENCES Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu
A, Olde B, Moreno RF, Kerlavage AR, Mccombie WR, Venter JC (1991) Complementary-DNA Sequencing - Expressed Sequence Tags and Human Genome Project. Science 252: 1651-1656
Allona I, Quinn M, Shoop E, Swope K, St Cyr S, Carlis J, Riedl J, Retzel E, Campbell MM, Sederoff R, Whetten RW (1998) Analysis of xylem formation in pine by cDNA sequencing. Proceedings of the National Academy of Sciences of the United States of America 95: 9693-9698
Almgren G, Skogsstyrelsen (1990) Lövskog : björk, asp och al i skogsbruk och naturvård. Skogsstyr., Jönköping
Baskin TI (2001) On the alignment of cellulose microfibrils by cortical microtubules: a review and a model. Protoplasma 215: 150-171
Baucher M, Halpin C, Petit-Conil M, Boerjan W (2003) Lignin: Genetic engineering and impact on pulping. Critical Reviews in Biochemistry and Molecular Biology 38: 305-350
Bhalerao R, Nilsson O, Sandberg G (2003) Out of the woods: forest biotechnology enters the genomic era. Current Opinion in Biotechnology 14: 206-213
Boerjan W, Ralph J, Baucher M (2003) Lignin biosynthesis. Annual Review of Plant Biology 54: 519-546
Boguski MS, Lowe TMJ, Tolstoshev CM (1993) Dbest - Database for Expressed Sequence Tags. Nature Genetics 4: 332-333
Bonin CP, Potter I, Vanzin GF, Reiter WD (1997) The MUR1 gene of Arabidopsis thaliana encodes an isoform of GDP-D-mannose-4,6-dehydratase, catalyzing the first step in the de novo synthesis of GDP-L-fucose. Proceedings of the National Academy of Sciences of the United States of America 94: 2085-2090
Bouchez D, Hofte H (1998) Functional genomics in plants. Plant Physiology 118: 725-732 Bouton S, Leboeuf E, Mouille G, Leydecker MT, Talbotec J, Granier F, Lahaye M, Hofte H,
Truong HN (2002) Quasimodo1 encodes a putative membrane-bound glycosyltransferase required for normal pectin synthesis and cell adhesion in Arabidopsis. Plant Cell 14: 2577-2590
Bradshaw HD, Ceulemans R, Davis J, Stettler R (2000) Emerging model systems in plant biology: Poplar (Populus) as a model forest tree. Journal of Plant Growth Regulation 19: 306-313
Brett C, Waldron K (1996) Physiology and biochemistry of plant cell walls. Chapman and Hall, London, UK
Brumer H, Zhou Q, Bauman M, Carlson K, Teeri TT (2004) Activation of crystalline cellulose surfaces through the chemo-enzymatic modification of xyloglucan. J. Am. Chem. Soc.: in press
Brunner AM, Busov VB, Strauss SH (2004) Poplar genome sequence: functional genomics in an ecologically dominant plant species. Trends in Plant Science 9: 49-56
Burton RA, Shirley NJ, King BJ, Harvey AJ, Fincher GB (2004) The CesA gene family of barley. Quantitative analysis of transcripts reveals two groups of co-expressed genes. Plant Physiology 134: 224-236
Campbell MM, Brunner AM, Jones HM, Strauss SH (2003) Forestry's fertile crescent: the application of biotechnology to forest trees. Plant Biotechnology Journal 1: 141-154
Chaffey N, Cholewa E, Regan S, Sundberg B (2002) Secondary xylem development in Arabidopsis: a model for wood formation. Physiologia Plantarum 114: 594-600
Christersson L (1996) Future research on hybrid aspen and hybrid poplar cultivation in Sweden. Biomass & Bioenergy 11: 109-113
Chuaqui RF, Bonner RF, Best CJM, Gillespie JW, Flaig MJ, Hewitt SM, Phillips JL, Krizman DB, Tangrea MA, Ahram M, Linehan WM, Knezevic V, Emmert-Buck MR (2002) Post-analysis follow-up and validation of microarray experiments. Nature Genetics 32: 509-514
Coutinho PM, Deleury E, Davies GJ, Henrissat B (2003) An evolving hierarchical family classification for glycosyltransferases. Journal of Molecular Biology 328: 307-317
Coutinho PM, Starn M, Blanc E, Henrissat B (2003) Why are there so many carbohydrate-active enzyme-related genes in plants? Trends in Plant Science 8: 563-565
Davies GJ, Henrissat B (2002) Structural enzymology of carbohydrate-active enzymes: implications for the post-genomic era. Biochemical Society Transactions 30: 291-297
51
HENRIK ASPEBORG
Dejardin A, Leple JC, Lesage-Descauses MC, Costa G, Pilate G (2004) Expressed sequence tags from poplar wood tissues - A comparative analysis from multiple libraries. Plant Biology 6: 55-64
Delmer DP (1999) Cellulose biosynthesis: Exciting times for a difficult field of study. Annual Review of Plant Physiology and Plant Molecular Biology 50: 245-276
Demura T, Tashiro G, Horiguchi G, Kishimoto N, Kubo M, Matsuoka N, Minami A, Nagata-Hiwatashi M, Nakamura K, Okamura Y, Sassa N, Suzuki S, Yazaki J, Kikuchi S, Fukuda H (2002) Visualization by comprehensive microarray analysis of gene expression programs during transdifferentiation of mesophyll cells into xylem cells. Proceedings of the National Academy of Sciences of the United States of America 99: 15794-15799
Dixon RA, Chen F, Guo DJ, Parvathi K (2001) The biosynthesis of monolignols: a "metabolic grid", or independent pathways to guaiacyl and syringyl units? Phytochemistry 57: 1069-1084
Doblin MS, Kurek I, Jacob-Wilk D, Delmer DP (2002) Cellulose biosynthesis in plants: from genes to rosettes. Plant and Cell Physiology 43: 1407-1420
Dudoit S, Yang YH, Callow MJ, Speed TP (2002) Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica 12: 111-139
Eckenwalder JE (1996) Systematics and evolution of Populus. In RF Stettler, HD Bradshaw, PE Heilman, TM Hinckley, eds, Biology of Populus and its implications for management and conservation. NRC rearch press, Ottawa, Canada, pp 7-32
Edwards ME, Dickson CA, Chengappa S, Sidebottom C, Gidley MJ, Reid JSG (1999) Molecular characterisation of a membrane-bound galactosyltransferase of plant cell wall matrix polysaccharide biosynthesis. Plant Journal 19: 691-697
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 95: 14863-14868
Eriksson ME, Israelsson M, Olsson O, Moritz T (2000) Increased gibberellin biosynthesis in transgenic trees promotes growth, biomass production and xylem fiber length. Nature Biotechnology 18: 784-788
Faik A, Bar-Peled M, DeRocher AE, Zeng WQ, Perrin RM, Wilkerson C, Raikhel NV, Keegstra K (2000) Biochemical characterization and molecular cloning of an alpha-1,2-fucosyltransferase that catalyzes the last step of cell wall xyloglucan biosynthesis in pea. Journal of Biological Chemistry 275: 15082-15089
Faik A, Price NJ, Raikhel NV, Keegstra K (2002) An Arabidopsis gene encoding an alpha-xylosyltransferase involved in xyloglucan biosynthesis. Proceedings of the National Academy of Sciences of the United States of America 99: 7797-7802
Fenning TM, Gershenzon J (2002) Where will the wood come from? Plantation forests and the role of biotechnology. Trends in Biotechnology 20: 291-296
Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, Mckenney K, Sutton G, Fitzhugh W, Fields C, Gocayne JD, Scott J, Shirley R, Liu LI, Glodek A, Kelley JM, Weidman JF, Phillips CA, Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JL, Geoghagen NSM, Gnehm CL, Mcdonald LA, Small KV, Fraser CM, Smith HO, Venter JC (1995) Whole-Genome Random Sequencing and Assembly of Haemophilus-Influenzae Rd. Science 269: 496-512
Fry SC, Smith RC, Renwick KF, Martin DJ, Hodge SK, Matthews KJ (1992) Xyloglucan Endotransglycosylase, a New Wall-Loosening Enzyme-Activity from Plants. Biochemical Journal 282: 821-828
Fukuda H (1997) Tracheary element differentiation. Plant Cell 9: 1147-1156 Futamura N, Kouchi H, Shinohara K (2002) A gene for pectate lyase expressed in elongating and
differentiating tissues of a Japanese willow (Salix gilgiana). Journal of Plant Physiology 159: 1123-1130
Gardiner JC, Taylor NG, Turner SR (2003) Control of cellulose synthase complex localization in developing xylem. Plant Cell 15: 1740-1748
Grima-Pettenati J, Goffner D (1999) Lignin genetic engineering revisited. Plant Science 145: 51-65 Gutierrez A, del Rio JC, Martinez MJ, Martinez AT (2001) The biotechnological control of pitch in
paper pulp manufacturing. Trends in Biotechnology 19: 340-348 Henrissat B, Coutinho PM, Davies GJ (2001) A census of carbohydrate-active enzymes in the
genome of Arabidopsis thaliana. Plant Molecular Biology 47: 55-72 Henrissat B, Davies G (1997) Structural and sequence-based classification of glycoside hydrolases.
Current Opinion in Structural Biology 7: 637-644
52
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
Henrissat B, Davies GJ (2000) Glycoside hydrolases and glycosyltransferases. Families, modules, and implications for genomics. Plant Physiology 124: 1515-1519
Holloway AJ, van Laar RK, Tothill RW, Bowtell DDL (2002) Options available - from start to finish - for obtaining data from DNA microarrays. Nature Genetics 32: 481-489
Holzman T, Kolker E (2004) Statistical analysis of global gene expression data: some practical considerations. Current Opinion in Biotechnology 15: 52-57
Hood L (2003) Systems biology: integrating technology, biology, and computation. Mechanisms of Ageing and Development 124: 9-16
Humphreys JM, Chapple C (2002) Rewriting the lignin roadmap. Current Opinion in Plant Biology 5: 224-229
Iwai H, Masaoka N, Ishii T, Satoh S (2002) A pectin glucuronyltransferase gene is essential for intercellular attachment in the plant meristem. Proceedings of the National Academy of Sciences of the United States of America 99: 16319-16324
Karacic A, Verwijst T, Weih M (2003) Above-ground woody biomass production of short-rotation populus plantations on agricultural land in Sweden. Scandinavian Journal of Forest Research 18: 427-437
Knight J (2001) When the chips are down. Nature 410: 860-861 Kohler A, Delaruelle C, Martin D, Encelot N, Martin F (2003) The poplar root transcriptome:
analysis of 7000 expressed sequence tags. Febs Letters 542: 37-41 Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle
M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang HM, Yu J, Wang J, Huang GY, Gu J, Hood L, Rowen L, Madan A, Qin SZ, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Raymond C, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan HQ, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JGR, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang WH, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz JR, Slater G, Smit AFA, Stupka E, Szustakowki J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ (2001) Initial sequencing and analysis of the human genome. Nature 409: 860-921
Lao NT, Long D, Kiang S, Coupland G, Shoue DA, Carpita NC, Kavanagh TA (2003) Mutation of a family 8 glycosyltransferase gene alters cell wall carbohydrate composition and causes a humidity-sensitive semi-sterile dwarf phenotype in Arabidopsis. Plant Molecular Biology 53: 687-701
Lee MLT, Kuo FC, Whitmore GA, Sklar J (2000) Importance of replication in microarray gene expression studies: Statistical methods and evidence from repetitive cDNA hybridizations.
53
HENRIK ASPEBORG
Proceedings of the National Academy of Sciences of the United States of America 97: 9834-9839
Lee Y, Kende H (2001) Expression of beta-expansins is correlated with internodal elongation in deepwater rice. Plant Physiology 127: 645-654
Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork P (2004) SMART 4.0: towards genomic data integration. Nucleic Acids Research 32: D142-D144
Leung YF, Cavalieri D (2003) Fundamentals of cDNA microarray data analysis. Trends in Genetics 19: 649-659
Li LG, Cheng XF, Leshkevich J, Umezawa T, Harding SA, Chiang VL (2001) The last step of syringyl monolignol biosynthesis in angiosperms is regulated by a novel gene encoding sinapyl alcohol dehydrogenase. Plant Cell 13: 1567-1585
Liang X, Joshi CP (2004) Molecular cloning of ten distinct hypervariable regions from the cellulose synthase superfamily in aspen trees. Tree Physiology 24: 543-550
Lockhart DJ, Dong HL, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang CW, Kobayashi M, Horton H, Brown EL (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nature Biotechnology 14: 1675-1680
Madson M, Dunand C, Li XM, Verma R, Vanzin GF, Calplan J, Shoue DA, Carpita NC, Reiter WD (2003) The MUR3 gene of Arabidopsis encodes a xyloglucan galactosyltransferase that is evolutionarily related to animal exostosins. Plant Cell 15: 1662-1670
Mansfield SD, Esteghlalian AR (2003) Applications of biotechnology in the forest products industry. Applications of Enzymes to Lignocellulosics 855: 2-29
Marchler-Bauer A, Anderson JB, DeWeese-Scott C, Fedorova ND, Geer LY, He SQ, Hurwitz DI, Jackson JD, Jacobs AR, Lanczycki CJ, Liebert CA, Liu CL, Madej T, Marchler GH, Mazumder R, Nikolskaya AN, Panchenko AR, Rao BS, Shoemaker BA, Simonyan V, Song JS, Thiessen PA, Vasudevan S, Wang YL, Yamashita RA, Yin JJ, Bryant SH (2003) CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Research 31: 383-387
Matthysse AG, White S, Lightfoot R (1995) Genes Required for Cellulose Synthesis in Agrobacterium-Tumefaciens. Journal of Bacteriology 177: 1069-1075
McCann M (1997) Tracheary element formation: building up to a dead end. Trends in Plant Science 2: 333-338
Mcqueen-Mason S, Durachko DM, Cosgrove DJ (1992) 2 Endogenous Proteins That Induce Cell-Wall Extension in Plants. Plant Cell 4: 1425-1433
Mellerowicz EJ, Baucher M, Sundberg B, Boerjan W (2001) Unravelling cell wall formation in the woody dicot stem. Plant Molecular Biology 47: 239-274
Milioni D, Sado PE, Stacey NJ, Domingo C, Roberts K, McCann MC (2001) Differential expression of cell-wall-related genes during the formation of tracheary elements in the Zinnia mesophyll cell system. Plant Molecular Biology 47: 221-238
Milioni D, Sado PE, Stacey NJ, Roberts K, McCann MC (2002) Early gene expression associated with the commitment and differentiation of a plant tracheary element is revealed by cDNA-amplified fragment length polymorphism analysis. Plant Cell 14: 2813-2824
Oh S, Park S, Han KH (2003) Transcriptional regulation of secondary growth in Arabidopsis thaliana. Journal of Experimental Botany 54: 2709-2722
Ohlrogge J, Benning C (2000) Unraveling plant metabolism by EST analysis. Current Opinion in Plant Biology 3: 224-228
Ohmiya Y, Samejima M, Shiroishi M, Amano Y, Kanda T, Sakai F, Hayashi T (2000) Evidence that endo-1,4-beta-glucanases act on cellulose in suspension-cultured poplar cells. Plant Journal 24: 147-158
Osterlund MT, Paterson AH (2002) Applied plant genomics: the secret is integration. Current Opinion in Plant Biology 5: 141-145
Pauly M, Scheller HV (2000) O-Acetylation of plant cell wall polysaccharides: identification and partial characterization of a rhamnogalacturonan O-acetyl-transferase from potato suspension-cultured cells. Planta 210: 659-667
Pear JR, Kawagoe Y, Schreckengost WE, Delmer DP, Stalker DM (1996) Higher plants contain homologs of the bacterial celA genes encoding the catalytic subunit of cellulose synthase. Proceedings of the National Academy of Sciences of the United States of America 93: 12637-12642
Pellis A, Laureysens I, Ceulemans R (2004) Growth and production of a short rotation coppice culture of poplar I. Clonal differences in leaf characteristics in relation to biomass production. Biomass and Bioenergy 27: 9-19
54
DISCOVERY OF FIBER-ACTIVE ENZYMES IN POPULUS WOOD
Perrin R, Wilkerson C, Keegstra K (2001) Golgi enzymes that synthesize plant cell wall polysaccharides: finding and evaluating candidates in the genomic era. Plant Molecular Biology 47: 115-130
Perrin RM, DeRocher AE, Bar-Peled M, Zeng WQ, Norambuena L, Orellana A, Raikhel NV, Keegstra K (1999) Xyloglucan fucosyltransferase, an enzyme involved in plant cell wall biosynthesis. Science 284: 1976-1979
Perrin RM, Jia ZH, Wagner TA, O'Neill MA, Sarria R, York WS, Raikhel NV, Keegstra K (2003) Analysis of xyloglucan fucosylation in Arabidopsis. Plant Physiology 132: 768-778
Pesquet E, Jauneau A, Digonnet C, Boudet AM, Pichon M, Goffner D (2003) Zinnia elegans: the missing link from in vitro tracheary elements to xylem. Physiologia Plantarum 119: 463-468
Phimister B (1999) Going global. Nature Genetics 21: 1-1 Plomion C, Leprovost G, Stokes A (2001) Wood formation in trees. Plant Physiology 127: 1513-
1523 Quackenbush J (2001) Computational analysis of microarray data. Nature Reviews Genetics 2: 418-
427 Quackenbush J (2002) Microarray data normalization and transformation. Nature Genetics 32: 496-
501 Ridley BL, O'Neill MA, Mohnen DA (2001) Pectins: structure, biosynthesis, and oligogalacturonide-
related signaling. Phytochemistry 57: 929-967 Ridoutt BG, Pharis RP, Sands R (1996) Fibre length and gibberellins A(1) and A(20) are decreased
in Eucalyptus globulus by acylcyclohexanedione injected into the stem. Physiologia Plantarum 96: 559-566
Rudd S (2003) Expressed sequence tags: alternative or complement to whole genome sequences? Trends in Plant Science 8: 321-329
Rudsander U, Denman S, Raza S, Teeri TT (2003) Molecular features of family GH9 cellulases in hybrid aspen and the filamentous fungus Phanerochaete chrysosporium. J. Appl. Glycobiol.: 253-256
Rytter L (2002) Nutrient content in stems of hybrid aspen as affected by tree age and tree size, and nutrient removal with harvest. Biomass & Bioenergy 23: 13-25
Saxena IM, Lin FC, Brown RM (1990) Cloning and Sequencing of the Cellulose Synthase Catalytic Subunit Gene of Acetobacter-Xylinum. Plant Molecular Biology 15: 673-683
Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative Monitoring of Gene-Expression Patterns with a Complementary-DNA Microarray. Science 270: 467-470
Schultz J, Milpetz F, Bork P, Ponting CP (1998) SMART, a simple modular architecture research tool: Identification of signaling domains. Proceedings of the National Academy of Sciences of the United States of America 95: 5857-5864
Sedbrook JC, Carroll KL, Hung KF, Masson PH, Somerville CR (2002) The Arabidopsis SKU5 gene encodes an extracellular glycosyl phosphatidylinositol-anchored glycoprotein involved in directional root growth. Plant Cell 14: 1635-1648
Smyth GK, Speed T (2003) Normalization of cDNA microarray data. Methods 31: 265-273 Smyth GK, Yang YH, Speed T (2003) Statistical issues in cdna microarray data analysis. In MJ
Brownstein, AB Khodursky, eds, Functional genomics: Methods and protocols, Vol 224. Humana press, Totowa, NJ, pp 111-136
Somerville C, Somerville S (1999) Plant functional genomics. Science 285: 380-383 Sterky F, Bhalerao RR, Unneberg P, Segerman B, Nilsson P, Brunner AM, Campaa L, Jonsson
Lindvall J, Tandre K, Strauss SH, Sundberg B, Gustafsson P, Uhlen M, Bhalerao RP, Nilsson O, Sandberg G, Karlsson J, Lundeberg J, Jansson S (2004) A Populus EST resource for functional genomics. Proceedings of the National Academy of Sciences of the United States of America: in press
Sterky F, Regan S, Karlsson J, Hertzberg M, Rohde A, Holmberg A, Amini B, Bhalerao R, Larsson M, Villarroel R, Van Montagu M, Sandberg G, Olsson O, Teeri TT, Boerjan W, Gustafsson P, Uhlen M, Sundberg B, Lundeberg J (1998) Gene discovery in the wood-forming tissues of poplar: Analysis of 5,692 expressed sequence tags. Proceedings of the National Academy of Sciences of the United States of America 95: 13330-13335
Tanaka K, Murata K, Yamazaki M, Onosato K, Miyao A, Hirochika H (2003) Three distinct rice cellulose synthase catalytic subunit genes required for cellulose synthesis in the secondary wall. Plant Physiology 133: 73-83
Taylor NG, Howells RM, Huttly AK, Vickers K, Turner SR (2003) Interactions among three distinct CesA proteins essential for cellulose synthesis. Proceedings of the National Academy of Sciences of the United States of America 100: 1450-1455
55
HENRIK ASPEBORG
56
Teleman A, Nordstrom M, Tenkanen M, Jacobs A, Dahlman O (2003) Isolation and characterization of O-acetylated glucomannans from aspen and birch wood. Carbohydrate Research 338: 525-534
The Arabidopsis genome initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815
Timell TE (1967) Recent progress in the chemistry of wood hemicelluloses. Wood. Sci. Techn. 1: 45-70
Tissier T, Bourgeois P (2001) Reverse genetics in plants. Current Genomics: 269-284 Tuskan GA (1998) Short-rotation woody crop supply systems in the United States: What do we know
and what do we need to know? Biomass & Bioenergy 14: 307-315 Uggla C, Moritz T, Sandberg G, Sundberg B (1996) Auxin as a positional signal in pattern
formation in plants. Proceedings of the National Academy of Sciences of the United States of America 93: 9282-9286
Usadel B, Kuschinsky AM, Rosso MG, Eckermann N, Pauly M (2004) RHM2 is involved in mucilage pectin synthesis and is required for the development of the seed coat in Arabidopsis. Plant Physiology 134: 286-295
Valdivia Granda W (2003) Strategies for clustering, classifying, integrating, standardizing and visualizing microarray gene expression data. In E Blalock, ed, A beginner's guide to microarrays. Kluwer Academic Publishers, pp 277-340
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW (1995) Serial Analysis of Gene-Expression. Science 270: 484-487
Vilaine F, Palauqui JC, Amselem J, Kusiak C, Lemoine R, Dinant S (2003) Towards deciphering phloem: a transcriptome analysis of the phloem of Apium graveolens. Plant Journal 36: 67-81
Vincken JP, Schols HA, Oomen RJFJ, McCann MC, Ulvskov P, Voragen AGJ, Visser RGF (2003) If homogalacturonan were a side chain of rhamnogalacturonan I. Implications for cell wall architecture. Plant Physiology 132: 1781-1789
Wittmann T, Wilm M, Karsenti E, Vernos I (2000) TPX2, a novel Xenopus MAP involved in spindle pole organization. Journal of Cell Biology 149: 1405-1418
Woolley LC, James DJ, Manning K (2001) Purification and properties of an endo-beta-1,4-glucanase from strawberry and down-regulation of the corresponding gene, cel1. Planta 214: 11-21
Wullschleger SD, Jansson S, Taylor G (2002) Genomics and forest biology: Populus emerges as the perennial favorite. Plant Cell 14: 2651-2655.
Yang YH, Speed T (2002) Design issues for cDNA microarray experiments. Nature Reviews Genetics 3: 579-588