Clarke 1993

28
Australian Journal of Ecology {,l99i) 18, 117-143 Non-parametric multivariate analyses of changes in community structure K. R. CLARKE Plymouth Marine Laboratory, Prospect Place, West Hoe, Plymouth, PLl 3DH, United Kingdom Abstract In the early 1980s, a strategy for graphical representation of multivariate (multi- species) abundance data was introduced into marine ecology by, among others. Field, et al. (1982). A decade on, it is instructive to: (i) identify which elements of this often-quoted strategy have proved most useful in practical assessment of community change resulting from pollution impact; and (ii) ask to what extent evolution of techniques in the intervening years has added self-consistency and comprehensiveness to the approach. The pivotal concept has proved to be that of a biologically-relevant definition of similarity of two samples, and its utilization mainly in simple rank form, for example 'sample A is more similar to sample B than it is to sample C. Statistical assumptions about the data are thus minimized and the resulting non-parametric techniques will be of very general applicability. From such a starting point, a unified frame- work needs to encompass: (i) the display of community patterns through clustering and ordination of samples; (ii) identification of species principally responsible for determining sample groupings; (iii) statistical tests for differences in space and time (multivariate analogues of analysis of variance, based on rank similarities); and (iv) the linking of community differ- ences to patterns in the physical and chemical environment (the latter also dictated by rank similarities between samples). Techniques are described that bring such a framework into place, and areas in which problems remain are identified. Accumulated practical experience with these methods is discussed, in particular applications to marine benthos, and it is concluded tbat they have much to offer practitioners of environmental impact studies on communities. INTRODUCTION AND RATIONALE Strategy of Field et al. (1982) Field et al. (1982) outlined a strategy for the analy- sis of data on community structure, that is, an abundance (or biomass) array whose columns rep- resent separate samples and whose rows are the full set of species present in those samples. (Note that all such assemblage information is referred to, rather loosely, throughout the current paper as 'community' data.) Their approach has the follow- ing components. (1) The biotic relationship between any two samples is distilled into a coefficient measuring similarity (or dissimilarity) in species compo- sition. (2) The resulting triangular matrix of similari- ties between every pair of samples is used to classify the samples into groups, by hierarchical agglomer- Accepted for publication October 1992. ative clustering with group-average linking (e.g. Clifford & Stephenson 1975), or to map the sample inter-relationships in an ordination, by non-metric multi-dimensional scaling (MDS, e.g. Kruskal & Wish 1978). (3) Relationships between the species are exam- ined by, in effect, transposing the data matrix and repeating the classification and ordination on a similarity matrix computed between every pair of species. Species which are indicative of particular groups of samples are determined by so-called in- formation statistic (I-) tests (Field 1969). (4) Having allowed the community data to 'tell its own story', its relationship to matching environ- mental data is examined by superimposing the values of each abiotic variable separately onto the biotic ordination. The above strategy has been adopted in a size- able number of published studies, for example papers in Bayne et al. (1988), Addison and Clarke (1990), Warwick et al. (1991), Agard et al. (1993)

Transcript of Clarke 1993

Page 1: Clarke 1993

Australian Journal of Ecology {,l99i) 18, 117-143

Non-parametric multivariate analyses of changes incommunity structure

K. R. CLARKEPlymouth Marine Laboratory, Prospect Place, West Hoe, Plymouth, PLl 3DH,United Kingdom

Abstract In the early 1980s, a strategy for graphical representation of multivariate (multi-species) abundance data was introduced into marine ecology by, among others. Field, et al.(1982). A decade on, it is instructive to: (i) identify which elements of this often-quoted strategyhave proved most useful in practical assessment of community change resulting from pollutionimpact; and (ii) ask to what extent evolution of techniques in the intervening years has addedself-consistency and comprehensiveness to the approach. The pivotal concept has proved to bethat of a biologically-relevant definition of similarity of two samples, and its utilization mainlyin simple rank form, for example 'sample A is more similar to sample B than it is to sample C.Statistical assumptions about the data are thus minimized and the resulting non-parametrictechniques will be of very general applicability. From such a starting point, a unified frame-work needs to encompass: (i) the display of community patterns through clustering andordination of samples; (ii) identification of species principally responsible for determiningsample groupings; (iii) statistical tests for differences in space and time (multivariate analoguesof analysis of variance, based on rank similarities); and (iv) the linking of community differ-ences to patterns in the physical and chemical environment (the latter also dictated by ranksimilarities between samples). Techniques are described that bring such a framework intoplace, and areas in which problems remain are identified. Accumulated practical experiencewith these methods is discussed, in particular applications to marine benthos, and it isconcluded tbat they have much to offer practitioners of environmental impact studies oncommunities.

INTRODUCTION AND RATIONALE

Strategy of Field et al. (1982)

Field et al. (1982) outlined a strategy for the analy-sis of data on community structure, that is, anabundance (or biomass) array whose columns rep-resent separate samples and whose rows are the fullset of species present in those samples. (Note thatall such assemblage information is referred to,rather loosely, throughout the current paper as'community' data.) Their approach has the follow-ing components.

(1) The biotic relationship between any twosamples is distilled into a coefficient measuringsimilarity (or dissimilarity) in species compo-sition.

(2) The resulting triangular matrix of similari-ties between every pair of samples is used to classifythe samples into groups, by hierarchical agglomer-

Accepted for publication October 1992.

ative clustering with group-average linking (e.g.Clifford & Stephenson 1975), or to map the sampleinter-relationships in an ordination, by non-metricmulti-dimensional scaling (MDS, e.g. Kruskal &Wish 1978).

(3) Relationships between the species are exam-ined by, in effect, transposing the data matrix andrepeating the classification and ordination on asimilarity matrix computed between every pair ofspecies. Species which are indicative of particulargroups of samples are determined by so-called in-formation statistic (I-) tests (Field 1969).

(4) Having allowed the community data to 'tellits own story', its relationship to matching environ-mental data is examined by superimposing thevalues of each abiotic variable separately onto thebiotic ordination.

The above strategy has been adopted in a size-able number of published studies, for examplepapers in Bayne et al. (1988), Addison and Clarke(1990), Warwick et al. (1991), Agard et al. (1993)

Page 2: Clarke 1993

118 K. R. CLARKE

and many others. (There are about 80 non-selfcitations to Field et al. 1982 in the Science CitationIndex.) Many of these studies are concerned withthe effects of pollutants. For example, samplesmight consist of a set of replicate sediment cores atdifferent sites or times, chosen with the intentionof displaying pollution-induced change in benthiccommunity structure.

Fundamental role of rank similarities amongsamples

The effectiveness of this strategy has depended onthe flexibility inherent in the use of non-metricMDS as an ordination technique. This is basedonly on the similarity matrix between samples, asdefined by the biologist to reflect the particularaspects of community structure that are biologi-cally meaningful for that study. The ordinationtechnique should not force a specific definition ofsimilarity (either explicitly or implicitly) onto thepractitioner. He or she should control the follow-ing initial stages, by answering the questionsposed.

(1) Selection of community attribute, forexample 'is the biological hypothesis best exam-ined by data on species abundance or biomass, or bysome combination of these reflecting production(as in Warwick & Clarke 1993a; Warwick1993)?'

(2) Standardization of data to relative rather thanabsolute values, for example 'are differences intotal abundance between samples of any biologicalsignificance?'

(3) Transformation of the data matrix (after stan-dardization, if appropriate), for example 'whichend of the spectrum, of common to rare species,should the analysis chiefly reflect?' Untransformeddata will typically lead to a shallow interpretationin which only the pattern of a few, very commonspecies is represented (although ordinations willthen fit more readily into two dimensions). Thetransformation sequence of y'^-^, y'^-^^, log y andultimately simple presence/absence, allows pro-gressively greater contribution from the rarerspecies (e.g. Clarke & Green 1988).

(4) Choice of similarity coefficient, for example'should the similarity between two samples dependin any way on those species that are absent fromboth (but present in other samples in the data set)?'

Ecologists generally seem to feel that so-called'joint absences' are not germane, ruling out somecommon coefficients. Other desirable features in-clude invariance (even under power transform-ation) to a scale change, for example of biomassmeasurements, and some form of standardizationensuring that the extreme values of 100 and 0 cor-respond, respectively, to a complete match of themeasurements and to a complete lack of species incommon. These properties are all satisfied by anumber of coefficients, including the Bray-Curtissimilarity (Bray & Curtis 1957), widely used inecology. Faith et al. (1987) discuss the robustness ofseveral such coefficients, in a simulation study of arange of (non-linear) ecological response models;the Bray-Curtis coefficient is seen to be one of themost reliable performers.

Some care will therefore have been taken in de-fining similarity to reflect biological reality, but it isstill true that no particular meaning can be at-tached to an isolated value, say a similarity of 64.3between samples A and B. The absolute levels willdepend markedly on the chosen coefficient andtransformation (typically Bray-Curtis similaritieswill tend to increase with increasing severity oftransformation). It is relative levels which have anatural interpretation, in particular the ranks ofthe similarity matrix, which summarize the datathrough statements such as 'sample A is more simi-lar to sample B than it is to sample C. This is anintuitively appealing and very generally applicablebase from which to build a graphical represen-tation of the sample patterns and, in effect, this isthe only information used by a successful non-metric MDS ordination. The rank similaritymatrix thus plays a fundamental role in definingand visualizing the community pattern. It is thennatural to demand consistency in answering sub-sequent questions concerning that pattern, by util-izing only these same rank similarities.

Weaknesses in the earlier approach and newrequirements

Some of the components of the Field et al. (1982)approach fail this consistency test.

(1) The I-statistic method for identifying indi-cator species is a function only of the presence orabsence of species and thus has a rather tenuouslink with the Bray-Curtis similarity matrix on

Page 3: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 119

quantitative data, used elsewhere in the strategy.Its provenance as a hypothesis testing proceduredoes not stand up to scrutiny in this case, as Fieldet al. (1982) pointed out. At a more pragmatic level,it has an inevitable tendency to select indicatorspecies that are rare, particularly for small clusterswhere chance occurrence of single individuals inone group can make a rare species look like a 'per-fect' indicator. It shares this property with corre-spondence analysis and two-way indicator speciesanalysis (Hill 1973; 1979; Greenacre 1984), asthese methods are deliberately designed to identifyclusters with mutually exclusive species sets.

(2) The species analysis, based on a matrix ofsimilarities between species in their dispositionacross samples, has also been found to have limitedpractical use. This is principally because of thehigh degree of noise in individual species patterns,causing difficulty in representing the among-species relationships in a low-dimensional ordi-nation. The approach tends to be informative onlyin strongly clustered situations in which mostspecies divide into mutually exclusive groups. (Forfurther discussion of why ordination of speciesmight be expected to cause more problems thanordination of samples see Faith 1991.) In practice,questions concerning species identities often turninto queries about what effect individual specieshave in determining the among-sample relation-ships seen in a cluster dendrogram or MDS plot.There is thus a requirement to identify individualspecies contributions to the underlying samplesimilarity matrix.

(3) Hierarchical cluster analysis of samples,using group-average linking, is not a function onlyof the rank similarities, though single-linkage clus-tering would be. Use of single-linkage is notadvised, however, because it has a tendency to giverise to unhelpful dendrogram plots, with chainlinking of samples rather than clearly definedgroups (Everitt 1980), even in situations where thelatter are to be expected. The requirement here isto reconcile the minor inconsistency between theinformation used in a non-metric MDS and thatexploited by group-average clustering, particularlywhere the point of doing a cluster analysis is tocheck the adequacy of an MDS ordination, bysuperimposition of the clusters (see later dis-cussion). This is not a requirement of any greattheoretical or practical import but can be simplyachieved.

(4) A cluster analysis or MDS ordination of thefull set of samples deliberately makes no use of theway the samples are structured (e.g. replicateswithin different sites, times, etc.). The overallsample pattern is displayed and one can then judgevisually whether, for example, different sites ap-pear to have differing community composition,based on the variation among replicates within asite. This visual comparison may be rather inad-equate in pollution impact studies for which theremay often be a priori hypotheses about (lack of)differences before and after an impact, or betweencontrol and impacted sites. There is also increasinginterest in designed studies of pollution or disturb-ance effects on communities, involving fieldmanipulations or experimental mesocosms (e.g.Gee et al. 1985). These designs are handled inclassical statistics by multivariate analysis of vari-ance (MANOVA, e.g. Mardia et al. 1979) but therequired assumptions of multivariate normality areimpossible to meet for many benthic communitydata sets. There is a clear requirement, notaddressed by Field et al. (1982), to extend the adhoc multivariate methods to encompass formalhypothesis-testing situations, without sacrificingthe strongly non-parametric (distribution-free)nature of analyses based on rank similarities.

(5) Finally, in many impact studies, the bioticsamples will be supplemented by matched environ-mental data, both on levels of pollutants and under-lying physical variables that could be structuringthe community (such as sediment grain size ordepth of the water column). Field et al. (1982) il-lustrated the linking of environmental informationto the biological analysis, by superimposing thevalues of the abiotic variables, one at a time, on tothe respective sample positions in the biotic MDS.While this can be a powerful visual tool in simplesituations, it gives no basis for answering suchquestions as 'how well does the full set of recordedenvironmental data explain the observed com-munity pattern?' and 'is there a subset of theenvironmental variables that explains the patternequally well, or better?' These questions areanswered in classical multivariate statistics by tech-niques such as canonical correlation (e.g. Mardiaet al. 1979) but, as remarked above, there will befew impact studies on complete communities forwhich classic multivariate assumptions are valid.What is needed here is an analogue of canonicalanalysis based only on rank similarities.

Page 4: Clarke 1993

120 K. R. CLARKE

Outline of proposed strategy

This paper therefore attempts to erect a coherentframework for non-parametric multivariate analy-sis of community data, that is, a combination oftechniques that acknowledge the primacy of theamong-sample similarity matrix and whose infer-ences are drawn only from its ranks (or at leastconsistently with that starting point). The conse-quent lack of model assumptions will confer ageneral validity of application which it would behard to improve upon, though it should be recog-nized that there will be a price to pay: some featuresof classical multivariate analysis have no obviousanalogue in this similarity-based setting.

The main sections of this paper deal with thefollowing components of this strategy.

(1) Display of community pattern through ordi-nation and clustering. The rationale and utility ofMDS are illustrated with several examples, in-cluding accumulated practical experiences of thealgorithm's behaviour. The possibility of applyinggroup-average clustering to the rank similarities,rather than the absolute values, is discussed inpassing.

(2) Determining the species responsible for samplegroupings observed in a cluster analysis. Naturally,because information identifying individual specieshas been entirely lost from the among-sample simi-larity matrix, some return to an earlier phase of theanalysis is essential for this objective. In line withthe underlying principle, however, the only infor-mation exploited is the contribution each speciesmakes to the chosen similarity coefficient (after thechosen standardization, transformation etc.). Themean contribution of each species to the dissimi-larity of two clusters is defined as an average overall cross-group pairs of samples. This yields an as-sessment of which species are good discriminatorsof these two groups. A subtly different question isto ask which species are typical of specific groups,in the sense of making a large contribution to theaverage similarity between every pair of sampleswithin a group. (Species can be typical of morethan one group and thus poor discriminators be-tween groups.)

(3) Testing for spatial and temporal differ-ences in community structure when samples areadequately replicated and hypotheses defineda priori. For example, with suitable replicatesamples from each of a number of sites, the hypo-

thesis of'no site-to-site differences' can be tested bypermutations of the rank similarity matrix. This'analysis of similarities' (ANOSIM) test is a distri-bution-free analogue of one-way ANOVA. Higher-way analogues also occur in practice. For example,adding another level of structure, in which sites areselected to represent impacted or control locations,produces a two-way nested design. Both this and atwo-way crossed layout can be handled, at least inpart, by a further permutation or randomizationprocedure.

(4) Linking community patterns to environmentalvariables, where a suite of abiotic data has beencollected to match each biotic sample. The ques-tions posed earlier, about the extent to which theabiotic data 'explains' the community structure,are answered by an indirect analysis, again onlyinvolving rank similarities. It is based on the prem-ise that pairs of samples which are rather similar intheir values for a set of environmental data wouldbe expected to have rather similar species compo-sition, provided the relevant variables determiningcommunity structure have been identified cor-rectly. If this is so, separate ordinations of bioticand abiotic data would be expected to show a closematch. This gives rise to a simple optimizationroutine (the BIO-ENV procedure), which selectsthat subset of environmental variables maximizinga (modified) rank correlation between the bioticand abiotic similarity matrices. Clarke and Ains-worth (1993) describe the procedure in detail sothe technique is only outlined here and illustratedby a new example.

Throughout the paper, there is a degree ofpromiscuity in the use of illustrations, which are allre-analyses of published data. While several do notrefer specifically to biological effects of pollutants,they are all chosen to exemplify analyses withobvious parallels in environmental impact studies.

DISPLAY OF COMMUNITY PATTERN

Non-metric multi-dimensional scaling

As implied earlier, non-metric MDS is often themethod of choice for graphical representationof community relationships (e.g. Everitt 1978;Kenkel & Orloci 1986), principally because of theflexibility and generality bestowed by:

(1) its dependence only on a biologically mean-

Page 5: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 121

ingful view of the data, that is, choice of standard-ization, transformation and similarity coefficientappropriate to the hypotheses under investi-gation;

(2) its distance-preserving properties, that is,preservation of the rank order of among-sampledissimilarities in the rank order of distances.

The computational algorithm, an iterative opti-mization procedure (Kruskal & Wish 1978), isfairly complex but the principle is very simple —indeed its conceptual clarity is perhaps MDS'smost important asset when communicating resultsof impact studies to environmental managers.

Analogy of reconstructing a map of the world

The main features of MDS are well illustrated bythe following question. Starting from the triangu-lar matrix of (great-circle) distances between everypair of major cities in the world, can one recon-struct the map of the globe, that is, place the citiesin their correct location? This is what non-metricMDS sets out to do except that, in effect, it attemptsto solve a harder problem, that of reconstructingthe world map from only the rank order of inter-city distances (e.g. statements of the form 'Sydneyis closer to Canberra than London is to Parish-Somewhat surprisingly, it succeeds; the result is a

near-perfect map of the relative locations of thecities in three dimensions. The algorithm actuallyworks in a brute force manner by initially placingthe cities in three-dimensional space at entirelyarbitrary locations, and then gradually refiningtheir relative positions in an iterative cycle (involv-ing a combination of the numerical techniques ofmonotonic regression and steepest descent). Theintention, though not necessarily the detail, is clearthroughout: to move cities into positions in whichthe rank order of the inter-city distances becomesever closer to the rank order in the original tri-angular matrix. The extent to which the two dis-agree is reflected in a stress coefficient, wherestress tends to zero when the rank orders reachperfect agreement.

Perfect agreement will not always be possible.For example, starting with the same triangular dis-tance matrix, suppose we now wish to construct amap of the cities in two dimensions; some distor-tion of the true relative positions is inevitable.Figure 1 shows the result of two-dimensional MDSapplied to all great-circle distances between 39cities {Reader's Digest Great World Atlas 1968). Thereconstruction is still remarkably good, with aseemingly accurate placement of cities in localregions. The distortion is most evident for SanFrancisco; the MDS is attempting to reconcile the

"San

Chicago" ^Ottawa

• New York

• Bahamas

Brasi l ia•

• Rio

Buenos• Aires

Francisco

London

Paris'•Madr

Lagos•

Osio• Moscow

pVier na• Istanbul

• Rome Baghdad Dd •

• Cai ro

Nairobi

J o'bu rg

Cape To w n

• Tokyo

" Peking

Hong Ko ng

elhi "Calcutta

•Bombay

•Colombo Singapore

•Darwin

^.SydneyCanberra

Fig. t . Two-dimensional ordination of 39 world cities from non-metric multi-dimensional scaling (MDS), applied to a triangular matrix of'great-circle' distances between every pair of cities. Stress = 0.13.

Page 6: Clarke 1993

122 K. R. CLARKET

2-d

MD

Snc

es I

I

cd

-o

er-

cit

iIn

ti

3

- 2

SrfftoB

1

° G 8 °B

D D •• • D p

°a° o f i o i G°

5000 10000

1 1

True inter-city distances

Fig. 2. Scatter plot ('Shepard diagram') of all inter-city distancesfrom the MDS of Fig. 1 (y axis) against the corresponding truegreat-circle distances (x axis). The line denotes the best-fit mono-tonic (increasing) regression of y on x; scatter about this linedefines the MDS 'stress'.

shorter (cross-Pacific) distance from San Franciscoto Tokyo with the cumulatively longer distanceexpected from the intermediate steps of, say,San Francisco to New York to London to Delhi toTokyo.

The resulting stress is usefully displayed in aShepard diagram (Fig. 2), a simple scatter plot ofthe distances in the original triangular matrixagainst the corresponding distances between citiesin the final MDS. There is little scatter at low dis-tances in Fig. 2, bearing out the observed accuracyof the map for local regions. The continuous linedenotes the fitted (monotonic increasing) re-gression, MDS's best estimate of the relationshipbetween original and final distance. By definition,if rank order distances agree then there is no stressand no scatter around this line. Stress is thereforedefined in terms of total scatter, here taking thevalue 0.13 (Kruskal's stress formula 1). This stressis by no means small in relation to other quotedvalues in this paper, yet the final plot gives a ratheraccurate representation in two dimensions. Theoverall picture can certainly be interpreted moreeasily than the original distance matrix, and is un-likely to mislead. This is reassuring for interpret-ation of the later ecological examples.

Community changes following the Amoco Cadizoil spill (Morlaix)

The above example may have little to do with com-munity data but it is a helpful analogy. The tri-angular matrix of similarity (or dissimilarity) inspecies composition between every pair of samplesprovides rank-order statements such as 'sample A ismore similar to sample B than it is to sample C;these are entirely equivalent to the previousexample's 'city A is closer to city B than to city C.One can thus use non-metric MDS to reconstruct amap of the samples in two or more dimensions, inwhich relative distance apart of the samples re-flects relative similarity in species composition.Just as for the world cities, there is no guaranteethat the rank similarities can be accurately pre-served in (say) a two-dimensional layout of thesamples, and it is important to note the level ofstress involved.

Figure 3 gives one example of the use of MDSin a temporal study of environmental impact.Samples of subtidal macrobenthic communitieswere taken by Dauvin (1984) at a single station inthe Bay of Morlaix, Brittany, France, at roughly 3-monthly intervals between April 1977 and Feb-ruary 1982, covering the period oithe Amoco Cadizoil-tanker disaster. This took place in March 1978at a distance of some 50 km from the Morlaix site,but the resultant oil shck was dispersed quitewidely along the Brittany coast. Abundances for a

Fig. 3. MDS of approximately quarterly samples (A,B,C, . . .) ofmacrobenthic communities in the Bay of Morlaix, Brittany, cover-ing the period of the Amoco Cadiz oil spill. Stress = 0.09.

Page 7: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 123

total of 257 species were recorded in the 21samples; after double-root transformation, Bray-Curtis similarities gave rise to the MDS ordinationin Fig. 3. The 3-monthly samples are coded A, B,C,. . ., with the oil spill taking place betweensamples E and F. The stress coefficient is low, andthe resulting pattern invites ready ititerpretation:seasonal changes in community structure (e.g. A toD) were rather small in relation to the large per-turbation evident following the oil spill. After2 years the community had begun to settle down toa more stable mix, with comparable seasonalvariation, but with a somewhat different speciescomposition than originally found.

Sampling design

Several caveats to these conclusions are necessary,from the perspective of experimental design. Theabsence of replicates on each sampling occasionmakes it largely impossible to infer the presence ofa seasonal signal at all — the variation in quarterlysamples in the pre-impact phase could simply be areflection of sampling variability from replicatecores on a single occasion. More importantly forthe main conclusions of the study, there is nospatial control (Green 1979), that is, a comparablesite sampled over the same period but which re-mains unimpacted (or preferably several such sites.Underwood 1992). It is at least possible that allsuch sites along the Brittany coast, whether subject

to the impact or not, exhibited a similar pattern inwhich marked change in community structure wasforced at the same time (e.g. by local climatic fac-tors). Observational studies are prone to suchcaveats, of course, and even the addition of goodspatial as well as temporal controls does notguarantee immunity from criticism. It is itievitablein a purely observational study that there could beunrecorded environmental factors which co-varyclosely with the contaminant signal and are them-selves the cause of any observed communitychange (e.g. Clarke & Green 1988). Such factorscan only be fully controlled by 'randomizing themout', to use the jargon of statistical experimentaldesign. This would involve experimental protocolsin which treatments (e.g. pollution impacts) arerandomly allocated to experimental units (e.g. dif-ferent sites), rarely a credible option for environ-mental impact studies in general! Nonetheless,careful design can do much to reduce the likeli-hood of drawing misleading inferences from bio-logical effects studies (e.g. Underwood & Peterson1988).

Community pattern around a North Sea oilfield (Ekofisk)

An example of a spatial, rather than temporal,study is that of macrobenthic communitiessampled in July 1987, at 39 locations around theEkofisk oil field in the Norwegian sector of the

30

20 ^ - ^

33

28

21 1719I8 24

2

2 6 "7

2« 31

34

37

39 —»•

(30 km E)

36

32

2 km

35

38

13 9 5

8

16

12

7

4

1 2 g 10 15

3

•"I 500 m

14

Fig. 4. Sampling design for a macrobenthic community study around the Ekofisk oil field, North Sea. Note that sites have been renumberedfrom Gray et al. (1990), in order of distances from the current centre of drilling activity.

Page 8: Clarke 1993

124 K. R. CLARKE

North Sea (Gray et al. 1990). The design consistedof several radial transects stretching up to 4 kmaway from the field, with more concentrated sam-pling within 1 km of the main drilling activity inpreceding years. A single site 30 km distant wasalso sampled. Figure 4 shows the samplinglocations; note that to aid the later displays the siteshave been renumbered from Gray etal. (1990), thenew site numbers being in order of distance fromthe centre of the oil-field (here defined as the active2/4B&K rig complex).

The data consist of abundance for 209 speciesfor three replicate grabs from each of the 39locations. Totalling across the replicates, root-transforming counts and computing Bray-Curtissimilarities leads to the MDS ordination of Fig. 5.The MDS stress is acceptably low (see the later dis-cussion) and the interpretation is very clear. Thereis a gradation of community change from the dis-tant to central sites with, for example, all sites at adistance of around 3 km or more from the centre(including the 30 km reference) falling together atthe left hand side of the plot, and clearly dis-tinguishable from sites at intermediate distances{ca 1-3 km). This consistency of pattern is all themore remarkable because it is not at all evidentfrom viewing the original data matrix and is notdetectable in simple summary measures such asdiversity indices. This is a good illustration of thesensitivity of multivariate analyses (e.g. Warwick &Clarke 1991).

The caveats in the above discussion on samplingdesign also apply here. There is no proof that thedrilling activity is causal to the change in the

benthic community but the good design allows astrong prima facie case to be made. To deny suchcausality, one has to invoke less credible alternativehypotheses to explain such 'concentric circles' ofdiffering community pattern surrounding the oilfield, with the most geographically dispersed sitesshowing no more community variation (and argu-ably less variation, Warwick & Clarke 1993b) thanthe closely located samples at the centre.

Such alternative hypotheses exist, of course, andare most satisfactorily refuted by repeating the 'ex-periment' in an independent setting, that is, re-peating the treatment. This is an unlikely scenariofor many environmental impact studies (Under-wood 1992) but does pertain to oil field monitoringin the North Sea. Gray et al. (1990) described aparallel community analysis for the Eldfisk drillingcomplex, a distance of 10 km from Ekofisk. Thiswas a smaller study involving 'cross-hair' transectssupplemented by additional samples near to thecentre of activity; the 20 sites showed a similarpattern of community change with decreasing dis-tance from the rigs, albeit with contours whichwere slightly more elliptical. The same pattern isalso apparent from recent analysis of other NorthSea fields (J. S. Gray, pers. comm.), and this isclearly a case where a series of purely observationalspatial studies is capable of building up decisiveevidence of causality.

This conclusion needs to be circumscribed intwo ways. First, there is no statement here aboutthe causative mechanism. Several features of oilfield activity could be responsible for a communityimpact, for example toxic effects from contami-

Fig. 5. MDS plot for macrobenthos at 39 sampling sites (Fig. 4) around the Ekofisk oil field. Note the strong gradation of community pattern

with increasing distance (site number) from the centre of drilling activity. Stress = 0.12.

Page 9: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 125

nants in the drilling muds, or simply their modifi-cation of the grain size structure of sediments.Such variables are certain to be confounded witheach other, to some extent. (The question ofwhether more progress can be made in differen-tiating correlative links with particular contami-nant variables is returned to later.) One shoulddistinguish between use of the term 'causality' toimply that a particular event or activity is respon-sible for an observed change, and its use in definingthe mechanism by which this change is engin-eered. Second, and perhaps tritely, while thesensitivity of the multivariate approach has hereprovided solid evidence of biological change, therecan be no imputation from this analysis alone thatthe effect is deleterious. Such judgements are out-side the scope of this paper, though the issues aretouched on in the companion paper (Warwick1993).

Practical experiences with the MDS algorithm

Experience in ecological application of non-metricMDS over the last decade has highlighted a num-ber of technical issues that are sometimes over-looked.

(1) An MDS configuration can be arbitrarilyrotated, reflected or expanded, explaining the ab-sence of axis annotation in plots such as Figs 3 and5. This is clear from the world map example; ex-ploiting only knowledge of which two cities areclosest, which two are next closest et cetera therecan be no information on orientation or units ofmeasurement. What is not arbitrary is the relativeposition of points in the final plot. The arbitrari-ness of orientation can be a practical nuisancewhen comparing different ordinations with thesame label set, and it can be helpful to rotate anMDS configuration so that its direction of maximalvariation always lies along the x axis. (This issimply achieved by applying principal componentanalysis [PCA] to the two-dimensional MDS con-figuration; this is not the same thing as applyingPCA to the original data matrix of course.)

(2) The nature of the iterative algorithm, suc-cessively refining an initially arbitrary configur-ation of samples, will often generate solutionswhich are sub-optimal. It is imperative to repeatthe iterative process a reasonable number of times,typically eight or nine would be advisable, and tocheck that the same (lowest) stress is achieved in

several of the repeats. (Configurations with thesame stress value, to three decimal places, arealmost always identical.)

(3) Degenerate solutions can occur, in whichgroups of samples collapse into single points on theMDS plot. Sometimes this can be a genuine arte-fact and will not be found in repeat iterations; moreoften however it is repeatable and results from atotal disjunction in the data. If the data divide intotwo groups, which have absolutely no species incommon, then there is clearly no yardstick fordetermining how far apart the groups should beplaced in the MDS plot. They are infinitely farapart, in effect, and it is not surprising to find thatthe samples in each group then collapse to a point.The solution is to split the data and carry out anordination separately on each group. In fact, thissequential approach could be taken beneficially inless extreme situations. If an initial MDS shows thesamples to be strongly clustered, and the individualgroups are quite large, then separate ordinationsfor each group are likely to reveal the fine structuremore accurately. Such an approach may be essen-tial when the total number of samples is very large.MDS on much more than 100 samples is not onlyrather computation intensive — though ever-increasing computing power is rendering this a lessimportant consideration than it once was — butmore importantly it is unlikely to reveal a clear andreliable pattern. In general, the greater the numberof samples the harder it will be to refiect the com-plexity of their inter-relationship in a two-dimen-sional plot, whatever the ordination techniqueemployed. An initial cluster analysis might thenform the basis for separate ordinations of two orthree major groups.

(4) All ordination methods are a compromise;inherently high-dimensional data are being viewedin a lower-dimensional (often two-dimensional)plot. It can be claimed that non-metric MDS makesthe best possible job of preserving among-samplerelationships accurately in a low-dimensional pic-ture; that is, after all, its raison d'etre. Nonetheless,it is important to assess how well it succeeds in anyspecific case and modify interpretation accord-ingly. The simplest indicator is the stress value.Traditionally (Kruskal & Wish 1978), this is exam-ined for MDS solutions in a range of dimensions; asthe dimensionality increases, a sudden drop in thestress value indicates that a valid configuration hasbeen found. In practice, a clear shoulder in the

Page 10: Clarke 1993

126 K. R. CLARKE

graph of stress versus dimensionality is rarely seenfor ecological data, and practical experiencesuggests the following rule of thumb for interpret-ing Kruskal's stress formula 1.

Stress <0.05 gives an excellent representationwith no prospect of misinterpretation.

Stress <0.1 corresponds to a good ordinationwith no real risk of drawing false inferences. Ahigher-dimensional plot is unlikely to add to theoverall picture, though in a strongly clustered situ-ation the fine structure of individual groups mightbear separate examination.

Stress <0.2 can still lead to a usable picture,although for values at the upper end of this rangethere is potential to mislead; too much relianceshould not be placed on the details of the plot — ahigher-dimensional solution could show a some-what different picture.

Stress >0.2 is hkely to yield plots which could bedangerous to interpret. Certainly by the time stressreaches 0.35-0.4 the samples are effectively ran-domly placed, bearing little relation to the originalsimilarity ranks. (Such large values can also be gen-erated by the user inputting similarities to a routinethat expects dissimilarities, or vice versa, and theMDS plot will then tend to string the samplesaround the circumference of a circle.)

These guidelines are over-simplistic. Forexample, stress tends to increase with increasingnumbers of samples. Also, it makes a difference tointerpretation if contributions to the stress ariseroughly evenly over all points or if most of thestress results from difficulty in placing a singlepoint in the two-dimensional picture (as in Fig. 1).The latter problem can be identified by closerstudy of the Shepard diagram (in Fig. 2 many of thepoints furthest from the monotonic regression lineare from distances involving San Francisco) and bynoting that, in repeated runs of the algorithm, sev-eral of the near-optimal solutions are identical tothe optimal configuration, except that one pointhas moved to a quite different position.

(5) A useful approach in cases of non-negligiblestress is to check reliability of interpretation bysuperimposing underlying or complementary in-formation onto the ordination. For example, onecould connect all points whose correspondingsimilarities are ranked in the top 10 or 20%. A less-than-faithful preservation of these rank similaritiesin the MDS would be evidenced by unnatural con-nections. A more sophisticated variant of this is to

superimpose the minimum spanning tree (Gower& Ross 1969). A third possibility, recommended byField et al. (1982), is to indicate groupings from acluster analysis in the MDS plot. Again, unnaturalgroupings suggest that among-sample relation-ships are more complex than can be accuratelyportrayed in a two-dimensional configuration.Warwick et al. (1988) gave a detailed example ofthis, contrasting the relative success of MDS andPCA in representing meiobenthic communityresponse to contaminant dosing in a mesocosmexperiment.

(6) If such techniques show that a two-dimen-sional picture is an inadequate summary, one mayoccasionally be able to divide the data into subsetsthat are capable of accurate portrayal; it is morelikely though that a three- or higher-dimensionalsolution must be sought. Good software for three-dimensional plots is now much more widely avail-able and could beneficially be used more often forordinations, even where stress in two dimensions isnot particularly large (Warwick & Clarke 1993agive a recent example).

Clustering on rank similarities: Estuarinenematode communities (Exe)

To maximize the effectiveness of superimposingclusters on an MDS, one requires the cluster analy-sis to exploit the same information as the ordi-nation; inconsistencies between the two displayscan then be attributed unambiguously to the inad-equacy of a two-dimensional description. In keep-ing with the earlier rationale, this suggests thatgroup-average clustering be performed on theranks of the similarities. For n samples, the ranksare just the integers 1,2, . . ., n(n — \)/2, althoughthese will usually need some adjustment for ties (bysimple averaging). Depending on the clusteringsoftware used, they also need initial rescaling to liein the interval 0-100%. There is no practical dif-ficulty with all of this. (Although theoretically, asD. P. Faith, pers. comm. 1992, has pointed out, acloser parallel still to the advocated ordinationtechnique would be a clustering algorithm whichminimized a stress value computed between theoriginal dissimilarities and the metric relationsrepresented by the dendrogram, the iterative algor-ithm being constrained only by the rank orderingof the dissimilarities. The simpler idea of perform-

Page 11: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 127

(a)

0-

10

20

30

40

50

60

70

80

90

1001 9 1 4 1 3 1 2 1 8 1 7 1 6 1 5 1 0 5 1 1 6 9 8 7 4 3 2 1

(b)

10

20

30

40

60

60

TO

eo

90

1001 9 1 4 1 3 1 2 1 8 1 7 1 6 1 5 1 0 5 1 1 6 9 8 7 4 3 2 1

Fig. 6. Hierarchical agglomerative clustering of nematode com-munities from 19 intertidal sites in the Exe estuary, UK, usinggroup-average linking on: (a) the Bray-Curtis similarities them-selves and (b) only the ranks of the Bray-Curtis similarities(rescaled).

ing standard clustering on the rank similarities canbe thought of as an approximation to such a pro-cedure.)

The practical consequences of replacing similar-ities by their ranks, prior to group-average cluster-ing, are not expected to be great. An example isshown in Fig. 6, for the data used as an illustrationby Field et al. (1982). Warwick (1971) samplednematode communities in intertidal sediments at19 locations in the Exe estuary, UK; this was not apollution study but a basic investigation of meio-faunal community pattern in relation to differingenvironmental conditions. Figure 6a is the originalgroup-average cluster analysis, based on Bray-Curtis similarities from fourth-root transformedabundances of 182 species, and Fig. 6b is the modi-fied dendrogram using only ranks of these simi-

larities. Note that in order to concentrate attentionon the significant feature of these two figures, vizthe group structure, the rank similarities have beenrescaled to the same range as the original Bray-Curtis values (cluster groupings are invariant to ascale or location change in the similarity matrix).As expected, the dendrograms are similar thoughnot identical, the (5, 10) and (6, 11) groups nowjoining together at a late stage in the analysis,instead of their previous attachment to separategroups. Rather than signalling any significantchange in interpretation, this is more a refiectionof the arbitrariness of cluster analysis, and itsnecessity to form hierarchical clusters come whatmay. The MDS for this data (see the later Fig. 13a)confirms the structure of tight groups (1-4), (7-9),(5, 10), (6, 11) et cetera but shows these groups tobe fairly evenly spaced. It is then something of ahair-line decision as to how these groups combine.This is precisely the reason why the continuum ofan MDS ordination is preferred to the discretenessof a cluster analysis, provided the MDS plot has lowenough stress to be reliable, as is the case here.

DETERMINING SPECIES RESPONSIBLEFOR SAMPLE GROUPINGS

Discriminating between two groups usingBray-Curtis dissimilarities

In keeping with the rationale of the opening sec-tion, it is consistent to assess the role of individualspecies only through their contributions to theamong-sample similarity coefficients. When acluster analysis divides a set of samples into (say)two clear-cut groups, it may be important to knowwhich species are contributing principally to thisdivision.

The Bray-Curtis dissimilarity ^jk between anytwo samples / and k can be defined as

where

(2)

yijis the (transformed) abundance of the /th speciesin the ^th sample and p is the number of species.There is no unambiguous partition of 8;* into con-tributions from each species, since the standardiz-ing term in the denominator of equation (2) is afunction of all species values, but one natural defi-

Page 12: Clarke 1993

128 K. R. CLARKE

nition would be to take djkii) as the 'contribution ofthe zth species' to bjk. Averaging ?>ji, over all samplepairs {j, k), with ) in the first group and k in thesecond, gives the overall average dissimilarity, 6,between groups 1 and 2. The same averaging takenover each 6;* (i) gives the average contribution, 5,,from the nh species to this overall dissimilarity

Typically there are many pairs of samples (j, k)making up the average 5,, and a useful measure ofhow consistently a species contributes to 5, is thestandard deviation, SD(6,), ofthe ?>,k{i) values. If 6,is large and SD(5,) small (and thus the ratio6,/SD(5,) is large), then the nh species not onlycontributes much to the dissimilarity betweengroups 1 and 2 but also does so consistently; it isthus a good discriminating species. (Note that,while SD(8,) is a permissible and convenientmeasure of variation here, the ?)jk{i) values are notindependent and one cannot, for example, use aconventional mean-to-SD ratio to test if the aver-age contribution of the i\h species is effectivelyzero.)

For the Exe nematode data of Fig. 6, Table 1shows which species contribute most to the dis-similarities between two of the groups, the clusters

termed lA (sites 1-4) and IB (sites 7-9) by Fieldet al. (1982). The first two columns give the abun-dances for each species, averaged across the sitesmaking up groups lA and IB (although note thatthis is an average of untransformed values and thedissimilarity computations are based on fourth-root transformed data). Table 1 is then ordered bythe values in the third column, the decreasingcontribution, 8,, to the total dissimilarity5( = 72.6).

It can be seen that many species play some partin determining the dissimilarity between the twogroups, and this is typical of such analyses. In thistightly clustered situation, it is no surprise to findthat the principal contributions come from speciesthat are abundant in one group and largely absent(though not necessarily totally absent) from theother; the balance of contributions in this case isfrom species that are numerous in IB but rare inlA. Note the inconsistent contribution of certainspecies, such as Oxystomina elongata (high SD(5,)and low mean-to-SD ratio), implying that thiswould not be a very useful discriminating speciesfor the two groups.

The final column in Table 1 cumulates the con-tributions, having rescaled the 5, to percentages of

Table 1. Average abundance (J) of important nematode species in groups 1A ( = 1 -4) and 1B ( = 7 -9) of Exe estuary sites. Species are listed inorder of their contribution (8,) to the average dissimilarity 5( = 72.6) between the two groups, with a cut-ofF when the cumulative per centcontribution (18,%) to 5 reaches 70%

Species 8, SD(6,) 8,/SD(6,)

Hypodontolaimus ponticus 140.0 0.5 3.9Axonolaimus spinosus 0.0 60.8 3.7Adoncholaimus fuscus 55.0 0.0 3.4Viscosia viscosa 84.0 1.0 3.3Sphaerolaimus balticus 38.0 0.0 3.2Axonolaimus paraspinosus 58.0 0.3 3.0Oxystomina elongata 145.0 0.0 2.8Hypodontolaimus geophila 0.0 38.8 2.7Tripyloides gracilis 58.7 0.0 2.5Daptonema oxycerca 96.1 3.5 2.5Monhystera sp 8.0 0.0 2.2Praeacanthonchus punciatus 15.0 0.0 2.1Adoncholaimus thalassophygas 0.0 7.3 2.0Anoplostoma vtviparum 11.3 122.3 2.0Microlaimus robustidens 3.0 0.0 1.8Enoploides spiculohamatus 2.3 0.0 1.7Ascolaimus elongatus 12.0 0.0 1.6Leptolaimus papilliger 0.7 7.5 1.5Daptonema flevensis 0.0 3.5 1.4Sabatieria pulchra 128.7 47.5 1.4Rhabditid sp a 0.0 3.0 1.3Halichoanolaimus robustus 2.7 0.0 1.3

1.20.80.61.00.91.03.21.11.90.50.20.90.61.00.30.41.30.91.01.10.91.0

3.25.06.33.23.72.90.92.61.35.29.42.53.31.95.64.01.21.71.41.31.51.3

5.310.415.219.724.228.332.236.039.442.845.848.751.554.256.759.061.163.365.267.168.970.7

Page 13: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 129

6. It can be seen that only part of the analysis istabled, up to the point where 70% of the total dis-similarity is accounted for. This has involved 22species and there are a further 27 species whichcontribute the remaining 30% of the dissimilarity(the other 133 species are absent from both groupsso contribute nothing).

Typicality of species within a group

In much the same way, though with less practicalsignificance, one can examine the contributioneach species makes to the average similarity withina group^5. The average contribution of the nhspecies, S,, is defined by taking the average over allpairs of samples (;, k) within a group, of the nh

term Sjk{i) in the alternative definition of Bray-Curtis similarity:

(3)

where

S,k{i) = 200 mm {y,,, y,k)l^,L^{y,, + y,k) (4)

(It may not be immediately apparent, but can besimply demonstrated, that Sjk= 100 — 8;*, that issimilarity and dissimilarity add to 100, the com-mon convention.)

The more abundant a species is within a groupthe more it will tend to contribute to the intra-group similarities. It typifies that group if it isfound at a consistent abundance throughout, so thestandard deviation SD(5,) of its contribution is low,and the ratio ~S,ISD{S,) high.

Table 2. Average abundance (v) of important nematode species in group 1A of Exe estuary sites, and their contribution S, to the averagesimilarity ^ ( = 76.8) within the group

Species

Anoplostoma viviparumDaptonema setosaAxonolaimus spinosusSabaiieria pulchraDesmolaimus fennicusHypodontolaimus geophilaAdoncholaimus thalassophygasDaptonema oxycercaLeptolaimus papiiliger

Table 3. Species contributions

Species

Daptonema oxycercaViscosia viscosaSabatieria puichraHypodontolaimus ponticusAdonchoiaimus fuscusSphaerolaimus bahicusAxonolaimus paraspinosusDaptonema setosaMonhystera spAnoplostoma viviparumMicrolaimus robustidensDesmoiatmus fennicusPraeacantkonchus punctatusFnoploides spiculohamatusTripyloides graciiisMetachromadora remaneiHalichoanolaimus robustusParacanthonchus multilubiferOxystomina eiongataTrefusia longicaudata

VIA

122.360.860.847.516.838.8

7.33.57.5

to similarity within group

y\A

96.784.0

128.7140.035.038.058.020.7

8.011.33.07.3

15.02.3

58.73.72.72.7

145.01.3

5,

12.011.510.88.66.96.05.15.t5.0

IB of Exe estuary sites

5,

5.95.65.54.94.24.03.72.82.82.82.42.22.12.11.50.90.90.80.80.8

SD(5,)

1.51.42.51.50.91.61.30.50.7

SD(5,)

0.30.41.31.10.81.30.50.60.20.60.30.30.30.32.51.61.51.41.41.4

S,ISD(S,)

8.18.34.35.77.43.93.8

10.47.4

S,/SD(.S,)

21.413.93.74.45.53.17.64.9

15.04.79.07.86.86.80.60.60.60.60.60.6

Z5,%

13.630.644.655.864.772.679.285.892.3

ts,%

9.418.427.235.141.848.154.158.663.167.671.575.078.381.784.085.586.988.289.590.8

Page 14: Clarke 1993

130 K. R. CLARKE

Table 2 gives the species breakdown of similar-ities for group lA of the Exe estuary nematodestudy. The first column is again a simple meanabundance within the group and the table isordered by the decreasing values of its secondcolumn, the contributions S, to the total similarity5 = 76.8. The table is again incomplete, the firstcolumn showing that only nine species contributed90% of this similarity. Table 3 is the equivalentinformation for group IB, where there are nowmore species contributing to the total similarity5 = 62.6. It is worth noting again that the specieswith the highest average abundance, Oxystominaelongata, contributes little because it is not foundconsistently within the group. Additionally, notethat typicality within a group does not guaranteethat a species is a good discriminator between twogroups. One example is Sabatieria pulchra whichfeatures quite high in both Tables 2 and 3 but lessso in Table 1; it is abundant in both groups 1A andIB and will therefore distinguish poorly betweenthem.

These similarity/dissimilarity breakdowns(termed the 'similarity percentages' or SIMPER pro-cedure) have been carried out for the other groupsin the Exe nematode study, including comparisonsmade for these data in Field et al. (1982), by the'I-test' procedure. (The latter does not lend itself tosmall numbers of samples, so the comparison be-tween groups lA and IB was not performed byField.) The conclusions bear out the remarks in theIntroduction, that the I-test approach is biasedtowards rarer species and does not highlight therange of species responsible for defining the clus-tering pattern, as seen in the SIMPER analyses.

TESTING FOR SPATIAL AND TEMPORALDIFFERENCES

Impact of coral mining on reef-fishcommunities (Maldives)

In the previous section, the emphasis was on a pos-teriori grouping of the samples in examiningwhich species are principally responsible for anobserved clustering. In other situations the group-ing may be an a priori one and the need is for aformal statistical framework within which to testhypotheses about differences in community struc-ture between groups. The simplest examples are of

Fig. 7. MDS ordination showing a clear distinction between reef-tlat fish communities from 11 mined (M) and 12 control (C) sites inthe Maldive Islands. Stress = 0.08.

'one-way layouts', to use the terminology of analy-sis of variance (ANOVA).

Figure 7 is an MDS plot based on abundancedata for reef-flat fish communities (152 species)recorded at 23 sites in the Maldives (Dawson-Shepherd et a/. 1992). The purpose of the study wasto examine if there were detectable impacts on thefish communities from the widespread mining ofreef corals, for building materials, that takes placearound certain ofthe Maldive Islands. A number ofreef sites were therefore selected from mined areas(denoted M in Fig. 7) and a roughly equal numberof non-mined sites were chosen to act as controls (Cin Fig. 7). The effect of coral mining is to reducethe complexity of the reef habitat yet, surprisingly,the mean Shannon diversity of the fish communi-ties does not differ between mined and control sites(Dawson-Shepherd et al. 1992). By contrast, themultivariate analysis (Fig. 7) shows a clear-cut dif-ference in community structure with a near-totalseparation of mined and control sites. A statisticaltest is obviously unnecessary in this case but therewill be many analogous situations in which a for-mal test ofthe null hypothesis of'no impact' wouldbe desirable.

As discussed in the Rationale, such a test wouldbe an analogue of the one-way ANOVA used for test-ing diversity indices, but the classical multivariateequivalent (MANOVA) applied to the full speciesmatrix is inappropriate for many community datasets and is also not in keeping with the distribution-free stance taken by this paper. (An alternativewhich is sometimes suggested is inference on theordination co-ordinates, e.g. a two-dimensionalMANOVA, Faith 1990; this also has a number of

Page 15: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 131

problems, such as the formal lack of independenceof samples in the ordination space, the usual diffi-culties with making MANOVA assumptions of equalvariance-covariance matrices and the arbitrarinessof the choice of ordination dimensionality withinwhich to carry out the test — see below.) A simplenon-parametric procedure which avoids theseproblems is possible, however, and the principle isoutlined for a 'biological effects' study by Clarkeand Green (1988). The history of such permu-tation tests can be traced to epidemiological workby Mantel (1967) and the randomization principleemployed to generate significance levels is due toHope (1968). It is convenient to illustrate the de-tails by a more borderline case than the Maldivesstudy.

A permutation test for the one-way layout:Frierfjord macrobenthos

Gray et al. (1988) described a study of subtidalbenthic communities at several sites in Frierfjord/Langesundfjord, in the context of the IOC OsloWorkshop on Biological Effects of Pollutants(Bayne et al. 1988). Figure 8a is from an extract ofthese data, and shows the MDS for a 12 sample X110 species abundance matrix, in which sampleshave the structure of four replicates at each of threesites (denoted B, C, D). This example can bethought of as representative of many other situ-ations in which it is desirable to establish site-to-sitedifferences in community structure before goingon to interpret those differences in terms of thebiology or the environmental conditions. The nullhypothesis is therefore that of 'no differencesbetween sites'.

To examine this hypothesis, one first needs toconstruct a test statistic reflecting observed differ-ences between sites, contrasted with differencesamong replicates within sites. From Fig. 8a, a natu-ral choice might be to compute the average dis-tance between every pair of replicates within a siteand contrast this with the average distance apart ofall pairs of samples corresponding to replicatesfrom different sites. A test could certainly be con-structed from these distances but has a number ofdrawbacks.

(1) Such a statistic could only apply to a situationin which the method of display was an MDS ratherthan, say, a cluster analysis.

(2) The result would depend on whether the

(bl

lOOn

- 0 . 2 - 0 . 1 0.0 0.1 0.2 0.3 0.4

R slalislic

Fig. 8. Testing for benthic community differences between threesites (B,C,D) in Frierfjord, Norway, based on four replicate coresper site, (a) MDS of the 12 samples, stress = 0.12 (b) Simulateddistribution of the test statistic R, a difference of rank similaritiesbetween and within sites, under the null hypothesis of 'no site-to-site changes' in community structure. The true value of R, 0.45,falls above the simulated range.

MDS was constructed in two, three or higher di-mensions. The earlier discussion showed that thereis often no correct dimensionality and one may endup viewing the picture in several different dimen-sions — it would be unsatisfactory to generatedifferent test statistics in this way.

(3) The configuration of B, C and D replicates inan MDS would also differ slightly if the MDS in-cluded additional samples, for example the full setof sites A-E, G in fig. 2a of Gray et al. (1988). It isagain undesirable that a test statistic for comparingonly B, C and D depends on what happens at othersites.

Page 16: Clarke 1993

132 K. R. CLARKE

These three difficulties disappear if the test isbased not on distances between samples in an MDSbut on the corresponding (rank) similarities be-tween samples in the underlying triangular simi-larity matrix. If ru' is defined as the average of allrank similarities among replicates within sites, andrg is the average of rank similarities arising from allpairs of replicates between different sites, then asuitable test statistic is

where M = n{n — 1 )/2 and n is the total number ofsamples under consideration. Note that the highestsimilarity corresponds to a rank of 1 (the lowestvalue), following the usual mathematical conven-tion for assigning ranks.

The denominator constant in equation (5) hasbeen chosen so that; (i) R can never technically lieoutside the range (—1, 1); (ii)/? = 1 only if all rep-licates within sites are more similar to each otherthan any replicates from different sites; and (iii) Ris approximately zero if the null hypothesis is true,so that similarities between and within sites will bethe same on average.

R will usually fall between 0 and 1, indicatingsome degree of discrimination between the sites.For the Oslo Workshop data of Fig. 8a, n = 12,M = 6, ?B = 37.5, rit7=22.7, so that R takes thevalue 0.45. (This is based on the similarities onlyfor the sites B, C, D, extracted from the matrix forall sites and re-ranked.)

R substantially less than zero is an unlikely con-tingency since it would correspond to similaritiesacross different sites being higher than thosewithin sites; such an occurrence is more likely toindicate an incorrect labelling of samples. The Rstatistic itself is a useful comparative measure ofthe degree of separation of sites, though one isoften initially concerned with the simple questionof whether it is significantly different from zero. (Itshould not be forgotten though that, as with stan-dard univariate tests, it is perfectly possible for R tobe significantly different from zero yet incon-sequentially small, if there are many replicates ateach site.)

Under the null hypotheses Ho; 'no differencesbetween sites', there will be little effect on averageto the value of /? if the labels identifying whichreplicates belong to which sites are arbitrarily re-arranged; the 12 samples are just replicates from a

single site if Ho is true. This is the rationale for apermutation test of Ho; all possible allocations offour B, four C and four D labels to the 12 samplesare examined and the R statistic recalculated foreach. In general there are

(6)

distinct ways of permuting the labels for n repli-cates at each of ^ sites, and the equation gives 5775permutations in this case. It is computationallypossible to examine such a number of re-labellingsbut the scale of calculation can quickly get out ofhand with modest increases in replication, so thefull set of permutations is randomly sampled(usually with replacement) to give the null distri-bution of/?. In other words, the labels in Fig. 8a arerandomly reshuffled, R recalculated and the pro-cess repeated, say, 1000 times. In this case, theresulting spread of R values is shown in the histo-gram of Fig. 8b; they range from about —0.3 to justover 0.4, with a right-skewed frequency distri-bution. This is the range of likely values of/? if Hois correct. The true value of/?, at 0.45, is thereforeseen to be an unlikely event, at least a one in 1000chance, so the null hypothesis can be rejected withsignificance level / •< 0.1%. This is the randomiz-ation test principle of Hope (1968). (Note that thisapproach should not be confused with the moreapproximate technique of boot-strapping, Efron1979, in which the replicate data would be re-sampled, with replacement. Here we are only sam-pling from the permutation distribution, and aredoing this only to save unnecessary computation inevaluating the full set of permutations. By increas-ing the number of randomizations, significancelevels can be determined as accurately as is necess-ary to demonstrate, or fail to demonstrate, rejectionof Ho. In that sense it is an exact test.)

It is possible to derive some 'large sample' results(Mantel 1967) for the null frequency distributionof statistics such as equation (5), and for the ap-proximate behaviour of their variance (Clarke1990). In practice though, the simplest and safestapproach is to test R by evaluating a large numberof random rearrangements, as above. This wascarried out here by a specially written FORTRANprogram (ANOSIM) but would be relatively easy toimplement in macro-languages of standard statisti-cal packages.

Page 17: Clarke 1993

N O N - P A R A M E T R I C M U L T I V A R I A T E ANALYSES 133

Generality of application: Coral communities(South Tikus Island)

Though the above exposition has been in terms oftesting for site differences, by referral to replicatesamples within a site, it clearly is equally applicableto other one-way layouts. A more practically im-portant case is represented by the earlier Maldivesexample (Fig. 7), where the replicates were the dif-ferent sites, which divide into two treatment levels:mining impact and control conditions. A test of thenull hypothesis of 'no effect of mining on fish com-munities' gives a large, positive value for R of 0.74and, not surprisingly, the hypothesis is rejected atvirtually any significance level one cares to nomi-nate. In drawing conclusions, however, one shouldbear in mind that this is an observational study andnot an experiment in which mining impact hasbeen allocated to the sampling sites at random.Mining activity is concentrated in a more local re-gion than the full geographic range covered by thecontrol sites (Dawson-Shepherd et al. 1992); otherfactors could be responsible for the difi'erent com-munities in that region. This is mitigated to someextent by the observations that: (i) there are one ortwo control sites within the general area of miningdamage, and these group with the other controlsites; and (ii) the community variation among con-trol sites appears to be smaller than among minedsites, in spite of the control sites' greater geo-graphic spread. A simple model in which com-munity differences were a function of spatialseparation and not mining activity could generatefallacious discrimination of mined and non-minedsites but would reverse the pattern of variabilityobserved in Fig. 7.

Temporal rather than spatial studies also gener-ate one-way layouts, which are amenable to testingin the above fashion. A further example of coral-reef work is reported by Warwick etal.il990b), thistime a study of the coral communities themselves.Figure 9 displays the MDS of replicate transectsacross a single reef site in South Tikus Island,Thousand Islands, Indonesia, with 10 replicatesfrom each of the two years 1981 and 1983. (Thedata were percentage cover of the transect by eachof 58 coral species and it was not considered necess-ary to transform the observations prior to cal-culating Bray-Curtis similarities.) A change in thecommunity pattern between the two years isevident, reflecting a coral-bleaching episode

putatively linked to the 1982-83 El Nino. It isinteresting to ask whether the permutation/ran-domization test described above has any validity orpower to detect this change. This question is by nomeans an 'Aunt Sally'; the conventional univariateANOVA or f-test does not detect a change whichcorresponds to a variance increase rather than alocation shift. Furthermore, the standard test isinvalid if the variance differs substantially betweenthe two groups. Similar caveats apply to the 'classi-cal' multivariate analysis of variance (MANOVA)tests, such as Wilks' lambda (Mardia et al. 1979),which make an assumption of equal variance-covariance matrices for the two groups, in additionto approximate normality of the species data.

By contrast, no such assumptions have been speltout for the randomization test, and they are notrequired. The strength of this test is undoubtedlyits simplicity and validity in almost any situation. Itwill have some power to detect the sort of changeseen in Fig. 9; the R statistic is 0.43 and this is sig-nificant at the P< 0.1% level. Naturally, in caseswhere the strong assumptions of multivariate nor-mality are justified, and the change is only a shift inlocation not variance, a classical test would be ex-pected to have greater power and should be used —much richer inference is possible. In many practi-cal situations, however, the distribution-free ap-proach is likely to be more appropriate, providedsufficient attention is paid to the replication levelto generate a reasonable number of possible per-mutations. If only two replicates are taken for each

Fig. 9. MDS ordination indicating a difference in variability ofcoral community structure at a single site from South Tikus Island,Indonesia. There were 10 replicate transects in each of 1981 (1)and 1983 (3). Stress = 0.11.

Page 18: Clarke 1993

134 K. R. CLARKE

of two groups then there are only three distinctpermutations and a 5 or 10% significance level testcould not be constructed. Four replicates fromeach of two groups (35 permutations) are neededfor a 5% level test, though the number of possibil-ities rises steeply once there are more than twogroups; for example there are 280 permutations ofthree groups of three replicates and thus a potentialP<0.5% level test.

Before leaving the South Tikus Island data, it isworth noting that Warwick and Clarke (1993b) dis-cuss this example in the context of increased varia-bility of community pattern as an indicator ofpollution or disturbance impact. (Increased varia-bility in population numbers through space or timehas previously been adduced as a consequence ofdisturbance, e.g. by Underwood 1991.) Warwickand Clarke (1993b) propose a comparative index ofmultivariate dispersion which follows the rationaleof this present paper, being only a function of therank similarity matrix.

Many impact studies are not confined to one-way layouts; some recommended designs mix bothspatial and temporal components and have hierar-chies of spatial sampling (e.g. Green 1979, 1993;Underwood 1992). These designs have beenevolved in the context of normality-based ANOVA,applied to abundance data for a single species or toa diversity measure. In the attempt to reduce thelikelihood of observed changes not being causallyrelated to the impact, some of these designs canbecome very complex (Underwood 1992). It is amajor challenge to even begin to translate suchstructures into the distribution-free multivariatecontext of the present paper, and one that couldonly ever be partially successful. A first step is poss-ible, namely some results for the two basic types oftwo-way layout.

Two-way nested layout: Impacts on nematodecommunities (Clyde)

Simple nested designs arise in spatial studies,where two levels of spatial replication are involved.For example, Lambshead (1986) analysed meio-benthic communities from three putatively im-pacted areas ofthe Firth of Clyde and three controlsites, taking three replicate samples from most ofthe sites. Tbe resulting MDS, based on fourth-roottransformed abundances of the 113 species in the16 samples, is given in Fig. 10a. Note that the third

(a)

C3

C2 Cl

PI

1

PIP1

P3P3

P2

P3

P2 p2

(b)

1 2 0 -

c 1 0 0 -

- 0 . 3 - 0 . 2 - 0 . 1 0.0 0.1 0.2 0.3 0.4 0.5 0.6

R statistic

Fig. 10. Testing for community differences in a two-way nestedlayout of 16 nematode samples from the Firth of Clyde, (a) MDSfor three 'polluted' (P1,P2,P3) and three 'control' sites (C1,C2,C3),with three replicate samples at most sites (stress = 0.09). (b) Simu-lated distribution ofthe R test statistic, under the null hypothesis of'no difference between sites' within either condition (C or P).

control site has only one replicate. The sites arenumbered 1 to 3 for both conditions but the num-bering is arbitrary; for example, there is nothing incommon between PI and Cl. This is what is meantby sites being 'nested' within conditions. Twoquestions are then appropriate.

(Ql) Can we reject the null hypothesis of no dif-ference among sites within each treatment (controlor polluted conditions)?

(Q2) Can we reject the null hypothesis of no dif-ference between control and polluted conditions?The approach to question 2 might depend on theoutcome of question 1.

Question 1 can be answered by extending theone-way permutation test of the last section to aconstrained randomization procedure. The pre-sumption under question 1 is that there may be adifference between general location of C and P

Page 19: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 135

samples in the MDS plot (Fig. 10a) but within eachcondition there cannot be any pattern in allocationof replicates to the three sites. This is patently falsein Fig. 10a, so one would expect to reject the nullhypothesis here. Treating the two conditions en-tirely separately, one therefore has two separateone-way permutation analyses of exactly the sametype as for the Frierflord macrobenthic samples(Fig. 8a). These generate test statistics RQ and Rp,say, computed from equation (5). These may becombined to produce an average statistic, R, whichcan be tested by comparing it with R values from allpossible permutations of sample labels permittedunder the null hypothesis. This does not mean thatall 16 sample labels may be arbitrarily permuted;the randomization is constrained to take place onlywithin the separate conditions: P and C labels maynot be switched. Even so, the number of possiblepermutations is large; for a balanced design itwould be given by the square of equation (6). Herethe design is slightly unbalanced by the loss of tworeplicates, so equation (6) needs some modifi-cation, but there are around 20 000 distinct per-mutations. Notice incidentally that there isnothing inherent in the randomization procedure,for either one-way or two-way cases, that restrictsits use to balanced designs. It may even, as here,cope with some sites that are represented by asingle sample, provided there are enough repli-cates elsewhere to generate sufficient permuta-tions. There must be a sense in which the power ofthe test is weakened by the lack of balance but thisis a difficult area to examine in any formal way —any definition of power requires a precise alterna-tive hypothesis to be nominated and it is hard to seehow this could be specified. The lack of balancealso causes a minor complication in the definitionof the average R of Rp and 7?c;^ome minor ef-ficiency gain will be possible if /? is a weightedaverage in the unbalanced case. It is relativelystraightforward to exploit the variance approxi-mation given by Clarke (1990) to effect an optimalweighting, though this will not be pursued here.The present example uses a straight average oiRpand Re-

Figure 10b displays the results of simulating thefull permutation distribution for R under the nullhypothesis. That is, R is computed for 1000 differ-ent (constrained) re-labellings of the points inFig. 10a. Possible values range from —0.3 to 0.6,although 95% of the values are <0.27 and 99% are

<0.46. The true value ofR at 0.75 thus provides astrongly significant rejection of the null hypothesisthat there are no site-to-site differences within aspecific condition.

Question 2, which will usually be the moreinteresting of the two hypotheses, can now beexamined. The test of question 1 demonstratedthat there are, in effect, only three replicates (thesites 1-3) at each of the two conditions (C and P).This is a one-way layout, and the null hypothesisthat there is no pollution impact can be tested bythe one-way ANOSIM procedure of the previoussubsection. (This is exactly homologous to themore familiar univariate nested analysis of vari-ance, where treatments would now be tested by anf statistic on (1, 4) degrees of freedom, having dis-covered significant variations among sites withintreatments.)

One initial decision still needs to be made,namely how best to combine the information fromthe three replicates at each site, so that a similaritymatrix can be defined for the six new 'replicates'(sites C1-C3, P1-P3). One possibility is simply topool the original data across the initial replicatesand compute an entirely new similarity matrix.More satisfactory, however, certainly within therationale of this paper, would be to retain depen-dence only on the rank similarities in the originaltriangular matrix, by averaging over the appropri-ate ranks to obtain a reduced matrix. For example,the similarity between the three PI and three P2replicates is defined as the average of the nineinter-group rank similarities; this is placed into thenew similarity matrix along with the 14 other aver-ages (Cl with C2, PI with Cl etc.) and all 15 valuesare then re-ranked. Applying the one-way test tothis re-ranked matrix gives 7? = 0.74. There areonly 10 distinct permutations (6!/3!3!2!) in this caseso that, although the true 7? of 0.74 is actually themost extreme value possible, the null hypothesis ofno difference between control and polluted con-ditions is only able to be rejected at a P=e 10%significance level.

These data perhaps provide a rather weakexemplar of the methodology (though to be fair tothe original author, they were probably never in-tended to be exploited in this way). The number ofsites within each of the conditions is too limited forstrong inference and the spatial distribution of thesites themselves, and the rather ill-defined natureof the pollution status, are not really satisfactory for

Page 20: Clarke 1993

136 K. R. CLARKE

the hypotheses here erected, so the main purposeof this example is to illustrate a methodologicalpossibility. One scenario is not evidenced by thesedata: what if question 1 returns a 'no' verdict, that iswe cannot reject the null hypothesis of 'no differ-ence among sites within each condition'? Thereare then two possibilities for answering question 2.

(1) Proceed with the average ranking and re-ranking exactly as above, on the assumption thateven if it cannot be proved that there are no dif-ferences between sites it would be unwise to assumethat this is so; the test may have had rather littlepower to detect such a difference.

(2) Infer from the test results that there are nodifferences between sites, and treat all replicates asif they were separate sites, for example there wouldbe seven replicates for control and nine replicatesfor polluted conditions in the example of Fig. 10a.A one-way ANOSIM procedure is then carried out onthese 16 samples.

Which of these two possibilities should be pur-sued is to some extent an open question. Option (2)will certainly have greater power but runs a realrisk of being invalid; option (1) is the conservativetest and it is certainly unwise to design the studywith anything other than option (1) in mind. Thereis little that is new, or specific to this multivariateapproach, about these options; they are paralleledby the similar dilemma over whether 'to pool or notto pool' in construction of residuals in standardunivariate ANOVA (e.g. Winer 1971).

Two-way crossed layout: A natural disturbance'experiment' (Eaglehawk Neck)

An example of a two-way crossed design is given inFig. 11a. This is not a study of anthropogenic im-pact but of natural disturbance to meiobenthiccommunities by continual reworking of the sedi-ment by soldier crabs (Warwick et al. 1990a). Tworeplicate samples were taken from each of four dis-turbed patches of sediment, and from adjacentundisturbed areas, stretching across a sand flat atEaglehawk Neck, Tasmania; Fig. 11a is a sche-matic representation (rather than a detailed map)of the 16 sample locations. There are thus two fac-tors: the presence or absence of disturbance by thecrabs and the 'block effect' of the four differentdisturbance patches. It might be anticipated thatthe community will change naturally across thesand flat, that is from block to block, and it is im-

Fig. 11. A two-way crossed layout arising from disturbance of meio-benthic communities by soldier-crab burrowing, Eaglehawk Neck,Tasmania, (a) Schematic sampling design of four cores fromeach of four 'blocks' (disturbed patches, shaded), (b) MDS ofthe 16 meiofauna samples, with the x axis separating the blocks andthe jv axis discriminating disturbed from undisturbed communities.Stress = 0.11.

portant to be able to separate this effect from anychanges associated with the disturbance itself.There are obvious parallels here with environmen-tal impact studies in which (say) sewage pollutionor human disturbance affects sections of severalembayments, so that matched control and pollutedconditions can be compared against a backgroundof changing community structure across a widespatial scale. In another scenario, the blocks wouldbe sampling occasions in a time series, each pointin time having replicate observations from controland polluted conditions; the objective would againbe to separate the effect of natural fluctuationsthrough time from differences associated with thepollution impact.

Page 21: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 137

A feature of these designs is that there are rep-licate samples from all blocks for both conditions:blocks and treatments are said to be 'crossed' andthe stages in this two-way layout are handledslightly differently from the previous nested case.Returning to the Eaglehawk Neck example. Fig-ure lib displays the MDS for the 16 core samples(2 treatments X 4 blocks X 2 replicates), based onBray-Curtis similarities from fourth-root trans-formed abundances of 59 meiofaunal species. Thepattern is remarkably clear and a classic analogueof what, in univariate two-way ANOVA, would becalled an additive model. The meiobenthic com-munity is seen to change from area to area acrossthe sand flat (separation of symbol types on the xaxis) but also differs between disturbed and undis-turbed conditions (separation of closed and opensymbols on the y axis). An immediate corollary isthe importance of the block design in eliciting thedisturbance effects; the spatial changes appear tobe as great as community differences associatedwith disturbance.

Statistical tests to support the above conclusionsare probably desirable, even in an apparently clear-cut case such as this, because of the minimalnumber of replicates for each block-treatmentcombination. A test of the null hypothesis thatthere are no disturbance effects, allowing for thefact that there may be block effects, can be carriedout using precisely the same two-way ANOSIM pro-cedure as initially employed in the two-way nestedlayout (the Question 1 test). For each separateblock an R statistic is calculated from equation (5),as if for a simple one-way test for a disturbanceeffect. The resulting values /?i to R^ are then aver-aged to give the test statistic R. Its permutationdistribution under the null hypothesis is generatedby examining all re-orderings of the four labels(two disturbed, two undisturbed) within each block— labels are not switched between blocks. There isno necessity to sample from the permutation dis-tribution in this case; there are only three distinctpermutations in each block, giving a total of3"* = 81 combinations overall. The observed valueof i? for the similarity matrix underlying Fig. 1 lb is0.94 and, not unexpectedly, this is the highestvalue attained in the 81 permutations. The nullhypothesis is therefore rejected at close to a signifi-cance level of 1%.

The strict conclusion is not that soldier crab dis-turbance is necessarily causal to the meiobenthic

community change, though this is a very plausiblemechanistic explanation, but that there is statisti-cal evidence of an association between the disturb-ance and community differences. So-called 'natu-ral experiments' are still observational studies andnot experiments at all, in the strict statistical senseof randomly allocating treatments to experimentalunits (see the previous discussion). Nonetheless,this example does demonstrate that better obser-vational design can lead to more powerful infer-ence.

There is a symmetry in the crossed design that isnot present in the nested case. One can now test thenull hypothesis that there are no block effects, al-lowing for the fact that there are treatment (dis-turbance) differences, by simply reversing the rolesof treatments and blocks in the above test. R is nowan average of two./? statistics, separately calculatedfor disturbed and undisturbed samples, and thereare 8!/(2!)H! = 105 permutations of the eightlabels for each treatment. One must therefore ran-domly select from the 11 025 possible combina-tions. In 1000 simulations the true value of R( = 0.85) is again the most extreme and is almostcertainly the largest in the full set; the null hypo-thesis is decisively rejected. In this case the test isinherently uninteresting but one can envisagemany situations where tests for both factors in acrossed design are of practical significance.

Interaction in a two-way layout: A microcosmexperiment

Notice that the above two-way ANOSIM procedureis not the analogue of a test for treatment maineffects in a univariate two-factor (treatments Xblocks) ANOVA. Rather it is equivalent to poolingthe sums of squares for main effects and interac-tions, and comparing this with the residual to givean overall test for presence of a treatment effect. Inthe current context, this is saying that the two-wayANOSIM test has the potential to reject the nullhypothesis of 'no disturbance effect' either when(as above) there is a consistent treatment differenceacross all blocks or when this difference is stronglypresent in some blocks but not others. An exampleof the latter is seen in Fig. 12, for data on nematodecommunity structure in a microcosm experimentperformed by Austen (1989). Again the details andobjectives of this work are of less importance herethan the structure of the data and the close parallels

Page 22: Clarke 1993

138 K. R. CLARKE

Fig. 12. A two-way crossed layout demonstrating 'interaction ef-fects'; an MDS of nematode communities from a microcosmexperiment with six independently replicated treatments, involv-ing food limitation and osmotic stress. Stress = 0.13. (A) 25%o(control); (o) 15%o; (•) 5%o; closed symbols indicate food limited.

that can be drawn with designs for environmentalimpact studies.

As part of a larger series of experiments, Austen(1989) examined the effects on estuarine meio-benthic communities of prolonged food limitation(factor 1 at two levels: control and food-limited)and osmotic stress (factor 2 at three levels: controland two reduced salinity levels). Figure 12 displaysthe MDS of nematode communities at the termin-ation of the experiment, based on square-roottransformed abundances and Bray-Curtis simi-larities (Austen 1989). It is apparent that salinityhad a strong effect on final community type; thetwo-way ANOSIM test of 'no salinity differences',allowing for possible differences from food limi-tation, gives R = 0.72 (P<0.1%). There was alsoan overall effect of food limitation, allowing for thesalinity differences, although the R is muchsmaller, at 0.32 (P< 0.1%). In fact, it is clear fromthe plot that food limitation only had a markedeffect on those communities which were subject tothe greatest osmotic stress; differences were slightor negligible at the 25%o and 15%o salinity levels.This, of course, is an interaction effect between thetwo factors. Its presence can be formally estab-lished, indirectly, by noting that the one-way ANO-SIM of control versus food-limited conditions,separately for each salinity level, gives: R = —0.05for 25%o, R = 0.20 for 15%o and R = 0.81 for 5%o.On an overall 5% significance level for the three

tests, the first two R values are not significant butthe last is — and highly so.

This indirect approach to examining interac-tions will be inadequate in some important practi-cal scenarios for environmental impact studies.Green (1979, 1993), Underwood (1991, 1992) andothers have discussed a class of spatial and tem-poral layouts known as BACI designs (Before/After, Control/Impact). A putatively impacted areaand a matching control site (or sites) are monitoredin a time series straddling the impact event. Pol-lution effects will then show up as interactionsbetween the temporal and spatial factors. At pres-ent, such a design could be formally tested by theANOSIM procedure only if there were no differ-ences between control and (putatively) pollutedsites prior to the impact. A general interaction ef-fect could, of course, arise as a significant differ-ence between control and impacted sites before theimpact and a larger difference after the impact.The difficulty in extending the above methodologyto cover this case is not so much in defining aninteraction statistic based only on the rank simi-larity matrix (though that is not trivial), as intesting such a statistic using an appropriate permu-tation of sample labels. This would appear to defydevelopment within the similarity-based frame-work of this paper and must be accepted as a limi-tation of the current methodology, though there isclearly scope for further study here.

LINKING COMMUNITY PATTERNS TOENVIRONMENTAL VARIABLES

'Explaining' Exe nematode communities

The final major area of practical communitystudies is the attempted explanation of communitypatterns by linking the biotic analysis to physical orcontaminant data from the same set of samples.The Introduction and Rationale outlined an ap-proach which is designed to fit the minimalistassumptions of this paper: it uses only rank simi-larities between samples and attempts to avoidexplicit assumptions about the form of biota-environment relationships. The basic premiseappears straightforward: if the suite of physico-chemical data responsible for structuring the com-munity were known, then samples having rathersimilar values for these variables would be ex-pected to have rather similar species composition,

Page 23: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 139

and an ordination based on this abiotic informationwould group sites in the same way as for the bioticplot. If key environmental variables are omitted,the match between the two plots will deteriorate.By the same token, the match will also worsen ifabiotic data which are irrelevant to the communitystructure are included. An example is given inFig. 13, based on the Exe nematode data used byField et al. (1982).

Figure 13a shows the MDS for nematode com-munities at 19 intertidal sites in the Exe estuary,based on the similarity matrix employed in theearlier cluster analysis (Fig. 6), with which there isvery close agreement. The remaining plots inFig. 13 are of specific combinations of the six sedi-ment variables recorded for each sample: the depthof the H2S layer (H2S), the interstitial salinity, themedian particle diameter (MPD), % organics,depth of the water table and height up the shore.For consistency, these plots are MDS ordinationsbased on a Euclidean distance matrix from the nor-malized variables, though they will be virtuallyidentical to configurations from principal compon-ent analysis (PCA, e.g. Everitt 1978) in this in-

stance. (Fig. 13b is effectively just a scatter plot,since it involves only two variables.) In contrast tobiotic data, quantitative environmental data isalmost always well-handled by PCA, possibly aftertransformation.

The point to note here is the remarkable degreeof concordance between biotic and abiotic plots,particularly Figs. 13a and c: both group thesamples in very similar fashion. Leaving out MPD(Fig. 13b), the (7-9) group is less clearly dis-tinguished from (6, 11) and one also loses somematching structure in the (12-19) group. Addingvariables such as depth of the water table andheight up the shore (Fig. 13d), the (1-4) groupbecomes more widely spaced than is in keepingwith the biotic plot, sample 9 is separated from 7and 8 et cetera and the fit again deteriorates. Infact. Fig, 13c represents the 'best fitting' environ-mental combination, in the sense defined below,and therefore best 'explains' the community pat-tern.

Quantifying the match between any two plotscould be accomplished by a Procrustes analysis(Gower 1971), in which one plot is rotated, scaled

(a) 15

12 1413

19

16

18

1 7

61 1

(c)

43 2

1

10

1 8

131415I2 ' "

9 87 1 1

16

19

(b)

(d)

12-19

97 6

10

15

16 "

17 13

18

Fig. 13. MDS ordinations for the 19 Exe estuary sites, based on: (a) nematode counts; (b) sediment variables recording depth of the H2S layerand interstitial salinity; (c) H2S, salinity and MPD, the environmental combination 'best matching' Fig. 13a (i.e. maximizing the rankcorrelation of the respective similarity matrices); (d) all six recorded abiotic variables. Stress values are: (a) 0.05, (b) 0, (c) 0.04, (d) 0.06.

Page 24: Clarke 1993

140 K. R. CLARKE

or reflected to fit the other in such a way as tominimize a sum of squared distances between thesuperimposed configurations. This is unsatisfac-tory, however, for exactly the same reasons asadvanced earlier in deriving the ANOSIM statistic:the 'best match' should not be a function of thedimensionality in which one chooses to view thetwo patterns. The fundamental constructs are, asusual, the similarity matrices underlying bothbiotic and abiotic ordinations. These are chosendifTerently to match the respective form of the data(e.g. Bray-Curtis for biota, Euclidean distance forenvironmental variables) and will not be scaled inthe same way. Their ranks, however, can be com-pared through a rank correlation coefficient andthis is a very natural measure to adopt within theframework of this paper.

Clarke and Ainsworth (1993) describe this wholeapproach in detail, including the systematic searchof all variable combinations to find the optimalmatch. (This uses a FORTRAN routine, BIO-ENV,

though again it would not be difficult to im-plement the procedure in macro-languages ofother systems.) They also contrast the use of simpleSpearman rank correlation (Kendall 1970) of thetwo similarity matrices with a weighted Spearmancoefficient, the latter placing more emphasis onmatching the local rather than global structure ofthe biotic and abiotic patterns. For this latter coef-ficient (pw), the combination of variables inFig. 13c is optimal, with pw = 0.80. It is perhapsunnecessary to repeat the warning that, as this is anobservational study, one cannot infer that a com-

bination of depth of H2S layer, interstitial salinityand median particle diameter are directly causal inshaping the community pattern at these sites. Theymay, for example, be highly correlated with unre-corded variables which are causal, although in thisinstance they form a very plausible 'explanation'.

The Ekofisk oil field study

The above example was of a fundamental study in alargely unpolluted estuary, but these ideas clearlyhave potential for use in the monitoring of en-vironmental impact. Clarke and Ainsworth (1993)discuss their application to macrobenthic samplesfrom the Clyde sewage-sludge dumping ground(Pearson 1987), for which the BIO-ENV results areencouragingly informative.

A further example of an impact study, not dis-cussed by Clarke and Ainsworth (1993), is theEkofisk oil field data of Gray et al. (1990), analysedearlier (Figs 4,5). Figure 14 shows an MDS of the39 sites based on just the three sediment variablesquoted by Gray et al. (1990): 'total hydrocarbonconcentration (THC)', 'barium concentration' and'% mud'. Barium is present in (and therefore agood tracer for) drilling muds, although it is notknown to be toxic to the marine benthos; the per-centage mud fraction might also be expected toreflect effects of the finer drilling muds. Bariumand THC levels were initially log-transformed andall three variables normalized to improve the ap-propriateness of Euclidean distance as a measure ofamong-sample dissimilarity (Clarke & Ainsworth

2216

3121,-, 271 ' 13

26 26

29 20

3619

34 3533 21

37 38

39

Fig. 14. MDS ordination of three environmental variables (% mud, log total hydrocarbons and log barium) for the 39 Ekofisk sites, Fig. 4.Note the fair match with the biotic ordination, Fig. 5. Stress = 0,05,

Page 25: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 141

1993). The Euclidean distance matrix was thenused to construct the MDS of Fig. 14 and also in-put to the Bio-ENV procedure. This showed that itwas necessary to retain all three variables to opti-mize the correlation (pw = 0.59) with the ranksimilarities underlying the biotic MDS (Fig. 5).The detailed patterns are fairly well matched andcertainly the broad sweep of community changemoving away from the oil field is well mirrored inthe contaminant plot, including the marked div-ision between intermediate and distant sites.

It must be emphasised that the BIO-ENV pro-cedure has so far only been applied to a limitednumber of data sets, although with some apparentsuccess, and needs more rigorous examination,both practically and theoretically. For example, noexpUcit assumptions appear to have been madeabout the form of the relationship between speciesabundance and environmental gradient, yet therecould be functional forms for which this matchingprocedure is unsound. Simulation studies are likelyto be informative here.

CONCLUDING REMARKS

The methods described here have demonstratedtheir utility in a variety of applications, particularlyin pollution studies, where multivariate analyseshave been shown to be both sensitive in their elici-tation of community change and robust to a sub-stantial degree of taxonomic aggregation (see thecompanion paper, Warwick 1993). The examplesgiven here are exclusively marine, and predomi-nantly involve soft-sediment benthos, but appli-cation of these techniques is by no means confinedto marine communities. Certainly, the methodsare well suited to some common features of soft-sediment data, such as a large species set (always inexcess of the number of samples) and a sparse,highly-skewed abundance matrix, arising frommarked spatial heterogeneities (clumping) inspecies distributions. These characteristics, how-ever, are shared by many other fields of appli-cation.

The reader should be warned that not all of themethods of this paper represent received wisdomin the multivariate area. Earlier studies empha-sised the role in ordination of PCA, with itsassumption that sample dissimilarities are well-represented by Euclidean distances for both en-vironmental and species data. Abundances are also

assumed to be linearly related to environmentalgradients. (The genesis for this is the classic stat-istical model of multivariate normality.) For a can-nonical analysis, that is the linking of communitypattern to an environmental matrix, PCA leadsnaturally to canonical correlation, or a variant of itknown as redundancy analysis (Rao 1964), but theassumptions underlying such techniques rarelyseem to be satisfied for field community data. Incontrast, ecologists now more often use a form ofcorrespondence analysis for ordination, usually de-trended correspondence analysis (DCA, Hill &Gauch 1980) with its implicit definition of 'x^ dis-tance' as the measure of sample dissimilarity. (Thishas its genesis in a standard categorical data model,in which abundances are the multinomial frequen-cies that would be expected from sampling spatialdistributions of organisms which are locally homo-geneous — known technically as 'homogeneousplanar Poisson processes', Diggle 1983). Thematching canonical analysis, predicated on uni-modal response models of species abundancesacross measured environmental variables, is pro-vided by the more recent, important developmentof Ter Braak (1986): canonical correspondenceanalysis. A good description of these competingordination and canonical methods can be found inJongman et al. 1987.

Detrended correspondence analysis has at-tracted statistical criticism for some time, mainlyfor the somewhat arbitrary and 'overzealous'nature of its detrending process, but also because ofthe sometimes inappropriate imposition and non-robust behaviour of an underlying x̂ distancemeasure (Pielou 1984; Faith et al. 1987; Gower1992). This paper takes the stance that the choiceof dissimilarity measure should be dictated by rel-evant ecological assumptions and not the mech-anics of the ordination method. The fact that thesemeasures (and even the ordination technique) maybe very different for biotic and abiotic variables,and an unwillingness to constrain species-environ-ment relationships to linear, monotonic, unimodalor even multimodal forms (in practice all four com-binations may be present), has led to the proposed'matching approach' to canonical analysis. Thiscontrasts strongly with the direct gradientmethods, in which species-environment relation-ships are embedded at an early stage of the analysisand will influence the observed biotic pattern. Thematching procedure avoids this (arguably) undesir-

Page 26: Clarke 1993

142 K. R. CLARKE

able feature, and possesses a seductive simplicity,but how well it stands up to further detailedscrutiny remains to be seen.

In conclusion, it should be stressed that the driv-ing force for all four main sections of this paper isone of simplicity. The advocated non-parametrictechniques may lack some of the sophistication ofother multivariate methods but it is suggested thatthis is more than compensated for by their wide-spread validity and the comparative ease withwhich they can be understood. The latter is a majorasset in communicating results from the largedata sets typically arising in environmental impactstudies.

ACKNOWLEDGEMENTSThe evolution of the components of this frame-work owes much to discussion and input fromseveral colleagues, in particular R. Warwick andM. Carr of the Plymouth Marine Laboratory(PML). Three students, J. Hall, M. Ainsworth andC. Green, were responsible for coding FORTRANversions of the ANOSIM and BIO-ENV procedures, tosupplement the PRIMER programs (PlymouthRoutines In Multivariate Ecological Research)developed at PML principally under the super-vision of M. Carr. These latter programs were usedfor most of the analyses of this paper. Credit for theSIMPER procedure belongs to M. Carr and J. Hall.M. Austen (PML) also contributed significantly toinitial discussions on several of the paper's topics.This work forms part of the Community Ecologyproject of the PML. My thanks are also due toA. Underwood (Institute of Marine Ecology, Uni-versity of Sydney) for the opportunity to participatein the Workshop and contribute to this specialissue. His suggestions, and those of D. Faith(CSIRO, Canberra), have significantly improvedthe paper but there remain a number of idosyncra-sies that are entirely my own.

REFERENCESAddison R. F. & Clarke K. R. ed. (1990) Biological Effects of Pol-

lutants in a Subtropical Environment. J. Fxp. Mar. Bioi. Ecoi.138, 1-166.

Agard J. B. R., Gobin J. & Warwick R. M. (1993) Analysis of mar-ine macrobenihic community structure in relation to naturaland man induced perturbations in a tropical environment.Mar. Ecol Prog. Ser. (in press).

Austen M. C. (1989) Factors affecting estuarine meiobenthic as-semblage structure: A multifactorial microcosm experiment.J. Fxp. Mar. Biol Fcol. 130, 167-87.

Bayne B. L., Clarke K. R. & Gray J. S. ed. (1988) Biological Effectsof Pollutants: Results of a Practical Workshop. Mar. Ecol. Prog.Ser. 46, 1-278.

Bray ]. R. & Curtis ] . T. (1957) An ordination of the upland forestcommunities of Southern Wisconsin. Ecol. Monogr. 27, 325-49.

Clarke K. R. (1990) Detecting change in benthic community struc-ture. In: Proceedings XlVth International Biometric Confer-ence. Namur: Invited Papers, pp. 131-42. Societe AdolpheQuetelet, Gembloux, Belgium.

Clarke K. R. & Ainsworth M. (1993) A method of linking multi-variate community structure to environmental variables.Mar. Fcol. Prog. Ser. (in press).

Clarke K. R. & Green R. H. (1988) Statistical design and analysisfor a 'biological effects' study. Mar. Fcol. Prog. Ser. 46, 213-26.

Clifford D. H. T. & Stephenson W. (1975) An Introduction toNumerical Classification. Academic Press, New York.

Dauvin J-C. (1984) Dynamique d'ecosystemes macrobenthiquesdes fonds sedimentaires de la Baie de Morlaix et leur pertur-bation par les hydrocarbures de l'Amoco-Cadiz. PhD thesis,Universite Pierre et Marie Curie.

Dawson-Shepherd A., Warwick R. M., Clarke K. R. & Brown B. E.(1992) An analysis of fish community responses to coralmining in the Maldives. Envir. Biol. Fishes. 33, 367-80.

Diggle P. J. (1983) Statistical Analysis of Spatial Point Patterns.Academic Press, London.

Efron B. (1979) Bootstrap methods: Another look at the jackknife.Ann. Stat. 7, 1-26.

Everitt B. (1978) Graphical Techniques for Multivariate Data.Heinemann, London.

Fveritt B. (1980) Cluster Analysis. 2nd edn. Heinemann, London.Faith D. P. (1990) Multivariate methods for biological monitoring

based on community structure. In: The Australian Society ofLimnology 29th Congress, p. 17 (abstract). Alligator RiversResearch Institute.

Faith D. P. (1991) Effective pattern analysis methods for natureconservation. In: Cost Effective Survey Methods for NatureConservation (ed. C. R. Margules and M. P. Austin) pp. 47-53. CSIRO and NSW NPWS, Canberra.

Faith D. P., Minchin P. R. & Belbin L. (1987) Compositionaldissimilarity as a robust measure of ecological distance.Vegetatw 69, 57-68.

Field J. G. (1969) The use of the information statistic in the nu-merical classification of heterogeneous systems. J. Ecol. 57,565-9.

Field J. G., Clarke K. R. & Warwick R. M. (1982) A practical stra-tegy for analysing multispecies distribution patterns. Mar.Ecol. Prog. Ser. 8, 37-52.

Gee J. M., Warwick R. M., Schaanning M. et al. (1985) Effects oforganic enrichment on meiofaunal abundance and com-munity structure in sublittoral soft sediments. J. Exp. Mar.Biol. Ecol. 91, 247-62.

Gower J. C. (1971) Statistical methods of comparing differentmultivariate analyses of the same data. In: Mathematics in theArchaeological and Historical Sciences (ed. F. R. Hodson,D. G. Kendall and P. Tautu) pp. 138-49. Edinburgh Univer-sity Press, Edinburgh.

Gower J. C. (1992) Generalized biplots. Biometrika 79, 475-93.Gower J. C. & Ross G. J. S. (1969) Minimum spanning trees and

single linkage cluster analysis. Appl. Stat. 18, 54-64.Gray J. S., Aschan M., Carr M. R. et al. (1988) Analysis of com-

munity attributes of the benthic macrofauna of Frier-fjord/Langesundfjord and in a mesocosm experiment. Mar.Ecol. Prog. Ser. 46, 151-65.

Page 27: Clarke 1993

NON-PARAMETRIC MULTIVARIATE ANALYSES 143

Gray J. S., Clarke K. R., Warwick R. M. & Hobbs G. (1990) Detec-tion of initial effects of pollution on marine benthos: Anexample from the Ekofisk and Eldfisk oilfields, North Sea.Mar. Ecol. Prog. Ser. 66, 285-99.

Green R. H. (1979) Sampling Design and Statistical Methods forEnvironmental Biologists. Wiley, New York.

Green R. H. (1993) Application of repeated measures designs inenvironmental impact and monitoring studies. Aust. J. Ecol.18, 81-98.

Greenacre M. J. (1984) Theory and Applications of CorrespondenceAnalysis. Academic Press, London.

Hill M. O. (1973) Reciprocal averaging: An eigenvector method ofordination. J. Ecol. 61, 237-49.

Hill M. O. (1979) TWINSPAN — A EORTRAN program for ar-ranging multivariate data in an ordered two-way table byclassification of individuals and attributes. Cornell University,Ithaca, New York.

Hill M. O. & Gauch H. G. (1980) Detrended correspondenceanalysis, an improved ordination technique. Vegetatio 42,47-58.

Hope A. C. A. (1968) A simplified Monte Carlo significance testprocedure. J. Roy. Stat. Soc. Ser. B 30, 582-98.

Jongman R. H. G., Ter Braak C. F. J. & van Tongeren O. F. R.(1987) Data Analysis in Community and Landscape Ecology.Pudoc, Wageningen.

Kendall M. G. (1970) Rank Correlation Methods. Griffin, Lon-don.

Kenkel N. C. & Orloci L. (1986) Applying metric and nonmetricmultidimensional scaling to some ecological studies: somenew results. Ecology 67, 919-28.

Kruskal J. B. & Wish M. (1978) Multidimensional Scaling. SagePublications, Beverly Hills, California.

Lambshead P. J. D. (1986) Sub-catastrophic sewage and industrialwaste contamination as revealed by marine nematode faunalanalysis. Mar. Ecol. Prog. Ser. 29, 247-59.

Mantel N. (1967) The detection of disease clustering and a gen-eralized regression approach. Cancer Res. 27, 209-20.

Mardia K. V., Kent J. T. & Bibby J. M. (1979) Multivariate Analy-sis. Academic Press, London.

Pearson T. H. (1987) The benthic biology of an accumulatingsludge disposal ground. In: Biological Processes and Wastes inthe Ocean (ed. J. Capuzzo and D. Kester) pp. 195-220.Krieger, Melbourne.

Pielou E. C. (1984) The Interpretation of Ecological Data. A Primeron Classification and Ordination. Wiley, New York.

Rao C. R. (1964) The use and interpretation of principal compo-nent analysis in applied research. Sankhya A 26, 329-58.

Reader's Digest Great World Atlas, 2nd edn (1968) The Reader'sDigest Association Ltd, London.

Ter Braak C. F. J. (1986) Canonical correspondence analysis: Anew eigenvector technique for multivariate direct gradientanalysis. Ecology 67, 1167-79.

Underwood A. J. (1991) Beyond BACI: Experimental designs fordetecting human environmental impacts on temporal vari-ations in natural populations. Aust. J. Mar. Ereshwat. Res. 42,569-87.

Underwood, A. J. (1992) Beyond BACI: The detection of environ-mental impacts on populations in the real, but variable,world. J. Exp. Mar. Bwl. Ecol. 161, 145-78.

Underwood A. J. & Peterson C. H. (1988) Towards an ecologicalframework for investigating pollution. Mar. Ecol. Prog. Ser.46, 227-34.

Warwick R. M. (1971) Nematode associations in the Exe estuary.J. Mar. Biol Ass. UK SI, 439-54.

Warwick R. M. (1993) Environmental impact studies on marinecommunities: Pragmatical considerations. Aust. J. Ecol. 18,63-80.

Warwick R. M., Carr M. R., Clarke K. R., Gee J. M. & Green R. H.(1988) A mesocosm experiment on the effects of hydrocarbonand copper pollution on a sublittoral soft-sediment meio-benthic community. Mar. Ecol. Prog. Ser. 46, 181-91.

Warwick R. M. & Clarke K. R. (1991) A comparison of somemethods for analysing changes in benthic community struc-ture. J. Mar. Biol. Ass. UK 71, 225-44.

Warwick R. M. & Clarke K. R. (1993a) Comparing the severity ofdisturbance: A meta-analysis of marine macrobenthic com-munity data. Mar. Ecol. Prog. Ser. (in press).

Warwick R. M. & Clarke K. R. (1993b) Increased variability as asymptom of stress in marine communities. J. Exp. Mar. Biol.Ecol. (in press).

Warwick R. M., Clarke K. R. & Gee J. M. (1990a) The efifect ofdisturbance by soldier crabs, Mictyris platycheles H. MilneEdwards, on meiobenthic community structure. J. Exp. Mar.Bwl. Ecol. 135, 19-33.

Warwick R. M., Clarke K. R. & Suharsono (1990b) A statisticalanalysis of coral community responses to the 1982-83El Nino in the Thousand Islands, Indonesia. Coral Reefs 8,171-9.

Warwick R. M., Goss-Custard J. D., Kirby R., George C. L., PopeN. D. & Rowden A. A. (1991) Static and dynamic environ-mental factors determining the community structure of estu-arine macrobenthos in SW Britain: Why is the Severn Estuarydifferent? J. Appl. Ecol. 28, 329-45.

Winer, B. J. (1971) Statistical Principles in Experimental Design,2nd edn. McGraw-Hill, Kogakusha, Tokyo.

Page 28: Clarke 1993