Urban vegetation mapping using sub-pixel analysis and...

Urban vegetation mapping using sub-pixel analysis and expert systemrules: A critical approach

S. W. MYINT*

Department of Geography, Arizona State University, 600 E. Orange Street, SCOB

Building, Room 330, Tempe, AZ 85287-0104

(Received 29 September 2004; in final form 1 December 2005 )

Since the traditional hard classifier can label each pixel only with one class, urban

vegetation (e.g. trees) can only be recorded as either present or absent. The sub-

pixel analysis that can provide the relative abundance of surface materials within

a pixel may be a potential solution to effectively identifying urban vegetation

distribution. This study examines the effectiveness of a sub-pixel classifier with

the use of expert system rules to estimate varying distributions of different

vegetation types in urban areas. The Spearman’s rank order correlation between

the vegetation output and reference data for wild grass, man-made grass, riparian

vegetation, tree, and agriculture were 0.791, 0.869, 0.628, 0.743, and 0.840

respectively. Results from this study demonstrated that the expert system rule

using NDVI threshold procedure is reliable and the sub-pixel processor picked

the signatures relatively well. This study reports a checklist of the sources of

limitation in the application of sub-pixel approaches.

1. Introduction

Vegetation influences urban environmental conditions and energy fluxes by selective

reflection and absorption of solar radiation (Gallo et al. 1993) and by function of

evapotranspiration (Owen et al. 1998). The presence and abundance of vegetation in

urban areas has long been recognized as a strong influence on energy demand and

development of the urban heat island (Oke 1982, Huang et al. 1987). Urban

vegetation abundance may also influence air quality and human health (Wagrowski

and Hites 1997) because trees make their own food from carbon dioxide in the

atmosphere, sunlight, water, and a little amount of soil elements, and release oxygen

in the process. They also provide surface area for sequestration of particulate matter

and ozone. The loss of trees in our urban areas not only intensifies the urban heat

island effect due to the loss of shade and evaporation, but we lose a principal

absorber of carbon dioxide and trapper of other pollutant as well. A noticeable

phenomenon that has arisen as a result of urbanization is that urban climates are

warmer and more polluted than their rural environments (Lo and Quattrochi 2003).

Urban development increases the amount of impervious surfaces in watersheds as

farmland, forests, and meadows are converted into buildings, driveways, pavements,

roads, and car parks with virtually no ability to absorb storm water. Surrounding

urban environments (i.e., forests, grasslands, agriculture, water, etc.) are also very

important because the decisions that need to be made regarding planning

*Corresponding author. Email: [email protected]

International Journal of Remote Sensing

Vol. 27, Nos. 12–14, July 2006, 2645–2665

International Journal of Remote SensingISSN 0143-1161 print/ISSN 1366-5901 online # 2006 Taylor & Francis

http://www.tandf.co.uk/journalsDOI: 10.1080/01431160500534630

community growth include where to locate new residential areas, transportation

infrastructure, new retailers, school catchment zones (Mesev 2003), emergency

management systems, industrial zones, public offices, commercial areas and how to

reduce and monitor air pollution, noise pollution, water pollution, soil erosion,

deforestation, land degradation, urban heat island effects, crime pattern and rate,

disaster risk, and traffic congestion. Urbanization alters the natural ways energy

flows through the atmosphere, land, and water systems. The modification of the

urban landscape influences the local (microscale), mesoscale, and even the

macroscale climate (NASA/GHCC Project Atlanta). Hence, the spatiotemporal

distribution of vegetation is generally considered a key component of the urban

environment.

Identification of urban land use and land covers from remotely sensed images has

usually been based on the hard classification of spectral response from image pixels.

The brightness value of each pixel represents either one homogeneous land cover or

the combination of a number of different land-cover classes. Since the traditional

hard classifier can label each pixel only with one class, urban vegetation (e.g. trees)

can only be recorded as either present or absent. Information on the percentage

distribution of spatially mixed spectral signatures from different ground-cover

features is not possible with the per-pixel classifiers. Hence, the traditional

classification of mixed pixels may lead to information loss (Wang, 1990),

degradation of classification accuracy, and degradation of modelling quality in

successive applications (Ji and Jensen 1996, 1999).

The sub-pixel analysis that can provide the relative abundance of surface

materials within a pixel may be a potential solution to per-pixel classifiers especially

when dealing with medium to coarse resolution satellite images (e.g. Landsat TM,

MODIS, AVHRR). There have been several approaches to sub-pixel analysis—

linear mixture models (Smith et al. 1990, Settle and Drake 1993, Van der Meer 1997,

Wu and Murray 2003, Rashed et al. 2003), Bayesian probabilities (Wang 1990a,

Wang 1990b, Foody et al. 1992, Eastman and Laney 2002, Hung and Ridd 2002),

neural network (Foody and Aurora 1996, Zhang and Foody 2001); fuzzy c-means

methods (Fisher and Pathirana 1990, Foody and Cox 1994, Foody 2000), and fuzzy

set possibilities (Eastman 1999).

This study aims to determine the effectiveness of the IMAGINE sub-pixel

classifier with the use of expert system rules to quantify varying amounts and

distributions of different vegetation types in urban and suburban areas using

Landsat TM data.

2. IMAGINE sub-pixel analysis

The sub-pixel processing tool used in this study is a sub-pixel classifier, an add-on

module to ERDAS IMAGINE geographic imaging package. The tool is intended to

quantify materials that are smaller than image resolution. IMAGINE sub-pixel

processor is based on the concept reported by Schowengerdt (1995) that the spectral

reflectance of the majority of the pixels in remotely sensed image data is assumed to

be a spatial average of spectral signatures from two or more surface categories.

Hence, the brightness value of a pixel in an urban image can be considered a

combination of spectral response from multiple materials, such as grass, trees,

shrubs, cement roads, tarmac roads, metal roofs, wooden roofs, pavements,

driveways, and car parks.

2646 S. W. Myint

The sub-pixel processor is designed to classify each pixel in an image as itsfraction of material of interest (MOI) present. For example, if the MOI is trees, each

pixel in the image will hold a number from 0 to 1.0 representing the fraction of trees

within the pixel. The procedure to obtain the proportion of trees in each pixel is

explained below.

Following Huguenin et al. (1997), it is assumed that the total spectral response of

each urban image pixel, Am, can be separated into a component of trees, C, and a

background component, Bm, of all other materials (e.g. grass, cement parking). In

figure 1, if trees were the material of interest (C) and if the fraction of C, fm, is equal

to 33%, then the fraction of the rest of the materials (Bm) would be (12fm)50.67. Itshould be noted that in figure 1, C represents a single specified material of interest

(e.g. trees) whereas Bm refers to all other materials in the pixel, representing a single

set of combined materials of the background.

Am~ fmCð Þz 1{fmð ÞBmf g ð1Þ

Assuming C and Bm are optically thick, meaning no transmittance through thematerial in at least one of the spectral bands, the radiance contributions from C and

Bm can be considered linearly additive in spectral bands, n.

Xm n½ �~ gm n½ �Z n½ �ð Þz 1{gm n½ �ð ÞYm n½ �f g ð2Þ

where, Xm[n], Z[n], and Ym[n] are radiances from, Am, C, and Bm in pixel m and bandn, respectively. The fraction of spectral radiance contributed by Z[n] (MOI) in pixel

m and band n is gm[n]. It should be noted that the radiant fraction, gm[n], can vary

from band to band, because the spectral signatures from the MOI and the

background material can vary from band to band since different spectral bands have

different spectral sensitivity to different materials. If there is negligible difference in

spectral response between the material of interest and the background for all

multispectral bands, gm[n] will be approximately equal to fm.

The sub-pixel processor detects the MOI (e.g. trees) in each urban pixel by

iteratively subtracting fractions of candidate background spectra. The set ofcandidate background, Ym[n], is unique for each pixel in the image and is

independently selected for each pixel based on the assumption that the background

for the pixel under investigation can be represented by other pixels in the same

scene. Then, the processor identifies the background and its fraction that gives the

residual spectrum that most closely matches the spectrum for the material of

Figure 1. Material of interest C (trees) and a background component Bm (all othermaterials) in a pixel of an urban image.

Urban vegetation mapping using sub-pixel analysis and expert system rules2647

interest. The residual, Zm[n], is obtained by using the expression

Zm n½ �~ Xm n½ �{ 1{gm n½ �ð ÞYm n½ �ð Þf g=gm n½ � ð3Þ

where gm[n] is the fraction of the MOI and (12gm[n]) is the fraction of the

background Ym[n], subtracted from the total radiant spectrum Xm[n]. The level of

spectral match between the residual Zm[n] and the signature spectrum Z[n] is

computed by the expression

f ~

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXN

n~1

Zm n½ �{Z n½ �ð Þ2,

N

vuut ð4Þ

where N is the number of image layers. Finally, the radiant faction of the best

matched residual spectrum, gm[n], is recorded as percent tree (percent crown cover)

in each pixel of the output map. The IMAGINE sub-pixel classifier is capable of

detecting and identifying materials covering an area as small as 20% of a pixel. The

algorithm reports classification results for each signature in two, four, or eight

classes. In this study, the eight class option was used. Results reported for class

number 1 with a 0.20–0.29 material pixel fraction indicates that those detections

contain 20–29% of the MOI. Class numbers 1 to 8 represent 0.20–0.29, 0.30–0.39,

0.40–0.49, 0.50–0.59, 0.60–0.69, 0.70–0.89, and 0.90–1.0 respectively.

3. Data and study area

Landsat ETM + image data at 30 m spatial resolution with six channels was used to

quantify varying amounts and distributions of vegetation in urban and suburban

areas. We did not use a thermal channel due to its coarser resolution. The image

data was acquired over the city of Norman, Oklahoma, on 22 May 2000. The study

area is shown in figure 2. The selected study area covers most of the urban/suburban

land-use and land-cover classes: high-density residential, low-density residential,

commercial, wild grass, woodlands, man-made grass, riparian vegetation, river,

sandbars, and exposed soil. To assess the accuracy of the levels of vegetation

distribution from sub-pixel analysis and the expert system rules employed in this

approach, IKONOS 4-m resolution multispectral image with four channels—blue

Figure 2. Norman, Oklahoma, metropolitan area, displayed using channel 3 (0.63–0.69 mm).

2648 S. W. Myint

(0.45–0.52 mm), green (0.52–0.60 mm), red (0.63–0.69 mm), and near infrared (0.76–

0.90 mm) acquired over Norman, Oklahoma, on 20 March 2000 with the aid of

1 : 50,000 scale aerial photographs were used. Both IKONOS and Landsat ETM +data were ortho-rectified. The aerial photographs were not very useful in identifying

accurate ground covers in comparison with the Landsat TM since they were

acquired in 1985. However, field verification was also carried out to supplement the

identification of land cover classes accurately. Several field trips were conducted to

identify uncertain features and classes.

4. Environmental correction

The environmental correction function first converts the brightness value of each pixel

into the true radiance of ground materials. This is done through environmental

correction using the pseudo-calibration materials (e.g. the darkest regions—deep clear

water or terrain shadow) indigenous to the scene. The environmental correction tool

calculates a set of factors to compensate for variations in environmental and

atmospheric conditions during satellite data acquisition. These correction parameters

are output to a file, and used during signature derivation and classification. This step

is required because in order to use equation (2) to search for the materials of interest,

the raw digital numbers for pixel m, DNm[n], need to be corrected to remove the

atmospherically scattered solar radiance component and the sensor offset factors. An

environmental correction tool in the IMAGINE sub-pixel classifier uses sampled

pixels from the scene being processed to derive a correction factor that is subtracted

from DNm[n] to provide the requisite proportionality to Xm[n].

The use of dark pixels, representing deep clear water and shadowed terrain, to

remove atmospherically scattered solar radiance from scene pixels in not at all

unusual. However, the success of the approach is questionable because water pixels

and shadowed pixels generally contain significant unwanted signatures from surface

features, such as reflected sky radiance and sun glints in water pixels and solar

illuminated terrain in shadowed pixels (Huguenin et al. 1997). The sub-pixel

procedure allows a more accurate atmospheric spectrum to be derived by blending

spectra from both reflected sky-radiated and sun-glinted water pixels and solar

illuminated terrain shadow pixels (Applied Analysis 2003). It is expected that the

unwanted glints and illuminated terrains are effectively suppressed, generating a

more accurate atmospheric spectrum.

5. Signature derivation

Even though the use of laboratory-based measurement of pure signatures would be

the optimal approach to sub-pixel classifiers (Wu and Murray 2003), a common

approach for determining pure signatures is to select representative pixels from

homogeneous land covers from satellite images (Rashed et al. 2001, Small 2001,

Eastman and Laney 2002, Hung and Ridd 2002, Wu and Murray 2003). One of the

reasons for the selection of pure signatures from images is to overcome the

substantial problems that exist in correcting atmospheric absorption and scattering

(Settle and Drake 1993). Other reasons may be due to unavailability of laboratory

based reference data, the nature of the landscape under study, and/or classification

specificity.

The signature of the material of interest (MOI) consists of a signature spectrum

and a non-parametric feature space. The signature derivation function generates the


signature spectrum and feature space from user-defined training samples and their

parameters. The function allows us to select signatures by either a whole-pixel or

sub-pixel training set. Whole-pixel signatures are signatures derived from training

sample pixels that contain more than 90% of the MOI. It was suggested that a sub-

pixel training approach should be applied only when a whole pixel signature cannot

provide satisfactory accuracy (Applied Analysis 2003). The whole-pixel signature

derivation strategy was employed in this study to identify vegetation distribution in

Norman. The user-specified parameters include the approximate fraction of the

MOI in the training pixels (material pixel fraction) and the estimated probability

that any specified training pixels actually contains the MOI (confidence level). For

all signature samples, we used 0.90 for material fraction and 0.80 for the confidence

level. The signatures that we selected as training samples for the material of interests

include shrubs (Gr1), tall wild grass (Gr2), short wild grass (Gr3), agriculture (Agr),

man-made grass (Gol), riparian vegetation (Rip), and trees (For). By referring to

IKONOS image data and aerial photos with the help of field checks, the above seven

vegetation covers were selected from the TM image and statistics of the samples

were computed using ERDAS IMAGINE software. Figure 3 is a line chart of the

mean brightness values per band for the seven vegetation types. The mean and

standard deviation brightness values by band for the selected covers are shown in

table 1, and figure 4 shows the band 3 vs. band 4 scatter plot of the mean brightness

values of the classes.

Since we found three visually and statistically different signatures in the wild grass

category under investigation (table 1), we identified three different training samples

for wild grass. It should be noted that the grass 2 sample was selected from land

where natural vegetation is purely tall grass. The grass 1 category sample was

selected from a grass-like plant, forbs, or bushes area whereas the grass 3 class

sample was selected from a short wild grass area. We treated man-made grass

vegetation as a separate category instead of combining it with wild grass in this

study, since the spectral response from man-made grass is significantly different

from other wild grasslands. This information could be useful for some specific urban

and environmental planning of the study area.

It is important to note that the sub-pixel tool is designed to identify one material

of interest at a time. In other words, each material of interest was identified in an

Figure 3. Mean brightness values of seven vegetation types. Gr1, shrubs; Gr2, tall wildgrass; Gr3, short wild grass; Agr, agriculture; Gol, man-made grass; Rip, riparian vegetation;For, trees.

2650 S. W. Myint

image independently, i.e. in one analysis MOI was grass, and in another analysis

MOI was trees. However, in some cases, there may be two or more signatures that

represent a material of interest (e.g. signatures of grasses from wild grassland, man-

made grass, and dry rangeland). It was anticipated that classification accuracy could

be improved by using more than one signatures of the same class because different

signatures of the same class could produce more complete identification of the

material of interest. This is partly because the sub-pixel processor uses pure

signatures (.90% of MOI) and does not consider the variance of the training

samples.

6. Classification and expert system rules

The signature combiner tool in the IMAGINE sub-pixel classifier can be used to

combine signatures of the same category. This tool allows us to combine different

signatures to form a signature family (e.g. grass) as well as signatures of different

materials such that they are not in the same family (e.g. grass and trees). However, it

should be noted that signatures of the same family are treated separately during the

classification process. They do not compete with each other as in linear mixture

modelling approaches (Smith et al. 1990, Settle and Drake 1993). For example,

Table 1. Mean and standard deviation of the brightness values for the seven vegetationcovers. The last column is the mean of normalized difference vegetation index.

TrainingSamples

Band1 Band2 Band3 Band4 Band5 Band7

NDVIMean Std Mean Std Mean Std Mean Std Mean Std Mean Std

Grl 53 1.52 23 0.92 21 1.60 80 4.91 67 4.00 20 2.08 0.587Gol 58 2.03 27 1.68 25 2.58 91 7.76 92 7.37 30 3.21 0.566Rip 47 1.60 18 1.18 15 1.50 48 6.66 46 6.19 14 2.73 0.521Tree 49 1.56 20 1.27 19 1.82 51 5.04 55 6.71 18 2.77 0.462Gr2 51 1.60 22 1.65 20 1.36 53 7.35 65 7.56 22 3.38 0.449Gr3 60 2.59 27 1.66 26 2.19 62 4.27 79 2.89 30 1.64 0.402Agr 55 1.49 23 1.24 26 2.07 31 3.04 63 6.08 28 3.58 0.083

Gr1, shrubs; Gr2, tall wild grass; Gr3, short wild grass; Agr, agriculture; Gol, man-madegrass; Rip, riparian vegetation; For, trees.

Figure 4. Mean brightness values of the seven vegetation types in a band 3 vs. band4 featurespace plot. Gr1, shrubs; Gr2, tall wild grass; Gr3, short wild grass; Agr, agriculture; Gol,man-made grass; Rip, riparian vegetation; For, trees.


when using grass signature, a given pixel is classified as containing 60% of the MOI

(i.e. Gr1), as containing 70% using signature 2 (i.e. Gr1), and as containing 70%

using signature 3 (i.e. Gr1). The total of these three signatures is well over 100%, but

the processor identifies the average fraction (66.7%). The classification output for

each pixel will consist of four layers, one for each signature and the fourth layer

containing the average material fraction of all signatures. It may be acceptable to

take the average of two or more signatures of the same family, but it may not

provide useful information or it may lead to information loss if we take the average

of different family members. For example, an average fraction value of 45% is

assigned as vegetation for a candidate pixel with signature responses from 10%

grass, 80% trees, and 10% background. Apparently the multiple signature approach

(Huguenin et al. 1997) employed in this study falls short in handling multiple family

members (multiple MOIs). This does not necessarily mean that the processor is

incapable of accurately identifying the material of interest. To overcome thislimitation, a set of expert system rules based on the signatures of selected materials

and their greenness vegetation biomass in relation to the normalized difference

vegetation index (NDVI) were developed to identify vegetation distribution in

Norman. This is basically to prepare a single output map showing all vegetation

types over the study area. In the sub-pixel classification stage, each of the signatures

was used to produce each vegetation map of eight levels.

Table 1 lists the mean brightness and standard deviation value by band for the

seven signature samples. This shows the purity of the selected training signatures.

The expert system rules developed in this study are based on the assumption that

there is a dominant vegetation cover type related to its normalized difference

vegetation index within the candidate pixel. The other vegetation type that might

occur in that candidate pixel is negligible. Hence, the output map contains one

dominant vegetation cover type at a time with its fraction value for each pixel under

investigation. The rules basically determine the dominant vegetation component

within the candidate pixel. Hung and Ridd (2002) used threshold values derived

from the ratio of band 4 over band 3 to adjust pixels with invalid percentages

representing a situation in which the training sample statistics do not handle well. In

this study, the mid-value of two successive NDVI values was set to determine the

dominant vegetation type in this study. For example, 0.5765 was used as the first

threshold value in this study to give priority to Gr1 vegetation type, since this value

is the mid-value between two successive NDVI values (0.587 and 0.566). Hence, the

threshold values for Gr1, Gol, Rip, Tree, Gr2, Gr3, and Agr classes were 0.5765,

0.5435, 0.4915, 0.4555, 0.4255, and 0.2425 respectively. The index was used to

distinguish all selected vegetation cover types. Before applying the expert systemrule, we recoded eight levels of fraction classes for all seven class outputs to avoid

confusion (e.g. 1–8 for Gr1, 9–16 for Gol, 17–24 for Rip). The rules are explained by

the following procedure.

If NDVI.5threshold value then

Gr1 is dominant (take the fraction of Gr1)

Else if NDVI.5threshold value and not Gr1 then

Gol is dominant (take the fraction of Gol)

Else if NDVI.5threshold value and not Gr1 and not Gol then

Rip is dominant (take the fraction of Rip)

Else if NDVI.5threshold value and not Gr1 and not Gol and not Rip then

For is dominant (take the fraction of For)

2652 S. W. Myint

Else if NDVI.5threshold value and not Gr1 and not Gol and not Rip and not

For then



For and not Gr2 then



For and not Gr2 and not Gr3 then

Agr is dominant (take the fraction of Agr)

After completion of this process, those pixels with NDVI above the first threshold

value will be filled with fraction values of dominant class (Gr1) and all other

possible classes in the output map. Experiments from this study showed that there

were only a very few pixels left to be assigned to other classes after Gr1 was given

priority. It was also found that the leftover pixels were, in most cases, assigned to

one or two sample classes next to the dominant signature (e.g. Gr1). This implied

that the NDVI threshold procedure is reliable and the sub-pixel processor picked the

signatures effectively. It could also be inferred that the signatures selected in this

study are pure and appropriate. The next step is to develop another set of

procedures to give priority to Gol using NDVI between 0.5765 and 0.4535 (mid

value between 0.566 and 0.521). The second set of rules can be described as

If NDVI,first threshold value and.second threshold value then

Gol is dominant (take the fraction of Gol)

Else if NDVI,first threshold value and.second threshold value and not Gol

then


Else if NDVI,first threshold value and.second threshold value and not Gol and

not Gr1 then

Rip is dominant (take the fraction of Rip)

Then follow the same procedure until the last class is obtained.

After completion of the second process, those pixels with NDVI values between

the first and the second thresholds were filled with fraction values of the second

dominant class (Gol) and all other possible classes in the output map.

Five more expert system rules following the procedures described above were

developed for the rest of the training samples (i.e. For, Rip, Gr2, Gr3, Agr) using

their respective NDVI threshold values to complete the whole study area for all

dominant training samples.

7. Results and Accuracy Assessment

Figure 6 (a) to (e) show the single category percentage images with grey level display.

Brighter areas in output maps represent a higher percentage, and darker areas

indicate a lower percentage. For comparison purposes, the normalized difference

vegetation index (NDVI) of the study area is provided in figure 5. Figure 6 (a)

illustrates the grass coverage of Norman. We combined three different grass

signature classes to show total grass distribution in Norman. It should be noted that

the grass coverage is not the average fraction of all three grass layers generated by

the three grass signatures. It was formed by integrating the values of the outputs

produced by the expert system rules described above. The rest of the classes are


shown separately. Figure 7 shows the final map with all five vegetation distributions

found in Norman.

Some interesting observations were found in the study when examining the

number of pixels quantified into each of the seven vegetation classes in the interim

maps, as well as the final map after expert rules. Apparently, there were some

overlapping categories found in the classification. It was anticipated that there were

some overlaps among the three different grass categories and man-made grass

vegetation because samples from the same family tend to possess similar signatures.

We believe that this situation is obvious and understandable. If there were some

overlaps between two different families, it would have been a limitation in the sub-

pixel processor and consequently would have led to certain errors. However, it was

possible that there were some situations where those two different classes co-existed

in some pixels. It was difficult to trace which overlaps were acceptable and which

were not. Some categories had a reduction in the number of pixels during the

integration. This may be due to the effect of overlap among signatures from the

same family or signatures from the different families. In general, without looking at

the classification accuracy or correlation of the referenced classes and the classes

generated by the sub-pixel processor, the percentage for each category looks

reasonable and seems to conform to the vegetation distribution in the city of

Norman.

By overlaying the final vegetation distribution map and the Landsat TM image, it

was found that there was some confusion among the three types of grassland and

man-made grass vegetation. We anticipated this situation since they all were

basically from the same family. We also found that there was a little confusion

between agriculture and grass since their signatures were also similar. However, our

main concern was the signature confusion between completely different families (e.g.

grass and tree or riparian vegetation). This is because they are not only different in

terms of the amount of vegetation biomass but also the level of importance with

regards to assessment and monitoring of the urban heat island, air pollution, and

environmental degradation. On the other hand, the percentage of trees (crown

closure) or percentage of riparian vegetation in urban land-use classes (e.g.

residential, commercial) may be crucial in the planning and management of the

urban environment (McPherson 1994, Lo et al. 1997). Hence, we observed the

Figure 5. NDVI image of the study area.

2654 S. W. Myint

overlap of riparian vegetation classes and three different grass categories (figure 8 (a)

to (c)) by using the Matrix function in the IMAGINE software. It was found that

the highest overlap between riparian vegetation and Gr1, Gr2, and Gr3 were

0.095%, 0.187%, and 0.125% respectively. We believe that the percent overlaps are

low and acceptable. This implies that the classifier identified the classes effectively

and the selected samples contained pure signatures. This also indicates that our

expert system rules developed for overlapping classes of different families as well as

Figure 6. Vegetation distribution maps derived from the IMAGINE sub-pixel analysis: (a)wild grass; (b) man-made grass; (c) riparian vegetation; (d) tree; (e) agriculture. Brighter areasrepresent higher-percent category of a certain class.


same families were justifiable and reliable. It can be observed from Figure 8 (a) to (c)

that majority of the pixels for the overlap between Rip and Gr1 and Rip and Gr3

belonged to lower-level classes, whereas Rip and Gr2 belonged to higher-level

classes. It is understandable that the overlap of higher-level classes from different

families is more important to consider than lower-level classes. It was anticipated

that there could be some confusion between Rip or For and Gr2 (table 1) since their

signatures were somewhat similar. However, the percentage of total overlapping

pixels was less than 1.5% and it was not as high as we expected. This is one of the

reasons why we developed the expert system rules to identify classes more accurately

Figure 6. (Continued.)

2656 S. W. Myint

by giving priority to different classes according to their NDVI threshold values. We

believe that the above problem was effectively taken care by the system rules.

As mentioned earlier, IKONOS 4 metre resolution multispectral image of

Norman, Oklahoma, with the aid of 1 : 50,000 scale aerial photographs, was used for

assessing the accuracy of the levels of vegetation distribution from sub-pixel analysis

Figure 6. (Continued.)

Figure 7. Final map with all five vegetation distributions derived from the IMAGINE sub-pixel analysis and expert system rules. Note: The original map contains forty levels of classes(8 levels65 vegetation classes540) and only eight levels of distributions for all classes areshown for better visualization and interpretation.


and the expert system rules employed in this approach. Field verification was also

carried out to identify the classes accurately. Remote sensing accuracy generally

refers to thematic accuracy according to the difference between referenced and

classified data. There is some uncertainty in accurately determining the percentage

of vegetation types in each pixel under study due to the rectification accuracy and

the date of acquisition of aerial photographs. It should be noted that even though

Figure 8. Percent overlap of riparian vegetation classes and three different grass categories:(a) Rp vs. Gr1; (b) Rp vs. Gr2; (c) Rp vs. Gr3.

2658 S. W. Myint

we kept the root mean square (rms) error for the rectification of Landast TM image

less than one, there would have been some significantly high locational errors in

image geometric accuracy. This is often referred to as spatial accuracy, due to the

position shift between coarse resolution and finer resolution data pointed out by

Singh (1989); for example, an rms error of one means that the referenced pixel is

30 m away from the transformed pixel. This could have contributed more than 49

pixels (.767 pixels) off for the IKONOS image in this study. In some cases, there is

also some uncertainty in identifying some features and objects in aerial photographs

as well as in IKONOS images. This limitation could be referred to as spectral

limitation. The differences in the dates of the acquisition of Landsat TM, IKONOS,

and aerial photographs could also be considered a limitation in effectively assessing

the accuracy. This could be referred to as temporal accuracy. We believed that it was

impossible to accurately determine the percentage of different vegetation types in

each pixel. On the other hand, unlike per-pixel classification approaches, the classes

we identified were not completely spectrally different classes, but rather the discrete

levels representing a percentage of material of interest in each pixel.

A regression analysis is generally performed and presents the correlation

coefficient between two sets of ground component percentages from sample points

for most linear spectral un-mixing approaches (Smith et al. 1990, Foody and Cox

1994, Bastin 1997, Small 2001, Hung and Ridd 2002). One set of data is generated

by the classifier, and the other set is obtained from the reference data (e.g. aerial

photograph). Unfortunately, the output of the IMAGINE sub-pixel classifier in its

current form does not support linear regression analysis (Ji and Jensen 1999) since it

requires data to be in interval ratio scale. Hence, we decided to assess the accuracy

by showing the Spearman’s correlation between referenced classes (actual) and

output classes (classified) of wild grass, riparian vegetation, tree, man-made grass

vegetation, and agriculture. In assessing the effectiveness of the sub-pixel classifier

with the use of expert system rule, the 250 stratified random sample pixels selected

from the final vegetation distribution map were overlaid with the IKONOS 4 m

resolution data. A pixel of Landsat (30630 m) covers a little more than 56 IKONOS

pixels (464 m). The Spearman’s rank order correlation between the vegetation

output map and reference data for wild grassland, man-made grass, riparian

vegetation, tree, and agriculture were 0.791, 0.869, 0.628, 0.743, and 0.840

respectively (table 2). The correlation coefficients for all classes were significant at

the 0.01 level.

A general conclusion can be drawn from table 2 that man-made grass vegetation

and agriculture were found to be the most reliable categories since they both gave

the highest correlation. Correlation between classified data and reference data for

Table 2. Spearman’s rank order correlation between classification results and reference data.

Classes Spearman’s rho

Wild grass 0.791*Man-made grass 0.869*Riparian vegetation 0.628*Tree 0.743*Agriculture 0.840*

*Correlation is significant at the 0.01 level (two-tailed).


riparian vegetation were found to be the lowest. This may be due to the fact that

there is signature confusion between riparian vegetation and other classes especially

tree (table 1). We reported earlier the overlap of riparian vegetation classes and three

different grass categories (figure 8 (a) to (c)).

The correlation between the percentage classes in the output and the ground truth

in general were not very strong since the study attempted to identify percentage

distribution of spectrally closed vegetation types in a complex nature of urban

suburban environment (table 1). This may also be due to the fact that there is no

guarantee for the selection of signatures derived from training set pixels that contain

more than 90% of the MOI at any situation. The mixing of MOI and the

background within pixels may not be linear in some cases.

From the preceding discussion and conclusion, a checklist of the sources of

limitation or uncertainty in the application of sub-pixel (mixed pixel) approaches in

general may be identified as follows.

(a) An optimal approach for choosing pure signatures (end-members for the

linear spectral unmixing and material of interest for the IMAGINE sub-pixel

classifier) may be to use laboratory-based measurement of pure signatures

(reference data). However, a common approach for determining pure

signatures is to select representative homogeneous pixels from images.

There are several important reasons for selecting pure signatures from

images: to overcome substantial problems that exist in correcting atmo-

spheric absorption and scattering, the unavailability of laboratory based

reference data, the nature of the landscape under study, and/or classification

specificity. In a real world situation, it is understood that the selection of pure

signatures (100% certain for 100% homogeneous land cover type) in remotely

sensed images is practically impossible. This uncertainty could lead to

substantial errors in classification regardless of the effectiveness of the

classifier used.

(b) The limitation of the linear spectral unmixing, fuzzy-c mean and Bayesian

probabilities approaches is that it is almost impossible to identify all possible

end-members in a study area under investigation and classification accuracy

may be significantly degraded by the potential presence of unknown classes.

This is because the classifier is based on the assumption that the sum of the

fractional proportions of all potential end-members in a pixel is equal to one.

This is a limitation for the linear mixture models whereas the IMAGINE

sub-pixel classifier does not require the identification of all potential land

covers since it employs the innovative background removal process. The sub-

pixel classifier assumes that background spectra for each pixel is unique and

can be represented by other pixels in the image. It could be a limitation in

achieving satisfactory accuracy if there are some complex forms of mixed

pixels involved in obtaining the background signatures. However, this

limitation will have less of an impact than the previously mentioned

constraint on classification accuracy.

(c) More end-members may explain better spectral variation in a scene and

hence increase model fitness. However, the linear spectral unmixing classifier

does not permit a number of representative materials greater than the

number of spectral bands. This constraint will have a significant impact on

classification accuracy. This is not a limitation for the IMAGINE sub-pixel

analysis since it uses the background removal approach.

2660 S. W. Myint

(d) The mixing of pure signatures within pixels may not always be linear. Most

sub-pixel procedures are based on the assumption that a linear relation exists

between pixel brightness value and component land cover types.

(e) All mixed pixel classifiers may produce significant errors when dealing with

the same cover type but having completely different spectral responses (e.g.

red tile roof, wood shingle roof, light-grey metal roof, green-grey metal roof,

green metal roof, light-grey tar roof, dark-grey tar roof, red-grey tar roof,

glass roof, plastic roof, light-grey asphalt roof) since all sub-pixel approaches

require the identification of a representative sample each of selected land-

cover types (material of interest or end-member). This situation is not at all

unusual and true for many land cover types in real-world phenomenon,

especially when dealing with urban suburban images.

(f ) The atmospheric condition is assumed to be uniform across the image since all

sub-pixel classifiers take the selected homogeneous pixels (end-members or the

material of interest) as the representative signatures and identify percentage

distribution of the selected features in each pixel regardless of the differences in

atmospheric conditions over the scene. It does not matter how accurate the

signature derivation procedure is to determine pure signatures (100%

homogeneous), the classifier will not be effective if the above assumption is

not met. In a real-world situation, the assumption clearly is not true for most

remotely sensed images and a 100% atmospheric correction is impossible.

(g) Different training procedures in selecting pure signatures or different image

analysts may result in significantly different classification accuracies of the

same image over the same study area since there are unavoidable problems and

uncertainty in the identification of pure signatures and the accuracy of all sub-

pixel classifiers depends largely on the purity of these signatures. This is also

partly due to the fact that substantial problems exist in ground truthing,

atmospheric corrections, and signature variations within one land cover type.

(h) The IMAGINE sub-pixel classifier does not permit regression analysis since

the output is in the form of ordinal data. The accuracy assessment may be

achieved by following the same procedure normally employed in traditional

image classification approaches. The classification accuracy for most sub-

pixel approaches that can produce a continuous percentage distribution of

component cover type is generally achieved by performing a regression

analysis between the estimated percent ground cover and reference data (e.g.

visual interpretation of aerial photograph). However, in both cases, the

results cannot be referred to as a thematic accuracy and are not explicit

indicators of classification accuracy (e.g. user’s accuracy, producer’s

accuracy, overall accuracy).

(i) There are some important limitations in ground truthing and verification for

the selection of pure signatures from images and the accuracy assessment for

sub-pixel approaches. A common approach is to use finer resolution images

or large scale aerial photographs in comparison to the primary data (coarse

resolution image) for verifying results (estimated fraction values or classified

outputs) from sub-pixel analysis since ground truthing with the use of a

global positioning system (GPS) or a topographic map is impractical. This is

because there is an unavoidable problem in identifying the actual area

coverage of a pixel on the ground because the location of a pixel is expressed

by a pair of x and y co-ordinates. In other words, it is impossible to identify


the location of four corners of a pixel (four pairs of x and y co-ordinates of a

pixel) to determine the exact coverage of such pixel on the ground. However,

if this were possible, it would have been extremely difficult to measure the

percentage distribution of different land cover types within that pixel. One

alternative to overcome the limitation in actual ground truthing is to

interpret or identify features in higher resolution images or larger scale

aerial photographs. However, there are still some substantial problems

involved in using higher detailed information from images, and they are

listed as follows.

N Temporal error or accuracy needs to be considered carefully since differences

in acquisition dates between the primary image (e.g. Landsat TM) and

reference image (e.g. IKONO panchromatic image) could lead to significant

errors even in a situation in which both images are acquired in the same month

and year. For example, an agricultural area or grassland could have been

converted to a barren land (exposed soil) in a month for a new residential/

commercial development. There may be substantial changes if both images are

acquired in different months, and the situation will be worse if they are

acquired in different years. It should be noted that the reference data and

primary data for accuracy assessment, in the real world situation, are normally

acquired in different years.

N Another limitation to be considered is spatial accuracy related to images’

geometric correction. For example, a 50 cm resolution aerial photograph data

(reference data) is used to evaluate the accuracy of an output map derived from

a Landsat TM imagery (primary data) with an RMS error of 0.5 may lead to

900 pixels (reference data) locational errors per pixel of the primary data. The

error could be higher than 900 pixels if the positional error in geometric

correction of the reference data is taken into consideration. It should be noted

that 100% rectification accuracy of any data is practically impossible.

N Image-to-image registration does not overcome the above spatial accuracy

problem since the registration procedure involves the identification of ground

control points (GCP) on both images to make one image conform to the other

image in the database. In other words, we are comparing 30630 m sized GCPs

from the Landsat TM data and 50650 cm sized GCPs from the aerial

photographs. Hence, co-registration of two images with significantly different

spatial resolutions could be no better than two separate rectifications for

comparison purpose (e.g. accuracy assessment).

N The finer resolution and larger-scale aerial photograph commonly used for

accuracy assessment and selection of pure signatures usually are grey-scale

images and do not always permit accurate identification of features, especially

when dealing with different land covers having similar spectral responses (e.g.

grassland vs. agriculture, shrubs/scrubs vs. trees, sandy soil vs. cement car

park). This may be referred to as spectral limitation. This limitation could

potentially lead to uncertainty in accuracy assessment and signature derivation

for sub-pixel analysis.

8. Conclusion

This research investigated the effectiveness of the IMAGINE sub-pixel classifier

with the use of an expert system rule in quantifying varying amounts and

2662 S. W. Myint

distributions of different vegetation types in urban and suburban areas using

Landsat TM data. Results from this study demonstrated that the expert system rule

using the NDVI threshold procedure is reliable, and the sub-pixel processor picked

the signatures relatively well. The linear spectral mixing model, a usual technique for

mixed pixel classification, requires careful selection of representative land covers to

completely characterize the heterogeneity of an area under investigation. However,

the spectral unmixing classifier does not permit a number of representative materials

greater than the number of spectral bands. It was found that the IMAGINE sub-

pixel classifier is not limited in the number of end-members it can analyze since it

uses the innovative background removal process. The shortcoming found in the

IMAGINE sub-pixel analysis is that the discrete values for the output thematic map

are limited to two, four, or eight. Hence, the minimum possible range available with

the classifier is 10 (e.g. 20–29%). Continuous percentage values may be more

desirable in cases where the exact percentage of MOI is required for some detailed

analysis and modelling. Generating a continuous output would also allow some

applications in which the occurrence of MOI, less than 20%, is what information is

required (Flanagan and Civco 2001) since the classifier is incapable of quantifying

an MOI covering an area less than 20% of a pixel.

It can be concluded that, under real-world conditions, the success of this

approach for sub-pixel classification can be highly variable and may not be easily

controlled. The approach needs to be handled carefully with the awareness of the

limitations discussed earlier. However, it should be noted that all or most of the

problems and uncertainties reported in this manuscript apply to all sub-pixel

classification techniques.

ReferencesAPPLIED ANALYSIS, 2003, Imagine Subpixel Classifier User’s Guide, 174 pp. (Billerica, MA

01821: Applied Analysis Inc.).

BASTIN, L., 1997, Comparison of fuzzy c-means classification, linear mixture modeling, and

MLC probabilities as tools for unmixing coarse pixels. International Journal of

Remote Sensing, 18, pp. 3629–3648.

EASTMAN, 1999, IDRISI32, Volume 2, 170 pp. (Worcester, MA: Clark University).

EASTMAN, J.R. and LANEY, R.M., 2002, Bayesian soft classification for sub-pixel analysis: a

critical evaluation. Photogrammetric Engineering and Remote Sensing, 68, pp.

1149–1154.

FISHER, P.F. and PATHIRANA, S., 1990, The evaluation of fuzzy membership of land cover

classes in the suburban zone. Remote Sensing of Environment, 34, pp. 121–132.

FLANAGAN, M. and CIVCO, D.L., 2001, Software Review, IMAGINE Subpixel Classifier 8.4.

Photogrammetric Engineering and Remote Sensing, 67, pp. 23–28.

FOODY, G.M., 2000, Estimation of sub-pixel land cover composition in the presence of

untrained classes. Computers and Geosciences, 26, pp. 469–478.

FOODY, G.M. and AURORA, M.K., 1996, Incorporating mixed pixels in the training,

allocation and testing of supervised classification. Pattern Recognition Letters, 17, pp.

1389–1398.

FOODY, G.M., CAMPBELL, N.A., TRODD, N.M. and WOOD, T.F., 1992, Derivation and

applications of probabilistic measures of class membership from the maximum-

likelihood classification. Photogrammetric Engineering and Remote Sensing, 58, pp.

1335–1341.

FOODY, G.M. and COX, D.P., 1994, Sub-pixel land cover composition estimation using a

linear mixture model and fuzzy membership functions. International Journal of

Remote Sensing, 15, pp. 619–631.


GALLO, K.P., MCNAB, A.L., KARL, T.R., BROWN, J.F., HOOD, J.J. and TARPLEY, J.D., 1993,

The use of a vegetation index for assessment of the urban heat island effect.

International Journal of Remote Sensing, 14, pp. 2223–2230.

HUANG, Y.J., AKBARI, H., TAHA, H. and ROSENFELD, A.H., 1987, The potential of vegetation

in reducing summer cooling loads in residential buildings. Journal of Climate and

Applied Meteorology, 26, pp. 1103–1116.

HUGUENIN, R.L., KARASKA, M.A., BLARICOM, D.V. and JENSEN, J.R., 1997, Subpixel

classification of bald cypress and tupelo gum trees in Landsat Thematic Mapper

imagery. Photogrammetric Engineering and Remote Sensing, 63, pp. 717–725.

HUNG, M. and RIDD, M.K., 2002, A subpixel classifier for urban land-cover mapping based

on a maximum-likelihood approach and expert system rules. Photogrammetric

Engineering and Remote Sensing, 68, pp. 1173–1180.

JI, M. and JENSEN, J.R., 1996, Fuzzy training in supervised image classification. Geographic

Information Sciences, 2, pp. 1–12.

JI, M. and JENSEN, J.R., 1999, Effectiveness of subpixel analysis in detecting and quantifying

urban impervious from Landsat Thematic Mapper Imagery. Geocarto International,

14(4), pp. 33–41.

LO, C.P. and QUATTROCHI, D., 2003, Land-use and land-cover change, urban heat island

phenomenon, and health implications, a remote sensing approach. Photogrammetric

Engineering and Remote Sensing, 69, pp. 1053–1063.

LO, C.P., QUATROCHI, D.A. and LUVALL, J.C., 1997, Application of high-resolution thermal

infrared remote sensing and GIS to assess the urban heat island effect. International

Journal of Remote Sensing, 18, pp. 287–304.

MCPHERSON, E.G., 1994, Cooling urban heat islands with sustainable landscapes. In

Ecological City: Preserving and Restoring Urban Biodiversity, R.H. Platt, R.A.

Rowntree and P.C. Muick (Eds), pp. 151–171 (Amherst, MA: The University of

Massachusetts Press).

MESEV, V., 2003, Remotely sensed cities: an introduction. In Remotely Sensed Cities, V.

Mesev (Ed.), pp. 1–19 (London: Taylor and Francis).

NASA/GHCC PROJECT ATLANTA, 2004, Urban climatology and air quality. Available online

at: http://www.ghcc.msfc.nasa.gov/urban/urban_heat_island.html (accessed 30 June

2004).

OKE, T.R., 1982, The energetic basis of the urban heat island. Quarterly Journal of the Royal

Meteorological Society, 108, pp. 1–24.

OWEN, T.W., CARLSON, T.N. and GILLIES, R.R., 1998, An assessment of satellite remotely

sensed landcover parameters in quantitatively describing the climate effect of

urbanization. International Journal of Remote Sensing, 19, pp. 1663–1681.

RASHED, T., WEEKS, J.R. and GADALLA, M.S., 2001, Revealing the anatomy of cities through

spectral mixture analysis of multispectral satellite imagery: a case study of the greater

Cairo region, Egypt. Geocarto International, 12(3), pp. 27–40.

RASHED, T., WEEKS, J.R., ROBERTS, D., ROGAN, J. and POWELL, R., 2003, Measuring the

physical composition of urban morphology using multiple endmember spectral

mixture models. Photogrammetric Engineering and Remote Sensing, 69, pp.

1011–1020.

SCHOWENGERDT, R.A., 1995, Soft classification and spatial-spectral mixing. In Proceedings of

International Workshop on Soft Computing in Remote Sensing Data Analysis, 4–5

December 1995, Milan, Italy, 1, pp. 1–6.

SETTLE, J.J. and DRAKE, N.A., 1993, Linear mixing and the estimation of ground cover

proportions. International Journal of Remote Sensing, 14, pp. 1159–1177.

SINGH, A., 1989, Digital change detection techniques using remotely sensed data.


SMALL, C., 2001, Estimation of urban vegetation abundance by spectral mixture analysis.


2664 S. W. Myint

SMITH, M.O., USTIN, S.L., ADAMS, J.B. and GILLESPIE, A.R., 1990, Vegetation in deserts: I. A

regional measure of abundance from multispectral images. Remote Sensing of

Environment, 31, pp. 1–26.

VAN DAR MEER F., 1997, Mineral mapping and Landsat Thematic Mapper image

classification using spectral unmixing. Geocarto International, 12, pp. 27–40.

WAGROWSKI, D.M. and HITES, R.A., 1997, Polycyclic aromatic hydrocarbon accumulation in

urban, suburban and rural vegetation. Environmental Science and Technology, 31, pp.

279–282.

WANG, F., 1990a, Fuzzy supervised classification of remote sensing images. IEEE

Transactions on Geoscience and Remote Sensing, 28, pp. 194–201.

WANG, F., 1990b, Improving remote sensing image analysis through fuzzy information

representation. Photogrammetric Engineering and Remote Sensing, 56, pp. 1163–1168.

WU, C. and MURRAY, A., 2003, Estimating impervious surface distribution by spectral

mixture analysis. Remote Sensing of Environment, 84, pp. 493–505.

ZHANG, J. and FOODY, G.M., 2001, Fully-fuzzy supervised classification of sub-urban land

cover from remotely sensed imagery: Statistical and neural network approaches.

Photogrammetric Engineering and Remote Sensing, 22, pp. 615–628.


Urban vegetation mapping using sub-pixel analysis and...

Documents

Transcript of Urban vegetation mapping using sub-pixel analysis and...