Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic...

6
Cloud detection in Landsat imagery of ice sheets using shadow matching technique and automatic normalized difference snow index threshold value decision Hyeungu Choi a,b, * , Robert Bindschadler b a Science Applications International Corporation (SAIC), San Diego, CA, USA b Oceans and Ice Branch (Code 971), NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA Received in revised form 24 March 2004; accepted 30 March 2004 Abstract This work presents a new algorithm designed to detect clouds in satellite visible and infrared (IR) imagery of ice sheets. The approach identifies possible cloud pixels through the use of the normalized difference snow index (NDSI). Possible cloud pixels are grown into regions and edges are determined. Possible cloud edges are then matched with possible cloud shadow regions using knowledge of the solar illumination azimuth. A scoring index quantifies the quality of each match resulting in a classified image. The best value of the NDSI threshold is shown to vary significantly, forcing the algorithm to be iterated through many threshold values. Computational efficiency is achieved by using sub-sampled images with only minor degradation in cloud-detection performance. The algorithm detects all clouds in each of eight test Landsat-7 images and makes no incorrect cloud classifications. D 2004 Elsevier Inc. All rights reserved. Keywords: Landsat; ETM+; Clouds; Shadow; Classification; Ice sheet; NDSI; Automatic cloud cover assessment (ACCA) 1. Introduction Automated procedures for detecting cloud have multi- ple uses. A major application is to assist in searches of optical imagery archives. Cloudier images can usually be ignored in lieu of less cloudy images, unless the target is small or if date is an essential search parameter. Accurate cloud assessment also serves a critical role in the sched- uling of high-resolution imagers such as the Enhanced Thematic Mapper Plus (ETM+) on Landsat-7 (Arvidson et al., 2001). Cloud cover of ETM+ images is used to determine if a desired image collection was successful and, if not, the image request is returned to the imaging queue for reacquisition. An incorrect cloud assessment can lead to poor utilization of imaging resources and effort. Over most of the earth’s surface, clouds can be detected by their high albedo in the visible spectrum and by their cold temperatures. However, either approach has difficulty in discriminating between clouds and ice sheets because both targets are bright and temperature inversions in the atmosphere above ice sheets are com- mon, leaving the surface colder than the clouds. Cloud formations are usually distinct and mappable in ice sheet imagery, but their automatic classification as cloud rather than as a formation of the ice sheet is the crux of the difficulty. The approach examined here utilizes the characteristic that clouds thick enough to mask the surface also cast shadows on the surface. Shadows are much darker than either the ice sheet surface or the clouds, and are easily identified. However, ice sheets do contain limited areas of mountains and bare rock that are also dark. Knowledge of the sun azimuth allows potential cloud features to be matched with potential cloud shadow features to better determine what features are actually clouds. A quantitative index of matching is used to optimize the algorithm, and multiple iterations are necessary to search the image for the 0034-4257/$ - see front matter D 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.rse.2004.03.007 * Corresponding author. Oceans and Ice Branch, Science Applications International Corporation, Code 971, NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA. Fax: +1-301-614-5644. E-mail addresses: [email protected] (H. Choi), [email protected] (R. Bindschadler). www.elsevier.com/locate/rse Remote Sensing of Environment 91 (2004) 237 – 242

Transcript of Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic...

Page 1: Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision

www.elsevier.com/locate/rse

Remote Sensing of Environment 91 (2004) 237–242

Cloud detection in Landsat imagery of ice sheets using shadow

matching technique and automatic normalized difference snow

index threshold value decision

Hyeungu Choia,b,*, Robert Bindschadlerb

aScience Applications International Corporation (SAIC), San Diego, CA, USAbOceans and Ice Branch (Code 971), NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA

Received in revised form 24 March 2004; accepted 30 March 2004

Abstract

This work presents a new algorithm designed to detect clouds in satellite visible and infrared (IR) imagery of ice sheets. The approach

identifies possible cloud pixels through the use of the normalized difference snow index (NDSI). Possible cloud pixels are grown into regions

and edges are determined. Possible cloud edges are then matched with possible cloud shadow regions using knowledge of the solar

illumination azimuth. A scoring index quantifies the quality of each match resulting in a classified image. The best value of the NDSI

threshold is shown to vary significantly, forcing the algorithm to be iterated through many threshold values. Computational efficiency is

achieved by using sub-sampled images with only minor degradation in cloud-detection performance. The algorithm detects all clouds in each

of eight test Landsat-7 images and makes no incorrect cloud classifications.

D 2004 Elsevier Inc. All rights reserved.

Keywords: Landsat; ETM+; Clouds; Shadow; Classification; Ice sheet; NDSI; Automatic cloud cover assessment (ACCA)

1. Introduction

Automated procedures for detecting cloud have multi-

ple uses. A major application is to assist in searches of

optical imagery archives. Cloudier images can usually be

ignored in lieu of less cloudy images, unless the target is

small or if date is an essential search parameter. Accurate

cloud assessment also serves a critical role in the sched-

uling of high-resolution imagers such as the Enhanced

Thematic Mapper Plus (ETM+) on Landsat-7 (Arvidson

et al., 2001). Cloud cover of ETM+ images is used to

determine if a desired image collection was successful

and, if not, the image request is returned to the imaging

queue for reacquisition. An incorrect cloud assessment

can lead to poor utilization of imaging resources and

effort.

0034-4257/$ - see front matter D 2004 Elsevier Inc. All rights reserved.

doi:10.1016/j.rse.2004.03.007

* Corresponding author. Oceans and Ice Branch, Science Applications

International Corporation, Code 971, NASA Goddard Space Flight Center,

Greenbelt, MD 20771, USA. Fax: +1-301-614-5644.

E-mail addresses: [email protected] (H. Choi),

[email protected] (R. Bindschadler).

Over most of the earth’s surface, clouds can be

detected by their high albedo in the visible spectrum

and by their cold temperatures. However, either approach

has difficulty in discriminating between clouds and ice

sheets because both targets are bright and temperature

inversions in the atmosphere above ice sheets are com-

mon, leaving the surface colder than the clouds. Cloud

formations are usually distinct and mappable in ice sheet

imagery, but their automatic classification as cloud rather

than as a formation of the ice sheet is the crux of the

difficulty.

The approach examined here utilizes the characteristic

that clouds thick enough to mask the surface also cast

shadows on the surface. Shadows are much darker than

either the ice sheet surface or the clouds, and are easily

identified. However, ice sheets do contain limited areas of

mountains and bare rock that are also dark. Knowledge of

the sun azimuth allows potential cloud features to be

matched with potential cloud shadow features to better

determine what features are actually clouds. A quantitative

index of matching is used to optimize the algorithm, and

multiple iterations are necessary to search the image for the

Page 2: Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision

H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237–242238

set of cloud features that are best matched by the set of

potential cloud shadows.

2. Data

Eight Level-1G Landsat-7 ETM+ images of the Antarctic

or Greenland ice sheet were used for this research. The

selection was made to provide a variety of cloud types and

coverage amounts. The test images were converted into

calibrated reflectance images using ENVI (nRSINC) soft-ware. No atmospheric correction was applied. Sun elevation

angle and azimuth angle were read from the metadata file.

3. The ACCA algorithm

The ACCA (automatic cloud cover assessment (ACCA)

algorithm (Irish, 2000) was developed for the Landsat

processing system (LPS) and is the starting point for the

approach discussed here. The LPS retrieves and processes

the raw image data and generates Level-0R data with an

associated cloud assessment. The ACCA algorithm embed-

ded in LPS generates a cloud cover score for each quarter of

each scene. The ACCA algorithm is a two-pass processing

scheme. Pass one applies eight separate filters, while pass

two involves thermal channel analysis.

The ACCA gives good results over most of the planet

with the exception of ice sheets because ACCA operates on

the premise that clouds are colder than the land surface they

cover. Only one of the eight filters for pass one processing is

effective for ice sheet images: the normalized difference

snow index (NDSI) (Hall et al., 1995). The NDSI was

designed to distinguish snow from most other features.

Other filters are designed for classifying highly reflective

vegetation, rock, and sand. The NDSI filter is expressed as:

NDSI ¼ ðband 2� band 5Þ=ðband 2þ band 5Þ ð1Þ

and produces an image of NDSI values. A threshold applied

to the NDSI image is used to separate cloud pixels from

Fig. 1. Brief flow chart o

non-cloud pixels. The principle behind the NDSI filter is

that while snow and cloud are both highly reflective in band

2 (0.52–0.6 Am), the reflectance of clouds in the near-

infrared band 5 (1.55–1.75 Am) decreases less than the

snow reflectance.

Ice sheets are primarily covered in snow, so a snow

versus cloud discriminator is expected to be effective.

ACCA takes 0.7 for its NDSI threshold value. For different

threshold values, the pixels identified as cloud varied. The

examination of the NDSI filter indicated that the best NDSI

threshold value was variable from one image to another,

depending on factors such as sun elevation angle, atmo-

spheric condition, and season.

Another factor reducing the effectiveness of the NDSI as

an ice sheet versus cloud discriminator is that snow’s near-

infrared reflectance increases with snow grain size (Dozier,

1989). Ice sheets are generally covered by older, larger

grained snow, a result of infrequent snowfall, wind that

transports snow breaking off a grain’s delicate dendritic arms,

and large temperature gradients that enhance snow metamor-

phosing into rounder shapes. The next two sections will show

how the optimal threshold NDSI for each image is decided.

4. Cloud detection using shadow matching (CDSM)

algorithm

The basic concept of our cloud detection using shadow

matching (CDSM) algorithm is to detect clouds by matching

them with their corresponding shadows. Dark features are

easily identified and shadows comprise a subset of all dark

features. Potential clouds are identified through the appli-

cation of the NDSI although the set of cloud candidates

varies with the threshold used with the NDSI. Knowing the

sun azimuth limits the searching necessary to match possi-

ble clouds with possible shadows.

The CDSM algorithm uses bands 2 through 5 of the

calibrated reflectance images. Fig. 1 shows the flow chart of

the CDSM algorithm. Preliminary steps, which are not

shown here, are the removal of non-image edge pixels

around the perimeter of the image and the detection of

water pixels.

f CDSM algorithm.

Page 3: Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision

H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237–242 239

Water pixels are much darker than cloud shadow pixels

in the visible spectrum. Each pixel in bands 3 and 4 is

compared to a water threshold set at 0.07. Pixels with values

below this threshold are classified as water and are not

considered further in the cloud detection scheme.

A NDSI threshold is set and, by application of the NDSI

formula (Eq. (1)), potential cloud pixels are identified. The

default threshold value is 0.7, but, as described below, this

value is later varied to optimize the amount of cloud and

shadow matching possible for any image. The result is a

binary image with each pixel labeled either ‘‘possible

cloud’’ or ‘‘not-cloud’’.

A morphological closing operator (Castleman, 1996) that

removes small holes and narrow gaps is then applied to the

binary map. This operation simplifies the shapes of possible

clouds and reduces their number. This dramatically reduces

Fig. 2. Landsat-7 ETM+ images (color composite image from bands 3, 4, and 5) a

Row. Identified clouds (light gray), detected shadows (dark gray), detected water

the processing times of the remaining steps of the algorithm.

Pixels identified as possible cloud are isolated into regions

with a ‘‘region-labeling’’ algorithm. A region is a set of

possible cloud pixels within a neighborhood around the

pixel under examination. This labeling operation (Pavlidis,

1982) also tags each potential cloud region with a unique

identifier. The CDSM algorithm then tests whether each

possible cloud region has a matching shadow.

Next, an edge detection procedure extracts the edges of

the possible cloud regions. After a thinning operation, all

edges are a single pixel wide.

Bands 3 and 4 are also used for shadow detection. Cloud

shadow is brighter than water and darker than both cloud

and snow. The brightness of cloud shadow varies depending

on sun elevation angle and cloud thickness. When sun

elevation angle is lower than 15 degrees, a maximum

nd corresponding cloud mask results. The numbers on each image are Path/

pixels (grid) and the rejected non-cloud pixels (black).

Page 4: Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision

H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237–242240

reflectance threshold of 0.6 is used. The maximum threshold

is increased to 0.7 when the sun elevation angle exceeds 15

degrees. Minimum thresholds are 0.15 and 0.1 for bands 3

and band 4, respectively. Pixels in the range between the

minimum and maximum thresholds are classified as ‘‘pos-

sible cloud shadow’’.

It is recognized that there may be other classes within this

range of brightness, such as bare rock or snow shadowed by

steep mountains. The matching of possible clouds with

possible cloud shadows is how the actual cloud shadow

pixels are separated from the other classes of intermediate

brightness.

The matching procedure works with the sets of edges of

possible clouds, the possible cloud shadows, and the water

regions. Starting at any cloud edge, this edge is translated

along the image in the direction of solar illumination,

searching for cloud shadow. When the edge pixels of a

cloud cluster meet shadow, or water, or image edge pixels,

the shadow, water, and image edge ratios (the number of

cloud edge pixels meeting shadow, water, and image edge

divided by the total edge pixel number of the cloud cluster)

are recorded. For the case of the shadow ratio, the extreme

situation is when a small cloud and its complete cloud

shadow are identified. In this case, every edge pixel matches

a cloud shadow pixel and the shadow ratio is 1. For larger

clouds or low clouds with distinct shadows, the cloud could

obscure a portion of the shadow and the shadow ratio would

decrease toward 0.5.

If the water ratio is greater than >0.25 or if the image

edge ratio is greater than >0.2, then the possible cloud

cluster is classified as cloud without testing shadow match-

ing. If the shadow ratio is greater than >0.2, the cluster is

classified as cloud. These thresholds were determined em-

pirically based on the test images available.

The output of the cloud detection algorithm is an image

map classified into cloud, water, shadow, and the remaining

possible cloud clusters that failed to be classified as cloud.

Fig. 2 shows our test Landsat-7 images and the corresponding

classified images resulting from the CDSM algorithm show-

ing identified clouds (light gray), detected shadows (dark

gray), detected water pixels (grid), rejected non-cloud pixels

(black), and snow-covered ice sheet (white).

Fig. 3. Flow chart of ANTD algorithm.

5. Automatic NDSI threshold decision (ANTD)

algorithm

As discussed earlier, the NDSI threshold value used in

the CDSM algorithm cannot be fixed due to the variability

of image conditions: specifically sun elevation angles,

atmospheric conditions, and seasonal conditions. Significant

errors occurred for any constant value of the NDSI thresh-

old. From visual inspection of clouds in our eight test

images and the performance with various values of the

NDSI threshold, the proper NDSI values ranged from 0.56

to 0.79 (average = 0.675, standard deviation = 0.069).

We introduced an automatic NDSI threshold decision

(ANTD) method to deal with this condition (Fig. 3). The

ANTD method requires that the full CDSM algorithm be

applied for a series of NDSI threshold values. For each

iteration in the series, a single value of the NDSI threshold is

used and the results of the CDSM algorithm are used to

derive a ‘‘cloud score’’, defined as:

Cloud score ¼ S1 � SRatio � 0:5 � R ð2Þ

where S1 = the number of cloud edge pixels matching cloud

shadow; SRatio = S1/(total number of cloud edge pixels); R =

the number of cloud edge pixels not matched by cloud

shadow.

The preferred value of the NDSI threshold occurs when

the cloud score is a maximum.

For the iterations of the ANTD, the NDSI threshold is set

to an initial value of 0.6 (0.56 for the image with the lowest

sun-elevation angle of 8j) and increased by 0.01 for each

iteration. Fig. 4 shows that there is always a maximum cloud

score for each image, but that the corresponding preferred

value of the NDSI threshold varies from image to image.

Fig. 4 also shows that the cloud scores decrease sharply

for NDSI thresholds above the preferred value. Higher

NDSI thresholds add incorrect pixels to the possible cloud

regions. As a result, fewer possible cloud regions match

with possible cloud shadow, lowering the value of the first

term in Eq. (2). In addition, the increased numbers of cloud

edge pixels that fail increase the value of the second term in

Eq. (2), which also serves to lower the cloud score. Eq. (2)

is weighted to give preference to the larger cloud regions.

Our bias was to ensure that the largest clouds had the

greatest certainty of detection because smaller clouds have

a lesser impact on the utility of an image. Our experience

with this weighting and the coefficient value of 0.5 for the

Page 5: Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision

Fig. 4. Normalized cloud score for each ANTD iteration step. Key gives path and row of each Landsat-7 ETM+ image.

H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237–242 241

second term was the result of extensive evaluations even

though the number of images examined was limited to eight.

To truncate the iteration process and save processor time,

a test is included based on a shadow–cloud ratio (total

number of shadow pixels/total number of cloud pixels). We

found that if this ratio is less than 0.15 at the end of an

iteration, subsequent iterations for other values of the NDSI

threshold need not be completed. A small shadow–cloud

ratio means that the cloud clusters have grown too much as

a result of the NDSI threshold value being too high.

Table 1

Cloud percentage detected by the CDSM algorithms with full, quarter, and

1/16 size images and optimal NDSI threshold values reported by the ANTD

algorithm

Path,

row

Month/

day/year

Cloud%

(full size)

Cloud%

(1/4 size)

Cloud%

(1/16 size)

NDSI

threshold

227, 117 01/17/00 15.68 15.60 15.21 0.7

34, 119 02/26/00 1.05 0.94 0.92 0.56

12, 115 01/15/00 14.91 14.96 14.60 0.67

7, 121 12/27/99 14.47 14.52 14.24 0.79

229, 118 12/14/99 11.84 11.94 11.74 0.63

229, 119 12/14/99 5.92 5.96 5.82 0.71

53, 115 01/16/01 16.90 16.71 16.30 0.63

29, 117 12/21/99 7.47 7.37 7.13 0.71

6. Performance

The performances of the CDSM and ANTD algorithms

were evaluated by comparing their results to an independent

classification of each image based on visual inspection.

Even though a cloud may appear very similar to snow in

the visible and near-infrared parts of the spectrum, a person

can often use cloud shape and shadow to unambiguously

distinguish the cloud from snow-covered ice.

In all eight cases, the combination of the CDSM and

ANTD algorithms found all clouds and made no incorrect

cloud classifications. The detected cloud percentage and the

optimal NDSI threshold values returned by the ANTD

algorithm are shown in Table 1. We feel the excellent results

represent a great improvement over ACCA, which uses

NDSI with a fixed threshold.

Although our test data set was chosen randomly, from

images already on hand for other studies, in half of the

images the near-infrared reflectance of the snow-covered

regions is so close to that of clouds, the NDSI often

identified those regions as cloud. However, the CDSM

algorithm correctly reclassified these regions because no

matching shadows could be found.

The CDSM and ANTD algorithms attempt to automate

some of the procedures a human employs in cloud identi-

fication, but the automated procedures necessarily involve

many calculations. Each cloud cluster must be tested for

shadow matching and the ANTD algorithm involves an

iteration scheme of over 20 CDSM processes. Nearest-

neighbor, sub-sampled images were created to examine

the effect on reducing the CDSM/ANTD processing times

and their effect of accurate cloud detection. Results for

identical CDSM/ANTD processing of the 2� 2 sub-sam-

pled and 4� 4 sub-sampled images are given in Table 1 and

illustrated in Fig. 5. The calculated cloud-detection results

suffer only minor degradation, while the processing time

decreases exponentially. The results also show that images

with more cloud clusters take more time for the CDSM/

ANTD procedure.

Situations are known to occur, such as ground fog, where

clouds are at such low elevations that shadows are displaced

too short a distance to be resolved in an image. Our

approach will fail in such situations. Requiring a shadow

to be a minimum of 2 pixels wide (60 meters for ETM+) and

Page 6: Cloud Detection in Landsat Imagery of Ice Sheets Using Shadow Matching Technique and Automatic Normalized Difference Snow Index Threshold Value Decision

Fig. 5. ANTD algorithm processing times depending on sampled image size. Key gives path and row of each Landsat-7 ETM+ image.

H. Choi, R. Bindschadler / Remote Sensing of Environment 91 (2004) 237–242242

sun elevations to be a minimum of 10 degrees, this implies

that only clouds with upper surfaces lower than 10 meters

will be missed. We do not deem this restriction to severely

limit the application of our approach.

7. Conclusion

Automated cloud detection in space-borne visible, near-

infrared, and short-wave infrared imagery of ice sheets has

proven to be a challenging problem for many years. We

believe that the shadow detection and matching approach of

our CDSM algorithm is a novel means that uses more

information within the image (i.e., darker regions as possible

cloud shadows) and about the image (i.e., metadata of solar

azimuth) to provide an improved solution to this problem.

Operational adoption of the CDSM/ANTD approach is

more likely given the much-reduced running times on sub-

sampled images with little impact on cloud-detection per-

formance. There appears to be further computational sav-

ings possible with greater sub-sampling of the raw imagery,

however, this aspect has not been fully explored in this

paper.

Further reductions of computational requirements could

be achieved if some other means is found to determine the

appropriate NDSI threshold. Our data set was not robust

enough to indicate the specific conditions that might inde-

pendently determine this threshold, however, it is possible

that the environmental history of a site is so important as to

make independent methods untrustworthy.

Finally, an additional advantage of shadow matching is

that it is easy to calculate the elevation of each cloud top.

We have not included these calculations in our results, but

they might prove useful for some scientific studies.

References

Arvidson, T., Gasch, J., & Goward, S. N. (2001). Landsat-7’s long-term

acquisition plan—an innovative approach to building a global imagery

archive. Remote Sensing of Environment, 78 (1–2), 13–26.

Castleman, K. R. (1996). Digital image processing. Prentice Hall, NJ.

Dozier, J. (1989). Remote Ssensing of snow in visible and near-infrared

wavelengths. In G. Asrar (Ed.), Theory and applications of optical

remote sensing ( pp. 527–547). New York: Wiley.

Hall, D. K., Riggs, G. A., & Salomonson, V. V. (1995). Development of

methods for mapping global snow cover using moderate resolution

imaging spectroradiometer data. Remote Sensing of Environment, 54,

127–140.

Irish, R. (2000). Landsat-7 automatic cloud cover assessment algorithms

for multispectral, hyperspectral, and ultraspectral imagery. SPIE, 4049,

348–355.

Pavlidis, T. (1982). Algorithms for graphics and image processing. Com-

puter Science Press, MD.

RSINC (Research Systems), www.rsinc.com.