Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting,...

34
Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th - 16 th July, 2015 Reference Map 1

Transcript of Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting,...

Page 1: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

1

Reference Data Collection &

Accuracy Assessment

Dr. Russ Congalton & Kamini Yadav

GFSAD30 Meeting, Wisconsin, 14th-16th July, 2015

Reference

Map

Page 2: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

2

Outline Reference Data

Africa Australia North America Europe & South America

VHRI data collection RHSeg Work eCognition/R Program Accuracy Assessment (Australia) Future work

Page 3: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

3

Discussion with each Group on Reference data Africa-Jun

Received reference data form Had first conference call on April 16, 2015

Australia-Pardha Received reference data form Conference call on May 28, 2015, follow up call on 30th June, 2015

North America-Richard/Teki Received reference data form Conference call on June 3, 2015

Europe-Aparna/Mutlu Received reference data form Unavailable to schedule the call

South America-Ying/Chandra Not Approached/Received

Page 4: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

4

Outcome of Reference Data Calls These calls proved to be excellent to coordinate efforts

between the mapping team and the accuracy team to determine mutually integrated needs and analysis.

Resulted in detailed knowledge about each mapping team and how they are generating their product. Will help us to perform validation for the respective products.

Determined actions items specific to each team.

Complied all the possible sources of reference data information. Discussed how to collect or build the necessary independent reference data for each continent.

Page 5: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

5

Ground Data Sources Ground data (collected by our team including Murali)

Received shape files for Ethiopia, Tanzania, Malawi, Rawanda, Burundi India (South India, Rajasthan)

Ground data sourced from other projects (e.g., CORINE) Curt Reynolds's field data from USDA/FAS 2015 corn map for South Africa and 2014 cotton / rice map for Australia GDA Corp

Ground data from global collection (e.g., Mutlu's) To be purely use for accuracy assessment

Ground data from literature Authors contacted to obtain the reference data they used or the map they produced for

their project (if willing to share)

Reference data from other work (e.g., USDA CDL, Agriculture and Agri-Food Canada)

Statistical Data (FAOSTAT) Ground Data Existing Cropland Data Very High Resolution Imagery

Page 6: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

6

Africa Reference data

Africa Australia North America

Source: With the help of Jim Tilton, received VHRI from NGA

Page 7: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

7

Australia Reference Data The statistical data from FAOSTAT is only

used to evaluate the sub pixel/actual crop area for the year 2014

Before using the ground data, there is a need to define the crosswalk used to compare its classification scheme with the classified map

The Dynamic Land cover map of Australia (DLDC) is used for crop area comparison only. It is available for the years 2000-2008

Create random samples from DLDC map for accuracy assessment

Decide the strategy and perform accuracy assessment for the GCE v2.0 250m

Select and perform reference classification of VHRI to generate ground data

Africa Australia North America

Agriculture Ecological Zones: Zone 3 -9

Page 8: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

8

North America Reference DataType Number of Pixels (250m)

Total Crop (Original) 26,444,974

Total Crop (Buffered) 6,415,797

Total Unused Pixels 20,029,177

Africa Australia North America

CDL – 56m cropland layer from USDA-NASS has been resampled to 250m and used to build pure pixels (6,415,797 out of total 26, 444, 974 crop pixels). The remaining 20, 029, 177 unused pixels are potential ones where homogeneous samples can be chosen for creating validation dataset

NLCD 30m cropland layer can be a possible source to create a crop/No-crop mask

Also, ground data used for NLCD validation and accuracy assessment is possible source??

Page 9: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

9

Canada Reference Data

Generate reference data from AAFC cropland layer as homogeneous samples for 250m mapping (the same way as USDA NASS CDL layer)

Based on the field size, decide the homogeneous pixel criteria to label 3x3 30m or 250m pixel

Process VHRI in some areas and use to test the accuracy of the cropland layer

Agriculture Ecological Zones: Zone 3-6, 12 & 13

Page 10: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

10

AAFC Cropland Map LabelsLabel Code Definition

Agriculture (undifferentiated) 120

Agricultural land, including annual and perennial crops; and would exclude grassland. This class is mapped only if the distinction of sub-agricultural covers (classes 132-199) is not possible.

Pasture / Forages 122Periodically cultivated. Includes tame grasses and other perennial crops such as alfalfa and clover grown alone or as mixtures for hay, pasture or seed.

Too Wet to be Seeded 130

Agricultural fields that are normally seeded that remain unseeded due to excess spring moisture.

Fallow 131Plowed and harrowed fields that are left unsown for the growing season

Cereals 132 This class is mapped only if the distinction of sub-cereal covers (classes 133-146) is not possible.

Barley 133Other Grains 134Millet 135Oats 136Rye 137Spelt 138Triticale 139

Wheat 140This sub-cereal class is mapped only if the distinction of sub-wheat covers (classes 145-146) is not possible

Switch grass 141Winter Wheat 145Spring Wheat 146Corn 147Tobacco 148Ginseng 149

Oilseeds 150 This class is mapped only if the distinction of sub-oilseed covers (classes 151-158) is not possible.

Borage 151Camelina 152Canola / Rapeseed 153

Flaxseed 154

Label Code DefinitionMustard 155Safflower 156Sunflower 157Soybeans 158

Pulses 160This class is mapped only if the distinction of sub-pulse covers (classes 162-174) is not possible.

Peas 162Beans 167Lentils 174

Vegetables 175

This class is mapped only if the distinction of sub-vegetable covers (classes 176-179) is not possible.

Tomatoes 176Potatoes 177Sugar beets 178Other Vegetables 179

Fruits 180This class is mapped only if the distinction of sub-fruit covers (classes 181-190) is not possible.

Berries 181Cranberry 183Orchards 188Other Fruits 189Vineyards 190Hops 191Sod 192Herbs 193Nursery 194Buckwheat 195Canary Seed 196Hemp 197Vetch 198

Other Crops 199

Issues in CrossWalking from AAFC to Map Classification: Standard Criteria??

AAFC: Agriculture & Agri Food Canada

Page 11: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

11

Reference data from Literature Paper Title Journal Contact Data

1

Crop area mapping in West Africa using landscape stratification of MODIS time series and comparison with existing global land products

International Journal of Applied Earth Observation and Geoinformation, Volume 14, Issue 1, February 2012, Pages 83–93

[email protected], [email protected]

A ground data set collected during the 2009 and 2010 cropping seasons (744 GPS waypoints at the validation sites)

2

Generating plausible crop distribution maps for Sub-Saharan Africa using a spatially disaggregated data fusion and optimization approach

Agricultural Systems, Volume 99, Issues 2–3, February 2009, Pages 126–140

[email protected]

Crop distribution map of sub Saharan Africa

3Generating global crop distribution maps: From census to grid

Agricultural Systems, Volume 127, May 2014, Pages 53–60

[email protected]

Global Rainfed/Irrigated crop map

4Disaggregating and mapping crop statistics using hyper temporal remote sensing

International Journal of Applied Earth Observation and Geoinformation, Volume 12, Issue 1, February 2010, Pages 36–46

[email protected], sunflower, Barley crop maps of southern Spain

5Global rain-fed, irrigated, and paddy croplands (GRIPC)

J. Meghan Salmon, Mark A. Friedl , Steve Frolking, Dominik Wisser, Ellen M. Douglas

https://dl.dropboxusercontent.com/u/12683052/GRIPCmap.zip.

Irrigated/Rainfed Map

6

Finer resolution observation and monitoring of global land cover: first mapping results with Landsat TM and ETM+ data

International Journal of Remote SensingVolume 34, Issue 7, 2013

[email protected]

Landsat/MODIS Mapping 91,000 Training samples; 38,000 Test samples

Page 12: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

12

Reference data from Literature Paper Title Journal Contact Data

7Data Mining, A Promising Tool for Large-Area Cropland Mapping

IEEE Journal of selected topics in applied earth observations and remote sensing, vol. 6, no. 5, October 2013

[email protected]

The field surveys were conducted in Mali during the 2009 and 2010 crop seasons (980 Way points)

8GlobeLand30 (http://www.globallandcover.com/GLC30Download/index.aspx)

ISPRS Journal of Photogrammetry and Remote Sensing 103 (2015) 7–27

[email protected]

154,587 pixel samples 2010 year

9Mapping and discrimination of soybean and corn crops using spectrotemporal profiles of vegetation indices

International Journal of Remote Sensing, 2015, Vol. 36, No. 7, 1809–1824,

[email protected]

Field data from 19 different croplands (state of Paraná, located in the South ofBrazil, between)

10Improving Crop Area Estimation in West Africa Using Multiresolution Satellite Data

Proceedings of Global Geospatial Conference 2013

[email protected]

field survey conducted between May and July 2012.

11Impact of feature selection on the accuracy and spatial uncertainty of per-field crop classification using Support Vector Machines

ISPRS Journal of Photogrammetry and Remote Sensing 85 (2013) 102–119

[email protected]

Extensive field survey conducted in Four test sites in Middle Asia.

12MODIS Collection 5 global land cover: Algorithm refinements and characterization of new datasets

Remote Sensing of Environment 114 (2010) 168–182

[email protected] Training sites globally

13Cropland for sub-Saharan Africa: A synergistic approach using five land cover data sets

Calibrated synergy map for Africa (http://onlinelibrary.wiley.com/doi/10.1029/2010GL046213/abstract)

[email protected]

2553 samples distributed over Africa

Page 13: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

13

RHSeg Algorithm: Segmentation

The conversion of NTF format to Tiff using GDAL does not match with the one converted using Erdas/ArcMap (for e.g. FCC (GDAL) & NCC (Erdas)satellite imagery on Left have different extent)

The output from RHSeg does not overlay over the Image in eCognition

The issue of converting the RHSeg results from raster to vector format because segmentation results efficiently represented in vector format

RHSEG result overlay on World-View-1 (FCC, NCC)

Page 14: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

14

Adding Thematic Layers to RHSeg Result

The object labels from RHSeg segmentation need to be appended to a number of parameters

Jim Tilton has worked on NDVI layer which can be computed either prior to processing with RHSeg (pre-processing) – or the NDVI can be computed on the RHSeg output (post-processing).

Border Index, Homogeneity Contrast, Band Values, Rectangular Fit…etc.

eCognition segmentation output

Example

Page 15: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

15

Classification with Random Forest R Program

RHSeg segmentation result is used for Random forest classification in R

Either the variables can be used in raster thematic form or attached to the region objects

Allows integration of the RHSeg output into the Random forest classification

Useful variables will be selected based on accuracy metrics predicted in R

eCognition Software

The random selection of objects created from multi-resolution segmentation are used as training data to perform random forest classification

Page 16: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

16

North America Reference Data

SpeltzMisc. Vegetables & fruitsWater MelonHopsSodSwitch GrassWildflowersOther tree CropsPistachioCarrotGarlicCantaloupesPumpkinBrocolliCaneberriesCranberries

The Study Area showing World View-1 imagery and Crop Database Layer, Yolo, California

CD

L C

lass

La

bels

Crosswalk between

CDL Labels and

classification

Methodology Development

Page 17: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

17

North America Reference Data

Random distribution of Samples: Vegetation samples appeared more due larger proportion of the area is vegetation in this particular scene

The Error Matrix

Page 18: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

18

Accuracy Assessment of GCE v.2 Australia Map

1/3rd Validation set Received: 1,118 Ground Reference Data Locations

The ground data were collected from Zone 3 to Zone 9. Some of the locations are not on the GCE map

Both scale 1 (90x90m) and scale 2 (250x250 m) labels are part of the ground reference data observations

Ground data label: Land prepared for season 2 – Fallow or Rainfed cropland??

Page 19: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

19

Crosswalk between Ground and Map Labels

  GCE v.2 Classification Scheme Ground data Classification1. Rainfed Single Crop, all crops  

2. Rainfed Single Crop, Pastures Rainfed GrazingRainfed Continuous OrchardRainfed Single flowersRainfed Plantation

3. Irrigated Single Crop, Double Crop, all crops

 

4. Irrigated Single Crop Pastures Irrigated single vegetables 

5. Irrigated Continuous, Orchards  6. Fallow Land prepared for season 2

Season 2 CropRainfed/Irrigated Single Land prepared for season 2

    No-Crop

1. Rf, croplands

2. Rf, Pastures

3. Irrigated croplands

4. Irrigated pastures

5. Cropland, Irrigated, Continuous, Orchard??

6. Fallow

Africa Australia North America

Group11 Class Name Category 1 Alfafa Pasture2 Barley Crops3 Beans Crops4 Canola Crops5 Lentils Crops6 Lupin Crops7 Oats Crops8 Peas Crops9 Wheat Crops

10Cropland, Irrigated, Continuous, Orchard

orchard /Continuous crops

11Cropland single, sown-pasture Pasture

12Cropland, single, land prepared for S2 Crops

13 Crop harvested Crops14 Rainfed vegetables Crops15 Plantation Crops16 Cropland, RF, single, Crop Crops20 RF, Grazing Pasture30 no crop Non crop

Final 6 classes to generate Error Matrix

Page 20: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

20

Accuracy Assessment for SMT and ACCA algorithm generated maps

Crop/No-Crop Accuracy

The Error matrix for 6 Classes after

truncating the samples

ACCA

SMT

Page 21: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

21

More Possible Error Matrices6 Classes + No-Crop

ACCA6 Classes + No-Crop SMT

4 Classes(Merge Rain fed & Irrigated) ACCA

4 Classes(Merge Rain fed & Irrigated) SMT

Irrigation map has been used as mask to label ground samples

Page 22: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

22

90 m Samples Verification over Google Earth

90m Samples Not Good for

250m

Page 23: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

23

250m Samples Verification from Google Earth

May

resu

lt in

Spa

tial A

utoc

orre

latio

n

Some samples are very near to the road and need to be placed more in the center of the field

• Most of the 250m samples are valid, a few of them need to be revised

• Most of the 90m samples are good for 30m Map validation but not for 250m Map

Page 24: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

24

Consider Scale 2: 250m Samples only

The Error matrix for 7 Classes after

removing 90m samples

ACCA

SMT

Page 25: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

25

Conclusion The ground data with 250m homogeneous

area are mostly good but the 90m samples do not cover 250m homogeneity in all samples

A number of Error Matrices can be generated to present the accuracy. The rows and columns can be reduced or expanded in detail, as necessary

The objective is to generate a valid, balanced, statistically sound error matrix with proportional representative number of samples for each class

Page 26: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

26

REMEMBER Anyone can generate an error matrix with any

data. Just because there is an error matrix does not

mean that there is a valid accuracy assessment.

We have already provided you with a number of resources including a full reference data collection document to help you.

Our goal is to work with each team to make sure that you are thinking about all the requirements now so that our accuracy assessments are valid.

Page 27: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

27

Some Key Topics Classification Scheme Sample Unit Sample Size Sampling Scheme Spatial Autocorrelation

Page 28: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

28

1. Classification Scheme Key to any mapping project.

Must be done at beginning of project. We have done this for our project – BUT…

Requirements of the Classification Scheme: Meets the user’s needs Consists of both labels and rules (definitions) that are

Mutually exclusive Totally exhaustive Hierarchical

Includes a minimum mapping unit. Issue for us is crosswalking all the different

classification schemes used for the various reference data sets all of us are using to our map classification scheme. Can introduce serious error!

Page 29: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

29

2. Sample Unit

Must consider positional accuracy and mmu

We have selected a 3x3 TM pixels (90m x 90m) sample unit that is homogeneous for the Landsat accuracy assessment.

We need at least a single MODIS pixel (250m x 250m) that is homogeneous for the MODIS accuracy assessment.

Page 30: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

30

3. Sample Size A useful rule of thumb: 50 sample units per map

class Need to balance proportion of samples in each

map class with insuring that enough samples are taken per map class to know the accuracy of each map class

Need enough samples to insure good distribution across the map (avoid spatial autocorrelation)

Samples MUST BE INDEPENDENT of training data We all must keep this in mind. This is the #1 reason for

our coordination calls with all the mapping teams. Need Justin’s help here.

If assess map of the continent, the results are for the entire continent. If need eco-region or country estimate, need to do assessment at that level.

Page 31: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

31

4. Sampling Scheme

Water Sand Commercial Residential Industrial

1:250,000 Map Sheet

Page 32: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

32

6. Spatial Autocorrelation

Spatial autocorrelation occurs when the presence, absence, or degree of a certain characteristic affects the presence, absence, or degree of the same characteristic in neighboring units (Cliff and Ord 1973)

Samples must be adequately spaced apart or they will be spatially autocorrelated. This is true whether we are collecting

reference data on the ground or from very high resolution imagery.

Page 33: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

33

Future Work Implement Object statistics on RHSeg Results

Resolve the issue of different extent of satellite imagery

Perform random forest on RHSeg result in R

Generate pure reference samples from AAFC cropland layer for Canada

Generate reference samples from CDL for North America Continue our coordination calls with each mapping team

Page 34: Reference Data Collection & Accuracy Assessment Dr. Russ Congalton & Kamini Yadav GFSAD30 Meeting, Wisconsin, 14 th -16 th July, 2015 Reference Map 1.

34