Joint Research Centre - GSARSgsars.org/wp-content/uploads/2016/10/Gallego-Rome-20161024-MS… ·...

J. Gallego,

Rome, October 2016

Use of technology to develop sampling frames.

Which technologies are we talking about?

• Geographic Information Systems

• Global Navigation Systems – GNSS - GPS

• Remote sensing (images from satellites,

aircrafts, etc)

Geographic Information Systems

Software and the information managed by it Software: • Commercial• Free Information• Polygons, lines, points

• Geometrically more precise. • Raster (Array of pixels)

• Often more efficient for complex computationsCartographic Projections• Often automatically managed • But be careful with on-the-fly projection.

Example of projection mismanagement

Georeferencing elements in a list frame

• Plots, parcels, fields: • OK if existing register

• Ad-hoc: too expensive.

• Farms: • coordinates of the farmer’s dwelling?

• Stocking place?

• All plots?

• Small administrative units (Enumeration Areas?). • It is worth making an effort to improve a layer of EA

Global Navigation Satellite Systems (GNSS-GPS)

• Limited use do build the sampling frame. • More important to support field work.

• Approaching a sampled point with known coordinates– But for the final decision on the location, the graphic

document has the priority• Measuring the area of a plot. • The availability of cheap and reasonably accurate GPS has

changed the cost-efficiency comparison in favor of point frames.

• Quality control: ensuring that the surveyor has reached the sampled point

• Sampling points for crop cutting experiments

Satellite images

• Mainly optical images. Main characteristics:• Spatial resolution (0.3 m to 5 km)• Spectral resolution (how many different wavelengths?)• Swath (how many passes are necessary to cover a study area?). • Price

• Public access image layers (Google Earth, Bing)• Useful when the date and spectrum are not critical. • Some problems with the geometric correction.

• Google Earth Engine (not to confuse with Google Earth) • Tools for image analysis in the cloud.

Area Sampling Frames (AFS)

• The units of an AFS have a geographic nature and

are geo-referenced: • Points

• Transects (more for environmental surveys)

• Segments: patches of terrain

• Clusters of points (= incomplete observation of a segment)

Area Frames of segments with physical boundaries.

• Building the sampling frame involves a heavy photo-

interpretation work if the field size is not very large.

AFS defined by a regular grid

Graphic support to survey a square segment

Stratification in an AFS

• Typical definition of strata:• Segments with > 60% arable land

• Segments where certain types of crops are dominant (permanent

crops, irrigated, paddy rice, etc.)

• Non-agricultural (or non-cropland) areas. • Reminder: an imperfect stratification reduces the efficiency

(variance), but does not introduce bias, unless: • The “non-agricultural” stratum is excluded, but it still contains agriculture.

• You redefine a-posteriori the strata (e.g. segment i was in stratum h1, but after the

field survey, it should have been in stratum h2, therefore I relabel it as stratum h2)

• Etc….

Stratification in an AFS

• Ad-hoc photo-interpretation if nothing else is available• Heavy operation unless you do a very coarse photo-interpretation

• Exploiting existing products: • Polygon land cover maps (Corine Land Cover, Africover)…

• Detailed administrative layers (cadaster, register)

• Image classification products for large regions (or global)

– Often too coarse resolution.

Stratification of square segments from a register

• Compute the cropland area in each square or the area of

main types of crops (GIS tool)• Apply your stratification rule.

Global cropland maps

• A possible source for stratification if nothing else is available.

• Coarse resolution (~ 800 m in this example)

Regional/national land cover maps

• Not perfect, but

usually a good

compromise

• Cost efficient? • Doubtful if it is produced

only for stratification.

• Better if multipurpose

Classified image from a previous year

• Not often

available with a

good quality.

Crowdsourcing with public images

Crowdsourcing with public images

• Tools available,

• Cross-comparison to assess the reliability/consistency of

volunteers

• Heterogeneous image dates

• Stimulating participants: rewards (e.g. a smartphone) to

the best scores (number of images/reliability)

• Still heavy for a full stratification.

• Possible alternative: two-phase sampling• Stratification of a large sample

• Subsample for field survey

Area frames of unclustered points

• Two-phase sampling• Large sample photo-

interpreted as strata

• Sub-sample for field

survey

• Example: the

Eurostat LUCAS

survey

• Crowdsourcing can

be explored

Sampling regular clusters of points

Two-stage area frame sampling

It can be seen as an incomplete observation of the segment

Sampling transects in stripe-shaped landscapes

Sampling points Using a transect to estimate % of each crop in a “super-plot”

Sampling stripes

• Particular case of segments with a long and thin shape

• Adapted to low altitude flights

• Some times used to estimate nomadic livestock

Sampling farms in an area frame

• Traditional approaches: • Open Segment

• Closed segment

• Weighted segment

• Less traditional (variants of the weighted segment)• Sampling through unclustered points

• Sampling farms through clustered points

Open segment

• A farm is selected if its headquarters are inside the

segment.

Closed segment

• Only the parts of plots inside the segment (tracts) are

considered. • Often for direct estimation (without interview to the farmer)

Weighted segment

• The additive data coming from the interview with the

farmer are attributed to the tract applying a coefficient

(tract area/farm area)

• The tract area can be computed by a GIS if it does not

include full plots.

Sampling farms through points

• Estimators similar to the weighted segment, but the tract is not needed. • If the points are sampled inside a segment, the tract area disappears from the formulas with

the Horwitz-Thomson estimator.

Sampling enumeration areas (EA) with a probability proportional to the geographical area.

• Random sample of

points and

selection of the

EAs

One-dimensional systematic sampling enumeration areas (EA) ordering in zig-zag

Computing the partial area sums from 1 to k.

Selecting one EA every M km2

Very limited improvement compared to random sampling.

Two-dimensional systematic sampling of enumeration areas (EA)

More substantial improvement if the spatial correlation decreases with the distance.

Two-dimensional systematic sampling of enumeration areas (EA) with probability proportional to arable areaSample of points Overlay ono Land cover map o Images (point photo-interpretation): usually more accurateKeeping only Eas corresponding to points on arable land

Using Enumeration areas as PSU

Sampling satellite images• In the 70’s the USDA started by cutting Landsat MSS

images into pieces of 6x8 miles • Reason: classifying a full image needed a “super-computer”. • In the 80’s cutting images into pieces became meaningless: a

mini-computer or a workstation could classify a Landsat TM image in a reasonable time with the same accuracy of classifying a piece.

• Late 80’s: sampling Spot images made sense because a full coverage of a large region was too expensive (EU-MARS “rapid estimates”).

• Sampling errors were ok, but non-sampling errors and subjectivity were too high because the system was based on direct pixel counting on classified images.

Sampling satellite images• In the 90’s several projects sampled full or quarter

images for global forest (and change) area estimates.• A lot of visual photo-interpretation. The cost of a sampling unit depended

very much on the size of the unit

Sampling satellite images• 2000’s: Very high

resolution images. A full coverage is very expensive.

• Sampling makes sense but the cost of images and processing should go down to become efficient: • Max 600-800 $ per image for

crop area estimation in the EU.

• Assuming the identification accuracy is comparable to a field survey

Some sources of bias

• Area frames are usually better protected against non-

sampling errors (bias) than list frames, but bias can be

introduced in a number of cases.

• Example 1: “extended square segment” • If a plot is partly in the segment, the whole plot is considered.

Stratification of square segments by conversion of a register to raster

• We have a detailed polygon layer

from a previous year

• We want a stratification of a

square grid of 300 m.

• (wrong) solution: we ask our GIS

to convert the polygon layer to a

300 m raster

• And we exclude the “non-

agricultural areas.

• The bias can be around 20-30%

Joint Research Centre - GSARSgsars.org/wp-content/uploads/2016/10/Gallego-Rome-20161024-MS… ·...

Documents

Transcript of Joint Research Centre - GSARSgsars.org/wp-content/uploads/2016/10/Gallego-Rome-20161024-MS… ·...