Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

43
lle Koo, Carol Spencer, David Bloom, Nelson useum of Vertebrate Zoology (UC Berkeley), VertNet, & Tulane University Georeferencing Introduction: Collaboration to Automation

description

Georeferencing Introduction: Collaboration to Automation. Michelle Koo, Carol Spencer, David Bloom, Nelson Rios Museum of Vertebrate Zoology (UC Berkeley), VertNet, & Tulane University. 1. Georeferencing. 2. Collaborations. 3. Automation. What is a georeference?. - PowerPoint PPT Presentation

Transcript of Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Page 1: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Michelle Koo, Carol Spencer, David Bloom, Nelson Rios Museum of Vertebrate Zoology (UC Berkeley), VertNet,

& Tulane University

Georeferencing Introduction: Collaboration to Automation

Page 2: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

1. Georeferencing1. Georeferencing

2. Collaborations2. Collaborations

3. Automation3. Automation

Page 3: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

What is a georeference?What is a georeference?

A nA numerical description in a umerical description in a coordinate system of a place.coordinate system of a place.

Page 4: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

ID Species Locality1 Lynx rufus Dawson Rd. N Whitehorse2 Pudu puda cerca de Valdivia3 Canis lupus 20 mi NW Duluth

9 Ursus arctos Bear Flat, Haines Junction

4 Felis concolor Pichi Trafúl5 Lama alpaca near Cuzco6 Panthera leo San Diego Zoo7 Sorex lyelli Lyell Canyon, Yosemite8 Orcinus orca 1 mi W San Juan Island

What we have:What we have:Localities we can readLocalities we can read

Page 5: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

What we want:What we want:Localities we can mapLocalities we can map

Page 6: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Darwin CoreDarwin Core

•Metadata standards initially built off of Dublin Core Metadata Standards

•Collections of any kind of biological objects or data.

•Terminology associated with biological collection data.

•Striving for compatibility with other biodiversity-related standards.

•Facilitating the addition of components and attributes of biological data.

Page 7: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Darwin Core Location TermsDarwin Core Location Terms HigherGeographyHigherGeography waterbody, island, islandGroupwaterbody, island, islandGroup continent, country, countryCode, continent, country, countryCode,

stateProvince, county, municipalitystateProvince, county, municipality localitylocality minimumElevationInMeters, minimumElevationInMeters,

maximumElevationInMeters, maximumElevationInMeters, minimumDepthInMeters, minimumDepthInMeters, maximumDepthInMetersmaximumDepthInMeters

Page 8: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Darwin Core Georeference Darwin Core Georeference TermsTerms

decimalLatitude, decimalLongitudedecimalLatitude, decimalLongitude geodeticDatumgeodeticDatum coordinateUncertaintyInMeterscoordinateUncertaintyInMeters coordinatePrecisioncoordinatePrecision pointRadiusSpatialFitpointRadiusSpatialFit footprintWKT, footprintSRS, footprintSpatialFitfootprintWKT, footprintSRS, footprintSpatialFit georeferencedBy, georeferenceProtocolgeoreferencedBy, georeferenceProtocol georeferenceSources georeferenceSources georeferenceVerificationStatusgeoreferenceVerificationStatus georeferenceRemarksgeoreferenceRemarks

Page 9: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

What is a georeference?What is a georeference?

A nA numerical description of a place umerical description of a place that can be mapped.that can be mapped.

Page 10: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

“Davis, Yolo County, California”

“point method”

Coordinates: 38.5463 -121.7425Horizontal Geodetic Datum: NAD27

Coordinates: 38.5463 -121.7425Horizontal Geodetic Datum: NAD27

Page 11: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Data QualityData Quality data have the potential to be used in ways data have the potential to be used in ways

unforeseen when collected. unforeseen when collected. the value of the data is directly related to the value of the data is directly related to

the fitness for a variety of uses.the fitness for a variety of uses. ““as data become more accessible many as data become more accessible many

more uses become apparent” – Chapman more uses become apparent” – Chapman 20052005

the MaNIS/HerpNET/ORNIS guidelines follow the MaNIS/HerpNET/ORNIS guidelines follow best practices (Chapman and Wieczorek best practices (Chapman and Wieczorek 2006) to enhance data quality and value2006) to enhance data quality and value

Page 12: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

What is an What is an acceptableacceptable georeference?georeference?

A numerical description of a place A numerical description of a place that can be mappedthat can be mapped

and that describes the spatial and that describes the spatial extent of a locality extent of a locality

and its associated uncertainties.and its associated uncertainties.

Page 13: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

“Davis, Yolo County, California”

“bounding-box method”

Coordinates: 38.5486 -121.754238.545 -121.7394

Horizontal Geodetic Datum: NAD27

Coordinates: 38.5486 -121.754238.545 -121.7394

Horizontal Geodetic Datum: NAD27

Page 14: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

“Davis, Yolo County, California”

“point-radius method”

Coordinates: 38.5468 -121.7469Horizontal Geodetic Datum: NAD27Maximum Uncertainty: 8325 m

Coordinates: 38.5468 -121.7469Horizontal Geodetic Datum: NAD27Maximum Uncertainty: 8325 m

Page 15: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

What is an What is an idealideal georeference? georeference?

A numerical description of a place A numerical description of a place that can be mappedthat can be mapped

and that describes the spatial and that describes the spatial extent of a locality extent of a locality

and its associated uncertaintiesand its associated uncertaintiesas well as all possibilities.as well as all possibilities.

Page 16: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

“Davis, Yolo County, California”

“shape method”

Page 17: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

“20 mi E Hayfork, California”

“probability method”

Page 18: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

point easy to produce no data quality

bounding-box simple spatial queriesdifficult quality assessment

point-radius easy quality assessmentdifficult spatial queries

shape accurate representationcomplex, uniform

Method ComparisonMethod Comparison

probability accurate representationcomplex, non-uniform

Page 19: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Parallels ofLatitude

Meridians ofLongitude

GraticularNetwork

Georeferencing Using Georeferencing Using MaNIS/HerpNET/ORNIS MaNIS/HerpNET/ORNIS

GuidelinesGuidelines

Page 20: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

MaNIS/HerpNET/ORNIS (MHO) MaNIS/HerpNET/ORNIS (MHO) Guidelines Guidelines

http://manisnet.org/GeorefGuide.htmlhttp://manisnet.org/GeorefGuide.html

►uses point-radius representation of uses point-radius representation of georeferencesgeoreferences

►circle encompasses all sources of circle encompasses all sources of uncertainty about the locationuncertainty about the location

►methodology formalizes assumptions, methodology formalizes assumptions, algorithms, and documentation algorithms, and documentation standards that promote reproducible standards that promote reproducible resultsresults

►methods are universally applicable methods are universally applicable

Page 21: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

MHO GuidelinesMHO Guidelines

► Think of georeferencing as “many-stepped Think of georeferencing as “many-stepped process” –process” – MHO projects produced a MHO projects produced a first pass. Then validation and refinement first pass. Then validation and refinement should be done using itineries, field notes, should be done using itineries, field notes, collector verification and by mapping the collector verification and by mapping the localities and making these maps available localities and making these maps available on-line.on-line.

Page 22: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Data QualityData Quality► ““Fitness of use” of the dataFitness of use” of the data

► As a collector, you may have an intended use for As a collector, you may have an intended use for the data you collect but data have the potential to the data you collect but data have the potential to be used in unforeseen ways…. The value of your be used in unforeseen ways…. The value of your data is directly related to the fitness for a variety of data is directly related to the fitness for a variety of uses.uses.

► As data become more accessible many more uses As data become more accessible many more uses become apparent. – Chapman 2005, Chapman and become apparent. – Chapman 2005, Chapman and Wieczorek 2006Wieczorek 2006

► We are using the MHO methods as a We are using the MHO methods as a tooltool to to enhance data qualityenhance data quality

Page 23: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Maximum Error Distance from Maximum Error Distance from Uncertainties:Uncertainties:

► Uncertainty is a “measure of the incompleteness of Uncertainty is a “measure of the incompleteness of one’s knowledge or information about an unknown one’s knowledge or information about an unknown quantity whose true value can be established if a quantity whose true value can be established if a perfect measure device were available.” (Cullen & perfect measure device were available.” (Cullen & Frey 1999)Frey 1999)

► In MHO Guidelines, this is defined as the numerical In MHO Guidelines, this is defined as the numerical value for the upper limit of the distance from the value for the upper limit of the distance from the coordinates of a locality to the outer extremity of coordinates of a locality to the outer extremity of the area within which the whole of the described the area within which the whole of the described locality must lie (i.e., what can be mistaken for that locality must lie (i.e., what can be mistaken for that locality based on the description given).locality based on the description given).

Page 24: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

ExtentExtent- - the geographic the geographic range, magnitude or distance range, magnitude or distance that a location may actually that a location may actually represent. (With a town, the represent. (With a town, the extent is the polygon that extent is the polygon that encompasses the area inside encompasses the area inside the town’s boundaries.)the town’s boundaries.)

Linear extentLinear extent-- what we use what we use for the Point-Radius Method. for the Point-Radius Method. Defined as the distance from Defined as the distance from the geographic center of the the geographic center of the location to the furthest point location to the furthest point of the geographic extent of of the geographic extent of the location.the location.

Extents:Extents:

Page 25: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Precision and Accuracy:Precision and Accuracy:

► Always use as many decimal places as given by the Always use as many decimal places as given by the coordinate source.coordinate source.

► A measurement in decimal degrees given to five decimal A measurement in decimal degrees given to five decimal places is more precise than a measurement in degrees places is more precise than a measurement in degrees minutes seconds.minutes seconds.

► False precisionFalse precision will result if data are recorded with a will result if data are recorded with a greater number of decimal points (e.g., when converting greater number of decimal points (e.g., when converting from DMS to decimal degrees).from DMS to decimal degrees).

► Always record the Always record the accuracy of your GPS readingsaccuracy of your GPS readings (how (how well the GPS measures the true value of the location). The well the GPS measures the true value of the location). The accuracy is given at the same time as the coordinate, but accuracy is given at the same time as the coordinate, but usually will not be recorded with the coordinates when you usually will not be recorded with the coordinates when you output them on recreational GPS units. Otherwise, default output them on recreational GPS units. Otherwise, default accuracy is assumed 30 m, so stating your accuracy is accuracy is assumed 30 m, so stating your accuracy is better.better.

Page 26: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Geodetic DatumGeodetic Datum:: defines the position of defines the position of the origin, scale, the origin, scale, shape, and orientation shape, and orientation of a 3-dimensional of a 3-dimensional model of the earth. model of the earth. Example: WGS84.Example: WGS84.

Coordinate SystemCoordinate System:: defines the “units of defines the “units of measure” of position measure” of position with respect to the with respect to the datum. Example: datum. Example: latitude, longitude in latitude, longitude in degrees, minutes, degrees, minutes, secondsseconds

Geographical Concepts:Geographical Concepts:

Page 27: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Map Projections:Map Projections: Mathematical transformations Mathematical transformations

of the 3-D model of the surface of the 3-D model of the surface of the earth onto a 2-D map.of the earth onto a 2-D map.

Many different kinds (e.g., Many different kinds (e.g., conical, cylindrical, azimuthal) conical, cylindrical, azimuthal) – all are compromises in – all are compromises in distortions (either area, shape, distortions (either area, shape, distance, or direction), but distance, or direction), but some preserve areas or some preserve areas or distances.distances.

When measuring distances on When measuring distances on paper maps, use an equal paper maps, use an equal distance projection, if distance projection, if available, otherwise available, otherwise understand the implications.understand the implications.

Page 28: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Named place: a place of reference in a locality description. Example: “Davis” in “5 mi N of Davis”Areal extent: the geographic area covered by a named place (feature). Example: the area inside the boundaries of a town.Linear extent: the distance from the geographic center to the furthest point of the areal extent of a named place.

Georeferencing ConceptsGeoreferencing Concepts

Page 29: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

► Offset: Offset: the distance the distance from a named place. from a named place. Example: “5 mi” in Example: “5 mi” in “5 mi NE of Beatty”.“5 mi NE of Beatty”.

► Heading: Heading: the the direction from a direction from a named place. named place. Example: “NE” in “5 Example: “NE” in “5 mi NE of Beatty”.mi NE of Beatty”.

Georeferencing ConceptsGeoreferencing Concepts

Page 30: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

► coordinateUncertaintyInMeters:coordinateUncertaintyInMeters:“The horizontal distance (in meters) from the “The horizontal distance (in meters) from the given decimalLatitude and decimalLongitude given decimalLatitude and decimalLongitude describing the smallest circle containing the describing the smallest circle containing the whole of the Location. Leave the value empty if whole of the Location. Leave the value empty if the uncertainty is unknown, cannot be the uncertainty is unknown, cannot be estimated, or is not applicable (i.e., there are no estimated, or is not applicable (i.e., there are no coordinates). Zero is not a valid value for this coordinates). Zero is not a valid value for this term.” (from Darwin Core)term.” (from Darwin Core)

► Maximum Error Distance: Maximum Error Distance: same as same as coordinateUncertaintyInMeters, except the coordinateUncertaintyInMeters, except the units are the same as in the locality description, units are the same as in the locality description, not necessarily meters. not necessarily meters.

Georeferencing ConceptsGeoreferencing Concepts

Page 31: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Sources of uncertainty:Sources of uncertainty:

Coordinate Uncertainty

Map scale

The extent of the locality

GPS accuracy

Unknown datum

Imprecision in direction measurements

Imprecision in distance measurements (e.g., 1 km vs. 1.1 km)

20° 30’ N 112° 36’ WScale Uncertainty (ft) Uncertainty (m)

1:1,200 3.3 ft 1.0 m

1:2,400 6.7 ft 2.0 m

1:4,800 13.3 ft  4.1 m

1:10,000 27.8 ft 8.5 m

1:12,000 33.3 ft 10.2 m

1:24,000 40.0 ft  12.2 m

1:25,000 41.8 ft 12.8 m

1:63,360 106 ft 32.2 m

 1:100,000 167 ft 50.9 m

1:250,000 417 ft 127 m

Page 32: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

1. Georeferencing1. Georeferencing

2. Collaborations2. Collaborations

3. Automation3. Automation

Page 33: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Collaborative DistributedDatabase Portals for Vertebrates

Page 34: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

CollaborationsCollaborations

Page 35: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

MaNIS Localities Georeferenced

n = 326k localities (1.4M specimens)r = 14 localities/hr (point-radius method)

t = 3 yrs (~40 georeferencers)

Page 36: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

ORNIS Localities Georeferenced

n = 267k localities (1.4M specimens)r = 30 localities/hr (point-radius method)

t = 2 yrs (~30 georeferencers)

Page 37: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

HERPNET Localities Georeferenced

n = 646 k localities (1.8 M specimens)r = 15 localities/hr (point-radius method)

t =5 yrs (111 georeferencers)

Page 38: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

Scope of the Problem for Natural History Collections

~2.5 Billion (109)

~6 records per locality*

~14 (30) localities per hour*

~15,500 (7,233) years

* based on the MaNIS (ORNIS) Project

Page 39: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

The Collaboration continues…

Page 40: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

1. Georeferencing1. Georeferencing

2. Collaborations2. Collaborations

3. Automation3. Automation

Page 41: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

AutomationAutomation

Tools for Georeferencing

GeoLocateGeoLocate

DIVA-GISDIVA-GIS

Georeferencing CalculatorGeoreferencing Calculator

BioGeomancer ClassicBioGeomancer Classic

Page 42: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios

http://www.biogeomancer.org

Page 43: Michelle Koo, Carol Spencer, David Bloom, Nelson Rios