Geospatial Big Data: Software Architectures and the Role of APIs in ...

Post on 14-Feb-2017

222 views 0 download

Transcript of Geospatial Big Data: Software Architectures and the Role of APIs in ...

®

Geospatial Big Data: Software Architectures and the Role of APIs in

Standardized EnvironmentsApache Big Data Europe 2016

IngoSimonisOpenGeospatialConsortiumisimonis@opengeospatial.org

slideset containsmaterialfromG.Percivall (OGC),R.Winterton (PitneyBowes),J.Sanyal (ORNL),A. Asahara (Hitachi),J.Spinney(PitneyBowes),T.Kolbe(TUM),RobEmanuele(LocationTech),ESA

GeospatialData

• Spatial data is big data

• Apache projects are implementing geospatial functionalities

• Coordination of spatial implementations across Apache projects

• Open standards to increase interoperability and code reuse

• Architectures integrating Big Data Services and Geospatial Services

EarthObservations

• BigEarthDataInitiative(BEDI)- Standardizingandoptimizingcollection,deliveryofU.S.Government’scivilEarthobservationdata.

• SentinelsatellitesoperatedbyESAintheframeworkoftheCopernicusprogramme fundedandmanagedbytheEuropeanCommission.

3

0.00

2.00

4.00

6.00

8.00

10.00

12.00

FY00 FY01 FY02 FY03 FY04 FY05 FY06 FY07 FY08 FY09 FY10 FY11 FY12 FY13

Volum

e(PB

s)

Multi-yearTotalArchiveVolume(PBs)Trend

TOWARDS A JRC EARTH OBSERVATION DATA AND PROCESSING PLATFORM

P. Soille1, A. Burger2, D. Rodriguez1, V. Syrris1, and V. Vasilev2

European Commission, Joint Research Centre (JRC)1Institute for the Protection and Security of the Citizens, Global Security and Crisis Management Unit

2Institute for Environment and Sustainability, Digital Earth and Reference Data Unit

ABSTRACTThe Copernicus programme of the European Union with itsfleet of Sentinel satellites operated by the European SpaceAgency are effectively making Earth Observation (EO) enter-ing the big data era. Consequently, most application projectsat continental or global scale cannot be addressed with con-ventional techniques. That is, the EO data revolution broughtin by Copernicus needs to be matched by a processing revolu-tion. Existing approaches such as those based on the process-ing of massive archives of Landsat data are reviewed and theconcept of the Joint Research Centre Earth Observation Dataand Processing platform is briefly presented.

Index Terms— Earth Observation, Sentinel, Copernicus,Infrastructure

1. INTRODUCTION

To date, the United States (U.S.) Government is the largestprovider of environmental and Earth system data in theworld1. A first data revolution happened in 2008 when theU.S. Geological Survey decided to release for free to the pub-lic its Landsat archive which is the worlds largest collectionof Earth imagery [11]. Still, the European Commission, withits ambitious Copernicus programme and associated Sentinelmissions (S1 to S6 satellite series) operated by the EuropeanSpace Agency and complemented by a range of contribut-ing missions, is on the way to become the main provider ofglobal EO data with a free, full, and open access data policy.With expected data volumes of 10 TB per day (when all Sen-tinel series will reach full operational capacity), data velocityhighlighted by the production of global coverage with repeattime as short as 2 days for Sentinel-3, and data variety result-ing from sensors in the optical and radar ranges at variousspatial, spectral, and temporal resolutions, the Copernicusprogramme is a game changer making EO data effectivelyentering the big data era [10]. Figure 1 shows the overallestimated data throughput for the Sentinel 1–3 missions com-pared to those delivered by the Landsat 8/MODIS satellites.

1http://dels.nas.edu/resources/static-assets/besr/miscellaneous/Stryker.pdf

Fig. 1. Yearly data flow estimates from Sentinel 1–3 (as-suming full operational capacity) compared to MODIS andLandsat 8 data flows.

Whilst the European Union (EU) is making EO enteringthe big data era in terms of data production, innovative de-velopments need to be pursued to fully exploit the potentialof the generated data whether for academic, institutional, orcommercial applications. This also applies to the Joint Re-search Centre where the current fragmented approach of EOdata storage and processing is no longer sustainable.

2. THE SENTINELS AND THE BIG EO DATA ERA

The evolution of the cumulative data produced by the Land-sat missions and Sentinel 1-2-3 with estimations until sum-mer 2018 is shown in Fig. 2. The underlying calculations arebased on the following assumptions and considerations:

1. The data volume of Landsat 1-6 missions is ⇠120 TB2;

2. The data volume of Landsat 7 and 8 is estimated fromthe relating metadata files provided by USGS3;

3. MODIS Terra and Aqua generate 70GB/day each [7];

2http://academic.emporia.edu/aberjame/remote/landsat/landsat.htm

3http://landsat.usgs.gov/metadatalist.php

Exploitation Platforms 2

Proc. of the 2016 conference onBig Data from Space (BiDS’16) doi: 10.2788/854791

65 Santa Cruz de Tenerife, Spain15–17 March 2016

CommercialCloudHosting

• DigitalGlobe– EntireDGarchiveincloudin2016- 45PB- largestEOarchive– Harris/ENVIprocessing

• GoogleEarthEngine– 5PBstorage(Landsatandothers) - 800+LibraryFunctions– Limitingfactoristheabilitytopulldatafromanothercloudtosupportlocalprocessing.

• HexagonGeospatial– Cloudhosteddynamicinformationservice– AirBus archiveinAmazoncloudwithHexagonservices

Geo-Enrichment

Allowsawidevarietyofdatasetstobeappendedtoadatarecord

oWhatarethepropertyattributesofthisinsuredproperty?oWhatdemographicgroupdoesthiscustomerbelongto?oWhatbusinessesareconnectedwiththisareaofpoornetworkcoverage?

Geo-Analytics

Reducethecomplexityofbillionsoftransactionalrecordsbyassigningdatatogeographicbinsandaggregatingresults.

o Istheaverage4Gnetworkcoverageinthisareabetterthanacompetitor?

o Isthisdatapointinsideoroutsideofageo-fence?

NetworkCoverageandPerformance

Connectedcarswillsend25gigabytesofdatatothecloudeveryhourimageby:http://barrachd.co.uk

NetworkCoverageandPerformance

• Singleviewonnetwork:• Improvequalityofservice• Increasenetpromoterscores

• Enableacquisition• Reducechurn

PitneyBowes|Increasingthevalueofdatathroughlocationinsights|09/2016

LayeredInformation

9

Demographics

LayeredInformation

10

Demographics

Eachlayeroflocationdataaddsnewdetailsandadditionalinsight

Demographics

PointsofInterest

BusinessData

BuildingAttributes

AreaBoundaries

StreetNetworks

AllDomains:Profitfromapreciseperspective

11

• Singleviewofrisk• Usagebasedinsurance• Frauddetection

Insurance

PopulationDensityTables

RapidSettlementIdentificationandCharacterization

LandScanGlobalAmbientPopulationDistribution(~1km)

SettlementMapping

FacilityLevelPopulationDensity

AmbientandDay/NightPopulationDistribution(~90m)

Increaseinsp

atialand

tempo

ralresolution

LandScanHD

TopDo

wnandBo

ttomUp

Approach

PopulationDistributionandDynamicsModeling

AddisAbaba,Ethiopia

• 2 Xeon Quad core 2.4GHz CPUs + 4 Tesla GPUs + 48GB

• Image analyzed (0.3m)• 40,000x40,000 pixels (800 sq. km)• RGB bands

• Overall accuracy 93%• Settlement class 89%• Non-settlement class 94%

• Total processing time• 27 seconds

FireStation

MLDAddressFabric

WaterBodies

IncorporatedPlaces

MCDPointinPolygonPointinPolygon

FireStationWithMCDinformation

PointinPolygon

FireProtectionDataBundle

Complexspatialprocessing

Routing&othercomplexspatialprocessing

MLDAddressFabricWithMCD,Incorporated

Places,Nearest3FireStation,NearestWaterbodyinformation

MCD

MLDAddressFabricWithMCDinformation

MLDAddressFabricWithMCD&IncorporatedPlacesinformation

172MillionAddressestoClosest3FireStations&NearestWaterBoundary- 6hoursUsing10NodeElasticMapReduce

ComplexWorkFlows

EcologyMapping

• 1 kmsq gridofUSeachwithninevariables,e.g.,daysbelowfreezing,amountofprecipitationingrowingseason

• Unsupervisedstatisticalmultivariateclustering

• Domains:tundra,prairie,alpine,andsoutheasternforest

A Groundbreaking Observatory to Monitor the Environment -- Pennisi 328 (5977): 418 -- Science

http://www.sciencemag.org/cgi/content/full/328/5977/418[5/10/2010 11:09:49 AM]

The united domains of America.Scientists divided the United Statesinto 20 ecological domains. Three siteswithin each domain will beinstrumented.

SOURCE: NEON

Sign In

More Information

More in CollectionsEcology

Scientific Community

Science and Policy

Related Jobs fromScienceCareers

Ecology

Environmental Science

to practice "ecoinformatics": the use of computers and software tools to integrate different types of information frommany locations. They will need to think about trends across a whole country instead of a single ecosystem. Thesuccess of NEON will depend in large part on whether they embrace or reject that new model.

Not everyone is pleased with how the project is set up. Some, like ecologist David Tilman of the University ofMinnesota, Twin Cities, lament the excision of an experiment to test the effects of global change. They say such anexperiment, deemed too expensive, is essential to obtaining timely answers about climate change. Others complainthat NEON won't be investing enough in the field sites that will host its instruments. There's also some concern thatecologists, untrained in the approach NEON is taking, won't use NEON's data. "Everybody still has some questionsbecause it's a new thing," says John Porter, an ecologist at the University of Virginia in Charlottesville who is notpart of NEON.

LTER on steroids

Monitoring a patch of land over time isn't a new idea for NSF. In 1980, it set up five U.S. sites, including one atNiwot Ridge, under the Long Term Ecological Research (LTER) Network that has grown to 26 sites, including twoin Antarctica and one off the Fiji Islands in the South Pacific. The $30-million-a-year program is widely considereda success, with findings on the effect of global warming on plant diversity, how forests could be overloaded byanthropogenic nitrogen, and the greater stability of diverse ecosystems.

But by the late 1990s, says Williams, "we realized there were limits to the LTER model." Each LTER was designedto answer questions posed by an individual investigator or a small team. Core activities, such as measuring primaryproductivity, were not a high priority, Williams acknowledges. "It was hard to integrate data [from different sites] andto do synthesis," he adds, because investigators followed different timetables and used different instruments.

At about the same time, Williams says, the community began to ask itself, "How do we grow ecology, and how dowe tap additional resources?" For NSF program managers, the goal was to fund construction of a large-scalebiology project without devouring their annual budgets, which nurture thousands of individual investigators(Science, 20 June 2003, p. 1869). Their models were the astronomy and geosciences communities, which havemanaged for decades to build costly instruments such as telescopes and ships without bankrupting their bread-and-butter programs. NSF already had a mechanism: Its budget included a special facilities account to financeconstruction of half a dozen projects at a time, with the understanding that NSF's research directorates would payfor operations and maintenance of those facilities from their annual budgets.

A series of workshops yielded a vision of NEON hailed by then-newly arrived NSF Director Rita Colwell, whoinserted the project into NSF's 2001 budget request to Congress. But the larger ecological community hadreservations. Congress also balked, wondering what particular scientific question NEON would be addressing.

In response to that resistance, NSF asked the American Institute of Biological Sciences to hold three town meetingsin 2002 and 2003. The resulting white paper called for a network of 17 sites in different biomes that, in turn, wouldbe linked to other research sites nearby. Each site was projected to cost $20 million to set up and $3 million a yearto operate.

Again, however, the community was divided. Although some people were excited, others wondered if ecologists,known for being independent, would take full advantage of NEON. To many, the program looked like "LTER onsteroids," says Williams. "It was not a good-enough plan."

Next up was an evaluation by the U.S. National Academies. Theresulting National Research Council (NRC) report endorsed NEON inprinciple but urged that the program be reoriented around six specificresearch questions, including biodiversity and land use. Eachquestion would be the focus of one observatory (Science, 26September 2003, p. 1828). "It forced us to look at large-scaleecological processes and large-scale drivers of change," says NSF'sElizabeth Blood. Nonetheless, Congress chose not to give NSFmoney in 2004 to begin construction, the third time in 4 years it hadpassed on funding NEON.

A plan takes shape

For NSF, the flaw in the NRC proposal was that the observatorieswere too independent to be considered a single entity. That featurewould preclude NEON from being funded by the agency's majorresearch equipment account. For NEON's supporters, the solution

Faculty Positions -Skeletal Muscle Biolog…University of IowaIowa City-IA-United States

FACULTY POSITION INMICROBIOLOGYEast Tennessee StateUniversity-United States

CHAIRUniversity of Florida-United States

POSTDOCTORALFELLOWSHIPPOSITIONSUniversity of MarylandSchool of Medicine-United States

FACULTY POSITION INMICROBIOLOGYUniversity of MinnesotaMinneapolis-MN-UnitedStates

POSTDOCTORALPOSITIONUniversity of Texas, HoustonHouston-TX-United States

INNATE IMMUNITYTENURE-TRACKFACULTY POSIT…University of Texas MedicalBranch

Science23April2010:Vol.328.no.5977,pp.418- 420DOI:

10.1126/science.328.5977.418

NSFNEONEcologicalDomains

Transportation• Toreducetrafficcongestion,tripdemanddatacollectedusingtransportationsurveys• GPSbaseddatacollectionoftripinformationisapplicable,withthebroadavailability

oflocationenabledmobiledevices• TheGPStracksareencodedbyMovingFeaturestoenablesharingbymany

stakeholderssuchaslocalgovernments,buscompanies,andsoon.

Transportationsurvey

12343456

23454567

12343456

12hrtotal24hrtotal

123234

34564567

34567890

TrafficDemandsTrafficCongestionsSmartphones

Peopleinthecity

TracksmeasuredbyGPS(encodedbyMovingFeatures)

Contexts &Possibilities

PRESENT

Behaviors &Actuals

PAST

Predictions &Potentials

FUTURE

LocationBasedMarketing

CityModelsforSmartCities

• Berlin• >500,000 buildings upto

Level of Detail 4• Modeled according to

CityGML• Basis for real estate• Integration of sensors

• New York• 1M buildings plus roads at

LoD 1• NYC Open data • Next - Underground critical

infrastructure www.virtual-berlin.de

GeospatialStandards

• Location• Geometry• Features• Coverages• SensorsandObservations• Processing,Analytics• WebServices

PowerofLocation

• 1stlawofgeography:"Everythingisrelatedtoeverythingelse,butnearthingsaremorerelatedthandistantthings.”

– WaldoTobler

• Bymeasuringentropyofindividual’strajectory,wefind 93%potentialpredictabilityinusermobility

– LimitsofPredictabilityinHumanMobility,Science2010

SomePeculiaritiesaboutSpatial

SomePeculiaritiesaboutSpatial

Latitudeisnotunique!

norisLongitude!

CoordinateReferenceSystems

• Coordinate– oneofasequenceofNnumbersdesignating

thepositionofapointinN-dimensionalspace• CoordinateSystems

– Cartesian2Dand3D– Spherical(3D),Polar(2D)– Cylindrical– Linear- alongapath– Ellipsoidal

• CoordinateReferenceSystem– coordinatesystemrelatedto

realworldbyadatum• Examples

– Geographic– Geocentric– Vertical– Engineering– Image– Temporal– DerivedCRS,e.g.,projections

ReferenceISO19111andOGCAbstractSpecTopic2

Mercatorprojection

Globularprojection

Orthographicprojection

Stereographicprojection

Afamiliarlyshaped‘continent’ indifferentmapprojections

MapProjections

Mercator TransverseMercator

Whaterrorscanyouexpect?

Deviations(undulations)betweentheGeoidandtheWGS84ellipsoid

SeaLevel

Localverticaldatum

SeaLevel

TheNetherlandstoBelgium:-2.34m!

NoMetadata– NoInterpretation

• Nogeodeticmetadataà coordinatescannotbeinterpreted– datum– ellipsoid– primemeridian– mapprojection

HidingGeospatialComplexity

MartinDesruisseaux,Geomatys,presentationtodayaboutApacheSISProject• ItistemptingtoignorethecomplexityofgeospatialinternationalstandardsontheassumptionthateveryonetodayusescoordinatesgivenbyGPS.

• ApacheSISmethodshandlealotofthiscomplexity• Martinwillshowexampleofwhathappenunderthehoodduringacubetransformation,fordemonstratingwhatthedevelopersgainwithSIS.

GeospatialInformation

Feature Data Coverage Data

Metadata Maps

SimpleGeometriesforSimpleFeature

OGC simple features (ISO 1923) geometries are restricted to 0, 1 and 2-dimensional geometric objects that exist in 2-dimensional coordinate space (R2).

A/B A B A B A B

A B A B ABA

Equals Touches Overlaps Contains

Within Disjoint Intersects Crosses

TopologicalRelationsbetweenSpatialObjects

TopologicalRelationsbetweenSpatialObjects

– ogcf:relate(geom1: ogc:WKTLiteral, geom2: ogc:WKTLiteral,patternMatrix: xsd:string): xsd:boolean

– ogcf:sfEquals(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfDisjoint(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfIntersects(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfTouches(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfCrosses(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfWithin(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfContains(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

– ogcf:sfOverlaps(geom1: ogc:WKTLiteral,geom2: ogcf:WKTLiteral): xsd:boolean

GeoSPARQL for Topological Query Functions

GeographicFeatures

Encodings

Access

ImplementationSpecifications

Concepts Vocabulary

StructureAbstractModels

<MultiGeometrygid="c731"srsName="http://www.opengis.net/gml/srs/epsg.xml#4326"><geometryMember><Pointgid="P6776"> <Coord><x>50.0</x><y>50.0</y></Coord> </Point></geometryMember><geometryMember> <LineStringgid="L21216"> <Coord><x>0.0</x><y>0.0</y></Coord> <Coord><x>0.0</x><y>50.0</y></Coord> <Coord><x>100.0</x><y>50.0</y></Coord> </LineString> </geometryMember> <geometryMember></MultiGeometry>

OGCGeographyMarkup Language

TwoDifferentUsagePatterns• Thematiccommunitiesdescribe

spatialdatasets: Cadastre,Topography,Geology,Hydrography,Meteorology,Aviation,CityModels,etc.

• EmbedlocationinotherXMLgrammars: GeoRSS,GeoSPARQL(OGC),Geopriv (IETF),POI(W3C),SensorWeb(OGC),etc.

GML: Geometry, Time, Features,

Reference Systems

Map

ping

XML Technologies (W3C)

Cit

y M

odel

s

Cad

astr

e

Geo

logy

Web

Fee

ds

...

CityGML – GeometryandSemantics

– Geometric entities knowWHAT they are– Semantic entities knowWHERE they are and what their spatial extentsare

CityGML: (Up to) Complex objects with structured geometrySemantics Geometry

CityGML and IndoorGML

1st layer: Topographic space model– building structure– geometric-topological model– network for route planning

2nd layer: Sensor space model– Radio/Beacon footprints– coverage of sensor areas– transition between sensor areas

OGC Moving Features

• "Moving features" - vehicles, pedestrians, airplanes, ships.– This is Big Data – high volume, high velocity.

• CSV and XML encodings

Spatial Temporal Geometry

time

Spatialplane

1prism=1leaf+1sweep(&attribute)

Endleafoftracks

id=1

Id=2

11:11:20.835 11:11:26.215 11:11:28.021 11:11:30.127

(C)

(B)(D)

(A)

OGCMovingFeaturesStandardimplementsISO19141

Social Media in Geospatial Analysis

Social MediaAPIsSilos

GeoSPARQL Linked Data REST API

Web AccessLayer

Human-orientedClients

. . .

OGC Interfaces for Social MediaSocial Media Analysis WPS

Geospatial Coverages

• Pixelgrid(e.g.,visiblebrightness)

Geospatial Coverages

• Pixelgrid(landuse/landcover)

Geospatial Coverages

• Pointgrid(e.g.,windspeed&direction)

Geospatial Coverages

• Triangulatedirregularnetwork(TIN)

OGCPointClouds

• WGestablishedin2015• Focusonalltypesofpointclouds:LiDAR/laser,bathymetric,meteorologic,photogrammetric…

WebCoverageProcessingService• Query Language for nD sensor, image, simulation, statistics data

– Syntax close to XQuery (WCPS 2.0: integration)• Ex: "From MODIS scenes M1, M2, and M3, the difference between red and nir, as TIFF

where nir exceeds 127 somewhere”for $c in ( M1, M2, M3 )where some( $c.nir > 127 )return encode( $c.red - $c.nir, “image/tiff“ )

(tiff1,tiff2)

GeospatialAnalytics

• Analyticexploitationofthespace-timefeatureswillusherinadvancesinhigh-qualitypredictionsystems.– Spacetimefeatures:thehighestorderbits- Jonas,Tucker

• UsingalgorithmicextractionandbigdatagraphstocreateandrelateentitiesontheWeb,organising themthroughasemantictaxonomyandenablingnaturalaccess– Thefutureis‘Where’"- S.Lawler,Bing

SpatialPartitioningTechniques

SpatialindexstoredperfileonHDFS

Zorder(2Dand3D),Hilbert(N-Dimensional)

Zorder(2Dand3D)Binnedperweekforspatiotemporal

N-DimensionalHilbertwitharbitrarybinningandtieredindexing

SpatialIndexing

GEOJINNI(formerly SPATIALHADOOP)

DiscreteGlobalGridSystems

• “…aspatialreferencesystemthatusesahierarchicaltessellationofcells topartitionandaddresstheglobe.DGGSarecharacterizedbythepropertiesoftheircellstructure,geo-encoding,quantizationstrategyandassociatedmathematicalfunctions.”

– OGCDGGSCandidateStandard

StandardizingDiscreteGlobalGridSystemsDifferent Cell Shapes

Square = Familiar Triangular = Fast Hexagonal = Fineness of Fit

00 01 02 03 10 11 12 13 20 21 22 23 30 31 32 33

nD Spatial Analyses ¯

1D Array Processes

Unique Cell Indices• Hierarchy-based, Space-filling Curve, Axes-based or Encoded Address

SpaceFillingCurvesA few different choices…

Sensors Everywhere(Things or Devices)

50 billions Internet-connected things by 2020

OGCSensorWebEnablement

• Quickly discoversensorsandsensordata (secureorpublic)thatcanmeetmyneeds– location,observables,quality,abilitytotask

• Obtainsensorinformation inastandardencodingthatisunderstandablebymeandmysoftware

• Readilyaccesssensorobservations inacommonmanner,andinaformspecifictomyneeds

• Tasksensors,whenpossible,tomeetmyspecificneeds• Subscribetoandreceivealerts whenasensormeasuresaparticularphenomenon

OGCSensorThings forIoT

• Builds on OGC Sensor Web Enablement (SWE) standards that are operational around the world

• Builds on Web protocols; easy-to-use RESTful style • OGC candidate standard for open access to IoT devices

http://ogc-iot.github.io/ogc-iot-api/datamodel.html

OGCEssentials

• SimpleFeaturesforSQL:FundamentalgeometriesandoperationswhichunderlieallOGCstandards.

• WellKnownText:TextencodingofSimpleFeaturesgeometries• WellKnownBinary:binaryencodingofWellKnownText.• CQL/Filter: CommonQueryLanguage andFilterlanguage• GeoPackage:SQLlite forgeospatial• WMTSSimpleTileMatrix

OGCBigGeoDataWhitePaper

BigGeoDataApplications

UseCasesforBigGeoData

OpenStandards

OpenSourceProjects

UseCasesReuseacrossApplications

Codereusebasedonstandards

Context forusecases Implementation

ofUseCases

UseCasesforBigGeoData

HighVelocityIngest

GeoAnalytics,MachineLearning

GeospatialDatabases

SpatialModeling

ObservationSources

Users and consuming

apps

UseCasesforBigGeoData

HighVelocityIngest

GeoAnalytics,MachineLearning

GeospatialDatabases

SpatialModeling

ObservationSources

Users and consuming

apps

IoTMessage Streaming

Social Media Message

Processing

ETL Stream processing using

RDF

Wide Area Motion

Imagery

Entity-oriented Spatial-temporal

analytics

Grid-oriented Spatial-temporal

analytics

Feature Fusion

Remote sensed data processing

Machine Learning

Array databases

NoSQLdatabases

Graphdatabases

Built environment

models

Integrated environmental

models

Modeling and simulation

HighVelocityIngest- UseCases

• OpenSourceProjects– ApacheKafka,ApacheNiFi,ApacheJena,– SensorHub,SensorUp

• OpenStandards– IoT:MQTT,COAP,IPSO,– OGCSensorWebEnablement(SWE),SensorThings– RDF,OWL,GeoSPARQL,– WebProcessingService(WPS)– WideAreaMotionImagery(WAMI)

HighVelocityIngest

ObservationSources

IoTMessage Streaming

Social Media Message

Processing

ETL Stream processing using

RDF

Wide Area Motion

Imagery

DRAFT

GeoAnalytics,MachineLearningUseCases

• OpenSourceProjects– Apache:Accumulo,Storm,Lucene,Hadoop,SIS,Magellan,Marmotta,Mahout,Spark

– LocationTech:GeoWave,GeoTrellis,GeoMesa,GeoJinni,JTSTopologySuite

– OSGeo:GDAL/OGR,OSSIM,pycsw– Others:MrGeo,MonetDB

• OpenStandards– OGCSimpleFeatures,DGGS– GeoTIFF,NetCDF,HDFencodings– WebProcessingService(WPS)

GeoAnalytics,MachineLearning

Entity-oriented Spatial-temporal

analytics

Grid-oriented Spatial-temporal

analytics

Feature Fusion

Remote sensed data processing

Machine Learning

DRAFT

GeospatialDatabasesUseCases

• OpenSourceProjects– Apache:Accumulo,Lucene/Solr,Cassandra,SIS,

Marmotta– OSGeo:degree,GeoServer,OpenLayers,QGIS– EarthServer,THREDDS,RasterStorageArchive– MonetDB

• OpenStandards– WebFeatureService(WFS)– WebCoverageService(WCS)– WebMapService(WMS)– GeographyMarkupLanguage(GML)

GeospatialDatabases

Array databases

NoSQLdatabases

Graphdatabases DRAFT

SpatialModelingUseCase

• OpenSourceProjects– ApacheSIS– CityDB– Cesium

• OpenStandards– CityGML– OpenMI– OGCCDB

SpatialModeling

Built environment

models

Integrated environmental

models

Modeling and simulation DRAFT

OpenSourceandOpenStandards

• Importanceofcoordination– “Havingjustoneimplementationofsomethingisrisky”- TomHardie,IETF– Needtodefinestableinterfaceswithstablestandardreference– Protocols,Interfacesandencodingsdocumentedinopenstandards

• OpenStandardsuseofOpenSource– ReferenceImplementationsofOpenStandards– CodesnippetsinOpenStandards.

Commercial39%

Government27%

NGO8%

Research6%

University20%

TheOpenGeospatialConsortium

Not-for-profit, international voluntary consensus standards organization; leading development of geospatial standards• Founded in 1994• 515+ member organizations• 48 standards• Thousands of implementations • Broad user community

implementation worldwide• Alliances and collaborative activities

with ISO and many other SDO’s

Africa…

AsiaPacific…

Europe209Middle

East34

NorthAmerica…

SouthAmerica

3

ApacheBDUSAMay2016- GeospatialTrack

• Open Geospatial Standards and Open Source – George Percivall, Open Geospatial Consortium (OGC)

• Magellan: Spark as a Geospatial Analytics Engine – Ram Sriharsha

• Applying Geospatial Analytics Using Apache Spark Running on Apache Mesos– Adam Mollenkopf, Esri

• SciSpark: MapReduce in Atmospheric Sciences – Kim Whitehall, NASA Jet Propulsion Laboratory

• Geospatially Enable Your Hadoop, Accumulo, and Spark Applications with LocationTech Projects

– Robert Emanuele, Azavea

© 2016 Open Geospatial Consortium

ApacheBDUSAMay2016- GeospatialTrackII

• Hiding Some of Geospatial Complexity– Martin Desruisseaux, Geomatys

• Geospatial Querying in Apache Marmotta– Sergio Fernandez, Redlink GmbH

• Spatial Data Based People/Vehicles Trails Analysis to Support Precision Urban Planning – Yonghua (Henry) Zeng, IBM

• Crowd Learning for Indoor Positioning – Thomas Burgess, indoo.rs GmbH

© 2016 Open Geospatial Consortium

OGCTestbed-13

ESAEO(Exploitation)Platform

BigDataIntegrationArchitectures

TheOpenGeospatialConsortium

OpenGeospatialConsortiumwww.opengeospatial.org

OGCStandards- freelyavailablewww.opengeospatial.org/standards

OGConYouTubehttp://www.youtube.com/user/ogcvideo

Dr.IngoSimonisisimonis@opengeospatial.org