An Open Source GIS Architecture for
Connected and Linked Data
Jerry HayesFrank Hardisty (Advisor)
Problem: Relational database abstraction impedes performance and scalability.
Connected Data in GIS Today
Many use cases in GIS
Creates logical model to represent topology.
Storage of logical model is not optimal.
Problem: Vast potential of the Semantic Web is unrealized in GIS!
Linked Data in GIS Today
Links described by semantic relationships.
Semantics enable data discovery.
Early stages of adoption in GIS.
Having trouble “visualizing” the network ? … so do machines!
How GIS Stores Connected Data Uses relational database tables
Abstraction introduces unnecessary overhead.
Bad for large datasets!
Much easier to visualize network … machines are happier too!
Graph Databases for Connected Data Stores connected data in its native format.
Removes unnecessary overhead.
Good for large datasets!
Performance comparisons are difficult. … how “connected is the connected data?
Preprocessing data helps mitigate issues. … ESRI’s preprocessed logical network model.
In general … i) RDBMS are optimized for aggregation queries ii) Graph databases are optimized for traversing.
Database Performance Comparisons
Two basic properties define graph databases.
Graph Database Characteristics
Native Graph StorageNati
ve G
raph
Pro
cess
ing
Connects data to data on the Web
Uses Resource Descriptive Framework (RDF).
Creating quality linked data is challenging!
Linked Data … the Next Frontier
Only useful in sufficient quality and quantity.
Many RDF datasets are now available
Data quality, availability and stability concerns.
Tools are available for accessing RDF models.
LinkedGeoData for GIS applications.
Accessing Linked Data
Server side is stateless.
PostGIS used for .. • Storing physical model.• Data visualization.
Neo4j used for …• Storing logical model• Graph traversals
Open Source System Architecture
Implemented in the IBM Cloud
Provides RESTful API.
Enables spatial analytics
Enables “data” discovery.
Integrates physical and logical model processing.
Implemented in the IBM Cloud
Servlet Architecture
Top Related