NEW APPLICATIONS OF CIM TO DATA · PDF fileNew applications of CIM to Data Analytics | June...
Transcript of NEW APPLICATIONS OF CIM TO DATA · PDF fileNew applications of CIM to Data Analytics | June...
NEW APPLICATIONS OF CIM TO
DATA ANALYTICS
European CIM User Group, Amsterdam.
EDF Research & Development Division.
Friday, June 3rd 2016.
| 2
How Data Analytics can improve the business of a DSO ?
What is Data Analytics ?
The Data Science Process (« Doing Data Science » - Cathy O’Neil & Rachel Schutt).
THE OBJECTIVE
New applications of CIM to Data Analytics | June 2016
| 3
THE PROJECT
Explore the potential of Data Analytics solutions that are relevant for DSO.
For data collection, storage & cleaning.
For statistical models.
For visualization.
Initiation of a Data Analytics Platform to examine these solutions.
The main components of the platform, till now.
A triplestore (Stardog) to store & process the knowlegde of the electrical network.
A graph database (Neo4j) to store & process the topology of the network.
A data historian (Open TSDB) to store & process the measurements of the network.
Some visualization modules.
And one common language for the whole platform : the CIM.
Partnership with IREQ (Research Institute of Hydro-Québec) on the semantic web.
New applications of CIM to Data Analytics | June 2016
| 4
Graph
DatabaseNeo4j
THE DATA ANALYTICS PLATFORM
New applications of CIM to Data Analytics | June 2016
TriplestoreStardog
Data
HistorianOpenTSDB
Data
Vizualisation
GIS
Network Operations
Asset Management
Work Management
Metering
Weather Foreast
Customer Support
…
Collect
Process
Clean
Statistical
models
OMS
SCADA
WMS
MMS …
Real world &
DSO referencialsVizualise
THE
TRIPLESTORE
European CIM User Group, Amsterdam.
Friday, June 3rd 2016.
| 6
THE TRIPLESTORE - PRINCIPLES
Stores & process the knowlegde of the electrical network.
A technology from the ressource description (RDF) & the semantic web.
Stores the complete network : equipments, assets, locations, organisations...
No data model, just triples : subject, predicate, object.
But a vocabulary (or ontology) : we use the CIM.
1/30th of the french MV network with CIM ~ 7 millions triples.
Stardog, as most triplestore, can process billions of triples.
New applications of CIM to Data Analytics | June 2016
| 7
THE TRIPLESTORE - BENEFITS
Many data references in the real world.
Various data models & formats.
The same equipment in the real world can be referenced more than once.
A unique data reference in our platform.
A unique vocabulary (CIM) & a unique format (triples, in Turtle for import).
Triplestores have facilities to clean the data (sameAs).
Triplestores have facilities to manage the upgrades of the network (named graph).
New applications of CIM to Data Analytics | June 2016
TriplestoreStardog
CIMGIS
SCADA MV network in
the SCADA
LV network in
the GIS
LV network
in CIM
MV network in
the GIS
A HV/MV
transformer in
the real world
sameAsCIM
CIM
MV network
in CIM
MV network
in CIM
| 8
THE TRIPLESTORE – CIM BENEFITS
An example of the CIM benefits : equipment naming.
In the real world, various representation of the network.
The same equipment can be identified by multiple identifiers & names.
The CIM classes Name & NameType are particularly relevant to assign multiple
identifiers & names to an IdentifiedObject.
We design the mRID of each IdentifiedObject with the NameType & the Name.
New applications of CIM to Data Analytics | June 2016
Namename=[SITR]
Namename=[NOM]
Namename=[GCO]
Namename=[SIG]
NameTypename = « scada_names »
description = « SCADA names »
NameTypename = « scada_ids »
description = « SCADA identifiers »
NameTypename = « gdo_codes »
description = « GDO codes »
NameTypename = « gis_ids »
description = « GIS identifiers »
IdentifiedObjectmrid=[NameType.name]/[CIM short name]/[Name.name]
THE GRAPH
DATABASE
European CIM User Group, Amsterdam.
Friday, June 3rd 2016.
| 10
THE GRAPH DATABASE - PRINCIPLE
Stores & process the topology of the electrical network.
A technology from NoSQL & Big Data trend.
No data model, just nodes & relations.
Nodes can have labels to describe their roles : we use the CIM.
Nodes and relations can have properties : we use the CIM.
1/100th of the french MV & LV network with CIM ~ 300.000 nodes.
New applications of CIM to Data Analytics | June 2016
| 11
THE GRAPH DATABASE - IMPORT
The import of the network in the graph database.
Nodes represent CIM equipements (Conducting & Containers), but not only…
The connectivity of the CIM (ConnectivityNode & Terminal) is replaced by relations.
Only relevant classes & attributes of the CIM are used to label & detail the nodes.
New applications of CIM to Data Analytics | June 2016
Substation
Switch
EnergyConsumer
Switch
Junction
| 12
THE GRAPH DATABASE - BENEFITS
Neo4j can process the graph of the electrical network.
With the Cypher query language.
• A cypher query which returns the EnergyConsumer contained in a Substation.
• MATCH (s:Substation)-[:CONTAINS]->(e:EnergyConsumer) WHERE
s.mrid='scada_ids/sub/9351_0_20608036‘ RETURN e
With traversals in Java for more complex & specific queries.
• To get the equipments powered by a feeder.
• To get the transfer feeders of a feeder.
• To find topological island in the network.
• To simulate contingency.
• To evaluate load-flows.
New applications of CIM to Data Analytics | June 2016
THE DATA
HISTORIAN
European CIM User Group, Amsterdam.
Friday, June 3rd 2016.
| 14
THE DATA HISTORIAN - PRINCIPLE
Stores & process time series measured in the electrical network
Open TSDB is a Data Historian layer on top of Hadoop Hbase.
Import raw files, read & write through REST API.
But no real time series available.
We needed to simulate time series with contingency simulation in Neo4j.
Our simulation tools generates CIM DiscreteMeasurement messages.
Converted & written through the REST API.
New applications of CIM to Data Analytics | June 2016CIM DiscreteMeasurements
Graph
DatabaseNeo4j
Data
HistorianOpenTSDB
TriplestoreStardog
Simulation
tool
DATA
VISUALIZATION
European CIM User Group, Amsterdam.
Friday, June 3rd 2016.
| 16
DATA VISUALIZATION
New applications of CIM to Data Analytics | June 2016
Visualization modules for operational projects of the french DSO.
CONCLUSION,
LIMITATIONS &
NEXT STEPS
European CIM User Group, Amsterdam.
Friday, June 3rd 2016.
| 18
CONCLUSION
The CIM is relevant for data collection.
Data Analytics requires the use of many heterogeneous data sources (real world).
The CIM assures a single & consistent representation of heterogeneous data.
The CIM is relevant for data processing.
The model is flexible enough to be adapted to different kind of storage.
• For the complete description of the network in the triplestore.
• Just for the description of the topology in the graph database.
The CIM proposes a single & consistent representation of the MV and LV network.
The CIM is relevant for data cleaning.
The CIM naming classes
Necessary to activate the sameAs property in the triplestore.
But no real time series available for our platform.
Is the CIM also relevant for time series processing ?
New applications of CIM to Data Analytics | June 2016
| 19
NEXT STEPS
The Data Science Profile (« Doing Data Science » - Cathy O’Neil & Rachel Schutt).
So far, we mainly addressed Computer Science, Data Viz & Domain Expertise.
On the next steps, we need to adress Statistics & Machine Learning.
CIM also relevant for these domains ?
New applications of CIM to Data Analytics | June 2016