Big Data Access, Analytics and...
Transcript of Big Data Access, Analytics and...
![Page 1: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/1.jpg)
Zhenyu (Henry) Huang, Ph.D., P.E., F.IEEE
Laboratory Fellow/Technical Group Manager
Pacific Northwest National Laboratory (PNNL)
Big Data Access, Analytics and
Sense-Making
Workshop on Data Analytics for the Smart Grid Pullman, WA August 28, 2017
![Page 2: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/2.jpg)
Deployment of a vast new phasor network is
generating unprecedented real-time data
2
April 2007 March 2015
Today – SCADA data Emerging – phasor data Improvement
Variety voltage + current + phase angle, … more information
Velocity 1 sample / 4 seconds 30-120 samples / second ~200x faster
Volume 8 terabytes / year 1.5 petabytes / year ~200x more data
Veracity unseen ms-oscillations oscillations seen at 10ms greater accuracy
![Page 3: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/3.jpg)
Smart devices and 2-way communication
offer new opportunities, greater complexity
3
60,000 Smart Meters
![Page 4: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/4.jpg)
More diverse data add to the complexity
Weather/climate data
(e.g. PNNL ARM* Data)
300 instruments, 2000 data streams 24/7
500 GB/day rising to multiple TBs/day
Curating 20 years’ data
Market/business data
Cyber/communication data
Simulated data
Each contingency scenario generates 0.5M bytes data, adding up to TB
scale
4
Contingency Analysis
Number of scenarios
Serial computing on 1
processor
Parallel computing on
512 processors
Parallel computing on 10,000
processors
WECC N-1 (full) 20,000 4 hours ~30 seconds
469x speed up
WECC N-2 (partial) 153,600 26 hours
~3 minutes 492x speed up
~12 seconds 7877x speed up
*ARM: Atmospheric Radiation Measurement
![Page 5: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/5.jpg)
Data volume comparison:
grid vs. big science data
100 103
KB 106
MB 109
GB 1012
TB 1015
PB
![Page 6: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/6.jpg)
Making data accessible is a big challenge
Organizing and converting data to application specific formats
SQL Tables
NoSQL
CSV
Text files
PI
Time series
Operational or Planning Applications
PTI
1. Connect to n data sources 2. Get Data * n 3. Combine * n 4. Transform * n 5. Analyze
JSON
XML
Redundancy: Underlined steps has to be performed by every application for each type of data
![Page 7: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/7.jpg)
Making data accessible is a big challenge
Operational or Planning Applications
GOSS
1. Connect to n data sources 2. Get Data 3. Combine 4. Transform
1. Connect to GOSS 2. Request Data 3. Analyze
Organizing and converting data to application specific formats
SQL Tables
NoSQL
CSV
Text files
PI
Time series
PTI
JSON
XML
GOSS = GridOPTICS Software System, https://github.com/GridOPTICS/GOSS
![Page 8: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/8.jpg)
GOSSTM: link data to applications
8
https://github.com/GridOPTICS/GOSS
![Page 9: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/9.jpg)
Analytical Challenges in multi-domain data-
driven reasoning
9
![Page 10: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/10.jpg)
Computational challenges in keeping up
with data cycles
Dynamic state estimation: computational performance achieved
30ms target for regional systems (1000s buses).
WECC-size system (16,000 buses) – a 100 TF problem with 48ms
performance on 3,000 cores. Remaining bottleneck is communication.
Memory bandwidth advancements expect to meet the 30ms target.
10
0.0001
0.001
0.01
0.1
1
10
100
1000
10000
0.003 0.030 0.300 3.000 30.000
Tera
FLO
PS
Seconds
Mar 2017
30 ms
Oct 2012
June 2014
Oct 2012
June 2014
Computational time per estimation step
![Page 11: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/11.jpg)
Mathematical challenges in handling non-
Gaussian noise in power grid measurement
Current data applications assume Gaussian noise.
New mathematics needs to be developed and adapted for handling
non-Gaussian noise, such as Particle Filters, Gaussian Mixture
Methods. 11
-0.22 -0.2 -0.18 -0.16 -0.14 -0.12 -0.1
0
20
40
60
80
100
120
Histogram (pValue=0.00%)
Fre
quen
cyMeasurements
pValue=0.00%
Noise extracted from PMU Noise property analysis
![Page 12: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/12.jpg)
Advanced visualization for improving hydro
state awareness (Hydromap)
Modernize displays for
hydro planning and
operations
Develop new, novel
visualization techniques
and paradigms for
analyzing dynamic data
Develop modular
framework for
deploying and
integrating new data
visualizations
12
Current data display in need of modernization
Interactive hydro map as a new visualization paradigm
![Page 13: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/13.jpg)
Multi-dimensional wind visualization (Glyphs)
13
Wind visualization
showing multi-
dimensional data in
glyphs
Wind speed (length
of tail)
Wind direction
(angle of tail)
Generation (size of
head)
Uncertainty (color of
tail)
Forecast variability
Wholesale price
Capacity
Last hour
generation
SCE error code
Generation
difference from
forecast
Wind Visualization/Wind Forecast Visualization
![Page 14: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/14.jpg)
Historical hydro view using radial visualization
14
Compares
hydropower
generation across
different projects
along Columbia
and Snake rivers
Alternative view
shows generation
across different
sources such as
hydropower,
nuclear,
renewables, and
miscellaneous
sources
![Page 15: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/15.jpg)
Data repository for public hosting
15
Data Tools
Web Portal Data Repository
Download
Processing
Generation
Processing
User
Use
existing
datasets
Generate
new datasets
Submit
datasets
Submission
Processing
Validated models & scenarios User-generated
models & scenarios
Existing models & scenarios
Dataset
Generation/
Anonymization
Methods
Dataset
Metrics/
Validation
Download
request
Case
configuration
3rd-party
datasets
Private Datasets
Data Repository
Anonymized Data
Case Published New Case Request
Data Generation
![Page 16: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/16.jpg)
Summary
Grid data complexity is increasing with big volumes,
diverse types, and various attributes.
Such complexity poses significant challenges in data
access, transformation, analytics, sense making.
Math, computing and visualization technologies need to
be developed to meet these challenges.
GOSS as a big data platform.
Multi-domain data reasoning and high performance computing.
Modular visualization for information presentation.
16
![Page 17: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/17.jpg)
Acknowledgement
PNNL Researchers: (Data and Computing) Bora Akyol, Poorva
Sharma, Steve Elbert, Shuangshuang Jin, Bruce Palmer, George
Chin; (Power Engineering) Ruisheng Diao, Yousu Chen, Mark Rice,
Shaobu Wang, Karen Studarus
Former PNNL Researchers: Terrence Critchlow, Ning Zhou, Ning Lu,
Pengwei Du
Funding support by:
PNNL Future Power Grid Initiative (FPGI)
DOE Office of Electricity Delivery and Energy Reliability (OE) Advanced
Grid Modeling Program
DOE Grid Modernization Initiative
DOE Advanced Scientific Computing Research (ASCR) Applied Math
Program
DOE Advanced Research Program Agency – Energy (ARPA-E)
Bonneville Power Administration (BPA)
17
![Page 18: Big Data Access, Analytics and Sense-Makingsgdril.eecs.wsu.edu/wp-content/uploads/2018/02/HenryHuang-Big-Da… · Deployment of a vast new phasor network is generating unprecedented](https://reader033.fdocuments.in/reader033/viewer/2022042322/5f0c759b7e708231d4358226/html5/thumbnails/18.jpg)
Questions?
18
Zhenyu (Henry) Huang, Ph.D., P.E., F.IEEE
Laboratory Fellow/Technical Group Manager
Pacific Northwest National Laboratory
Further Information: GridOPTICS: http://gridoptics.pnnl.gov/ GridOPTICS™ Software System (GOSS): https://github.com/GridOPTICS/GOSS Interactive Visualization and Demo Center: http://vis.pnnl.gov/