HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower
-
Upload
cloudera-inc -
Category
Technology
-
view
1.113 -
download
0
description
Transcript of HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower
![Page 2: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/2.jpg)
My life with HBase
Drawn to Drawn to ScaleScale
Drawn to Drawn to ScaleScale OpowerOpowerOpowerOpowerClouderaClouderaClouderaClouderaFactsetFactsetFactsetFactset
![Page 3: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/3.jpg)
About Opower
Opower is a customer engagement platform for the utility industry
![Page 4: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/4.jpg)
About Opower
Home energy reportsCustomized utility bills
Energy efficiency programs for utilities
![Page 5: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/5.jpg)
About Opower
Opower runs on analyticsAnalytics run on Hadoop + HBase
![Page 6: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/6.jpg)
Opower analysis relies on datafrom a variety of sources
» Electric Utility Usage Data
» Gas Utility Usage Data
2
4
3 1
Data Storage & Processing
Disaggregation Algorithms
Shared Energy Signature
Repository
OPOWER Platform
» Thermostat data
» Weather data
![Page 7: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/7.jpg)
Opower’s first architecture could not support their analytic vision
MySQLScalability?
Performance? Data integration?
![Page 8: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/8.jpg)
Opower’s first architecture could not support their analytic vision
Analytic workflow instead of analytic apps:
SQL -> CSV -> R -> too little, too slow
![Page 9: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/9.jpg)
Problem #1 Data Lake Cost
Usage AMI Regional AMI Sensor Data Data Lake
![Page 10: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/10.jpg)
Problem #2 Slower and slower queries
Smart-grid-scale dataLots of supporting data: weather, demographics, etc.
![Page 11: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/11.jpg)
Problem #3 It was taking lots of “magic”
Intense analyticsStrange schemas
Segmented queries
![Page 12: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/12.jpg)
Hadoop + HBase at Opower
Opower determined that they needed an entirely new data architecture
![Page 13: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/13.jpg)
NexGen Architecture @ Opower
![Page 14: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/14.jpg)
Hadoop + HBase at Opower
Early success: HBase AMI
![Page 15: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/15.jpg)
What rocked
Endless, cheap scalability
![Page 16: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/16.jpg)
What rocked
The analytics team loved it!
![Page 17: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/17.jpg)
What sucked
Hard on the ops team – still trying to grok it
![Page 18: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/18.jpg)
What suckedNoSchema p1.
Creating SchemaManaging MetaData
Schema <=> Performance
![Page 19: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/19.jpg)
What sucked
HAFailover
Snapshots
![Page 20: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/20.jpg)
What sucked
No secondary indexAggregation is slow (Rollup/OLAP)
Poor Client Performance
![Page 21: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/21.jpg)
It would be better if only …
Developers were not forced to know how the data is stored, indexed, etc.
![Page 22: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/22.jpg)
It would be better if only …
There were nicer APIs and better query languages (SQL?)
![Page 23: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/23.jpg)
It would be better if only …
Version migrations were easyHierarchical Tables
![Page 24: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/24.jpg)
It would be better if only …
Real-time tuning
![Page 25: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/25.jpg)
It would be better if only …
Did I mention HA?
![Page 26: HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environment - OPower](https://reader036.fdocuments.in/reader036/viewer/2022070319/5581d34cd8b42ae06c8b5401/html5/thumbnails/26.jpg)
In summary
HBase has helped Opower achieve their analytic vision
But they’ve still got a long way to goHBase still has a long way to go