Self-Service Analytics on Hadoop: Lessons Learned

27
Self-Service Analytics on Hadoop: Lessons Learned June 29, 2016 Drew Leamon Director – Advanced Technology Solutions

Transcript of Self-Service Analytics on Hadoop: Lessons Learned

Page 1: Self-Service Analytics on Hadoop: Lessons Learned

Self-Service Analytics on Hadoop: Lessons Learned

June 29, 2016Drew LeamonDirector – Advanced Technology Solutions

Page 2: Self-Service Analytics on Hadoop: Lessons Learned

Comcast: Shaping the Future of Media and Technology

High Speed Internet

Page 3: Self-Service Analytics on Hadoop: Lessons Learned

Forecast

Engineering Design

Budget

Engineering Analysis: Global Central Analysis Team

Page 4: Self-Service Analytics on Hadoop: Lessons Learned

Animals are Best Suited in Their Native Habitat

Page 5: Self-Service Analytics on Hadoop: Lessons Learned

Spreadsheets: The Natural Habitat of Analysts

Page 6: Self-Service Analytics on Hadoop: Lessons Learned

Evolution of Self Service Analytics

SSRS

Page 7: Self-Service Analytics on Hadoop: Lessons Learned

Self Service: Native HabitatLimitations of the Spreadsheet Native Habitat

• 1 Million Row Max

Self Service• Not Even Medium Data• Not Collaborative• No Automation• Not Repeatable

IT Analyst

Page 8: Self-Service Analytics on Hadoop: Lessons Learned

Self Service: How We StartedAnalyst goes to IT, makes request, waited weeks to get results

SSRS

• 10 TB Storage • 1 Compute Node

Not Self Service• 10 TB (Medium Data)• Limited Compute• IT Hand-off• Consultative service• Not self service.

IT Analysts

Page 9: Self-Service Analytics on Hadoop: Lessons Learned

Bigger database still meant building dashboards for team

IT Analysts

Still Not Self Service• 100s TBs (Large Data)• Data silos• IT Hand-off• Consultative service• Analysts not SQL

experts

Graduated to Specialized Databases

• Clustered Storage• Columnar Compression• Clustered Compute

Page 10: Self-Service Analytics on Hadoop: Lessons Learned

Datameer, native on Hadoop, enables self-service for big data

Analysts

True Self Service• PB == Big Data• Data Lake • Excel-like UI• No more waiting for IT

Self Service: The New Way

• Clustered Storage• Columnar Compression• Clustered Compute• Liberated Data

Page 11: Self-Service Analytics on Hadoop: Lessons Learned

11

Multiple Configurations for Big Data

Page 12: Self-Service Analytics on Hadoop: Lessons Learned

12

Engineering Analysis

IP Telephony

Video Research

IP Video Engineering

X1 Operations

Advanced Advertising

Web Analytics

Enterprise Business

Intelligence

Network EngineeringMature

Evolving

On-Boarded

On-Deck

Expanding Use Cases with Datameer

Page 13: Self-Service Analytics on Hadoop: Lessons Learned

Use Case #1: Comcast Digital Voice

Page 14: Self-Service Analytics on Hadoop: Lessons Learned

One Of The Largest IP Telephony Networks

Page 15: Self-Service Analytics on Hadoop: Lessons Learned

Anonymized Call Detail Records (CDR) Data Set

Data complexity from networkData size: TBs/month

Page 16: Self-Service Analytics on Hadoop: Lessons Learned

Discovered Unusual PatternsNoticed large spikes for high cost areas

Page 17: Self-Service Analytics on Hadoop: Lessons Learned

Hypothesis: Network Abuse

Page 18: Self-Service Analytics on Hadoop: Lessons Learned

30% of this traffic was coming from three accounts. 

Analysis Shows Traffic Concentration Few Accounts

Page 19: Self-Service Analytics on Hadoop: Lessons Learned

Ongoing Monitoring of Future Abuse

Analyst Scheduled a Tableau Data Extract and built a Tableau dashboard- Now the business can keep an eye out for further abuse.

Page 20: Self-Service Analytics on Hadoop: Lessons Learned

Result: Future Abuse Prevented and More

Abuse detected Analysts empowered Resources saved

No IT hand-off Value to organizationAutomated and repeatable

Page 21: Self-Service Analytics on Hadoop: Lessons Learned

21

Engineering Analysis

IP Telephony

Video Research

IP Video Engineering

X1 Operations

Advanced Advertising

Web Analytics

Enterprise Business

Intelligence

Network EngineeringMature

Evolving

On-Boarded

On-Deck

Expanding Use Cases with Datameer

Page 22: Self-Service Analytics on Hadoop: Lessons Learned

22

Use Case #2: Customer PerspectiveHow to measure customer experience from the customer perspective

Page 23: Self-Service Analytics on Hadoop: Lessons Learned

23

Millions of Viewing Experiences

Page 24: Self-Service Analytics on Hadoop: Lessons Learned

24

Improved Customer Experience through Data Analytics

Findings / Analysis

Best Practices

Improved Customer Experience

Data driven schedulingDataflow Automation

Page 25: Self-Service Analytics on Hadoop: Lessons Learned

Solution:

25

- Build views quickly & aggregate large datasets.

- Early visibility of data in Hadoop

Analyze

- Create repeatable processes through automated workflow

• Aggregations of large datasets from disparate data sources.- RDBMS, HDFS, APIs

• Data Joins / Data Quality Checks / Pipeline between clusters

Blend

Share

Insights

Page 26: Self-Service Analytics on Hadoop: Lessons Learned

26

Result: Data-driven Customer Viewing Experience Enhancements

Customer Experience Improved

Analysts empowered Capital Spend Directed Intelligently

No IT hand-off Value to organizationAutomated and repeatable

Page 27: Self-Service Analytics on Hadoop: Lessons Learned