Big Data Case Study: Fortune 100 Telco
-
Upload
bluedata-inc -
Category
Software
-
view
641 -
download
1
Transcript of Big Data Case Study: Fortune 100 Telco
Fortune 100 Media / Telco Company
Business Goal• Big Data analytics to improve customer experience
• Provide daily insights to internal and external teams
• Sandbox environment to support ad-hoc analysis
• Isolated environments for external content providers
Key Challenges• Limited IT resources and skill sets in Hadoop and Spark
• Administrative overhead managing existing Big Data environments
• Onboarding multiple internal and external user groups
Big Data Case Study Example
MANAGEMENTCOMPLEXITY
DUPLICATIONOF DATA
CLUSTER SPRAWL
< 30% UTILIZATION
IT
Fortune 100 Media / Telco CompanyBig Data Infrastructure = Complex and Expensive
External Content Provider
External Content Provider
Other Internal Teams
Data Scientists and Developers
Going Forward: Two Options Considered
Expand on-premises Hadoop infrastructure• Ongoing management of physical servers
• Multi-tenancy required for external providers
• Significant IT overhead
Fortune 100 Media / Telco Company
Move to AWS Elastic MapReduce• Hadoop-as-a-Service offers simplicity and
agility
• Internal security policies are barrier
• Ongoing TCO of AWS cloud services
• Data is on-premises, difficult to copy or move
Physical
Data Copy
Hadoop Cluster (~ 15 nodes)
(Converted to Production from Pilot)
New Physical Nodes ($$)To increase performance & capacity
Hue Console(Hadoop jobs)
Marketing
External Content Provider
Advanced administration Groups/queues/schedulers
BI Tool(s)
Custom Web App ($$)(Security, access control &
onboarding)
New Physical Nodes ($)
For BI/ETL tools
User administration (AD/LDAP)
User administration (AD/LDAP)
Utilization < 20%
NFS Database Other
Physical Data Copy/Duplication
Sales SupportData
Scientists Developers
Dev/Test Cluster
New Physical Nodes
($)
Big
Data
Ap
plicati
on
s
& U
sers
Big
Data
In
frastr
uctu
reExis
tin
g
Data
Fortune 100 Media / Telco CompanyOption 1: Expand On-Premises Infrastructure
External Content Provider
A third option: Hadoop-as-a-Service on-premises• Infrastructure software platform (BlueData) for Hadoop and
Spark
Self-service, on-demand virtual clusters• Amazon EMR-like experience
• Agility and speed for data scientists
• IT infrastructure efficiency, higher utilization
Secure and multi-tenant architecture• Eliminate complexities and pitfalls of multiple isolated
physical clusters
• Stronger isolation and greater flexibility, no data duplication
Solution and Benefits
Fortune 100 Media / Telco Company
Hadoop Cluster (~ 15 nodes)
(Converted to Production from Pilot)
New Physical Nodes ($)Performance optimized (CPU &
Memory)
Data Scientists and Developers
Web UI – multi-tenant, role-based access control
User administration (AD/LDAP)
EPIC Platform ($)
Content Provider Tenant 3
VIRTUAL HADOOP CLUSTERHUE CONSOLE + BI TOOLS
Content Provider Tenant 2
VIRTUAL HADOOP CLUSTERHUE CONSOLE + BI TOOLS
Internal TeamTenant 1
VIRTUAL HADOOP CLUSTERHUE CONSOLE + BI TOOLS
In-place access
Other Internal Teams
NS Gluster Other
Big
Data
Ap
plicati
on
s
& U
sers
Big
Data
In
frastr
uctu
reExis
tin
g
Data
Fortune 100 Media / Telco CompanyOption 3: Deploy BlueData EPIC Software Platform
External Content Provider
External Content Provider
• Significantly lower costs (~70%) – less hardware required for dev/test cluster and BI / analytical tools
• Reduced administrative overhead – simpler user management and administration, elminated data copying
• Speed and self-service – on-demand provisioning of virtual Hadoop and Spark clusters
• Higher utilization – consolidation ratio of 8:1 between virtual and physical servers
Fortune 100 Media / Telco CompanyBig Data Case Study – Example Benefits
GLUSTER
HDFS SWIFT NFS
Utilization > 90%
Simplified management
No duplication of data
No cluster sprawl
ElasticPlane TM : Self-service, multi-tenant clusters
DataTap TM : In-place access to enterprise data stores
IOBoost TM : Extreme performance and scalability
EPIC Platform
Fortune 100 Media / Telco CompanyBig Data Infrastructure Made Easy
External Content Provider
External Content Provider
Other Internal Teams
Data Scientists and Developers