Post on 13-Dec-2014
description
Big Data at CallFire
Vijesh Mehta (Co-Founder and CTO)
• A little about CallFire
• CallFire’s technical challenges
• How CallFire deals with data
• Summary
Agenda
• I am one of the founders of CallFire. – Started in 2005 in a small apartment– Now 50 people
• I’ve been writing software primarily in the Java space for 12 years. CallFire is all Java. – We use : Wicket, Guice, Hibernate, MySQL,
Cassandra, ActiveMQ, XEN, Puppet
Some background about myself
• We are a cloud telephony provider.– Outbound Phone calls– Phone Numbers– SMS through long and short codes– IVR – Interactive Voice Response– Power Dialing
• CallFire’s call volume can get large very quickly. – Hurricane Sandy : 1.9 million emergency calls
• 4 Engineers and 1 System admin managing operations and new features.
• We just hired 7 more engineers this year, and still hiring!
About CallFire
• 1.4 billion calls and texts– Growing exponentially
• Over 50,000 accounts• Over 6 million campaigns• 80 million sound files• 14 TB in storage (NFS)• MySQL : Over 10,000 qps at peak
Big data isn’t always big company problem!
Technical Challenges by Numbers
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
Campaigns over Time
Growing faster each day
The first challenge
• Problem : We outgrew our datacenter. New systems need access to central storage. Replication across a 1gb/s interconnect.
• Needed Solution:– Must work across datacenter– Must scale as demand increases– Must be fault tolerant– Must deal with over 80 million sound files– Cheaper the better
Solutions Considered (2010)
NFS GLUSTER HDFS CASSANDRA
Fault Tolerant Yes, if configured Yes Yes Yes
DatacenterReplication
Maybe. Rsync isn’t fun with lots of files.
Not at the time Yes Yes
Easy to add storage No Not at the time Yes Yes
No Single point of failure
No Yes Not exactly, NameNode.
Yes
Data always accessible easily
No, hard to sort through file systems.
No, same as a file system
Yes Yes
Notes Not working for us. Too much management and downtime.
Looks good, tried it for a while. Easy at first because it was a file system.
Didn’t like the name node issue. May have been a good way to go.
Everything we need, quick to learn. We went all in!
* Only LAN solutions considered. Calls had too much latency in the cloud, or even across datacenter.
• Storage isn’t the best use of Cassandra.
• Do not exceed 50% of drive space. – Compaction needs the space. Hard lesson learned.
• Fault Tolerance: Replication factor of 3.
• Result• 1 TB of data = 6 TB of storage needed!• CallFire has a 120TB Cassandra Cluster
Cassandra
• We like SQL and Hibernate. – Pros: Easy, Flexible, Ad-Hoc Queries, Locks– Cons: Scaling
• Solution: Sharding with Cassandra for universal data
Extending the scope
Shard 1 Shard 2 Shard 3
Cassandra Cluster
• Cassandra makes sharding easier– Easy to store universal data. (Authentication)– Performs very well
• Tungsten Replicator (Big Data with SQL)– Sharding makes joins impossible, so fan your
data into central places.– NoSQL can’t handle ad-hoc queries. No
worries, you can still have SQL.
Sharding + Big Data
• Not Just for big companies, data grows rapidly in todays environment. – Nice article about Obama’s Data Crunchers:– http://swampland.time.com/2012/11/07/inside-the-secret-world-of-quants-and-data-crunchers-who-helped-obama-win/
• NoSQL systems have easier scaling and fault tolerance mechanisms.– Not uncommon to see small teams with 10-20 node clusters.
• SQL is still a big part of the equation. (Tungsten)– Fan in information across partitions– Replicate across datacenters– Keep your ad-hoc dreams alive!
Big Data Summary
Passive / Archived Storage
http://www.protocase.com/products/index.php?e=Backblaze
Backblaze – $5,300 for empty case. Holds 45 Drives (117TB usable space)