Big Data at CallFire

Vijesh Mehta (Co-Founder and CTO)

• A little about CallFire

• CallFire’s technical challenges

• How CallFire deals with data

• Summary

Agenda

• I am one of the founders of CallFire. – Started in 2005 in a small apartment– Now 50 people

• I’ve been writing software primarily in the Java space for 12 years. CallFire is all Java. – We use : Wicket, Guice, Hibernate, MySQL,

Cassandra, ActiveMQ, XEN, Puppet

Some background about myself

• We are a cloud telephony provider.– Outbound Phone calls– Phone Numbers– SMS through long and short codes– IVR – Interactive Voice Response– Power Dialing

• CallFire’s call volume can get large very quickly. – Hurricane Sandy : 1.9 million emergency calls

• 4 Engineers and 1 System admin managing operations and new features.

• We just hired 7 more engineers this year, and still hiring!

About CallFire

• 1.4 billion calls and texts– Growing exponentially

• Over 50,000 accounts• Over 6 million campaigns• 80 million sound files• 14 TB in storage (NFS)• MySQL : Over 10,000 qps at peak

Big data isn’t always big company problem!

Technical Challenges by Numbers

1000000

2000000

3000000

4000000

5000000

6000000

7000000

Campaigns over Time

Growing faster each day

The first challenge

• Problem : We outgrew our datacenter. New systems need access to central storage. Replication across a 1gb/s interconnect.

• Needed Solution:– Must work across datacenter– Must scale as demand increases– Must be fault tolerant– Must deal with over 80 million sound files– Cheaper the better

Solutions Considered (2010)

NFS GLUSTER HDFS CASSANDRA

Fault Tolerant Yes, if configured Yes Yes Yes

DatacenterReplication

Maybe. Rsync isn’t fun with lots of files.

Not at the time Yes Yes

Easy to add storage No Not at the time Yes Yes

No Single point of failure

No Yes Not exactly, NameNode.

Data always accessible easily

No, hard to sort through file systems.

No, same as a file system

Yes Yes

Notes Not working for us. Too much management and downtime.

Looks good, tried it for a while. Easy at first because it was a file system.

Didn’t like the name node issue. May have been a good way to go.

Everything we need, quick to learn. We went all in!

* Only LAN solutions considered. Calls had too much latency in the cloud, or even across datacenter.

• Storage isn’t the best use of Cassandra.

• Do not exceed 50% of drive space. – Compaction needs the space. Hard lesson learned.

• Fault Tolerance: Replication factor of 3.

• Result• 1 TB of data = 6 TB of storage needed!• CallFire has a 120TB Cassandra Cluster

Cassandra

• We like SQL and Hibernate. – Pros: Easy, Flexible, Ad-Hoc Queries, Locks– Cons: Scaling

• Solution: Sharding with Cassandra for universal data

Extending the scope

Shard 1 Shard 2 Shard 3

Cassandra Cluster

• Cassandra makes sharding easier– Easy to store universal data. (Authentication)– Performs very well

• Tungsten Replicator (Big Data with SQL)– Sharding makes joins impossible, so fan your

data into central places.– NoSQL can’t handle ad-hoc queries. No

worries, you can still have SQL.

Sharding + Big Data

• Not Just for big companies, data grows rapidly in todays environment. – Nice article about Obama’s Data Crunchers:– http://swampland.time.com/2012/11/07/inside-the-secret-world-of-quants-and-data-crunchers-who-helped-obama-win/

• NoSQL systems have easier scaling and fault tolerance mechanisms.– Not uncommon to see small teams with 10-20 node clusters.

• SQL is still a big part of the equation. (Tungsten)– Fan in information across partitions– Replicate across datacenters– Keep your ad-hoc dreams alive!

Big Data Summary

Passive / Archived Storage

http://www.protocase.com/products/index.php?e=Backblaze

Backblaze – $5,300 for empty case. Holds 45 Drives (117TB usable space)

Big Data at CallFire

Technology

Transcript of Big Data at CallFire

Big Data, Big Risks – Simplify Big Data Security & Management | Vormetric

Big Data + Big Ideas = Big Impact

CallFire API Introduction. Outline Overview Generating API credentials 7 Services CallFire API information REST API REST Example SendText SOAP API SOAP.

Introduction to Big Data, Big Data Processing, and Big ...cis.csuohio.edu/~sschung/CIS660/Lecture1_IntroBigDataAnalyrics.pdf · What’s Big Data? From Wikipedia: • Big data is

Caterpillar Big Data Infrastructure Big Data, Data ...

BIG DATA, SMART DATA AND BIG ANALYSIS

Introduction to Big Data, Big Data Processing, and Big ...eecs.csuohio.edu/~sschung/CIS660/Lecture1_IntroBig... · What’s Big Data? From Wikipedia: • Big data is the term for

Big Data Madison: Architecting for Big Data

CallFire API

Big Data Visualization: Turning Big Data into Big Insights · Big Data Visualization: Turning Big Data Into Big Insights The Rise of Visualization-based Data Discovery Tools MARCH

CS 7265 BIG DATA ANALYTICS BIG DATA

Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Unite and Free your Data Making Big Data Big …files.meetup.com/14077672/WiDB - Making Big Data Big...Unite and Free your Data Making Big Data Big Business East Coast Chapter Launch

BIG CONTAINERS, BIG ORCHESTRATION, BIG DATA · BIG CONTAINERS, BIG ORCHESTRATION, BIG DATA William Benton Red Hat, Inc. @willb

· for executive: box big data ussuiu lla:ansnnns1ðxnu big data -big data big -wifñuiaÖ big data big data • hadoop big clouderâ manager hive impala big data 22 airuntju 2559

Big Data Curation - pdfs.semanticscholar.org · Big Data Curation Webinar 19/12/2013 BIG Big Data Public Private Forum BIG DATA INSIGHTS Coping with data variety and verifiability

Callfire & real estate marketing

Big Success With Big Data - Accenture · Big Success with Big Data 3 Big success with big data Big data is clearly delivering significant value to users who have actually completed

Big Data Technology Big Data - aakritsubedi9.com.npaakritsubedi9.com.np/files/Big Data Technology.pdf · Big Data Technology Big Data 1"Big data" is a field that treats ways to analyze,

Big Data, künstliche Intelligenz und Data Analytics · Big Data, künstliche Intelligenz, Machine Learning, Data Analytics & Co. How big is big? Big Data in der Versicherung sind