C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan...

15
Pushing Cassandra’s Boundaries Darshan Rawal VP Engineering, Openwave Messaging Inc.

description

Darshan Rawal is leading the development of hybrid cloud based messaging products for global Tier 1 Telcos. Darshan has been working in Silicon valley since 2000, building nimble, cost effective products/services, handling millions of users and billions of transactions per day. Previous to Openwave Messaging, Darshan held engineering positions @ SS8 networks, Yahoo, DE Shaw, yp.com and has a M.S in Software Engineering from Carnegie Mellon University.

Transcript of C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan...

Page 1: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

Pushing Cassandra’s Boundaries

Darshan Rawal VP Engineering, Openwave Messaging Inc.

Page 2: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

2 © 2013 Openwave Messaging | Confidential #Cassandra13

Agenda

! Introduction ! Our Cassandra Journey ! Spectrum of BIG Data challenges ! Cassandra Pivots ! Typical Cassandra Instance YoY change ! Cassandra Insights ! Conclusion

Page 3: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

3 © 2013 Openwave Messaging | Confidential #Cassandra13

Openwave Messaging Customers

Page 4: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

4 © 2013 Openwave Messaging | Confidential #Cassandra13

Universal Messaging Suite

Page 5: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

5 © 2013 Openwave Messaging | Confidential #Cassandra13

Our Cassandra Journey – 3.5 years

Page 6: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

6 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Under Fire - A Story

! Customer Emergency •  Where: Major North American OWM customer •  When: Q4 2012 •  What: File System corruption in legacy platform •  Impact: All (~800K) accounts without mail access

! Resolution: A lab system goes live ! Metrics:

•  20 minutes to upgrade RAM per Cassandra Node •  Run wild maintainence/compaction; solved via SSDs •  100% Uptime

Page 7: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

7 © 2013 Openwave Messaging | Confidential #Cassandra13

Spectrum of BIG Data Challenges

Page 8: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

8 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Pivots

Page 9: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

9 © 2013 Openwave Messaging | Confidential #Cassandra13

Atomic Batches – Client Side Impact

getConnection()

batch_mutate(…) freeConnection() getConnection()

batch_mutate(…) freeConnection() getConnection()

batch_mutate(…) freeConnection()

getConnection() batch_mutate( …) batch_mutate( …) batch_mutate( …)

freeConnection()

prepare_batch() getConnection()

atomic_batch_mutate(…) freeConnection()

Cassandra 1.1x

Cassandra 1.2.x

Application Optimization

Page 10: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

10 © 2013 Openwave Messaging | Confidential #Cassandra13

Typical Cassandra Instance - YoY Change

Page 11: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

11 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Journey Insights

! It’s a new paradigm, will take time / investment ! There is no free lunch; cool features have a price

! Sizing is all about IOPS, not all IOPS are equal

! Eventual Consistency is dual edged sword

! Adapt paradigms that don’t fit upfront

Page 12: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

12 © 2013 Openwave Messaging | Confidential #Cassandra13

Cassandra Insights

Aspect Insight Replication Factor Ratio of RF / Ring size plays a crucial role in

throughput. Linear growth as the ratio shrinks

Tombstones Needs effective tuning for delete heavy applications Refactor application level soft deletes

Sizing Plan for the perfect storm: Compaction + N Failures + Recovery (especially for dense deployments)

Reliable Counters Utilize Client side affinity

Super Cols Best Avoided

Client Interaction Thundering herd issues due to backend GC

Page 13: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

13 © 2013 Openwave Messaging | Confidential #Cassandra13

In retrospection

Page 14: C* Summit 2013: (Re)-Building the Social Grid for Global Telcos @ 1/10th the Market Cost by Darshan Rawal

14 © 2013 Openwave Messaging | Confidential #Cassandra13

Current challenges @ Openwave Messaging