Vertafore: Database Evaluation - Selecting Apache Cassandra
-
Upload
datastax-academy -
Category
Technology
-
view
341 -
download
2
Transcript of Vertafore: Database Evaluation - Selecting Apache Cassandra
Database Evaluation: Selecting Apache Cassandra
© 2015. All Rights Reserved. 3
Introduction
software engineer at vertafore, >6 yearsgotten hands in almost every project in east lansing, enhancing or maintainingattended some talks, 2014 cassandra summit, first time speaking outside of the office
Prelude
© 2015. All Rights Reserved. 4
first a little history
Prelude
© 2015. All Rights Reserved. 5
monolith farmBig Oracle databases with VPDs, PL/SQLWebLogic clusters, Large web applications
A New Adventure
© 2015. All Rights Reserved. 6
A New Adventure
© 2015. All Rights Reserved. 7
green field developmentguidelines (next slides)
A New Adventure
8© 2015. All Rights Reserved.
0.001%
99.999%
Up Down
Modern goals for a modern system
A New Adventure
9© 2015. All Rights Reserved.
scalable, get it? but easy, intuitive
A New Adventure
10© 2015. All Rights Reserved.
maximum security, but not in a way that impedes us or our performance
A New Adventure
11© 2015. All Rights Reserved.
“money is irrelevant to the evaluation.”money is always relevant…
Choose Your Party
© 2015. All Rights Reserved. 12
“where do we start?”“find something to evaluate”
Choose Your Party
© 2015. All Rights Reserved. 13
etc…
limited exposure to nosql or non relational databasesGoogle “nosql”
Choose Your Party
© 2015. All Rights Reserved. 14
There are so many systems, and they all excel at everysomethingHow do you choose what makes the cut?
Choose Your Party
Consider Your Data Model And Goals
© 2015. All Rights Reserved. 15
evaluation guidelines: good response times, good throughput99th percentiles should also be “good”
Choose Your Party
© 2015. All Rights Reserved. 16
we care about entities and relationships
Choose Your Party
17© 2015. All Rights Reserved.
we also care about the history of these entities
Choose Your Party
© 2015. All Rights Reserved. 18
document stores and key-value stores were not on our keep list.Scary blog posts about data modelling relationships with document stores like MongoDB
Choose Your Party
© 2015. All Rights Reserved. 19
document stores and key-value stores were not on our keep list.Scary blog posts about data modelling relationships with document stores like MongoDB
Choose Your Party
© 2015. All Rights Reserved. 20
column stores and data abstraction layers looked to be worth our time
Choose Your Party
© 2015. All Rights Reserved. 21
from here we took a deeper dive into documentationwe eventually dropped hbase because it seemed to be for a different scalewe also eventually dropped datomic because it was very new
Choose Your Party
The Incumbent
© 2015. All Rights Reserved. 22
in a manner of speaking, everything would be measured against oracle
Level Select
© 2015. All Rights Reserved. 23
Level Select
Choose An Environment That Is Advantageous
© 2015. All Rights Reserved. 24
Level Select
25© 2015. All Rights Reserved.
as powerful a machine as this might be…other processes, limited cores and memorycluster with cassandra cluster manager can’t take advantage of optimized write path in cassandra due to extra disk seeks
Level Select
26© 2015. All Rights Reserved.
In-house Virtual machinesmay or may not give you more flexibility, depending on who manages them
Level Select
27© 2015. All Rights Reserved.
the cloud. e.g. AWS, Microsoft Azurewe used one called Skytapconsider: operating systems, cores, memory, and network interfaces, securitycost: pay a little now to save a lot later (migration)
Level Select
Something You Can Control
© 2015. All Rights Reserved. 28
can’t know all activities up front, need root to mess with stuff
Level Select
Follow Documented Best Practices
Trust the Experts
© 2015. All Rights Reserved. 29
Setup Virtual Machines: OS/Kernel, JVM, UtilitiesAgain, factor in project requirements: Do you need a cluster?there’s a reason the default cassandra config is what it is
Save Early and Often
© 2015. All Rights Reserved. 30
Save Early and Often
© 2015. All Rights Reserved. 31
Just good life advicewhether saving for retirement, saving a game, saving an essay, committing code…also this project
Save Early and Often
© 2015. All Rights Reserved. 32
in OOP, developers gain efficiency through re-use. doesn’t stop at codegame developers re-use assets, like tree models, to make forestswhat we did was use VM snapshots to build our world
Save Early and Often
© 2015. All Rights Reserved. 33
to do this, we needed helppart of building your world is knowing what it should look likeKnowledge on the business side helps hereCreate projections for Customer base, Data volumeUser profile variance (i.e. partition width in C* terms)Analytics may or may not be obvious…Keep reporting in the back of your mind
Save Early and Often
© 2015. All Rights Reserved. 34
Different databases will require independent modelsyou’ll have to work around unique limitationsAsk for help! DataStax helped us a lotnow let’s talk a little more about snapshots
Save Early and Often
© 2015. All Rights Reserved. 35
install your database
Save Early and Often
36© 2015. All Rights Reserved.
Save Early and Often
© 2015. All Rights Reserved. 37
Create your schema
Save Early and Often
38© 2015. All Rights Reserved.
Save Early and Often
© 2015. All Rights Reserved. 39
Build your cluster
Save Early and Often
40© 2015. All Rights Reserved.
Save Early and Often
© 2015. All Rights Reserved. 41
Build your data set. maybe encrypt it
Save Early and Often
42© 2015. All Rights Reserved.
Save Early and Often
© 2015. All Rights Reserved. 43
Add application servers
Save Early and Often
44© 2015. All Rights Reserved.
Save Early and Often
© 2015. All Rights Reserved. 45
Add stress test servers
Save Early and Often
46© 2015. All Rights Reserved.
Save Early and Often
© 2015. All Rights Reserved. 47
Add health monitoring services/servers
Save Early and Often
48© 2015. All Rights Reserved.
Save Early and Often
Do Anything Yourself At Most Once
© 2015. All Rights Reserved. 49
Replace manual steps withSnapshots, obviouslyScriptsSmall services or executablesExisting tools
Bring Your Gear
© 2015. All Rights Reserved. 50
tools
Bring Your Gear
Check Out The Jepsen Series by Kyle Kingsbury
https://aphyr.com/tags/jepsen
© 2015. All Rights Reserved. 51
Jepsen series has lots of info and examples about different techs performance in terms of CAP theorem and data loss in failure/partition scenariosvery succinctly points out the difficulties of distributed systems
© 2015. All Rights Reserved. 52
microservice
Bring Your Gear
microservice
mic
rose
rvic
e
microservice
microservice
We built a representative microservice prototypeOne REST-ish API, many data layers DataStax Java Driver Oracle JDBCHTTP endpoints Creating representative data sets Simulated user requests
Apache JMeter
Bring Your Gear
© 2015. All Rights Reserved. 53
A sample service needs sample clientsUsed an existing tool: JMeter (introduce jmeter: load testing and performance)Tools/automation team member built JMeter projects/test suitesIncluded a variety of load types: slow, bursty, firehose, read- vs write-heavyimportant: Executable via command lineimportant: Wrote all results to disk
Bring Your Gear
© 2015. All Rights Reserved. 54
microservice
OpsCenter
Generate lots of useful data… need more toolsconsistent format: microservice, jmeter opscenterDo some math, write some CSVs Find averages, mins, maxes… Percentiles are great
Bring Your Gear
After a few minutes with Excel…
© 2015. All Rights Reserved. 55
Bring Your Gear
© 2015. All Rights Reserved. 56
these are some cassandra numbersaverage and percentile tell a story together
Fight Some Dragons
57© 2015. All Rights Reserved.
This is the fun part
Fight Some Dragons
58© 2015. All Rights Reserved.
Fight Some Dragons
59© 2015. All Rights Reserved.
came in on a saturday. doesn’t happen often, outside of releasespartly because this was great funpartly because it took so long… couldn’t run every test at oncealso some anomalies… will get to that later
Fight Some Dragons
© 2015. All Rights Reserved. 60
have account for bad situations
Fight Some Dragons
© 2015. All Rights Reserved. 61
what happens when a party member gets knocked out?drop a node during a test
Fight Some Dragons
© 2015. All Rights Reserved. 62
what happens when you resurrect the fallen comrade?restore node in cluster during test
Fight Some Dragons
© 2015. All Rights Reserved. 63
what happens when a new fighter joins your party?add an additional nodedoes it help, or just get in the way?
Fight Some Dragons
“Failure Is Always An Option”
- Adam Savage
© 2015. All Rights Reserved. 64
The Goal of every test is To Achieve Some Sort Of Failure,or you’ve only learned part of the lesson it has to teach
Fight Some Dragons
© 2015. All Rights Reserved. 65
When will this thing topple over?How does that compare to our volume/throughput projections?Is there room for growth? Can we easily scale?every db works to some extent, even using a simple text file as a databaseremember “the goal is always to achieve some sort of failure”
Fight Some Dragons
Find Where It Fails Here, Not In Production
© 2015. All Rights Reserved. 66
this will probably save you time and money and stressand those Saturdays where you don’t get to just have funwhen you’ve done this, you’ve achieved a major goal of the evaluation
Collect Your Loot
© 2015. All Rights Reserved. 67
every good adventure ends with treasure
Collect Your Loot
© 2015. All Rights Reserved. 68
take good notes - on everythingdesign journal - documented our findings incrementallyended up with tremendous volume
Collect Your Loot
© 2015. All Rights Reserved. 69
it’s like drawing a maplets you retrace your steps, point out pitfalls, help out newcomers
Collect Your Loot
© 2015. All Rights Reserved. 70
a series of technical meetings and presentationsteaching is a great way to learn - taught about cassandralots of q&a with different people, DBAs, etcnow you have knowledge, metrics, charts… just line that up with your goals
Collect Your Loot
© 2015. All Rights Reserved. 71
Those metrics and charts were real “gems”We brought our findings to management, and…we work with very reasonable people
Easter Eggs
© 2015. All Rights Reserved. 72
During our testing, some of our VMs behaved strangelySporadic poor performance, slow startup, just fine laterWe had the metrics to clearly illustrate this!Hard to reproduce, but they eventually didThey pushed out a fix within a couple daysIt never happened again after their fix
Easter Eggs
© 2015. All Rights Reserved. 73
teamwork!
Replay
Consider Your Data Model And Goals
© 2015. All Rights Reserved. 74
evaluation guidelines
Replay
Choose An Environment That Is Advantageous
© 2015. All Rights Reserved. 75
Replay
Follow Documented Best Practices
Trust the Experts
© 2015. All Rights Reserved. 76
Replay
Do Anything Yourself At Most Once
© 2015. All Rights Reserved. 77
snapshots! tools!
Replay
© 2015. All Rights Reserved. 78
“Failure Is Always An Option”
- Adam Savage
and find those failures before you deploy anything
Questions or Comments
© 2015. All Rights Reserved. 79
Thanks
80© 2015. All Rights Reserved.
Go see Robert Johnson!
robert johnson, online analytics processing
Thanks
81© 2015. All Rights Reserved.
We’re Hiring! tinyurl.com/VertaforeEastLansing
we’re hiring
Thanks
82© 2015. All Rights Reserved.
@ChrisMonosmith [email protected] github.com/cmmonosmith
me me me
Thank you