[Homebuilt Aircraft] Zenith Chris Heintz Drw & Construction Manual 1976
Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs"...
Transcript of Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs"...
Big Data���Little Tests
John Heintz
Founder, Gist Labs Technical Consultant, Cutter Consortium
[email protected] @jheintz
http://gistlabs.com
© 2012 Gist Labs, LLC
About John Heintz • Developer since 1995
• Agilist since 1999
• Founded Gist Labs in 2008
• Developer, Mentor, Consultant
• Intuitive, Abstract, Precise
2
Kool-Aids I’ve drank: Agile/Lean/Kanban, OO, TDD, REST, Mentoring, Craftsmanship, Emergent/Progressive Design, InnovationGames®, Systems and Complexity Theory
© 2012 Gist Labs, LLC
My Goals for You
• Demystify test automation for Big Data
• Provide executable examples
3
© 2012 Gist Labs, LLC
What you shouldn’t expect…
• Barely introduce Big Data concepts
• No performance tuning
4
© 2012 Gist Labs, LLC
Simple Code, Config
• I went as simple and clear as possible
• Java, JUnit4
• Maven… okay maybe not simple :-\
5
© 2012 Gist Labs, LLC
Mostly Code
• Remember the Law of Two Feet
• If code isn’t what you were looking for I totally respect you finding something better for your time J
6
© 2012 Gist Labs, LLC
• Everything available from http://gistlabs.com/2012/08/big-data-little-tests/
• The entire command script is there…
so you can take notes assuming that’s available
7
© 2012 Gist Labs, LLC
My Soapboxes…
These are topics I’ll repeat myself on
• Fast test execution
• One-click build
8
© 2012 Gist Labs, LLC
Big Data
• Too much
• Too fast
• Not trivially structured
9
© 2012 Gist Labs, LLC
Map Reduce
• Map from one input to one output
• Reduce from many inputs to one output
• Can be run in parallel
• Crude, but massive
10
© 2012 Gist Labs, LLC
CAP Theorem
• Consistency
• Availability
• Partition Tolerance
11
© 2012 Gist Labs, LLC
Big Data Ecosystem
• Hadoop: A giant among giants
(Tons of projects on this platform!!)
• Cassandra: Feels like a weird RDBMS
• Riak: An elegant key/value/search store
• MongoDB: Document store
12
© 2012 Gist Labs, LLC
Let’s Run Some Code
13
© 2012 Gist Labs, LLC
Hadoop Tests
14
© 2012 Gist Labs, LLC
Riak tests
15
© 2012 Gist Labs, LLC
Other Frameworks
• CassandraUnit
https://github.com/jsevellec/cassandra-unit
• PigUnit, Hadoop Query Language
http://pig.apache.org/docs/r0.8.1/pigunit.html
16
© 2012 Gist Labs, LLC
Code Questions?
• Fast test execution?
• One-click build?
17
© 2012 Gist Labs, LLC
What about Big Tests?
• Real test data
• Realistic cluster
18
© 2012 Gist Labs, LLC
Real Test Data
My favorite strategy is to:
• Develop with small, crafted data
• Build/test the same way
• Run another test on top of real prod data
19
© 2012 Gist Labs, LLC
Continuous Deployment Servers
Build
Cluster
Test1
Cluster
Version Control
Staging
Production
Continuous Integration Servers
Developers
Developers
Test2
Cluster
Virtual vs Physical Servers
Network Infrastructure
Storage Infrastructure
Developer Sandboxes
Self-service Provisioning
Private vs Public Cloud
20
© 2012 Gist Labs, LLC
Realistic Cluster
• Use a CI/DevOps environment
• Virtualize, “X as a Service”
• Virtual Machines
• Virtual Infrastructure (Network, Storage)
21
© 2012 Gist Labs, LLC
Jenkins CI Server • Master/slave clusters
• Plugins for Hadoop and VMWare
• http://jenkins-ci.org/
22
© 2012 Gist Labs, LLC
Big Questions?
23
© 2012 Gist Labs, LLC
Thank you!
• Everything available from:
http://gistlabs.com/2012/08/big-data-little-tests/
• John Heintz, @jheintz, http://gistlabs.com
24