Mythbusting: Understanding How We Measure the Performance of MongoDB

76
Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Mythbusting: Understanding How We Measure the Performance of MongoDB

description

 

Transcript of Mythbusting: Understanding How We Measure the Performance of MongoDB

  • 1. Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Mythbusting: Understanding How We Measure the Performance of MongoDB
  • 2. Before we start We are going to look a lot at C++ kernel code Java benchmarks JavaScript tests And lots of charts
  • 3. Measuring "Performance" https://www.youtube.com/watch?v=7wm-pZp_mi0
  • 4. Benchmarking Some common traps Performance measurement & diagnosis What's next
  • 5. Part One Some Common Traps
  • 6. The Milk Train Doesn't Stop Here Anymore Tennessee Williams "We all live in a house on fire, no fire department to call; no way out, just the upstairs window to look out of while the fire burns the house down with us trapped, locked in it."
  • 7. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 8. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 9. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 10. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 11. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); #1 Time taken to Insert x Documents
  • 12. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); So that looks ok, right?
  • 13. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management?
  • 14. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Thread contention on nextInt()? Object creation and GC management?
  • 15. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Time to synthesize data? Object creation and GC management? Thread contention on nextInt()?
  • 16. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Thread contention on addAndGet()? Thread contention on nextInt()? Time to synthesize data?
  • 17. long startTime = System.currentTimeMillis(); for (int roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; BasicDBObject doc = new BasicDBObject(); doc.put("_id",id); doc.put("k",rand.nextInt(numMaxInserts)+1); String cVal = "" doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i]=doc; } coll.insert(aDocs); numInserts += documentsPerInsert; globalInserts.addAndGet(documentsPerInsert); } long endTime = System.currentTimeMillis(); What are else you measuring? Object creation and GC management? Clock resolution? Thread contention on nextInt()? Time to synthesize data? Thread contention on addAndGet()?
  • 18. // Pre Create the Object outside the Loop BasicDBObject[] aDocs = new BasicDBObject[documentsPerInsert]; for (int i=0; i < documentsPerInsert; i++) { BasicDBObject doc = new BasicDBObject(); String cVal = ""; doc.put("c",cVal); String padVal = ""; doc.put("pad",padVal); aDocs[i] = doc; } Solution: Pre-Create the objects Pre-create non varying data outside the timing loop Alternative Pre-create the data in a file; load from file
  • 19. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention nextInt() by making Thread local
  • 20. // Use ThreadLocalRandom generator or an instance of java.util.Random per thread java.util.concurrent.ThreadLocalRandom rand; for (long roundNum = 0; roundNum < numRounds; roundNum++) { for (int i = 0; i < documentsPerInsert; i++) { id++; doc = aDocs[i]; doc.put("_id",id); doc.put("k", nextInt(rand, numMaxInserts)+1); } coll.insert(aDocs); numInserts += documentsPerInsert; } // Maintain count outside the loop globalInserts.addAndGet(documentsPerInsert * roundNum); Solution: Remove contention Remove contention on addAndGet() Remove contention nextInt() by making Thread local
  • 21. long startTime = System.currentTimeMillis(); long endTime = System.currentTimeMillis(); long startTime = System.nanoTime(); long endTime = System.nanoTime() - startTime; Solution: Timer resolution "resolution is at least as good as that of currentTimeMillis()" "granularity of the value depends on the underlying operating system and may be larger" Source http://docs.oracle.com/javase/7/docs/api/java/lang/System.html
  • 22. General Principal #1 Know what you are measuring
  • 23. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 24. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 25. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 26. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); #2 Response time to return all results
  • 27. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); So that looks ok, right?
  • 28. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes
  • 29. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Unrestricted predicate?
  • 30. BasicDBObject doc = new BasicDBObject(); doc.put("v", str); // str is a 2k string for (int i=0; i < 1000; i++) { doc.put("_id",i); coll.insert(doc); } BasicDBObject predicate = new BasicDBObject(); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); What are else you measuring? Each doc is is 4080 bytes on disk with powerOf2Sizes Measuring Time to parse & execute query Time to retrieve all document But also Cost of shipping ~4MB data through network stack Unrestricted predicate?
  • 31. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Return fixed range
  • 32. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Return fixed range
  • 33. BasicDBObject predicate = new BasicDBObject(); predicate.put("_id", new BasicDBObject("$gte", 10).append("$lte", 20)); BasicDBObject projection = new BasicDBObject(); projection.put("_id", 1); long startTime = System.currentTimeMillis(); DBCursor cur = coll.find(predicate, projection ); DBObject foundObj; while (cur.hasNext()) { foundObj = cur.next(); } long endTime = System.currentTimeMillis(); Solution: Limit the projection Only project _id Only 46k transferred through network stack Return fixed range
  • 34. General Principal #2 Measure only what you need to measure
  • 35. Part Two Performance measurement & diagnosis
  • 36. The Physical Principles of the Quantum Theory (1930) Werner Heisenberg "Every experiment destroys some of the knowledge of the system which was obtained by previous experiments."
  • 37. Broad categories Micro Benchmarks Workloads
  • 38. Micro benchmarks: mongo-perf
  • 39. mongo-perf: goals Measure commands Configure Single mongod, ReplSet size (1 -> n), Sharding Single vs. Multiple DB O/S Characterize Throughput by thread count Compare
  • 40. What do you get? Better
  • 41. What do you get? Measured improvement between rc0 and rc2 Better
  • 42. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 43. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 44. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 45. tests.push( { name: "Commands.CountsIntIDRange", pre: function( collection ) { collection.drop(); for ( var i = 0; i < 1000; i++ ) { collection.insert( { _id : i } ); } collection.getDB().getLastError(); }, ops: [ { op: "command", ns : "testdb", command : { count : "mycollection", query : { _id : { "$gt" : 10, "$lt" : 100 } } } } ] } ); Benchmark source code
  • 46. Code Change
  • 47. Workloads "public" workloads YCSB Sysbench "real world" simulations Inbox fan in/out Message Stores Content Management
  • 48. Example: Bulk Load Performance 16m Documents Better 55% degradation 2.6.0-rc1 vs 2.4.10
  • 49. Ouch where's the tree in the woods? 2.4.10 -> 2.6.0 4495 git commits
  • 50. git-bisect Bisect between good/bad hashes git-bisect nominates a new githash Build against githash Re-run test Confirm if this githash is good/bad Rinse and repeat
  • 51. Code Change - Bad Githash
  • 52. Code Change - Fix
  • 53. Bulk Load Performance - Fix Better 11% improvement 2.6.1 vs 2.4.10
  • 54. The problem with measurement Observability What can you observe on the system? Effect What effects can an observation cause?
  • 55. mtools
  • 56. mtools MongoDB log file analysis Filter logs for operations, events Response time, lock durations Plot https://github.com/rueckstiess/mtools
  • 57. Response Times > 100ms Bulk Insert 2.6.0-rc0 Ops/Sec Time
  • 58. Response Times > 100ms Bulk Insert 2.6.0-rc0 vs. 2.6.0-rc2 Floor raised
  • 59. Code Change Yielding Policy
  • 60. Code Change
  • 61. Response Times Bulk Insert 2.6.0 vs 2.6.1 Ceiling similar, lower floor resulting in 40% improvement in throughput
  • 62. Secondary effects of Yield policy change Write lock time reduced Order of magnitude reduction of write lock duration
  • 63. > db.serverStatus() Yes will cause a read lock to be acquired > db.serverStatus({recordStats:0}) No lock is not acquired > mongostat Yes - until SERVER-14008 resolved, uses db.serverStatus() Unexpected side effects of measurement?
  • 64. CPU sampling Get an impression of Call Graphs CPU time spent on node and called nodes
  • 65. > sudo apt-get install google-perftools > sudo apt-get install libunwind7-dev > scons --use-cpu-profiler mongod Setup & building with google- profiler
  • 66. > mongodb dbpath Note: Do not use fork > mongo > use admin > db.runCommand({_cpuProfilerStart: {profileFilename: 'foo.prof'}}) Execute some commands that you want to profile > db.runCommand({_cpuProfilerStop: 1}) Start the profiling
  • 67. Sample start vs. end of workload
  • 68. Sample start vs. end of workload
  • 69. Code change
  • 70. Public Benchmarks Not all forks are the same YCSB https://github.com/achille/YCSB sysbench-mongodb https://github.com/mdcallag/sysbench-mongodb
  • 71. Part Three And next?
  • 72. Beavis & Butthead "The future sucks. Change it." "I'm way cool Beavis, but I cannot change the future."
  • 73. What we are working on mongo-perf UI refactor Adding more micro benchmarks Workloads Adding external benchmarks Creating benchmarks for common use cases Inbox fan in/out Analytical dashboards Stream / Feeds Customers, Partners & Community
  • 74. Here's how you can help change the future! Got a great workload? Great benchmark? Want to donate it? [email protected]
  • 75. Don't be that benchmark #1 Know what you are measuring #2 Measure only what you need to measure
  • 76. [email protected] Senior Director of Performance Engineering, MongoDB Alvin Richards #MongoDBWorld Thank You