Download - Building a Scalable and Modern Infrastructure at CARFAX

Transcript
Page 1: Building a Scalable and Modern Infrastructure at CARFAX

A Scalable and Modern Infrastructure at CARFAX

Page 2: Building a Scalable and Modern Infrastructure at CARFAX

About Me• Jai Hirsch – Senior Systems Architect, Data

Technologies at CARFAX• Long-time Java and Database Developer• Data and Distributed Processing Enthusiast

• Github: https://github.com/JaiHirsch• Twitter: @JaiHirsch • Blog: http://jaihirsch.github.io/straw-in-a-haystack/

Page 3: Building a Scalable and Modern Infrastructure at CARFAX

“CARFAX helps millions of people buy and sell used cars with more confidence”

Page 4: Building a Scalable and Modern Infrastructure at CARFAX

CARFAX Vehicle History Report

Page 5: Building a Scalable and Modern Infrastructure at CARFAX

Documents on the Report

Page 6: Building a Scalable and Modern Infrastructure at CARFAX

NoSQL Before it Was Cool

Proprietary Key Value Store on OpenVMS Developed by CARFAX in 1984

Page 7: Building a Scalable and Modern Infrastructure at CARFAX

Never mind that sh*t! Here comes Mongo!

Page 8: Building a Scalable and Modern Infrastructure at CARFAX

Why MongoDB?Legacy structures mapped to

documentsHigh availability using replica setsPlatform IndependenceSupport

Page 9: Building a Scalable and Modern Infrastructure at CARFAX

MongoDB at CARFAXOur Production EnvironmentThe Legacy Database and High

Volume LoadsHigh Availability Reads

Page 10: Building a Scalable and Modern Infrastructure at CARFAX

Our Production Environment

Page 11: Building a Scalable and Modern Infrastructure at CARFAX

Server Deployment

AUTOMATEAUTOMATE

AUTOMATEAUTOMATE

Page 12: Building a Scalable and Modern Infrastructure at CARFAX

Server Configuration12 Shards with two spare servers racked for failover• OS: Linux• MongoDB 2.4.9• 128 GIGs of RAM• 1.8 TB of Drive Space • 10K RPM SAS Drives

Page 13: Building a Scalable and Modern Infrastructure at CARFAX
Page 14: Building a Scalable and Modern Infrastructure at CARFAX

The Future

Page 15: Building a Scalable and Modern Infrastructure at CARFAX
Page 16: Building a Scalable and Modern Infrastructure at CARFAX

Extract, Transform, Load

Page 17: Building a Scalable and Modern Infrastructure at CARFAX

Loading Millions to Billions of Records per Day

AUTOMATEAUTOMATE

AUTOMATEAUTOMATE

Page 18: Building a Scalable and Modern Infrastructure at CARFAX

First Attempt To Load Was Completely CPU Bound

Page 19: Building a Scalable and Modern Infrastructure at CARFAX

Not Acceptable!45 Days to

Backload the Legacy Database

Page 20: Building a Scalable and Modern Infrastructure at CARFAX

DistributedProcessing

Page 21: Building a Scalable and Modern Infrastructure at CARFAX

Acceptable! Billion+ inserts per

Day! 9 Days to Backload

Page 22: Building a Scalable and Modern Infrastructure at CARFAX

The MongoDB Implementation

13.6 billion+ documents 1.5 billion+ new documents per

year Document size: ~ 800 Bytes

Page 23: Building a Scalable and Modern Infrastructure at CARFAX

VHR Uses 200+ DocumentsWith Embedded Keys

Page 24: Building a Scalable and Modern Infrastructure at CARFAX

High Availability

Reads

Page 25: Building a Scalable and Modern Infrastructure at CARFAX

Millions of Reports per Day

AUTOMATEAUTOMATE

AUTOMATE

Page 26: Building a Scalable and Modern Infrastructure at CARFAX

Read Scalability With Tagging

Page 27: Building a Scalable and Modern Infrastructure at CARFAX

Each Data center is Tagged

Each Replica Set is Tagged

Page 28: Building a Scalable and Modern Infrastructure at CARFAX

5X More Reports per

Second

Page 29: Building a Scalable and Modern Infrastructure at CARFAX

But we can do More!

Page 30: Building a Scalable and Modern Infrastructure at CARFAX

Lets Wrap It UpDon’t buy a used car without a

CARFAX reportGrok your data and working setArchitect for your load volumeScale your reads to meet demand

30

Page 31: Building a Scalable and Modern Infrastructure at CARFAX

Keys To SuccessAUTOMATE EVERYTHINGTest Many ConfigurationsGrid Computing is AwesomeShard Early, Shard Often

Page 32: Building a Scalable and Modern Infrastructure at CARFAX

And Remember

Page 33: Building a Scalable and Modern Infrastructure at CARFAX

Friends Don’t Let Friends Use Default Ulimits!

Page 34: Building a Scalable and Modern Infrastructure at CARFAX

Thank You! The migration was a

success due to the incredible teams at CARFAX and MongoDB

We are always looking for great people to join us.

www.carfax.com/careers