[@IndeedEng] Logrepo: Enabling Data-Driven Decisions
-
Upload
indeedeng -
Category
Technology
-
view
2.729 -
download
5
description
Transcript of [@IndeedEng] Logrepo: Enabling Data-Driven Decisions
![Page 1: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/1.jpg)
go.indeed.com/IndeedEngTalks
![Page 2: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/2.jpg)
LogrepoEnabling Data-Driven Decisions
![Page 3: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/3.jpg)
Jeff ChienSoftware EngineerIndeed Apply Team
![Page 4: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/4.jpg)
Scale
More job searches worldwide than any other employment website.
● Over 100 million unique users ● Over 3 billion searches per month● Over 24 million jobs● Over 50 countries● Over 28 languages
![Page 5: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/5.jpg)
I help people get jobs.
![Page 6: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/6.jpg)
![Page 7: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/7.jpg)
![Page 8: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/8.jpg)
![Page 9: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/9.jpg)
![Page 10: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/10.jpg)
![Page 11: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/11.jpg)
1. Search
2. View job
3. Click “Apply Now”
4. Submit application
Job seeker flow using Indeed Apply
![Page 12: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/12.jpg)
Knowing how users interact with our system
helps us make better products
![Page 13: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/13.jpg)
Have to upload a resume
Have Indeed
Resume
Likelihood of applying to a job
![Page 14: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/14.jpg)
We Have Questions
● What percentage of applications use Indeed resumes?
● How many searches for “java” in “Austin”?
● How often are resumes edited?
● How long does it take to aggregate jobs?
![Page 15: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/15.jpg)
How many applications … to jobs from CareerBuilder … by job seekers who searched for “java” in “Austin” … used an Indeed resume?
Is the percentage different on mobile compared to web?
How much has this changed in 2011 compared to 2014?
Complicated Questions
![Page 16: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/16.jpg)
More Information
Better Decisions
![Page 17: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/17.jpg)
More information
Need to log events
● job searches
● clicks
● applies
![Page 18: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/18.jpg)
What to log
Client information - unique user identifier, user agent, ip address…
User behavior - clicks, alert signups…
Performance - backend request duration, memory usage...
A/B test groups - control and test groups
![Page 19: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/19.jpg)
Better decisions
Use empirical data to make decisions
Not based on assumptions nor the highest paid person’s opinion!
![Page 20: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/20.jpg)
Objective
Collect data on user actions and system performance from many different applications in multiple data centers
![Page 21: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/21.jpg)
How we build systems
Simple
Fast
Resilient
Scalable
![Page 22: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/22.jpg)
Simple
Easy interface
Reuse familiar technologies
![Page 23: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/23.jpg)
Fast
No impact to runtime performance
Data available soon
![Page 24: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/24.jpg)
Resilient
Does not lose data in spite of system or network failures
![Page 25: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/25.jpg)
Can handle large quantities of data
Scalable
![Page 26: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/26.jpg)
Requirements
Powerful enough to express diverse data
![Page 27: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/27.jpg)
Requirements
Powerful enough to express diverse data
Store all data forever
![Page 28: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/28.jpg)
Powerful enough to express diverse data
Store all data forever
Events stored at least once
Requirements
![Page 29: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/29.jpg)
Requirements
Powerful enough to express diverse data
Store all data forever
Events stored at least once
Easy to add new data to logs
![Page 30: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/30.jpg)
Requirements
Powerful enough to express diverse data
Store all data forever
Events stored at least once
Easy to add new data to logs
Easy to access logs in bulk
![Page 31: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/31.jpg)
RequirementsPowerful enough to express diverse data
Store all data forever
Events stored at least once
Easy to add new data to logs
Easy to access logs in bulk
Time range based access
![Page 32: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/32.jpg)
Non-Goals
Random access to individual events
Real time access to events
Complex data types
![Page 33: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/33.jpg)
LogrepoA distributed event logging system
Est. 2006
![Page 34: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/34.jpg)
Logrepo stores log entries
Everything is a string
Key/value pairs
URL-encoded
![Page 35: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/35.jpg)
Organic click log entry
uid=18dtbolr20nk23qh&type=orgClk&v=0&tk=18dtbnn3p0nk20g9&jobId=500&onclick=1&avgCmpRtg=2.9&url=http%3A%2F%2Fwww.indeed.com%2Frc%2Fclk&href=http%3A%2F%2Fwww.indeed.com%2Fjobs%3Fq%3D%26l%3DNewburgh%252C%2BNY%26start%3D20&agent=Mozilla%2F5.0+%28Windows+NT+6.1%3B+WOW64%3B+rv%3A26.0%29+Gecko%2F20100101+Firefox%2F26.0&raddr=173.50.255.255&ckcnt=17&cksz=1033&ctk=18dtbc6960nk20vd&ctkRcv=1&&
![Page 36: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/36.jpg)
URL-decoded organic click log entry
uid=18dtbolr20nk23qh&type=orgClk&v=0&tk=18dtbnn3p0nk20g9&jobId=500&onclick=1&avgCmpRtg=2.9&url=http://www.indeed.com/rc/clk&href=http://www.indeed.com/jobs?q=&l=Newburgh%2C+NYstart=20&agent=Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0&...
![Page 37: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/37.jpg)
URL-decoded organic click log entry
uid=18dtbolr20nk23qh&type=orgClk&v=0&tk=18dtbnn3p0nk20g9&jobId=500&onclick=1&avgCmpRtg=2.9&url=http://www.indeed.com/rc/clk&href=http://www.indeed.com/jobs?q=&l=Newburgh%2C+NYstart=20&agent=Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0&...
![Page 38: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/38.jpg)
Advantages
Human-readable
![Page 39: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/39.jpg)
Advantages
Human-readable
Arbitrary keys
![Page 40: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/40.jpg)
Advantages
Human-readable
Arbitrary keys
Low overhead to add new key/value pairs
![Page 41: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/41.jpg)
Advantages
Human-readable
Arbitrary keys
Low overhead to add new key/value pairs
Self-describing
![Page 42: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/42.jpg)
Advantages
Human-readable
Arbitrary keys
Low overhead to add new key/value pairs
Self-describing
Easy to parse in any language
![Page 43: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/43.jpg)
Required log entry keys
Every log entry has uid and type
Type is an arbitrary string
uid=18dtbolr20nk23qh&type=orgClk&...
![Page 44: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/44.jpg)
UID format
uid=18ducm8u50nk23qh&type=jobsearch&...
UID is always the first key
Unique
16 characters
Base 32 [0-9a-v]
![Page 45: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/45.jpg)
uid=18ducm8u50nk23qh
Date = 2014-01-10 Time = 09:35:24.357
Server id = 1512App instance id = 2
UID Version = 0Random value = 3921
UID breakdown
![Page 46: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/46.jpg)
UID generation
Unique IDs are unique
Random value avoids UID collisions
Random value is between 0 and 8191
Up to 8000 events per application instance per millisecond
![Page 47: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/47.jpg)
UID format benefits
Contains useful metadata
Compact format reduces memory requirements
Easy to compare or sort events by time
![Page 48: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/48.jpg)
Job seeker events
1. Search for jobs
2. Click on job
3. Apply to job
All events are part of the same flow
![Page 49: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/49.jpg)
Parent-child relationships between events
Events can reference other events with &tk=18ducm8u50nk23qh...
Children know their parents
Parents don’t know their children
Extremely powerful model
![Page 50: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/50.jpg)
Parent-child relationships between events
An organic click points to the search it occurred on
uid=18dtbnn3p0nk20g9&type=jobsearch&v=0&...
uid=18dtbolr20nk23qh&type=orgClk&v=0 &tk=18dtbnn3p0nk20g9&...
![Page 51: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/51.jpg)
More jobsearch child events
Sponsored job clicks
Javascript errors
Job alert signups
And many more...
![Page 52: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/52.jpg)
uid=18en3o3ov16r25rp&type=viewjob&...
user submission
post to employer
load IndeedApply
job view18en3o3ov16r25rp
Job seeker views a job
![Page 53: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/53.jpg)
job view18en3o3ov16r25rp
user submission
post to employer
uid=18en3o3s216ph6d5&type=loadJs&vjtk=18en3o3ov16r25rp&...
load IndeedApply18en3o3s216ph6d5
Indeed Apply loads
![Page 54: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/54.jpg)
uid=18en3qe0u16pi5ct&type=appSubmit&loadJsTk=18en3o3s216ph6d5&...
job view18en3o3ov16r25rp
user submission18en3qe0u16pi5ct
post to employer
load IndeedApply18en3o3s216ph6d5
Prepare job application
![Page 55: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/55.jpg)
POST /apply HTTPS/1.1Host: employer.com
{ "applicant": {
"name": "John Doe","email": "[email protected]","phone": "555-555-5555",
}, "jobTitle": "Software Engineer" ...
uid=18en3qe2r0nji3h6&type=postApp&appSubmitTk=18en3qe0u16pi5ct&...
job view18en3o3ov16r25rp
user submission18en3qe0u16pi5ct
post to employer18en3qe2r0nji3h6
load IndeedApply18en3o3s216ph6d5
Submit job application
![Page 56: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/56.jpg)
Javascript latency ping
At start of page load, browser executes js to ping Indeed
Server receives the ping and logs an event
![Page 57: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/57.jpg)
Parent job search and child js latency ping
uid=18dqpc3lm16pi2an&type=jobsearch&...
uid=18dqpc3s516pi566&type=lat&tk=18dqpc3lm16pi2an
![Page 58: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/58.jpg)
uid=18dqpc3s516pi566&type=lat&tk=18dqpc3lm16pi2an
Latency = 1389247205253 - 1389247205046= 207 ms
Approximates perceived latency to jobseeker
uid timestamp Jan 9, 2014 00:00:05.253
tk timestamp Jan 9, 2014 00:00:05.046
Subtracting UID timestamps yields duration
![Page 59: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/59.jpg)
West coast perceived latency in California vs. Washington
![Page 60: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/60.jpg)
Writing log entries from apps
LogEntry entry =factory.createLogEntry("search");
entry.setProperty("q", query);entry.setProperty("acctId", accountId);entry.setProperty("time", elapsedMillis);// ...
entry.commit();
![Page 61: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/61.jpg)
Creating a log entry
LogEntry entry =factory.createLogEntry("search");
Creates a log entry with UID and type set
UID timestamp tied to createLogEntry() call
![Page 62: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/62.jpg)
Populating a log entry
entry.setProperty("q", query);entry.setProperty("acctId", accountId);entry.setProperty("time", elapsedMillis);// ...
![Page 63: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/63.jpg)
Lists
Separate values with commas
String groups = "foo,bar,baz";
logEntry.setProperty("grps", groups);
// uid=...&grps=foo%2Cbar%2Cbaz&...
![Page 64: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/64.jpg)
Lists of Tuples
Encapsulate each tuple in parenthesis
Comma-separate elements within tuple
// Two jobs with (job id, score)String jobs = "(123,1.0)(400,0.8)";
logEntry.setProperty("jobs", jobs);
// uid=...&jobs=%28123%2C1.0%29%28400%2C0.8%29&...
![Page 65: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/65.jpg)
Committing a log entry
After log entry is fully populated...
entry.commit();
![Page 66: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/66.jpg)
Jason KoppeSystem Administrator
![Page 67: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/67.jpg)
I engineer systemsthat help people get jobs.
![Page 68: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/68.jpg)
Before logrepo
![Page 69: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/69.jpg)
Before logrepo
![Page 70: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/70.jpg)
log4j - Java logging framework
● Code - what● Configuration - define what goes to
where● Appender - where (file, smtp)
http://logging.apache.org/log4j/1.2/
![Page 71: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/71.jpg)
Before logrepo
![Page 72: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/72.jpg)
Reusing log4j for logrepo
![Page 73: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/73.jpg)
Redundancy from the start
Write to local disk (FileAppender)
Write to remote server #1 (? Appender)
Write to remote server #2 (? Appender)
![Page 74: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/74.jpg)
Writing to a remote server
syslogProtocol for transporting messages across
an IP network
Est. 1980s
http://tools.ietf.org/html/rfc5424
![Page 75: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/75.jpg)
Using log4j with syslog
Out-of-the-box, log4j only supported UDP syslog
UDP could result in data loss
![Page 76: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/76.jpg)
Avoiding data loss
TCP guarantees data transfer
Use TCP!
![Page 77: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/77.jpg)
SyslogTcpAppender● created by Indeed● TCP-enabled log4j syslog Appender● buffers messages before transport
Resilient for short network and syslog server downtimes
Creating a reliable Appender
![Page 78: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/78.jpg)
Choosing a syslog daemon
syslog-ngsyslog daemon which supports TCP
Est. 1998
http://www.balabit.com/network-security/syslog-ng
![Page 79: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/79.jpg)
Redundancy with log4j
Write to local disk (FileAppender)
Write to remote server #1 (SyslogTcpAppender)
Write to remote server #2 (SyslogTcpAppender)
![Page 80: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/80.jpg)
Redundancy over TCP
![Page 81: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/81.jpg)
Each syslog-ng server
receives unsorted log entries
immediately flushes entries to files on disk called raw logs
![Page 82: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/82.jpg)
Quick redundancy over TCP
![Page 83: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/83.jpg)
Optimized for redundancy
raw logs are probably out-of-order
each app writes to syslog independently
![Page 84: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/84.jpg)
Optimize for read access patterns
LogRepositoryBuilder (“Builder”)● sort● deduplicate● compress
![Page 85: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/85.jpg)
Builder architecture
![Page 86: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/86.jpg)
Builder architecture
![Page 87: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/87.jpg)
Builder architecture
![Page 88: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/88.jpg)
Builder architecture
![Page 89: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/89.jpg)
Builder creates segment files
uid=15mt000000k1&type=orgClk&v=1&k=4... uid=15mt000010k7&type=orgClk&v=1&k=3... uid=15mt000020k8&type=orgClk&v=1&k=2... uid=15mt000030ss&type=orgClk&v=1&k=9...
![Page 90: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/90.jpg)
Repeated strings compress well
uid=15mt000000k1&type=orgClk&v=1&k=4... uid=15mt000010k7&type=orgClk&v=1&k=3... uid=15mt000020k8&type=orgClk&v=1&k=2... uid=15mt000030ss&type=orgClk&v=1&k=9...
compresses by 85%
![Page 91: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/91.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
logentry type
![Page 92: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/92.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
4-char UID prefix, base 32
![Page 93: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/93.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
4-char UID prefix, base 32
~9.3 hour time period
![Page 94: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/94.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
5-char UID prefix, base 32
![Page 95: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/95.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
5-char UID prefix, base 32
~17 minute time period
![Page 96: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/96.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
unique number
![Page 97: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/97.jpg)
Archive directory structure
/orgClk/15mt/0.log4181.seg.gz
unique number
Supports more than 1 segment file per type per 5-char UID prefix
![Page 98: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/98.jpg)
Multiple segment files
Keep Builder memory usage fixed
When Builder memory fills, it flushes to disk
Each flush creates files for 5-char UID prefix
![Page 99: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/99.jpg)
Multiple segment files
Keep Builder memory usage fixed
When Builder memory fills, it flushes to disk
Each flush creates files for 5-char UID prefix
![Page 100: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/100.jpg)
Multiple segment files
Keep Builder memory usage fixed
When Builder memory fills, it flushes to disk
Each flush creates files for 5-char UID prefix
![Page 101: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/101.jpg)
Builder creates the archive
![Page 102: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/102.jpg)
Redundancy
![Page 103: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/103.jpg)
Redundancy
![Page 104: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/104.jpg)
Ensure archive consistency
● Delayed Builder on second server● Add new segment files for log entries
missed by first Builder● Causes multiple segment files for a 5-char
UID prefix
![Page 105: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/105.jpg)
Providing access to logrepo
LogRepositoryReader (“Reader”)● simple request protocol● reads from (multiple) segment files● provides sorted stream of entries to TCP
client as quickly as possible
![Page 106: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/106.jpg)
Reader request protocol
1. Start time2. End time3. Logrepo type
![Page 107: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/107.jpg)
Reader request using netcat
$ echo 1295905740000 1295913600000 orgClk
start time (ms since 1970-01-01, the start of Unix time)
![Page 108: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/108.jpg)
Reader request using netcat
$ echo 1295905740000 1295913600000 orgClk
end time (ms since 1970-01-01)
![Page 109: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/109.jpg)
Reader request using netcat
$ echo 1295905740000 1295913600000 orgClk
logrepo type
![Page 110: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/110.jpg)
Reader request using netcat
$ echo 1295905740000 1295913600000 orgClk \
| nc 192.168.0.1 9999
send echo across a TCP session
![Page 111: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/111.jpg)
Reader request using netcat
$ echo 1295905740000 1295913600000 orgClk \
| nc 192.168.0.1 9999
uid=15mt00l710k3262q&type=orgClk&v=0&...
uid=15mt00l780k137d9&type=orgClk&v=0&...
...
uid=15mt7ggvj142h06k&type=orgClk&v=0&...
UID-sorted results
![Page 112: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/112.jpg)
Reading entries from archive
1. Isolate to the type directory
1295905740000 1295913600000 orgClk
![Page 113: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/113.jpg)
Reading entries from archive
2. Convert request timestamps to UID prefix
uidPrefixFromTime(1295905740000) = 15mt0
uidPrefixFromTime(1295913600000) = 15mt7
1295905740000 1295913600000 orgClk
![Page 114: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/114.jpg)
Reading entries from archive
3. Find segments matching first UID prefix
ls orgClk/15mt/0*orgClk/15mt/0.log3094.seg.gzorgClk/15mt/0.log4181.seg.gz
1295905740000 1295913600000 orgClk
15mt0
![Page 115: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/115.jpg)
Reading entries from archive
4. Read sorted segments simultaneously, merge into a single sorted stream
/orgClk/15mt/0.log3094.seg.gz: uid=15mt000080g1i0j5&type=orgClk&... uid=15mt00l780k137d9&type=orgClk&.../orgClk/15mt/0.log4181.seg.gz: uid=15mt00l710k3262q&type=orgClk&... uid=15mt00l790k1i2rs&type=orgClk&...
1295905740000 1295913600000 orgClk
![Page 116: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/116.jpg)
Reading entries from archive
4. Read sorted segments simultaneously, merge into a single sorted stream
/orgClk/15mt/0.log3094.seg.gz: uid=15mt000080g1i0j5&type=orgClk&... uid=15mt00l780k137d9&type=orgClk&.../orgClk/15mt/0.log4181.seg.gz: uid=15mt00l710k3262q&type=orgClk&... uid=15mt00l790k1i2rs&type=orgClk&...
1295905740000 1295913600000 orgClk
1
42
3
![Page 117: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/117.jpg)
Reading entries from archive
4. Read sorted segments simultaneously, merge into a single sorted stream
uid=15mt000080g1i0j5&type=orgClk&... uid=15mt00l710k3262q&type=orgClk&... uid=15mt00l780k137d9&type=orgClk&... uid=15mt00l790k1i2rs&type=orgClk&...
1295905740000 1295913600000 orgClk
1
4
23
![Page 118: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/118.jpg)
Reading entries from archive
5. Only return log entries between timestamps
uid=15mt000080g1i0j5&type=orgClk&... uid=15mt00l710k3262q&type=orgClk&... uid=15mt00l780k137d9&type=orgClk&... uid=15mt00l790k1i2rs&type=orgClk&...
1295905740000 1295913600000 orgClk
1
4
23
![Page 119: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/119.jpg)
Reading entries from archive
6. Read segments for each UID prefix, one prefix at a time
1295905740000 1295913600000 orgClk
15mt0 15mt7
15mt115mt215mt315mt415mt515mt6
![Page 120: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/120.jpg)
Reading entries from archive
7. Stop reading files when entry crosses request boundary
1295905740000 1295913600000 orgClk
![Page 121: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/121.jpg)
![Page 122: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/122.jpg)
![Page 123: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/123.jpg)
The first years (2007 & 2008)
● Single datacenter● App servers● 2 logrepo servers● syslog-ng● Builder● Reader
![Page 124: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/124.jpg)
Growth
job seekers
![Page 125: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/125.jpg)
Growth
job seekers
products
![Page 126: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/126.jpg)
Growth
job seekers
products
datacenters
![Page 127: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/127.jpg)
Growth
log entries
![Page 128: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/128.jpg)
Multi-datacenter rationale
Latency
Redundancy
![Page 129: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/129.jpg)
Multi-datacenter rationale
Job seekers
![Page 130: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/130.jpg)
Logrepo in multiple datacenters
● Single datacenter● Consumers● Reader
● Every datacenter● Applications producing logentries● 2 syslog servers● Builders (minimize Internet traffic)
![Page 131: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/131.jpg)
![Page 132: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/132.jpg)
Single datacenter archival
/dc1/orgClk/15mt/0.log4181.seg.gz
event type(orgClick means organic search result click)
25-bit timestamp prefix, base 32~17-minute time period
random number
![Page 133: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/133.jpg)
Multiple datacenter archival
/dc1/orgClk/15mt/0.log4181.seg.gz
event type(orgClick means organic search result click)
25-bit timestamp prefix, base 32~17-minute time period
random number
datacenter
![Page 134: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/134.jpg)
Datacenter dirs avoid collisions
~$ ls */orgClk/15mt/0*
dc1/orgClk/15mt/0.log1481.seg.gz
dc3/orgClk/15mt/0.log1481.seg.gz
Different datacenters
![Page 135: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/135.jpg)
Datacenter dirs avoid collisions
~$ ls */orgClk/15mt/0*
dc1/orgClk/15mt/0.log1481.seg.gz
dc3/orgClk/15mt/0.log1481.seg.gz
Same segment filename
Independent Builders
![Page 136: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/136.jpg)
uid=18ducm8u50nk23qh
Date = 2014-01-10 Time = 09:35:24.357
Server id = 1512App instance id = 2
UID Version = 0Random value = 3921
UID breakdown
![Page 137: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/137.jpg)
uid=18ducm8u50nk23qh
Date = 2014-01-10 Time = 09:35:24.357
Server id = 1512App instance id = 2
UID Version = 0Random value = 3921
UID breakdown
![Page 138: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/138.jpg)
Using server ID for uniqueness
Each datacenter gets 256 server IDs
1. DC #1 uses 0 - 2552. DC #2 uses 256 - 5113. DC #3 uses 512 - 7674. ...
![Page 139: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/139.jpg)
The next years (2009 - 2011)
● Multiple datacenters● 2 logrepo servers● syslog-ng● Builder
● Consumer datacenter● Reader● Consumers
![Page 140: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/140.jpg)
More logentries
More consumers
![Page 141: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/141.jpg)
Diverse requests
![Page 142: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/142.jpg)
Single server disk bottleneck
![Page 143: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/143.jpg)
Scaling logrepo reads
Bottleneck: single active Reader server
Goal: spread logrepo accesses across a cluster of servers
![Page 144: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/144.jpg)
Read logrepo from HDFS
Hadoop Distributed File System (HDFS)
“a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster.”
http://hadoop.apache.org/docs/stable1/hdfs_design.html
![Page 145: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/145.jpg)
Using HDFS for logrepo access
![Page 146: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/146.jpg)
Using HDFS for logrepo access
![Page 147: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/147.jpg)
Using HDFS for logrepo access
![Page 148: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/148.jpg)
Resilient logrepo in HDFS
Store each logentry on 3 servers
![Page 149: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/149.jpg)
Push to HDFS quickly
Mirror every segment file into HDFS
![Page 150: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/150.jpg)
Push to HDFS quickly
/dc1/orgClk/15mt/0.log4181.seg.gz
5-char UID prefix, base 32~17-minute time period
500,000+ files per day
![Page 151: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/151.jpg)
HDFS optimized for fewer files
Reduce the number of logrepo files in HDFS keeps us efficient
![Page 152: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/152.jpg)
HDFS optimized for fewer files
Reduce the number of logrepo files in HDFS keeps us efficient
HDFSArchiver
![Page 153: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/153.jpg)
Archive yesterday in HDFS
/dc1/orgClk/15mt/0.log4181.seg.gz
20-bit timestamp prefix~9.3 hour period
2,500 files per day
type
![Page 154: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/154.jpg)
Scaling logrepo in HDFS
500,000+ files per day
2,500 files per day
![Page 155: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/155.jpg)
LogrepoA distributed event logging system
Created @IndeedEng● Application
Open source● log4j
![Page 156: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/156.jpg)
Created @IndeedEng● Application● SyslogTcpAppender
Open source● log4j
LogrepoA distributed event logging system
![Page 157: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/157.jpg)
Created @IndeedEng● Application● SyslogTcpAppender
Open source● log4j● syslog-ng
LogrepoA distributed event logging system
![Page 158: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/158.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder
Open source● log4j● syslog-ng
LogrepoA distributed event logging system
![Page 159: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/159.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder
Open source● log4j● syslog-ng● gzip
LogrepoA distributed event logging system
![Page 160: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/160.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder● Reader
Open source● log4j● syslog-ng● gzip
LogrepoA distributed event logging system
![Page 161: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/161.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder● Reader
Open source● log4j● syslog-ng● gzip● rsync+ssh
LogrepoA distributed event logging system
![Page 162: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/162.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder● Reader
Open source● log4j● syslog-ng● gzip● rsync+ssh● Hadoop
LogrepoA distributed event logging system
![Page 163: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/163.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder● Reader● HDFSPusher
Open source● log4j● syslog-ng● gzip● rsync+ssh● Hadoop
LogrepoA distributed event logging system
![Page 164: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/164.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder● Reader● HDFSPusher● HDFSReader
Open source● log4j● syslog-ng● gzip● rsync+ssh● Hadoop
LogrepoA distributed event logging system
![Page 165: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/165.jpg)
Created @IndeedEng● Application● SyslogTcpAppender● Builder● Reader● HDFSPusher● HDFSReader● HDFSArchiver
Open source● log4j● syslog-ng● gzip● rsync+ssh● Hadoop
LogrepoA distributed event logging system
![Page 166: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/166.jpg)
All time logrepo = 150 TB compressed
![Page 167: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/167.jpg)
jobsearch event setabredistimeacmetimeaddltimeadscadsdelayadsibadscbadsiboostojcboostojibsjcbsjcwiabsjibsjindappliesbsjindappviewsbsjrevbsjwiackcntckszcountsctkagectkagedaysdayofweekdcpingtimedomTotalTimeds-mpo
dsmissdstimefeatempfjfreekwacfreekwarevfreesjcfreesjrevfrmtimegalatdelayiplatiplongjslatdelayjsvdelaykwackwacdelaykwaikwarevkwcntlacinsizelacsgsizelmstimempotimemprtimenavTotTimendxtime
ojcojclongojcshortojcwiaojiojindappliesojindappviewsojwiaoocscpageprcvdlatencyprimfollowcntprvwojiprvwojlatprvwojopentimeprvwojreqradscradsirecidlookupbudgetrectimeredirCountredirTimerelfollowcntrespTimereturnvisitrojc
rojirqcntrqlcntrqqcntrrsjcrrsjirrsjrevrsavailrsjcrsjirsusedrsviableserpsizesjcsjcdelaysjclongsjcntsjcshortsjcwiasjisjindappliessjindappviewssjrevsjwiasllatsllong
sqcsqisugtimesvjsvjnostarsvjstartadsctadsitimetimeofdaytotcnttotfollowcnttotrevtottimetsjctsjcwiatsjitsjindappliestsjindappviewstsjrevtsjwiaunqcntvpwacinsizewacsgsize
![Page 168: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/168.jpg)
acmepageacmereviewmodacmeserviceacmesessionadclickadcrequestadcrevadschanneladsclickadsenseclickadveadvtagghttpaggjiraaggjobaggjob_waldorfaggsherlockaggsourcehealthagstimingapiapijsvapisearcharchiveindexarchiveindex_shingled_testbincarclicksclickclickanalyticscobranddctmismatchdrawdupepairsdupepairs_minidupepairs_olddupepairsalldupepairsall_miniejcheckeremilyops
feedbridgeglobalnavgooglebot_organichomepageimpressionindeedapplyjhstjobalertjobalertorganicjobalertsearchjobalertsponsoredjobexpirationjobexpiration2jobexpiration3jobprocessedjobqueueblockjobsearchjssquerykeywordAdlocsvclucyindexermainmechanicalturkmindyopsmobhomepagemobilmobilemobileorganicmobilesponsoredmobrecjobsmobsearchmobviewjobmyindeedmyindfunnelmyindpagemyindrezcreatemyindsessionoldopsesjasx
organicorgmodelorgmodelsubsetorgmodelsubset90passportaccountpassportpagepassportsigninramsaccessrecjobsrecommendserviceresumedataresumesearchrexcontactsrexfunnelreximpressionrexsearchrezSrchSearchrezalertrezalertfunnelrezfunnelrezjserrrezsrchrequestrezviewsearchablejobsseosessionsjmodelsponsoredsysadappinfosysadapptimingtestndxtestndx1testndx2tmpusrsvccacheusrsvcrequestviewjobwebusersignin
![Page 169: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/169.jpg)
Every day at Indeed
● Create 5 billion log entries
● App spends 0.03 ms to create each log entry
● Add 500 GB to the archive
● Add 1.5 TB to HDFS
● Consumers read from HDFS at 18.5 GB/s
● 100s of consumers request 1000 different logrepo types
![Page 170: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/170.jpg)
Four types of consumers
Ad-hoc command line
Standard Java programs
Hadoop map/reduce
Real-time monitoring
![Page 171: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/171.jpg)
$ echo 1388556000000 1388642400000 jobsearch \| nc logrepo 9999
uid=18d6666o916r15g3&type=jobsearch&q=VP+ITuid=18d6666ob0mp27aa&type=jobsearch&q=Lab+Techuid=18d6666ob0nl15ce&type=jobsearch&q=daycareuid=18d6666og0nk24rb&type=jobsearch&q=Chef+Upscale...
Command line access
![Page 172: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/172.jpg)
Reuses standard unix tools and patterns
$ echo 1388556000000 1388642400000 jobsearch \| nc logrepo 9999| egrep -o '&searchTime=[^&]+' \| egrep -o '[0-9]+' \| sort -r -n \| head
Slowest searches from log entries
![Page 173: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/173.jpg)
Programmatic access is trivial
We have clients for
● java
● python
● php
● pig
![Page 174: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/174.jpg)
A typical logrepo consumer (single machine)
Reads one primary log event type
Reads a dozen child events per primary
Total size of each event set = 10KB
![Page 175: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/175.jpg)
A typical logrepo consumer (single machine)
Millions of events read per run
Thousands of consumers run each day
Tens of terabytes processed each day
![Page 176: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/176.jpg)
Efficient Parsing
Important for single machine consumers
Log entry parsing too slow
Fast
Minimize memory usage
![Page 177: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/177.jpg)
URL String Parsing(now available on github)4x faster than String.split(...), generates 50% less garbage
Parses 1 million log entries of size 0.5K each in 3 seconds
https://github.com/indeedeng
http://go.indeed.com/urlparsing
![Page 178: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/178.jpg)
Hadoop clients
Reliable, scalable, distributed computing
![Page 179: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/179.jpg)
Hadoop clients
Reliable, scalable, distributed computing
Most new consumers use Hadoop
![Page 180: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/180.jpg)
Hadoop clients
Reliable, scalable, distributed computing
Most new consumers use Hadoop
Read log entries directly from HDFS
![Page 181: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/181.jpg)
Hadoop clients
Reliable, scalable, distributed computing
Most new consumers use Hadoop
Read log entries directly from HDFS
Divide and conquer to scale
![Page 182: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/182.jpg)
Monitoring
Want to monitor
● Business metrics
● Operational metrics
“Available soon” isn’t good enough
![Page 183: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/183.jpg)
Datadog
Third party monitoring service
Stream metrics to Datadog HQ
Real-time dashboards
![Page 184: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/184.jpg)
Datadog
![Page 185: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/185.jpg)
miniEPL
'jobsearch.organic_clk': "SELECT COUNT(*), 'clicks' AS unit FROM orgClk",
'jobsearch.totTime': "SELECT int(totTime), 'ms' AS unit FROM jobsearch(totTime IS NOT NULL)",
'mobile.mobsearch.oji': "SELECT tupleCount(orgRes), 'results' AS unit FROM mobsearch",
![Page 186: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/186.jpg)
Getting logs into Datadog
![Page 187: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/187.jpg)
Data redundancy
Replaying events
Click charging
![Page 188: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/188.jpg)
Replaying events
1. Job alert email sign up broke for logged in users
![Page 189: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/189.jpg)
Replaying events
1. Job alert email sign up broke for logged in users
2. Got alert parameters + jobsearch uid from access logs
![Page 190: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/190.jpg)
Replaying events
1. Job alert email sign up broke for logged in users
2. Got alert parameters + jobsearch uid from access logs
3. Got account id from jobsearch log entries
![Page 191: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/191.jpg)
Replaying events
1. Job alert email sign up broke for logged in users
2. Got alert parameters + jobsearch uid from access logs
3. Got account id from jobsearch log entries
4. Recreated job alert sign ups
![Page 192: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/192.jpg)
Click charging
1. Store sponsored click data in database
![Page 193: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/193.jpg)
Click charging
1. Store sponsored click data in database
2. Log sponsored click data to logrepo
![Page 194: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/194.jpg)
Click charging
1. Store sponsored click data in database
2. Log sponsored click data to logrepo
3. Verify logs match database
![Page 195: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/195.jpg)
Click charging
1. Store sponsored click data in database
2. Log sponsored click data to logrepo
3. Verify logs match database
4. Charge for clicks
![Page 196: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/196.jpg)
Click charging
1. Store sponsored click data in database
2. Log sponsored click data to logrepo
3. Verify logs match database
4. Charge for clicks
5. Profit!
![Page 197: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/197.jpg)
What does logrepo enable?
Answering business and operational questions
Data-driven decisions
![Page 198: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/198.jpg)
Average cover letter length inside US vs. outside US?
![Page 199: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/199.jpg)
Mobile searches per hour inJP vs. UK?
![Page 200: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/200.jpg)
Resume creation by country?
![Page 201: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/201.jpg)
Email alert opens by email domain?
![Page 202: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/202.jpg)
Percent of app downloads fromiOS, Android, Windows?
![Page 203: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/203.jpg)
How quickly does a datacenter take on traffic after a failover?
![Page 204: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/204.jpg)
Q & A
https://github.com/indeedeng
http://go.indeed.com/urlparsing
![Page 205: [@IndeedEng] Logrepo: Enabling Data-Driven Decisions](https://reader033.fdocuments.in/reader033/viewer/2022051323/546e8dffb4af9fcd268b46e3/html5/thumbnails/205.jpg)
Next @IndeedEng TalkBig Value from Big Data:
Building Decision Trees at Scale
Andrew Hudson, Indeed CTOFebruary 26, 2014
http://engineering.indeed.com/talks