Lean Hiring Aided by Machine Learning December 18, 2014 Presented by Vinayak Joglekar, Co-Founder...
-
Upload
adelia-linette-townsend -
Category
Documents
-
view
217 -
download
0
Transcript of Lean Hiring Aided by Machine Learning December 18, 2014 Presented by Vinayak Joglekar, Co-Founder...
Lean Hiring Aided by Machine Learning
December 18, 2014
Presented by
Vinayak Joglekar, Co-Founder and CTO, Synerzip
Confidential
Discussion Topics
1. The Problem
2. Lean Hiring
3. Resume Ranking - Choice of Algorithm
4. Data Acquisition and Cleaning Challenges
5. Initial Results
6. Improving Accuracy
7. The Road Ahead
December 18, 2014
Confidential
Needle in the Haystack Situation1. Waste in the process of hiring =
resumes reviewed but not selected + candidates interviewed but not selected.
2. Every candidate “hypes” the resume to a certain extent. Much time is wasted reading pages of hyperbole to discover the grain of truth.
3. Hiring managers have other priorities - their precious time is wasted in interviewing unsuitable candidates whose resumes look good.
December 18, 2014Confidential
Confidential
Can I see some more resumes?
Send Some more
Some more please
Is this all you have?
Week 1
Week 2
Week 3
Feedback after 4 weeks
Myth: More choice = better selection
Reality: More choice = waste + delay + confusion
More Choice Isn’t Always Better
December 18, 2014
Confidential
Make Them Jump through Hoops1. Common response to the
problem is to have a strict filtration process consisting of series of tests and interviews.
2. Good candidates are often not actively looking for a change. They get turned off by the long evaluation process.
3. The evaluation process is often flawed with too much stress on specific skills than abilities. Tests are susceptible to gaming.
December 18, 2014Confidential
1. The Problem
2. Lean Hiring
3. Resume Ranking- Choice of Algorithm
4. Data Acquisition and Cleaning Challenges
5. Initial Results
6. Improving Accuracy
7. The Road Ahead
Confidential
Discussion Topics
December 18, 2014
Kanban• Extensively used in automobile industry.
• Principle: Any process consisting of a workflow can’t run faster than the bottleneck.
• All sub-processes that run faster than the bottleneck produce waste.
• Kanban ensures all sub-processes march to the drum-beat set by the bottleneck.
• Kanban is pull-based. A sub-process can’t pull more work than a pre-set WIP limit.
Confidential December 18, 2014
Confidential
Solution: Lets Limit the Choice!
December 18, 2014
Confidential
Challenge: Ranking Resumes
Job 1 Job 2 Job 3 Job 4
Expected Pay
Experience
Education
Job Switches
Soft skills
Location
Lead Time
It took half a day for an experienced recruiter to rank a few resumes. He kept asking what was more important: Was it pay, soft skills or experience? The answer is that it depends on the job. Each job needs a different weightage to be assigned to each one of these attributes. Thus ranking reduces to assigning appropriate weightages to the attributes.
December 18, 2014
1. The Problem
2. Lean Hiring
3. Resume Ranking- Choice of Algorithm
4. Data Acquisition and Cleaning Challenges
5. Initial Results
6. Improving Accuracy
7. The Road Ahead
Confidential
Discussion Topics
December 18, 2014
Confidential
Why Logistic Regression?• Training a machine learning algorithm for resume ranking would
require a significant number of resumes that are pre-ranked by a human expert.
• It is very difficult for human expert to rank resumes.
• On the other hand we have a lot of resumes that are classified as suitable or unsuitable which can be used to train the logistic regression algorithm.
The graph here shows how the probability of a resume being suitable depends on the attributes (X) and the weights assigned to them (θ) hθ(X)=1/(1+e-Z) where
Z= θ1X1+ θ2X2 +…+ θnXn where
X1,X2,..,Xn are various attributes and
θ1,θ2 ,..,θn are the weights assigned to
themDecember 18, 2014
Confidential
What is Decision Boundary?
• x1 = experience and x2 = pay
• 1 = suitable and 0 = unsuitable
• Observation 1: More pay reduces the probability of selection and more experience increases the same
• Decision boundary is an imaginary line that separates suitable & unsuitable examples
• In this case the line is x1-x2-3=0
• Points below the line are likely to be suitable
• Points above the line are likely to be unsuitable
• Points along the line have equal probability of being suitable or unsuitable – hence hθ(X)=.5 & θTX=0 along this line
December 18, 2014
Confidential
Training, Test & Validation Sets
Testing is the measure the accuracy with which true positives and true negatives are predicted by the algorithm. In tests like cancer
detection; false negatives can prove to be fatal.
InputSupervised Learning Algorithm
Training Set(60%)
Output
Validation
Testing
Validation Set (20%)
Test Set (20%)
Over/ under fitting?
Valid
Test
Validation
Training
Available Data
}60%
}20%
}20%
December 18, 2014
Confidential
Implementation Challenges
0
0
1
0 0
0
1
0
0
0
0 0
10
1
0
0 0
11
0
1
1 1
0
1
0
1
1
1
0
Decision Boundary
• Very difficult to manually rank/ grade resumes thus we can’t use standard ranking algorithms.
• Small training sets, even smaller test sets.
• There are more than 13 attributes - experience, education, pay, location, availability, stability, current job, etc. based on which a candidate is selected or rejected. Many of these attributes are subjective and need to be quantified.
• There is no clear decision boundary.
• We addressed these challenges by using data from our ATS about 20 job openings for which more than 3000 resumes were considered and more than 400 candidates were found suitable to be called for interview.
• We used a sixth degree polynomial that lends itself well to render a decision boundary with an irregular shape.
• We quantified the subjective attributes like education, stability etc.
• We used every 4th record to test and others to train the algorithm.
December 18, 2014
1. The Problem
2. Lean Hiring
3. Resume Ranking- Choice of Algorithm
4. Data Acquisition and Cleaning Challenges
5. Initial Results
6. Improving Accuracy
7. The Road Ahead
Confidential
Discussion Topics
December 18, 2014
Confidential
Results and Analysis• When we used the values of weightages delivered by the
algorithm to predict, we could correctly predict 89% of the examples in the training set that was used to train the algorithm.
• The same algorithm could predict 65% of the test examples.
• We improved the accuracy by 9% when we used the sixth degree polynomial.
• We used the weightages to assign ranks and the ranking was well accepted and appreciated by hiring managers within Synerzip.
• We started practicing Kanban and lean hiring as ranking enabled us to put WIP limit on the number of resumes entering the hiring process.
• The hiring efficiency improved and we were able to fill in more positions without adding any new recruiters.
December 18, 2014
Confidential
Example Weightages - AnalysisIt turns out to be a fairly distributed set of values for weightages for various attributes. Each job opening uses independent assessment of resumes.
This job opening gives extremely negative weightage to “current compensation” – this means that candidates earning well are not suitable; while its just the opposite case for most other job openings.
This position assigns positive weightage to total experience but negative weightage to relevant experience. The requirement was for a broader skillset beyond just C++.
December 18, 2014
Confidential
Comparing Results
Before After
Hiring Cycle Time 6 to 8 weeks 3 to 4 weeks
Hit Rate Less than 10% Almost 50%
Hiring Manager’s Time per hire
15 to 25 hours per position
Less than 10 hours per position
Recruiter’s time per hire
Close to 100 hours per position
Less than 40 hours per position
Hiring Mistakes Low confidence High confidence
December 18, 2014
1. The Problem
2. Lean Hiring
3. Resume Ranking- Choice of Algorithm
4. Data Acquisition and Cleaning Challenges
5. Initial Results
6. Improving Accuracy
7. The Road Ahead
Confidential
Discussion Topics
December 18, 2014
Confidential
Improving Accuracy• We tried using regularization to avoid over-fitting. However it did
not yield any improvement in accuracy. As we have the accuracy at 89% while predicting the training set itself, it can be intuitively concluded that over-fitting doesn’t need to be addressed.
• We need to get more training data to improve accuracy. Also the training data should pertain to a period over which the job requirements are constant. It’s very hard to find job openings where more than a hundred candidates are screened. As we plan to implement lean hiring and Kanban, the chances of having large training sets is very low.
• We tried seeing if principal component analysis can be used to reduce the number of attributes to 2 or 3 to be able to plot. We could not get the “retained variance” anywhere close to 99%. (In fact, it was close to 50%.)
December 18, 2014
1. The Problem
2. Lean Hiring
3. Resume Ranking- Choice of Algorithm
4. Data Acquisition and Cleaning Challenges
5. Initial Results
6. Improving Accuracy
7. The Road Ahead
Confidential
Discussion Topics
December 18, 2014
Confidential
Future Roadmap• Create search engine app with ranking on the
fly to limit the number of search results that fit within the smartphone screen with no need to scroll.
• Try using sixth degree polynomial with more attributes. Currently we are using it only on expected compensation and relevant experience. This will most likely improve the accuracy.
• Using NLP for information extraction and more precise attribute values.
December 18, 2014
Synerzip in a Nutshell• Software product development partner for small/mid-sized technology
companies– Exclusive focus on small/mid-sized technology companies, typically
venture-backed companies in growth phase– By definition, all Synerzip work is the IP of its respective clients– Deep experience in full SDLC – design, dev, QA/testing, deployment
• Dedicated team of high caliber software professionals for each client– Seamlessly extends client’s local team, offering full transparency– Stable teams with very low turn-over– NOT just “staff augmentation”, but provide full mgmt support
• Actually reduces risk of development/delivery– Experienced team - uses appropriate level of engineering discipline– Practices Agile development – responsive, yet disciplined
• Reduces cost – dual-shore team, 50% cost advantage• Offers long term flexibility – allows (facilitates) taking offshore team
captive – aka “BOT” option
December 18, 2014Confidential
Our Clients
December 18, 2014Confidential
Next Webinar
December 18, 2014Confidential
Agile Leadership: Want to change your results?Change how you lead.
Complimentary Webinar: Wednesday, January 21, 2015 @ noon CST
Presented by: Niel Nickolaisen, Chief Technology Officer at OC Tanner. He also co-authored
“Stand Back and Deliver: Accelerating Business Agility”
which gives you the agile leadership tools you’ll need to achieve breakthrough levels of
performance.
Call Us for a Free Consultation!
Hemant Elhence [email protected]
469.374.0500
Thanks!
@Synerzip_Agilelinkedin.com/company/synerzip
facebook.com/Synerzip
December 18, 2014Confidential