Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human...

Post on 14-Jan-2015

92.555 views 0 download

description

A deep dive into resume and LinkedIn sourcing and matching solutions claiming to use artificial intelligence, semantic search, and NLP, including how they work, their pros, cons, and limitations, and examples of what sourcers and recruiters can do that even the most advanced automated search and match algorithms can't do. Topics covered include human capital data information retrieval and analysis (HCDIR & A), Boolean and extended Boolean, semantic search, dynamic inference, dark matter resumes and social network profiles, and what I believe to be the ideal resume search and matching solution.

Transcript of Talent Sourcing and Matching - Artificial Intelligence and Black Box Semantic Search vs. Human...

Talent Sourcing and Matching

Glen Catheywww.linkedin.com/in/glencatheywww.booleanblackbelt.com

Artificial Intelligence & Black Box Semantic

Search vs.

Human Cognition & Sourcing

Sourcing = Easy!

What’s the big deal anyway?

Some people believe resume, LinkedIn and Internet sourcing is so easy that sourcing is either dying or

dead or can be performed for $6/hour

The Challenge

Resume and LinkedIn sourcing appears simple and easy on the surface, however – it is

deceptively difficult and complex

The Challenge

Anyone can find candidates because all searches "work" as long as they are syntactically correct That doesn’t mean the searches are finding

all of the best candidates!

People make assumptions when creating searches Every time an assumption is made, there is

room for error and you unknowingly miss and/or eliminate results!

The Challenge

No single search can return all potentially qualified people Every search both includes some

qualified people and excludes some qualified people

Some of the best people have resumes or social profiles that may not appear to be obvious or strong matches to your needs

The Challenge

People cannot effectively be reduced to and represented by a text-based document

Job seekers are NOT professional resume or LinkedIn profile writers

Most people still believe shorter and concise resumes and social profiles are still better This means they are removing data/info from

their resumes which can no longer be searched for!

The Challenge

No one mentions every skill or responsibility they’ve had, nor describes every environment they’ve ever worked in

There are many ways of expressing the same skills and experience

Employers often don’t use the same job titles for the same job functions

The Challenge

People don’t create their resumes and LinkedIn profiles thinking about how you will search for them

Sometimes people don’t even use correct terminology

Anyone easy for you to find is easy for other recruiters to find = no competitive advantage!

The ChallengeIn addition to the people you do find,

there are Dark Matter results of people that exist to be retrieved, but can't be

found through standard, direct or obvious methods

I estimate Dark Matter to be at least

50% of each source searched

So…

Finding some people is easy…

However…

Finding all of the best people IS

NOT!

Access is Nothing!“When every business has free and ubiquitous data, the ability to understand it and extract value from it becomes the complimentary scarce factor. It leads to intelligence, and the intelligent business is the successful business, regardless of its size. Data is the sword of the 21st century, those who wield it well, the Samurai.”

-Jonathan Rosenberg, SVP, Product Management @ Google

The Solution & Sell Stop wasting time trying to create

difficult and complex Boolean search strings

Let "intelligent search and match applications" do the work for you

A single query will give you the results you need - no more re-querying, no more waste of time!

Matching App Claims Understand titles, skills, and concepts Automatically analyze and define

relationships between words and concepts

Intuit and infer experience by context

Matching App Claims Perform pattern recognition

Employ semantic search

Perform fuzzy matching

Matching Apps

How do they really work?

Parsing

Intuit experience by context = resume parsing

Parsing breaks down and extracts resume information Most recent title and employer Skills and experience Years of experience – overall, in each position,

with specific skills, in management, etc. Education

Structured Data

Parsing enables structured, fielded search

Search by: Most recent title Recent experience Years of experience Etc.

Semantic Search Well developed ontologies and

taxonomies Hierarchical

Semantic Search Synonymous terms

Programmer, Software Engineer, Developer

Tax Manager, Manager of Tax CSR, Customer Service Representative Ruby on Rails, RoR, Rails, Ruby Oracle Financials, Oracle Applications, e-

Business Suite, etc.

Semantic Search Some applications use complex

statistical methods in an attempt to "understand" language and the relationships between words

Example: Google Distance

Google Distance

Keywords with the same or similar meanings in a natural language sense tend to be "close" in units of Google distance, while words with dissimilar meanings tend to be farther apart

Google Distance

A measure of semantic interrelatedness derived from the number of hits returned by the Google search engine for a given set of keywords

Semantic Clustering Non-interactive and unsupervised

machine learning technique seeking to automatically analyze and define relationships between words and concepts

Clustering is a common technique for statistical data analysis

Machine Learning The design and development of

algorithms that allow computers to evolve behaviors based on empirical data

A major focus is to automatically learn to recognize complex patterns and make intelligent decisions and classifications based on data

Pattern Recognition Aims to classify data (patterns) in

resumes based either on a priori knowledge or on statistical information extracted from the patterns A priori: independent of experience Example of pattern recognition: spam

filters

Fuzzy Logic

Finds approximate matches to a pattern in a string

Useful for word and phrase variations and misspellings

Search/Match Apps

Pros

Reduce time to find relevant matches

Can lessen or eliminate the need for recruiters to have deep and specialized knowledge within an industry or skill set

Reduce and even eliminate time spent on research

Pros

Go beyond literal, identical lexical matching

Levels the playing field

Can make an inexperienced person look like a sourcing wizard Good for teams with low search/sourcing

capability

Pros

Work well for positions where titles effectively identify matches and where there is a low volume and variety of keywords

Good for a high volume of unchanging hiring needs

Cons

Removes thought from the talent identification and decision making process

Danger of eliminating the need for recruiters to understand what they’re searching for

Information technology, healthcare, and other sectors/verticals can create pose serious challenges to matching apps

Cons

Apps find some people, bury or eliminate others Is finding some people good enough for

your organization? Shouldn’t your goal be to find ALL of the

BEST people?

Cons

Matching apps level the playing field People from different companies using

the same solution will both find and miss the same people

Competitors using the same search and match solution will have no competitive advantage over each other!

Cons

Belief that one search finds all of the best candidates is intrinsically flawed and simply not based in reality

Top talent isn't represented by what a search engine "thinks" has the best resume or profile

AI and semantic search apps favor keyword rich resumes and profiles

Keyword Rich/Poor Keyword poor resumes and profiles may

in fact represent better talent than keyword rich resumes and profiles

It’s not just a matter of keyword frequency or even keyword presence!

AI powered search & match applications can only return results that explicitly mention required keywords and their variants

Keyword Rich/Poor Many people have skills and

experience that are simply not mentioned anywhere in their resumes!

These people are the Dark Matter of databases, ATS’s, and social networks, and they exist but cannot be found via direct search/match methods – AI or otherwise!

Cons

Pre-built taxonomies are static, limited in their completeness and must be continually updated in order to stay relevant and effective

Taxonomies are only as good as who created them

Applications can only match on what’s present and cannot “think outside of the box”

Cons

Semantic clustering and NLP applications can retrieve related search terms, but that does not mean they are relevant for your need!

Cons

Match primarily on titles and skill terms True match is at the level of role,

responsibilities, environment, etc.

Some applications rank results favoring recent employment duration Is someone who has been in their current

company for 5 years really “better” than someone who has been with their current company for 2 years?

Cons

Apps don’t "know" what you’re looking for or what's the best match for your company

Apps are not and cannot be "aware" of people that were excluded from their search results

Applications are not truly intelligent – they do not actually "know" or "understand" the meaning of titles and terms

Intelligence

The ability to learn or understand or to deal with new or trying situations

The ability to apply knowledge to manipulate one’s environment or to think abstractly

REASON; the power of comprehending and inferring Source: Merriam-Webster.com

Artificial Intelligence The capability of a machine to imitate

intelligent human behavior

Artificial = humanly contrived

Source: Merriam-Webster.com

Artificial Intelligence Dr. Michio Kaku

Theoretical physicist and futurist specializing in string field theory

Harvard Grad (summa cum laude)

Berkeley Ph.D Currently working on completing

Einstein's dream of a unified field theory

What are his thoughts on AI?

Artificial Intelligence “…pattern recognition and common sense

are the two most difficult, unsolved problems in artificial intelligence theory. Pattern recognition means the ability to see, hear, and to understand what you are seeing and understand what you are hearing. Common sense means your ability to make sense out of the world, which even children can perform.”

- Dr. Michio Kaku

Jobs of the Future Dr. Michio Kaku believes the job market of

the future will be “dominated by jobs involving common sense (e.g. leadership, judgment, entertainment, art, analysis, creativity) and pattern recognition (e.g. vision and non-repetitive jobs). Jobs like brokers, tellers, agents, low level accountants and jobs involving inventory and repetition will be eliminated.”

Jobs of the Future That’s good news for sourcers and recruiters

who perform sourcing!

Sourcing requires judgment, creativity, analysis, common sense and pattern recognition (instantly making sense of human capital data)

Sourcers of the future will be human capital data analysts who are experts in HCDIR & A – Human Capital Data Information Retrieval and Analysis

Static vs. Dynamic Matching apps do not have the dynamic

ability to learn, understand and instantly relate new concepts and through direct experience and observation

They depend on taxonomies, statistical models, or semantic clustering to “understand” relationships and concepts

Dynamic Inference The human mind naturally organizes

its knowledge of the world, instantly relating new terms and concepts and judging their relevance

Dynamic Inference Example: A sourcer who is completely

unfamiliar with “infection control” can instantly recognize non-highlighted but related and relevant terms and incorporate them into new and improved searches

Carolinas HealthCare System, Charlotte, NCInfection Preventionist 1997-present

Responsible for all aspects of infection prevention and control for an 800+bed hospital. Uses science-based research to perform infection prevention. Conducts all aspects of surveillance, data analysis, and presents data to interdisciplinary teams, including the Infection Control Committee.

Dynamic Inference Human sourcers can learn from

research and search results, dynamically and adaptively identifying related and relevant search terms and incorporate them into successive searches to continuously refine and improve searches for more relevant results

Dynamic Inference For example, if a recruiter was sourcing

for a position that required a skill that they were unfamiliar with (e.g.,“Cockburn Use Case Methodology” ) they could quickly perform research to learn more about it

In the next slide, you will see a screen capture of such research

Dynamic Inference From this quick research , the recruiter

would be able to determine that most people would not explicitly mention “Cockburn Use Case Methodology,” let alone “Cockburn” (which the research revealed is pronounced “Co-burn”) – thus they would not include the term in their searches

Dynamic Inference Instead, it would be a better idea to

search for candidates that mention experience with Agile methodology and simply call and ask them if they have experience with using Cockburn’s use case methodology (which many likely would)

NLP

Applications using Natural Language Processing do not truly understand human language

They use complex statistical methods to resolve the many difficulties associated with making sense of human language

NLP experts admit that to computers, even simple sentences can be highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses

Innate

Humans effortlessly and automatically process and understand language, regardless of sentence length or complexity, ambiguity, incorrect grammar, etc.

We can udnretsnad any msseed up stnecene as lnog as the lsat and frsit lteetrs of wdros are in the crrcoet plaecs

Deduction

Human sourcers and recruiters can deduce potential experience, even in the absence of information (not explicitly mentioned in the resume/profile)

Applications can only work with what’s actually mentioned in a resume – if it's not explicitly mentioned, it can't match on it

Awareness

Applications are not aware that many of the best people have average resumes

Applications are not aware of the people their algorithms bury in results or eliminate entirely

Human sourcers can become aware of and specifically target this Dark Matter

Dark Matter

How can you target resumes and LinkedIn profiles that exist, but your searches can’t and don’t retrieve

them?

AI Solution

Well developed taxonomies, semantically generated query clouds and matching algorithms can help greatly with automatically searching for and matching on synonymous terms, related words, word variants, misspellings, etc.

Human Solution

Think + Perform Research For keyword, phrase or title you are

thinking of using in your search, realize:1.Not everyone will explicitly mention what you

think they would or should mention in their resume/profile

2.There are many different and often unexpected ways of expressing the same skills and experience

Example

Global Experience

What search terms might you use if you are looking for people with

global experience?

How many can you think of off the top of your head?

Research

In a few minutes of exploratory research, a sourcer can come up with a volume of related and relevant terms

Global, international, foreign, multinational, worldwide

Europe, European, EU, EMEA, Asia, Asia-Pac, Pacific Rim, South America, Latin America, Americas, CALA (Caribbean and Latin America), Middle East

Canada, Japan, China, Russia, India, UK, United Kingdom, etc.

Countries, Offshore, Overseas

Dark Matter

How can you target results of people that your searches retrieve but the results are

buried (ranked poorly or "too many" results to be reviewed)

and you don’t find them?

AI/Semantic Solution? Search and matching software

powered by artificial intelligence / black box semantic search doesn't have a solution to this challenge

One of the major claims AI/semantic search applications make is that their solutions can find the "right people" in one search

AI/Semantic Solution? However - a single search strategy is

intrinsically flawed and limited - no single search can find all qualified candidates, and each search both includes qualified people as well as excludes qualified people

I am not aware of any search & match software that allows for successive searching via mutually exclusive filtering

Human Solution

Run Multiple Searches Start with maximum qualifications

Use the NOT operator to systematically filter through mutually exclusive result sets

End with minimum qualifications

Example Job

Required: A,B,C

Explicitly desired: D,E

Implicitly desired: F

Max/Min

1. A and B and C and D and E and F

2. A and B and C and D and E and NOT F

3. A and B and C and D and NOT E and F

4. A and B and C and NOT D and E and F

5. A and B and C and NOT D and NOT E and F

6. A and B and C and D and NOT E and NOT F

7. A and B and C and NOT D and E and NOT F

8. A and B and C and NOT D and NOT E and NOT F

Max/Min

Search #1

Search #8

Human Solution

Probability-Based and Exhaustive!

This approach allows for:1. The specific targeting of people who theoretically have

the highest probability of being a match based on information present

2. The specific targeting of people who may be the best match, but may have keyword/information poor resumes or profiles, who do not explicitly mention what you think the "right" person would or should mention

3. The ability to systematically filter through all available results via manageable and mutually exclusive result sets – never seeing the same person twice!

Ideal Solution

Ideal Solution

A mix of “man and machine,” integrating human knowledge and expertise into computer systems

Essentially - the best of both worlds: Autopilot: An artificially intelligent

semantic matching engine Manual Override: Ability to take complete

control over searches and search results

Ideal Solution

An artificial intelligence semantic matching engine coupled with taxonomies built by human SMEs that are continually modified and improved specifically for the organization No COTS solution is customized for any

specific employer, industry or discipline, nor 100% complete

Ideal Solution

Resume and LinkedIn profile parsing

Structured, contextual search Most recent title and experience, overall years of

experience, education, etc.

White Box relevance weighting Configurable by users – no black box!

Searchable tagging for level 5 semantic search

Ideal Solution

Standard and extended Boolean in full text and field-based search AND, OR, NOT, configurable proximity,

weighting

Configurable proximity enables level 3 semantic search

Variable term weighting allows users to control which search terms are more important and thus control over true relevance

X-Boolean

Lucene is a free and open source text search engine that support configurable proximity and term weighting, and can be integrated into some existing ATS's/databases

Some Applicant Tracking Systems already have databases powered by text search engines that allow for extended Boolean

Consider

“Society has reached the point where one can push a button and immediately be deluged with…information. This is all very convenient, of course, but if one is not careful there is a danger of losing the ability to think.”

- Eiji Toyoda

Man AND Machine Data and information requires analysis to

support decision making

Just as very expensive Business Intelligence and Financial Analytics software hasn't replaced the need for people to make sense of the data, there is no software solution for HR and recruiting that replaces the need for people to analyze and interpret human capital data to make appropriate decisions

Man AND Machine Matching apps move/retrieve

information, but only PEOPLE can analyze and interpret for relevance and make intelligent decisions Relevant: the ability (as of an information

retrieval system) to retrieve material that satisfies the needs of the user [1]

Only the user (sourcer/recruiter) can judge relevance!

[1] Source: Merriam-Webster.com

Man AND Machine Sourcers and recruiters need technology

that can enable their productivity

Intelligent search and match apps are not a replacement for creative, curious, investigative people

Do not seek to automate that which you do not understand and cannot accomplish manually!

Consider

“Computers move information, people do the work”

- Jeffrey Liker

Find Me & Connect!