The Implications of Big Data on Employee Referrals and Recruiting

BIG DATA: THE IMPLICATIONS FOR RECRUIT ING & REFERRALS

A R O L E P O I N T W H I T E P A P E R B Y @ B I L L B O O R M A N


B I G D A T A

INTRODUCTION

We have always created and stored data

in one format or another. Historians can

go back thousands of years to

understand how people lived in centuries

past thanks to the evidence and

indications of what has happened that is

stored deep in museums and archives,

historical documents and hand curated

registers such as the census. The

collation, interrogation and interpretation

of data and what it means is far from a

new concept, but what has now brought

the discussion to the fore in all aspects of

business is the incredible volume of data

that is being created every moment of

every day, and the development of data

mining and interpretation tools that are

available for anyone to use.

We are creating data and data trails at an

unequaled rate. Pretty much everything

we do is recorded somewhere. Every

keystroke, click, call or action is recorded

and searchable. Our domestic appliances,

cars, games and everyday appliances

leave a trail while mobile devices leave

data trails for geolocation use. This data,

often called ‘multi-structured’, includes

loosely structured social media data and

content, as well as the deluge of

machine-generated data like geolocation

and data usage information coming from

networked devices and packaged goods

with embedded sensors. The data

generated from these devices offers

game-changing opportunities in

operational process optimization and

reinvention, as well as in business

analytics and intelligence. HR and

recruiting is just one of the many business

functions to benefit from this new way of

working.

In April 2012, Information Management

reported that we create 2.5 quintillion

bytes of data every day, with 90% of the

data in the world having been created in

the last two years alone. Every hour,

Wal-Mart handles 1 million transactions,

feeding a database of 2.5 petabytes,

which is almost 170 times the data in the

Library of Congress. The entire collection

of the junk delivered by the U.S. Postal

Service in one year is equal to 5

petabytes, while Google processes that

amount of data in just one hour. The total

amount of information in existence is

estimated at a little over a zettabyte.

The real challenge starts when attempting

to deal with the wealth of data that is

available. What organizations really want

is to be able to navigate the data

landscape in order to find and interpret


what is useful in order to make informed

business decisions. The hottest job in

business over the past 18 months has

been data analyst, as more and more

businesses look to understand just what

this data means to them and how they

can use it to their advantage in every

aspect of their business, not least

recruiting. Internal data alone has massive

implications for all HR functions

throughout business, because decisions

based on data are based on fact.

In this paper we will not attempt to make

the case for big data in recruiting and

referral - you ignore it at your peril.

Rather, we will look at practical

applications within recruiting. The easiest

way to understand big data concepts is

that big data practices take numbers

from multiple sources structured and

unstructured (such as social media

channels), identify relationships and

interpret meaning to different data sets

and deliver an output that is easy to

understand without expert knowledge

(data visualization.) A 2013 report by the

Aberdeen Group found that of

organizations that use visual discovery

tools, 48 percent of business information

users are able to find the information they

need without the help of IT staff. Without

visual discovery, the rate drops to a mere

23 percent. When data is visual and real

time, everyone benefits from being

informed and current. It’s an old saying,

but it has never been more applicable: “In

God we trust, everyone else bring data!”


S E C T I O N H E A D E R

V


B I G D A T A

INTERNAL DATA

Companies have always recorded and

stored data. One of the challenges

companies face is that their corporate

data is kept in silos within individual

databases. For the most part, the

historical business database was built for

data storage rather than data retrieval.

Within HR, companies may be collating

and retaining data in 5 - 8 individual

places with no link from one to the other.

HR systems might consist of:

Payroll System:

Contains pay data and personal data

relating to past and present

employees, earnings, benefits,

company reporting structure,

employment history etc. The payroll

system is an accurate data record of

work history, rewards and other

personal information.

Performance Management System:

Contains performance data, reviews,

appraisals, report, performance

disciplinary records etc. The

performance management system is

an accurate data record of the work

completed by employees against

management expectations, enabling

the ranking or grouping of employees

by actual performance.

Learning and Development System:

Contains training data and

assessments, skills profiles, reviews etc.

The learning and development system

is an accurate data record of

employees’ development, progress and

potential, and enables the

measurement of skills gaps and

workforce planning.

Applicant Tracking System:

Contains applicant and employee data

past and present, recording progress

through the application process. The

ATS is an accurate data record of all

applications for employment,

successful or unsuccessful, and

important recruiting metrics such as

time to hire, source of hire and

applicant to hire ratio. Includes

metrics that are important for

understanding recruiting efficiency,

enabling best use of resources.

However, these are often hard to track.

Candidate Relationship Management

System:

Works like the client relationships

management system and commonly

used in marketing. The CRM enables

the indexing and tracking of candidate

7RolePoint Inc. © 2014 Big Data7RolePoint Inc. © 2014 Big Data

communication. The CRM follows the

trend for talent communities and

talent networks.

A talent community, such as those

operated by the likes of Microsoft,

focus on a single discipline or interest,

enabling members of the community

to communicate with each other whilst

creating a dynamic profile and leaving

a data trail. While this approach is

often confused with a talent network,

its popularity has exploded over recent

years fueled by social media discussion

and debate. A community can be

defined as a collection of people who

are connected by a topic, where each

member can raise or comment on a

discussion and connect directly with

each other. A good example would be

a LinkedIn group.

A talent network, by contrast, can be

compared to targeted e-mail lists of

past years, although other

communication channels such as SMS

or via a mobile app also qualify, as well

as traditional methods of

communication. Registration for a

talent network is usually as simple as

one-click connections of a potential

candidate with the company, with

profiles populated with external data

from sources such as LinkedIn profiles.

Communication lines are vertical

between candidate and company in

contrast to a talent community, where

anyone can connect and communicate.

The real benefit and appeal of the

talent network is that candidate data

can be analyzed to ensure relevance in

messaging (the same concept that

applies to referral messaging).

Relevance of topic and content is a

critical factor in success, with 75% of

messaging being opened on a mobile

device and the recipient often making

instant decisions over ditching the

message or opening and reading -

research indicates this decision is

made in a maximum of 3 seconds.

Other HR Systems:

Individual companies may well have

other systems housing additional HR

data. The problem for most

organizations is that each HR system

operates and stores data in isolation.

The benefit of utilizing internal data is

that it is owned by the organization

and is considered structured data. The

challenge is combining all of the

datasets for interrogation and

understanding how one dataset might

relate to another, enabling the

discovery of trends and relationships


B I G D A T A

for interpretation and decision-making.

What hampers this is often internal

resistance to opening up HR data

between departments and opening the

APIs between one technology and

another. An application programming

interface (API) specifies how some

software components should interact

with each other. For organizations to

unlock the potential offered by internal

big data depends on:

• A single data flow in one direction

• Data conversion in to a single

format effectively speaking the

same language

• Data retrieval from all technologies

made possible by an open API

• Refreshed latent data pulling from

real time unstructured social data,

combining internal data with public

data sources such as LinkedIn,

Facebook or Twitter

LATENT DATA

Latent data is dated and becomes old the

day it is recorded. A good example of this

within talent acquisition is the vast

number of resumes that have been

submitted to a company for job openings

as they arise. When we consider the

resume, it becomes an historical, out of

date document from the day it is

submitted as an e-mail address or

contact number may change over time.

Furthermore, candidates will add to their

experience and skills through promotions,

transfers, learning or changing employer.

None of these changes are recorded on

the resume held within the ATS, hiding

many potential matches to new

opportunities. Companies like London

based Work Digital identified the benefit

of searching unstructured social data to

update the latent resume data contained

within the ATS, because social data is real

time. Combining a historical resume with

the data contained on a LinkedIn profile,

a Facebook profile, an about me page

and other similar sources refreshes the

ATS keeping candidate details up to date

in real time, retrievable and relevant in

search.


THE SEARS STORY

In a presentation at SourceCon in 2013,

Donna Quintall, the Senior Manager

Executive Talent Acquisition at Sears

Holdings, told the story of how they had

implemented this methodology in to their

recruiting function. Quintell commented

that HR teams very often have all the

data recruiters needed to plan recruiting,

but it is hidden behind the walls built by

HR and recruiting. Sears set about tearing

down the walls to create the HR data

warehouse, fed by their HR systems. Data

feeds come from each of their HR

systems including performance

management, CRM and ATS to produce

reports, dashboards, and analysis. This

means the recruiters can interrogate data

to understand what they should be

recruiting for, costs, speed of hire, loss

due to not hiring and other things like the

best companies to hire from, the best

industries and the best schools. The data

includes things like performance

management, reviews, appraisals and

financials: every aspect of HR data in

order to be proactive, predictive and to

be able to use data to influence hiring

managers.

Every new starter at Sears completes a

profile and all the HR data through their

career gets added in real time. When you

have data, you can interrogate it. How

many organizations recognize that this

type of data would be useful to recruiters,

and how many people think it should be

locked away from the hiring team? When

you have data, you can influence

decisions.

Recruiters get to know when people are

high-risk—triggered by actions like

performance plans—and this models the

recruiting plan. The recruiters source

according to projected needs rather than

simply reacting to jobs as they come up

at the 11th hour. The sourcing plan covers

internal employees as well as external

targets. When you have data, succession

planning and internal mobility become a

reality.

Sears have worked hard to allow

recruiters to have conversations about

internal opportunities freely without

needing to go through layers of

permission. This takes some doing. I

remember having the same conversation

with Arie Ball at Sodexo. Companies talk

internal mobility but block the access to

it through politics and turfism. The best

recruits with the least risk, who are

already known, usually live within the

company, but many recruiters are driven


B I G D A T A

to source outside. The key in all of this is

transparency of data and trust. Sears has

profiles on over 400,000 employees.

That’s a huge data source.

Every interview is a source of competitor

information that goes into the system,

hired or not. Recruiters are trained to

gather data in the interview (they jokingly

compare this to being interviewed by the

CIA). When I think about how much

market information recruiters could

collect to help influence sourcing and

hiring decisions, the potential is

frightening. This means building a whole

process for data collection and an

emphasis on retrieval through data

mining rather than storage.

When recruiters have the data to

influence and advise recruiters—what

they need to be doing—the perception of

the role changes from being reactive

people-finders to strategic partners.

Sears runs their talent community

through Find.ly. They track all social

media activity to see what topics are

trending. Things like Family Guy and

Eminem are massive trends. They switch

this back into their thinking on content

and content placement. This is starting

with a Family Guy campaign because that

is where their hires are and what they are

interested in.

I love the direction Sears is going with

this thinking. Data drives decisions further

than opinions, just as soon as the walls

come down and recruiters get access,

allowing sourcing from a reactive, just-in-

time function, to playing a more strategic

part in business planning. Decisions are

driven by data and what is really

happening, and Sears are reaping the

benefits.

UNSTRUCTURED DATA

According to Wikipedia, unstructured

data is defined as “information that either

does not have a pre-defined data model

or is not organized in a predefined

manner. Unstructured information is

typically text-heavy, but may contain data

such as dates, numbers, and facts as well.

This results in irregularities and

ambiguities that make it difficult to

understand using traditional computer

programs as compared to data stored in

fielded form in databases or annotated

(semantically tagged) in documents.”

In 1998, Merrill Lynch cited a rule of

thumb that somewhere around 80-90%


of all potentially usable business

information may originate in unstructured

form. In recent years the volume of

unstructured data has exploded as a

result of the embedding of social media

sites and the adoption of mobile devices.

Lets consider some of the stats relating

to Facebook, taken from the blog Digital

Marketing Ramblings (http://

expandedramblings.com):

Facebook User Stats

• Total number of Facebook users: 1.26 billion

as of 10/6/13

• Total number of Facebook monthly active

users (MAU): 1.23 billion as of 01/29/14

• Total number of Facebook daily active users

(DAU): 757 million as of 1/29/14

• Daily active users in the US: 128 million as of

8/13/13

• Size of user data that Facebook stores:

more than 300 petabytes as of 11/7/13

Facebook User Activity Stats

• Number of times daily that the Facebook

Like or Share buttons are viewed: 22 billion

(Tweet this stat) as of 11/6/13

• Number of sites that contain Facebook Like

or Share buttons: 7.5 million (Tweet this

stat) as of 11/6/13

• Total number of Facebook friend

connections: 150 billion as of 2/1/13

• Total number of Facebook likes since launch:

1.13 trillion

• Average daily Facebook likes: 4.5 billion as

of 5/27/13

• Total number of location-tagged Facebook

posts: 17 billion

• Total number of uploaded Facebook photos:

250 billion as of 9/17/13

• Average daily uploaded Facebook photos:

350 million as of 2/1/13

• Average number of photos uploaded per

Facebook user: 217 photos as of 9/17/13

• Average number of items shared by

Facebook users daily: 4.75 billion as of

9/17/13

• Number of Facebook messages sent daily:

10 billion as of 9/17/13

• Percentage of Facebook users that login

once a day: 76% as of 7/2/13

• Percentage of users that check Facebook

multiple times a day: 40% as of 12/30/13

• Average number of page likes per Facebook

user: 40 as of 7/12/13


B I G D A T A

Twitter stats

• Unique monthly visitors to Twitter.com

(desktop only): 36 million as of 5/1/13

• Total Number of Tweets Sent: 300 billion as

of 10/3/13

• Monthly Active Twitter Users: 231.7 million

as of 10/17/13

• Percentage of Twitter MAUs located outside

U.S.: 77% as of 10/3/13

• Percentage of MAUs that accessed Twitter

via mobile device: 75% as of 10/3/13

• Daily Active Twitter Users: 100 million

(Tweet this stat) as of 10/3/13

• Average Number of Followers per Twitter

User: 208 (Tweet this stat) as of 10/11/12

• Average Number of Tweets Sent Per Day:

500 million as of 10/3/13

• Average Number of Tweets per Twitter User:

307 as of 1/11/13

• Number of tweet impressions outside

Twitter properties in Q3 2013: 48 billion as

of 10/3/13

Percentage of Twitter Users Accessing Via

Mobile: 60% as of 12/18/12

When we consider the volume of data

being posted in public social media

channels, documents, forums and other

places, it is easy to understand that, if we

can sort the data and sift social data by

relevance and value, 80% of the world’s

data is considered to be unstructured.

The challenge of unstructured data is not

availability but sifting for relevance and

determining meaning. Unstructured data

is facing the same challenges that

structured information faced in its early

days. CIOs must overcome fragmentation

of information and processes; the

infamous three Vs of information –

volume, variety and velocity; security; and

governance issues. To get meaning we

need to separate the signal from the

noise and that’s not easy to do with

unstructured data.


In terms of recruiting, this means

identifying what you want the data to tell

you as an outcome, the questions that

need answering and the areas of decision

making that have historically been based

on opinion. Social data enables profiling

of candidates and audience, identifying

the best fit and the most receptive

targets. The challenge here is three fold:

• Volume of data created

• Velocity at which data is

multiplying in real time, for

example, Twitter recorded 58

million tweets/day mark - over a

billion tweets a month. There are

over 2-1bn Twitter search engine

queries each day. That’s velocity of

data. Source: http://www.statisticbrain.

com/twitter-statistics

• Variety of sources we can access

data from. From Twitter, Facebook,

LinkedIn or YouTube to Pinterest or

Instagram, new data sources are

being added every day.

Data is multiplying at an exponential rate

day-by-day, week-by-week, year-by-year.

Everything that we do leaves a data trail.

Our every action is recorded even if we

are not connected, Your mobile leaves a

trail of your location, your credit or debit

card outlines your spending habits and

the products and services you like and

use outlines how, where and when you

buy things.

In November 2012, Analyst Josh Bersin, of

Bersin Deloitte wrote:

“Start with the problem, not the data. We

are all flooded with data: employee data,

location data, social data, compensation

data, and much more. If you start an

analytics project by collecting all the data

you can find, you may never come to an

end. Rather you have to start with the

problem: What big decisions would you

like to be able to make? What problems

would you like to solve?

One common talent problem, for

example, may be sales productivity. What

factors contribute to a predictable high-

performing sales person? Every company

would like to understand this better. And

once you understand these

characteristics, how can you better

source, attract, and hire such people?

Another may be turnover. What factors

contribute to high turnover in your

company and in particular groups?


B I G D A T A

These questions are worth millions of

dollars to answer. Be careful you don’t

start by only looking at data. It leads to

lots of money spent, systems built, and

often little or no return.”

Source: http://www.bersin.com/blog/post.

aspx?id=574b5527-ce55-4ec6-8c45-c8f05f85162e

The point Bersin is making is an

important one. Big data is, well, big. There

is such a range of data sources that is

possible to just keep mining and never

extract any real value. Big data projects

should begin with objectives and a

desired outcome. In the case of recruiting,

we want to achieve certain clear

objectives:

1. We want to understand what ‘good’

looks like. Who are our best

performers? Who are the best

performers in the industry? What did

their background look like? What data

trends connect them?

2. When we have identified the best

candidate profile, we want to know

who fits the profile internally, or if

internal mobility is actually the best

option. The CareerXroads Source of

Hire Survey for 2013 indicates that the

principle source of hire in the USA is

internal hiring, with 42% of job

openings in the companies surveyed

filled by internal candidates:

“77,200 positions out of 185,450

positions were filled in the US by the

responding firms through internal

movement and promotion. This is

~42% of all the openings filled and

reminds us that the largest source of

hire by far is our own employees”.

Source: http://www.careerxroads.com/news/

SourcesOfHire2013.pdf

3. Next we need to identify who fits the

profile externally and how we are

connected to them? The best

candidate may have already applied

and could be hidden in the ATS, talent

network or connected with the

company in some other way, for

example, as a fan of the company

page on Facebook or a follower on

LinkedIn. Alternatively, they may well

be connected via social networks or

e-mail with our existing employees.

4. When we have targets, we want to

understand the best way to reach

them so that we can personalize the

message whilst determining the best

method of delivery and the best time

to get a response.


5. We want to understand our

recruitment process so that we can

identify any blockage in order to

reduce time and cost of hire.

There may be many other objectives that

can be solved by taking a big data

approach to hiring. The key point Bersin

is making is to begin with the end in

mind. Understand the problem you are

trying to solve and then apply data

mining, collection and analytics to find a

solution, or at least to understand why

the problem exists.


S E C T I O N H E A D E R

V


B I G D A T A

THE GOOGLE FLU TRENDS

It might seem strange that we are

discussing this particular initiative in

relation to recruiting, but there are some

clear lessons to be learnt from this that

apply to talent acquisition and big data.

Every time you search Google, you leave

a trail as to the information you want now

and meaning can be derived from this, as

well as an understanding of the sources

of information you trust.

A controlled experiment between 2009

and 2010 validated this using search data

to forecast hospital visits for influenza

over a 21 month period. By monitoring

millions of users’ health tracking

behaviors online, the large numbers of

Google search queries gathered were

analyzed to reveal whether there was the

presence of flu-like illness in a population.

Google Flu Trends compared these

findings to a historic baseline level of

influenza activity for its corresponding

region and then reported the activity

level as minimal, low, moderate, high, or

intense. These estimates have been

generally consistent with conventional

surveillance data collected by health

agencies, both nationally and regionally.

“ A study in Clinical Infectious Diseases

shows that a Google tool can predict

surges in hospital flu visits more than a

week before CDC. For the study, Johns

Hopkins School of Medicine researchers

compared Baltimore-specific data from

the Google Flu Trends website, which

estimates influenza outbreaks based on

online searches for flu information, to ED

crowding and laboratory statistics from

Johns Hopkins Hospital. Using Google Flu

Trends, researchers found that the

number of online searches for flu

information increased at the same time

that the hospital’s pediatric ED

experienced a rise in cases of children

with flu-like symptoms. The Google Flu

Trends data had a moderate correlation

with patient volume in the adult ED.

Moreover, Google Flu Trends signaled an

uptick in flu cases seven to 10 days earlier

than CDC’s U.S. Influenza Sentinel

Provider Surveillance Network. Based on

the findings, the researchers suggested

that platforms like Google Flu Trends

could help hospital administrators

anticipate flu outbreaks and make

appropriate staffing and capacity

planning decisions.”

Source: http://www.advisory.com/daily-

briefing/2012/01/13/google-flu

17RolePoint Inc. © 2014 Big Data17RolePoint Inc. © 2014 Big Data

Whilst this study was concluded four

years ago, the tracking has continued via

the Google Flu Trends report, and much

has been learnt about using search data

to forecast outcomes in a meaningful

way. There has been some variance to the

forecast from Google Flu Trends over the

past year which presents another

challenge when using data to predict

outcomes: that of context and meaning in

unstructured data. This article from the

Harvard Business Review outlines the real

challenge of taking data based forecasts

as verbatim:

“AT THE CORE OF THE ISSUE WITH FLU

MEASUREMENT (AND MOST PROJECTS

INVOLVING LARGE AMOUNTS OF DATA)

IS AMBIGUITY; BOTH IN THE INTENT OF

A SEARCH QUERY, AND IN THE SENSE

THAT THE REFERENCE RATE FROM THE

CDC MEASURES INFLUENZA-LIKE

ILLNESSES, WHICH MIGHT INCLUDE

NON-FLU AILMENTS THAT CAUSE

FEVER, COUGH, OR SORE THROAT.

SEARCH TERMS DIRECTLY RELATING

TO A FLU SYMPTOM OR COMPLICATION

ARE CONFLATED BETWEEN PEOPLE

WHO ACTUALLY HAVE THE FLU AND

THOSE THAT ARE EXPRESSING

CONCERNED AWARENESS ABOUT IT —

AND CDC MEASUREMENTS MINGLE

PEOPLE WHO ACTUALLY HAVE THE FLU

AND THOSE THAT ARE JUST

EXPRESSING SOME FLU-LIKE

SYMPTOMS. TRYING TO DETERMINE

THE ACTUAL FLU INCIDENCE REQUIRES

SOME CAREFUL DISAMBIGUATION. THIS

IS ONE PLACE WHERE SMARTER

ALGORITHMS MAY COME INTO DATA

VIGILANTISM: PULLING OUT THE

INFORMATION THAT YOU ACTUALLY

WANT TO MEASURE FROM YOUR BIG,

MESSY PILE OF DATA.”

Source: http://blogs.hbr.org/2013/07/how-google-

flu-trends-is-getting-to-the-bottom/

This highlights the need to continuously

measure predicted results against actual

outcomes and to constantly measure the

variance, adjusting the algorithm

accordingly. The benefit we have in

recruiting is that we have access to plenty

of structured data in the ATS and other

HR systems to test outcomes against. By

reverse engineering the data on hires, we

can measure if the historical data trail

follows the same path. It is important to

regularly challenge our datasets and

predictions against known favorable

outcomes in order to ensure the integrity

of predictions.


B I G D A T A

VARIETY OF DATA SOURCES

Another area to pay attention to is the

variety of data sources available,

particularly when using unstructured data

to influence strategy. The data source

needs to be representative of the

population in order to avoid bias or a lack

of diversity. Pew internet research

highlights this problem when looking at

the demographics of internet users. The

report from December 2013 identifies that

whilst Facebook is popular across a mix

of demographic groups, other channels

have developed their own unique

demographic user profiles. For example,

Pinterest holds particular appeal to

female users (women are 4 times more

likely to visit the site then men) and

LinkedIn is particularly popular among

college leavers and users from higher

income households. Twitter and

Instagram have particular appeal to

younger users, urban dwellers and non-

whites and there is a strong overlap

between Twitter and Instagram users.

Source: http://pewinternet.org/Reports/2013/

Social-Media-Update/Main-Findings/73-of-online-

adults-now-use-social-networking-sites.aspx

Data demographics can be especially

useful when applying diversity to

sourcing where a percentage of the

desired workforce population is

underrepresented by placing a greater

weighting to data from sources where the

target population is represented. We can

also ensure data integrity by adjusting

the weighting given to certain data to

allow for the data demographics. In order

to derive meaning from data we need to

understand the DNA of the sources we

are mining to allow for any bias.

In recruiting terms, we can apply this

thinking to technology by targeting the

data we want to collect, identifying the

data source to protect against bias and

applying contextual understanding. This

involves investigating meaning in the data

being analyzed. In the case of Google Flu

Trends, the variance occurred because

the reason people were turning to Google

for advice changed from those who were

concerned about symptoms they were

experiencing to people wanting to take

precautions in advance of an epidemic

because of increased publicity and news.

When we can understand data trends, we

can apply meaning to what triggers this

reaction. One of the real benefits we

have in the recruiting space is that we

can use this thinking to identify the

internet behaviors of people who are

beginning to prepare for a job change. A


candidate in a talent network preparing

for a job change needs greater attention

in matching results against job openings,

and we know they are likely to be more

receptive to a referral message. Our

tracking has shown that on LinkedIn, for

example, a user thinking about a change

will update their profile, seek and add

new recommendations, increase their

connections, particularly with recruiters

and follow more companies. Whilst these

actions in isolation might mean very little,

when they are combined over a four week

period, they are likely to be thinking

about moving. The same thinking can be

found in other social channels, with

predictive behaviors identified by mining

the social media behaviors of candidates

as they apply in order to understand what

candidates look like in terms of their data

footprint so that people with a similar

footprint can be identified.

THE SOCIAL GRAPH

Facebook CEO Mark Zuckerberg

popularized the concept of the social

graph to describe his approach to

mapping the world’s social relationships,

in the process, unlocking untold value for

people by digitizing their social networks.

The social graph has been referred to as

“the global mapping of everybody and

how they’re related”. The term was

popularized at the Facebook F8

conference on May 24, 2007, when it was

used to explain that the Facebook

Platform, which was introduced at the

same time, would benefit from the social

graph by taking advantage of the

relationships between individuals, that

Facebook provides, to offer a richer

online experience. The definition has been

expanded to refer to a social graph of all

Internet users.

When this is applied to referral networks,

mapping the social graph of the

organization and employees highlights

the best sources for distributing targeted

messages and opportunities. The average

number of LinkedIn connections per

employee is 225, with 130 Facebook

connections. This offers targeted reach

for relevant messaging, and tracking

employee activity and response to

messaging will highlight those employees

most likely to participate in employee

referral networks. Collecting on-going

data over user behavior and outcomes

enables recruiters to identify their most

active referrers with the strongest

relationships. The strength of a

relationship can be calculated by mining

all of the channels where there is a


B I G D A T A

connection in order to understand the

nature of the relationship. Whilst we

might identify a relationship as a

connection between two people, e.g. they

are friends on Facebook or connections

on LinkedIn, we can use other factors to

identify the depth of the relationship in

order to apply a weighting to the depth

of connection.

Interactions between users in social

channels that indicate the depth of

relationship can include the measurement

of:

• Number of connection points

• Shared connections

• Frequency of interactions e.g. @

messages on twitter, comments

and likes on Facebook, shared

pictures etc.

• Social reputation e.g. Stackoverflow

votes to answers or questions

• Shared work history

Whilst this list is by no means exhaustive,

it is easy to see how relationships

between target candidates and existing

employees can be ranked to identify who

has the closest relationship. It stands to

reason that the closer the relationship

and the more mutual respect that exists,

the greater the likelihood of eliciting a

positive response to the referral message.

A big problem that exists when sourcing

talent is over messaging. This is a

fundamental problem with many social

recruiting platforms where matching and

targeting is based on a limited data set

such as job title and location. Typically, an

individual will likely have multiple

connections in the same organization. If

each contact sends the same referral

message in the same way, the risk is over

messaging considered spam. In the same

way as the social graph can be used to

identify connections with a target profile,

so too the relationship graph can be

applied to prioritize who should be

reaching out and through which channel.

The relationship graph leads to the most

likely route for a successful outcome.

PEOPLE PROFILING AND RELEVANCE

OF MESSAGING

In the first section of this paper, we

identified the way in which structured

data from internal sources can be used to

profile the best performers in the

company, and how their skill sets,

aptitude and other factors can be used to

compile job specs and a sourcing plan.


When you know what good looks like,

you can plan how to find it again. Once

you have profiles of your best employees,

you have a data footprint of what you are

looking for, and a template to match

against, This can be used in a variety of

ways including identifying potential hires

now and in the future or identifying an

audience for your employer brand

content. Mapping the people you want to

hire against the social graph of your

current employees enables you to identify

your employees with the deepest

relationship with the highest likelihood of

being heard.

The unstructured social data enables the

profiling of targets of professional and

personal data, online behaviors, interests

etc in order to deliver a personalized

tailored message from a trusted source.

In a world of noise, tailored marketing to

an audience of one greatly improves the

probability of success, and probability is

what big data is all about.

DATA VISUALIZATION

In the age of big data, data visualization

is becoming critical to deriving meaning

from the masses of information available

to us. Visualization is the creation and

study of the visual representation of data,

meaning information that has been

abstracted in some schematic form,

including attributes or variables for the

units of information. The vast majority of

us are not data experts or analysts, and

that’s why we use tools to do the work.

We find it hard to understand numbers in

columns and rows, but we understand

pictures, graphs, maps and symbols.

Whilst we may not be able to spot a

possible problem hidden in a row of data,

we can easily locate a possible problem if

it is indicated by a red flag. Whilst we

may find it difficult to identify the extent

of a skills shortage in a given location by

studying academic data and comparing it

with job openings, if you overlay the

same data on a map to create a heat map

of talent against openings, you can

immediately spot where there is a skills

shortage or over supply.

Airline caterers Gate Gourmet were able

to apply this technique in San Francisco

to change their whole recruiting strategy.

Recruiting unskilled labor in Silicon Valley

is no mean feat; historically Gate Gourmet

had targeted all the towns surrounding

the city, took a truck out and announced

job openings. They did this for a number

of years, and were always


B I G D A T A

understaffed. The big problem as they

saw it was a combination of their

location, the availability of unskilled labor

and the fact that being airside and a low

margin business, they paid $2 below

minimum wage.

Seeking a solution to an on-going

problem, they overlaid where they had

experienced success in hiring with a map

of the local area. This told an interesting

story that had previously been hidden

from them: their hires came from 3 towns.

Changing strategy, rather than try to

blanket cover the whole state, they

concentrated all of their efforts on the 3

towns identified on the data map with

increased activity and investment. The

net result just 12 months later is that they

now have a waiting list for opportunities

at the airport. The decision to change

direction was made simply by making the

obvious visible in a format everyone

understood.

A multi-national company was able to

reduce their time to hire from 95+ days to

16 by mapping out their end-to-end

process from the raising of a requisition

to a hire starting. They then measured the

time taken at each stage and overlaid this

data on the end-to-end process map. This

highlighted where blockages were

occurring, allowing them to make

improvements. The process map made

every stage visible, and the addition of

data revealed where significant

improvements could be made. It is

common in organizations that poor

practice continues year on year because

it is the way things have always been

done or just the way things are, and

making data visual has the potential to

change that. Data needs to be presented

in a format that is easy to digest without

expert knowledge in order to bring about

best practice.

SELF-SERVICE TOOLS ON DEMAND

More users are expecting self-service

access to data, without the need to call

on data or computer experts. This means

that the users need to be able to set their

own parameters for data interrogation

and access results in a format they can

understand. A good example of this

would be giving recruiters control to set

their own search criteria, eliminating, for

example, employees from a certain

company. This gives the recruiter control

of the data at their disposal without

being dependent on a machine dictated

algorithm. As much control as possible

needs to be in the hands of the user, and


this instantly results in a format they can

understand, because making data

accessible and understandable improves

decision making, operates in real time

from the latest data when it is needed

most.

MACHINE LEARNING

Machine learning concerns the

construction and study of systems that

can learn from data. For example, a

machine learning system could be trained

on e-mail messages to learn to distinguish

between spam and non-spam messages.

After learning, it can then be used to

classify new e-mail messages into spam

and non-spam folders.

The core of machine learning deals with

representation and generalization.

Representation of data instances and

functions evaluated on these instances

are part of all machine learning systems.

Generalization is the property that the

system will perform well on unseen data

instances; the conditions under which this

can be guaranteed are a key object of

study in the subfield of computational

learning theory.

There are a wide variety of machine

learning tasks and successful applications.

Optical character recognition, in which

printed characters are recognized

automatically based on previous

examples, is a classic example of machine

learning. Source: http://en.wikipedia.org/wiki/

Machine_learning

When it comes to big data, machine

learning is a branch of artificial

intelligence (AI), the practice of getting

computers to think like people. Analytics

of user interpretation of data enables

technology to make decisions and rank

results based on the past reaction of

users. A good example of this might be a

recruiter who rejects the result of a

search or removes similar resumes from

those under consideration. Machine

learning interprets these actions in order

to change the way future results are

arrived at and determined in the future.

Google uses machine learning to order

search results by past actions, ranking

search results by items the user has

shown trust in previously, like blog posts

from a particular writer or academic text.

This works on the principle that the more

the user interacts with the technology,

the better the technology gets at

understanding the user in the same way a

person would. Interaction is rewarded by

a continually offering a better user


B I G D A T A

experience. Machine learning is applied to

big data for recruiting in the same way.

Modern recruiting technology facilitates

this methodology, bringing the way the

technology thinks closer to the elusive

artificial intelligence.

SUMMARY

In September 2011, super sourcer Glen

Cathey, the SVP of Talent Strategy and

Innovation at recruiters KForce, published

a post, entitled “Moneyball Recruiting”, on

his excellent resource, the Boolean Black

Belt blog. In this post Cathey drew on the

story of Billy Bean, who used a computer

to select baseball team Oakland A’s. The

story of the success this team achieved is

well documented elsewhere, but Cathey

expressed the opinion that such

methodology could be applied to change

recruiting and human capital

management.

Cathey concluded with the statement:

“I AGREE WHOLEHEARTEDLY WITH

MIKE LOUKIDES THAT: “THE FUTURE

BELONGS TO THE COMPANIES WHO

FIGURE OUT HOW TO COLLECT AND

USE DATA SUCCESSFULLY.” HOWEVER,

I’D ADD THAT THE FUTURE BELONGS

MORE ACCURATELY TO THE

COMPANIES WHO FIGURE OUT HOW TO

COLLECT AND USE HUMAN CAPITAL

DATA SUCCESSFULLY. THAT’S BECAUSE

THE COMPANIES THAT CAN

CONSISTENTLY HIRE GREAT PEOPLE,

THROUGH IDENTIFYING PEOPLE AND

BASING HIRING DECISIONS ON DATA

AND NOT INTUITION AND

CONVENTIONAL WISDOM, ARE MORE

LIKELY TO DEVELOP THE BEST TEAMS.

AND THE BEST TEAMS WIN.”

See more at: http://booleanblackbelt.com/2011/09/

big-data-data-science-and-moneyball-

recruiting/#sthash.rcfD4Xgh.dpuf

When Cathey wrote this post, we were

still speculating about the potential

offered up by the information age. Other

areas of business like marketing and sales

were showing themselves as the early

adopters of data mining and analytics to

derive business value, and a lot of what is

being developed in the recruiting space

can be attributed to the lessons learnt at

this time, lessons we are now applying to

talent acquisition and hiring.


Josh Bersin, of Bersin Deloitte, has created a model to outline the spread of big data analytics

across business functions on the journey towards HR and Recruiting:

The case now in 2014 is overwhelming. We can utilize the vast volume of structured and

unstructured data in all areas of recruiting, talent acquisition, workforce planning and hiring. We

can expect big data to play an increasingly important role in identifying, pipelining and hiring the

best talent in the most effective way, and we have only just started.

ANALYTICS IS DEFINITELY COMING TO HRThe Evolution of Business Analytics in other Functions

The Waves of Business Analytics

Logistics & Supply

Chain Analytics

1980s Financial

& Budget Analytics

Integrated

Supply Chain

Integrated ERP and

Financial Analytics

Customer Analytics -

CRM (Data Warehouse)

Customer Segmentation

Shopping Basket

Web Behavior

Analytics

Predictive Customer

Behavior - CRM

Recruiting, Learning,

Performance

Measurement

Integrated

Talent Management

Workforce Planning

Business-Driven Talent

Analytics

Predictive Talent

Models HR Analytics

The Industrial Economy

The Financial Economy

The Customer Economy and Web

The Talent Economy

Steal, Oil, RailroadsConglomerates, Financial,

Engineering

Customer Segmentation

Personalized Products

Globalization,

Demographics, Skills and

Leadership Shortages

Early 1900s 1950s-60s 1970s-80s Today


B I G D A T A

N E X T S T E P S

W W W . R O L E P O I N T . C O M

I N Q U I R I E S @ R O L E P O I N T . C O M

Nasdaq clients, building the principles that

help companies generate 70%+ referral

rates into a software-as-a-service platform.

Understanding that at the core of any

successful referral program is the

employee, RolePoint focuses on providing

an engaging, transparent and frictionless

experience, making it easy to identify

talented connections to refer.

For recruitment teams, RolePoint offers a

comprehensive set of tools, enabling

tracking, automation and recruitment

intelligence for greater control and insight

into referrals within your organization.

B I L L B O O R M A N

The author, Bill Boorman, has over 30

years’ experience in and around recruiting.

He has spent the last 3 years working with

social recruiting technology start-ups on

product and with corporate clients

including Hard Rock Café, Oracle and the

BBC to integrate social into their recruiting

practices. Bill has also hosted recruiting

events in over 30 countries worldwide.

R O L E P O I N T

RolePoint delivers employee referral

solutions to a range of Fortune 500 and

C O N TAC T U S TO S C H E D U L E A F R E E E M P LOY E E R E F E R R A L

C O N S U LTAT I O N W I T H B I L L B O O R M A N

C O N TAC T U S TO F I N D O U T M O R E A B O U T RO L E P O I N T A N D A R R A N G E

A D E M O N S T R AT I O N


R O L E P O I N T

THE MOST POWERFUL

SOURCING SOLUTION AT

DISCOVERING TALENTED

CANDIDATES WITHIN

YOUR EMPLOYEES’

PROFESSIONAL

NETWORKS

HIGHER QUALITY CANDIDATES

REDUCED TIME-TO-HIRE

LOWER COST-PER-HIRE

IMPROVED EMPLOYER BRAND

W W W . R O L E P O I N T . C O M

I N Q U I R I E S @ R O L E P O I N T . C O M

The Implications of Big Data on Employee Referrals and Recruiting

Technology

Transcript of The Implications of Big Data on Employee Referrals and Recruiting