iWork : Analytics for Human Resources Management

45
1 iWork: Analytics for Human Resources Management Girish Keshav Palshikar Tata Consultancy Services Limited 54B Hadapsar Industrial Estate, Pune 411013, India. [email protected] Invited Talk at the Forum for Information Retrieval Evaluation (FIRE 2012), Indian Statistical Institute, Kolkata, India on 19- Dec-2012

description

Invited Talk at the Forum for Information Retrieval Evaluation (FIRE 2012), Indian Statistical Institute, Kolkata, India on 19-Dec-2012. iWork : Analytics for Human Resources Management. Girish Keshav Palshikar Tata Consultancy Services Limited - PowerPoint PPT Presentation

Transcript of iWork : Analytics for Human Resources Management

Page 1: iWork : Analytics for Human Resources Management

1

iWork: Analytics for Human Resources Management

Girish Keshav PalshikarTata Consultancy Services Limited54B Hadapsar Industrial Estate, Pune 411013, [email protected]

Invited Talk at the Forum for Information Retrieval Evaluation (FIRE 2012), Indian Statistical Institute, Kolkata, India on 19-Dec-2012

Page 2: iWork : Analytics for Human Resources Management

2

Human Resources Management HR Management is a crucial function within any

organization– More so in services, IT and BPO industries– HR is a cost center; everyone interacts with HR!

HR function is characterized by a large number of – Business processes– IT systems that automate these business processes– Huge databases of employee activities– Many employee contact initiatives– Close interactions with L&D– Associated metrics and KPI– Monitoring and regulatory compliance

Page 3: iWork : Analytics for Human Resources Management

3

Phases in HR Management Talent Acquisition

– Requirement gathering, recruitment planning– Campus recruitment, EP interviews and recruitments

Talent Management and Utilization– Allocation (e.g., to project), team formations, monitoring– Roles and tasks, Utilization tracking (billing, timesheets)– Transfer, deputation, Travel– Training, Knowledge management– Performance appraisal, Promotion, Salary– Communications– Administration (leave, medicals, …)

Talent Retention– Feedback, complaints, grievances, …– Resignation handling, retention, knowledge transfer,

succession

Page 4: iWork : Analytics for Human Resources Management

4

Business Goals of HR Management HR is responsible for maintaining a high-quality

workforce – Well-aligned and competitive for the business of the

organization– Effective in performing the business tasks and services,

delivering required value and meeting client expectations;

– Well-trained in the required and emerging skills– Highly responsive to emerging business requirements– Stable (low attrition, low impact of attrition; successful

succession and knowledge transfer)– Cost-effective (low salary and overhead costs)– Able to evolve into leadership– Agile and Mobile (quickly form effective and distributed

teams)– Motivated, high on initiative and ownership, highly

proactive– Happy with their environment, work, roles, salaries,

career paths– Follows professional ethics, codes of conducts etc.– Well-integrated and diverse; highly communicative

Page 5: iWork : Analytics for Human Resources Management

5

What Makes HR Management Challenging?

The human factors! Large and varied backgrounds of the workforce Globalization, diversified and distributed workforce New business demands (services, products, …) New business models New customers across the globe Mergers and acquisitions across the globe Changing skills requirements Innovation (disruptive / incremental, technical / domain) Risks (ethical violations, data privacy, client

confidentiality)

Page 6: iWork : Analytics for Human Resources Management

6

iWork: Analytics for HR

Data Repositories

Text Repositories

AnalyticsTechniques

InsightsPatterns

Knowledge

Vision• Make effective use of historical databases and document

repositories for solving HR domain-specific problems• Use analytics-driven decision making to meet HR business goals

• Combine data and text mining to build innovative HR domain-specific solutions for significant enterprise

• Deliver analytics-derived to the right users at the right point in the HR business processes

Technology AreasData Mining, Machine Learning, Pattern Recognition, Statistical Analysis,

Natural Language processing, Computational Linguistics, Text Mining; Optimization

HR Domain Knowledge

ActionableNovel

Page 7: iWork : Analytics for Human Resources Management

7

iWork: Opportunities Reduce effects of attrition: understanding of root causes,

accurate prediction, targeted retention strategies, backup/replacement plans, …

Improve project team formation: optimal mix of experience and expertise for all projects, optimal cost team, maximize match with associates’ interests

Improve associate satisfaction: better understanding of drivers, identify concrete action items, cost-effective improvement plan, what-if analysis

Talent acquisition: reduce delays and costs, maximize match with requirements, school evaluation, identify patterns of long-term stay, …

Team profiles: in terms of backgrounds, skills, roles, domains, …

Effective RFI Responses/proposals: locate experts, experience, tools

Available HR Datasets: resumes (internal, external), timesheets, project details, in-house tool/document repositories, allocations, trainings, surveys, …

Page 8: iWork : Analytics for Human Resources Management

8

iWovierk Overview QUEST: employee/customer survey analytics iRetain: Attrition analytics; retention analytics Resume Center: extract structured information from

resumes; match job requirements; find experts; team skill profiles; …

ExBOS: optimal project team formation iTAG: analytics for talent acquisition Analytics for improving effectiveness of training

programs …

Page 9: iWork : Analytics for Human Resources Management

9

iWork Vision: Mine HR data to Drive Improvements to Workforce Management

Talent Acquisition

Talent Utilization and Management

Talent Retention

SPEED

iCALMS

PULSE PEEP

mPOWER

IPMS

Visa Tracker

GESS

HRMSGRS

ExOp

RGS

Workforce

Planning

RMG Tracker

HR Datasets iWork

Complaints Managemen

t System

Page 10: iWork : Analytics for Human Resources Management

10

iWork: Strategy and Offerings Build a set of integrated offerings that identify specific improvement

opportunities for Workforce Management Transcend silos in HR systems and data Integrate the offerings with existing HR systems to deliver the

actionable recommendations to the users when they need

• As-is state dashboard for attrition

• Discovering high-attrition groups

• Predictive models for attrition• Identifying root-causes of

attrition• Plan for reducing attrition

impact• Optimal retention plan

Talent Retention

•TA analytics: improve cost, quality, timeliness of TA• ILP analytics

Talent Acquisition• Quest: employee survey analytics• EXBOS: optimal team formation• Visa analytics• Resume Center: competency extraction; find right people for positions; find experts; find misinformation; enrich RFP response (past projects, tools); talent pool profiling; find customer intelligence; • Bench analytics

Talent management

ITIS workforce management: team sizing + shift planning; optimal team skill profile; service level rationalization; expert finding; training plans; DC transformation planning

Automation of survey responses tagging Nielsen

Page 11: iWork : Analytics for Human Resources Management

11

Practical Applications Need a Lot More than IR

Various databases and

text repositories

Business Solution

Document retrievals; rankingFine-grained retrievalGoal-directed retrievalPost-processingInformation extractionCross-linking and information fusion (e.g., with FB, LinkedIn)ClassificationVisualization;SummarizationAnalytics (problem-specific)Learning to rank

Page 12: iWork : Analytics for Human Resources Management

12

RESUME CENTER

Effective use of resumes in all HR functions

Page 13: iWork : Analytics for Human Resources Management

13

Using Resumes in HR Functions Resumes: a valuable source of information for people’s work

– ~250,000 TCS employees’ resumes– ~2 million candidates’ (applicants’) resumes

Business goals– Use information extraction: extract personal, job history,

project details, education, training, awards etc. from given resumeso Validate information in resumeso Create gazettes (colleges, degrees, certifications, tools,

companies)– Update employee experience profile and skills/competencies – Identify top K best matches for a given job requirement /

position o Learning to rank (poster in FIRE 2012)o improve team formation; shorten recruitment cycle

– Perform mining of data extracted from resumes to derive novel, actionable insights about the available talent poolo reduce bench; reduce attrition; improve utilizationo identify training opportunities; help in career planning

Page 14: iWork : Analytics for Human Resources Management

14

Resume Center: Information Extraction

On-the-fly extraction using an IR engine

Page 15: iWork : Analytics for Human Resources Management

15

Resume Center: Information Extraction

Page 16: iWork : Analytics for Human Resources Management

16

Resume Center: PowerMiner Given a set of resumes, provide facilities to help in filing

RFI/RFP responses, form project teams etc. locate relevant projects for a given project description locate relevant tools for a given project description identify expert persons for a given technical area assign domain(s) to each resume (e.g., insurance,

railways, banking, telecom etc.) Identify "unusually high quality" resumes in terms of a

set of pre-defined quality criteria – Special tools, niche skills, extra qualifications (e.g.,

domain-related), top-quality academic performance, awards, publications

Page 17: iWork : Analytics for Human Resources Management

17

Resume Center: Team Profiler Given a resume repository, help HR executives in

building an “understanding” of their teams:– What are the strengths and weaknesses of my team in

terms of technical skills, domain knowledge, roles etc.? – What should I do to improve the quality of my teams?

Create a summary profile of a team, in terms of technology skills, domains, experience etc.;

Group the given resumes into clusters (from different perspectives), with specific interpretation for each cluster– Similar to customer segmentation?

Document repository visualization and exploratory facilities

Page 18: iWork : Analytics for Human Resources Management

18

R. Srivastava, G. K. Palshikar, RINX: Information Extraction, Search and Insights from Resumes, Proc. TCS Technical Architects' Conf., (TACTiCS 2011), Thiruvanthapuram, India, Apr. 2011.

S. Pawar, R. Srivastava, G.K. Palshikar, Automatic Gazette Creation for Named Entity Recognition and Application to Resume Processing, Proc. ACM COMPUTE 2012 Conference, Pune, India, 24-Jan-2012.

G.K. Palshikar, R. Srivastava, S. Pawar, Delivering Value from Resume Repositories, TCS White Paper published on www.tcs.com, Feb. 2012. (c) Tata Consultancy Services Limited.

Page 19: iWork : Analytics for Human Resources Management

19

QUESTSurvey response analytics

Page 20: iWork : Analytics for Human Resources Management

20

QUEST: Overview Advanced analytics tool to mine survey response data and

derive novel, actionable insights for improving workforce management

Surveys are a direct and effective mechanism to gauge concerns and issues that affect satisfaction of employees or customers

Motivation: TCS conducts an annual in-house employee survey– 250,000 employees, ~100 questions (structured, free-form) – 250,000 textual responses to each of ~20 questions– Challenges: volumes; dependencies; mixed structured/text

responses Business goals: improve satisfaction levels among

employees Benefits: deeper insights, objective results, reduced

time/efforts Impact: Satisfaction levels affect projects quality, client

satisfaction Status: Currently deployed in-house Vision

– QUEST should be an integral part of all HR contact and feedback programs throughout TCS (ISU, geographies, clients etc.)

– Deploy for customer / product satisfaction surveys

Page 21: iWork : Analytics for Human Resources Management

21

QUEST: Approach Dashboards and standard reports Drill-down exploratory analysis Visualization Summarize responses to specific questions/categories Identify specific issues, concerns and suggestions Characterize low-satisfaction groups (discover common

characteristics of employees with high/low satisfaction) Identify factors (root causes) that affect satisfaction Design optimal plans to improve satisfaction levels Use survey results in team planning and other

workforce management tasksG.K. Palshikar, S. Deshpande, S. Bhat, QUEST: Discovering Insights from Survey Responses, Proc. 8th Australasian Data Mining Conf. (AusDM09), Dec. 1-4, 2009, Melbourne, Australia, P.J. Kennedy, K.-L. Ong, P. Christen (Ed.s), CRPIT, vol. 101, published by Australian Computer Society, pp. 83 - 92, 2009.

Page 22: iWork : Analytics for Human Resources Management

22

QUEST: Results

Page 23: iWork : Analytics for Human Resources Management

23

QUEST: Results…Things you don’t like about TCS

Page 24: iWork : Analytics for Human Resources Management

24

Quest: Results…

Groups having unusually low ASIEXPERIENCE_RANGE = ‘4-7’ (60.4; global avg. = 73.8)Root causes for low ASICanteen, Transportation, RMG

PULSE 2008-09 Responses for TCS Mumbai

M. Natu, G.K. Palshikar, Interesting Subset Discovery and its Application on Service Processes, Proc. Workshop on Data Mining for Services (DMS 2010) held as part of the Int. Conference on Data Mining (ICDM 2010), Australia, 2010, pp. 1061-1068.

Interesting subset discovery: finding bumps in a large-dimensional distribution

Page 25: iWork : Analytics for Human Resources Management

25

QUEST: Results…

Actionable suggestions made by associatesTCS can have tie ups with best Schools in the near by locations for their employee kids… the moment you step out there is only garbage and randomly parked autos aroundTCS can engage with lease agreement … with TATA Housing itself and provide economical accommodation.I don`t have any leg space...n my knees are hurting badly

S. Deshpande, G.K. Palshilkar, G Athiappan, An Unsupervised Approach to Sentence Classification, Proc. Int. Conf. on Management of Data (COMAD 2010), Nagpur, 2010, Allied PublishersPvt. Ltd., pp. 88 - 99.

Page 26: iWork : Analytics for Human Resources Management

26

Sentence Classification

Sentence class labels are usually domain-dependent

Unsupervised classification of sentences: specific / general

Page 27: iWork : Analytics for Human Resources Management

27

Sentence Classification…

A SPECIFIC sentence is more ”on the ground”

A GENERAL sentence is more ”in the air” Example:

– My table is cramped and hurts my knees. – The work environment needs improvement.

– Travel vouchers should be cleared within 2 working days.

– Accounts department is very inefficient.

Page 28: iWork : Analytics for Human Resources Management

28

Sentence Classification… Compute a specificity score for each sentence:

– Unsupervised (knowledge-based), without the need for any labeled training examples.

– Define a set of features and compute their values for each sentence.

– The features are lexical / semantic. – The features are context-free: their values are

computed exclusively using the words in the sentence and do not depend on any other (e.g., previous) sentences.

– Then combine the feature values for a particular sentence into its specificity score.

Rank the sentences in terms of their specificity score.

Page 29: iWork : Analytics for Human Resources Management

29

Sentence Classification…

Sentence features– Average semantic depth (ASD)– Average semantic height (ASH)– Total occurrence count (TOC)– Count of Named Entities (CNE)– Count of Proper Nouns (CPN)– Sentence Length (LEN)

Page 30: iWork : Analytics for Human Resources Management

30

Sentence Classification…

Semantic depth (SD) SDT(w) of a word w is the distance (number of edges) from the root of ontology T to word w in T– We use T = WordNet ISA ontology– More semantic depth more specific word

Page 31: iWork : Analytics for Human Resources Management

31

Sentence Classification…

Semantic depth of a word changes with its POS tag and with its sense; – SD(bank) = 7 for financial institution– SD(bank) = 10 for flight maneuver sense.

Solution:– Apply word sense disambiguation (WSD)

during pre-processing; or– Take average of the semantic depths of

the word for top k of its senses

Page 32: iWork : Analytics for Human Resources Management

32

Average semantic depth S.ASD for a sentence S = <w1 w2 . . . wn> containing n content-carrying words = the average of the semantic depths of the individual words

My table hurts the knees. – (8 + 2 + 6)/3 = 5.3

The work environment needs improvement.– (6 + 6 + 1 + 7)/4 = 5.

Page 33: iWork : Analytics for Human Resources Management

33

Semantic height (SH) SHT(w) of a word w is the length of the longest path in T from word w to a leaf node– We use T = WordNet hyponym ontology– Lower semantic height more specific word

Page 34: iWork : Analytics for Human Resources Management

34

Average semantic height S.ASH for a sentence S = <w1 w2 . . . wn> containing n content-carrying words (non stop-words) = the average of the semantic heights of the individual words

Semantic height of a word changes with its POS tag and with its sense;

Solution: use WSD or take average of the semantic heights of the word for top k of its senses

Page 35: iWork : Analytics for Human Resources Management

35

Intuition: more specific sentences tend to include words which occur rarely in some reference corpus– apple (2), fruit (14), food (34)

More the number of rare words in a sentence, more specific it is likely to be.

OC(w) = occurrence count of word w in WordNet; – if w has multiple senses, then OC(w) = average of the

occurrence counts for top k senses of w Total occurrence count S.TOC for a sentence S =

<w1 w2 ... wn> containing n content words is the sum of the lowest m occurrence counts of the individual words, where m is a fixed value (e.g., m = 3).

OC of a word changes with its POS tag and with its sense;

Solution: use WSD or take average of the OC of the word for top k of its senses

Page 36: iWork : Analytics for Human Resources Management

36

Named entities (NE) are commonly occurring groups of words which indicate specific semantic content – Person name (e.g., Bill Gates) – Organization name (e.g., Microsoft Inc.), – location (e.g., New York), – date, time, amount, email addresses etc.

Since each NE refers to a particular object, an NE is a good indicator that the sentence contains specific information.

Another feature S.CNE for a sentence S is the count of NE occurring in S

Page 37: iWork : Analytics for Human Resources Management

37

Proper Nouns (PN) are commonly occurring groups of words which indicate specific semantic content – Abbreviation (IBM or kg), domain terms

(oxidoreductases), words like (Apple iPhone), numbers etc.

Since each PN may refer to a particular object, an PN is a good indicator that the sentence contains specific information.

Another feature S.CPN for a sentence S is the count of PN occurring in S

Page 38: iWork : Analytics for Human Resources Management

38

Sentence length, denoted S.Len, is a weak indicator of its specificity in the sense that more specific sentences tend to be somewhat longer than more general sentences.

Length refers to the number of content carrying words (not stopwords) in the sentence, including numbers, proper nouns, adjectives and adverbs

Page 39: iWork : Analytics for Human Resources Management

39

Features have contradictory polarity. – We want higher values more specificity.– Not true for features ASH and TOC– Lower values higher specificity for these

Scales of values for various features are not the same, because of which some features may unduly influence the overall combined score. – E.g., ASD is usually 10, whereas TOC is a

larger integer. Uniform scaling: map x[a, b] to y [c, d]

Scaling + reversal of polarity

Page 40: iWork : Analytics for Human Resources Management

40

Page 41: iWork : Analytics for Human Resources Management

41

Page 42: iWork : Analytics for Human Resources Management

42

Sentences 6, 7, 9 as top 3 in terms of specificity score

Page 43: iWork : Analytics for Human Resources Management

43

Some specific sentences identified by our algorithm from 110,000 responses in an employee satisfaction survey

Page 44: iWork : Analytics for Human Resources Management

44

Some specific sentences identified by our algorithm from 220 sentences from 32 reviews of a hiking backpack product by Kelty.

Page 45: iWork : Analytics for Human Resources Management

45

Conclusions

Domain-driven IR = IR + text-mining of retrieved documents

Enterprise document repositories offer good scope for Domain-driven IR to deliver solutions and insights relevant for real-life business problems and decisions