H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014
Transcript of H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014
jobknowledge.eu
facebook.com/jobknowledge@Jobknowledge
Small talkText mining in organizational research:
a review and a case studyVladimer Kobayashi, Hannah Berkers, Stefan Mol, Gabór Kismihók & Deanne den Hartog
Overview
The case study: Extracting job information from vacancies• The problem: Modernizing job analysis • The data: 500,000 online vacancies• The use of a framework: knowledge from the job analysis field• The techniques: feature extraction• The results: Successful automatic categorization of job information
The review: text mining techniques and tasks in organizational research• The task: Invitation for a special issue on big data in ORM• The paper: Our structure so far• The question: Feedback
We are in the process of writing the paper, so any feedback, suggestions,
etc. to help us to further develop the paper are very welcome!
The case study: Extracting job information from vacancies
The problem: Modernizing job analysis Jobs are changing, but job analysis is lagging behind
• Seen as a tedious and expensive, but necessary task• Not up to speed with the changes in work • Accuracy of job analysis using job incumbents as a source is questioned• Not taking advantage of the ‘big data’ opportunities
The case study: Extracting job information from vacancies
The data: 500,000 English online vacanciesAn often overlooked rich source of job information Could facilitate upscaling amount of data used in job analysis
The case study: Extracting job information from vacancies
The use of a framework: knowledge from the job analysis fieldSkills can be extracted from job advertisements (Sodhi & Son, 2009; Smith & Ali, 2014)Studies conducted in the field of Information Technologies with a focus on the use of technologiesNeed for a more deductive approach (George, Haas, & Pentland, 2014)
We go beyond this research by using knowledge from the job analysis field We categorize job information based on the basic distinction between job attributes and job activities (Sackett & Laczo, 2003) First step toward the extraction of finer grained job information
The case study: Extracting job information from vacancies
The use of a framework: knowledge from the job analysis fieldCategorization into job attributes and job activitiesUse of manual labelling of 300 random vacancies (3,921 labelled sentences)Based on definitions of the finer grained job features (either attribute or activity), such as knowledge, abilities, tasks, responsibilities etc.
The case study: Extracting job information from vacancies
The techniques: Feature extraction
Feature MatrixTEXT PREPROCESSING TEXT ENCODING
Text Preprocessing• Sentence and word tokenization• Lower case transformation• Stopwords removal, e.g. the, and, etc• Extra whitespace• Lemmatization
Text Encoding• Linguistic preprocessing, e.g. part of
speech (POS) tagging
F E A T U R E S
S E N T E N C E S
Job Vacancies Preprocessed Vacancies
The case study: Extracting job information from vacancies
Feature list• Sentence Length (after removing certain words)• POS of first word (job activity sentences usually start with a verb)• First word (both kind of sentences often start with certain words)• Last Word (job attribute sentences commonly end with certain words )• Proportion of nouns and adjectives• Proportion of verbs and TO• Proportion of verbs followed by noun, verb, adjectives, adverb• Frequent words
The case study: Extracting job information from vacancies
Application of Data Mining Techniques to the Feature Matrix• Naïve Bayes• Support Vector Machines• Random Forest
The results: Successful automatic categorization of job informationAt least 95% mean accuracy based on 10-fold cross validation compared with the base classifier accuracy of 55%
The case study: Extracting job information from vacancies
Future work• Semi-supervised labelling• Finer classification• Consideration of more features
The review: Text mining techniques in organizational research
The task: Invitation for a special issue on big data in ORMIntroduce the methods of text analysis to organizational scientistsReview of various techniques for mining textual dataThe pros and cons of different approaches (best practices)Illustrations from the current project on job analysis showing how these procedures can be applied to a substantive area
The review: Text mining techniques in organizational researchThe paper: Our structure so far1. Introduction
Text data in organizational research and issues that could be solved with text miningIntroduce the case study on text mining in job analysis
2. Review of text mining techniquesDefinitions and terminology Text preprocessing 3 tasks done in text mining: classification, feature construction, and feature selectionEvaluating text mining results
The review: Text mining techniques in organizational researchThe paper: Our structure so far2. Review of text mining techniques
For each task a) Text mining techniques applied to perform the tasksb) Possibilities for applying Organizational frameworksc) Advantages and disadvantages of these techniques illustrated with examples from Organizational Research and other fieldsd) Illustration from our case study
The review: Text mining techniques in organizational researchThe paper: Our structure so far3. Discussion of opportunities and challenges of text mining in Organizational Research
Opportunities such as extending the application of text mining to other problems in Organizational Research (input?)
Challenges such as dealing with data size, access and protection of data, language issues etc.
4. Conclusion
The review: Text mining techniques in organizational research
The question: Feedback
What problems you are dealing with right now (or in the past) that make use of text data?What are the opportunities that you see for text mining?Which part of text mining would you like to learn more about?Do you have experience in submitting a manuscript to ORM?
References
The question: Feedback George, G., Haas, M.R. & Pentland, A. (2014). From the editors: Big Data and Management. Academy of Management Journal, 57 (2), 321-326.
Sackett, P.R., & Laczo, R.M. (2003). Job and Work Analysis. In Comprehensive Handbook of Psychology: Industrial and Organizational Psychology, vol. 12, ed. W.C. Borman, D.R. Ilgen, & R.J. Klimoski, pp. 21-37. New York: Wiley.
Smith, D., & Ali, A. (2014). Analysing Computer Programming Job Trend Using Web Data Mining. Issues in Informing Science and Information Technology, 11, 203-214.
Sodhi, M.S., & Son, B-G. (2009). Content Analysis of O.R. Job Advertisements to Infer Required Skills. Journal of the Operational Research Society, 61, 1315-1327.