Document

Introduction The Method Evaluation Conclusion References

A Graph-Based Approach toSkill Extraction from Text

Higher School of Economics, School of Applied Mathematicsand Information Science, Nizhny Novgorod, Russia

Ilkka Kivimaki1, Alexander Panchenko4,2, Adrien Dessy1,2,Dries Verdegem3, Pascal Francq1, Cedrick Fairon2,

Hugues Bersini3 and Marco Saerens1

[email protected]

1ICTEAM, 2CENTAL, Universite catholique de Louvain, Belgium,3IRIDIA, Universite libre de Bruxelles, Belgium,

4Digital Society Laboratory LLC, Russia

December 18, 20131 / 46


Table of Contents

1 Expertise retrieval and skill extraction

2 The Elisit system for skill extractionOverview of the systemSample QueriesAssociation with WikipediaSpreading activation in Wikipedia

3 Evaluation of system

4 Conclusion and future work

2 / 46


Reference paper:

Kivimki I., Panchenko A., Dessy A., Verdegem D., Francq P.,Bersini H. and Saerens M. ”A Graph-Based Approach to SkillExtraction from Text”. In Proceedings of the 8th WorkshopTextGraphs-8 Graph-based Methods for Natural LanguageProcessing. EMNLP 2013: Conference on Empirical Methodsin Natural Language Processing. Seattle, USA, October18-21, 2013.

http://aclweb.org/anthology/W/W13/W13-5011.pdf

3 / 46

http://aclweb.org/anthology/W/W13/W13-5011.pdf


Expertise retrieval [Balog et al., 2012]

Expertise Retrieval vs. Expertise Seeking

Expertise retrieval: linking humans to expertise areas, andvice versa from a system-centered perspective. Expertiseretrieval has primarily focused on identifying good topicalmatches between a need for expertise on the one hand andthe content of documents associated with candidate expertson the other hand.

Expertise seeking: linking humans to expertise areas from ahuman-centered perspective. Expertise seeking has beenmainly investigated in the field of knowledge managementwhere the goal is to utilize human knowledge within anorganization as well as possible.

4 / 46


Expertise retrieval [Balog et al., 2012]

Expertise retrieval: Expert Profiling vs. Expert Seeking

Person: a set of (text) documents generated by an individual.

Expertise: a keyword or a a keyphrase, specifying a field ofknowledge e.g. “Machine Learning”, “Hadoop”, “NLP”, etc.

Expert profiling: given a person, retrieve (profile) itsexpertise.Person → Expertise

Expert retrieval: given an expertise, retrieve persons withsuch expertise.Expertise → Person

5 / 46


Expertise Retrieval: Earlier Work

TREC Enterprise Track [Balog et al., 2008]State-of-the-Art overview [Balog et al., 2012]A skill extraction system [Crow and DeSanto, 2004]Skill extraction System [Skomoroch et al., 2012]Expertise retrieval in universities [Balog et al., 2007]Expert finding on DBLP data [Deng et al., 2008]e-Human Resource Management system [Biesalski, 2003]

6 / 46


Expertise Retrieval: Earlier Work

Skill extraction System [Skomoroch et al., 2012]

http://www.freepatentsonline.com/20120197863.pdf

7 / 46

http://www.freepatentsonline.com/20120197863.pdf


Expertise Retrieval: Applications

Expertise management systems

Knowledge management in enterprisesEmployee profiling

Reviewer selection for articles

Recommendation systems of

jobsjob applicantswebsites, blog texts, articles

8 / 46


Expertise retrieval

9 / 46


Expertise retrieval

10 / 46


Skill extraction

We focus on skill extraction from texts,i.e. associating skills with text documents.

11 / 46


Table of Contents





12 / 46


Overview of the system

Table of Contents





13 / 46



The Elisit system for skill extraction

Original goal of the system:

Associate professional skills to people based on texts that theyproduce (emails, blogs, forums, articles etc.).

Tools:

List of skills extracted from LinkedIn.

The skills are linked to corresponding Wikipedia pages.

Method:

1 Find Wikipedia pages relevant to a target document.

2 Use spreading activation on Wikipedia’s hyperlink network tofind skills that are “close” or “central” to these relevant pages.

14 / 46



Skill extraction using Wikipedia

↑

15 / 46



Example

16 / 46



Example

17 / 46



Example

18 / 46



Size of the problem

Our current version of English Wikipedia consists of

n = 3 983 338 encyclopedia entriesm = 247 560 469 links

27 513 of the encyclopedia entries correspond to LinkedInskills.

19 / 46



Implementation

For computing the similarities between the target documentand all Wikipedia pages, we use the Gensim library [Rehurekand Sojka, 2010].

This part of the Elisit system is called the text2wiki

module.Currently the bottleneck of the computation

For performing spreading activation, we use the sparse matrixlibrary of SciPy.

This part is called the wiki2skill module.

20 / 46



The Elisit system

At the moment not fully functional...

21 / 46


Sample Queries

Table of Contents





22 / 46


Sample Queries

Popular Article about Natural Language Understanding

23 / 46


Sample Queries

Popular Article about Natural Language Understanding

24 / 46


Sample Queries

Blog Article about SEO Marketing

25 / 46


Sample Queries

Blog Article about SEO Marketing

26 / 46


Sample Queries

Wikipedia Article about Geo Information Systems

27 / 46


Sample Queries

Wikipedia Article about Geo Information Systems

28 / 46


Sample Queries

Scientific Article about Graph Mining

29 / 46


Sample Queries

Scientific Article about Graph Mining

30 / 46


Sample Queries

Try it. . .

Elisit Web Interfacehttp://elisit.cental.be/

Elisit Web Servicehttp://elisit.cental.be:8080/

This is only a demo: not optimized for multiple-user queries,high load, fast response, etc.

31 / 46

http://elisit.cental.be/

http://elisit.cental.be:8080/


Association with Wikipedia

Table of Contents





32 / 46




1. Find Wikipedia pages relevant to a target document.

We compute the similarity between the input document andall Wikipedia pages.

We tried four different models:

1 TF-IDF (300,000 dimensions)2 LogEntropy (300,000 dimensions)3 LogEntropy + LSA (200 dimensions)4 LogEntropy + LDA (200 topics)

⇒ the target document is represented as a semantic vector ofsize n, the number of Wikipedia pages (inspired byESA [Gabrilovich and Markovitch, 2007]).

33 / 46


Spreading activation in Wikipedia

Table of Contents





34 / 46




2. Use Wikipedia’s hyperlink network to find skills that are “close”or “central” to these relevant pages.

INITIAL PAGES SKILLS

35 / 46




Formalization of spreading activation by Shrager et al. [1987]:If a(0) is a vector of initial activations, then after each timestep t, the vector of activations is

a(t) = γa(t − 1) + λWTa(t − 1) + c(t)

Parameters

T , the number of time stepsγ ∈ [0, 1] is a decay factorλ ∈ [0, 1] is a friction factorc(t) is an activation source vectorThe link weight, element wij of W determines the amount ofactivation that is spread from i to j .

36 / 46




a(t) = γa(t − 1) + λWTa(t − 1) + c(t)

Thorough model selection difficult because of the size of theproblem

We experimented with three versions of the model:

model 1: a(t) = WTa(t − 1)model 2: a(t) = WTa(t − 1) + a(t − 1)model 3: a(t) = WTa(t − 1) + a(0)

In addition, W is constrained to be row-stochastic.

More focus on the selection of the link weights than otherparameters.

37 / 46




Observation from initial results:

Hubs get easily activated even if they are not relevant.Common phenomenon with large graphs [Brand, 2005; vonLuxburg et al., 2010]

Solution:

we bias the spreading to avoid hubs by

wij =παj∑

(i ,k)∈Eπαk

πj is a popularity index of j (degree / PageRank / HITS).

If α = 0, no biasing; if α < 0 popular nodes are avoided.

Biased random walks have e.g. shorter return times thanunbiased random walks [Fronczak and Fronczak, 2009].

38 / 46


Table of Contents





39 / 46


Evaluation of system

We evaluated the biasing strategy by seeing how well the systemactivates related skills, defined by LinkedIn.

≤ 20

40 / 46



We tested the biasing strategy by seeing how well the systemactivates related skills, defined by LinkedIn.

Pre@5 Pre@10 R-Pre Rec@100α din PR HITS din PR HITS din PR HITS din PR HITS

0 0.119 0.119 0.119 0.156 0.156 0.156 0.154 0.154 0.154 0.439 0.439 0.439-0.2 0.206 0.238 0.206 0.222 0.216 0.213 0.172 0.193 0.185 0.469 0.469 0.494-0.4 0.225 0.263 0.169 0.203 0.200 0.150 0.185 0.204 0.148 0.503 0.498 0.476-0.6 0.238 0.225 0.119 0.200 0.197 0.141 0.186 0.193 0.119 0.511 0.517 0.418-0.8 0.213 0.181 0.075 0.191 0.197 0.113 0.171 0.185 0.109 0.515 0.524 0.384-1 0.169 0.156 0.063 0.178 0.197 0.091 0.154 0.172 0.097 0.493 0.518 0.336

Table : The effect of the biasing parameter α and the choice ofpopularity index on the results in the evaluation of the module.

E.g., the top 5 most activated skills of all the ≈ 27 000 skillscontain 1-2 of the ≤ 20 related skills, on average.

Also, biasing definitely improves retrieval results.

41 / 46



We also ran a test for comparing the different language models.

VSM Pre@5 Pre@10 R-Pre Rec@100TF-IDF 0.231 0.214 0.190 0.516

LogEntropy 0.216 0.212 0.193 0.525LogEnt + LSA 0.180 0.181 0.163 0.491LogEnt + LDA 0.193 0.174 0.159 0.470

Table : Comparison of the different vector space models of the system inthe performance of the whole system.

42 / 46


Table of Contents





43 / 46


Conclusion

The Elisit system extracts explicit skills that are related toan arbitrary text input.

Combination of ESA-style conceptual mapping and spreadingactivation on the Wikipedia network

Evaluation experiments suggest that using popularity-biasedspreading activation improves retrieval results.

44 / 46


Future work

Improvement of link weights, e.g. by

computing content similarity of the Wikipedia pagestrying other structural similarity measuresusing the category memberhips of pages

Comparison with other strategies

More sophisticated (e.g. hierarchical) representation of results.

Also, the methodology could be applied for other purposes,e.g. a general topic model by replacing skills with topics.

45 / 46


References

Krisztian Balog, Toine Bogers, Leif Azzopardi, Maarten De Rijke, and Antal Van Den Bosch. Broad expertiseretrieval in sparse data environments. In Proceedings of the 30th annual international ACM SIGIR conferenceon Research and development in information retrieval, pages 551–558. ACM, 2007.

Krisztian Balog, Paul Thomas, Nick Craswell, Ian Soboroff, Peter Bailey, and Arjen P De Vries. Overview of thetrec 2008 enterprise track. Technical report, DTIC Document, 2008.

Krisztian Balog, Yi Fang, Maarten de Rijke, Pavel Serdyukov, and Luo Si. Expertise retrieval. Foundations andTrends in Information Retrieval, 6(2-3):127–256, 2012.

Ernst Biesalski. Knowledge management and e-human resource management. FGWM 2003, 2003.

M. Brand. A random walks perspective on maximizing satisfaction and profit. Proceedings of the 2005 SIAMInternational Conference on Data Mining, 2005.

Dan Crow and John DeSanto. A hybrid approach to concept extraction and recognition-based matching in thedomain of human resources. In Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE InternationalConference on, pages 535–541. IEEE, 2004.

Hongbo Deng, Irwin King, and Michael R Lyu. Formal models for expert finding on dblp bibliography data. InData Mining, 2008. ICDM’08. Eighth IEEE International Conference on, pages 163–172. IEEE, 2008.

Agata Fronczak and Piotr Fronczak. Biased random walks in complex networks: The role of local navigation rules.Physical Review E, 80(1):016107, 2009.

Evgeniy Gabrilovich and Shaul Markovitch. Computing semantic relatedness using wikipedia-based explicitsemantic analysis. In IJCAI’07: Proceedings of the 20th international joint conference on Artifical intelligence,pages 1606–1611, San Francisco, CA, USA, 2007. Morgan Kaufmann Publishers Inc.

Radim Rehurek and Petr Sojka. Software Framework for Topic Modelling with Large Corpora. InProceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pages 45–50, Valletta,Malta, May 2010. ELRA.

Jeff Shrager, Tad Hogg, and Bernardo A Huberman. Observation of phase transitions in spreading activationnetworks. Science, 236(4805):1092–1094, 1987.

U. von Luxburg, A. Radl, and M. Hein. Getting lost in space: large sample analysis of the commute distance.Proceedings of the 23th Neural Information Processing Systems conference (NIPS 2010), pages 2622–2630,2010.

46 / 46

Document

Technology

Transcript of Document