Extracting Information for Context-aware Meeting Preparation

13
Extracting Information for Context- aware Meeting Preparation Simon Scerri, Behrang Q. Zadeh, Maciej Dabrowski, Ismael Rivera 26.05.2014 LREC 2014. Reykjavik, Iceland.

description

People working in an office environment suffer from large volumes of information that they need to manage and access. Frequently, the problem is due to machines not being able to recognise the many implicit relationships between office artefacts, and also due to them not being aware of the context surrounding them. In order to expose these relationships and enrich artefact context, text analytics can be employed over semi-structured and unstructured content, including free text. In this paper, we explain how this strategy is applied together with for a specific use-case: supporting the attendees of a calendar event to prepare for the meeting.

Transcript of Extracting Information for Context-aware Meeting Preparation

Page 1: Extracting Information for Context-aware Meeting Preparation

Extracting Information for Context-aware Meeting Preparation

Simon Scerri, Behrang Q. Zadeh, Maciej Dabrowski, Ismael Rivera

26.05.2014 LREC 2014. Reykjavik, Iceland.

Page 2: Extracting Information for Context-aware Meeting Preparation

General Objectives

LREC 2014Wednesday, 28 May 2014

Page 3: Extracting Information for Context-aware Meeting Preparation

Information Extraction Targets

Information Items & their attributes: (Semi)Structured• Email Messages• Instant Messages• Documents• Calendar Events• Folders

Item Titles, Descriptions & Content: Complex/Unstructured• Keywords• Action Items: Information Request & Task Request

LREC 2014Wednesday, 28 May 2014

Page 4: Extracting Information for Context-aware Meeting Preparation

Architecture

LREC 2014Wednesday, 28 May 2014

Page 5: Extracting Information for Context-aware Meeting Preparation

Keyword Extraction - Method

Keyword Extraction

General Text Processing Indexing and Storage

Page 6: Extracting Information for Context-aware Meeting Preparation

Keyword Extraction - Method

LREC 2014Wednesday, 28 May 2014

• Generic term extraction architecure• Based on the assumption that similar

terms appear in similar contexts• Use the context of previously known

terms to identify new terms

• Random Indexing for the Construction of a VSM at reduced dimension

• Create a training set using the previously known terms

• Use Linear least square support vector machine (SVM)

Page 7: Extracting Information for Context-aware Meeting Preparation

MLTagger• Mining technical terms (Expert Vocabulary) in semi-supervised

manner (minimum user intervention)• Train or Use Pre-Trained Models• Input: a Sentence

• Tagger based on Liblinear SVMs • Includes POS tags, Dependency Structures• Includes user feedback to identify relevant terms

• Output: Set of weighted terms

technology term / 1.3071887518221268term tagger / 0.859136213710545technology term tagger / 0.75647809808033technology related terms / 0.38733521155619terms / 0.3856395759054531Dependency / 0.24820541872752222identification of technology related terms / 0.22234662115108667

technology / 0.2218680207043609technology / 0.20526909576693653features including POS tags / 0.169229802088223Dependency Structures / 0.1408195803257369features including Part / 0.12821844123781564Part of Speech tags / 0.10986616318102964

Page 8: Extracting Information for Context-aware Meeting Preparation

Keyword Extraction - Evaluation

LREC 2014Wednesday, 28 May 2014

Evaluation over corpora of scientific papersSection A of ACL Anthology Reference CorpusSemantic web dog food corpusEvaluated datasets are availed here: http://parsie.deri.ie/datasets/TTI/

Precision-Recall estimation

Page 9: Extracting Information for Context-aware Meeting Preparation

GATE Pipeline (English)• Conditional Corpus

ANNIE IE System• Tokeniser/NE Transducer/POS Tagger

Gazetteer Lookup• Verbs (Actions, Activities, Modal verbs)• Grammatical Person

JAPE Hand-coded Rules• 62 rules in 16 phases• Grammatical Person

Action Item Extraction - Method

LREC 2014Wednesday, 28 May 2014

Page 10: Extracting Information for Context-aware Meeting Preparation

Action Item Extraction - Method

LREC 2014Wednesday, 28 May 2014

Page 11: Extracting Information for Context-aware Meeting Preparation

Action Item Extraction - Evaluation

LREC 2014Wednesday, 28 May 2014

Human vs Automatic Annotation• > 100 email messages• > 240 chat turns • Confirmation of Extracted Action Items• Marking False positives & False negatives (Missed Items)

Results• F2-measure: 0.69• Email only: 0.71• IM only: 0.64

Page 12: Extracting Information for Context-aware Meeting Preparation

Extracted Items: Unified Representation

LREC 2014Wednesday, 28 May 2014

Page 13: Extracting Information for Context-aware Meeting Preparation

Future Work

LREC 2014Wednesday, 28 May 2014

Action Item Extraction• Separation of pipelines

• Email & IM

• IM Pipeline: Abbreviation/TxtSpk replacement service

Keyword Extraction• Iterative Learning Procedure (App Validation)

•Active Learning – k-nearest-neighbour Regression instead of SVM

• Chat-email histories to Extract Background Knowledge• Application of Association Measures for Filtering Candidate terms