Applying lexical knowledge to improve search quality for a … · database Master thesis final...

21
Software Engineering for Business Information Systems (sebis) Department of Informatics Technische Universität München, Germany wwwmatthes.in.tum.de Applying lexical knowledge to improve search quality for a German legal information database Master thesis final presentation Laura Altamirano Sainz - May 4 th , 2014

Transcript of Applying lexical knowledge to improve search quality for a … · database Master thesis final...

Page 1: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Software Engineering for Business Information Systems (sebis)

Department of Informatics

Technische Universität München, Germany

wwwmatthes.in.tum.de

Applying lexical knowledge to improve search

quality for a German legal information

database Master thesis final presentationLaura Altamirano Sainz - May 4th, 2014

Page 2: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Administration matters

• Time:

October 15th, 2014 to April 15th, 2015

• Supervisor:

Prof. Dr. Florian Matthes

• Advisor:

Bernhard Waltl

© sebisLaura Altamirano Sainz 2

Page 3: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Agenda

1. Motivation

2. Research questions

3. Research method

4. Demonstration

5. Evaluation

6. Conclusions

© sebisLaura Altamirano Sainz 3

Page 4: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Motivation

• Integration of lexical information in legal searches

Related work:

• Ontologies integrated in the foreground of the systems

• Interaction between the users and the lexical knowledge

• Other areas integrate lexical knowledge for searches: Biology

• Lexical knowledge integrated in the background

of the systems

• Query expansion

© sebisLaura Altamirano Sainz 4

Page 5: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Research questions

How can search quality be improved by lexicalinformation for a legal database?

What mechanisms and methods are common in legaldatabases?

Which search mechanisms and methods can beenhanced by lexical information and how?

How can a implementation for a support searchmechanism integrated with lexical knowledge look like?

© sebisLaura Altamirano Sainz 5

Page 6: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Research method

• Needs assessment

• Interview with 6 experts in the legal domain

• Results:

• Lexical relations

• Hyponyms

• Troponyms

• Siblings / related terms

• Derivationally related terms

• Search support mechanisms in legal databases analysis

• Comparison between 5 legal databases

• Search support mechanisms categories

• Query formulation / specification

• Query reformulation

• Integration of navigation of the results with search

© sebisLaura Altamirano Sainz 6

“Systems are built to help people work

better. They cannot be built well without

understanding how people work”

(Holtzblatt & Beyer, 1997)

Page 7: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Research method

Search system

• Integration of the lexical knowledge as a search support mechanism in a

search system

• GermaNet

• Lexical database for German

• Query expansion / refinement suggestions

• Query expansion with hypernyms

• Query refinement with hyponyms

© sebisLaura Altamirano Sainz 7

Organism

Animal

Marine

creature

Larva

Plant

Page 8: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Research method

System architecture

© sebisLaura Altamirano Sainz 8

Page 9: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

DEMONSTRATION

© sebisLaura Altamirano Sainz 9

Page 10: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Evaluation

• Limitations

• Word sense disambiguation

• Search context

• Lexical database

• Evaluation results – Expert interview

© sebisLaura Altamirano Sainz 10

System advantages

• Integration of lexical information as a searchsupport mechanism

• More than one lexical relation implemented

• Clean and clear interface

Areas of improvement

• Search context

• Personalization

• Explanatory mechanismfor highlighted words

Page 11: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Conclusions

© sebisLaura Altamirano Sainz 11

• Lexical information can improve searches

• Current search support mechanisms can be improved by lexical information

• Law practitioners show interest for this area

• Users are able to interact with the lexical information

Outlook

• Context

• Improving lexical database

• Personalization features

Page 12: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Technische Universität München

Department of Informatics

Chair of Software Engineering for

Business Information Systems

Boltzmannstraße 3

85748 Garching bei München

Tel +49.89.289.

Fax +49.89.289.17136

wwwmatthes.in.tum.de

Laura Altamirano Sainz

17124

[email protected]

Thank you for your attention!

Page 13: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

References

[1] http://www.toddtransportation.com/countdown-to-a-successful-move-the-

moving-checklist/

[2] http://www.horsesforsources.com/cognizant-051411

[3] AngularJS. https://angularjs.org/

[4] Elasticsearch. https://www.joyent.com/public-cloud/benchmarks/elasticsearch

[5] Play framework. https://www.playframework.com/

[6] Germanet. http://www.sfs.uni-tuebingen.de/GermaNet/

[7] Wordnet. http://wordnet.princeton.edu/wordnet/

[8] Java logo. http://onsitepcsolution.com/wp-

content/uploads/2014/08/java_tech.jpg

[9] Bootstrap logo. http://logonoid.com/bootstrap-logo/

[10] http://www.many-roads.com/2014/12/13/mega-search-engine/

© sebisLaura Altamirano Sainz 13

Page 14: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

• Lexical knowledge

• All what we know about a word

• Relationships with other words

• Ontological categories (Relations, hyponyms, hypernyms, synonyms,…)

• Lexical databases

1. Wordnet

117 000 synsets

2. GermaNet

93 246 synsets

© sebisLaura Altamirano Sainz 14

Organism

Animal

Marine

creature

Larva

Plant

Page 15: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

© sebisLaura Altamirano Sainz 15

Search supportmechanisms

Query formulation/ specification

Boolean operators

Autocomplete

Queryreformulation

Spellingsuggestions

Query expansion

Integration of navigation of the

results with search

Facetednavigation

Table of contents

Page 16: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

German legal information

• Number of German laws is increasing

• Frequently revised information

• Relevant information

• To build up cases, for resolutions, etc…

• Style of writing legal documentation

• Standard format

© sebisLaura Altamirano Sainz 16

Page 17: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

• Main objectives

© sebisLaura Altamirano Sainz 17

• Finding out common search supportmechanisms in legal databases

• Assess the needs of the user

• Integrate the lexical information as a searchsupport mechanim in a system

• The support tool must be intuitive to the user

• Evaluate the system

Page 18: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

© sebisLaura Altamirano Sainz 18

Search support mechanismsanalysis in 5 legal databases

• JURION

• beck-online

• LexisNexis

• FindLaw

• LEXinform

Select which ones are common

Select which ones can be enhanced by lexical information

• For example:

1. Autocomplete

2. Faceted navigation

Needs assesment

Page 19: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

Process

© sebisLaura Altamirano Sainz 19

Lexical information analysis

12

3

Lexical information

(Synonyms, hypernyms,

hyponyms, etc…)

4

3, 5

2

Page 20: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

Lexical relations

© sebisLaura Altamirano Sainz 20

richtig

(right)

zutreffend

(applicable)

korrekt

(correct)

wahr

(true)

recht

(right)regulär (regular)

vorschriftsmäβig (properly)

ordnungsgemäβ (properly)

regelgerecht (regular)

regelgemäβ (regularly)

vorschriftsgemäβ (properly)

angemessen

(appropriate)

adäquat

(adequate)

bewertungsspezifisch

(assessment specific)

klassenübergreifend

(across classes)

GNROOT

hyponyms

hypernyms

falsch

(false)

antonym

richtig

(right)

richtig regelrecht

richtiggehend

(right proper

real)

synset

synset

Page 21: Applying lexical knowledge to improve search quality for a … · database Master thesis final presentation Laura Altamirano Sainz - May 4th, ... • Lexical relations • Hyponyms

Backup slides

Sequence diagram

© sebisLaura Altamirano Sainz 21