A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones...

33
Introduction Evaluation Framework Experiments Conclusion A Framework for Evaluating Database Keyword Search Strategies Joel Coffman Alfred C. Weaver University of Virginia 28 October 2010 Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 1 / 18

Transcript of A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones...

Page 1: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

A Framework for EvaluatingDatabase Keyword Search Strategies

Joel Coffman Alfred C. Weaver

University of Virginia

28 October 2010

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 1 / 18

Page 2: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Outline

Introduction

Evaluation Framework

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 2 / 18

Page 3: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Outline

Introduction

Evaluation Framework

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 2 / 18

Page 4: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Outline

Introduction

Evaluation Framework

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 2 / 18

Page 5: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Outline

Introduction

Evaluation Framework

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 2 / 18

Page 6: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 7: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 8: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Cast

personId characterId movieId

10 7 1810 7 1911 9 19

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 9: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Definition (query result)A tree of tuples that is reduced withrespect to the query.

Indiana Jones and the Last Crusade

Professor Henry Jones

Sean Connery

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 10: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Definition (query result)A tree of tuples that is reduced withrespect to the query.

Indiana Jones and the Last Crusade

Professor Henry Jones

Sean Connery

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 11: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Which would you rather write?

SELECT Person . nameFROM Person , Character , Movie , CastWHERE Person . i d = Cast . personIdAND Character . i d = Cast . cha rac te r IdAND Movie . i d = Cast . movieIdAND Character . name = ' Professor Henry

Jones 'AND Movie . t i t l e = ' Ind iana Jones and

the Last Crusade ' ;

or “Henry Jones Last Crusade”

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 12: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Which would you rather write?

SELECT Person . nameFROM Person , Character , Movie , CastWHERE Person . i d = Cast . personIdAND Character . i d = Cast . cha rac te r IdAND Movie . i d = Cast . movieIdAND Character . name = ' Professor Henry

Jones 'AND Movie . t i t l e = ' Ind iana Jones and

the Last Crusade ' ;

or “Henry Jones Last Crusade”

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 13: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Keyword SearchI Preferred means of data exploration and retrieval online

I > 4 billion searches dailyI Desire to extend paradigm to relational databases

ExampleWho played Professor Henry Jones in Indiana Jones and the Last Crusade?

Person

id name

10 Ford, Harrison11 Connery, Sean

Character

id name

7 Indiana Jones9 Professor Henry Jones

Movie

id title

18 Raiders of the Lost Ark19 Indiana Jones and the Last Crusade

Which would you rather write?

SELECT Person . nameFROM Person , Character , Movie , CastWHERE Person . i d = Cast . personIdAND Character . i d = Cast . cha rac te r IdAND Movie . i d = Cast . movieIdAND Character . name = ' Professor Henry

Jones 'AND Movie . t i t l e = ' Ind iana Jones and

the Last Crusade ' ;

or “Henry Jones Last Crusade”

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 3 / 18

Page 14: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

The ProblemRelational keyword search has been a hot topic since 2002

I Evaluations ad hoc, no standardization

Example (Search Effectiveness)DISCOVER � Hristidis et al. � Liu et al. � SPARK � Xu et al.

I ≈ 16-fold improvement in search effectiveness during past decadeI Liu et al. claim to be better than Google

I SPARK achieves Mean Reciprocal Rank (MRR) of 1.0

I Best systems at TREC score ≈ 0.8 (Webber, 2010)

Hypothesis: Existing evaluations overstate retrievaleffectiveness

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 4 / 18

Page 15: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

The ProblemRelational keyword search has been a hot topic since 2002

I Evaluations ad hoc, no standardization

Example (Search Effectiveness)DISCOVER � Hristidis et al. � Liu et al. � SPARK � Xu et al.

I ≈ 16-fold improvement in search effectiveness during past decade

I Liu et al. claim to be better than Google

I SPARK achieves Mean Reciprocal Rank (MRR) of 1.0

I Best systems at TREC score ≈ 0.8 (Webber, 2010)

Hypothesis: Existing evaluations overstate retrievaleffectiveness

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 4 / 18

Page 16: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

The ProblemRelational keyword search has been a hot topic since 2002

I Evaluations ad hoc, no standardization

Example (Search Effectiveness)DISCOVER � Hristidis et al. � Liu et al. � SPARK � Xu et al.

I ≈ 16-fold improvement in search effectiveness during past decadeI Liu et al. claim to be better than Google

I SPARK achieves Mean Reciprocal Rank (MRR) of 1.0

I Best systems at TREC score ≈ 0.8 (Webber, 2010)

Hypothesis: Existing evaluations overstate retrievaleffectiveness

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 4 / 18

Page 17: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

The ProblemRelational keyword search has been a hot topic since 2002

I Evaluations ad hoc, no standardization

Example (Search Effectiveness)DISCOVER � Hristidis et al. � Liu et al. � SPARK � Xu et al.

I ≈ 16-fold improvement in search effectiveness during past decadeI Liu et al. claim to be better than Google

I SPARK achieves Mean Reciprocal Rank (MRR) of 1.0

I Best systems at TREC score ≈ 0.8 (Webber, 2010)

Hypothesis: Existing evaluations overstate retrievaleffectiveness

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 4 / 18

Page 18: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

The ProblemRelational keyword search has been a hot topic since 2002

I Evaluations ad hoc, no standardization

Example (Search Effectiveness)DISCOVER � Hristidis et al. � Liu et al. � SPARK � Xu et al.

I ≈ 16-fold improvement in search effectiveness during past decadeI Liu et al. claim to be better than Google

I SPARK achieves Mean Reciprocal Rank (MRR) of 1.0

I Best systems at TREC score ≈ 0.8 (Webber, 2010)

Hypothesis: Existing evaluations overstate retrievaleffectiveness

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 4 / 18

Page 19: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

The ProblemRelational keyword search has been a hot topic since 2002

I Evaluations ad hoc, no standardization

Example (Search Effectiveness)DISCOVER � Hristidis et al. � Liu et al. � SPARK � Xu et al.

I ≈ 16-fold improvement in search effectiveness during past decadeI Liu et al. claim to be better than Google

I SPARK achieves Mean Reciprocal Rank (MRR) of 1.0

I Best systems at TREC score ≈ 0.8 (Webber, 2010)

Hypothesis: Existing evaluations overstate retrievaleffectiveness

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 4 / 18

Page 20: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

BackgroundMotivation

Survey of Existing EvaluationsI Existing experiments unrepeatable

I Few details included in literatureI Datasets, query workloads, and relevance assessments not

releasedI Query workloads vary widely

I 12–1100 queries included in experimentsI Too few representative queries

I ExperimentsI Performance-focus, less than half consider search

effectivenessI Little system comparison

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 5 / 18

Page 21: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

DatasetsQueriesRelevance Assessments

Outline

IntroductionBackgroundMotivation

Evaluation FrameworkDatasetsQueriesRelevance Assessments

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 6 / 18

Page 22: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

DatasetsQueriesRelevance Assessments

DatasetsI 3 datasets

I Subsets of IMDb and Wikipedia used in experimentsI Evaluate systems that assume index fits in memory

Dataset Size (MB) Relations Tuples

MONDIAL 10 28 17KIMDb 427 6 1.7MWikipedia 378 6 0.2M

IMDb (original) 9017 44 44.3MWikipedia (original) 670 42 1.6M

Table: Dataset characteristics.

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 7 / 18

Page 23: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

DatasetsQueriesRelevance Assessments

Query WorkloadI 50 information needs (minimum for evaluating retrieval

systems)I Query statistics are similar to those submitted to Internet

search engines

Search log Synthesized

Dataset |Q| JqK JqK |Q| JqK JqK

MONDIAL 50 1–5 2.04IMDb 101,903 1–96 2.71 50 1–26 3.88Wikipedia 122,956 1–95 2.87 50 1–6 2.66

Overall 20,527,863 1–245 2.37 150 1–26 2.86

Legend|Q| total number of queriesJqK range in number of query termsJqK average number of terms per query

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 8 / 18

Page 24: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

DatasetsQueriesRelevance Assessments

Relevance AssessmentsI Binary relevance assessments

Results

Dataset JrK JrK

MONDIAL 1–35 5.90IMDb 1–35 4.32Wikipedia 1–13 3.26

Overall 1–35 4.49

LegendJrK range in number of relevant results per queryJrK average number of relevant results per query

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 9 / 18

Page 25: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Outline

IntroductionBackgroundMotivation

Evaluation FrameworkDatasetsQueriesRelevance Assessments

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 10 / 18

Page 26: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

ExperimentsObjectives and Metrics

I Determine search effectiveness of different systemsI Mean Reciprocal Rank (MRR)I Mean Average Precision (MAP)

I Impact the number of results retrieved has on metricsI Interpolated precision

I Correlation of results from different systemsI Minimizing Kendall distance (Kmin)

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 11 / 18

Page 27: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

SystemsI 2 major approaches to keyword search in relational

databasesI Relational

I Specific to relational databasesI Use IR-style ranking functions

I Proximity SearchI Applicable to arbitrary data graphsI Minimizes the total weight of result trees

I 8 systems published in major proceedingsI . . . plus our own ranking scheme, structured cover density

ranking (CD)

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 12 / 18

Page 28: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Single-Entity EffectivenessI Proximity search systems handle single-entity queries well

but not scalable

Figure: Mean reciprocal rank for queries targeting a single tuple.

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 13 / 18

Page 29: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Overall EffectivenessI No ranking scheme outperforms all others

I IR-style ranking (excluding CD) generally not as good asproximity search

Figure: Mean average precision across all queries.

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 14 / 18

Page 30: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Additional ExperimentsNumber of results retrieved

I Precision-recall curve inaccurate above 40% recall forsmall k

I k should be at least double the number of relevant results

Ranking CorrelationI Ranking functions derived from common ancestor produce

similar resultsI Prefer simpler ranking functions

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 15 / 18

Page 31: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

Outline

IntroductionBackgroundMotivation

Evaluation FrameworkDatasetsQueriesRelevance Assessments

Experiments

Conclusion

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 16 / 18

Page 32: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

IntroductionEvaluation Framework

ExperimentsConclusion

ConclusionsI Existing evaluations ad hoc, lacking standardization

I Standardized evaluation critical to progressI Our evaluation benchmark is the first designed for keyword

search within relational databasesI Datasets, queries, and relevance assessments available for

other researchersI No existing ranking scheme is most effective on all

workloadsI Improve ranking by considering additional factors

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 17 / 18

Page 33: A Framework for Evaluating Database Keyword Search …...18 Raiders of the Lost Ark 19 Indiana Jones and the Last Crusade Definition (query result) A tree of tuples that is reduced

Questions

Questions?

Download the datasets, queries, and relevance assessments:http://www.cs.virginia.edu/~jmc7tp/projects/search

Coffman and Weaver Evaluating Database Keyword Search 28 October 2010 18 / 18