Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu....
Transcript of Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu....
![Page 1: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/1.jpg)
Querying Web Data: The WebQA Approach
Sunny K.S. Lam and M. Tamer Özsu
Presented by E. Cem Sözgen
![Page 2: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/2.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 3: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/3.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 4: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/4.jpg)
What do people want from a web query system?• The ideal system for querying
the web: (from the author’s point of view)• Accepts easy to pose query
(possibly in natural language)• Searches all of the sources• Returns direct answers (not links)
• How about WebQA?
![Page 5: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/5.jpg)
WebQA• Factual query expressed in natural
language• Ranked list of short answers
e.g. Who invented the telephone?1) Alexander Graham Bell (58.0)2) Graham Bell (58.0)3) Bell (58.0)4) Alexander Graham (54.0)
![Page 6: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/6.jpg)
Type of questions that WebQA do not deal?• Who are the players of Toronto
Raptors? (multiple results)• Notify me whenever the
temperature of Waterloo drops below zero. (continuous query)
• How do I make pancakes? (procedural query)
![Page 7: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/7.jpg)
Which areas are involved?
• Question answering (QA) techniques• Metasearch techniques• Mediator/Wrapper techniques• Information Retrieval (IR) techniques• Extraction techniques
![Page 8: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/8.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 9: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/9.jpg)
• Keyword Search approach• Search engines• Metasearchers
• Category Search approach• Database view approach• Semi-structured data querying
approach• Web Query Language approach• Learning based approach• Question answering approach
![Page 10: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/10.jpg)
Mulder• Very similar to WebQA• Accepts short factual questions in NL• Returns exact answers• Similar main components• Question types:
• Nominal: Noun phrase• Numerical: Number• Temporal: Date
• Uses Google as a search engine
![Page 11: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/11.jpg)
Differences
WebQA Mulder•Light NLP•7 categories•Multiple sources•More fault tolerant•More flexible and scalable
•Heavy NLP•3 categories•Single search engine
![Page 12: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/12.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 13: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/13.jpg)
Client-Server Architecture
![Page 14: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/14.jpg)
![Page 15: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/15.jpg)
Interface
• Two types of interface• Textual Interface
• Local access• Fast and provides debugging information• Need a copy of WebQA in local machine
• Graphical User Interface
![Page 16: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/16.jpg)
Home page of WebQA
![Page 17: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/17.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 18: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/18.jpg)
![Page 19: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/19.jpg)
Categories• Defined to improve system accuracy
• Name • Place• Time• Quantity• Abbreviation• Weather• Other
• Who invented the telephone? (Name)• Who was George Washington? (Other)
![Page 20: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/20.jpg)
![Page 21: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/21.jpg)
Output Options
<Category> [-output <Output Option> ]-keywords<Keyword List>
![Page 22: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/22.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 23: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/23.jpg)
![Page 24: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/24.jpg)
• List used by Source Ranker
• The structure of a record
![Page 25: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/25.jpg)
Mediator/Wrapper
• For information integration• One wrapper for each data source• Same Wrapper API• One centralized mediator• Different from data warehouse:
integrated data is not materialized
![Page 26: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/26.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 27: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/27.jpg)
![Page 28: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/28.jpg)
Candidate Identifier
• Candidate list: list of candidates• Four sub-identifiers
• Country sub-identifier• Abbreviation sub-identifier• Weather sub-identifier• Search engine sub-identifier
Structure of a Candidate
![Page 29: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/29.jpg)
Rearranger
1) Bell (58.0)2) Alexander Graham Bell (50.0)
1) Alexander Graham Bell (58.0)2) Bell (58.0)
![Page 30: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/30.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 31: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/31.jpg)
Experiment 1
• To see the performance of categorizing questions
• TREC 9: 686/693 -> 98.99%• TREC 10: 461/500 -> 92.2%
![Page 32: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/32.jpg)
Experiment 2
• To determine the best source ranking for each category
![Page 33: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/33.jpg)
Experiment 3• To see how using secondary sources
affects the results
![Page 34: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/34.jpg)
Experiment 4• Comparison of WebQA with other systems
![Page 35: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/35.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 36: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/36.jpg)
References1) M.T. Özsu and P. Valduriez, Principles of
Distributed Database Systems, 2nd edition, Prentice-Hall, Inc., 1999; ISBN 0-13-659707-6
2) S.K.S. Lam and M. T. Özsu. “Querying Web Data –The WebQA Approach,” In Proc. 3rd International Conference on Web Information Systems Engineering, Singapore, December 2002, pages 139-148.
3) S. K. S. Lam. WebQA: A web querying system using the QA approach. Master's thesis, University of Waterloo, School of Computer Science, Waterloo, Canada, Spring 2002.
4) http://www.viz.co.nz/internet-facts.htm5) C. C. T. Kwok, O. Etzioni, and D. S.Weld. Scaling
question answering to the Web. In Proceedings of 10th International World Wide Web Conference, pages 150–161, 2001.
![Page 37: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/37.jpg)
Outline
• Introduction• Background and Literature• WebQA Architecture• Query Parser• Summary Retriever• Answer Extractor• Evaluation• References• Comments
![Page 38: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/38.jpg)
Comments…
![Page 39: Querying Web Data: The WebQA Approachtozsu/courses/cs856/W05/... · 2) S.K.S. Lam and M. T. Özsu. “Querying Web Data – The WebQA Approach,” In Proc. 3rd International Conference](https://reader034.fdocuments.in/reader034/viewer/2022042419/5f3631d995c79a2d561dada9/html5/thumbnails/39.jpg)