A Concept Space Approach to Semantic Exchange
description
Transcript of A Concept Space Approach to Semantic Exchange
The University of ArizonaManagement Information Systems
A Concept Space Approach toSemantic Exchange
Tobun Dorbin Ng
Dissertation Defense
April 19, 2000
The University of ArizonaManagement Information Systems
Outline
• Introduction
• Literature Review
• Research Questions & Methodologies
• Concept Space Consultation
• Concept Space Generation
• Conclusions
The University of ArizonaManagement Information Systems
Objective
• To investigate the use of information technologies that clarify semantic meaning to help users elaborate their information needs by providing their library-specific knowledge during the information seeking process.
Introduction
The University of ArizonaManagement Information Systems
Knowledge SpacesConcept Spaces
Category Spaces
Distributed, HeterogeneousDatabase Collections
Knowledge Discovery•Concept Association•Cluster Analysis
Search forDocuments
BrowsingClassifications
Query Document Set
Information RetrievalSystems
•Keyword Search•Inverted Index•Summarization•Visualization
Text Image Video
Users
Does a query truly representuser information need?
Can these knowledge sourcesadequately serveusers’ information needs?
Questions& Problems
Introduction
The University of ArizonaManagement Information Systems
Goal
• To adopt a user-centric and interactive approach to helping users elaborate their information needs with library-specific knowledge and simultaneously gain insight into a library’s offerings related to their information needs.
Introduction
The University of ArizonaManagement Information Systems
Research Issues
• Interactive Consultation with Knowledge Sources
• Automatic Generation of Semantic-bearing Knowledge Sources from Corresponding Libraries
Introduction
The University of ArizonaManagement Information Systems
Static Nature of Knowledge in Library Collection
• Characterizing Document Objects
• Characterizing Global Knowledge in Document Collections– Grand Coverage– Knowledge of Knowledge
• Revealing Knowledge in Neighborhood– Contextual Information
Literature Review
The University of ArizonaManagement Information Systems
Dynamic Nature of User Information Need
• Expressing User Need– Information Need
• Dynamic, not directly observable or symbolized
– Indeterminism– Opportunism– Vocabulary Problem– Recognition with Contextual Information
• Key Word In Context, Relevance Feedback
Literature Review
The University of ArizonaManagement Information Systems
Perceiving Knowledge
• What is the user’s perspective of knowledge?
• How does a user perceive retrieved or derived knowledge?
• Computing Relevance?
Literature Review
The University of ArizonaManagement Information Systems
Structure & Context: Aids To Perceive Knowledge
• Structureless and Contextless– Document List
• Structural but Contextless– Dynamic Clustering
• Structural and Contextual– Path to the Knowledge
Literature Review
The University of ArizonaManagement Information Systems
ResearchQuestions
Knowledge SpacesConcept Spaces
Category Spaces
Distributed, HeterogeneousDatabase Collections
Knowledge Discovery•Concept Association•Cluster Analysis
Search forRelated
Concepts
Search forDocuments
BrowsingClassifications
Context-richQuery
InformationNeed Vocabulary
& Context
Context-coherentDocument Set
Concept ConsultationSystems
Concept Exploration•Branch-and-bound Search•Hopfield Net Activation
Information RetrievalSystems
•Keyword Search•Inverted Index•Summarization•Visualization
Text Image Video
Users
• Can knowledge sources be used to help users express their information needs?
Research Questions & Methodologies
The University of ArizonaManagement Information Systems
Research Methodologies
• Systems Development Approach
• Experimental Design
Research Questions & Methodologies
The University of ArizonaManagement Information Systems
Concept Space Consultation
• Algorithmic Concept Exploration
• Large Networks of Knowledge– Man-made Thesauri: LCSH & ACM CRCS– Concept Spaces
• Spreading Activation– Traversing a set of Knowledge Networks
automatically and suggesting a set of most relevant concepts
Concept Space Consultation
The University of ArizonaManagement Information Systems
Research Questions 1&2
• Would the automatic concept exploration process be able to help users identify more relevant concepts?
• Would such a process be able to perform more efficient exploration of a concept space than the conventional manual browsing method?
Concept Space Consultation
The University of ArizonaManagement Information Systems
Research Question 3
• If so, which algorithmic methods - symbolic-based branch-and-bound or neural network-based Hopfield net algorithm - is better in terms of gathering relevant concepts from knowledge sources?
Concept Space Consultation
The University of ArizonaManagement Information Systems
Research Questions 4&5
• Would the concept space consultation process provide a semantic medium to reduce the cognitive demand from users in terms of elaborating information needs?
• Would the concept exploration process be able to help users find more relevant documents?
Concept Space Consultation
The University of ArizonaManagement Information Systems
Two Algorithms forSpreading Activation
• Branch-and-bound Algorithm– Semantic Net Based: “Optimal” Search
• Hopfield Net Algorithm– Neural Net Based: Parallel Relaxation
Search
• Spreading Activation Process– Activation, Weight Computation, Iteration– Stopping Condition
Concept Space Consultation
The University of ArizonaManagement Information Systems
User Evaluation
• 3 Subjects, 6 Tasks, 3 Phases
• Phase 1: Identify subject areas
• Phase 2: Find other topics using spreading activation & manual browsing
• Phase 3: Document evaluation
Concept Space Consultation
The University of ArizonaManagement Information Systems
Findings: Concepts
• Manual browsing achieved higher recall but lower term precision than the algorithmic systems.
• Manual browsing was also a much more laborious and cognitively demanding process.
• When using the algorithms, subjects reviewed the suggested terms more slowly and treated them more seriously and carefully than when performing manual browsing.
Concept Space Consultation
The University of ArizonaManagement Information Systems
Findings: Documents
• No signification differences (in document recall and precision) were observed between the relevant documents suggested by the algorithms and those generated via the manual browsing process.
• Each approach could contribute to a larger set of relevant documents for users.
• The essential differences were time spent and cognitive effort in both approaches.
Concept Space Consultation
The University of ArizonaManagement Information Systems
Publications
• Chen, H., Lynch, K. J., Basu, K., and Ng, T. D. “Generating, Integrating, and Activating Thesauri for Concept-Based Document Retrieval,” IEEE Expert, Special Series on Artificial Intelligence in Text-Based Information Systems 8(2):25-34 (1993).
• Chen, H. and Ng, T.D. “An Algorithmic Approach to Concept Exploration in a Large Knowledge Network (Automatic Thesaurus Consultation): Symbolic Branch-and-bound Search vs. Connectionist Hopfield Net Activation,” Journal of the American Society for Information Science 3(5): 348-369 (1995).
Concept Space Consultation
The University of ArizonaManagement Information Systems
Concept Space Generation
• Automatic Generation of Large-scale Concept Spaces
• Feasibility and Scalability Issues of Large-scale Concept Space Generation– Domain Knowledge– Computing Resources
Concept Space Generation
The University of ArizonaManagement Information Systems
Research Question 1
• With regard to computing scalability, would the technique of computer generation of concept spaces be applicable to very large textual databases?
Concept Space Generation
The University of ArizonaManagement Information Systems
Research Question 2
• With regard to domain specific knowledge scalability, would concept space generation by technology create satisfactory domain-specific concept associations from corresponding textual databases?
Concept Space Generation
The University of ArizonaManagement Information Systems
Research Question 3
• How does the quality of concept associations in concept space generated from very large textual databases compare with that of a man-made domain-specific thesaurus?
Concept Space Generation
The University of ArizonaManagement Information Systems
Concept Space Techniques
• Document & Object List Collection
• Object Filtering
• Automatic Indexing
• Co-occurrence Analysis
• Parallel Supercomputing to Laptop Computing
• Large to Small Collections
Concept Space Generation
The University of ArizonaManagement Information Systems
User Evaluation
• 10 Subjects, 23 Tasks
• Recall & Recognition Phases
• Findings:– Concept space has higher concept recall– INSPEC thesaurus has higher concept
precision– Concept space compliments man-made
thesaurus
Concept Space Generation
The University of ArizonaManagement Information Systems
Publications
• Chen, H., Schatz, B.R., Ng, T.D., Martinez, J., Kirchhoff, A., and Lin, C. “A Parallel Computing Approach to Creating Engineering Concept Spaces for Semantic Retrieval: The Illinois Digital Library Initiative Project,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Special Section on Digital Libraries: Representation and Retrieval 18(8): 771-782 (1996).
• Chen, H., Martinez, J., Ng, T. D., and Schatz, B. “A Concept Space Approach to Addressing the Vocabulary Problem in Scientific Information Retrieval: An Experiment on the Worm Community Systems,” Journal of the American Society for Information Science 48(1):17-31 (1997).
• Houston, A. L., Chen, H., Hubbard, S. M., Schatz, B. R., Ng, T. D., Sewell, R. R., and Tolle, K. M. “Medical Data Mining on the Internet: Research on a Cancer Information System,” Artificial Intelligence Review13(5/6):437-466 (1999).
Concept Space Generation
The University of ArizonaManagement Information Systems
Corpuses & Applications• INSPEC, CSQuest
http://ai.bpa.arizona.edu/cgi-bin/mcsquest
• CancerLit, Cancer Space http://ai20.bpa.arizona.edu/cgi-bin/cancerlit/cn
• Webpages, ET-Space http://ai.bpa.arizona.edu/cgi-bin/tng/ETSpace
• GeoRef & Petroleum Abstracts, GIS Space http://ai10.bpa.arizona.edu/gis/
• Law Enforcement, COPLINK Concept Space• DARPA ITO Project Summary Collection
http://ai6.bpa.arizona.edu/cgi-bin/tng/Psum
• CNN News, http://processc.inf.cs.cmu.edu/tng/inf/
Concept Space Generation
The University of ArizonaManagement Information Systems
Conclusions
• Context-specific Concept Space Consultation
• Concept Space As Semantic Exchange Medium
Conclusions
The University of ArizonaManagement Information Systems
Lessons Learned
• Both concept space consultation and generation work
• “Strategic” use of knowledge sources
• Concept Space Technique is scalable conceptually and computationally
• Insight to potentially retrieved documents
Conclusions
The University of ArizonaManagement Information Systems
Future Directions
• Performing Summarization
• Semantic Protocol for Machine Comm.
• Multimedia Concept Association
• Context Analysis with– Metric Clusters: “distance” information– Scalar Clusters: neighboring concepts of
two targeting concepts to compute their similarity
Conclusions