Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
-
Upload
noel-thomas -
Category
Documents
-
view
212 -
download
0
Transcript of Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
![Page 1: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/1.jpg)
Xiaoying Gao
Computer Science
Victoria University of Wellington
Intelligent AgentsCOMP 423
![Page 2: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/2.jpg)
Next lecture
• Paper: Personalized Mobile search engine
• Some concepts• Ontology
• Cyc, WordNet, ODP(Open directory project)
• Entropy• disorder, uncertainty• the smaller, the better
• Ranking SVM (Support Vector Machines)• Supervised machine learning
• K-means clustering
![Page 3: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/3.jpg)
Review on information retrieval
• Text representation• Find all the unique words/terms in all documents,
• each term is a feature• Each document is represented as a vector
• Weight for each term• Term frequency: for one particular document• Inversed document frequency : the same for all
documents• Query is also a vector
• Comparison with Query• The similarity between query vector and each document vector
• Cosine similarity • or other distance measure (Euclidean distance)
• Ranking• The documents with higher similarity are more relevant to query• PageRank
![Page 4: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/4.jpg)
Evaluation
• Precision and RecallFor a query, If a system finds 200 results, among them 50 are relevant. The human labels ( model solutions) have 120 relevant documents.
•Precision and recall curve
• AUC: Area under curve
• Top N precision
• ARR: the average rank of the documents rated as “relevant”
![Page 5: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/5.jpg)
Personalised search
Personalised information retrieval
A related area is called Adaptive Hypermedia
![Page 6: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/6.jpg)
Information gathering
• Information gathering approach• Implicit, Explicit, Both
• Type of information• Queries, clicked documents, snippets of documents• Cashed web pages, dwell time on page, desktop
documents• Email, calendar items• Tags and bookmarks on online social applications• User supplied information• User’s categorical interests
• Source of information• Server side, Client –side, user intervention
![Page 7: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/7.jpg)
Information representation
• User model• Short-term interests, long-term interests• Static, dynamic, periodic• Terms or conceptual terms (use wordnet, ontology)• Vector-based
• Models where user’s interests are maintained in a vector of weighted keywords (concepts).
• Semantic network based• Models where user’s interests are maintained in a network
structure of terms and related terms (concepts and related concepts)
![Page 8: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/8.jpg)
Two directions• Query adaptation or result adaptation
• Query expansion/adaptation• implicit
individualised
User model
Aggregate
Usage information (search logs)
Not user-focused
Pseudo-relevance feedback
Global analysis• explicit
individual, relevance feedback, interactive query expansion
![Page 9: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/9.jpg)
Search results filtering/adaptation
• Different applications: individual, aggregate, web search or recommendation, databases search
• Typically use supervised machine learning• Relevant, not relevant: binary classification• Training data:
• Labeled data• Assume clicked docs are relevant
• Machine learning methods• KNN: K nearest neighbour• Naïve Bayes• SVM: Support vector machines• …
![Page 10: Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.](https://reader036.fdocuments.in/reader036/viewer/2022081908/56649e9f5503460f94ba0bb9/html5/thumbnails/10.jpg)
Evaluation
• In lab setting
10-500 users
• Quantitative & Qualitative
• System performance
• User evaluation, system usability
• Data sets
open web corpora, in-lab generated logs,
TREC collection, search engine query logs
subset of annotated documents from specific sites