INFORMATION RETRIEVAL VECTOR SPACE MODEL IN-DEPTH PART 2 Thomas
Tiahrt, MA, PhD CSC492 Advanced Text Analytics
Slide 3
Inverse Document Frequency (IDF) 2
Slide 4
Inverse Document Frequency 3
Slide 5
4
Slide 6
Document/Term Matrix 5
Slide 7
Weight Factor Computation 6
Slide 8
VSM Pros and Cons 7 Benefits Documents can be ordered by
importance Threshold display limits are easy to honor Documents
similar to the query retrieved early can be used for relevance
feedback Drawbacks Orthogonal terms assumption is false Some vector
operations have no theoretical justification
Slide 9
References 8 Sources: Introduction to Information Retrieval by
Christopher Manning, Prabhakar Raghavan and Hinrich Schtze, The
Cambridge University Press Automatic Text Processing Gerard Salton,
Addison-Wesley Publishing.
Slide 10
The end of the second in-depth description of the vector space
model slide show has come. End of the Slides 9