© 2005 IBM Corporation Discovering Large Dense Subgraphs in Massive Graphs David Gibson IBM Almaden Research Center Ravi Kumar Yahoo! Research* Andrew.
1 Near Duplicate Detection Slides adapted from –Information Retrieval and Web Search, Stanford University, Christopher Manning and Prabhakar Raghavan –CS345A,
Finding Similar Items. Set Similarity Problem: Find similar sets. Motivation: Many things can be modeled/represented as sets Applications: –Face Recognition.
Finding Similar Items
Web Search