What's Really New on the Web? Identifying New Pages from a Series of Unstable Web Snapshots Masashi Toyoda and Masaru Kitsuregawa IIS, University of Tokyo.
Addressing Incompleteness and Noise in Evolving Web Snapshots KJDB2007 Masashi Toyoda IIS, University of Tokyo.
Crawling The Web For a Search Engine Or Why Crawling is Cool.