Efficient Interactive Fuzzy Keyword Search
description
Transcript of Efficient Interactive Fuzzy Keyword Search
![Page 1: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/1.jpg)
Efficient Interactive Fuzzy Keyword SearchShengyue Ji, Guoliang Li, Jianhua Feng , Chen LiUniversity of California, IrvineWWW 2009
1 Dec 2011Presentation @ IDB Lab. Seminar
Presented by Jee-bum Park
![Page 2: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/2.jpg)
2
Outline Introduction Indexing Methods Single Keyword Multiple Keywords Experiments Conclusions
![Page 3: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/3.jpg)
3
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 4: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/4.jpg)
4
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 5: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/5.jpg)
5
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 6: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/6.jpg)
6
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 7: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/7.jpg)
7
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 8: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/8.jpg)
8
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 9: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/9.jpg)
9
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 10: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/10.jpg)
10
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 11: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/11.jpg)
11
Introduction http://searchenginewatch.com/article/2128218/Google-Searchers-Use-Autocomplete-Most-Ignore-Google-Instant-Eye-
Tracking-Study
![Page 12: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/12.jpg)
12
Introduction A typical directory-search form
![Page 13: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/13.jpg)
13
Introduction Interactive fuzzy search
![Page 14: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/14.jpg)
14
Introduction “interactive, fuzzy search”
– Interactive The system searches for the best answers on the fly as the
user types in a keyword query– Fuzzy
The system tries to find relevant records that include words sim-ilar to the keywords in the query, even if they do not match exactly
![Page 15: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/15.jpg)
15
Outline Introduction Indexing Methods Single Keyword Multiple Keywords Experiments Conclusions
![Page 16: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/16.jpg)
16
Indexing Methods List
Prefix query Inverted index
li 1
lin 3, 4
liu 5
lu 4
luis 7
![Page 17: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/17.jpg)
17
Indexing Methods List
– Typed “li”Prefix query Inverted index
li 1
lin 3, 4
liu 5
lu 4
luis 7
![Page 18: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/18.jpg)
18
Indexing Methods List
– Typed “lu”Prefix query Inverted index
li 1
lin 3, 4
liu 5
lu 4
luis 7
![Page 19: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/19.jpg)
19
Indexing Methods Trie
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 20: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/20.jpg)
20
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 21: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/21.jpg)
21
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 22: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/22.jpg)
22
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 23: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/23.jpg)
23
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 24: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/24.jpg)
24
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 25: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/25.jpg)
25
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 26: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/26.jpg)
26
Indexing Methods Trie
– Typed “li”
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 27: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/27.jpg)
27
Outline Introduction Indexing Methods Single Keyword Multiple Keywords Experiments Conclusions
![Page 28: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/28.jpg)
28
Single Keyword
![Page 29: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/29.jpg)
29
Single Keyword Example
– Query = “nlis”, edit distance threshold = 2
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
0 1 2
Edit dis-tance
![Page 30: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/30.jpg)
30
Single Keyword Initial state: “”
– Query = “nlis”, edit distance threshold = 2
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
0 1 2
Edit dis-tance
Φε Delete Substitute Match Insert
<0,0>
<10,1>
<11,2>
<14,2>
![Page 31: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/31.jpg)
31
Single Keyword Typed: “n”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φε Delete Substitute Match Insert
<0,0> <0,1> <10,1>
<10,1> <10,2> <11,2><14,2>
<11,2> <12,2>
<14,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 32: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/32.jpg)
32
Single Keyword Typed: “n”
– Query = “nlis”, edit distance threshold = 2
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
0 1 2
Edit dis-tance
Φε Delete Substitute Match Insert
<0,0> <0,1> <10,1>
<10,1> <10,2> <11,2><14,2>
<11,2> <12,2>
<14,2>
Φn
<0,1>, <10,1>, <11,2>, <12,2>, <14,2>
![Page 33: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/33.jpg)
33
Single Keyword Typed: “n”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φn Delete Substitute Match Insert
<0,1>
<10,1>
<11,2>
<12,2>
<14,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 34: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/34.jpg)
34
Single Keyword Typed: “nl”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φn Delete Substitute Match Insert
<0,1> <0,2> <10,1> <11,2><14,2>
<10,1> <10,2> <11,2><14,2>
<11,2>
<12,2>
<14,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 35: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/35.jpg)
35
Single Keyword Typed: “nl”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φn Delete Substitute Match Insert
<0,1> <0,2> <10,1> <11,2><14,2>
<10,1> <10,2> <11,2><14,2>
<11,2>
<12,2>
<14,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
Φnl
<10,1>, <0,2>, <11,2>, <14,2>
![Page 36: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/36.jpg)
36
Single Keyword Typed: “nl”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnl Delete Substitute Match Insert
<10,1>
<0,2>
<11,2>
<14,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 37: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/37.jpg)
37
Single Keyword Typed: “nli”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnl Delete Substitute Match Insert
<10,1> <10,2> <14,2> <11,1> <12,2><13,2>
<0,2>
<11,2>
<14,2> <15,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 38: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/38.jpg)
38
Single Keyword Typed: “nli”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnl Delete Substitute Match Insert
<10,1> <10,2> <14,2> <11,1> <12,2><13,2>
<0,2>
<11,2>
<14,2> <15,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
Φnli
<11,1>, <10,2>, <12,2>, <13,2>, <14,2>, <15,2>
![Page 39: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/39.jpg)
39
Single Keyword Typed: “nli”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnli Delete Substitute Match Insert
<11,1>
<10,2>
<12,2>
<13,2>
<14,2>
<15,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 40: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/40.jpg)
40
Single Keyword Typed: “nlis”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnli Delete Substitute Match Insert
<11,1> <11,2> <12,2><13,2>
<10,2>
<12,2>
<13,2>
<14,2>
<15,2> <16,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
![Page 41: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/41.jpg)
41
Single Keyword Typed: “nlis”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnli Delete Substitute Match Insert
<11,1> <11,2> <12,2><13,2>
<10,2>
<12,2>
<13,2>
<14,2>
<15,2> <16,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
Φnlis
<11,2>, <12,2>, <13,2>, <16,2>
![Page 42: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/42.jpg)
42
Single Keyword Typed: “nlis”
– Query = “nlis”, edit distance threshold = 2
0 1 2
Edit dis-tance
Φnli Delete Substitute Match Insert
<11,1> <11,2> <12,2><13,2>
<10,2>
<12,2>
<13,2>
<14,2>
<15,2> <16,2>
10: l
0: \0
14: u
15: i
16: s
11: i
12: n
13: u
3, 4 5 7
41
Φnlis
<11,2>, <12,2>, <13,2>, <16,2>
![Page 43: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/43.jpg)
43
Outline Introduction Indexing Methods Single Keyword Multiple Keywords Experiments Conclusions
![Page 44: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/44.jpg)
44
Multiple Keywords Challenges in multiple keywords
– Intersection of multiple lists of keywords Each prefix query keyword has
– Multiple predicted complete keywords– The union of the lists of predicted keywords includes potential an-
swers The union lists of multiple query keywords need to be inter-
sected in order to compute the answers to the query– Cache-based incremental intersection
![Page 45: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/45.jpg)
45
Multiple Keywords HYB (H. Bast, I. Weber. Type Less, Find More: Fast Autocompletion Search with a Succinct Index. In SI-
GIR 2006)
The intersections can be computed in
The union can be computed in
Total time complexity
D.id
D.content
21 apple iphone33 php programming64 apple juice91 iphone programming172
iphone galaxy tab
308
application iphone
759
difference ipv4 ipv6
W New Data Structure (HYB)ipho 950(ipho)
900(iph), 1000, ...64, 128, 256, 900(juice), 950(juice), ...
iphjuice
iphone 1, 5, 21, 91, 172, 300, 308, 3000, 3001, ...759(ipv4), 760, ...400, 759(ipv6), 800(ipv6), ...5(ipv), 6, 1100, 1200, ...5(tab), 172, 272, 800(tab), ...
ipv4ipv6ipvtab
iphon NULL5, 3000, 5123, ...ip
W’ = { iphone, ipv4, ipv6 }D ∩ Dw = D’ = { 21, 172, 308, 759 }
![Page 46: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/46.jpg)
46
Multiple Keywords Forward lists
![Page 47: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/47.jpg)
47
Outline Introduction Indexing Methods Single Keyword Multiple Keywords Experiments Conclusions
![Page 48: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/48.jpg)
48
Experiments DBLP
– It included about one million computer science publication records
Authors, title, conference or journal name, year, page numbers, URL
MEDLINE– It had about 4 million latest publication records related to life
sciences and biomedical information Authors, their affiliations, article title, journal name, journal issue
![Page 49: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/49.jpg)
49
Experiments Computing prefixes similar to a keyword
![Page 50: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/50.jpg)
50
Experiments List intersection of multiple keywords
![Page 51: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/51.jpg)
51
Experiments Scalability (MEDLINE)
![Page 52: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/52.jpg)
52
Outline Introduction Indexing Methods Single Keyword Multiple Keywords Experiments Conclusions
![Page 53: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/53.jpg)
53
Conclusions They proposed an efficient incremental algorithm to
answer single-keyword fuzzy queries
They studied various algorithms for computing the answers to a query with multiple keywords that are treated as fuzzy, prefix conditions
![Page 54: Efficient Interactive Fuzzy Keyword Search](https://reader035.fdocuments.in/reader035/viewer/2022062501/56816196550346895dd141b4/html5/thumbnails/54.jpg)
Thank You!Any Questions or Comments?