Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.
-
date post
21-Dec-2015 -
Category
Documents
-
view
242 -
download
0
Transcript of Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.
![Page 1: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/1.jpg)
Multimedia Databases
Text - part I
Slides by C. Faloutsos, CMU
![Page 2: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/2.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 2
Outline
Goal: ‘Find similar / interesting things’
• Intro to DB
• Indexing - similarity search
• Data Mining
![Page 3: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/3.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 3
Indexing - Detailed outline• primary key indexing• secondary key / multi-key indexing• spatial access methods• fractals• text• multimedia• ...
![Page 4: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/4.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 4
Text - Detailed outline
• text– problem– full text scanning– inversion– signature files– clustering – information filtering and LSI
![Page 5: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/5.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 5
Problem - Motivation
• Eg., find documents containing “data”, “retrieval”
• Applications:
![Page 6: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/6.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 6
Problem - Motivation
• Eg., find documents containing “data”, “retrieval”
• Applications:– Web– law + patent offices– digital libraries– information filtering
![Page 7: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/7.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 7
Problem - Motivation
• Types of queries:– boolean (‘data’ AND ‘retrieval’ AND NOT ...)
![Page 8: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/8.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 8
Problem - Motivation
• Types of queries:– boolean (‘data’ AND ‘retrieval’ AND NOT ...)– additional features (‘data’ ADJACENT
‘retrieval’)– keyword queries (‘data’, ‘retrieval’)
• How to search a large collection of documents?
![Page 9: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/9.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 9
Full-text scanning
• Build a FSA; scan
ca
t
![Page 10: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/10.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 10
Full-text scanning
• for single term:– (naive: O(N*M))
ABRACADABRA text
CAB pattern
![Page 11: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/11.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 11
Full-text scanning
• for single term:– (naive: O(N*M))– Knuth Morris and Pratt (‘77)
• build a small FSA; visit every text letter once only, by carefully shifting more than one step
ABRACADABRA text
CAB pattern
![Page 12: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/12.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 12
Full-text scanning
ABRACADABRA text
CAB pattern
CAB
CAB
CAB
...
![Page 13: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/13.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 13
Full-text scanning
• for single term:– (naive: O(N*M))– Knuth Morris and Pratt (‘77)– Boyer and Moore (‘77)
• preprocess pattern; start from right to left & skip!
ABRACADABRA text
CAB pattern
![Page 14: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/14.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 14
Full-text scanning
ABRACADABRA text
CAB pattern
CAB
CAB
CAB
![Page 15: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/15.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 15
Full-text scanning
ABRACADABRA text
OMINOUS pattern
OMINOUS
Boyer+Moore: fastest, in practiceSunday (‘90): some improvements
![Page 16: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/16.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 16
Full-text scanning
• For multiple terms (w/o “don’t care” characters): Aho+Corasic (‘75)– again, build a simplified FSA in O(M) time
• Probabilistic algorithms: ‘fingerprints’ (Karp + Rabin ‘87)
• approximate match: ‘agrep’ [Wu+Manber, Baeza-Yates+, ‘92]
![Page 17: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/17.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 17
Full-text scanning
• Approximate matching - string editing distance:
d( ‘survey’, ‘surgery’) = 2 = min # of insertions, deletions,
substitutions to transform the first string into the second SURVEY SURGERY
![Page 18: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/18.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 18
Full-text scanning
• string editing distance - how to compute?• A:
![Page 19: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/19.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 19
Full-text scanning
• string editing distance - how to compute?• A: dynamic programming cost( i, j ) = cost to match prefix of length
i of first string s with prefix of length j of second string t
![Page 20: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/20.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 20
Full-text scanning
if s[i] = t[j] then cost( i, j ) = cost(i-1, j-1)else cost(i, j ) = min ( 1 + cost(i, j-1) // deletion 1 + cost(i-1, j-1) // substitution 1 + cost(i-1, j) // insertion )
![Page 21: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/21.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 21
Full-text scanning
Complexity: O(M*N) (when using a matrix to ‘memoize’ partial results)
![Page 22: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/22.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 22
Full-text scanning
Conclusions: • Full text scanning needs no space overhead,
but is slow for large datasets
![Page 23: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/23.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 23
Text - Detailed outline
• text– problem– full text scanning– inversion– signature files– clustering – information filtering and LSI
![Page 24: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/24.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 24
Text - Inversion
![Page 25: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/25.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 25
Text - Inversion
Q: space overhead?
![Page 26: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/26.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 26
Text - Inversion
A: mainly, the postings lists
![Page 27: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/27.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 27
Text - Inversion
• how to organize dictionary?
• stemming – Y/N?
• insertions?
![Page 28: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/28.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 28
Text - Inversion
• how to organize dictionary?– B-tree, hashing, TRIEs, PATRICIA trees, ...
• stemming – Y/N?
• insertions?
![Page 29: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/29.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 29
Text – Inversion
• newer topics:
• Parallelism [Tomasic+,93]
• Insertions [Tomasic+94], [Brown+]– ‘zipf’ distributions
• Approximate searching (‘glimpse’ [Wu+])
![Page 30: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/30.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 30
• postings list – more Zipf distr.: eg., rank-frequency plot of ‘Bible’
log(rank)
log(freq)
Text - Inversion
freq ~ 1/rank / ln(1.78V)
![Page 31: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/31.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 31
Text - Inversion
• postings lists– Cutting+Pedersen
• (keep first 4 in B-tree leaves)
– how to allocate space: [Faloutsos+92]• geometric progression
– compression (Elias codes) [Zobel+] – down to 2% overhead!
![Page 32: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/32.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 32
Conclusions
• Conclusions: needs space overhead (2%-300%), but it is the fastest
![Page 33: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/33.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 33
Text - Detailed outline
• text– problem– full text scanning– inversion– signature files– clustering – information filtering and LSI
![Page 34: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/34.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 34
Signature files
• idea: ‘quick & dirty’ filter
![Page 35: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/35.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 35
Signature files
• idea: ‘quick & dirty’ filter
• then, do seq. scan on sign. file and discard ‘false alarms’
• Adv.: easy insertions; faster than seq. scan
• Disadv.: O(N) search (with small constant)
• Q: how to extract signatures?
![Page 36: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/36.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 36
Signature files
• A: superimposed coding!! [Mooers49], ...
m (=4 bits/word)F (=12 bits sign. size)
![Page 37: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/37.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 37
Signature files
• A: superimposed coding!! [Mooers49], ...
data
actual match
![Page 38: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/38.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 38
Signature files
• A: superimposed coding!! [Mooers49], ...
retrieval
actual dismissal
![Page 39: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/39.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 39
Signature files
• A: superimposed coding!! [Mooers49], ...
nucleotic
false alarm (‘false drop’)
![Page 40: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/40.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 40
Signature files
• A: superimposed coding!! [Mooers49], ...
‘YES’ is ‘MAYBE’ ‘NO’ is ‘NO’
![Page 41: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/41.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 41
Signature files
• Q1: How to choose F and m ?
• Q2: Why is it called ‘false drop’?
• Q3: other apps of signature files?
![Page 42: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/42.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 42
Signature files
• Q1: How to choose F and m ?
m (=4 bits/word)F (=12 bits sign. size)
![Page 43: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/43.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 43
Signature files
• Q1: How to choose F and m ?
• A: so that doc. signature is 50% full
m (=4 bits/word)F (=12 bits sign. size)
![Page 44: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/44.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 44
Signature files
• Q1: How to choose F and m ?
• Q2: Why is it called ‘false drop’?
• Q3: other apps of signature files?
![Page 45: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/45.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 45
Signature files
• Q2: Why is it called ‘false drop’?
• Old, but fascinating story [1949]– how to find qualifying books (by title word,
and/or author, and/or keyword)– in O(1) time? – without computers
![Page 46: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/46.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 46
Signature files
• Solution: edge-notched cards
......
1 2 40
•each title word is mapped to m numbers(how?)•and the corresponding holes are cut out:
![Page 47: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/47.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 47
Signature files
• Solution: edge-notched cards
......
1 2 40
data
‘data’ -> #1, #39
![Page 48: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/48.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 48
Signature files• Search, e.g., for ‘data’: activate needle #1,
#39, and shake the stack of cards!
......
1 2 40
data
‘data’ -> #1, #39
![Page 49: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/49.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 49
Signature files• Also known as ‘zatocoding’, from ‘Zator’
company.
![Page 50: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/50.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 50
Signature files
• Q1: How to choose F and m ?
• Q2: Why is it called ‘false drop’?
• Q3: other apps of signature files?
![Page 51: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/51.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 51
Signature files
• Q3: other apps of signature files?
• A: anything that has to do with ‘membership testing’: does ‘data’ belong to the set of words of the document?
![Page 52: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/52.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 52
Signature files
• UNIX’s early ‘spell’ system [McIlroy]
• Bloom-joins in System R* [Mackert+] and ‘active disks’ [Riedel99]
• differential files [Severance+Lohman]
![Page 53: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/53.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 53
Signature files - conclusions
• easy insertions; slower than inversion
• brilliant idea of ‘quick and dirty’ filter: quickly discard the vast majority of non-qualifying elements, and focus on the rest.
![Page 54: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/54.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 54
References
• Aho, A. V. and M. J. Corasick (June 1975). "Fast Pattern Matching: An Aid to Bibliographic Search." CACM 18(6): 333-340.
• Boyer, R. S. and J. S. Moore (Oct. 1977). "A Fast String Searching Algorithm." CACM 20(10): 762-772.
• Brown, E. W., J. P. Callan, et al. (March 1994). Supporting Full-Text Information Retrieval with a Persistent Object Store. Proc. of EDBT conference, Cambridge, U.K., Springer Verlag.
![Page 55: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/55.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 55
References - cont’d
• Faloutsos, C. and H. V. Jagadish (Aug. 23-27, 1992). On B-tree Indices for Skewed Distributions. 18th VLDB Conference, Vancouver, British Columbia.
• Karp, R. M. and M. O. Rabin (March 1987). "Efficient Randomized Pattern-Matching Algorithms." IBM Journal of Research and Development 31(2): 249-260.
• Knuth, D. E., J. H. Morris, et al. (June 1977). "Fast Pattern Matching in Strings." SIAM J. Comput 6(2): 323-350.
![Page 56: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/56.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 56
References - cont’d
• Mackert, L. M. and G. M. Lohman (August 1986). R* Optimizer Validation and Performance Evaluation for Distributed Queries. Proc. of 12th Int. Conf. on Very Large Data Bases (VLDB), Kyoto, Japan.
• Manber, U. and S. Wu (1994). GLIMPSE: A Tool to Search Through Entire File Systems. Proc. of USENIX Techn. Conf.
• McIlroy, M. D. (Jan. 1982). "Development of a Spelling List." IEEE Trans. on Communications COM-30(1): 91-99.
![Page 57: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/57.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 57
References - cont’d
• Mooers, C. (1949). Application of Random Codes to the Gathering of Statistical Information
• Bulletin 31. Cambridge, Mass, Zator Co.
• Pedersen, D. C. a. J. (1990). Optimizations for dynamic inverted index maintenance. ACM SIGIR.
• Riedel, E. (1999). Active Disks: Remote Execution for Network Attached Storage. ECE, CMU. Pittsburgh, PA.
![Page 58: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/58.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 58
References - cont’d
• Severance, D. G. and G. M. Lohman (Sept. 1976). "Differential Files: Their Application to the Maintenance of Large Databases." ACM TODS 1(3): 256-267.
• Tomasic, A. and H. Garcia-Molina (1993). Performance of Inverted Indices in Distributed Text Document Retrieval Systems. PDIS.
• Tomasic, A., H. Garcia-Molina, et al. (May 24-27, 1994). Incremental Updates of Inverted Lists for Text Document Retrieval. ACM SIGMOD, Minneapolis, MN.
![Page 59: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/59.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 59
References - cont’d
• Wu, S. and U. Manber (1992). "AGREP- A Fast Approximate Pattern-Matching Tool." .
• Zobel, J., A. Moffat, et al. (Aug. 23-27, 1992). An Efficient Indexing Technique for Full-Text Database Systems. VLDB, Vancouver, B.C., Canada.
![Page 60: Multimedia Databases Text - part I Slides by C. Faloutsos, CMU.](https://reader036.fdocuments.in/reader036/viewer/2022062407/56649d585503460f94a36f43/html5/thumbnails/60.jpg)
Multi DB and D.M. Copyright: C. Faloutsos (2001) 60