From KWIC to Enterprise Search - M G Lindquist
-
Upload
mglindquist -
Category
Education
-
view
228 -
download
0
description
Transcript of From KWIC to Enterprise Search - M G Lindquist
![Page 1: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/1.jpg)
www.kb.se
Information retrieval from KWIC to Enterprise
SearchBy
Mats G. Lindquist
National Library of Sweden
![Page 2: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/2.jpg)
www.kb.se
IR – the Technology Dimension
Information carriers:
• Punch cards (Hollerith cards)
• Magnetic tapes
• Direct Access storage (disc a.o.)
• CD-ROM (distribution copies of db)
![Page 3: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/3.jpg)
www.kb.se
KWIC – an example of early technology
Needed: sort program and line printer
![Page 4: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/4.jpg)
www.kb.se
Magnetic tape
Task: Arrange Q’s in core memory to give
maximum number of A’s per tape run.
![Page 5: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/5.jpg)
www.kb.se
IR – the Content Dimension
• 60’s Numbers, alphanumeric codes
• 70’s plus simple (short) texts
• 80’s plus long texts, extended character sets, office documents
• 90’s plus images, graphs
• 00’s plus moving images
![Page 6: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/6.jpg)
www.kb.se
Wherever there is information piling up
there is a call for information retrieval.
The nature of the information determines
the approaches and methods for IR.
![Page 7: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/7.jpg)
www.kb.se
IR – the Application DimensionA develpment from structured scientificand technical (mostly) bibliographicalinformation to information from more generalinformation sources.
Information of greater variety in form andless control (of structure).
![Page 8: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/8.jpg)
www.kb.se
Freetext rules …
… OA?
Word processing and computing (ADP) came together mid 80’sOffice Automation was hot
![Page 9: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/9.jpg)
www.kb.se
Enter: Document Management
IR thinking and methods became accepted
as a useful approach to the management
of information from a corporate perspective
- DMS – document management systems
- IR features for searching data in general
![Page 10: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/10.jpg)
www.kb.se
Functionality maturesEssential features:• Total content searchable (no stoplist)• Complete pattern searching (# * ! : ?)• Vocabulary look-up/search• Search history re-useable• Relevance ranking• Hits in context• Incremental updating• RDB features (incl. some math)
![Page 11: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/11.jpg)
www.kb.se
and then came …
T… the internet
![Page 12: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/12.jpg)
www.kb.se
IR – the Future Dimension
In the beginning web searching was done
without using the experiences and
achievements of IR.
But information was piling up – clearly new
IR techniques were needed.
![Page 13: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/13.jpg)
www.kb.se
New rules OK?• Internet (intranet) is the information carrier• Applications challenge boundaries
– Private vs. Public– Individual vs. Institutional
• In the enterprise– The variety of form increases– The volume increases– Control of structure decreases– Control of sources decreases
![Page 14: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/14.jpg)
www.kb.se
A new technology landscape
• Volumes and volumes of storage
• Plenty of computing power
• Huge amounts of bandwidth
![Page 15: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/15.jpg)
www.kb.se
The end of precision retrieval?
Precision retrieval is replaced by a new kind of
retireval, the Quick Search which sets user
ambitions and requirements.
Search engines are different from IR-systems.
IR must rise to the challenges of Enterprise
Search – in-house search engines.
![Page 16: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/16.jpg)
www.kb.se
What is ahead?• The main stage for development must be the user
interface for searching and presentation.• Make searching even more simple.
Consider this headline from the Technology Guardian,
May 3, 2007:
It’s time for Amazon to turn a new leaf and make
searching for books at its site a whole lot easier.
(article by Wendy M Grossman)
![Page 17: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/17.jpg)
www.kb.se
The big challenge
• The big challenge is to achieve enough structure to enable precision search without loosing to much recall.
• And without manual intervention – it must be automatic.
![Page 18: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/18.jpg)
www.kb.se
References
Ashford, John and Willett, Peter, Text retrieval and document databases,Bromley :Chartwell-Bratt, 1989 (ISBN: 0-86238-204-1)
Brooks, Terrence A. (2004), ”The Nature of Meaning in the Age of Google”,Information Research, 9(3) paper 180 http://InformationR.net/ir/9-3/paper180.html
Grossman, Wendy M., It’s time for Amazon to turn over a new leaf ---,The Technology Guardian, May 3, 2007http://www.guardian.co.uk/technology/2007/may/03/comment.guardianweeklytechnologysection1
Lindquist, Mats G., "3RIP-COM: Integrating Information Retrieval andComputerized Conferencing", Proc. Am. Soc. Info. Science, vol. 17 (1980),pp. 71-73.
![Page 19: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/19.jpg)
www.kb.se
Lindquist, Mats G., "Information Resources Management (IRM),Yesterday,
Today and Tomorrow", NORD IoD 6, Samfundet för Informationstjänst i
Finland, Helsinki, 1985, pp. 275-280.
O’Neill, Edward T., Lavoie, Brian F., Bennet, Rick (2003), ”Trends in the
Evolution of the Public Web 1998-2002”, D-Lib Magazine 9(4),
http://www.dlib.org/dlib/april03/lavoie/04lavoie.html
Schneider, Karen G., How OPACs Suck, Part 2: The Checklist of Shame,
Posted 04/03/2006 [ http://www.techsource.ala.org/blog/2006/04/how-opacs-suck-part-2-the-checklist-of-shame.html ]
![Page 20: From KWIC to Enterprise Search - M G Lindquist](https://reader031.fdocuments.in/reader031/viewer/2022013011/5491af91b479593f188b456b/html5/thumbnails/20.jpg)
www.kb.se