The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises [email protected].
-
Upload
gyles-dean -
Category
Documents
-
view
213 -
download
0
Transcript of The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises [email protected].
![Page 1: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/1.jpg)
The Future of Isite - Growing GILS
Archie WarnockA/WWW Enterprises
http://www.clark.net/pub/warnock/awww.html
![Page 2: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/2.jpg)
What Is Isite?
Isite is a standards-based Internet toolkit for information search and retrieval (Z39.50)
Isite was developed by MCNC/CNIDR Isite was intended as a replacement for
freeWAIS Funded by a US NSF grant There are other good Z39.50 toolkits, too
![Page 3: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/3.jpg)
Isite Architecture
Isite is written in C++ to utilize the usual object-oriented advantages
Major componentsIsearch - the search and retrieval engineSAPI - the Z39.50 search engine APIZdist - the Z39.50 implementation
![Page 4: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/4.jpg)
Isite Architecture - Example Programs Iindex, Isearch, Iutil - the search engine Isearch-cgi - the CGI gateway to Isearch zclient, izclient, zping, zbatch - the
Z39.50 clients zserver, zserverNT - the Z39.50 servers zcon & zgate - the WWW-to-Z39.50
gateway
![Page 5: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/5.jpg)
Current Status of Isite
MCNC/CNIDR funding from NSF is finishedSuccessful completion of 3 year grantJim Fullton, PI, is now at WIPO in GenevaNo additional support is anticipated
Other projects are supporting customizationFGDC, US Dept. of Commerce, US Patent &
Trademark Office, CEO, STScI, World Bank, BSn
![Page 6: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/6.jpg)
Isite Strengths
Powerful and flexible search engine Community-based development of a reference
implementation Freely distributed and widely available for any
use Source code included Powerful search engine interface Ported to Windows NT with threaded Z39.50
server
![Page 7: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/7.jpg)
Isearch Features
Full text search
Search on text fields
Search on numeric fields with appropriate relations (>, <, =)
Search on date fields with appropriate relations (before, during, after)
Search on geospatial bounding box
Boolean searches
Phrase searching
Right truncation
Proximity searching (within N characters)
Case insensitive searching, punctuation ignored
Configurable stopword list
Customizable results presentation
Relevance ranked scores
Term weighting
![Page 8: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/8.jpg)
Isearch Document Types
ASCII text USMARC records Electronic mail
folders Usenet news archives US patents IAFA templates BIBTeX Filenames
First line in file SGML tagged fields
HTML GILS templates FGDC templates
Colon delimited fields GCMD DIF templates
whois++ templates Multi-file documents Medline
![Page 9: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/9.jpg)
Isite Weaknesses
Modest Z39.50 implementationneeds GRS-1better USMARC
supportdata structures
All examples are console applications
No real end-user applications
No GUI interface Difficult configuration Requires
programming for extensions
Needs optimization & performance enhancement
Needs more documentation
![Page 10: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/10.jpg)
What The Future Holds For Isite
New Projects want (and will get):Distributed document collectionsDistributed searchingAutomated information extraction
(centroids, templates)Searching and referralsAdditional Z39.50 support (lots of Z39.50
details are not supported now)
![Page 11: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/11.jpg)
GILS and the Advanced Search Facility ASF is a US Dept. of Commerce project,
to be built by Pilot Research, MCNC and A/WWW Enterprises
“GILSnet” - a network of cooperative, low-impact, distributed nodes
The basic interchange will be GILS templates
Search on full text and GILS records
![Page 12: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/12.jpg)
GILS, Dublin Core and Everyone Else Dublin Core is a minimal (15 fields) generic metadata
scheme for virtually any kind of document GILS represents a more detailed approach, including
most of DC, providing greater interoperability GILS is less bibliographically oriented than BIB-1 GILS is lightweight compared to GEO and CIP (which
have specific functional requirements
![Page 13: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/13.jpg)
What GILS Means To Me -1
Fewer fieldsMore documentsMore metadata
recordsSkinnier metadata
recordsEasier abstraction
More fieldsFewer documentsFewer metadata
recordsFatter metadata
recordsLess abstraction
GILS is a good, general compromise
![Page 14: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/14.jpg)
What GILS Means To Me - 2
Think of the GILS profile as defining a language At some level, Z39.50 is a detail Protocols are about communication, profiles are about
abstraction and GILS is about content Z39.50 guarantees that the user’s query can be
unambiguously decoded - no guarantees about content We could implement the profile over any protocol - http,
CORBA, etc.
• Does GILS have to use Z39.50? No, but the abstraction is required Z39.50 already includes the abstraction model
![Page 15: The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises warnock@clark.net.](https://reader036.fdocuments.in/reader036/viewer/2022082820/56649f1c5503460f94c3253c/html5/thumbnails/15.jpg)
Related Documents
Getting Isiteftp://ftp.cnidr.org/pub/software/Isiteftp://ftp.clark.net/pub/warnock/Software (pre)
A/WWW [email protected]://www.clark.net/pub/warnock/awww.htmlUS Phone/FAX: 301-854-2987