The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises [email protected].
-
Upload
gyles-dean -
Category
Documents
-
view
213 -
download
0
Transcript of The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises [email protected].
The Future of Isite - Growing GILS
Archie WarnockA/WWW Enterprises
http://www.clark.net/pub/warnock/awww.html
What Is Isite?
Isite is a standards-based Internet toolkit for information search and retrieval (Z39.50)
Isite was developed by MCNC/CNIDR Isite was intended as a replacement for
freeWAIS Funded by a US NSF grant There are other good Z39.50 toolkits, too
Isite Architecture
Isite is written in C++ to utilize the usual object-oriented advantages
Major componentsIsearch - the search and retrieval engineSAPI - the Z39.50 search engine APIZdist - the Z39.50 implementation
Isite Architecture - Example Programs Iindex, Isearch, Iutil - the search engine Isearch-cgi - the CGI gateway to Isearch zclient, izclient, zping, zbatch - the
Z39.50 clients zserver, zserverNT - the Z39.50 servers zcon & zgate - the WWW-to-Z39.50
gateway
Current Status of Isite
MCNC/CNIDR funding from NSF is finishedSuccessful completion of 3 year grantJim Fullton, PI, is now at WIPO in GenevaNo additional support is anticipated
Other projects are supporting customizationFGDC, US Dept. of Commerce, US Patent &
Trademark Office, CEO, STScI, World Bank, BSn
Isite Strengths
Powerful and flexible search engine Community-based development of a reference
implementation Freely distributed and widely available for any
use Source code included Powerful search engine interface Ported to Windows NT with threaded Z39.50
server
Isearch Features
Full text search
Search on text fields
Search on numeric fields with appropriate relations (>, <, =)
Search on date fields with appropriate relations (before, during, after)
Search on geospatial bounding box
Boolean searches
Phrase searching
Right truncation
Proximity searching (within N characters)
Case insensitive searching, punctuation ignored
Configurable stopword list
Customizable results presentation
Relevance ranked scores
Term weighting
Isearch Document Types
ASCII text USMARC records Electronic mail
folders Usenet news archives US patents IAFA templates BIBTeX Filenames
First line in file SGML tagged fields
HTML GILS templates FGDC templates
Colon delimited fields GCMD DIF templates
whois++ templates Multi-file documents Medline
Isite Weaknesses
Modest Z39.50 implementationneeds GRS-1better USMARC
supportdata structures
All examples are console applications
No real end-user applications
No GUI interface Difficult configuration Requires
programming for extensions
Needs optimization & performance enhancement
Needs more documentation
What The Future Holds For Isite
New Projects want (and will get):Distributed document collectionsDistributed searchingAutomated information extraction
(centroids, templates)Searching and referralsAdditional Z39.50 support (lots of Z39.50
details are not supported now)
GILS and the Advanced Search Facility ASF is a US Dept. of Commerce project,
to be built by Pilot Research, MCNC and A/WWW Enterprises
“GILSnet” - a network of cooperative, low-impact, distributed nodes
The basic interchange will be GILS templates
Search on full text and GILS records
GILS, Dublin Core and Everyone Else Dublin Core is a minimal (15 fields) generic metadata
scheme for virtually any kind of document GILS represents a more detailed approach, including
most of DC, providing greater interoperability GILS is less bibliographically oriented than BIB-1 GILS is lightweight compared to GEO and CIP (which
have specific functional requirements
What GILS Means To Me -1
Fewer fieldsMore documentsMore metadata
recordsSkinnier metadata
recordsEasier abstraction
More fieldsFewer documentsFewer metadata
recordsFatter metadata
recordsLess abstraction
GILS is a good, general compromise
What GILS Means To Me - 2
Think of the GILS profile as defining a language At some level, Z39.50 is a detail Protocols are about communication, profiles are about
abstraction and GILS is about content Z39.50 guarantees that the user’s query can be
unambiguously decoded - no guarantees about content We could implement the profile over any protocol - http,
CORBA, etc.
• Does GILS have to use Z39.50? No, but the abstraction is required Z39.50 already includes the abstraction model
Related Documents
Getting Isiteftp://ftp.cnidr.org/pub/software/Isiteftp://ftp.clark.net/pub/warnock/Software (pre)
A/WWW [email protected]://www.clark.net/pub/warnock/awww.htmlUS Phone/FAX: 301-854-2987