Interactive graphical queries for bibliographic search

12
Interactive Graphical Queries for Bibliographic Search Martin Brooks* and Jennifer Campbell Interactive Information Group, National Research Council of Canada, Institute for Information Technology, M50 Montreal Road, Ottawa, Ontario K1A OR6, Canada. E-mail: [email protected]; ai.iit.nrc.ca/II_public This article presents “Islands,” an interactive graphical interface for construction, modification, and manage- ment of queries during a search session on a biblio- graphic database. The Islands interface is compared to the Dialog Interface™ on a search of the INSPEC data- base. A bibliographic database search typically requires devel- opment of multiple queries and exploration of their Boolean combinations. Traditional interfaces to bibliographic data- bases require typing text at a command line, and printing lists of queries with information on the sets retrieved by them. This article presents “Islands,” an interactive graph- ical interface for construction, modification, and manage- ment of queries. The Islands interface is intended to help the searcher by supporting the nonlinear conceptual and logical structure of the search session. The Islands interface provides click-and-drag creation and modification of queries as Boolean combinations of search primitives and existing queries. A search session is represented and manipulated as a sea of hilly islands. Each island represents a single query, with the individual hills representing disjunctions, and the collection of hills on an island interpreted as a conjunction. Figure 1 shows the graphical and representational nature of the interface. Visual interfaces for searching generally take one of two forms: at the front end, where the search is specified; or at the back end, where the search results are presented to the user. Visual query specification first appeared in connection with NASA’s photographic archives (Rorvig, Turner, & Moncada, 1988; Seloff, 1990) in the form of a visual the- saurus, where each search primitive was presented to the user in the form of a representative image having links to images representing more specialized and more general search terms. Recently, visual methods for geometric search have been explored (Paquet & Rioux, 1997). Search results were visualized by Wise et al. (1995) as “galaxies” and “sedimentary layers” through statistical processing of the retrieved documents’ content. Document databases have been visualized by clustering based on citation structure (Small & Griffith, 1974), providing relationships between scientific fields and the structure of scientific literature. This article begins with the motivation behind the Islands interface and a review of the semantics of bibliographic search. We walk through a search of the INSPEC database, first using a standard interface and then with the Islands interface. This is followed by discussion of the differences between the two interfaces, followed by a survey of related work and a summary. Motivation Typically, complex searches on bibliographic databases are performed by library professionals. Standard textual command line interfaces are well accepted by librarians; however, one of the present authors (JC), having 10 years’ experience as a reference librarian, has noted that this in- terface may be the source of annoying, time consuming, and possibly expensive errors. Bibliographic searches are often carried out under considerable time and financial pressure. Clients may be charged for the librarian’s time, for online connection time, and for database hits. Librarians are often challenged to provide small sets of highly relevant docu- ments. A search session may require development of tens of queries, many of which are variations of each other. Con- sequently, technologies that improve search effectiveness and economics might be of interest to library professionals. Many libraries provide direct user access to biblio- graphic databases, typically via dedicated terminals or In- ternet access. In the latter case the command line interface may be replaced by an HTML or Java™ interface, thus making it easier for the user to enter correctly formed queries. The Islands interface goes further than this, elimi- nating query syntax entirely by means of direct graphical manipulation when creating and modifying queries, and providing an overview of all queries used in the search session. This might benefit the casual user in several ways. First, syntax-free creation of queries might eliminate errors, * To whom all correspondence should be addressed. © 1999 John Wiley & Sons, Inc. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 50(9):814 – 825, 1999 CCC 0002-8231/99/090814-12

Transcript of Interactive graphical queries for bibliographic search

Page 1: Interactive graphical queries for bibliographic search

Interactive Graphical Queries for Bibliographic Search

Martin Brooks* and Jennifer CampbellInteractive Information Group, National Research Council of Canada, Institute for Information Technology, M50Montreal Road, Ottawa, Ontario K1A OR6, Canada. E-mail: [email protected]; ai.iit.nrc.ca/II_public

This article presents “Islands,” an interactive graphicalinterface for construction, modification, and manage-ment of queries during a search session on a biblio-graphic database. The Islands interface is compared tothe Dialog Interface™ on a search of the INSPEC data-base.

A bibliographic database search typically requires devel-opment of multiple queries and exploration of their Booleancombinations. Traditional interfaces to bibliographic data-bases require typing text at a command line, and printinglists of queries with information on the sets retrieved bythem. This article presents “Islands,” an interactive graph-ical interface for construction, modification, and manage-ment of queries. The Islands interface is intended to help thesearcher by supporting the nonlinear conceptual and logicalstructure of the search session.

The Islands interface provides click-and-drag creationand modification of queries as Boolean combinations ofsearch primitives and existing queries. A search session isrepresented and manipulated as a sea of hilly islands. Eachisland represents a single query, with the individual hillsrepresenting disjunctions, and the collection of hills on anisland interpreted as a conjunction. Figure 1 shows thegraphical and representational nature of the interface.

Visual interfaces for searching generally take one of twoforms: at the front end, where the search is specified; or atthe back end, where the search results are presented to theuser. Visual query specification first appeared in connectionwith NASA’s photographic archives (Rorvig, Turner, &Moncada, 1988; Seloff, 1990) in the form of a visual the-saurus, where each search primitive was presented to theuser in the form of a representative image having links toimages representing more specialized and more generalsearch terms. Recently, visual methods for geometric searchhave been explored (Paquet & Rioux, 1997). Search resultswere visualized by Wise et al. (1995) as “galaxies” and

“sedimentary layers” through statistical processing of theretrieved documents’ content. Document databases havebeen visualized by clustering based on citation structure(Small & Griffith, 1974), providing relationships betweenscientific fields and the structure of scientific literature.

This article begins with the motivation behind the Islandsinterface and a review of the semantics of bibliographicsearch. We walk through a search of the INSPEC database,first using a standard interface and then with the Islandsinterface. This is followed by discussion of the differencesbetween the two interfaces, followed by a survey of relatedwork and a summary.

Motivation

Typically, complex searches on bibliographic databasesare performed by library professionals. Standard textualcommand line interfaces are well accepted by librarians;however, one of the present authors (JC), having 10 years’experience as a reference librarian, has noted that this in-terface may be the source of annoying, time consuming, andpossibly expensive errors. Bibliographic searches are oftencarried out under considerable time and financial pressure.Clients may be charged for the librarian’s time, for onlineconnection time, and for database hits. Librarians are oftenchallenged to provide small sets of highly relevant docu-ments. A search session may require development of tens ofqueries, many of which are variations of each other. Con-sequently, technologies that improve search effectivenessand economics might be of interest to library professionals.

Many libraries provide direct user access to biblio-graphic databases, typically via dedicated terminals or In-ternet access. In the latter case the command line interfacemay be replaced by an HTML or Java™ interface, thusmaking it easier for the user to enter correctly formedqueries. The Islands interface goes further than this, elimi-nating query syntax entirely by means of direct graphicalmanipulation when creating and modifying queries, andproviding an overview of all queries used in the searchsession. This might benefit the casual user in several ways.First, syntax-free creation of queries might eliminate errors,

* To whom all correspondence should be addressed.

© 1999 John Wiley & Sons, Inc.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 50(9):814–825, 1999 CCC 0002-8231/99/090814-12

Page 2: Interactive graphical queries for bibliographic search

confusion, and lost time in the case of users unfamiliar withthe particular query syntax. Second, being able to modifyprevious queries might help the user improve a query andreduce the proliferation of queries. Third, being able to takein, at a glance, the collection of all queries created so farduring a search session might help the user formulate strat-egies for bringing the session to a successful conclusion.

Bibliographic Search Semantics

When faced with a search task, one first chooses thebibliographic databases to be searched. Such databases maydiffer from each other with respect to both content andsearch capabilities. Nevertheless, there exist sufficient com-monalties that one may describe a generic semantics forsearch of an individual database, as follows.

A bibliographic database represents a document collec-tion as a set of individual document records consisting oflabeled fields. Typically, these fields include title, author,journal, publication date, indexing and classification data,language, etc. Additional fields found in some databasesinclude the document’s full text, or lists of citing docu-ments.

The fundamental unit of bibliographic database search isthe search session, consisting of a sequence of Boolean

queries, each of which results in a retrieved set. Within asingle search session, a unique name is assigned to eachquery; the name is used synonymously to indicate the queryand its retrieved set. Boolean queries are built from searchprimitives and previous queries, the latter represented byquery name. The Boolean connectives are OR, AND, andAND NOT. Search primitives mostly perform string match-ing within specified fields, but for some fields might havemore particular semantics, such as restriction of dates to anindicated range, or restriction of language to indicated lan-guages.

The mechanics of search, just described, are the toolswith which the searcher executes a search strategy in pursuitof a search objective. The search objective depends on theneeds of the person for whom the search is being conducted.In some cases it may be appropriate to attempt to maximizeboth precision and recall (Lesk & Salton, 1969); in othercases retrieval of a single relevant document may be suffi-cient. A typical search objective is retrieval of a boundednumber, say less than 50, of highly relevant documents.

Search strategies vary widely according to the personand the context. Furthermore, a particular search might be amixture of strategy execution and improvisation. The latteroccurs when the intended strategy fails to produce a re-trieved set meeting the search objective. The experienced

FIG. 1. The Islands interface.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999 815

Page 3: Interactive graphical queries for bibliographic search

searcher may use strategies that gracefully accommodatecontingencies for such failure; in this sense search strategiesare somewhat like game strategies. Search strategies also

bear a resemblance to software design, in the sense that arecord of the mechanics of a search session does not nec-essarily provide the reader with an immediate sense of thestrategy behind it.

Notwithstanding the personal and contextual nature ofsearch strategies, there is a common practice of starting asearch with a sequence of queries that define “concepts,”followed by Boolean combinations thereof. In this context,a concept is a disjunctive query having as disjuncts a set ofwords or phrases, perhaps restricted to certain fields, all ofwhich are different ways of referring to the same thing. TheIslands interface caters to this convention.

Standard Bibliographic Search Interface

In this section we walk through a hypothetical searchsession using a standard search interface. In the next sectionwe carry out this same search using the Islands interface.

The Dialog interface to the INSPEC database may beconsidered representative of typical bibliographic searchinterfaces. Queries are entered sequentially, with namesassigned automatically in the form of sequentially increas-ing numbers. The size of the retrieved set is returned; thesearcher may subsequently sample this set, with the hope

FIG. 2. Summary of a hypothetical Dialog search on INSPEC.

FIG. 3. Six concepts, represented as towers of string patterns. Each concept is the disjunction of its string patterns. The name of the set of retrieveddocuments is shown at the top of the tower. The number of documents in the set is shown in the box below the tower.

816 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999

Page 4: Interactive graphical queries for bibliographic search

that the examined records will provide clues as to how toproceed.

Figure 2 shows the summary of a hypothetical searchsession illustrating some basic features of the interface.Each query is listed in the “description” column; the “set”column gives the name of the set of documents retrieved bythe query; and the “items” column gives the number ofdocuments in the set.

The search objective for Figure 2 is to help a researcherlearn about certain recent research on information retrievalapplying to bibliographic databases and the Web. This ob-jective suggests that a relatively small set of documentsshould be returned, say less than 50. The research of interestis specified to come from two fields, machine learning andstatistics. However, techniques using neural nets are to beexcluded. The searcher has worked with this researcherpreviously, and is familiar with her professional profile as aspecialist in applications of artificial intelligence to games.

The search strategy is to create six concepts, representedin queries S1–S6 in Figure 2, and then take the appropriateBoolean combination in S7, with the hope that the size of S7will be appropriate.

Each of the S1–S6 is a disjunction of strings to bematched in any of the title (TI), identifier (ID), or descriptor(DE) fields. The question marks ending some of the string

FIG. 4. Clicking on an oval shows the entire string pattern.

FIG. 5. A new, empty, island created by pressing the “New” button (top center). The island has a single, empty hill.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999 817

Page 5: Interactive graphical queries for bibliographic search

patterns are wild cards; the string to which they match maycontinue with any sequence of characters. The empty set ofparentheses appearing between string patterns is a proximityoperator; it specifies that the two patterns must occur adja-cent to each other, separated only by white space.

S7 is a straightforward expression of the search specifi-cation. Note that the boolean connective NOT is an abbre-viation for AND NOT.

Unfortunately, S7 results in too many (1845) documents;the searcher proceeds by trying to narrow the set in S8–S10.In S8 she restricts S7 such that the terms of S1–S6 occur inthe title (TI), but there are still too many (945). In S9 shetries restricting to articles published no earlier than 1994;this has little effect—the Web first appeared in 1993. Sherestricts the date further in S10, but still there are too many(413).

The searcher now explores a different approach. Shefigures that restriction to articles within the researcher’sfield of specialty will be most relevant. She characterizes theresearcher’s interests by means of S11, however the size (3)of S12 discourages her from continuing in this direction.

At this point the searcher doubts the ultimate success ofrestricting documents in S7 by publication date or restric-tion to the researcher’s field. Glancing through some of therecords in S10, she notices a document describing machine

learning research applied to search on an object orienteddatabase, and realizes that this was supposed to be excludedby S6, which was intended to provide the restriction tobibliographic databases. S14 narrows S10 by excludingS13, giving the desired final result (38).

The Islands Interface

The Islands software is a general framework for visual-izing ordered data (Rival & Zaguia, 1995). (The software ispropietary to Decision Academic Graphics Inc. The presentapplication was developed by Decision Academic Graphicsunder contract from the National Research Council of Can-ada.) The data is stored in a Microsoft AccessRN database.The visualization presents the data in the form of islandsfloating on a dark blue background. Islands are populatedwith hills, and hills are populated with towers. A mouseclick on a tower causes it to pop up, showing its content asa stack of ovals, each containing text. Another click, and thetower is pushed down, hiding its content, just showing thetop oval.

In the present application, the database contains thequeries of the search session. Each island represents aquery; the totality of islands represents the search session.Each hill represents a (possibly negated) disjunctive clause;

FIG. 6. Two terms have been added to the hill on S7. At this point S7 represents the query S1 OR S2.

818 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999

Page 6: Interactive graphical queries for bibliographic search

the totality of hills on each island represents the query as aconjunction of the clauses represented by the individualhills. Each tower represents either a reference to an earlierquery or a disjunction of literal terms, i.e., a concept (in thesense of the search semantics of the previous section). In theformer case the tower cannot pop up, but in the latter caseit may be popped up to reveal the terms in the concept.

Color is used to reflect the state of interaction; for ex-ample, the user may select an island by clicking on it,causing its color to change from light blue to red. Color isalso used to represent the Boolean type of a hill; a green hillis a disjunction, and a black hill is a negated disjunction.

The content of the database is represented in the structureof the islands, including the number of islands, the numberof hills on each island, the color of those hills, and thetowers on each hill. However, the graphical form of theislands is automatically computed to enhance human visualaccess to the data. Islands are drawn so as to appear naturaland three dimensional. Their shapes are irregular, mimick-ing the shoreline of a real island. The outline of an island isdrop-shadowed, giving the impression of height. Hills alsohave irregular shapes and drop-shadows. The shapes andlayout of hills on an island, and the shapes and layout of theislands themselves, are computed so as to optimally useavailable screen space and to facilitate user interaction with

towers. In particular, placement of the towers is such thatevery tower can be popped up without graphically collidingwith other towers. This allows the user to have any desiredsubset of the towers popped up without impairing visualaccess to the towers’ contents.

The database grows during a search session, starting asempty, accumulating data representing the queries as theyare entered by the user. Consequently, the visual presenta-tion of the data also grows. Editing a query changes thenumber of towers and hills appearing on the island repre-senting the query. This in turn causes the island to changeshape, which may further ripple to a change of layout for allthe islands. This feature will be visible in the extendedexample constituting the remainder of this section.

We now walk through use of the Islands interface toINSPEC for the search of the previous section, followingexactly the same strategy. The net content of the queriescreated by the Islands interface will be identical to that ofthe Dialog Interface™, but the queries created will not be inone-to-one correspondence for reasons that will becomeclear as we proceed with the example. Rorvig et al. (1988,figure 4) present a similar comparison of textual vs. visualinterface for the NASA photographic archives.

Figure 3 shows the Islands construction of the six con-cepts S1–S6; these correspond exactly to S1–S6 in the

FIG. 7. The island is query S7, defined as (S1 OR S2) AND NOT S3. There are two hills, each representing one of the conjuncts. The negated conjunctis the black hill. The query has been entered; the number of retrieved records is shown in the box with the query’s name.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999 819

Page 7: Interactive graphical queries for bibliographic search

Dialog search of Figure 2. The Islands interface is special-ized for queries of this type; they are represented on the leftside of the screen as towers of ovals, each oval containingone string pattern to be matched against any or all of the TI,DE, or ID fields. The query represented by a tower is thedisjunction of the string patterns in the ovals.

A new tower is created by pressing the “Terms” button atthe top left. The query’s name is automatically assigned atthe time the query is created. A dialog box pops up, intowhich the searcher types string patterns, using Dialog™syntax for individual string patterns. The ovals show onlythe first few letters of each string pattern; clicking on anoval shows the whole pattern, as in Figure 4.

Having created the six concepts, the searcher wants tocreate their Boolean combination ([S1 OR S2] AND NOTS3 AND S4 AND [S5 OR S6]). She does this in two steps,first creating query S7 as ([S1 OR S2] AND NOT S3, thencreating query S8 as S7 AND S4 AND [S5 OR S6]). Notethat this differs from the Dialog Interface™ shown in Figure2, where this Boolean combination constituted a singlequery. This difference is discussed below.

The major portion of the Islands screen is for represen-tation of queries as hilly islands. A new query, S7, is createdby pressing the “New” button at the top center; the result is

shown in Figure 5. The query’s name, automatically as-signed at the time the query is created, appears on it.

Terms are added to the new query by click-and-drag. InFigure 6 we have added two terms, S1 and S2, in two steps,first adding S1, then S2. Each was added by clicking on theoval at the top of its respective tower, then while holding themouse button down, moving the mouse to the hill on S7 andreleasing the mouse button, at which point a labeled ovalappears on the hill.

Terms appearing on the same hill are disjuncts; thus inFigure 6 the query S7 is S1 OR S2. However, the query isnot yet complete; no records have been retrieved. Thesearcher is in the process of constructing the query (S1 ORS2) AND NOT S3; when all these terms have been addedthe searcher will hit the “enter” key on her keyboard andrecords will be retrieved.

A query represented as a hilly island is a conjunction ofdisjunctions; additionally, conjuncts may be negated, thusproviding the boolean connective AND NOT. It is thisrestriction of single queries to conjunctions of disjunctionsthat causes the user to form the query ([S1 OR S2] ANDNOT S3 AND S4 AND [S5 OR S6]) as two queries. To adda conjunct to S7, the searcher presses the “And” button (topcenter of the screen), causing a second, empty, hill to appear

FIG. 8. A second island has been created. Its red color [see pdf file at www.interscience.wiley.com for color figure] indicates that it is the currently selectedisland; i.e., the island being worked upon.

820 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999

Page 8: Interactive graphical queries for bibliographic search

on the island. To negate this conjunct, she presses the “Not”button (next to the “And” button), causing the hill to becolored black. She drags S3 onto this hill, and then hits the“enter” key on her keyboard, causing records for this queryto be retrieved. The number of retrieved records is showntogether with the query’s name. (See Fig. 7.)

Note that an island’s shape changes as its contentschange. Islands have distinctive, unique, compact shapes.An island’s shape may change when another island ischanged, so that they fit together optimally in the availablescreen space.

Working directly towards her search objective, thesearcher creates a second island, query S8, formulated as S7AND S4 AND (S5 OR S6). The term S7 is added to thequery by clicking on island S7, and with the mouse buttondown, moving the mouse to the hill on S8. (See Fig. 8.)

The searcher now sees that S8 retrieves too many (1845)documents, and proceeds to try to narrow this. First sherestricts all terms contributing to S8 to occur in the title,using the TI, DE, and ID check boxes at the upper left of thescreen. These check boxes operate on whichever island iscurrently selected. (If the desired island was not selected,clicking on it will cause it to become the currently selectedisland.) (See Fig. 9.)

Unfortunately, restriction to the title field still results intoo many (945) records. Clicking the DE and ID checkboxesback on, she tries a date restriction instead, using the datespecifiers at the top center of the screen. First she restricts topublications in 1994–1997, but there are still too many;she then tries 1995–1997, resulting in 413 records. (SeeFig. 10.)

Note that the Islands interface differs from the DialogInterface™ in that restrictions on queries are performed bydirectly modifying an existing query, instead of creating anew query.

Not satisfied with the result of date restriction on S8,the searcher explores a different approach, figuring thatrestriction to articles within the researcher’s field ofspecialty will be most relevant. She characterizes theresearcher’s interest in computer game playing with anew concept, S9, matching on “chess,” “backgammon,”“go,” or “checkers,” and creates a new query, S10 as S9AND S8. (See Fig. 11.)

With S10 having only three records, this approach seemshopeless, so the searcher removes S10 by clicking on theisland and dragging it off to the left side of the screen,thereby removing unnecessary clutter from the search sum-mary. The query is not lost, however, it is recorded on a

FIG. 9. Query S8 is restricted to matches in the title (TI) field, using the check boxes at the upper left. Compare with Figure 8 to see the reduced numberof retrieved records.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999 821

Page 9: Interactive graphical queries for bibliographic search

log of all queries, and can be recovered from that log. (SeeFig. 12.)

Glancing through some of the records in S8 (this func-tionality is not currently supported by Islands), she noticesa document describing machine-learning research applied tosearch on an object-oriented financial database, and realizesthat this was supposed to be excluded by S6, which wasintended to provide the restriction to bibliographic data-bases. She creates a concept, S11, matching on “relational”or “object,” and further restricts S8 by adding a negatedconjunct containing the term S11. This gives the desiredfinal result (38), shown in Figure 1. Note that Figure 1summarizes the entire search session.

Discussion

The previous two sections have demonstrated executionof the same INSPEC search using a standard Dialog Inter-face™ and the Islands interface. The strategy was identicalin both cases; the difference being only in creation, modi-fication, and management of queries. Table 1 summarizeshow the two interfaces differ.

Although there are many differences between Dialog™and Islands regarding query creation, the authors’ intuitionis that the greatest value of Islands will come from query

modification and management. Islands provides just oneway that such functionality might be approached; otherdifferent, query modification and management systems mayprovide similar benefits. Important future research will in-clude design and evaluation of such systems.

Query modification and management functionality inIslands is designed to help maintain simplicity and structurein the search session, by preventing query proliferation andsimplifying the session summary. Comparison of Figures 1and 2 supports this claim. The essence of this argument isthat although queries are executed sequentially, the concep-tual and logical structure of a search session is not linear.Query modification and management allow the human com-puter interaction to reflect this nonlinear structure, whereasa strictly sequential interface such as Dialog™, and itslinear session summary, force the searcher to synthesize andhold the session’s structure in her/his head.

The Islands interface is currently a prototype. All userinteractions described in the previous section are functional,but the interface is not operationally connected to a searchengine; the set sizes were added manually. Many details ofoperation were omitted from the example search. Althoughthe practical success of this work will depend on thosedetails, the important goal of future research is to identifywhat makes a search difficult and to address those issues.

FIG. 10. Query S8 is restricted to documents published in 1995–1997, using the date specifiers at top center. Compare with Figure 8 to see the reducednumber of retrieved records.

822 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999

Page 10: Interactive graphical queries for bibliographic search

Development of the Islands interface needs to be fol-lowed by evaluation of effectiveness as perceived by users.We share this state of affairs with the work of Ramsey,Hsinchun, and Bin (1999) reported in this volume. Ramseyet al. discuss the appropriateness of Web delivery of graph-ical interfaces using Java™; this is also a promising futuredirection for the Islands interface.

The Islands interface provides some interesting contrastsand similarities with visual thesauri described by Ramsey etal. (1999) and the NASA Visual Thesaurus (Rorvig et al.,1999; Seloff, 1990). Seloff (1990) describes image manage-ment technology as being “far ahead of the intellectualprocess required to harness its power.” In the case of bib-liographic search the situation is opposite. Librarians andprofessional searchers use nonlinear strategies to arrive atthe desired Boolean query, but the available technologysupports only linear organization of search sessions. TheIslands interface directly addresses this shortcoming of cur-rent bibliographic search systems.

Another contrast between visual thesauri and the Islandsinterface concerns Boolean complexity and the power ofsearch primitives. The Islands interface uses a scarce re-source—screen space—to provide the user with natural andpowerful access to unlimited Boolean combinations. Visualthesauri spend their screen space on presentation of thesearch primitives, i.e., images. The Islands interface focuses

on logical structure built from standard primitives (i.e.,occurrences of terms), whereas the visual thesauri focus onstrong primitives (i.e., related to the graphic content), withsome additional possibility for combinations thereof.

The Islands interface and visual thesauri of Rorvig et al.(1999), Seloff (1990), and Ramsey et al. (1999) share someimportant characteristics. They are all visually more interestingthan textual displays. Furthermore, they trade off some of theflexibility of command line interfaces for their visual aspects.This is discussed by Rorvig et al. (1999); in the case of Islandsthe form of individual queries is limited to conjunctions ofpossibly negated disjuctions. This is not a limitation of searchpower, since any Boolean combination can be achieved by acombination of several queries. The limitation arises in con-nection with the depiction of queries as islands with hills;graphical schemes supporting arbitrary Boolean combinationswithin a single query would lead to unbounded graphicalcomplexity, e.g., hills upon hills upon hills, and the intuitivepower of visual presentation would be lost to the user.

Related Work

The authors believe Islands is the first direct manipula-tion graphical interface integrating query creation, modifi-cation, and management for bibliographic search. However,

FIG. 11. Query S10 profiles the researcher’s interests in an unsuccessful attempt to narrow S8.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999 823

Page 11: Interactive graphical queries for bibliographic search

there is much work on related problems, particularly data-base search and visualization of search results.

The difference between database search and bibliographicsearch stems from the difference between structured data andtextual data. Structured data, which provides the context fordatabase search, is built around an organization of the conceptsto which the data refer. Important examples are entity–rela-tionship models, relational models, and object-oriented data-bases. Textual data, on the other hand, is flat.

Structured data search utilizes the database organization.Visual query systems for structured data reflect this organi-zation. Two special issues of theJournal of Visual Lan-guages and Computing(1995, 1996) have been dedicated tovisual query systems. There has been important work onvisual queries to relational databases (Dogru et al., 1996;Papantonakis & King, 1995), with focus on the entity-relationship model (Andries & Engels, 1996; Angelaccio,Catarci, & Santucci, 1991), and object oriented databases(Boyle, Leishman, & Gray, 1996; Leong, Choo, Kok, Lim,& Narasimhalu, 1990). An evaluation of visual vs. textualquery languages is reported by Catarci and Santucci (1995).

Two recent reviews (Card, 1996; Hearst, 1997) of interfacesfor searching the Web focus on visualization of search results.Search result visualization work on bibliographic databases isreported by Nowell, France, Hix, Heath, and Fox (1996), and

evaluation of a search result visualization system usingTREC-4 data is reported by Veerasamy and Belkin (1996).

Summary

The Islands interface demonstrates a direct manipulationapproach to query creation, modification, and management.The Islands interface was compared to Dialog™ by carryingout the same INSPEC search using both approaches. Wehave argued that the advantages of interfaces such as Islandswill be largely due to query modification and managementfunctionality, which allows the searcher’s behavior and thesearch summary to reflect the nonlinear conceptual andlogical structure of the search session. Important work re-mains to be done, including improvement of the interfaceand objective evaluation.

Acknowledgment

The authors thank Ivan Rival at Decision Academic Graph-ics Inc. (www.dag.ca) for providing the Islands software. TheIslands software and the Islands interface belong exclusively toDecision Academic Graphics; this application was developedunder a contract from the National Research Council of Can-

FIG. 12. Query S10 has been removed, but can be recovered from the log, shown in the figure, which is activated by pressing the “Log” button at topcenter.

824 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999

Page 12: Interactive graphical queries for bibliographic search

ada in which Decision Academic Graphics provided program-ming and guidance in appropriate use of the interface. TheIslands software was originally developed by Decision Aca-demic Graphics for visualization of ordered structures (Rival &Zaguia, 1995); Decision Academic Graphics is pursuing ap-plications to curriculum administration, information retrieval,and other fields. Not all features of the Islands interface havebeen illustrated in this article; interested readers may learnmore at www.dag.ca.

References

Andries, M. & Engels, G. (1996). A hybrid query language for an extendedentity-relationship model. Journal of Visual Languages and Computing,7, 321–352.

Angelaccio, M., Catarci, T., & Santucci, G. (1990). Query by diagram*: Afully visual query system. Journal of Visual Languages and Computing,1, 255–273.

Boyle, J., Leishman, S., & Gray, P.M.D. (1996). From WIMPS to 3D: Thedevelopment of AMAZE. Journal of Visual Language and Computing,7, 291–319.

Card, S.K. (1996). Visualizing retrieved information: A survey. IEEEComputer Graphics and Applications, 16, 63–67.

Catarci, T., & Santucci, G. (1995). Are visual query languages easier to usethan traditional ones? An experimental proof. Proceedings of the Inter-national Conference on Human–Computer Interaction 1995 (HCI ‘95)(pp. 323–338), Chapman & Hall.

Catarci, T., & Santucci, G. (1995) Diagrammatic vs textual query lan-guages: A comparative experiment. Proceedings IFIP W.G. 2.6 WorkingConference on Visual Databases, Lausanne, 27-29 March 1995 (pp.70–83), Huddersfield, UK: Cambridge Unversity Press.

Dogru, S., Rajan, V., Rieck, K., Slagle, J.R., Tjan, B.S., & Wang, Y.(1996). A graphical data flow language for retrieval, analysis, andvisualization of a scientific database. Journal of Visual Language andComputing, 7, 247–265.

Hearst, M.A. (1997). Interfaces for searching the web. Scientific American,276, 60–64.

Leong Mun-Kew, Choo Boon-Siong, Kok Chun-Hong, Lim Jyh-Jang, &Narasimhalu, D. (1990). The implementation of a visual language in-terface for an object-oriented multimedia database system. Journal ofVisual Languages and Computing, 1, 275–289.

Lesk, M.E. & Salton, G. (1969). Relevance assessments and retrievalsystem evaluation. Information Storage and Retrieval, 4, 343–359.

Nowell, L.T., France, R.K., Hix, D., Heath, L.S., & Fox, E.A. (1996).Visualizing search results: Some alternatives to query-document simi-larity. SIGIR ’96 (pp. 67–75).

Papantonakis, A. & King, P.J.H. (1995). Syntax and semantics of Gql, agraphical query language. Journal of Visual Languages and Computing,6, 3–25.

Paquet, E. & Rioux, M. (1997). Nefertiti: A query by content software forthree-dimensional models databases management. Proceedings of the Inter-national Conference on Recent Advances in 3-D Digital Imaging andModeling (pp. 345–352), Los Alamitos, CA: IEEE Computer Society Press.

Ramsey, M., Hsinchun C., & Bin Z. (1999). A collection of visual thesaurifor browsing large collections of images. Journal of the AmericanSociety for Information Science, 50, 826–834.

Rival, I. & Zaguia, N. (1995). Learning navigator: A multi-purpose visu-alization software tool for education, career, and employment counsel-ling. NATCON articles, 21st National Consultation on Career Develop-ment (pp. 185–188).

Rorvig, M., Turner, C.H., & Moncada, J. (1999). The NASA imagecollection visual thesaurus. Presented at the American Society for In-formation Science 17th Mid-Year Meeting. Journal of the AmericanSociety for Information Science, 50, 794–798.

Seloff, G.A. (1990). Automated access to the NASA-JSC image archives.Library Trends, 38, 682–696.

Small, H. & Griffith, B.C. (1974). The structure of scientific literatures. I:Identifying and graphing specialities. Science Studies, 4, 17–40.

Veerasamy, A. & Belkin, N.J. (1996). Evaluation of a tool for visualizationof information retrieval results. SIGIR ’96 (pp. 85–92).

Wise, J.A., Thomas, J.J., Pennock, K., Lantrip, D., Pottier, M., Schur, A.,& Crow, V. (1995). Visualizing the non-visual: Spatial analaysis andinteraction with information from text documents. In N. Gershon & S.Eick (Eds.), Information Visualization ’95 Proceedings (pp. 51–58), LosAlamitos, CA: IEEE Computer Society Press.

TABLE 1. Comparison of the INSPEC and Islands interface.

Dialog Islands

Query Creation: Type characters at command line Push button to create template; fill in templateAll queries created identically Two types of query creation: Concepts (disjunction of

string matches), represented as towers; andconjunctions of disjunctions (conjuncts may benegated), represented as islands

Searcher responsible for all syntax Searcher responsible only for string matching syntaxin concepts

Arbitrary Boolean combinations in single query Single queries restricted to common forms (disjunctionof string matches for concepts; conjunctions ofdisjunctions [with negated conjuncts] for islands);more complex logic requires combination of queries

Query Modification: Previous queries cannot be changed Previous queries modifiable; terms anddate/field/language restrictions added and removed

Query Management: Print summary showing full sequence of queries and info onretrieved sets (see Figure 2)

Islands automatically sized and placed

Unused islands may be removed, and can berecovered later from log

Only useful queries shownUnderstanding search session from summary requires analysis Search session represented visually, relationships

visible

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE—July 1999 825