Digital Libraries: Extending and Applying Information Science and Technology
ProLISSAOctober 26-27, 2000
Edward A. Fox
[email protected] http://fox.cs.vt.edu
CS DLRL Internet TIC
Virginia Tech, Blacksburg, VA, USA
Thanks!
Theo Bothma Petrina Bothma Peter Ingwersen Irene Wormell ProLISSA and DISSAnet staff DANIDA
Acknowledgements (Selected)
Mentors: JCR Licklider, Michael Kessler, Gerard Salton Sponsors: Adobe, IBM, Microsoft, NLM, NSF, OCLC, SOLINET,
SURA, UNESCO, US Dept. of Ed. (FIPSE), … VT Faculty/Staff: Tony Atkins, Thomas Dunbar, Debra Dudley,
John Eaton, Gwen Ewing, Peter Haggerty, Gary Hooper, Gail McMillan, Len Peters, James Powell, …
VT Students: Emilio Arce, Fernando Das Neves, Brian DeVane, Robert France, Marcos Goncalves, Scott Guyer, Robert Hall, Neill Kipp, Paul Mather, Tim McGonigle, Todd Miller, Constantinos Phanouriou, William Schweiker, Ohm Sornil, Hussein Suleman, Patrick Van Metre, Laura Weiss, …
JCDL 2001
First Joint ACM/IEEE Conference on Digital Libraries
http://www.jcdl.org June 24-28, 2001 in Roanoke, VA Conference Committee: General Chair: Edward A. Fox, Virginia Tech Program Chair: Christine Borgman, UCLA Treasurer: Neil Rowe, Naval Postgraduate School Posters Chair: Craig Nevill-Manning, Rutgers U.
Why this topic today?
Many users (patrons) prefer digital libraries to traditional libraries or the Web
Digital library collections often are free or less expensive, so are heavily used
Most publishers are working toward digital libraries to allow access to their content
Library and information science professionals are key players in building digital libraries
Outline
Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications
Libraries of the FutureJCR Licklider, 1965, MIT Press
World
Nation
Province
City
Community
Licklider – Unified Theory?
Not ready in 1960s Analog – unified field theory in physics “Mess” today – segmented field, specialities
– Database <-> Knowledge <-> Content Mgmnt– Multimedia, Hypermedia, Hypertext– Logic, Algebra, Artificial Intelligence, …
Expensive, annoying for users– Don’t know where to look– Don’t know how to use services
D ig ita l L ib ra r y C o n te n t
A rtic le s ,R e p o rts,
B o o ks
T e xtD o cum e n ts
S p ee ch ,M u s ic
V id eoA u d io
(A e ria l)P h o tos
G e og rap h icIn fo rm ation
M o d e lsS im u la tio ns
S o ftw a re ,P ro g ra m s
G e no m eH u m a n,a n im a l,
p la n t
B ioIn fo rm ation
2 D , 3 D ,V R ,C A T
Im ag es a ndG ra p h ics
C o nte n tT yp e s
Computing (flops)Digital content
Com
mun
icat
ions
(ban
dwid
th, c
onne
ctiv
ity)
Locating Digital Libraries in Computing andCommunications Technology Space
Digital Libraries technologytrajectory: intellectualaccess to globally distributed information
less more
Grand Challenges Can
Mobilize the community Spur creativity Lead to important benefits in society Push researchers to develop relevant theories Force people to work in teams/groups Convince funding agencies to invest Help bring about integration of systems,
interoperability, and seamless interfaces
DLs: Why of Global Interest? National projects can preserve antiquities and
heritage: cultural, historical, linguistic, scholarly Knowledge and information are essential to
economic and technological growth, education DL - a domain for international collaboration
– wherein all can contribute and benefit– which leverages investment in networking– which provides useful content on Internet & WWW– which will tie nations and peoples together more
strongly and through deeper understanding
Digital Libraries --- Objectives
World Lit.: 24hr / 7day / from desktop Integrated “super” information systems: 5S:
streams, structures, spaces, scenarios, societies Ubiquitous, Higher Quality, Lower Cost Education, Knowledge Sharing, Discovery Disintermediation -> Collaboration Universities Reclaim Property Interactive Courseware, Student Works Scalable, Sustainable, Usable, Useful
MARIAN Layers
Database Layer
Search Engine Layer
User Information Layer
User Interface Layer
User User User User
DL Components
User Interfaces
Workflow Mgr
DBMS
Search Engines
Data, MM Info
Gateways
Repository
Rights Mgr
MM/ HT Renderer
Digital LibrariesShorten the Chain from
Editor
Publisher
A&I
Consolidator
Library
Reviewer
DLs Shorten the Chain to
Author
Reader
Digital
LibraryEditor
Reviewer
Teacher
Learner
LibrarianDr. Patient
Benefits
Ease of use Effectiveness
“The benefits of digital libraries will not be appreciated unless they are easy to use effectively.” - IITA Workshop report
Definitions
Library ++ (library+archive+museum+…) Distributed information system + organization
+ effective interface User community + collection + services Digital objects, repositories, IPR management,
handles, indexes, federated search, hyperbase, annotation
Outline
Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications
PetaPlex Top View
4 ft.
side
PetaPlex Side View
4 ft. wide
8 ft.
high
Roles:* Support* Cooling* Power
15
shelves
PetaPlex Complex
FRONT END MACHINERS/6000, 1G RAM, 4 Proc.
Nanoserver
Nanoserver
Nanoserver
Nanoserver
Nanoserver
Nanoserver
Nanoserver
Nanoserver
Nanoserver
Nanoserver Nanoserver
Nanoserver
Nanoserver Nanoserver
Service
Machine 1
Service
Machine 2
Service
Machine 3
Service
Machine 4
PetaPlex
Digital Library Machine (“super” object store): Parallel computer / storage utility
Research: inverted files, video server, … Knowledge Systems Incorporated is supplying
VT-PetaPlex-1 with 2.5 terabytes through 100 nodes:
Net connection + 25GB disk + 233 MHz Pentium + Linux
Structured Video Browser(making video into hypermedia)
www.learn.umd.edu
IBrowse
Expository multimedia Narrative Structures
ICUInformation and CommunicationUniversity
Users Web Search Engines
WWW
Servlet Engine
Web Server
OSDB
Search Server
Servlet Servlet ServletMPEG-7
DescriptionModule
1
2
3
4
5
3’
4’
5’
MPEG-7 Image Library Systems Tech. M
PE
G-7 Im
age Library System
s
MP
EG
-7 Video Library S
ystems T
ech.
ICUInformation and CommunicationUniversity
MPEG-7 Video Library Systems Tech.
Video Data
Description GeneratorDescription Schemes
Design Tool
DescriptionScheme
MetaDatabase
VideoDatabase
Retrieval ServerModule
PlayerP
resentation
Module
Architecture
LMDS offers a LOT of bandwidth(comparison to previous auctions)
0 200 400 600 800 1000 1200MHz
Interactive & Video Data
Wireless Communications Service
PCS D-F Block
Digital Audio Radio Service
Cellular Unserved
PCS A-C Block
DBS
MMDS
LMDS
LMDS is:- 1300 MHz in two “Blocks” ( 28-31 GHz)- Over 2X bandwidth of AM/FM radio, VHF/UHF television, and Cellular telephone combined.- More than sum of previous 16 auctions
LMDS Hub Site at Slusher Hall
Radio Hut
Wavtrace Tower 1 Wavtrace Tower 2
Eventually LMDS could be used in combination with other wireless and wireline
technologies to reach individual homes
SPIRE Visualization
CAVE-ETD CAVE-ETD is a simulation of a library that
runs in a CAVE (VR environment). Populated with a subset of ETD records.
Main Foyerroom
room
room
room
Reading Book Abstract
Integrated Integrated CCLINC CCLINC Translingual Information SystemTranslingual Information System
Integrated Integrated CCLINC CCLINC Translingual Information SystemTranslingual Information System
DARPA
Extraction
What is th
e north korean
movement in th
e front li
ne?
CCLINC SERVER
Info Detection
Summarization
It seems that North Korea launch a missile againAfter North Korea launched a Daipodong missilelast month, NK is perceived to proceed to an additionaltest launch. Korea, US and Japan enter into an alertstate, and prepare for a joint response policy. Korea estimates that the additional launch will be on 09/05. Japan estimates that NK’s missile range is short. USinformation says that there is no sign of launch yet.
Translation
What is th
e status of nk
missile la
unch against japan?
BugHanI IlBonE Ddo MiSaIlEul
BalSaHan Deus HaDa
2-w
a yS
pe e
c h T
ran
s ati
on
Outline
Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications
Definitions
Library ++ (library+archive+museum+…) Distributed information system + organization
+ effective interface User community + collection + services Digital objects, repositories, IPR management,
handles, indexes, federated search, hyperbase, annotation
Definition: Digital Libraries are complex systems that
help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)
5S Layers
Societies
Scenarios
Spaces
Structures
Streams
Definition: 5S Framework Societies: interacting people (, computers) Scenarios: services, functions, operations, methods Spaces: domains + constraints (e.g., distance,
adjacency): 2D, vector, probability Structures: relations, trees, nodes and arcs Streams: sequences of items (text, audio, video,
network traffic) (5 Element System: Fire, Wood, Earth, Metal, Water)
5S: Components
Societies: roles, rituals, reasons, relationships, artifacts Scenarios: acquire, index, consult, administer, preserve Spaces: physical, temporal, functional, presentational,
conceptual Structures: architectures, taxonomies, schema,
grammars, links, objects Streams: granularities, protocols, paths, flows,
turbulences
5S: Combinations
Societies + Scenarios = user model Societies + Scenarios + Spaces = user interface Streams + Structures = markup Streams + Structures + Scenarios = object Structures + Scenarios = DBMS
Outline
Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications
Complex to Simple
MARC ($50) Dublin Core (DC)
Author‘s toolswww.physik.uni-oldenburg.de/EPS/mmm
DL Components
User Interfaces
Workflow Mgr
DBMS
Search Engines
Data, MM Info
Gateways
Repository
Rights Mgr
MM/ HT Renderer
OAi Philosophy
Self-archiving = submission mechanismLong-term storage system = archiveOpen interface = harvesting mechanismData provider + service providerStart with “gray literature”
– e-prints/pre-prints, reports, dissertations, …
Archive of Digital Objects
ArchiveAccessProtocol
Handle(ID)
Digital object
terms and conditions
OAI – Repository Perspective
Required: Protocol
DODO DO DO
MDO
MDO MDOMDOMDO
MDOMDOMDO
OAI – Black Box Perspective
OA 1
OA 2
OA 4
OA 3
OA 5OA 6
OA 7
Black Box OAI-ETD Perspective
ISTEC(Ibero
America)
PhysDis
NSYSU(Taiwan)
ADT(Australia)
BN.PT(Portugal)
www.theses.org
CyberTheses(Francophone)
VT
Dissert.Online(Germany)
MITOhioLINK
CBUC(Catalunya)
NDC(Greece)
SEALS(S.Africa)
CIC U. Bergen(Norway)
…
…
Outline
Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications
Presidential Directive - 12/17/1999Subject: Use of Information Technology to Improve Our Society
“13. The Secretary of the Smithsonian Institution, the Director of the National Science Foundation, the Director of the National Park Service, and the Director of the Institute of Museum and Library Services shall work with the private sector and cultural and educational institutions across the country to create a Digital Library of Education to house this country's cultural and educational resources.”
Programmatic History
Digital Libraries Initiative (DLI 1) - NSF/NASA/ARPA, FY 94-97
DLI 2 - NSF, et al., initiated in FY 98, continuing
in UG Education FY 98-99 DLI 2 Special Emphasis
NSDL ProgramNSF: FY 00-02
DL Operational
Fall, 2002
DLs & UG Earth Systems Educationinitiated FY 99, continuing
Vision
A Learning Environments and Resources Network for SMET
Education (LEARNS)
Designed to meet the needs of learners, in both individual and collaborative settings
Constructed to enable dynamic use of a broad array of materials for learning, primarily in digital format
Managed actively to promote reliable anytime - anywhere access to quality collections and services, available both within and without the network
“The network is the library.”
LEARNS Connects:
Users: students, educators, life-long learners
Content: structured learning materials; large real-time or archived datasets; audio, images, animations;primary sources; digital learning objects (e.g. applets);interactive (virtual, remote) laboratories; ...
Tools: search; refer; validate; integrate; create; customize; publish; share; notify; collaborate; ...
Expectations of Tracks Core Integration: to coordinate a distributed alliance of
resource collection and service providers, and to ensure reliable and extensible access to and usability of the resulting network of learning environments and resources
Collections: to aggregate and actively manage a subset of the digital library’s content within a coherent theme or specialty
Services: to increase the impact, reach, efficiency, and value of the digital library in its fully operational form
Targeted Research: to have immediate impact on one or more of the other three tracks
CS Teaching Center (CSTC)
Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units.
Learners benefit from having well-crafted modules that have been reviewed and tested.
Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built.
ACM Education Board and SIG support, new NSF grant with COLLEGIS Research Institute and others …
Browsing (1)
Browsing (2)
A Digital Library Case Study
Domain: graduate education, research
Genre: ETDs = electronic theses & dissertations
Submission: http://etd.vt.edu
Collection: http://www.theses.org
Project: Networked Digital Library
of Theses & Dissertations http://www.ndltd.org
(NDLTD – remember: ND LTD / NDL TD) (also, newer NUDL:
Networked University Digital Library, with e-courseware, etc.)
GradProgram Library IT Ed.
(Tech)
The Networked Digital Library of Theses and Dissertations
www.NDLTD.org
Leader of the Worldwide ETD(Electronic Thesis and Dissertation) Initiative
Training AuthorsExpanding Access
Preserving KnowledgeImproving Graduate Education
Enhancing Scholarly CommunicationEmpowering Students & Universities
What are the long term goals?
Attract all TDs/yr: 50K D-US, 25K D-Germany, 10K TD-Canada, …
>200K/yr rich hypermedia ETDs that may turn into electronic portfolios (images, video, audio, …)
Dramatic increase in knowledge sharing: literature reviews, bibliographies, …
Services providing lifelong access for students: browse, search, prior searches, citation links
Hundreds/thousands of downloads / year / work
Student Gets CommitteeSignatures and Submits ETD
Signed
Grad School
Graduate School Approves ETD, Student is Graduated
Ph.D.
Library Catalogs ETD, Access isOpened to the New Research
WWW
NDLTD
User Search Support(multilingual, XML)
NDLTD W orld FederatedSearch
V irg in ia Tech ...(un iv )
D isserta tionsO nline
(G erm any)
O hioLink(lib / un iv group)
Portugese N L ...(national lib)
Austra lia(reg ional)
O AS,ISTEC(Latin
Am erica)
UserInterface
Note: All groups shown are connected with NDLTD.
Access Possibilities
Websearchengines
librarycatalogclients
www.theses.org
www.openarchives.org
3rd
PartyServices(e.g.,UMI)
VirginiaTech
NationalLibrary ofPortugal
CBUC(Spain)
OhioLink
MIT NationalProjects:AU, GE, …
Status of the Local Project Approved by university governance Spring
1996; required starting 1/1/97 Submission & access software in place Submission workshops for students (and
faculty) occur often: beginner/adv. Faculty training as part of Faculty
Development Initiative Over 3000 ETDs in collection – some have
audio, video, large images, software, …
US University Members (44) Penn. State University Rochester Institute of Tech. U. of Colorado Health Science Center U. of Florida U. of Georgia University of Hawaii, Manoa U. of Iowa U. of Kentucky U. of Maine U. of North Texas – required since 8/99 U. of Oklahoma U. of South Florida U. of Tennessee, Knoxville U. of Tennessee, Memphis U. of Texas at Austin – required in 2001 U. of Virginia U. Wisconsin - Madison Vanderbilt U. Virginia Commonwealth U. Virginia Tech - required since 1/97 West Virginia U. - required fall 1998 Western Michigan U. Worcester Polytechnic Inst.
Air University (Alabama)Baylor UniversityBrigham Young University (part, whole)CaltechClemson UniversityCollege of William & MaryConcordia University (Illinois)East Carolina UniversityEast Tenn. State U. – require fall 2000Florida Institute of TechnologyFlorida International UniversityGeorge Washington UniversityLouisiana State UniversityMarshall University (W. Va.)Miami University of OhioMichigan TechMississippi State UniversityMITNaval Postgraduate School (CA)New Mexico TechNorth Carolina State University
OhioLINK
Statewide Consortium Represents 79 colleges, universities, libraries Public Universities Private Universities and Colleges 2-Year Colleges Only a few (e.g., Miami U. of Ohio) are also
NDLTD members on their own
National / Regional Projects Australia
– U. New South Wales (lead)– U. of Melbourne– U. of Queensland– U. of Sydney– Australian National U.– Curtin U. of Technology– Griffith U.
Germany– Humboldt University (lead)
– 3 other universities
– 5 learned societies: Math, Physics, Chemistry, Sociology, Education
– 1 computing center
– 2 major libraries
Consorci de Biblioteques Universitàries de Catalunya, as group, www.cbuc.es:– Universitat de Barcelona– Universitat Autonòma de Barcelona– Universitat Politècnica de Catalunya– Universitat Pompeu Fabra– Universitat de Girona– Universitat de Lleida– Universitat Rovira i Virgili– Universitat Oberta de Catalunya– Biblioteca de Catalunya
India, Portugal, … South Africa: SEALS, …
Other Countries with Members
Belgium Brazil Canada Germany Hong Kong India Italy Korea Mexico
Netherland Norway Russia Singapore S. Africa S. Korea Spain Taiwan UK
ECHEA / SEALS (S. Africa) Mellon Foundation $80K Eastern Cape Higher Education Association South East Academic Library System Border Technikon Eastern Cape Technikon University of Fort Hare Port Elizabeth Technikon Rhodes University (first to require outside US) University of Port Elizabeth University of Transkei
(and members elsewhere, e.g., University of Pretoria)
GermanPhysDis
Collection
5SL Source
Description
wrapper wrapper
Harvestprotocol
VT OAI
Collection
MARIAN/DEByE Mediation Middleware
MIT ETDCollection...
Open Archives
protocol
wrapper...Dienst
protocol
SOIF
DublinCore RFC1807
NDLTD/NUDL/Digital Library User
Queries + Results
Belief Network LayerFusion Layer
Additional Evidential
Information
GreekHellenic Dissertations
Collection
wrapper
MARCZ39.50
protocol
WrapperGenerator
Local Data Store
Search ServicesRecommendation Services, etc
AnalysisIndexingLinking
Build Local ETD Site
Digital Library
Policies
Inspection/Approval
Workshop/Training
ETD
ETD
In South Africa
DISSAnet papers Library of Parliment Howard Pim Africana Library,
University of Fort Hare Collections in 11 Languages Cultural Heritage …
Remember
Grand ChallengeScaling / TechnologyFramework, TheorySimplification: DC, OAIExample Applications
Conclusions
Consider DLs (like the poetry project/paper) in South Africa
Education is one important application Cultural heritage, linguistic diversity, are important
to preserve Technology opens up exciting opportunities Having a framework and theory may lead to better
systems and broader applicability
Top Related