Post on 12-Jan-2016
Knowledge Creation for an Educational Use of Digital
Librariesacross Language Boundaries
US-Korea Joint Workshop on Digital LibrariesAugust 10-11, 2000
Sung Hyon Myaeng
Division of Information & Communication
Chungnam National University
KOREA
http://ir.cnu.ac.kr/
Myaeng @ Chungnam National University
2
Outline
General Goal
Project Goals
Virtual Document Concept & MIRAGE-III
Cross-Language IR Work
Project Goals Revisited
Myaeng @ Chungnam National University
3
General Goal
To develop core technology for providing an educational environment where research and educational materials can be searched, shared, and used creatively through internationally interoperable digital libraries
Myaeng @ Chungnam National University
4
Project Goal
Develop techniques for federated searching in multi-lingual, heterogeneous digital library (DL) environment
A digital library environment for active learning with distributed, multilingual materials (OAI-compliant)
Develop a technical basis for an OAI-compliant information gathering environment using the federated searching techniques
Extend the Virtual Document work to incorporate OAI and integrate the federated searching capability
US & Korea
US Korea
Myaeng @ Chungnam National University
5
Virtual Document Concept &
MIRAGE-III
Myaeng @ Chungnam National University
6
Motivation
DL as a Dynamic Knowledge SpaceDynamic: creation of new materials
Knowledge: inter-connection using links
Space: “distance” among objects (retrievable)
Virtual Document (vs. physical document)A document virtually exists over existing digital resources on the Internet
A way of sharing and exploiting existing information to create knowledge
Myaeng @ Chungnam National University
7
Virtual Document: Example
Van Gogh's: Masterpieces from the Van Gogh Museum, Amsterdam, will be based entirely on the holdings of the Van Gogh Museum.The exhibition will illustrate Van Gogh's entire career, from the Potato Eaters of 1885 through Wheatfield of Crows of 1890, the year of his death. It will include such famous works as the Self Portrait as an Artist (1888) The Zouav (1888), The Bedroom (1888e,) and The Harvest (1888).
Embedding, Total Link
Embedding, Partial Links
The exhibition will illustrate Van Gogh's entire career, from the Potato Eaters of 1885 through Wheatfield of Crows of 1890, the year of his death. It will include such famous works as the Self Portrait as an Artist (1888) The Zouav (1888), The Bedroom (1888e,) and The Harvest (1888).
Van Gogh
VDocInstantiation
Referential Link
Myaeng @ Chungnam National University
8
Virtual Document: Concept
Document consisting of links only.
Types of linksembedding / referential
one-to-one / one-to-many / many-to-one
specific / generic
total / partial
etc.
Myaeng @ Chungnam National University
9
One-to-One vs One-to-Many
Depending on the cardinality of the destination
Hamlet, Prince of Denmark
1. Shakespear
2. Hamlet - A Note on Sources The text of this play was acquired via Gopher from iretap.spies.com …………
English Korean
Translated by C. Park
Translated by C. Park
Translated by J. Lee
Translated by J. Lee
Translated by Y. Kim
Translated by Y. Kim
The infant William Shakespeare may have been born on this day in 1564.
And since no other day seems a likelier candidate, ..
Myaeng @ Chungnam National University
10
Generic vs Specific
Depending on the condition of the source
The infant William Shakespeare may have been born on this day in 1564. And since no other day seems a likelier candidate, ……
The infant William Shakespeare may have been born on this day in 1564. And since no other day seems a likelier candidate, ……
It is hard to believe, but once again they are new and improved.My motive in publishing these pages remains to help and stimulate others in Shakespeare studies
It is hard to believe, but once again they are new and improved.My motive in publishing these pages remains to help and stimulate others in Shakespeare studies
There are also links to the Lambs' Tales From Shakespeare (an orignal html edition mounted at this site). Near the bottom of the page I have placed ……
There are also links to the Lambs' Tales From Shakespeare (an orignal html edition mounted at this site). Near the bottom of the page I have placed ……
The Complete Works of William Shakespeare
The Complete Works of William Shakespeare
Welcome to the Web's first edition of the
Complete Works of William
Shakespeare.
The original electronic source for this server is the Complete Moby(tm) Shakespeare, which is freely available online. There may be differences between a copy of a play ……
Generic Link
About the categoriesShakespeare's plays are often arranged in three categories: tragedy, comedy, or history. ……
About the categoriesShakespeare's plays are often arranged in three categories: tragedy, comedy, or history. ……
Shakespeare Discussion AreaWelcome to the discussion pages to discuss Shakespeare and his work, to ask and answer questions, and to enliven the site.
Shakespeare Discussion AreaWelcome to the discussion pages to discuss Shakespeare and his work, to ask and answer questions, and to enliven the site.
Site 1
Myaeng @ Chungnam National University
11
Total vs Partial - Image
Total Link
Partial Link
Myaeng @ Chungnam National University
12
Total vs Partial - Video
Star Wars: Episode I
Snapshots set7
The Trade Federation army marches towards Theed
Star Wars: Episode I
Snapshots set7
The Trade Federation army marches towards Theed
Starwars.mov
Total Link
Partial LinkTime(min,sec)
Myaeng @ Chungnam National University
13
Virtual Document: Definition
A hub & a style sheet
A hub consists ofR-links (referential links)
E-links (embedding links)
Metadata (Dublin Core + index terms)
Myaeng @ Chungnam National University
14
Virtual Document: Benefits
Easy creation of composite documentslinks on read-only documents
links on semantics
Retrieval of composite and component documents
Savings in storage (and network traffic)unnecessary to copy/store large documents
linking a part of a large document
Myaeng @ Chungnam National University
15
Virtual Document: Benefits (Cont’d)
Handling multiple versions & representationswith one-to-many links
Annotating to documents with metadata“community document” & support for collaborative work
Automatic reflection of changes in participating documents
Myaeng @ Chungnam National University
16
The Architecture of MIRAGE-III
Other MIRAGE-REGULAR
RSLS SS
VDoc PDoc
`
LinkServer
Link DB Index
MetaSearcher
RetrievalServer
StorageServer
MIRAGE-REGULAR(Public DL)
MIRAGE-LITE(Personal DL)
RSLS SS
User Agent
User Agent
AuthoringTool
Client
User Agent
Myaeng @ Chungnam National University
17
Authoring Panel
Image Panel
Text Panel
Virtual Document Authoring Tool
Partial Embedding
Link
Partial Embedding
Link
Myaeng @ Chungnam National University
18Retrieval Interface (in an ordinary way)
Input Bar
Result ListPanel
PDoc Browser
Selection
Retrieval Command
Myaeng @ Chungnam National University
19
Link Condition
Input Bar
Metadata Condition
Result ListPanel
Retrieval Command
Selection
VDoc Browser
Retrieval Interface (Link-based)
Myaeng @ Chungnam National University
20
Summary
Virtual Document Concept
Link-based Retrieval
Retrieval of Composite/Component Documents
An education and KM tool for new functionality
Personal DL / Public DL
Myaeng @ Chungnam National University
21
Further Research
Multimedia: Audio & Video
Federated Searching
Diverse formats of documents
Copyright/Ownership IssuesRight management based on DOI (Digital Object Identifier) Envisioned
Myaeng @ Chungnam National University
22
Cross-Language IR (1)
Research GoalHow far can we go with most readily available resources?
Development of a practical CLIR system
Query translation with a bilingual dictionary
Disambiguation with co-occurrence statisticsmutual information statistics in the target corpus only
selection of one or more terms + term weighting
Experiments with TREC-6
Myaeng @ Chungnam National University
23
Cross-Language IR (2)
Intermediate conclusions:Using a target corpus for disambiguation can give a reasonable performance.
We haven’t reached the upper limit.
Ongoing workDifferent disambiguation & weighting methods?
Ways to accurately translate phrases?
Myaeng @ Chungnam National University
24
Project Goal - Revisited
Develop techniques for federated searching in multi-lingual, heterogeneous digital library (DL) environment
A digital library environment for active learning with distributed, multilingual materials (OAI-compliant and others)
Develop a technical basis for an OAI-compliant information gathering environment using the federated searching techniques
Extend the Virtual Document work to incorporate OAI and integrate the federated searching capability
US & Korea
US Korea