Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library

download Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library

of 74

  • date post

    13-Jan-2016
  • Category

    Documents

  • view

    28
  • download

    0

Embed Size (px)

description

Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library. University of California, Berkeley School of Information Management and Systems SIMS 257: Database Management. Today. Object Relational Database Applications The Berkeley Digital Library Project - PowerPoint PPT Presentation

Transcript of Object-Relational Database Applications -- The UC Berkeley Environmental Digital Library

  • Object-Relational Database
    Applications -- The UC Berkeley Environmental Digital Library

    University of California, Berkeley

    School of Information Management and Systems

    SIMS 257: Database Management

  • Today

    Object Relational Database ApplicationsThe Berkeley Digital Library ProjectSlides from RRL and Robert Wilensky, EECSUse of DBMS in DL project.
  • Final Presentations and Reports

    Specifications for final report are on the Web Site under assignmentsPresentations (1 on Nov. 28, Others on Nov 30, Dec 5th and 7th (Full))
  • Today

    Object Relational ApplicationsThe UCB Digital Library
  • Overview

    What is an Digital Library?Overview of Ongoing Research on Information Access in Digital Libraries
  • Digital Libraries Are Like Traditional Libraries...

    Involve large repositories of information (storage, preservation, and access) Provide information organization and retrieval facilities (categorization, indexing) Provide access for communities of users (communities may be as large as the general public or small as the employees of a particular organization)
  • Originators

    Libraries

    Users

    Traditional Library System

  • But Digital Libraries Are Different From Libraries...

    Not a physical location with local copies; objects held closer to originators Decoupling of storage, organization, access Enhanced Authoring (origination, annotation, support for work groups) Subscription, pay-per-view supported in addition to free browsing. Integration into user tasks.
  • Originators

    Repositories

    Users

    A Digital Library Infrastructure Model

    Index

    Services

    Network

  • UC Berkeley Digital Library Project

    Focus: Work-centered digital information servicesTestbed: Digital Library for the California EnvironmentResearch: Technical agenda supporting user-oriented access to large distributed collections of diverse data types.Part of the NSF/NASA/DARPA Digital Library Initiative (Phases 1 and 2)
  • UCB Digital Library Project: Research Organizations

    UC Berkeley EECS, SIMS, CED, IS&TUCOPXerox PARCs Document Image Decoding group and Work Practices groupHewlett-PackardNEC SUN MicrosystemsIBM AlmadenMicrosoftRicoh California ResearchPhilips Research
  • Testbed: An Environmental
    Digital Library

    Collection: Diverse material relevant to Californias key habitats.Users: A consortium of state agencies, development corporations, private corporations, regional government alliances, educational institutions, and libraries.Potential: Impact on state-wide environmental system (CERES )
  • The Environmental Library -
    Users/Contributors

    California Resources Agency, California Environment Resources Evaluation System (CERES)California Department of Water ResourcesThe California Department of Fish & GameSANDAGUC Water Resources Center ArchivesNew Partners: CDL and SDSC
  • The Environmental Library - Contents

    Environmental technical reports, bulletins, etc.County general plansAerial and ground photographyUSGS topographic mapsLand use and other special purpose mapsSensor dataDerived informationCollection data bases for the classification and distribution of the California biota (e.g., SMASCH)Supporting 3-D, economic, traffic, etc. modelsVideos collected by the California Resources Agency
  • The Environmental Library - Contents

    As of late 2000, the collection represents about one terabyte of data, including over 165,000 digital images, about 300,000 pages of environmental documents, and nearly 2 million records in geographical and botanical databases.
  • Botanical Data:

    The CalFlora Database contains taxonomical and distribution information for more than 8000 native California plants. The Occurrence Database includes over 600,000 records of California plant sightings from many federal, state, and private sources. The botanical databases are linked to our CalPhotos collection of Calfornia plants, and are also linked to external collections of data, maps, and photos.

  • Geographical Data:

    Much of the geographical data in our collection is being used to develop our web-based GIS Viewer. The Street Finder uses 500,000 Tiger records of S.F. Bay Area streets along with the 70,000-records from the USGS GNIS database. California Dams is a database of information about the 1395 dams under state jurisdiction. An additional 11 GB of geographical data represents maps and imagery that have been processed for inclusion as layers in our GIS Viewer. This includes Digital Ortho Quads and DRG maps for the S.F. Bay Area.

  • Documents:

    Most of the 300,000 pages of digital documents are environmental reports and plans that were provided by California state agencies. This collection includes documents, maps, articles, and reports on the California environment including Environmental Impact Reports (EIRs), educational pamphlets, water usage bulletins, and county plans. Documents in this collection come from the California Department of Water Resources (DWR), California Department of Fish and Game (DFG), San Diego Association of Governments (SANDAG), and many other agencies. Among the most frequently accessed documents are County General Plans for every California county and a survey of 125 Sacramento Delta fish species.

  • Documents - cont.

    The collection also includes about 20Mb of full-text (HTML) documents from the World Conservation Digital Library. In addition to providing online access to important environmental documents, the document collection is the testbed for our Multivalent Document research.

  • Testbed Success Stories

    LUPIN: CERES Land Use Planning Information NetworkCalifornia Country General Plans and other environmental documents.Enter at Resources Agency Server, documents stored at and retrieved from UCB DLIB server.California flood relief effortsHigh demand for some data sets only available on our server (created by document recognition).CalFlora: Creation and interoperation of repositories pertaining to plant biology.Cloning of services at Cal State Library, FBI
  • Research Highlights

    DocumentsMultivalent Document prototypePage images, structured documents, GIS data, photographsIntelligent Access to ContentDocument recognition Vision-based Image Retrieval: stuff, thing, scene retrievalNatural Language Processing: categorizing the web, Cheshire II, TileBar Interfaces
  • Multivalent Documents

    MVD Modelradically distributed, open, extensiblebehaviors and layersbehaviors conform to a protocol suite inter-operation via IDEGApplied to enlivening legacy documentsvarious nice behaviors, e.g., lenses
  • Document Presentation

    Problem: Digital libraries must deliver digital documents -- but in what form?Different forms have advantages for particular purposesRetrievalReuseContent AnalysisStorage and archivingCombining forms (Multivalent documents)
  • Spectrum of Digital Document Representations

    Adapted from Fox, E.A., et al. Users, User Interfaces and Objects: Evision, an Electronic Library, JASIS 44(8), 1993

  • Document Representation: Multivalent Documents

    Primary user interface/document model for UCB Digital Library (Wilensky & Phelps)Goal: An approach to new document representations and their authoring. Supports active, distributed, composable transformations of multimedia documents. Enables sophisticated annotations, intelligent result handling, user-modifiable interface, composite documents.
  • Multivalent Documents

    Cheshire Layer

    OCR Layer

    OCR Mapping

    Layer

    History of The Classical World

    The jsfj sjjhfjs jsjj

    jsjhfsjf sjhfjksh sshf

    jsfksfjk sjs jsjfs kj

    sjfkjsfhskjf sjfhjksh

    skjfhkjshfjksh

    jsfhkjshfjkskjfhsfh

    skjfksjflksjflksjflksf

    sjfksjfkjskfjskfjklsslk

    slfjlskfjklsfklkkkdsj

    ksfksjfkskflk sjfjksf

    kjsfkjsfkjshf sjfsjfjks

    ksfjksfjksjfkthsjir\\

    ks

    ksfjksjfkksjklsks

    klsjfkskfksjjjhsjhuu

    sfsjfkjs

    Modernjsfj sjjhfjs jsjj

    jsjhfsjf sslfjksh sshf

    jsfksfjk sjs jsjfs kj

    sjfkjsfhskjf sjfhjksh

    skjfhkjshfjksh

    jsfhkjshfjkskjfhsfh

    skjfksjflksjflksjflksf

    sjfksjfkjskfjskfjklsslk

    slfjlskfjklsfklkkkdsj

    GIS Layer

    taksksh kdjjdkd kdjkdjkd kj

    sksksk kdkdk kdkd dkk

    skksksk jdjjdj clclc ldldl

    Table 1.

    Table Layer

    kdk

    dkd

    kdk

    Scanned

    Page

    Image

    Valence:

    2: The relative

    capacity to unite,

    react, or interact

    (as with antigens

    or a biological

    substrate).

    Websters 7th Collegiate

    Dictionary

    Network

    Protocols &

    Resources

  • MVD Third Party Work

    Japanese support by NEC; application to office document managementPrinting, support for other OCR formats, by HPChinese character and multilingual lens by UCB Instructional Support staff (Owen McGrath)Automatic enlivening of documents via Transcend proxy.
  • MVD Forthcoming

    Support for XML + style sheetsMore robust parsingSaving where you wantMedia adaptors forContinuous mediaNear image formats, word proc. formatsImprove authoring toolsInteroperation with paperApplication versus applet?Release to community, get feedback, iterate.
  • GIS in the MVD Framework

    Layers are georeferenced data sets.Behaviors aredisplay semi-transparentlypanzoomissue querydisplay contextspatial hyperlinksannotationsWritten in Java (to be merged with MVD-1 code line?)
  • GIS Viewer: Recent Developments

    Annotation and savingpoints, rectangles (w. labels and links), vectors saving of annotations as separate layerIntegration with address, street finding