UVA MDST 3703 Thematic Research Collections 2012-09-18
-
Upload
rafael-alvarado -
Category
Education
-
view
609 -
download
0
description
Transcript of UVA MDST 3703 Thematic Research Collections 2012-09-18
Thematic Research Collections
Prof. AlvaradoMDST 3703/7703
18 September 2012
[XKCD]
Business
• Anyone having problems connecting to their home directory?– Come see me if so
• Quiz 1 will be posted on Collab today
Comments
• “It is much more than just the technology, it’s about the conscious decision made by real live humans who design the technology.”
• “I think in order for digital representation to be able to achieve a maximum functionality there needs to be a move away from simply recreating a card catalog online with attachments to the documents.”
• “… nothing can ever truly replace the experience of being in a physical library itself.”
Comments
• “As complex as the hypertext may become, it must remain user friendly to be of any value.”
• “… the first thing that sticks out is the diversity of structure of the collections.”
• “… there [are] still some drawbacks to digital collections, for example, the lack of a great system for annotating documents.”
Comments
• “There is something to be said for walking into a library and pouring over pages, without interruption from technology, for hours and having to forge the way for your own trail of connections from one document to the next, much like the work hyperlinks do for us.”
Review
• So far, we have looked at two big ideas– The idea of hypertext, and its realization in
HTML– The concept of text markup, and its realization
in SGML, XML, TEI, and HTML
• Remember:– TEI and HTML are specific markup languages– SGML and XML are specifications for defining
markup languages– XML lets you create the languages on the fly
What mechanism do SGML and XML provide to define specific markup
languages?
DTDsDocument Type Definitions
More generally, these are called schema
[DTD]
<!DOCTYPE NEWSPAPER [
<!ELEMENT NEWSPAPER (ARTICLE+)><!ELEMENT ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)><!ELEMENT HEADLINE (#PCDATA)><!ELEMENT BYLINE (#PCDATA)><!ELEMENT LEAD (#PCDATA)><!ELEMENT BODY (#PCDATA)><!ELEMENT NOTES (#PCDATA)>
<!ATTLIST ARTICLE AUTHOR CDATA #REQUIRED><!ATTLIST ARTICLE EDITOR CDATA #IMPLIED><!ATTLIST ARTICLE DATE CDATA #IMPLIED><!ATTLIST ARTICLE EDITION CDATA #IMPLIED>
]>
For example, a DTD for a newspaperNo need to remember the syntax for DTDs, just their purpose
DTDs can also be used to define
genres, such as essays, poems, novels
The distinction between document type and genre is
fuzzy
Genres in the Humanities
• Primary sources– Tax records, letters, diaries, paintings,
oral history, manuscripts, first editions, etc.
• Secondary sources– Essays and “monographs” (books)
• Tertiary sources– Encyclopedias, dictionaries, etc.
Primary Sources
Essays and books are the staplesecondary sources
But these can become primary sources too …
Tertiary Sources
Is this a genre?
[The Rotunda]
[Library of Babel]
Is this a portable library or a book?
[Talmud]
Are libraries and books distinct?
If not, are there schema for libraries?
What about this?
• Trivium– Grammar– Rhetoric– Logic
• Quadrivium– Arithmetic– Geometry– Music– Astronomy
Does this not form the plan of a library?
[Berners-Lee’s diagram]
Hypertext blurs the distinction between documents and
libraries
Instead, we have a docuverse
(or a vast intertext)
The library is one big documentEvery document is a little
library
Overview
• Today, we consider a set of projects that are built on this premise– Either as attempts to fulfill it or as reactions to it (because hypertext can be scary)
• We look at specific examples of “digital collections”
• Within the framework defined by Palmer and McGann– The TRC as an emerging genre of digital
scholarship
What is a thematic research collection?
How is it different from a traditional library?
TRCs overcome the problem that libraries scatter content
They consolidate content
Features of the TRC
• electronic• heterogeneous datatypes• extensive but thematically coherent• structured but open-ended• research oriented• authored or multi-authored• interdisciplinary• collections of digital primary
resources
Critical Convergences and Effects
• They coincide with the move away from theory and toward historicism
• They produce a renewed focus on the materiality of text
• They achieve “contextual mass”• They force collaboration and
inter-disciplinarity• They become laboratories for
research
McGann on Secondary Sources
• “[W]hen scholarly journals publish their work online … in electronic form, they open their materials to integration within a scholarly network whose range and power outstrip current paper-based publication. Furthermore, electronic publishing permits scholars to present their work in far greater depth and diversity. Essays can present all their documentary evidence as part of their argument (in notes and appendices, or in electronic links to the original documents).”
Contextual Mass
Instead of building large collections, “digital research libraries should be systematically collecting sources and developing tools that work together to provide a supportive context for the research process.”
Let’s look at some examples and see how
they stack up
6 Questions
1. What’s in the collection?2. How is the collection organized? Any
guiding metaphors?3. How easy is it to find things?4. How effective is it achieving contextual
mass? How connected are things?5. What tools does it provide for
researchers?6. How much does it involve users in a
community?
Backstory: IATH
• Institute for Advanced Technology in the Humanities– http://www.iath.virginia.edu
• Established in 1992 • Funded by IBM• VOTS and RA two founding projects • VOTS was a demonstration project for
IBM; pitched as "as a research library in a box, enabling students at places without a large archive to do the same kind of research as a professional historian."
Yea, though I walk through the valley of the shadow of death, I will fear no evil: for thou art with me; thy rod and thy staff they comfort me.
(from Psalm 23)
VOTS Intro
What’s in the site?
• Focused on primary source documents relating to the US Civil War– Thousands of primary source
documents– Newpapers, letters, diaries, maps,
images, gov docs– Augusta Co, VA and Frankln Co, PA– 1859 to 1870
How is it organized?
The Library Metaphor
How easy is it to find things?
Quick exercise: find out if the Confederate Army ever made it to
Carlisle, PA
How connected are the parts? Does it achieve contextual mass?
Not very connected
Items have few connectors to other items
(e.g. no links in the metadata)
What tools does it provide to researchers?
Tools
• Search and browse• Timelines• Animations– http://valley.lib.virginia.edu/VoS/MAPDE
MO/Theater/TheTheater.html
• Resources for using the site
Does the site seek to build a community?
Not internally
The Rosetti Archive
What’s in site?
• Focused on the works the Pre-Raphaelite poet and painter Dante Gabriel Rossetti (1828–1882)– Paintings, poems, letters, etc.
• Also some secondary source material– Art history and literary criticism
How is the collection organized?
The site is organized as a traditional database
Search, List, Display
How easy is it to find things?
Getting to Bocca Baciata
• Find the painting, Bocca Baciata• Search [image records]• What do you do when you get
there?• How is the site structured?
Bocca Baciata 1859
Exercise: Find a painting of Bocca Baciata
Easy, if you know what you are looking for
How connected are the parts? Does it achieve contextual mass?
Some connectivity among parts, but not much.
What tools does it provide to researchers?
Does the site seek to build a community?
The Tibetan Himalayan Digital Archive
What’s in the site?
• A vast collection of Tibetan documents
• An interactive collection of maps• Videos and images
How is the collection organized?
Cross between a database and a library
Hybrid
How easy is it to find things?
Exercise: Find the city of Lhasa
How connected are the parts? Does it achieve contextual mass?
The site is highly connected
It can be confusing knowing where you are
What tools does it provide to researchers?
Tools
• Interactive map• Place dictionary• Thesaurus• Etc.
Does the site seek to build a community?
Yes
Other IATH Examples
• The Blake Project– http://www.blakearchive.org/blake/
• The World of Dante– http://www.worldofdante.org/
• The Chaco Archive– http://www.chacoarchive.org/cra/
Other Examples
• Princeton Dante Project– http://etcweb.princeton.edu/dante/
index.html
• Perseus Project– http://www.perseus.tufts.edu/hopper/
• A House Divided– http://hd.housedivided.dickinson.edu/