Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on the Web Using...
-
Upload
dhlab -
Category
Technology
-
view
254 -
download
0
Transcript of Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on the Web Using...
Roberto Rosselli Del Turco - Università di Torino Florentina Armaselu - [email protected] [email protected] Di Pietro - Università di Pisa Lars Wieneke - [email protected] [email protected] Masotti - Università di Pisa [email protected]
1www.cvce.eu
Europe’s Beginnings through the Looking Glass: Publishing Historical Documents on
the Web Using EVT
1. Overview of the WEU-DIPLO project2. Experiments with Web publication platforms3. EVT adaptation• experiments• publication framework overview
4. Future work5. Conclusion6. References
Summary
Summary 3
Overview of the WEU-DIPLO project: document structure. ©WEU-UEO
Overview WEU-DIPLO 4
Header
Content
Footer
1. Goal: XML-TEI encoding, corpus analysis and Web publication of institutional documents of the W.E.U. (Western European Union):• Topics: armament production, standardization, control in the period from 1954 to 1982;• Source: Archives nationales de Luxembourg, W.E.U collection.
2. Initial format: • digitized versions (JPEG) of typewritten materials (one file per page).
3. Size:
*proc. = processed
Overview of the WEU-DIPLO project
Overview WEU-DIPLO 5
Category Number of documents
Number of documents per language
Number of pages
Number of pages per language
EN FR FR proc.* EN FR FR proc.*
Note 89 43 46 37 395 191 204 155Minutes 30 15 15 15 256 138 118 118Memorandum 3 1 2 2 16 7 9 9Study 2 0 2 1 12 0 12 8
Discourse 1 0 1 0 4 0 4 0Draft protocol 2 1 1 0 4 2 2 0
Total 127 60 67 55 687 338 349 290
XML-TEI Encoding: WEU-DIPLO - metadata, header. ©WEU-UEO
Overview WEU-DIPLO 9
@@hAuthor @@hArchNum
@@hStampConfid@@hDocRef
@@hOrigDate
@@hOrigLang
@@hVersion
XML-TEI Encoding: WEU-DIPLO – Headings, paragraphs, line breaks. ©WEU-UEO
Overview WEU-DIPLO 10
@@Heading2
@@Paragraph
@@LineBreak
EVT experiments
Experiments 14
(Partial) customisation:• General layout: folders structure, images renaming.
• EVT Transformer: builder pack (XSLT)o added/modified templates for transforming specific patterns (headers, footers, paragraphs) (layout
not fully supported – e.g. sections, subsections, paragraph indentation, etc.).
• EVT Viewer: CSSo added/modified statements to support visualisation in the browser of specific patterns (alignment,
text decoration, colour of headers, footers, etc.).
• Manual modificationo XML-TEI input: page breaks linked to the facsimile images;o transformation output: changed HTML output to support particular features (Text-Link, HotSpot) (should
not occur in the real workflow).
EVT experiments – facsimile/transcription page side-by-side view (title page). ©WEU-UEO
Experiments 15
1. Goal: • publishing on the CVCE’s Web site different types of documents on
European Integration history.2. Types of documents (for the majority, high quality multilingual
transcriptions are available - TXT, RTF, SRT formats):• treaties;• administrative documents (minutes, notes, memoranda);• press articles;• handwritten notes;• letters;• video and audio archives.
3. Types of features to be implemented (required / optional):• side by side facsimile/transcription (replicating the original with more or
less fidelity) (r);• multipanel alignment (r);• text-image link (o);• zooming (r);• HotSpot (o), etc.
EVT adaptation – towards a TEI-based publication framework – types of documents/features
EVT adaptation 17
EVT adaptation – towards a TEI-based publication framework – manuscript note (Werner corpus)
EVT adaptation 18
EVT adaptation/combination with other tools – towards a TEI-based publication framework – general layout
EVT adaptation 19
EVT adaptation – towards a TEI-based publication framework – architecture, workflow
EVT adaptation 20
General architecture General workflow
1. Identification of features to be implemented in the digital editions:• visualisation;• search.
2. Publication framework design:• core / plugin;• optional / project specific.
3. Implementation of the module for XML-TEI conversion (potential adaptation of OxGarage for batch processing).
4. Implementation/integration into existing CVCE architecture:• Back End;• Front End.
Future work
Future work 21
EVT framework:• flexible enough to support different types of documents in
European integration history; • possibility to compare original / transcription (of interest for
researchers in European integration studies);• different degrees of fidelity to the original can be envisaged
(balance manual / automatic processing).EVT adaptation:
• minimise the amount of manual interventions in the XML-TEI documents;
• publication framework with modular architecture to allow gradual development and customisation according to the needs of the projects.
Conclusion
Future work 22
• EVT (Edition Visualization Technology): http://sourceforge.net/projects/evt-project/
• KILN : http://kiln.readthedocs.org/en/latest/#
• TEIBoilerplate : http://dcl.ils.indiana.edu/teibp/ • TEI (Text Encoding Initiative): http://www.tei-c.org • Versioning Machine: http://v-machine.org/ • XTF (eXtensible Text Framework): http://xtf.cdlib.org/about/
References
References 25