Samuel Läubli, Sabine Tittel, Martin-Dietrich Glessgen, Linking Primary Texts to Electronic...
-
Upload
mediae-et-infimae-latinitas-polonorum-lexicon-utiliaque -
Category
Documents
-
view
193 -
download
4
description
Transcript of Samuel Läubli, Sabine Tittel, Martin-Dietrich Glessgen, Linking Primary Texts to Electronic...
Linking Primary Texts to Electronic DictionariesCOST Workshop “Connecting Textual Corpora and Dictionaries”
Samuel Läubli1,3 Sabine Tittel2 Martin-Dietrich Glessgen1
1Institute of Romance StudiesUniversity of Zurich
2Dictionnaire Étymologique de l’Ancien Français (DEAF)Heidelberg Academy of Sciences and Humanities
3Institute of Computational LinguisticsUniversity of Zurich
April 26, 2013
Samuel Läubli | 2/23
Contents
1. Introduction
2. Concept & Requirements
3. InterfaceBack-EndFront-End
4. Plan of Action
5. Conclusion
Sabine Tittel, Samuel Läubli | 4/23
Concept & Requirements
2. Concept & RequirementsConnecting Phoenix2 and DEAFél
Samuel Läubli | 8/23
Concept & Requirements
Aim
What do we want to do?
⇒ Include references to DocLing texts in DEAFél (attestations)
DocLing chHM 130; DocLing chMe 195; ...
Samuel Läubli | 9/23
Concept & Requirements
DocLing: Charte chMe 195
Date: Octobre 1266
Type de document: charte: affranchissement
Auteur: Jean seigneur de Joinville et sénéchal de Champagne
...
... 44 Cil de Moustier pourront amener en la vile totes fames parmariaige qui n’ a \37 veront suite ne reclain d’ autre seignour · etautre fames non fors mes fames de cors · 45 Et li home de Moustierne porront marier lour fillies se à mes homes non de ma propreterre · ou à ceus de la juree · 46 Les genz de Moustier ne poentfaire lour fyé clers se par moi non · Et cil de \38 Moustier peuventfaire mairiage aus genz de la terre mon frere de Vauquelour, seloncl’ atiremant ...
Samuel Läubli | 10/23
Concept & Requirements
Requirements
Phoenix2 DEAF Writing System
• Adapt to DEAFlemmatization policy
• Lemmatize texts
• Serve to texts / occurrences
• Enhance writing system withGUIs to integrate DocLingattestations
• Fetch texts / occurrences
↖ ↗INTERFACE
Samuel Läubli | 13/23
Interface Back-End
Back-End: SOAP Service
We decided to implement a SOAP service
• Protocol specification for exchanging data via RPC/HTTP• Official W3C recommendation• Uses XML as a transport format• Fully platform independent
Samuel Läubli | 14/23
Interface Back-End
Back-End: SOAP Service
The Phoenix2 SOAP Service provides two functions:
getOccurrences ( Lemma )getOccurrenceDetails ( OccurrenceID )
Phoenix2 DEAFél Phoenix2 DEAFélgetOccurrences(lemma="marïage")
Phoenix2 DEAFélgetOccurrences(lemma="marïage")
OccurrenceCollection (XML)
0..*
Samuel Läubli | 15/23
Interface Back-End
Back-End: SOAP Service
The Phoenix2 SOAP Service thus enables the following functionality:
getOccurrences ( Lemma )
⇒ Show all occurrences, given a lemma
getOccurrenceDetails ( OccurrenceID )
⇒ Show meta information, given the ID (a numeral identifier) of anoccurence
Try it yourself
SOAP Endpoint (document/literal)http://sa.muel.tv/test/soap/ph2deafel.wsdl
Short WSDL Documentationhttp://sa.muel.tv/test/soap/doc/wsdl.html
XML Schema Definitions (XSD)http://sa.muel.tv/test/soap/doc/xsd.html
Sabine Tittel | 16/23
Interface Front-End
Front-End
The DEAF develops a number of graphical user interfaces (GUIs) which
• Build upon the DEAF’s electronic dictionary writing system• Allow for an integration of DocLing material
⇒ No complete blend: DocLing material will continue to be recognizeableas external material
Sabine Tittel | 18/23
Plan of Action
Release
Our joint work is foreseen to be released in three steps:
1. All materials without semantic structure2. All materials with a dedicated semantic structure for DocLing entries3. Full integration of the DocLing entries into the DEAF article structure
Samuel Läubli | 19/23
Plan of Action
Milestones
Phoenix2 DEAF Writing System
4 Migrate old lemmata
• Implement SOAP service
• Lemmatize texts
• Implement new GUIs
• Adapt publication format(web edition)
⇒ First version due in autumn 2013
Sabine Tittel, Samuel Läubli | 21/23
Conclusion
Benefits
Both DocLing and Phoenix2 benefit from our cooperation:
Phoenix2 DEAF
The vocabulary of the DocLing textsis embedded into its natural contextof the Old French language, and—viathe DEAF’s etymological discussion—in the broader context of the Romancelanguages.
A considerable number of digitalsource texts is added to the dic-tionary. This new source materialwill strengthen the foundation ofthe semantic structure of the DEAFarticles and enhance its quality.
Sabine Tittel, Samuel Läubli | 22/23
Conclusion
Conclusion
Questions?Feedback is always very welcome
Sabine Tittel, Samuel Läubli | 23/23
Thank YouThese slides are available atwww.cl.uzh.ch/people/team/laeubli.html