Download - Exploring Citation Networks to Study Intertextuality in Classics

Transcript
Page 1: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

......

Exploring Citation Networks to StudyIntertextuality in Classics

Matteo Romanello (DAI, KCL) @mr56k

Digital Classics Assoc. Conference–Buffalo NY, April 5-6 2013

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 2: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Digital Classicist Seminar Berlin

http://de.digitalclassicist.org/berlinhttp://www.youtube.com/playlist?list=PLq4Pz4R7ts0UqSn0bgAgeX1lEpkL0SDs2

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 3: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Cyberinfrastructure for Classics

(Crane, Seales, and Terras 2009, 26) :

Scholarly disciplines such as classics need specializednamed entity searches: we need to determine not onlywhether “Th. 1.38” is a citation to a primary source butalso, if so, whether it designates Thucydides, book 1,chapter 38, Theocritus, Idyll 1, line 38 or some other text.

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 4: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. PhD Research Project.scope........

modern (XIX–) publications in Classics

.goal..

......

new/more effective meansto find information for studying classical textsover (possibly) large text corpora

.methods..

......

to capture and make computable canonical citationsby applying Computer Science methods/tools (NLP, ontologymodelling)

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 5: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Big Picture

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 6: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Canonical Citations.why..

......

references to object of research %* not only Classical texts:Bible, Shakespeare, etc.pre-digital text interoperability

.challenges..

......

ambiguity: “Th.” Thucydides, Theogonia or Thebaid?alternative forms: “Hom. Od. I 1” / “α 1”implicit domain knowledge (e.g. opus maximum)discursive formunderspecification †lack of resources

Hom. Il. I 1 ; Athen. Deipn., X 412a ; Arist. Poetics 1451a35-b6

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 7: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. 21st Century Classics.going beyond..

......

commentariesindexesbibliographiesthesaurifull-text search

.intertextuality..

......

discovery of possible intertextual phenomenatext re-useallusion detection

track already studied parallels over time

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 8: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Citation Network (Macro)

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 9: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Citation Network (Meso)

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 10: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Citation Network (Micro)

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 11: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Creating an Annotated Corpus

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 12: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Corpora: JSTOR & APh

JSTOR

comprehensive: ~71k paper(sometimes noisy) OCRAPI, license agreement

APh

high density of canonical citationsclean text

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 13: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. L’Année Philologique as corpus

APh volumes: 1 (1924)–80 (2009)my corpus

7.5-8% of vol. 75~30k tokens (clearly transcribed text)

multilingual (de, fr, en, es, it)annotations

POS (automatic)NE (manually corrected)

CC License (BY-NC-SA)https://github.com/mromanello/APh_Corpussee Romanello (2013)

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 14: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Example: APh 75-06697

.S. Braund & G. Gilbert “An ABC of epic ira: anger, beasts, and cannibalism”Yale Classical Studies 32:250-285..

......

In Statius’ « Achilleid » (2, 96-102) Achilles describes his diet ofwild animals in infancy, which rendered him fearless and mayindicate another aspect of his character - a tendency towardaggression and anger.

The portrayal of angry warriors in Roman epic is effected for themost part not by direct descriptions but indirectly, by similes of wildbeasts (e.g. Vergil, Aen. 12, 101-109 ; Lucan 1, 204-212 ;Statius, Th. 12, 736-740 ; Silius 5, 306-315).

These similes may be compared to two passages from Statius (Th.1, 395-433 and 8, 383-394) that portray the onset of anger indirect narrative. Analysis of these passages demonstrates that theconcept of « ira » in epic takes its moral aspect from the context.

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 15: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Annotation Scheme

AAUTHORIn Statius’ « Achilleid » (2, 96-102) […]

AWORKIn Statius’ « Achilleid » (2, 96-102) […]

REFAUWORKVergil, Aen. 12, 101-109 […]

REFSCOPESilius 5,306-315 […]Vergil, Aen. 12, 101-109 […]

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 16: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. NE and Relation Annotation

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 17: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. NE Resolution

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 18: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Machine Learning-based Approach

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 19: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Named Entity Features.Linguistic Features..

......

POS tagsneighboring words

.Orthographic Features

..

......

punctuationbracketscasenumberpattern

.Semantic Features..

......

matching vs dictionary of author names/abbreviationsmatching vs dictionary of author titles/abbreviations

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 20: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Active Annotation (AA)based on Active Learning paradigm (Ekbal et al. 2011)tenet: training more effective when selection is supervised vs random

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 21: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. NER Classifier Evaluation

classifier: Conditional Random Fields10-fold cross evaluation

Class p r FAAUTHOR 57.89 (62.75) 38.60 (40) 46.32 (48.85)AWORK 68.11 (62.20) 78.85 (72.86) 73.09 (67.11)REFAUWORK 71.58 (71.43) 78.16 (75) 74.73 (73.17)REFSCOPE 72.37 (66.34) 86.14 (67.68) 78.66 (67)Overall 69.64 (65.22) 79.73 (62.28) 74.34 (63.72)

Feature Set r p FPOS 42.22 66.34 51.12POS+ortho 63.83 78.09 69.49POS+ortho+sem 69.07 79.85 73.44

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 22: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. Wrap Up.New Perspectives..

......

tool to search/browse large corpora of publicationsby cited author/work/passageby co-cited author/work/passage

tool to study the history of Classicstrack interpretations in intertextuality studiescombine 2 kinds of citation networks

.Further work..

......

citation resolutionuse cases, e.g. Pentecontaetiaoffer as an open web-service

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 23: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

Thanks for your attention!Questions, comments?matteo.romanello{@dainst.de, @kcl.ac.uk}

https://github.com/mromanello/

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics

Page 24: Exploring Citation Networks to Study Intertextuality in Classics

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.. References

Crane, Gregory, Brent Seales, and Melissa Terras. 2009. “Cyberinfrastructurefor Classical Philology.” Digital Humanities Quarterly 3.http://www.digitalhumanities.org/dhq/vol/3/1/000023/000023.html.

Ekbal, Asif, Francesca Bonin, Sriparna Saha, Egon Stemle, Eduard Barbu,Fabio Cavulli, Christian Girardi, and Massimo Poesio. 2011. “Rapid Adaptationof NE Resolvers for Humanities Domains using Active Annotation.” Journal forLanguage Technology and Computational Linguistics 26: 39–51.

Romanello, Matteo. 2013. “Creating an Annotated Corpus for ExtractingCanonical Citations from Classics-Related Texts by Using Active Annotation.”In Computational Linguistics and Intelligent Text Processing. 14thInternational Conference, CICLing 2013, Samos, Greece, March 24-30, 2013,Proceedings, Part I, ed. Alexander Gelbukh, 1:60–76. Springer BerlinHeidelberg. doi:10.1007/978-3-642-37247-6\_6.

Matteo Romanello (DAI, KCL) @mr56k Exploring Citation Networks to Study Intertextuality in Classics