Visible Lunch slides

download Visible Lunch slides

of 37

Transcript of Visible Lunch slides

  • 8/7/2019 Visible Lunch slides

    1/37

    Visualizing Variation in aRe-Translation CorpusCase Study: William Shakespeares

    Othello, The Moor of Venice

    Pilot project funded by Swansea University College of Arts andHumanities Research Initiatives Fund (Feb July 2011)

    PI: Tom Cheesman

    David Berry Robert S. Laramee Andy Rothwell

    Alison Ehrmann Zhao Geng

  • 8/7/2019 Visible Lunch slides

    2/37

    Presentation by Tom Cheesman

    at Dr Robert S. Laramees VISIBLE LUNCH talks series

    Swansea University, Computer Science

    http://www.cs.swan.ac.uk/~csbob/visibleLunch/

    Thursday 10 February 2011

  • 8/7/2019 Visible Lunch slides

    3/37

    >40 German Othellos (1766 2009) to be digitized:

    pre-processing: i.e.: copy, scan, OCR, clean, &align with the English base text (tagging)

    (and >30 French? >30 Italian? >40 Dutch

    >30 Spanish? 6 Arabic )

  • 8/7/2019 Visible Lunch slides

    4/37

    why explore Shakespeare re-translations?

    Cultural History interest Each re-translation reflects changingculture* OR expresses individual thought [the translators]*

    OR both: cultural / cultural-political interventions*

    Cultural Observatory Versions currently produced (in print

    and/or on stage, in media) intervene in current social, cultural,

    political debates, such as: HamletorMacbeth: state politicsand the individual Othello: race/racism, gender, social

    hierarchy, imperialism

    Complexity: each re-translation is a reply* (a) to previous re-

    translations, (b) to Shakespeares text, and (c) to received

    ideas about what Shakespeare means William

    Shakespeare as Global Icon and specifically William

    Shakespeare as a German Writer

    * Not (only) conscious / intentional

  • 8/7/2019 Visible Lunch slides

    5/37

    background TC text corpus research

    Recent study on Turkish German novels large corpus (c.90 texts),grouped and analysed using traditional methods (manual reading)

    PhD on German ballads (1980s) in German Folksong Archive

    large corpus of songs, each with >2,000 variants

    variant: the same song as sung / written down / published... by different singers / collectors / editors

    Variants grouped by common text features (e.g. presence/absence of

    words, verses, story events)

    Variationdimensions:Regions.

    Historical periods.

    Gender, Social Class, Religion of singer.

    Source Media: singer manuscript / collector transcript / printed book.

  • 8/7/2019 Visible Lunch slides

    6/37

    German Folksong Archive

    250,000 typescript song transcriptions (fieldwork: mainly 1912-1930)

    70,000 books with 1000s of song texts

    20,000 sound recordings (mainly 1912-1930)

    15,000 printed song sheets / pamphlets (15th - 20th c.)

    20,000 song-type files

  • 8/7/2019 Visible Lunch slides

    7/37

  • 8/7/2019 Visible Lunch slides

    8/37

    Feridun Zaimoglu

    + Gnter Senkel

  • 8/7/2019 Visible Lunch slides

    9/37

    Beginningto compare GermanOthellos

    Zaimoglu and Senkel claim to have looked at about a dozen other translations

    before writing their own adaptation.

    In order to understand better what is special about their version, I decided to

    look at Othellos by some of their precursors and competitors.

    The Zaimoglu/Senkel version is very politically incorrect. A good example of

    this is their translation of some sexist, misogynistic, and racist jokes told by

    Iago (the villain in the play). The jokes are in rhyming couplets. Most

    translators produce rhymes here. This puts the translators under abnormal

    stress their choices are more limited than usual. And most translatorsmade the jokes sound relatively harmless.

    But this series of jokes makes a very complex case study.

  • 8/7/2019 Visible Lunch slides

    10/37

    For a first attempt to explore variation systematically in a small, conceptually

    indicative textsample, I chose the Duke of Venices last words in the play.

    Its a rhyming couplet (puts extra stress on translators)

    It uses some of the plays key terms: virtue, delight, beauty, fair, black, and

    Moor (more)

    The Duke is speaking to Othellos father-in-law, Brabantio:

    Ifvirtuenodelighted beauty lack,

    Yourson-in-law isfarmorefairthan black.

    Modern English versions (student cribs):

    Ifvalouristhemeasureoftrue beauty, yourson-in-law isfairerthan

    hes black. Shakespeare Made Easy(1989)

    Ifgoodnessis beautiful, yourson-in-law is beautiful, not black.

    No Fear Shakespeare (2003)

  • 8/7/2019 Visible Lunch slides

    11/37

    Finding German Othellos

    Blinn/Schmidt: Shakespeare deutsch. Bibliographie (2003)

    www.theatertexte.de

  • 8/7/2019 Visible Lunch slides

    12/37

    Germantranslations and adaptationsofOthello

    consulted(NBthistableisnot up-to-date)(pre-1920: includes only those who translated the Dukes couplet

    differently many in the 19th

    c. copied Baudissin or Schiller here).Yellow fill = adaptation, not faithful translation

  • 8/7/2019 Visible Lunch slides

    13/37

    The Dukes last couplet: Ifvirtuenodelighted beauty lack,

    Yourson-in-law isfarmorefairthan black.

    Wieland (1760s): wenn Tugend die glnzendeste Schnheit ist, so ist euer Tochtermann mehr weials schwarz.

    Ifvirtueisthemost radiantbeauty, then yourson-in-law ismorewhite than black.

    Schiller (1800s): Wenn je die Tugend einen Mann verklrt, / Ist Euer Eidam schn und liebenswert.

    Ifever virtuetransfigureda man/ yourson-in-law isbeautiful and lovable.

    Baudissin (1830s): Wenn man die Tugend mu als schn erkennen, / Drft Ihr nicht hlich EurenEidam nennen.

    Ifonemustrecognise virtue as beautiful/ you may not callyourson-in-law ugly.

    Gundolf (1900s): Entbehrt die Tugend Reiz und Schnheit nicht, / Ist euer Eidam minder schwarz alslicht.

    Ifvirtuenot lack charm and beauty/ yourson-in-law is less black thanbright-lit.

    Wolff (1920s): Leiht Tugend ihre Farbe dem Gesicht, / Ist Euer Eidam wei, ein Schwarzer nicht.Ifvirtuelends its colour to the face/ yourson-in-law is white, not a blackman.

    von Zeynek (1940s): wenn Mannesmut nicht Reiz und Glanz entbehrt, / so ist er, wenn auch schwarz,hchst schtzenswert.

    Ifmanly courage isnot without charm andradiance/glory/ thenheis, even ifblack, highlyestimable.

  • 8/7/2019 Visible Lunch slides

    14/37

    Ifvirtuenodelighted beauty lack,

    Yourson-in-law isfarmorefairthan black.

    Laube (1970s): Wenn Tugend schn ist, hast du jetzt zum Lohn /Nen schwarzen, aber schnenSchwiegersohn.Ifvirtueis beautiful, you now have as your reward/wage/ a black butbeautiful son-in-law

    Fried (1970s):Wenn Ihr der Tugend nicht Schnheit absprechen wollt, / Ist Euer Schwiegersohn nichtdunkel, sondern Gold!

    Ifyou donotwish to denybeauty to virtue/ yourson-in-law isnotdarkbutgold!

    Gnther (1990s):Gbs helle Haut fr Edelmut als Preis, / Dann wr Ihr Schwiegersohn statt schwarz

    reinwei.Ifbright skin were aprize for noble-mindedness/ then yourson-in-law would be pure whiteinsteadofblack.

    Wachsmann (2000s): Khnheit wirkt anziehnd, hell erstrahlt zum Lohn / Mehr schn als schwarzdrum Euer Schwiegersohn.

    Boldness affects [us as] attractive, brightly shines as a reward/ more beautiful than blacktherefore yourson-in-law.

    Zaimoglu/Senkel (2000s):Solange mnnliche Tugend mehr zhlt als Schnheitsfehler, kann mansagen, Ihr Schwiegersohn ist eher edel als schwarz.

    So long asmale virtue countsmorethanminor blemishes [literally: beauty-failings], onecansay yourson-in-law ismorenoble than black.

    http://www.delightedbeauty.org

  • 8/7/2019 Visible Lunch slides

    15/37

    The Dukes couplet: Germantranslators choicesfor lexical andsyntactic

    features (horizontal rows);1760s 2000s

    Normalchoices for source text features: found by counting.

    Many departures from the norm = a hot translation.

    >Temperature range: white (very normal / cold) through blue (average) ,

    to green, to orange, to red (very abnormal).

    (Pastel colours = similar features found in specific historical periods.)

    >

  • 8/7/2019 Visible Lunch slides

    17/37

    www.delightedbeauty.orgShakespeare's Global Rewritings

    Matthias Zach (Nantes) collected c.30 French versions of the

    couplet.

    Further crowd-sourcing limited success so far but we do have

    some Albanian, Norwegian, Spanish, Italian versions of the

    couplet.

    What next?

    Sampling basedonintuitionmay not be the best approach

    to discover how versions inter-relate, copy one another, differfrom one another, and interpret the play in differing ways.

    Before attempting global multilingual exploration, lets

    refine methods.

    But first, one finding of the couplet analysis

  • 8/7/2019 Visible Lunch slides

    18/37

    Author-Translators and OtherTranslators

    German and French Authors (= famous writers/translators)

    translate fair and black in the Dukes couplet differentlyfrom non-Authors

    but German Authors translate more differently,

    relative to non-authors in the same languageAPOLOGIES FOR THE UNINTUITIVE GRAPHIC!

  • 8/7/2019 Visible Lunch slides

    19/37

    Wholetext:intrinsic structuring Scenes / Characters (parts)

    alsostructural: Sentences. Speeches.Rhymed verse Blank verse Prose.

    Soliloquy Duologue Multilogue.

    Words (lemmas?) / Semantic fields (?)

  • 8/7/2019 Visible Lunch slides

    20/37

    Tagged basetext scenes, characters,

    numberedspeeches, andspecial features

    also:

  • 8/7/2019 Visible Lunch slides

    21/37

    Tools- ?

    SURVEY oftext visualizationtools by the Zhao Geng / our ProjectTeam:http://cs.swan.ac.uk/~cszg/text/Survey.pdf

    DH: keep up athttp://www.digitalhumanities.org/ SeeespCraig:Stylistic Analysis and Authorship Studies, in Companion to Dig Hums, ch.20,www.digitalhumanities.org/companion

    Susan Schreibmans Versioning Machine: www.v-machine.org/

    multiple witnesses e.g. Emily Dickinson, There are two [or three]ripenings:

    www.v-machine.org/samples/fp420.html

    Scholarly Editions: www.sd-editions.com/ + InterEdition (an international initiativefor digital scholarly editing infrastructure, an EU COST Action): www.interedition.eu/

    Textometric analysis of multilingual text corporaDr Zimina Maria, Centre de Textomtrie SYLED-CLA2T, Paris Sorbonne

    Paper: http://www.corpus.bham.ac.uk/pclc/CL_05_Zimina_M_005.doc

    www.cavi.univ-paris3.fr/ilpga/ilpga/tal/lexicoWWW/index-gb.htm

    Automated alignment, mkAlign: http://tal.univ-paris3.fr/mkAlign/mkAlignDOC.htm

    Askis (Bergen) Translation CorpusAligner: http://gandalf.aksis.uib.no/tca2/index.page

    Screenshot: https://reader009.{domain}/reader009/html5/0523/5b050b890059d/5b050b954a24English-Norwegian Parallel Corpus Translation CorpusExplorer:

    http://khnt.hd.uib.no/webtce.htm

  • 8/7/2019 Visible Lunch slides

    22/37

    Matthew L Jockers (Stanford) uses Docuscope to count high-frequency words and

    punctuation marks in order to machine-sort texts into genres: e.g. 36 19th-century

    novels into 12 genres, or : Shakespeares plays into3

    genres.http://www.stanford.edu/~mjockers/cgi-bin/drupal/node/27

  • 8/7/2019 Visible Lunch slides

    23/37

    Equipped with a specialized dictionary, Docuscope is able to divide texts into strings of

    words that are then sorted into one of eighteen word categories, such as "Inner

    Thinking" and "Past Events." The program turns differentiating amongst genres into astatistical task by testing the frequency of occurence of words in each of the

    categories for each individual genre and recognizing where significant differences

    occur.

    Docuscope was designed as a tool for analyzingstudent writing, but Witmore (et.

    al.) discovered that it could also be employed as a specialized sort of feature

    extraction tool. - Jockers

    [Could differentiating amonggenres besimilarto differentiating among

    translations? TC]

    Mike Witmore(Working Group for Digital Inquiry, Wisconsin) and Jonathan Hope

    (Strathclyde):

    In this essay, we explore the underlying linguistic matrix of Shakespeares dramatic

    genres using multivariate statistics and a text tagging device known as Docuscope, ahand-curated corpus of several million English words (and strings of words) that have

    been sorted into grammatical, semantic and rhetorical categories.

    -- http://winedarksea.org/?p=707

    And see other essays by Witmore e.g. Texts as Objects II:

    http://winedarksea.org/?p=381

  • 8/7/2019 Visible Lunch slides

    24/37

    ?? Concordancing software (compare concordance outputs?)

    Understanding Shakespeare by Stephan Thiel Potsdam:

    http://www.understanding-shakespeare.com/about.html - using

    Classifier4J (http://classifier4j.sourceforge.net/)

    > Beautiful and useful!

    > Summarizing approach: Segmentation of text by scene & part

    & speech. Statistical analysis to find most typical sentence

    in each speech. (Could also be any string of words or [140]

    characters?) Will summaries of

    DATA: digitally assisted text analysis blog by Martin Mueller at

    http://literaryinformatics.northwestern.edu/node/1

  • 8/7/2019 Visible Lunch slides

    25/37

    Turnitin

    Input so far: in sequence, the already-digital German Othello texts: Wieland

    (1766), Baudissin (1832), Zaimoglu & Senkel (2003), Wachsmann

    (2005)

    (We also have 6 others in protected pdf formats)

    Expectations:

    B takes 1%-5% from W.

    Z takes 0% from W and 1%-5% from B.

    Wa takes 5%-20% from B.

    Many others take >90% from B and/or from subsequent versions based on

    B.

    Can Turnitin output be neatly visualized? To construct a genetic

    tree? Map this onto geographical data (place of publication)?

  • 8/7/2019 Visible Lunch slides

    26/37

    Desideratum (long-term):historical atlas asinterfaceforexploring

    when and where Shakespeares work has beentranslated,

    zoominginonindividual languages, orplaces, orperiods, or

    plays, thengettingdetailsofhow eachtranslationinterpretedthe work by part, scene, speech, or wordorsemantic field

    FOR NOW,

    Some tools are language-specific (based on an English text corpus).

    Few tools assist comparison between texts.

    Editioning software (collation of variants)?

    Statistical analysis based on concordances of aspects of the text

    creates a digest for speed-reading. Digests of:: the whole play / a

    scene / a speech / a characters part (we decide). Thesedigests

    can then be compared / the comparison visualized - ?

    Lemmatization if manual, to be avoided (immense labour). But seehttp://llc.oxfordjournals.org/content/25/3/287.short?rss=1

    Following slides: IBMs Many-Eyes visualisation packages:

    http://manyeyes.alphaworks.ibm.com/manyeyes/

  • 8/7/2019 Visible Lunch slides

    27/37

    Data file: Most Common Word Pairs to Follow 'Tis' in Shakespeare

    Data source: Project Gutenberg's Shakespeare Plays

    but the / the better / all one / but a / just to / dimpled. I / true; he / this fever / like a / Agamemnon just. /

    Nestor / right. / for Agamemnon's / dry enough-will / most meet. / meet Achilles /

  • 8/7/2019 Visible Lunch slides

    28/37

    Wordles (Macbeth and The Sonnets)

  • 8/7/2019 Visible Lunch slides

    29/37

    Jane Austen's Pride and Prejudice : Phrase Nethttp://manyeyes.alphaworks.ibm.com/manyeyes/page/Phrase_Net.html

  • 8/7/2019 Visible Lunch slides

    30/37

    Shakespeare's 2-word images.

    Phrase Net based on The Sonnets

  • 8/7/2019 Visible Lunch slides

    31/37

    Phrase Net: < s > in the Sonnets

  • 8/7/2019 Visible Lunch slides

    32/37

    above, Othello 3.3 (Word Cloud)

    below left, words spoken, by character and by scene

    below right, Word Tree

  • 8/7/2019 Visible Lunch slides

    33/37

    Graphs from Franco MorettisBOOK:Graphs, Maps, Trees

    Timelines showing production of differing literary genres etc

  • 8/7/2019 Visible Lunch slides

    34/37

    From: https://reader009.{domain}/reader009/html5/0523/5b050b890059d/5

  • 8/7/2019 Visible Lunch slides

    35/37

  • 8/7/2019 Visible Lunch slides

    36/37

  • 8/7/2019 Visible Lunch slides

    37/37

    Visualisationsofbook contents

    http://flowingdata.com/2008/06/12/12-cool-visualizations-to-

    explore-books/

    http://www.textarc.org/