Jakobsen(2011)Tracking translators keystrokes_eye_TranslogII.pdf

7/23/2019 Jakobsen(2011)Tracking translators keystrokes_eye_TranslogII.pdf

1/20

racking translators keystrokes

and eye movements with ranslog*

Arnt Lykke JakobsenCopenhagen Business School, Denmark

Although keylogging opens up many possibilities or studying translation proc-esses, the addition o eye tracking vastly increases our ability to know whatkind o process a translator engages in at any given point in time. Following adescription o how the ranslog sofware was technologically reshaped in thecontext o the EU Eye-to-I project to be able to integrate keylogging and eyetracking, and o how raw eye and key data were represented, an analysis o asmall sample o eye and key data results will be presented in a tentative ormula-tion o a recurrent six-step processing micro-cycle which introduces the an-

chor word as a novel concept. It is optimistically suggested that an elementarycycle o the kind suggested could both unction as the basis o computationalanalysis o large volumes o translational eye and key data and potentially as thecore o a computational model o human translation.

Keywords: translation processes, keylogging, eye tracking, ranslog

. Introduction

Te ranslog sofware was originally only developed to record translators keystrokesin time. A complete, timed sequence o keystrokes provides a detailed overview othe entire typing process by which a translation comes into existence. Data recordso this type make it possible to study cognitive phenomena such as chunking, basedon the distribution o pauses, and such writing-related phenomena as edits and cor-rections. However, although it is sometimes possible to connect production data withcomprehension phenomena, in most instances it is only possible to speculate on whatcomprehension processes a translator engaged in beore a solution was typed, sincethe only recorded data is a pause.

* Very constructive comments rom two anonymous reviewers and the editors o the volumeare grateully acknowledged.


2/20

Arnt Lykke Jakobsen

Te addition o eye tracking to keylogging adds very rich inormation on whatgoes on when no typing activity is recorded. Source-text reading can be tracked as

well as target-text reading and monitoring, and also shifs o attention rom source totarget text and back. Such inormation radically improves our chances o reconstruct-ing both the comprehension processes that precede production and the way in whichcomprehension and production processes combine.

Eye movement recording has made us aware o the magnitude o the work per-ormed by the eyes and the human brain in reading. Tis has been described inseveral major studies o reading (Rayner and Pollatsek 1989; Rayner 1998; Radach,Kennedy and Rayner 2004), but the kind o reading that is part o written translationhas not been studied. When we recorded the eye movements o participants instruct-

ed in experiments to read a oreign-language text or comprehension, and comparedthem to eye movements o participants who translated a screen text in the oreignlanguage and typed the translation on the same screen, we ound that written transla-tion involves vastly more eye work and appears to be a considerably more demandingtask, cognitively, than just reading and comprehending a text or even reading it andtranslating it orally (Jakobsen and Jensen 2008). Eye-movement observation helpsus better understand the mechanical complexity o the task o written translation,but gaze data also help us penetrate closer to the core o the cognitive challenges intranslation. Te undamental assumption here is that there is a correlation between

behavioural outside data and cognitive inside processing. Tough this so-calledeye-mind hypothesis (Just and Carpenter 1980), which states that what the eyes arelooking at is what the mind is attending to, has been demonstrated not to apply every-where during a reading process (c. e.g. Hyn, Lorch and Rinck 2003), eye movementdata still provide a fine window to the mind. By visualising the way in which readingproceeds and how it interacts with typing, we get the best available picture o howtranslation processes are coordinated in the mind.

. Analysing keystroke data

Te original impetus or the development o the ranslog sofware was an ambitionto supplement think-aloud process data rom translation experiments with harder,machine-recorded evidence o human translation processes. Te methodological as-sumption behind this ambition was a conviction that i it were possible to recordconverging qualitative think-aloud data and quantitative behavioural data, our hy-potheses about translation processes would gain important support.

Te first ranslog version, developed in 1995, had three main unctions. It could

display source text automatically as ull text, in paragraphs, in sentences, or in seg-ments defined by the researcher. Te central logging unction, in the component thencalled Writelog (Jakobsen and Schou 1999), recorded all keystrokes, including navi-gation and deletion keystrokes, but not yet mouse clicks, that were made during the


3/20

racking translators keystrokes and eye movements with ranslog

completion o a translation task in time, and then saved this data in a log file. Subse-quently, inormation stored in the log file could be displayed in two modes. It could be

replayed dynamically, and it could be represented as a linear sequence o keystrokes.Te replay could be done at different speeds, which allowed the researcher to in-spect the temporal dynamics o text production in translation visually when replayedat normal speed, or somewhat more slowly or aster. Faster replay provided direct

visual evidence o the chunked structure o translational text production. Sloweddown replay had an observational effect similar to that o a magniying glass. In ourresearch group, we soon ound that by deploying the replay unction to support par-ticipants recall during retrospection, we could obtain very rich data. When partici-pants were given the chance to observe a replay o their keystrokes as they were being

interviewed, they appeared to immediately and very exactly recall what had been ontheir mind at a certain point in time during a translation task. 1Tough it could notbe known with any certainty that what was reported in retrospection was as reliableas concurrent verbalisation, the data elicited in this manner had all the appearance otruthul evidence. It was generally very detailed, and it was articulated spontaneouslyand with great conviction by participants.

One great advantage o retrospective data, in addition to the data being both richand accurate, at least apparently, was that there could be no suspicion o such dataimpacting on the primary (keystroke) data since it was all elicited afer the event.

Ericsson and Simons (1984) claim, that concurrent think-aloud had no effect onprimary processing i verbalised content was available in SM in verbal orm, wasnot supported by our findings in experiments in Copenhagen, which indicated thatconcurrent think-aloud had a degenerative effect on processing (Jakobsen 2003). Forsuch reasons more and more experiments were run with cued retrospection ratherthan concurrent verbalisation (e.g. Hansen 2005 and 2006).

owards the end o 1999, Lasse Schou finished programming the first Windows(98) version o ranslog, which we named ranslog2000. Early experimental versionso this new development had been available or testing and use in the CBS RAP

project (19962000). Te three basic unctions were still the same: a ranslog projectcreated in the ranslog Supervisor module (ormerly ranslog) was displayed inranslog User (ormerly Writelog), which logged the key and time data and savedbinary data in a log file that could be inspected (replayed and represented linearly) inthe Supervisor module, but was not otherwise immediately accessible. Te programincluded a small statistical component, which calculated e.g. task time and the totalnumber o keystrokes.

wo o the main applications o this version o ranslog were to the study o trans-lation phases and to how translation students and proessional or expert translators

. Tere is a fine description o the method in Englund Dimitrova (2005: 66) and an extendeddiscussion o retrospection as a research method in Englund Dimitrova and iselius (2009).


4/20

Arnt Lykke Jakobsen

chunked their typing o a translation. In the context o the Expertise project2severalexperiments demonstrated that proessional or expert translators spent more time

and effort on checking and revising their translation than did translation students,even though their translation was already o superior quality. It was also demonstrat-ed clearly that expert translators chunked their production into longer segments thandid students. Tis strongly suggested that expert translators handled the translationtask cognitively differently in that they processed longer strings o text and longertranslation units than did student translators. Comparison o segment length in theinitial portions o the drafing phase with segment length urther on in the drafingphase gave quantitative empirical support or the acilitation effects hypothesis.3

Te act that ranslog provided a complete record o all revision acts made it

suitable or studying editorial practices, e.g. the distribution o online revision doneduring the drafing phase and end revision or or more qualitative studies o revision.Te sofware was also applied to the study o various distracting phenomena in theproduction o translations (Hansen 2005).

Although keystroke logging provides data or important insights into the sizeand composition o the chunks rom which translators construct their target texts,it is very difficult to know rom a keystroke record what cognitive activity was go-ing on in the pauses between chunks, why pauses had the duration they did, or whythey occurred. In most o the recordings we made, chunks were relatively short with

ewer than our or five words and with pauses o about two seconds or more betweenthem. Tis seemed to suggest a process which alternated between source-text readingand comprehension, mental ormulation o the translation and physical typing o thetranslation. However, in some instances the chunks we recorded were so long thatthey ar exceeded anything that could be held in short-term memory at any one time(so-called instances o peak perormance, c. Jakobsen 2005) and were not explicablein terms o a serial succession o comprehension, ormulation o the translation andtyping o it. Occurrences o that kind seemed to indicate that at least some translatorswere able to continuously eed their short-term memory and process the inorma-

tion concurrently with their typing o the translation. Such evidence o overlappingcomprehension, ormulation and typing activity was strongly suggestive o parallelprocessing.

But this was only one problem that was in obvious need o supplementary evi-dence. Fundamentally, we could only speculate rom the linguistic environment in

. Details at http://www.h.uio.no/ilos/orskning/orskningsprosjekter/expertise/.

. Te term acilitation effects has been used in the study o L2 acquisition and bilingualismsince the 1960s to indicate eatures in the L1 that acilitate the acquisition o L2. In translation

research the term reers to the observation that translation o a text gets easier and aster as thetranslator gets into the text. See Englund Dimitrova (2005: 98): Te possible acilitation effectduring the writing phase o the translation task will also be investigated, more particularlywhether there are differences in segmentation and/or number o characters per segment at thebeginning and towards the end o the writing phase o the task.
http://www.hf.uio.no/ilos/forskning/forskningsprosjekter/expertise/http://www.hf.uio.no/ilos/forskning/forskningsprosjekter/expertise/


5/20


which pauses occurred why they occurred and what processing they might be cor-related with. Te data, coming at the tail end o the translation process, was open to

multiple interpretations. We could not even know i pauses were always caused bytask-related phenomena or by actors totally irrelevant to the execution o the task.An obvious way to look or supplementary evidence was by trying to establish whattranslators were looking at during a translation task.

. Presumed research benefits from adding gaze data to keystroke data

One o the undamental ambitions in the EU Eye-to-I project (20062009)4was to

develop an application that would integrate reading data (gaze or eye data) roman eye tracker with the kind o writing data that was already available in programs likeScriptlog, Inputlog and ranslog.5Te challenge was to map inormation rom the eyetracker to the words a translator actually reads.

A undamental assumption was that eye data would provide evidence pertain-ing identifiably to source-text reading so that source-text comprehension processescould be studied separately rom text-production processes and could be comparedwith other reading processes that were not part o a translation process. Eye data wereexpected to be available all through a translation task without interruption or with

ew interruptions and to be able to fill most o the pauses in keystroke activity withinteresting data.

Another assumption was that integration o eye and key data would provide newopportunities or researching the dynamics o how source-text comprehension activ-ity, reflected in translators eye movements, was coordinated with target-text produc-tion activity, reflected primarily in keystroke activity but possibly also in gaze activity.

A urther assumption was that just as the distribution o pause duration in thewriting process seemed relatable to translation difficulty or ease, so fixation durationand eye movement patterns, such as regressions, would be correlatable with compre-

hension difficulty or ease as well as with text production and/or translation difficultyor ease.

Finally, it was assumed that i it were possible to map the eye trackers gaze datareliably to the words being read on the screen in real time, i.e. i the eye trackers gazedata (giving the pixel position o the gaze) could be instantly translated into inor-mation about what words were being looked at, we would be able to exploit such real-time mapping to support translation with prompts activated by the translators gaze.

. See http://cogs.nbu.bg/eye-to-it/.

. See http://scriptlog.net/, http://webh01.ua.ac.be/mleijten/inputlog/and http://translog.dk.
http://cogs.nbu.bg/eye-to-it/http://scriptlog.net/http://webh01.ua.ac.be/mleijten/inputlog/http://translog.dk/http://translog.dk/http://webh01.ua.ac.be/mleijten/inputlog/http://scriptlog.net/http://cogs.nbu.bg/eye-to-it/


6/20

Arnt Lykke Jakobsen

. ranslog development in the Eye-to-I project

It soon became clear that ranslog2000 would be difficult to update in a way thatwould meet the promises made in the Eye-to-I research description. Tereore itwas decided that Lasse Schou would completely rewrite the program in C#, usingMicrosofs .NE 2.0 environment, and introducing Unicode 8 (to accommodate non-Latin scripts) and xml as an open output ormat. Te result was ranslog2006, cur-rently distributed as the academic version. (Te distributed version does not includethe gaze recording component that was developed experimentally or the Eye-to-Iproject.6)

Te big technical challenges in the Eye-to-I project were to find a way o inte-

grating keystroke and gaze recording so that it would be possible to have a perectlytimed sequence o eye and key data, and to find a way o mapping inormation romthe eye tracker in real time to what words were currently being read by a user. Tisdual technical and programming challenge was taken on jointly by Lasse Schou andOleg pakov, and in the course o the EU Eye-to-I project (20062009) a gaze-to-word mapping application (GWM), developed at the University o ampere,7was suc-cessully integrated into an experimental version o ranslog. Gaze samples rom theobii 1750 eye tracker were interpreted as saccades or fixations, and the GWM mod-ule automatically mapped fixations to a source-text or target-text word on the basis o

the current positions o words on the screen, with some adjustment perormed by analgorithm representing a crude model o reading. Raw gaze data were saved in sepa-rate files, and only mapped, interpreted data (the succession o fixations with specifieddurations, mapped to words) were recorded in ranslog. Te output rom the inte-grated ranslog/GWM application was a perectly timed record o the sequence o ausers word fixations across the screen and the users keystrokes (c. Figures 2 and 3).In ranslog, the data could be represented in linear ashion with inormation at milli-second level about the intervals between gaze or keystroke events, and data could alsobe replayed with keystrokes appearing dynamically as in earlier ranslog versions, but

now also dynamically highlighting the users recorded fixations on source or targettext words or as long as a fixation on a word lasted (see Figure 1).In earlier versions o ranslog raw data were recorded in binary code, but rans-

log2006 stores all raw data in openly accessible xml code. Figures 2 and 3 show anextract o the data underlying the screen representation in Figure 1. Figure 2 givesinormation about what kind o event (fixation (7), keystroke (1), synchronisationsignal (6)) was logged; the Value parameter indicates what event subtype occurred

. A version which includes keystroke, gaze and audio recording as well as real-time gaze-to-word mapping is currently being tested. Hopeully, it will be ready or release in the course o2011.

. Te solution was inspired by the iDict project developed at the University o ampere(Hyrskykari, Majaranta, Aaltonen and Rih 2000; Hyrskykari 2006).


7/20


Figure 1. ranslog2006 display o source text in the upper lef window and an extractrom the linear representation o gaze and keystroke data in the right-hand window,showing the current cursor position afer [source gaze: deorestation]. When activated,the replay displays the succession o keystrokes in the lower lef window, while highlight-

ing the current source-text word [deorestation] looked at, at the replay speed selected.Source text and (more ofen) target text fixations sometimes ailed to be mapped [argetgaze:]. Te occurrence o pauses lasting more than a second where no data was recorded,e.g. [*02.751], indicates that the translator was looking away rom the screen, most prob-ably to attend visually to the keyboard while typing.

Figure 2. Extract o the xml record o some o the raw data underlying the representa-tion shown in Figure 1.


8/20

Arnt Lykke Jakobsen

(source-text fixation (1), target-text fixation (2); keystroke on the space bar (32), let-

ter h (104), etc.). Te Cursor and Block parameters show where in the sequence oevents a given event occurred, and how many positions the event moved the cursor,and ime gives the time (hrs:mins:secs:mss) at which an event occurred, starting atthe moment when the task was begun.

In the first line o Figure 2, in-dicates that the LogEvent was a gaze event (7) in the target text window (2), wherethe word looked at [ogs] had 4 characters (Block=4). in the third line represents a fixation on a word [have] in the source text,with our letters. in line 4 indicates that the key

with ASCII/Unicode number 32, [], was struck, and that the space was in-serted at position 300 at the time indicated. Ten ollow five fixations on words in thesource text [could, help, minimise, help, ] and one in the target text [ogs],afer which we get a sequence o keystrokes making up the target text hjlpe. (Logevent 6 indicates a synchronisation signal sent by ranslog.)

Te words to which fixations are automatically mapped by the system are listedelsewhere in the xml log file. I mapping is complete and successul, every fixation willbe mapped to the word that was being looked at, at the time the event was recorded.Figure 3 shows an extract rom such a list.

Figure 3. Automatically generated list o successively fixated words in the source andtarget texts. Note that in five instances a fixation was recorded, but word mapping ailed(xsi:nil=true).


9/20


. Te analytical challenge: Preliminary process observations

With recorded data organised in this structured manner, we ace a new analyticalchallenge. All the details in a translators eye-key activity can be inspected, but to be-gin to make coherent sense, details have to be interpretable as constituents in a model.Te translator recorded in Figures 13 spent considerable time attending visually tothe keyboard during typing. For this reason, there are long periods o more than 10seconds with no recorded eye data in this particular recording. Another consequenceo this is that we get airly extended periods o typing without intervening visual at-tention to the source text. Finally, because o the extended visual attention to the key-board, which is cognitively much like a third text, the gaze data show that it ofen took

this translator some rereading to get back to the next relevant source-text segment.Eye-movement recording visualises how much work is perormed by the eyes

and the human brain in reading. Te challenges, rom the point o view o trans-lation process research, are to investigate how reading, when part o a process otranslation, differs rom other kinds o reading, and how reading interacts with othercognitive processes involved in translation. Our findings in 2007, when we recordedthe eye movements o participants instructed in experiments to read a oreign-lan-guage text or comprehension, and then compared them to eye movements o partici-pants who translated a screen text in the oreign language and typed the translation

on the same screen, showed that eye movements in reading are highly dependent bothon the purpose o reading (c. early observations by Yarbus 1967) and on the natureo the reading task. Written translation involves vastly more eye work and appears tobe a considerably more demanding task, cognitively, than just reading and compre-hending a text or even reading it and translating it orally (Jakobsen and Jensen 2008).Tis may be in part because the modality o speech has been better automatised thanthe modality o writing, but it also appears to be caused to some extent by the nec-essary ocular motor effort involved in helping the brain keep track o what source-text words can be or have been aligned with what target-text words. Alignment work

orces the brain to constantly instruct the eyes to shif between the source and targettexts. Tough eye tracking provides ample evidence that many translators have veryaccurate spatial text memory and can shif between local text strings in different textswith great accuracy, such transitions may also cause conusion and make reorienta-tion and rereading necessary. Such random noise in the data makes it more difficultto detect regularities.

A preliminary report o the shifs o attention between source and target textscan be generated rom lists such as the ones shown in Figures 2 and 3, and the pointsat which text was typed can also be reconstructed. Such a report reveals that the eyes

are constantly engaged in three basic types o activity: (a) source-text (S) reading,(b) target-text () reading and (c) monitoring o the keystroke activity and out-come. We know that translation must also be taking place, most probably in paral-lel with other processes, but behavioural evidence o this activity is only indirect. Atthe time recorded in Figures 13, the translator had just typed og de kunne ogs


10/20

Arnt Lykke Jakobsen

(a translation o and could also). Te S text ragment processed in the extractshown in Figure 3 was help minimise. Looking more closely at the sequence o word

fixations, we make the ollowing observations and interpretations: (1) Te most re-cently typed word (ogs) was fixated immediately afer having been typed (an ex-ample o production being monitored). (2) Te fixation o have, which appearedbelow the intended target word help one line down,8 illustrates the phenomenono undershooting, known rom studies o eye movements at line shifs in ordinaryreading. Te distance rom ogs in the target text up to help in the source textrequired a saccade almost the length o that required at line shifs, and thereore thetarget word help was, apparently, not directly fixated. It is possible, however, that thetranslators brain perceived the target word paraoveally because afer the short stopo-

ver on have, the saccade veered off to the lef to could, a word already translated.(3) As could had already been translated, a probable reason or rereading that moredistant word, rather than going to the more proximate new S target word (help),could have been the need to establish a resh mental anchor word or the next chunko source text to be processed. (4) Having established the anchor, the mental groundhad been prepared or the next two target words, help and minimise, to be suc-cessively fixated. (5) Tough one might expect the translator to have wanted to fixatethe next word also (emissions, the object o the transitive verb minimise), this didnot happen until later, perhaps an indication that this translator did not engage in

very deep parsing o the source text. Instead, this translators gaze regressed to brieflyrefixate help (and perhaps minimise) while probably ormulating a translation.(6) Beore typing the translation, however, the last word o the current target text wasfixated, establishing a target anchor or the translation o help minimise to attach to.(7) Only then did the translator type the (unidiomatic) translation hjlpe i at min-imere, without looking at the screen. (8) Finally, the new target text screen input wasmonitored (in an unmapped fixation). Having perormed this sequence o actions,the translator was ready to embark on translating the next chunk, emissions romdeorestation, in much the same manner.

. A recurrent processing pattern

Eye movement observation helps us better understand the mechanical complexity othe task o written translation, but gaze data also help us penetrate closer to the coreo the cognitive processes in translation. Te eye-mind hypothesis (Just and Carpen-ter 1980) is the undamental assumption that what the eyes are looking at is what themind is attending to. Te correlation has been demonstrated not to apply universally

during a reading process (c. e.g. Hyn, Lorch Jr. and Rinck 2003), but although thecorrelation is not always perectly straightorward, the basic assumption still stands

. Figure 4 shows the source text as laid out on the screen in the experiment.


11/20


that what the eye is looking at is (in general) something the mind is attending to. Soeye movement data provide a fine window to the mind, and by visualising the way in

which reading proceeds and how reading interacts with typing, we get the best avail-able picture o how translation processes are coordinated in the mind.With a structured record o the exact temporal succession o translators eye and

hand activity, we can begin to ask old questions in a new way and begin to ormulatetentative answers to them, as was attempted in the previous section.

One such question concerns the way in which translators segment their produc-tion o the target text. Instead o basing our account o segmentation on temporalaspects o text production (alternating sequences o pauses and production periods),we can now enrich our analysis with eye data. As we have just seen, eye data provides

strong evidence o how much source text was read prior to the production o a pieceo target text. Tis gives us an exact idea o what source text segment was read imme-diately prior to the typing o the matching piece o target text. Instead o mechanicallydefining a segment or a translation unit by the occurrence o a pause in the typingactivity o a certain duration, we can now more qualitatively, and with greater detail,identiy what source text segment was being read and what corresponding target textwas being typed, regardless o the duration o any segment boundary pauses.

Another major question concerns the coordination o comprehension, transla-tion and ormulation processes. Te addition o gaze data gives us a finer-grained

picture o the behavioural and cognitive complexity o the eye-hand (and brain) inter-action that goes on in translation than we had beore. Tough we still only get indirectevidence o translation processing going on, we get stronger evidence o the steps thatthe human translators brain goes through to compute the trajectory o the eyes andthe finger movements across the keyboard in the execution o an act o translation.

Te observations and interpretations made in the previous section can be sum-marised schematically. In able 1, the actions have been tabulated to give a survey othe sequence o actions and outcomes.

able 1. Visual attention shifs between S and based on automatic mapping o gazedata to words looked at. Te YP column indicates where typing was done. Afer thesecond fixation o ogs, the words hjlpe i at minimere were typed. Question marksindicate unmapped fixations.

S YP S S

havecouldhelpminimisehelp(?)

hjlpeiat minimere

deorestation(?)

Minimiseemissions

ogs(?)

ogs (?) minimere(?)


12/20

Arnt Lykke Jakobsen

Te sequence o actions perormed by our human translator can now be hypo-thetically abstracted into a small algorithm. Starting with an act o comprehension,

the cycle o cognitive and motor processes that we can establish on the basis o theabove example data involves the ollowing six steps:

1. Moving the gaze to read the next chunk o new source text (and constructing atranslation o it),

2. Shifing the gaze to the target text to locate the input area and read the currenttarget-text anchor word(s),

3. yping the translation o the source-text chunk,4. Monitoring the typing process and the screen outcome,5. Shifing the gaze to the source text to locate the relevant reading area,6. Reading the current source-text anchor word(s).

Tese steps should be seen as constituting a complete micro-cycle in which some othe steps (or portions o them) can be skipped or, conversely, can be repeated severaltimes. For instance, in step 2, the gaze will be shifed to the target text, but not neces-sarily to an anchor word. We also requently find that the gaze may be shifed backand orth several times between the source text word(s) being processed (step 2) andthe translation being typed (steps 3 and 4). A finer-grained model would thereoreadd subcategories to each step or some o them and would point out that e.g. steps

3 and 4 can be executed quite differently, depending on the translators typing skills.It should also be noted that steps 3 and 4 are to some extent concurrent. Monitoringtakes place concurrently with text being typed, but (screen) monitoring requentlycontinues afer typing has stopped. However, even without such detail, the modelallows us to predict with a high degree o probability that step 1 will be ollowed bystep 2, step 2 by step 3, etc.

As can be seen, the ocular motor effort involved in helping the brain keep track owhat source-text words have been read and translated, o what words can be or havealready been aligned with what target-text words, and o what target-text words have

been typed, and how, is very considerable. Alignment work (vertical alignment)across the two languages involved orces the brain to constantly instruct the eyes toshif between the source and target texts. Monolingual (horizontal) syntagmatic at-tachment also necessitates rereading when new text is added to old text, visually ortypographically.

. Sample data analysis

Te availability o large volumes o structured data makes it obvious to look or sup-port or observations we make on the basis o small data samples rom applicationso statistical methods to big data corpora. o do this, we need to be sure that thedata collected satisfies basic empirical criteria, e.g. regarding completeness. As is evi-dent rom the sample shown (see esp. Figure 3), the current system does not always


13/20


succeed in mapping fixations. A more undamental challenge concerns the discoveryo fixations. Tis is at once a technical and a theoretical challenge. Te eye tracker that

was used in our experiments was a 50 Hz machine, i.e. it recorded gaze samples at20 ms intervals. With such equipment, the traditional method o identiying fixationsis to say that a fixation occurred i five consecutive samples were recorded within aspecified area, e.g. no more than 40 pixels apart. Tis means that a fixation is alwaysrecorded as having a duration o at least 100 ms. Tough more sophisticated, veloc-ity-based definitions are now gaining currency, this does not change the act that whatconstitutes a fixation is highly dependent on the equipment used. It would be absurd,or instance, to use the traditional definition on data rom a 1000 Hz eye tracker andclaim that a fixation could be as brie as 5 ms, but it might make perect sense to claim

that fixations can occasionally have shorter durations than 100 ms. Tere is also a pos-sible upper limit that has to be considered. We need to consider how long a fixationcan be beore we need to split it into several fixations. Tere are several other techni-cal challenges to getting good reading data rom translation experiments, especiallybecause o the requent shifs o visual attention and the discontinuous nature o thereading. In addition, gaze data recording is sensitive to a large number o variables, in-cluding head movement, calibration accuracy, blinks and the shape o the translatorseye lashes. With poor-quality data, the danger is that fixations, however defined, arenot even discovered. And obviously, i fixations are not discovered, there is nothing

to be mapped.o illustrate the variability in the different data records we are challenged by, Fig-ure 4 shows the succession o fixations (still in the same extract) as shown in another

Figure 4. Extract rom a ClearView gaze plot representation o fixations (635664).Fixations 640664 were recorded in the exact same time span and rom the same eventas that recorded in Figures 23. (Fixations 635639 were earlier.)


14/20

Arnt Lykke Jakobsen

sofware program. Figure 4 shows a gaze plot generated by the obii companys Clear-View sofware, which records the succession o fixations and saccades and plots them

to the computed, approximate position on the screen, with no adjustment to what isdisplayed on the screen. When the gaze plot is laid over the text that was displayedwhen the recording was made, it can be seen that according to ClearView, there wereprobably several more word fixations than the eleven fixations that were successullymapped by the GWM module reporting to ranslog (c. Figure 3).

Te representation o fixations in Figure 4 illustrates some o the theoretical andtechnical problems involved in connecting recorded fixations reliably to what wethink was probably seen and read. Tese problems go to the heart o the eye-mindrelationship: How do the fixations computed by our equipment correlate with the

mental experience o reading a sequence o words? Our equipment records what thephysical eye appears to be targeting, but exactly how this recording corresponds towhat the mind sees is a major issue. Many o the fixations are not recorded as havingbeen on a specific word, but appear slightly below or above a word, or a fixation isrecorded as being between words or at one extreme o a word, or else two (or more)successive fixations are recorded within the same word. Manual mapping o such datais a very instructive reminder that what we call a fixation is neither a very well-de-fined concept nor an easily identified object, but a technology and theory-dependentabstraction with a certain grounding in behaviour.

One way o making sense o the gaze data represented in Figure 4 is by interpret-ing it in the light o a model o reading or, better, in the light o a process model otranslation which includes a model o probable reading behaviour. Te unction osuch a model is to close the gap, as ar as possible, between the recorded fixationsand the words that were actually read by the participant. able 2 presents the result osuch a process. Fixations 640 through 647 could be mapped identically without anyproblems. (ranslog/GWM interpreted fixation 640 as two separate fixations, but wasunable to map the second. Fixation 646 looks like a fixation on hjlpe in ClearView,but since the timestamp inormation shows that the word had not yet been typed at

the time o the fixation, it had to be mapped to ogs.) Fixations 660664 were alsomapped almost identically (ranslog/GWM collapsed 660 and 661 into a single fixa-tion on deorestation). Fixation 663 was recorded in ClearView as being somewherebetween minimise (in line 4) and countries (in line 3). A decision based on prox-imity would align this fixation with countries, but this mapping conflicts with the6-step model outlined above, which predicts that a fixation ollowing a fixation thatserves to monitor target output (step 4) will be ollowed by a shif o the gaze to theS (step 5), ollowed by a fixation on an anchor word (step 6). Te expected trajec-tory would thereore be rom minimere (target output monitored) to S anchor

minimise, with perhaps a quick stopover near the next target word (emissions) toadjust the trajectory. Tereore, the decision was to map this fixation to minimisein accordance with the model, and to similarly map fixation 664 to emissions ratherthan to to go in the line above.


15/20


Tus, with respect to fixations 640647 and 660664, there is almost perectagreement between the automatic mapping result rom ranslog/GWM and rom the

able 2. Automatically mapped fixations by ranslog/GWM compared to manuallymapped fixations rom ClearView. Between fixations 647 and 648 there was an intervalo 2.013 seconds in which the word hjlpe was typed. Between fixations 651 and 654there was an interval o 3.054 s during which the words i at minimere were typed.

Fixated words ime extract: 13 seconds, 509521 seconds into task

ranslog ClearViews record o fixations, manually mapped to words

fix S word word Dur Comment

635 Contributed 239636 have 120 line below help637 help 179638 minimise 239639 levestandarder 100 on way down to end

ogs (+unmapped) 640 ogs 259have 641 have 199 line below helpcould 642 could 299help 643 help 140minimise 644 minimise 399help 645 help 120(unmapped) 646 ogs 159ogs 647 ogs 120

648 hand 120 on way up to emissions649 emissions 179 first hal o word fixated650 emissions 219 second hal o word fixated651 minimise 319652 at 698

653 at 179654 minimise 179

655 countries 160 line above emissions656 emissions 140 first hal o word fixated657 emissions 279 first hal o word fixated658 minimise 139659 rom 140

deorestation 660 deo(restation) 120661 (deo)restation 279

minimere 662 minimere 419minimise 663 minimise 199

emissions 664 emissions 321


16/20

Arnt Lykke Jakobsen

manual mapping o the ClearView gaze plot representation o recorded data. How-ever, the data do not always support identical interpretations. ClearView fixations

648659, or instance, were not mapped by ranslog/GWM. Tey were not even iden-tified as fixations. Tis may have been caused by the act that between fixation 647 and648 there was an interval o 2.013 s in which the translator typed the word hjlpewhile probably looking at the keyboard. When the translators gaze returned to thescreen, GWM did not recognise the translators gaze as quickly as ClearView did.Whatever the technical reason or this (perhaps some slight inconsistency betweenthe ClearView and the GWM calibration data or this translator), missing data is aserious problem, but in this particular case we can fill the gap by manually mappingthe fixations recorded in ClearView, as was done in able 2. As we shall see, the data

illustrates that the six steps in our model are not always ollowed strictly sequentially.Steps can be skipped or repeated several times, and occasionally processing may evenbacktrack to earlier steps i regular orward step-by-step processing ails to yield a rel-evant result. Fundamentally, however, we find the same basic processing cycle whenwe reconstruct the data missing in GWM. Tis is very good news or our chances oanalysing the data automatically and or modelling the human translation processcomputationally. Let us take a closer look at the manually mapped sequence o fixa-tions (643660).

I we simpliy the sequence slightly (collapsing multiple sequential fixations on

the same word and disregarding the irregular fixation on countries), we get the ol-lowing sequence o fixations: hand, emissions, minimise, at, minimise, emis-sions, minimise, rom, deorestation. In terms o our model, the sequence wouldbe interpreted as representing the ollowing steps: Afer fixating a succession o newsource-text words, i.e. help, minimise and help (step 1, fixations 643645), thegaze was shifed to the target text, where fixations 646 and 647 identified the targettext anchor word, ogs in two (repeated) fixations (step 2). Tis was ollowed bythe typing o hjlpe (step 3), a translation o the first o the two new words read in643645. Te absence o gaze data during the subsequent two seconds indicates that

visual (monitoring) attention was on the keyboard only (step 4). Afer that, the gazewas shifed towards the next new piece o source text (step 5). Since the next newsegment that was read (minimise emissions) overlapped with the previous segment(help minimise), minimise unctioned as the mental anchor (step 6) or the nextnew segment (minimise emissions, step 1). Tis step was ollowed by a shif o thegaze to the target text area (step 2) and beyond that to the keyboard, as the translationo the rest o minimise emissions was typed (i at minimere, step 3). Te act that notarget-text anchor was fixated and the act that two successive source-text segmentsoverlapped suggest that help minimise was treated as one translation segment even

i emissions was not fixated until afer the translation o help had been typed. Dur-ing the typing o i at minimere (step 3), which lasted 3.054 s, two fixations on atwere recorded, indicating that visual attention was shifed rom the keyboard to thescreen (during step 4) or about one third (0.9 s.) o the time it took to type this por-tion o the translation. Afer the 3-second typing period, which might well indicate


17/20


cognitive uncertainty about the unidiomatic solution produced at this point, a newcycle was initiated, beginning with the shif o the gaze to the source text (step 5) and

ollowed by fixation o the source-text anchor word minimise (step 6, fixation 654).Tis was ollowed by a reading o new source text (emissions rom deorestation), butapparently this was too big a chunk or this translator to handle in a single processingcycle, so the anchor word minimise and its object emissions were refixated, aferwhich the translator was ready to shif the gaze to the target text (step 2) and typethe translation o emissions (step 3). As can be seen, the basic processing model isrecognisable even in a sequence characterised by repeated fixations and backtracking.Tis gives the model potential or being used to handle fixation mapping problemsand to diagnose e.g. parsing or meaning construction problems. It also opens up a

new opportunity to computationally record the succession o new source-text seg-ments read by a translator and to automatically align the source-text segments readand how they were translated.

. Conclusion and perspective: oward a computational model

of human translation?

Te addition o eye tracking to keylogging has made it possible to study what source or

target text units are being worked on by a translator at any given point in time in muchgreater detail than has hitherto been possible. Te data that we get rom such inte-grated logging provides tangible evidence o the complex processing activity requiredby a translator. All along, the aim o the ranslog development reported in the presentarticle has been to enable us to know more certainly, rom behavioural evidence, whatthe nature o this processing activity is, what steps are involved, what segments areread and aligned, and how this whole process is monitored. Our detailed analysis oa small sample o eye and key data showed that with the current state o the art thereare still some technical issues that we should strive to improve. However, the analysis

also indicated that there appears to be a micro-cycle o processing steps at the hearto translation. Te integrated succession o eye and key data suggests that the basicdynamics o translational processing can be described in terms o six basic steps.

With the present state o technological development, it still seems relevant, per-haps even necessary, to examine small volumes o eye movement and keystroke datamanually and selectively, both in order to measure the success rate o the systems au-tomatic mapping and to develop hypotheses rom small samples o careully control-led, clean data that can be tested against bigger samples o somewhat noisier data. Butthis should not prevent us rom pursuing a larger goal. Te potential or large-scale

computational analysis o translator activity data is there, or will very soon be there,and the prospect o creating a computational model o how expert human translatorstypically execute their skill seems within reach.


18/20

Arnt Lykke Jakobsen

Te eventual success o a computational model o human translation dependscrucially both on the quality o the recorded raw data and on the soundness o our

systems automatic interpretation o the data, primarily in the way fixations are identi-fied and in the mapping algorithm. Tese interpretations will be inormed initially bythe knowledge o human translation that has been accumulated in the research com-munity, but this knowledge might well be enriched by the discovery o new patternsand regularities in large volumes o data about the behaviour o human translatorshands and eyes.

References

Englund Dimitrova, B. 2005. Expertise and Explicitation in the ranslation Process.Amsterdam/Philadelphia: John Benjamins.

Englund Dimitrova, B. and iselius, E. 2009. Exploring retrospection as a research method orstudying the translation process and the interpreting process. InMethodology, echnologyand Innovation in ranslation Process Research, I. Mees, F. Alves and S. Gperich (eds),109134. Copenhagen: Samundslitteratur.

Ericsson, K. A. and Simon, H. A. 1984. Protocol Analysis. Verbal Reports as Data. Cambridge,Mass.: MI.

Hansen, G. 2005. Strquellen in bersetzungsprozessen. Eine empirische Untersuchung von

Zusammenhngen zwischen Profilen, Prozessen und Produkten.(Habilitation thesis. Co-penhagen: Copenhagen Business School).

. 2006. Erfolgreich bersetzen. Entdecken und Beheben von Strquellen.bingen: Narr.Hyn, J., Lorch Jr., R. F. and Rinck, M. 2003. Eye movement measures to study global text

processing. In Te Minds Eye: Cognitive and Applied Aspects of Eye Movement Research, J. Hyn, R. Radach and H. Deubel (eds), 313334. Amsterdam: Elsevier.

Hyrskykari, A. 2006. Eyes in Attentive Interfaces: Experiences from Creating iDict, a Gaze-AwareReading Aid. (Doctoral dissertation. [Dissertations in Interactive echnology, Number 4]University o ampere, Finland. http://acta.uta.fi/pd/951-44-6643-8.pd.)

Hyrskykari, A., Majaranta, P., Aaltonen, A. and Rih, K.-J. 2000. Design issues o iDIC: Agaze-assisted translation aid.In Eye racking Research & Application. Proceedings of the2000 Symposium on Eye racking Research & Applications,914. New York: ACM.

Jakobsen, A. L. 2003. Effects o think aloud on translation speed, revision and segmentation.In riangulating ranslation. Perspectives in Process Oriented Research,F. Alves (ed.), 6995. Amsterdam/Philadelphia: John Benjamins.

. 2005. Investigating expert translators processing knowledge. In Knowledge Systemsand ranslation,H. V. Dam, J. Engberg and H. Gerzymisch-Arbogast (eds), 173189.Berlin/New York: De Gruyter.

Jakobsen, A. L. and Jensen, K. . H. 2008. Eye movement behaviour across our differenttypes o reading task. In Looking at Eyes. Eye-racking Studies of Reading and ransla-tion Processing,S. Gperich, A. L. Jakobsen and I. M. Mees (eds), 103124. Copenhagen:Samundslitteratur.

Jakobsen, A. L. and Schou, L. 1999. ranslog Documentation. In Probing the Process in rans-lation: Methods and Results,G. Hansen (ed.), 151186. Copenhagen: Samundslitteratur.


19/20


Just, M. A. and Carpenter, P. A. 1980. A theory o reading: rom eye fixations to comprehen-sion. Psychological Review 87: 329354.

Radach, R., Kennedy, A. and Rayner, K. 2004. Eye Movements and Information Processing Dur-ing Reading.Hove: Psychology Press.Rayner, K. 1998. Eye movements in reading and inormation processing: 20 years o research.

Psychological Bulletin 124 (3): 372422.Rayner, K. and Pollatsek, A. 1989. Te Psychology of Reading. Englewood Cliffs: Prentice Hall.Yarbus, A. L. 1967. Eye Movements and Vision.New York: Plenum Press. (ranslated rom Rus-

sian by Basil Haigh. Original Russian edition published in Moscow in 1965.)


20/20

Jakobsen(2011)Tracking translators keystrokes_eye_TranslogII.pdf

Documents

Transcript of Jakobsen(2011)Tracking translators keystrokes_eye_TranslogII.pdf