Stephen Doherty, CNGL/SALIS s [email protected]

21
Stephen Doherty, CNGL/SALIS [email protected]

description

Current Research A comparative investigation of the readability and comprehensibility of SMT and RBMT output for controlled and uncontrolled input. Stephen Doherty, CNGL/SALIS s [email protected]. Overview. Past Research Readability & Comprehensibility Controlled Language - PowerPoint PPT Presentation

Transcript of Stephen Doherty, CNGL/SALIS s [email protected]

Page 1: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Stephen Doherty, CNGL/SALIS

[email protected]

Page 2: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Past Research Readability & Comprehensibility Controlled Language Research Proposal (Methodology) Evaluation (Eye Tracking) Conclusion

2

Page 3: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Translating Versus Post-Editing: A Segmentation Comparison Based on Pauses (B.A. Dissertation)

Think-Aloud Protocols in Translation Studies (Interessen der kognitiv orientiereten Translationswissenschaft)

3

Page 4: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

CNGL Work Package: ILT1.8 Controlled Language:

Supervisors – Dr. Sharon O’Brien, Dr. Dorothy Kenny

“adapt the systems developed by other ILT WPs to deal with in-house data which conforms to both source and target controlled language guidelines”

4

Page 5: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

What is readability?

(Gray 1935: “In the reader, those features affecting readability are 1. prior knowledge, 2. reading skill, 3. interest, and 4. motivation. In the text, those features are 1. content, 2. style, 3. design, and 4. structure”.)

What is comprehensibility?

5

Page 6: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Metrics: (Reading scores, recall tests...)

E.g. Flesch Reading Ease:

Gunning-Fog Index – SMOG (Simple Measure of Gobbledygook) (Mc Laughlin 1969)

6

Page 7: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

What is controlled language?

“an explicitly defined restriction of a natural language that specifies constraints on lexicon, grammar, and style”

(Huijsen, 1998)

7

Page 8: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Types of CL:

Human-Orientated Controlled Language (HOCL): readability & comprehensibility e.g. AECMA Simplified English

Machine-Orientated Controlled Language (MOCL): improved translatability, MT system specific

(Huijsen, 1998)

8

Page 9: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Examples of CLs: AECMA Simplified English, Sun Microsystem’s Controlled English, IBM Easy English, Caterpillar Technical English, GM...

Usage (mostly English, but…)

Symantec (CNGL Industry Partner)

9

Page 10: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Roturier (2006):

Consistent spelling (54) Do not use pronouns that have no specific referent (19) Avoid unusual punctuation (35) Avoid embedded clauses introduced by commas or dashes (41) Do not use more than 25 words per sentence (5) Use a question mark only at the end of a direct question (48)

10

Page 11: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

O’Brien (2003) - three types of rule categories:

Lexical (e.g. Rules that allow or rule out the use of specific acronyms or abbreviations)

Syntactic (e.g. specifying when and where past participles can be used and avoiding the present participle)

Textual: Text Structure (e.g. Specifying admissible sentence length) Pragmatic (e.g. Using certain verb forms for specific text purposes

– imperative for instructions)

11

Page 12: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

A comparative investigation of the readability and comprehensibility of SMT and RBMT output for

controlled and uncontrolled input

12

Page 13: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

13

Page 14: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

14

Page 15: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

15

Page 16: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Both automatic and human evaluation (focus)

Automatic evaluation (Blue…)

Human evaluation: eye tracking & retrospective protocols (recall tests & interviews)

16

Page 17: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Eye Tracking:

What is it exactly? (background)

Successful application in this research area

Tobii Eye Tracker & ClearView software

Additional video recording, keystroke & mouse logging

17

Page 18: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

18

Tobii 1750 Eye Tracker (www.tobii.se)

Page 19: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

Recall tests (comprehensibility)

Retrospective interviews (generation of additional data & resolving possible issues)

19

Page 20: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

20

Page 21: Stephen Doherty, CNGL/SALIS s tephen.doherty2@mail.dcu.ie

21