Pros and Cons of back translation in assessments and surveys
Transcript of Pros and Cons of back translation in assessments and surveys
PROS AND CONS OF BACK TRANSLATIONIN ASSESSMENTS AND SURVEYS
Andrea [email protected]
Steve [email protected]
Founder of cApStAn (2000)
Business engineer by training
Italian verifier and project manager for translation verification in PISA 2000
21 years of experience in linguistic quality assurance (LQA) for surveys and assessments
CFO + in charge of quality control of linguistic quality assurance @ cApStAn
Founder of cApStAn (2000)
Linguist by training
French verifier and project manager for translation verification in PISA 2000
27 years of experience in test adaptation, survey translation and linguistic quality assurance (LQA)
CEO + in charge of International Large-Scale Assessments (ILSAs) @ cApStAn
Andrea Ferrari Steve Dept
PRESENTERS
CONTENTS
SETTING THE STAGE brief history of translation quality management in tests and surveys
EXPECTED OUTCOMES of translation quality management
BACK TRANSLATION what issues it detects, how these are reported, how they are fixed
TRANSLATION VERIFICATION what issues it detects, how these are reported, how they are fixed
SUMMARY Q&A
SETTING THE STAGETRANSLATION QUALITY
MANAGEMENT IN TESTS
and SURVEYS
MILESTONESIn the late 60s: “test translation changes test difficulty to the extent that comparisons across language groups may have limited validity”
In the 70s: linguistic quality control methods are introduced, e.g. back translation (Brislin,1969, 1973, 1988…)
Prof. Y. Poortinga: “75 % of research in cross-cultural psychology before 1990 was flawed because of poor quality of translations”
A good summary of breakthroughs:
Hambleton, R. K. et al (2005):Adapting educational and psychological tests for cross-cultural assessment
Driving force behind ITC GUIDELINES (1999, 2010, 2017)
Political Attitudes and Democracy in Five Nations
(G. Almond and S. Verba, 1963)
IEA Cross-national Study of Mathematics (1964)
EXPECTED OUTCOMES
Feedback meaningful for test developers/psychometricians
Feedback that is timely (corrective action), possibly including loop back to source version
Complete documentation of intentional deviations versus source
→ Assurance that all steps have been taken to maximize cross-language comparability
BACK TRANSLATIONWhat it detects
How issues are reported
How issues are fixed
WHAT IT DETECTS
o Added information
WHAT IT DETECTS
o Added information
WHAT IT DETECTS
o Missing information
WHAT IT DETECTS
o Mistranslation
WHAT IT MAY NOT DETECT
o Higher or lower register in target versus source
WHAT IT MAY NOT DETECT
o Missing key correspondence between stimulus and question (e.g. literal match, synonymous match)
Back-translation less
likely to be helpful here
WHAT IT DEFINITELY WILL NOT DETECT
o Fluency in target: stilted, awkward, literal translation
WHAT IT DEFINITELY WILL NOT DETECT
o Fluency in target: stilted, awkward, literal translation
WHAT IT DEFINITELY WILL NOT DETECT
o Certain psychometric characteristics that may get lost in translation
HOW ISSUES ARE REPORTED
o 5 recent ToR which include back-translation as a requirement do not provide any further specifications
Typical example:
“Deliverable: pre-test,
translation and back-
translation of data
collection tools”
HOW ISSUES ARE REPORTED
At minimum: Human reviewer compares back-translation to original source, flags “doubtful” segments, sends this feedback to translator
More robust procedure:
Reviewer uses a taxonomy to
categorize issues
HOW ISSUES ARE REPORTED
o At minimum: Human reviewer compares back-translation to original source, flags “doubtful” segments, sends this feedback to translator
o Translator uses feedback to improve translation
More robust procedure:
- Translator reports how
flagged issues were addressed
- Reviewer follows up
HOW ARE ISSUES FIXED
o Back-translation per se does not address this step
o Cf. previous slide: corrective action depends on the workflow and the overall LQA design that includes the BT step; it can be minimal or more robust
Additional consideration:
How are Reviewers selected,
trained and instructed?
WHY BACK TRANSLATION IS REASSURING
control element
element of hard evidence:
think double-blind studieswith placebo
TRANSLATION VERIFICATIONWhat it detects
How issues are reported
How issues are fixed
LQC
Verification by linguists(or by pairs: linguist
plus domain specialist)
Documentation of issues (verifier intervention categories)
Monitoring of corrective action (final check)
Quantitative and qualitative reports
Defining Linguistic Quality Control
in the ILSA setting:
Check whether translated/adapted
data collection instruments comply
with T&A notes
Report issues as well as risks
Propose, implement and follow up
corrective action
WHAT IT DETECTS
WHAT IT DETECTS
WHAT IT DETECTS
Verifier comment:
Correct response has
same root as main
verb in the question
(unlike in source)
WHAT IT DETECTS
Verifier comment:
"Church" was not
adapted. Churches
are not the main
religious organisations
in the target country
HOW ISSUES ARE REPORTED
VERIFIER INTERVENTIONCATEGORIES (CAPSTAN)
PISA 2006 FT: 5,380 verifier comments, covering 42 national versions in 36 languages for 38 countries, were analysed and described with key words
A taxonomy of verifier intervention categories was developed
SEVERITY CODES (IEA)
1. Major Change or Error:e.g. incorrect order of choices in MCQ; omission of a question; incorrect translation which changes the meaning or difficulty of the passage or question
2. Minor Change or Error:e.g. spelling errors that do not affect comprehension.
3. Suggestion for Alternative: translation may be adequate, but you suggest a different wording.
4. Acceptable Change:change is acceptable and appropriate. E.g. a reference to winter is changed from Jan to Jul for SH
1? In case of Doubt:not sure what code to apply=> use “1?”, so that no serious issue is left unaddressed
More ‘operational’
classification, but less
‘informative’
HOW ISSUES ARE FIXED
In translation verification, issues are reported and fixes are proposed at the same time
All corrections made are tracked
Ideally, a reviewer examines verifier feedback and decides what issues require follow-up
SUMMARY Back-translation vs
Translation Verification
HOW MUCH TIME AND EFFORT, ON WHAT?
Back-translation means time/effort
1) by a Back-translator (possibly aided by MT)
2) by a Reviewer (to compare back-translation to original source)
3) by the Translator (to implement corrections)
Translation Verification means time/effort (more? less?)
1) by a Verifier (to compare target to source sentence by sentence;
to report issues; to suggest corrections)
2) by a Reviewer (to analyze verifier feedback)
CONCLUDING
BACK TRANSLATION (BT)
1 person works with source and target
Usually catches mistranslations
Does not catch fluency, register
Does not catch culture-driven perception shifts
No documentation of equivalence issues
After BT, a reviewer needs to compare two same language versions
Corrections still need to be implemented
TRANSLATION VERIFICATION (VER)
2 people work with source and target
Usually catches mistranslations
Reports and corrects fluency, register
Usually catches culture-driven perception shifts
Systematic documentation of equivalence issues
After VER, a reviewer needs to analyze verifier feedback
Some corrections may need to be rejected/undone
CONCLUDING
BACK TRANSLATION (BT)
Literal translation scores well on BT index
Back translator may want to show off his translation skills (and embellish BT)
Back translator only needs translation skill
Compliance with translation and adaptation notes cannot be checked
No procedure to suggest adaptations if needed
Residual typos in the target version not corrected
TRANSLATION VERIFICATION (VER)
Literal translation flagged as awkward
Verifier offers a diagnosis of potential equivalence issues in translated version
Verifier needs to be trained to detect and report survey-specific issues
Compliance with translation and adaptation notes systematically checked
Verifier can identify the need for adaptations
Linguistic quality control a subset of verification
CROSS-CULTURAL SURVEY GUIDELINES
https://ccsg.isr.umich.edu/index.php/chapters/translation-chapter/translation-overview#twelve
“Translation procedures from the past – no longer recommended”
“instead of looking at two source language texts, it is much better in practical and theoretical terms to focus attention on first producing the best possible translation and then directly evaluating the translation produced in the target language, rather than indirectly through a back translation. Comparisons of an original source text and a back-translated source text provide only limited and potentially misleading insight into the quality of the target language text.”
A COST-EFFECTIVE APPROACH: AD HOC VERIFICATION
Identify a selection of
sensitive points (literal
matches, synonymous matches,
patterns, technical terms)
Verify these sensitive
points carefully for
each language
pilot partially
verified version
Above
threshold
Below
threshold
Full Verification
(sentence by
sentence)
Nothing speaks against
asking the reviewer to
back translate non
compliant segments
THANK YOU VERY MUCH [email protected]
REFERENCES 1
Almond, G. and Verba, S. (1963). The Civic Culture or Political Attitudes and Democracy in Five Nations, Sage Publications.
Brislin, R. Back-translation for cross-cultural research. (Doctoral dissertation: The Pennsylvania State University), Ann Arbor, Michigan: University Microfilms, 1969, No. 70-13, 803.
Brislin, R. W., Lonner, W., & Thorndike, R. M. (1973). Cross-cultural research methods., New York: Wiley.
Brislin, R. W. (1988). The wording and translation of research instruments. In W. Lonner, & J. W. Berry (Eds.), Field methods in cross-cultural research.
Hambleton, R. K., Merenda, P., & Spielberger, C. (eds.), (2005). Adapting educational and psychological tests for cross-cultural assessment. Hillsdale, NJ: Lawrence S. Erlbaum Publishers.
REFERENCES 2
Iliescu, D. (2017). Adapting Tests in Linguistic and Cultural Situations. (New York, Cambridge University Press.)
Harkness, J. A. (2003). Questionnaire translation. In J. A. Harkness, F. van de Vijver, & P. Ph. Mohler (Eds.), Cross-cultural survey methods (pp. 35-56). Hoboken, NJ: John Wiley & Sons
Harkness, J. A et al (eds.), 2010. Survey Methods in Multinational, Multiregional and Multicultural Contexts. John Wiley & Sons, Hoboken
Survey Research Center. (2016). Guidelines for Best Practice in Cross-Cultural Surveys. Ann Arbor, MI: Survey Research Center, Institute for Social Research, University of Michigan. Retrieved May, 28, 2020, from http://www.ccsg.isr.umich.edu/.
WHAT IS VERIFICATION?
Verification
Item Functioning
Adaptations & Guidelines
Proofreading
PROOFREADING VS. VERIFICATION
Focus on maintaining same
difficulty level and ensuring
correct item functioning
Less flexibility as regards form
(especially in key parts)
Preferential changes to be
avoided
Literal & synonymous matches
preferred
If it is not broken, do NOT
fix it.
Linguistic fluency and correctness,
equivalency on content level
More flexibility as regards form
More room for preferential
changes
Rich vocabulary can be a plus