Prof Klaus: Terminology Management

Post on 11-May-2015

2.474 views 7 download

Tags:

Transcript of Prof Klaus: Terminology Management

Terminology Management in Technical Communication

tcworld IndiaBangalore, 11 – 12 March 2011

Klaus-Dirk SchmitzInstitute for Information ManagementFaculty 03University of Applied Sciences Cologneklaus.schmitz@fh-koeln.de

K.-D. Schmitz, IIM, FH Köln

Overview

What is terminology?

Basic principles of terminology management

Terminology management in companies

Tools for managing terminology Terminology extraction tools Terminology databases Terminology control tools

K.-D. Schmitz, IIM, FH Köln

Note: If you omit the password, MultiTermprompts you for a password when loading,assuming the database is password-protected. If you log on as the system administrator, you are normally asked whether you want exclusive access to the database. This is not the case when opening a database using parameters; in this case, it is assumed that you do want exclusive access. Only when exclusive access is not available, MultiTerm does assume that you still want to take part in normal multi-user operation.

What is terminology ?

K.-D. Schmitz, IIM, FH Köln

Note: If you omit the password, MultiTermprompts you for a password when loading, assuming the database is password-protected. If you log on as the system administrator, you are normally asked whether you want exclusive access to the database. This is not the case when opening a database using parameters; in this case, it is assumed that you do want exclusive access. Only when exclusive access is not available, MultiTerm does assume that you still want to take part in normal multi-user operation.

What is terminology ?

K.-D. Schmitz, IIM, FH Köln

Note: If you omit the password, MultiTermprompts you for a password when loading, assuming the database is password-protected. If you log on as the system administrator, you are normally asked whether you want exclusive access to the database. This is not the case whenopening a database using parameters; in this case, it is assumed that you do want exclusive access. Only when exclusive access is not available, MultiTerm does assume that you still want to take part in normal multi-user operation.

What is terminology ?

K.-D. Schmitz, IIM, FH Köln

What is terminology ?

databaseexclusive accessloadinglog onMultiTerm *multi-user operationopen a databaseparameterpasswordpassword-protectedpromptsystem administrator

terminology =vocabulary of a subject field= Gesamtheit der Begriffe und Benennungen in einem Fachgebiet(DIN 2342)=set of designations belonging to one special language (ISO 1087-1)

K.-D. Schmitz, IIM, FH Köln

„„mousemouse““

conceptconcept

objectobjecttermterm

Terminological Triangle

K.-D. Schmitz, IIM, FH Köln

Object

Any part of the perceivable or conceivable world

Objects may be material (e.g. mouse) or immaterial (e.g. magnetism)

K.-D. Schmitz, IIM, FH Köln

Concept

Unit of thinking made up of characteristicsthat are derived by categorizing objects having a number of identical properties (DIN)

Unit of knowledge created by a unique combination of characteristics (ISO)

Concepts are not bound to particular languages. They are, however, influenced by social or cultural background

K.-D. Schmitz, IIM, FH Köln

Term

Designation of a defined conceptin a special languageby a linguistic expression

Designation: Any representation of a concept

A term may consist of one or more words A term may consist of one or more wordsSingle word term: mouse, printer, laserMulti-word term: laser printer, printer with single-sheet feed

““mousemouse””

K.-D. Schmitz, IIM, FH Köln

Synonymy

““return key?return key?””

Communication can be disturbed!

““enter keyenter key””

K.-D. Schmitz, IIM, FH Köln

Synonymy

Synonymy exists if two or more terms in a given language represent the same concept.

K.-D. Schmitz, IIM, FH Köln

Synonymy

K.-D. Schmitz, IIM, FH Köln

Homonymy / Polysemy

““mousemouse””

Communication can fail!

K.-D. Schmitz, IIM, FH Köln

Homonymy / PolysemyPolysemy: etymological affinity, the same word.

Homonymy exists if one term or several terms that have the same external form refer to several concepts.

True homonymy: different words with the same form, no etymological affinity.

Differentiation between polysemy and homonymy is irrelevant for terminology work.

K.-D. Schmitz, IIM, FH Köln

Coining new terms

K.-D. Schmitz, IIM, FH Köln

Coining of terms: word formation + term buildings mechanisms

Composition: cyberspace, translation memory system

Derivation: preface, management

Conversation: the chair to chair, green (adj) the green

Terminologization: mouse (IT) mouse (bio, general), virus (IT) virus (med)

Loan word: festschrift, zeitgeist from DE, rickshaw from JP

Abbreviation: CEO, AIDS, scuba, Interpol

New creation: blurb, quark (very rare)

Term-related issues

K.-D. Schmitz, IIM, FH Köln© Prof. Dr. Petra Drewer & Prof. Dr. Klaus‐Dirk Schmitz

• USB flash drive• flash drive• USB stick• USB memory key• memory stick• keydrive• pendrive• thumbdrive• jumpdrive• etc.

Selecting “good” terms

K.-D. Schmitz, IIM, FH Köln

Criteria for the selection and creation/coining of terms:

Transparency/motivation

Consistency

Appropriateness

Linguistic economy

Derivability

Linguistic correctness

Preference for native language

Term-related issues

K.-D. Schmitz, IIM, FH Köln

Transparency

The concept designated by the term can be inferred without a definition(e.g. pipe wrench or adjustable wrench vs. monkey wrench)

The meaning of the term is visible by:

morphological motivationpage setup, error messagedata network identification code*network printing device setup*

semantic motivationworm, virus, infected file, vulnerability, firewall*

K.-D. Schmitz, IIM, FH Köln

Terminology must be defined accurately and used consistently at least within: one document one product one company or organization

Only one term for each concept(avoid synonyms !)

Only one concept for each term(avoid homonyms !)

But also consistent term coining (see Wikipedia: wrench)

Consistency

K.-D. Schmitz, IIM, FH Köln

Appropriateness

Appropriateness means that terms: have to be familiar to the user (localization!)

don’t cause confusion or insecurity

have no negative connotations (neutral, politically correct)

express installation (only components needed)network installation (all components)

system error, severe error, fatal error, user error etc. master/slave, web designers’ bible, knowledge nugget etc. nuclear energy vs. atomic energy

K.-D. Schmitz, IIM, FH Köln

Appropriateness

K.-D. Schmitz, IIM, FH Köln

Other features

Linguistic economy Ultrakurzwellenüberreichweitenfernsehrichtfunkverbindung

Derivability medicinal plant vs. herb herbal, herbalist, ... Bedeutungslehre Semantik

Linguistic correctness aktualisieren vs. updaten, geupdated, upgedatet, ... OpenSource, Cafe ToGo, fünfköpfiger Familienvater, …

Preference for native language Startseite vs. Homepage, Multifunktionsleiste vs. Ribbon

K.-D. Schmitz, IIM, FH Köln

Overview

What is terminology?

Basic principles of terminology management

Terminology management in companies

Tools for managing terminology Terminology extraction tools Terminology databases Terminology control tools

K.-D. Schmitz, IIM, FH Köln

Terminology is an important carrier of knowledge for domain-specific information in companies

Within a company, but also between company and customers as well as between company and suppliers

Many departments and sectors of a company create, disseminate, retrieve and use terminology

BUT: Terminology management is discussed controversially: Costs + effort Quality and efficiency

Therefore: Costs and benefits of terminology management

Terminology in companies

K.-D. Schmitz, IIM, FH Köln

Successful terminology management in companiesPractical tips and guidelines: Basic principles, implementation, cost-benefit analysis, system overview

published in German 4/2010, about 300 pages, CD with data

also available in English

tekom survey

K.-D. Schmitz, IIM, FH Köln

Online questionnaire end of 2009 about 1,000 sent out, mostly to tekom members High response rate of 940 questionnaires (77% tekom) 34% managerial staff and CEOs

64% employees 67% industrial enterprises

15% software companies13% service providers (TD / translation / localization)

(And: questionnaire for tools providers, questionnaire for benchmarking companies (25), 2 benchmarking workshops)

tekom survey

K.-D. Schmitz, IIM, FH Köln

At an average, a company

has to manage 11.81 different information products

is creating 5.87 different technical documentations

has to deal with the fact, that 5.04 different sections/ departments are involved in the process of creating (new) terms

is translating information products into 10.1 different languages

tekom survey: the situation

K.-D. Schmitz, IIM, FH Köln

Product planning and development

Product usage

Product maintenance

R&D & Product management Information pertaining to product release changesMarketing and product management Product information sheets for pre-salesMarketing Marketing material

Customer presentationsSales and Marketing &TD Product catalogues

Price catalogues Contract documents

Corporate communications Press releases

R&D & Product Management Product specificationsR&D Procedural instructions

Process documentation for product development CAD graphics, 3D-models pertaining to the product Laboratory manuals

R&D & software development Software GUIs / user menusR&D & TD marketing Images pertaining to the productR&D & TD Data sheets

Parts listsR&D & Marketing Packaging labels / product labelsQM Quality documentation

TD Maintenance/Service manualsTD & Service Repair manuals

Spare parts catalogues

Product marketing

Technical Communication (TD) Montage and installation instructions Commissioning manuals Online help

Training & TD Training documentsMarketing and TD Multimedia/simulation programsTD & R&D & software dev. User interfaces(GUIs) / softwaredescriptionsR&D Control elementsand labels

K.-D. Schmitz, IIM, FH Köln

Who is creating terminology?

Technical documentation 79.7%

Research / (software) development / engineering

79.7%

Marketing 63.5%

Product management/ Portfolio management

61.3%

Translation / Localization 40.4%

Distribution / Sales 39.3%

Customer service / After sales 30.7%

Training 28.4%

Management board 26.3%

Corporate communications / Public relation

24.4%

Quality assurance / Quality management 16.3%

Purchase / Procurement 12.3%

Montage / Assembly planning / Production

12.3%

Servicing / Maintenance 8.2%

IT service 6.7%

Multiple answers, average 5.04different departments/ sections

K.-D. Schmitz, IIM, FH Köln

84.2 % report, that always or very frequently various departments/sections use different terms for the same concept

70.7 % report, that always or very frequently differing terms for the same concept are used in various documents

47.1 % of the staff always or very frequently have problems in understanding technical terms on the spot

51.1 % of the staff always or very frequently have to ask for or retrieve the correct term for a given concept

Terminology problems

K.-D. Schmitz, IIM, FH Köln

Consequences: Terminology problems

K.-D. Schmitz, IIM, FH Köln

Opinions about terminology

96,0%96,0%96,9%

87,7%

sehen die Zeitersparnis inder Kommunikation und

Arbeit als eher groß bis sehrgroß an

halten dieArbeitserleichterung für eher

groß bis sehr groß

schätzen dieQualitätsverbesserung vonDokumenten und in derKommunikation als eher

groß bis sehr groß

gehen von einer eher großenbis sehr großen Erleichterungder Verständlichkeit für den

Kunden aus

Saving time Making work easier

Improving quality

Better understanding

for the customer

K.-D. Schmitz, IIM, FH Köln

And: Context, Example, Synonym, No-Term, Source, Explanation, References, Position numbers, short forms

34.8%Illustrations

44.9%Project, product, customer, department information

51.4%Grammatical information (Gender, POS, Number etc.)

72.3%Status (preferred, admitted, deprecated, do not use etc.)

78.2%Subject field information

84.3%Definitions

Documentation of terminology

Terminology documentation

K.-D. Schmitz, IIM, FH Köln

1. Analysis of the problem2. Analysis of the impact (very often, very expensive, bad quality)

3. Framework conditions for the value of benefit (many documents, high volumes, many languages, using TMS, using CMS)

4. Definition of goals (management triangle: costs, time, quality)

5. Analysis of benefit (consistent terminology, less changes, higher TM match rate, less queries by translators)

6. With/without comparison7. Analysis of costs (initial costs: system, implementation, training;

running costs: licenses, personnel, working hours)

8. Cost-benefit analysis9. Success factors and risks (early, involve all, workflow)

Model for a cost-benefit analysis

K.-D. Schmitz, IIM, FH Köln

For a specific application scenario: Costs for changes of terms in the source language Costs for queries from translators Costs for terminology related translation corrections Translation costs

e.g. for changes of terms: duration in number of hours of work for making changes wages for the hours of work number of changes made per document number of newly created documents per year

And: when changes? in which formats/systems? with/without termbase!

Key performance indicators: benefits

K.-D. Schmitz, IIM, FH Köln

For a specific application scenario:

Procurement costs for a termbase system (12,000-40,000 €)

Costs for system support and update (1,100-9,000 €)

Time for one SL term entry (ø 30 min) plus TL info (ø 20 min)

Number of new entries/year (60-600) + updated entries (100-1200)

Monthly salaries for the terminology staff

Key performance indicators: costs

K.-D. Schmitz, IIM, FH Köln

Cost-benefit analysisDegree of Necessity

Type of Problems

Degree of Effects

Framework Conditionsfor Optimum Usage

e.g. Number of Languages

e.g. TMS Usage

e.g. Number Employees TD

Alternatives

Benefit Key Indicators:e.g. Costs for Source Text Changes

Costs for Answering Translators QuestionsCosts for Target Text Corrections

TMS Match Rates

Cost Key Indicators:e.g. Investment Costs for TermBase System

Costs for Training PersonnelRunning Costs for Terminology Management

System Maintenance

Without Defined Terminology

With Defined Terminology

(Without Defined Terminology)

With Defined Terminology

Figures to Compare from Benchmarking or Estimation

Evaluation and Comparison Benefit Evaluation and Comparison Costs

e.g. Amount of Translation

K.-D. Schmitz, IIM, FH Köln

Salary expenses

- 200 000 €

+ 200 000 €

- 150 000 €

- 50 000 €

- 100 000 €

+ 150 000 €

+ 50 000 €

+ 100 000 €

Licenses

Initial investment

Salary expenses

Licenses

Translatorqueries

Changes

Translationcosts

Corrections

Translatorqueries

Changes

Translationcosts

Corrections

Translatorqueries

Changes

Translationcosts

Corrections

Salary expenses

Licenses

Salary expenses

Licenses

2. Year1. Year 3. Year 4. Year

ROI

Break-Even-Point

Cost-benefit analysis: sample

K.-D. Schmitz, IIM, FH Köln

Pain curve for terminology management

Source: Dunne, Keiran; Multilingual, April 2006

K.-D. Schmitz, IIM, FH Köln

Terminology error propagation

Source: Dunne, Keiran; Multilingual, April 2006

K.-D. Schmitz, IIM, FH Köln

Source: Childress, Mark; Multilingual, April 2006

Terminology error propagation

K.-D. Schmitz, IIM, FH Köln

Overview

What is terminology?

Basic principles of terminology management

Terminology management in companies

Tools for managing terminology Terminology extraction tools Terminology databases Terminology control tools

K.-D. Schmitz, IIM, FH Köln

Terminology extraction

In many application scenarios of terminology work, the extraction of terminology from (existing) textual material is recommended.

We can differentiate between the following extraction methods:

Monolingual term extraction (text in electronic form)

Bilingual term extraction (parallel aligned texts, i.e. TMs)

Manual (human) term extraction

Computer-assisted term extraction (tools propose term candidates)

• With statistical methods (for “all” languages, cannot use knowledge about syntax)

• With linguistic methods (better results, but only for “important”languages)

• With hybrid methods (combining statistical and linguistic methods)

K.-D. Schmitz, IIM, FH Köln

Features of (monolingual) term extraction tools:

Common functionalities from concordance programs (e.g. WordSmith): identify words, word statistics, KWIC index, alphabetic/frequency order

Reducing inflected word forms to the basic canonical form:needed for real statistics, needs morphological knowledge

Filtering and ignoring function words (articles, conjunctions etc.) and general language words (but what is general language?)

Filtering and ignoring terms that are already included in a termbase

Identifying multi-word terms, noun phases and verbal phases

Identifying discontinuous elements and elliptical constructions

Terminology extraction tools

K.-D. Schmitz, IIM, FH Köln

Improving and enriching term candidates with SDL MultiTermExtract

Terminology extraction tools

K.-D. Schmitz, IIM, FH Köln

Settings to improve the results of term extraction with SDL MultiTermExtract

Terminology extraction tools

K.-D. Schmitz, IIM, FH Köln

Benefits and problems of term extraction tools:

Term extraction tools are helpful in preparing terminology for large translation projects (with several translators) and for aninitial feeding of a term base (with company or subject specificterminology)

Result of a term extraction is a list of term candidates; the list must be checked; but what about the texts (with possible not extracted terms)?

Results are only terms (and context examples), but no other terminological information; it is a kind of a to-do list for the terminologist

The more linguistics the better the results; but what about “less common” and minority languages?

Terminology extraction tools

K.-D. Schmitz, IIM, FH Köln

Terminology management systems

Terminology management systems are software applications that are designed to manage terminological data.

They support tasks related to terminology work and store the results: Terminological data can be entered, edited, deleted, retrieved and filtered.

Most of the systems available on the market are based on (relational) data base systems (MS-Access, SQL, Oracle).

Can be seen as a kind of CAT-Tools (CAT=computer assisted translation).

Tables in word processing or spreadsheet programs are not adequate for terminology management !

K.-D. Schmitz, IIM, FH Köln

Classification of terminology management systems:

Complexity (languages): monolingual / bilingual / multilingual

Entry structure: predefined / free-definable / hybrid

Autonomy: autonomous / CAT tool component / hybrid

Software technology: stand-alone / client-server / browser-based

Business aspects: proprietary / commercial / open source

e.g. SDL MultiTerm 2009

Terminology management systems

K.-D. Schmitz, IIM, FH Köln

Designing a terminology management solution

Before designing a terminology management solution and choosing, adapting or programming a terminology management application:

Analyze the needs and objectives

Specify the user groups, tasks and workflow

Define the terminological data categories needed

Take into account the basic modelling principles

Model the terminological entry

Select, adapt, develop the software Meta data

K.-D. Schmitz, IIM, FH Köln

Typology of data categories I

Complex data categories Open data categories

content not predictable and defined by specificatione.g.: term, definition, note

Closed data categoriescontent defined by a limited set of possible valuese.g.: gender, part of speech, geographical usage

Simple data categoriescontent only yes or no; values of closed data categoriese.g.: masculine, noun, DE

K.-D. Schmitz, IIM, FH Köln

Typology of data categories II Concept-oriented data categories

e.g.: subject field, figure

Language-oriented data categoriese.g.: definition ?

Term-oriented data categoriese.g.: part of speech, context

Administrative data categoriese.g.: author, date, note

Special data categoriese.g.: term, language, (structural elements), (shared resources)

K.-D. Schmitz, IIM, FH Köln

K.-D. Schmitz, IIM, FH Köln

All terminological information belonging to one concept including all terms in all languages and all term-related and administrative data must be store in one terminological entry

concept = terminological entry

Concept orientation

K.-D. Schmitz, IIM, FH Köln

wordword meaningmeaning

meandingmeanding

meaningmeaning

meaningmeaning

meaningmeaning

meaningmeaning

Lexicographical view / model / entry

K.-D. Schmitz, IIM, FH Köln

conceptconcept termterm

descriptive descriptive terminology managementterminology management

termterm

termterm

termterm

termterm

termterm

Terminological view / model / entry

K.-D. Schmitz, IIM, FH Köln

conceptconcept termterm

prescriptive prescriptive terminology managementterminology management

termterm

termterm

termterm

(term)(term)

termterm

Terminological view / model / entry

K.-D. Schmitz, IIM, FH Köln

Lexicographical entry

K.-D. Schmitz, IIM, FH Köln

Terminological entry

K.-D. Schmitz, IIM, FH Köln

Terminological entry

K.-D. Schmitz, IIM, FH Köln

Eintragsmodellierung + Prinzipien

K.-D. Schmitz, IIM, FH Köln

All terms belonging to one concept should be managed (in one terminological entry) as autonomous (repeatable) blocks of data categories without any preference for a specific term

Therefore all terms can be documented with the relevant term-related data categories

Term autonomy is necessary for the main term, all synonyms, all variants, and all short forms

Term autonomy is not explicitly discussed in theoretical literature

Term autonomy

K.-D. Schmitz, IIM, FH Köln

TermEntryConcept

represented by ID-No. and/or classification / notation

Language 1+ AuxInfo

...

Term 1+ AuxInfo

Term 2+ AuxInfo

Term 1+ AuxInfo

Term 2+ AuxInfo

Term 1+ AuxInfo

Term 3+ AuxInfo

Concept orientation & term autonomy

Language 3+ AuxInfo

Language 2+ AuxInfo

K.-D. Schmitz, IIM, FH Köln

Concept orientation & term autonomy

K.-D. Schmitz, IIM, FH Köln

Terminological data modeling

Terminological data categories ISO 12620:1999 / 12620:2009 definition, subject field, grammar, context,

project code, author, date etc.

Data Category Registry (DCR)

Terminological data modeling principles ISO 12200 / 16642 / 30042 meta model, concept orientation, term autonomy,

TBX (Termbase eXchange)

K.-D. Schmitz, IIM, FH Köln

Terminology control

In many application scenarios of terminology work, the checking of correct and consistent use of terminology in documents (created by technical writers or translators) is recommended.

We can differentiate between the following control methods:

Monolingual terminology control

Bilingual terminology control (for translations)

Manual (human) terminology control (part of proof reading & QA)

Computer-assisted term control (tools analyze and check documents)

• Without linguistic methods (for “all” languages, using the content of a term base)

• With linguistic methods (better results, but only for “important”languages)

K.-D. Schmitz, IIM, FH Köln

Features of terminology checking tools: Similar to spell checkers and auto correction Integrated into editors, authoring systems, CAT tools, but also as

stand-alone programs Directly during the writing process of a document or translation, or

as an autonomous process (when the document is finished) Connection to the term base entries (interactive or via

export/import) Very often combined with grammar and style checking (controlled

language) Using fuzzy search and/or linguistics (inflected terms in texts vs.

canonical form of the terms in term bases) Deprecated terms must be maintained in the term base (no-terms)

Terminology control tools

K.-D. Schmitz, IIM, FH Köln

Sample of a terminology check with acrolinx IQ Suite

Terminology control tools

K.-D. Schmitz, IIM, FH Köln

Automatic detection of linguistic variants with acrolinx IQ Suite

Terminology control tools

K.-D. Schmitz, IIM, FH Köln

You will find terminology problems in many companies

In any case, terminology management will improve important factors such as: better communication, consistent corporate language, avoidance of errors and misunderstanding, faster training, better translations

Not all benefits are easy to calculate and quantify

Benefits and ROI depend on several company specific framework conditions

Terminology management is no luxury, it is a necessityfor all industrial companies and service providers,and this can be documented by key indicators

Conclusion 1

K.-D. Schmitz, IIM, FH Köln

Only excellent managed terminology can guarantee a high quality information and communication:

follow guidelines for term creation and term selection

model your termbase with concept orientationand term autonomy, and include appropriate data categories for documenting terms

Wrong decisions and mistakes can later be repaired only with huge efforts and costs !

Conclusion 2

Thank you for your attention

Prof. Dr. Klaus-Dirk SchmitzFachhochschule Köln

Fakultät 03 - ITMK/IIMMainzer Str. 5

50678 Kölnklaus.schmitz@fh-koeln.de