Electronic dictionaries in writing tools: user needs and models for user interaction

40
Electronic dictionaries in writing tools: user needs and models for user interaction Ulrich Heid Universit¨ at Hildesheim, Institut f¨ ur Informationswissenschaft und Sprachtechnologie, Universit¨ atsplatz,1 — D 31141 Hildesheim, Germany Santiago de Compostela: Multilex-2015, October 2015 Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 1 / 32

Transcript of Electronic dictionaries in writing tools: user needs and models for user interaction

Electronic dictionaries in writing tools:user needs and models for user interaction

Ulrich Heid

Universitat Hildesheim,Institut fur Informationswissenschaft und Sprachtechnologie,

Universitatsplatz,1 — D 31141 Hildesheim, Germany

Santiago de Compostela: Multilex-2015,October 2015

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 1 / 32

Overview

• Framework: Lexicographic Function Theoryand its implications for e-dictionary making

• User needs:• General aspects• Needs in text production –

and proposals from the literature to satisfy them:• Needs resulting from linguistic complexity• Needs resulting from different levels of knowledge of users

• Models of interaction:• Information on demand• (New) Ways of presenting lexicographic data

• Conclusion: lessons learnt

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 2 / 32

Context — and WarningProjects – cooperation

• This presentation does not contain anything new:it just re-arranges and re-interprets recent work:rather practical state of the art than abstract visions

• Based on cooperation inSeLA – Scientific e-Lexicography for Africa:Project funded by BMBF (05-2012 – 12-2015) and organized by DAAD

• University of Pretoria Theo Bothma – Daan Prinsloo – Elsabe Taljard

• University of Stellenbosch Rufus H. Gouws

• UNISA, University of South Africa Sonja E. Bosch

• University of Namibia Herman Beyer

• University of Hildesheim Gertrud Faaß

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 3 / 32

Framework and reminder: Lexicographic Function TheoryDictionaries as information tools Tarp 2008 etc.

• The dictionary provides data from which users can derive informationto satify a given need

• An “ideal” dictionaryprovides the user withexactly that{ types | amount of... } datawhich he/she needs

• Assumption in FT:Lexicographers (should) knowwhat is best for a given user (type)

→ different types of (e-)dictionaries

→ different data offers

potential user

user situation

need for information lexicographical data

extraction of inform.satisfaction of needs

dictionary

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 4 / 32

Framework and reminder: Lexicographic Function TheoryParameters influencing the process of information derivation Tarp 2008 etc.

• Needs of users arising in different situations:• Cognitive needs: learn about “things” or words• Communicative needs:

• Text production vs. text reception• Monolingual vs. bilingual• etc.

• Users’ pre-existing knowledge• Knowledge of the targeted language• Knowledge of the targeted domain (e.g. in specialized dictionaries)• Knowledge about using the (e-) dictionary,

or, more generally,about using electronic information tools

• Awareness of the use situation and needs

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 5 / 32

Implications of user needs and pre-existing knowledgeA view on the scenario of lexicography

• To satisfy different user needs,lexicographers will collect large amounts of lexicographic data

• For each type of need and/or for each type of user,a specific subset of the data will be needed

• Thus a filtering approach is necessary,where the filter is definedaccording touser types and needs

user−1

user−2

user−n

dict−1

dict−2

dict−3

filterslexgr.

data

specifications

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 6 / 32

Implications of user needs and pre-existing knowledgeLexicographic scenario: need for well-defined dictionary specifications

Dictionary plan Gouws 2013

• Lexicographic data categories:• Must be clearly distingushed, categorized and marked up• Must be presentable in different forms, Spohr2012

e.g. with different degrees of specialization, different metalanguage, etc.

• Filtering:• By lexicographic function• According to

pre-existing knowledge→ Selection

of data categories→ Selection

of presentation modes

user−1

user−2

user−n

dict−1

dict−2

dict−3

filterslexgr.

data

specifications

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 7 / 32

User needs: general aspectsParameters relevant for data selection

• Lexicographic functions• Text production ←→ text reception• Elements of cognitive needs involved in a communicative situation:

learning while producing text – training for text production

• Properties of the targeted linguistic phenomena• Lexicographic data categories needed for a given function:

words — word combinations — linguistic properties — ...• Interaction of lexical objects with “grammar”

• Pre-existing knowledge in users• Lexical items of the targeted language• Linguistic properties of the targeted lexical items• Grammatical knowledge of the targeted language

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 8 / 32

User needs in text productionLinguistic aspects

• Need to know a lexical object• Access:

• From a “concept”• Form a source language item

• Choice among alternatives, based on properties of each

• Need to insert le lexical object into an upcoming context:construction — sentence — discourse — text (type) ...• Access to linguistic properties of lexical objects,

on different levels of linguistic description• Some properties may act as constraints and rule out certain options

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 9 / 32

User needs in text productionLevels of interactvity – interaction models Prinsloo, Bothma and Heid 2015

• Mainly interactive tools:with different amounts of user interaction required• Step-wise build-up of a construction or a sentence• Guidance through options of lexical or grammatical choice• Guidance with cognitively oriented elements:

lexical or grammatical explanations

• Mainly automatic tools:User input triggers automatic processing• Checking tools: Verlinde 2014 and ILT online

grammar checkers — style checkers — collocation checkers ...• (Autoomatic) translation functions

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 10 / 32

Phenomenon-related needs: collocations as a case in pointAn example of criteria for the selection of lexicographic data categories

• Notion of collocation underlying:In the tradition of pedagogical lexicography Hausmann 2006, Mel’cuk

• Lexically and/or pragmatically constrained,language-specific: Bartsch 2004

FR prendre une douche ←→ IT fare la doccia• Base plus collocate: {douche | doccia} ⊕ verb• Syntactic relationship between base and collocate

• Lexicographic data needed: Gouws 2015

• Knowledge of the collocation:preferred lexical combination

• Knowledge about the collocation:properties relevant for its insetion into context

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 11 / 32

Phenomenon-related needs: collocations as a case in pointTypes of knowledge about collocations relevant for text production – Examples

• Morphosyntax: e.g.• Number preferences:

DE den Rechtswegsing . einschlagen ([to] take legal action)

←→ IT adire le vieplural legali• Determination: IT fare la doccia, ([to] take a shower)

DE sein Veto einlegen ([to] veto)

• Syntactic valency: e.g.[to] be in a position (+ to +INF)

DE in der Lage sein (+ zu + INF)

• Collocational preferences: e.g.DE {scharfe|heftige|massiv(e)...} Kritik uben ([to ]criticize severely)

• Pragmatic preferences: e.g. by text type:FR medical experts: X accroıt le risque de X (X increases the risk of Y)

FR medical lay persons: X augmente le risque de X Wandji Tchami et al. 2015

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 12 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

Different scenarios

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

• Text production: onomasiological access cf. Giacomini 2013

known searched for

base lemma + reading

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in context

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAccess to data on collocations

• Text production: onomasiological access

known searched for

base lemma + reading

meaning of word combination typical collocation (lexical rendition)

maybe: syntactic environment fit into text/sentence to be built

• Text reception: semasiological, form-based access

known searched for

(element of) word (combination) meaning in contextplus pragmatic properties

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32

Phenomenon-related needs: collocations as a case in pointAn example: different kinds of access Data from OCDSE

Production

Reading 1: forward movement [military]

• ADJ + advance- [speed] rapid ∼- [agent] German ∼, Allied ∼, etc.

• V + advance- [make] make an ∼on XThe regiment made an advance on theenemy lines.

Reading 2: development (often in the plural)

• ADJ + advance- [amount] considerable ∼; big ∼,substantial ∼;dramatic ∼, enormous ∼, great ∼,spectacular ∼, tremendous ∼.

• V + advance- [make] make ∼es (in/on) [plural!]

Reading 3: amount of money

• ADJ + advance- [quantity] small ∼, large ∼ - [type] cash ∼

• V + advance- [provide] give so. an ∼, pay so. an ∼The university pays me an advance for thisbusiness trip.

Reception

• Readings(1) [military] forward movement(2) development(3) amount of money

• Typical adjectives- Allied etc. (cf. German etc.) (1)- big (=considerable) (2)- cash (3)- considerable (=big) (2)- dramatic (2)- German (cf. Allied, etc.) (1)- great (2)- important (1)- large (3)- notable (2)

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 14 / 32

Phenomenon-related needs: collocations as a case in pointAccess to collocational data for text production

Proposal for onomasiological access — example Giacomini 2011: 263

• Search:

Base syntactic filter semantic filters

paura fear ⊕ PP (di) ⊕ cause(= natural phenomenon)

• Result:paura [...]

colloc:paura ⊕ PP (di)

– causa:elementi e fenomeni naturali:

paura del terremoto; paura del fuoco; ...

• Option for a comparison with collocations of quasi-synonyms:paura del fuoco ↔ panico per il fuoco; *spavento, *ansia

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 15 / 32

Phenomenon-related needs: collocations as a case in pointA wireframe prototype for a collocation dictionary (1/3)

Step 1: Enter base lemma possibly with reading, if it is polysemous

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 16 / 32

Phenomenon-related needs: collocations as a case in pointA wireframe prototype for a collocation dictionary (2/3)

Step 2: Semantic selection

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 17 / 32

Phenomenon-related needs: collocations as a case in pointA wireframe prototype for a collocation dictionary (3/3)

Step 3: Syntactic selection

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 18 / 32

Needs due to different levels of pre-existing knowledgeLexical selection as a complex decision task — Bantu languages

Copulatives in Northern Sotho:how to translate [to] be (1/3) Bothma et al. 2013

• Linguistic parameters of the lexico-grammatical selection task:• Lexical semantics: *3

Identifying Descriptive Associativethis is a letter this woman is clever he is (together) with Sarake lengwalo mosadi yo o bohlale o na le Sara

• Aktionsart-like: stative ←→ incohative *2• Mood: indicative ←→ situative ←→ relative *3• Person or noun class *(14+4)• Positive ←→ negative *2

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 19 / 32

Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho (2/3)

Model for stepwise guidance:Lexical selection as a decision tree

A

B

C

D

E

F

G

? B or C

?

? F or G

D or E

• Choice points: A, B. C...

• Provides only relevant choices,depending on prior selection(s)

• Presence of cognitively relevant data at each choice point:Grammatical hints about the choice at hand — examples

→ A combination of dictionary and grammar,with on-demand support for text production• Systematic path to the solution• Decision-relevant information provided:

• Options at each choice point (minimal amount of data)• Grammatical hints and examples only if needed by the user

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 20 / 32

Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3a/3)

• Selecting stative vs. incohative copulative

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 21 / 32

Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3b/3)

• Selecting one of the readings of the copulative:identifying – descriptive – associative

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 22 / 32

Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3c/3)

• Stative descriptive copulative selected,selection among moods: indicative – situative – relative

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 23 / 32

Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3d/3)

• Almost all features selected –remains noun class

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 24 / 32

Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3e/3)

• For noun class:select positive vs. negated

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 25 / 32

Needs due to different levels of pre-existing knowledgeCombining data for communicative and cognitive needs

Learner-oriented tools for text production: Bosch/Faaß 2014

e-Zulu (and e-Xhosa) dictionary and grammar trainer Sanasi 2015

• Focus on the Zulu possessive construction:• Lexical choice of nominals for possessor and possession• Noun classes of possessor and possession• Noun-class-dependent connector (expressing the possessive relation)• Morphophonological adaptation rules

• Stepwise guidance on demand:• Nominal lexemes can be input in Zulu or English

• Data about input by user or provided by system

the noun class and the connector• etc.• Reminder of rules on demand→ From stepwise guidance to full translation

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 26 / 32

Needs due to different levels of pre-existing knowledgee-Zulu dictionary (1/2) Bosch/Faaß 2014

• Input in English: rooms of hotel

• Choice options:• Translation only• Stepwise explanation of Zulu rules applied

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 27 / 32

Needs due to different levels of pre-existing knowledgee-Zulu dictionary (2/2) Bosch/Faaß 2014

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 28 / 32

Needs due to different levels of pre-existing knowledgeCombining data for communicative and cognitive needs

Learner-oriented tools for text production: Prinsloo et al. 2014, 2015

Sepedi (= Northern Sotho) sentence builder for speakers of English

• Phenomena:• Lexical selection: nominals, verbs• Noun class system of Sepedi — concords and pronouns• Grammatical rules for valency constructions, relative clauses, etc.

• Same principles as with Zulu possessives:• On each step in text production, Individualization: Tarp 2011

user may decide whether and how much help to get from the tool• User input may be in either English or Sepedi,

with option open at each step of the sentence construction

• Integratable with a large English → Sepedi dictionary• Grammatical information on demand

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 29 / 32

Models for user interactionInformation on demand Bothma 2011

• Basic amount of data is available by default

• Additional data may be accessed via unfoldable items:• Grammatical explanations in decision trees Bothma et al. 2013

• “Info” button in Sepedi sentence builder Prinsloo et al. 2015

• Option to see explanations inlearning tools Sanasi 2015

⇒ Open questions:• Deciding beforehand profile-based dictionaries

about amount of data requiredor deciding at each step in the text production process ?

• How much use is made by users of extra data offer? Trap-Jensen 2010

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 30 / 32

Models for user interactionLinguistic complexity ←→ interactional simplicity

• Dilemma:• Complex linguistic decision processes

may require complex descriptions Bantu languages – collocation selection

• But:Many users want simple tools, easy to use:• Few clicks• Short explanations• Little effort before getting to the result Heid/Zimmermann 2012

• Proposal:• Providing guidance tools only on demand,

in addition to “standard” dictionary entries• Maybe adding non-linear guidance devices, especially for learners:

• Graphical elements Runte 2015

• Interactive elements, for learners to explore linguistic phenomena

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 31 / 32

Models for user interactionGraphical display of lexical relations Runte 2015

• Display of relationships between lexemes:• Paradigmatic:

• Synonyms, Antonyms• Hyp(er)onyms

• Syntagmatic:• Typical adjectives• Typical verbs, ...

<Qualifikation>

qualifiziert hochqualifiziert

Angestellter

Arbeiter

Erwerbstaetiger

Arbeitskraft

einstellen

beschaeftigen

kuendigen

arbeiten

Arbeit−

nehmer

• Analyzedin eye-tracking studies:presentationworks well for learners

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 32 / 32

ConclusionLessons learnt from overview of recent work

• Parameters relevant forthe design of dictionaries in writing support tools:• Properties of targeted lexical objects:

Addressing linguistic complexity• Pre-existing knowledge of users:

On lexical objects and their insertion inzo zext• Flexibility wrt interaction models:

Combining automatic and interactive use

• Current approaches• Constrained-based selection in collocations dictionary mainly from SeLA

• Stepwise guidance in decision trees• Learners’ bilingual dictionaries with explanations• Stepwise sentence builder:

Flexible amounts of support• Graphical presentation

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 33 / 32

Future work

• User testing of prototypes,to understand which approaches work best

• From mock-ups and prototypesto tools with sizeable lexical resources:• e-Zulu: several hundreds of items• Spedi sentence builder: work towards large grammatical cov

Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 34 / 32