Electronic dictionaries in writing tools: user needs and models for user interaction
-
Upload
carlos-valcarcel-riveiro -
Category
Education
-
view
231 -
download
1
Transcript of Electronic dictionaries in writing tools: user needs and models for user interaction
Electronic dictionaries in writing tools:user needs and models for user interaction
Ulrich Heid
Universitat Hildesheim,Institut fur Informationswissenschaft und Sprachtechnologie,
Universitatsplatz,1 — D 31141 Hildesheim, Germany
Santiago de Compostela: Multilex-2015,October 2015
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 1 / 32
Overview
• Framework: Lexicographic Function Theoryand its implications for e-dictionary making
• User needs:• General aspects• Needs in text production –
and proposals from the literature to satisfy them:• Needs resulting from linguistic complexity• Needs resulting from different levels of knowledge of users
• Models of interaction:• Information on demand• (New) Ways of presenting lexicographic data
• Conclusion: lessons learnt
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 2 / 32
Context — and WarningProjects – cooperation
• This presentation does not contain anything new:it just re-arranges and re-interprets recent work:rather practical state of the art than abstract visions
• Based on cooperation inSeLA – Scientific e-Lexicography for Africa:Project funded by BMBF (05-2012 – 12-2015) and organized by DAAD
• University of Pretoria Theo Bothma – Daan Prinsloo – Elsabe Taljard
• University of Stellenbosch Rufus H. Gouws
• UNISA, University of South Africa Sonja E. Bosch
• University of Namibia Herman Beyer
• University of Hildesheim Gertrud Faaß
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 3 / 32
Framework and reminder: Lexicographic Function TheoryDictionaries as information tools Tarp 2008 etc.
• The dictionary provides data from which users can derive informationto satify a given need
• An “ideal” dictionaryprovides the user withexactly that{ types | amount of... } datawhich he/she needs
• Assumption in FT:Lexicographers (should) knowwhat is best for a given user (type)
→ different types of (e-)dictionaries
→ different data offers
potential user
user situation
need for information lexicographical data
extraction of inform.satisfaction of needs
dictionary
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 4 / 32
Framework and reminder: Lexicographic Function TheoryParameters influencing the process of information derivation Tarp 2008 etc.
• Needs of users arising in different situations:• Cognitive needs: learn about “things” or words• Communicative needs:
• Text production vs. text reception• Monolingual vs. bilingual• etc.
• Users’ pre-existing knowledge• Knowledge of the targeted language• Knowledge of the targeted domain (e.g. in specialized dictionaries)• Knowledge about using the (e-) dictionary,
or, more generally,about using electronic information tools
• Awareness of the use situation and needs
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 5 / 32
Implications of user needs and pre-existing knowledgeA view on the scenario of lexicography
• To satisfy different user needs,lexicographers will collect large amounts of lexicographic data
• For each type of need and/or for each type of user,a specific subset of the data will be needed
• Thus a filtering approach is necessary,where the filter is definedaccording touser types and needs
user−1
user−2
user−n
dict−1
dict−2
dict−3
filterslexgr.
data
specifications
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 6 / 32
Implications of user needs and pre-existing knowledgeLexicographic scenario: need for well-defined dictionary specifications
Dictionary plan Gouws 2013
• Lexicographic data categories:• Must be clearly distingushed, categorized and marked up• Must be presentable in different forms, Spohr2012
e.g. with different degrees of specialization, different metalanguage, etc.
• Filtering:• By lexicographic function• According to
pre-existing knowledge→ Selection
of data categories→ Selection
of presentation modes
user−1
user−2
user−n
dict−1
dict−2
dict−3
filterslexgr.
data
specifications
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 7 / 32
User needs: general aspectsParameters relevant for data selection
• Lexicographic functions• Text production ←→ text reception• Elements of cognitive needs involved in a communicative situation:
learning while producing text – training for text production
• Properties of the targeted linguistic phenomena• Lexicographic data categories needed for a given function:
words — word combinations — linguistic properties — ...• Interaction of lexical objects with “grammar”
• Pre-existing knowledge in users• Lexical items of the targeted language• Linguistic properties of the targeted lexical items• Grammatical knowledge of the targeted language
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 8 / 32
User needs in text productionLinguistic aspects
• Need to know a lexical object• Access:
• From a “concept”• Form a source language item
• Choice among alternatives, based on properties of each
• Need to insert le lexical object into an upcoming context:construction — sentence — discourse — text (type) ...• Access to linguistic properties of lexical objects,
on different levels of linguistic description• Some properties may act as constraints and rule out certain options
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 9 / 32
User needs in text productionLevels of interactvity – interaction models Prinsloo, Bothma and Heid 2015
• Mainly interactive tools:with different amounts of user interaction required• Step-wise build-up of a construction or a sentence• Guidance through options of lexical or grammatical choice• Guidance with cognitively oriented elements:
lexical or grammatical explanations
• Mainly automatic tools:User input triggers automatic processing• Checking tools: Verlinde 2014 and ILT online
grammar checkers — style checkers — collocation checkers ...• (Autoomatic) translation functions
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 10 / 32
Phenomenon-related needs: collocations as a case in pointAn example of criteria for the selection of lexicographic data categories
• Notion of collocation underlying:In the tradition of pedagogical lexicography Hausmann 2006, Mel’cuk
• Lexically and/or pragmatically constrained,language-specific: Bartsch 2004
FR prendre une douche ←→ IT fare la doccia• Base plus collocate: {douche | doccia} ⊕ verb• Syntactic relationship between base and collocate
• Lexicographic data needed: Gouws 2015
• Knowledge of the collocation:preferred lexical combination
• Knowledge about the collocation:properties relevant for its insetion into context
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 11 / 32
Phenomenon-related needs: collocations as a case in pointTypes of knowledge about collocations relevant for text production – Examples
• Morphosyntax: e.g.• Number preferences:
DE den Rechtswegsing . einschlagen ([to] take legal action)
←→ IT adire le vieplural legali• Determination: IT fare la doccia, ([to] take a shower)
DE sein Veto einlegen ([to] veto)
• Syntactic valency: e.g.[to] be in a position (+ to +INF)
DE in der Lage sein (+ zu + INF)
• Collocational preferences: e.g.DE {scharfe|heftige|massiv(e)...} Kritik uben ([to ]criticize severely)
• Pragmatic preferences: e.g. by text type:FR medical experts: X accroıt le risque de X (X increases the risk of Y)
FR medical lay persons: X augmente le risque de X Wandji Tchami et al. 2015
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 12 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
Different scenarios
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
• Text production: onomasiological access cf. Giacomini 2013
known searched for
base lemma + reading
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in context
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAccess to data on collocations
• Text production: onomasiological access
known searched for
base lemma + reading
meaning of word combination typical collocation (lexical rendition)
maybe: syntactic environment fit into text/sentence to be built
• Text reception: semasiological, form-based access
known searched for
(element of) word (combination) meaning in contextplus pragmatic properties
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 13 / 32
Phenomenon-related needs: collocations as a case in pointAn example: different kinds of access Data from OCDSE
Production
Reading 1: forward movement [military]
• ADJ + advance- [speed] rapid ∼- [agent] German ∼, Allied ∼, etc.
• V + advance- [make] make an ∼on XThe regiment made an advance on theenemy lines.
Reading 2: development (often in the plural)
• ADJ + advance- [amount] considerable ∼; big ∼,substantial ∼;dramatic ∼, enormous ∼, great ∼,spectacular ∼, tremendous ∼.
• V + advance- [make] make ∼es (in/on) [plural!]
Reading 3: amount of money
• ADJ + advance- [quantity] small ∼, large ∼ - [type] cash ∼
• V + advance- [provide] give so. an ∼, pay so. an ∼The university pays me an advance for thisbusiness trip.
Reception
• Readings(1) [military] forward movement(2) development(3) amount of money
• Typical adjectives- Allied etc. (cf. German etc.) (1)- big (=considerable) (2)- cash (3)- considerable (=big) (2)- dramatic (2)- German (cf. Allied, etc.) (1)- great (2)- important (1)- large (3)- notable (2)
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 14 / 32
Phenomenon-related needs: collocations as a case in pointAccess to collocational data for text production
Proposal for onomasiological access — example Giacomini 2011: 263
• Search:
Base syntactic filter semantic filters
paura fear ⊕ PP (di) ⊕ cause(= natural phenomenon)
• Result:paura [...]
colloc:paura ⊕ PP (di)
– causa:elementi e fenomeni naturali:
paura del terremoto; paura del fuoco; ...
• Option for a comparison with collocations of quasi-synonyms:paura del fuoco ↔ panico per il fuoco; *spavento, *ansia
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 15 / 32
Phenomenon-related needs: collocations as a case in pointA wireframe prototype for a collocation dictionary (1/3)
Step 1: Enter base lemma possibly with reading, if it is polysemous
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 16 / 32
Phenomenon-related needs: collocations as a case in pointA wireframe prototype for a collocation dictionary (2/3)
Step 2: Semantic selection
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 17 / 32
Phenomenon-related needs: collocations as a case in pointA wireframe prototype for a collocation dictionary (3/3)
Step 3: Syntactic selection
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 18 / 32
Needs due to different levels of pre-existing knowledgeLexical selection as a complex decision task — Bantu languages
Copulatives in Northern Sotho:how to translate [to] be (1/3) Bothma et al. 2013
• Linguistic parameters of the lexico-grammatical selection task:• Lexical semantics: *3
Identifying Descriptive Associativethis is a letter this woman is clever he is (together) with Sarake lengwalo mosadi yo o bohlale o na le Sara
• Aktionsart-like: stative ←→ incohative *2• Mood: indicative ←→ situative ←→ relative *3• Person or noun class *(14+4)• Positive ←→ negative *2
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 19 / 32
Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho (2/3)
Model for stepwise guidance:Lexical selection as a decision tree
A
B
C
D
E
F
G
? B or C
?
? F or G
D or E
• Choice points: A, B. C...
• Provides only relevant choices,depending on prior selection(s)
• Presence of cognitively relevant data at each choice point:Grammatical hints about the choice at hand — examples
→ A combination of dictionary and grammar,with on-demand support for text production• Systematic path to the solution• Decision-relevant information provided:
• Options at each choice point (minimal amount of data)• Grammatical hints and examples only if needed by the user
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 20 / 32
Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3a/3)
• Selecting stative vs. incohative copulative
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 21 / 32
Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3b/3)
• Selecting one of the readings of the copulative:identifying – descriptive – associative
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 22 / 32
Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3c/3)
• Stative descriptive copulative selected,selection among moods: indicative – situative – relative
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 23 / 32
Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3d/3)
• Almost all features selected –remains noun class
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 24 / 32
Needs due to different levels of pre-existing knowledgeCopulatives in Northern Sotho – sample steps (3e/3)
• For noun class:select positive vs. negated
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 25 / 32
Needs due to different levels of pre-existing knowledgeCombining data for communicative and cognitive needs
Learner-oriented tools for text production: Bosch/Faaß 2014
e-Zulu (and e-Xhosa) dictionary and grammar trainer Sanasi 2015
• Focus on the Zulu possessive construction:• Lexical choice of nominals for possessor and possession• Noun classes of possessor and possession• Noun-class-dependent connector (expressing the possessive relation)• Morphophonological adaptation rules
• Stepwise guidance on demand:• Nominal lexemes can be input in Zulu or English
• Data about input by user or provided by system
the noun class and the connector• etc.• Reminder of rules on demand→ From stepwise guidance to full translation
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 26 / 32
Needs due to different levels of pre-existing knowledgee-Zulu dictionary (1/2) Bosch/Faaß 2014
• Input in English: rooms of hotel
• Choice options:• Translation only• Stepwise explanation of Zulu rules applied
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 27 / 32
Needs due to different levels of pre-existing knowledgee-Zulu dictionary (2/2) Bosch/Faaß 2014
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 28 / 32
Needs due to different levels of pre-existing knowledgeCombining data for communicative and cognitive needs
Learner-oriented tools for text production: Prinsloo et al. 2014, 2015
Sepedi (= Northern Sotho) sentence builder for speakers of English
• Phenomena:• Lexical selection: nominals, verbs• Noun class system of Sepedi — concords and pronouns• Grammatical rules for valency constructions, relative clauses, etc.
• Same principles as with Zulu possessives:• On each step in text production, Individualization: Tarp 2011
user may decide whether and how much help to get from the tool• User input may be in either English or Sepedi,
with option open at each step of the sentence construction
• Integratable with a large English → Sepedi dictionary• Grammatical information on demand
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 29 / 32
Models for user interactionInformation on demand Bothma 2011
• Basic amount of data is available by default
• Additional data may be accessed via unfoldable items:• Grammatical explanations in decision trees Bothma et al. 2013
• “Info” button in Sepedi sentence builder Prinsloo et al. 2015
• Option to see explanations inlearning tools Sanasi 2015
⇒ Open questions:• Deciding beforehand profile-based dictionaries
about amount of data requiredor deciding at each step in the text production process ?
• How much use is made by users of extra data offer? Trap-Jensen 2010
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 30 / 32
Models for user interactionLinguistic complexity ←→ interactional simplicity
• Dilemma:• Complex linguistic decision processes
may require complex descriptions Bantu languages – collocation selection
• But:Many users want simple tools, easy to use:• Few clicks• Short explanations• Little effort before getting to the result Heid/Zimmermann 2012
• Proposal:• Providing guidance tools only on demand,
in addition to “standard” dictionary entries• Maybe adding non-linear guidance devices, especially for learners:
• Graphical elements Runte 2015
• Interactive elements, for learners to explore linguistic phenomena
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 31 / 32
Models for user interactionGraphical display of lexical relations Runte 2015
• Display of relationships between lexemes:• Paradigmatic:
• Synonyms, Antonyms• Hyp(er)onyms
• Syntagmatic:• Typical adjectives• Typical verbs, ...
<Qualifikation>
qualifiziert hochqualifiziert
Angestellter
Arbeiter
Erwerbstaetiger
Arbeitskraft
einstellen
beschaeftigen
kuendigen
arbeiten
Arbeit−
nehmer
• Analyzedin eye-tracking studies:presentationworks well for learners
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 32 / 32
ConclusionLessons learnt from overview of recent work
• Parameters relevant forthe design of dictionaries in writing support tools:• Properties of targeted lexical objects:
Addressing linguistic complexity• Pre-existing knowledge of users:
On lexical objects and their insertion inzo zext• Flexibility wrt interaction models:
Combining automatic and interactive use
• Current approaches• Constrained-based selection in collocations dictionary mainly from SeLA
• Stepwise guidance in decision trees• Learners’ bilingual dictionaries with explanations• Stepwise sentence builder:
Flexible amounts of support• Graphical presentation
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 33 / 32
Future work
• User testing of prototypes,to understand which approaches work best
• From mock-ups and prototypesto tools with sizeable lexical resources:• e-Zulu: several hundreds of items• Spedi sentence builder: work towards large grammatical cov
Heid (IwiSt/IMS) Text production dictionaries santiago15-fol 34 / 32