LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

54
LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO

Transcript of LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Page 1: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

LINGUISTICA GENERALE E COMPUTAZIONALE

CONOSCENZA LESSICALE:IL LESSICO GENERATIVO

Page 2: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

2

Introduction

• Lexicon— ideally collection of all words of a language

• Information stored in a lexicon- Phonetic information

pronunciation

Semantic information

meaning

Morphological information

transitivity and intransitivity (verbs) , count vs. mass (noun)

Page 3: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

3

Lexicon (contd…)

Example of “eat” in the Oxford Advanced Learner’s Dictionary

eat /i:t/ v (pt ate /et/; pp eaten /i:tn/):1. sth (up) to food into the mouth,chew and swallow it: he was too ill to eat

Lexical entry

Pronunciation

Morphological informationMeaning

Page 4: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

4

Mental Lexicon

• Mental Lexicon: information stored in the mind of a native speaker

• Native speakers store information

Phonetic information

pronunciation

Semantic information

meaning

Morphological information

transitivity vs.intransitivity (verbs), count vs. mass (noun)

• Additional information

use of a word in a new context, syntactic environment of a word, word-formation rules

Page 5: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

5

Example of Mental Lexicon

Example of eat in a native speaker’s mind

• Pronunciation: long /i:/ is used in eat

• Grammatical information: past tense is ate /et/

• Word-formation rules: /-s/ is the third person singular present tense marker as in

he eats

• Meaning: 1. Take in solid food: she ate a banana

2. Take a meal: we did not eat until 10 P.M.

3. Worry or cause anxiety in a persistent way: what’s eating you up.

• Syntactic Information: eat needs an agent to perform the action.

the agent role is obligatory.

Page 6: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

6

Lexicon in Computational Linguistics

Lexicon meant for Natural Language Processing (NLP) must have the

following properties:

• Morphological information

Parts of speech information

Rules should be there to deal with both regular and irregular forms

e.g ate (past tense of eat)

men (plural of man)

• Semantic information

Can handle lexical ambiguity

• Syntactic information

Action verbs will always have an agent

Page 7: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

7

Polysemy and the Logical Problem of Polysemy

Polysemy• An individual word can have indefinite number of subtle meaning

difference

• Natural Languages are highly polysemous

• This creates ambiguity

• Weinreich distinguishes between two types of ambiguity Contrastive ambiguity Complementary polysemy

Page 8: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

8

Polysemy and the Logical Problem of Polysemy (contd…)

Contrastive Ambiguity• A lexical item carries two distinct unrelated meanings• This is a case of homonymy

words spelled or pronounced in the same way but have different meanings

Example: bank a financial institution bank place beside a body of water.

Page 9: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

(1)a. Mary doesn’t believe the book. b. John sold his books to Mary.(2)a. Eno the cat is sitting on yesterday’s newspaper. b. Yesterday’s newspaper really got me upset.(3)a. Mary is in Harvard Square looking for the Bach sonatas. b. We won’t get to the concert until after the Bach sonatas.(4)a. I have my lunch in the backpack. b. Your lunch was no longer today than it was yesterday.(5)a. The phone rang during my appointment. b. My next appointment is John.

Complementary polysemy

Page 10: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

11

Sense Enumeration Lexicon (SEL)

• WordNet and similar resources are examples of SENSE ENUMERATION LEXICA

• Direct approach to handle polysemy is to allow the lexicon to have multiple listing of words, each annotated with a separate meaning or lexical sense.

• Widely accepted in both computational and theoretical linguistics.

Page 11: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

12

Sense Enumeration Lexicon (SEL)

• Example of Contrastive Senses

bank1

CAT= count-noun GENUS= financial-institution

bank2

CAT= count-noun GENUS= shore

Page 12: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

13

Nominal polysemy in sense enumeration lexica

• Newspaper

Newspaper2

CAT= count-noun GENUS= information

Newspaper1

CAT= count-noun GENUS= artefact

Page 13: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

14

Sense Enumeration Lexicon (SEL)

• Possible Modification of Complementary Polysemy in SEL

CAT= count-noun GENUS= information

CAT= count-noun GENUS= artefact

sense1

sense2

newspaper

Page 14: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Syntactic polysemy deals with polivalency (I), object deletion (II) and the general properties of argument expression (III).

(I) a. Mary began to read the novel. b. Mary began reading the novel. c. Mary began the novel.

(II) a. Mary ate (her meal) quickly. b. Mary devoured *(her meal) quickly.

(III) a. John carved a doll (out of the wood). b. John carved the wood (into a doll).

Verbal polysemy in sense enumeration lexica

Page 15: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

GENERATIVE LEXICON THEORY

• (Pustejovsky, 1991, 1995)• Claim: the concepts associated with a word in

a context are GENERATED by a process starting from lexical entries structured into QUALIA STRUCTUREs and involving GENERATIVE DEVICES such as TYPE COERCION and CO-COMPOSITION

Page 16: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

19

Generative lexicon theory: lexical entriesA lexical entry in the generative lexicon consists of the

following elements at least:• Argument Structure

True Arguments

Default Arguments

Shadow Arguments

True Adjuncts

• Event Structure

• Qualia Structure Formal

Constitutive

Telic

Agentive

Page 17: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

20

Argument Structure• True Arguments: syntactically realized parameters of

the lexical item

John arrived late • Default Arguments: logically present in the expressions

but are not necessarily expressed syntactically.

John built the house out of bricks • True Adjuncts:

modify the logical expression part of the situational interpretation

She drove down to New York on Tuesday.

Page 18: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

21

Argument Structure (contd…)

• Shadow Arguments: semantically incorporated in the lexical item and are expressed by discourse specification and contextual factors

Mary buttered her toast hidden argument is the material being spread on the toast

these are not optional arguments but expressible only under specific conditions

refer to the semantic content that is not necessarily expressed in syntax

Example: Mary buttered her toast with margarine

Page 19: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

SELECTION

1. a. The man fell/died. b. The rock fell/!died.

2. a. John forced/!convinced the door to open.b. John forced/convinced the guests to leave.

3. a. John poured milk into /!on his coffee.b. John poured milk into/on the bowl.

Page 20: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Integrating Selection into Grammars

Page 21: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

CONSEQUENCE OF SELECTION: ONTOLOGICAL ASSUMPTIONS

Page 22: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

BUT …

• Unlike in other lexical theories, in GLT types can be modified via TYPE COERCION (see below)

Page 23: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

26

Event Structure

• event type of a lexical item and a phrase • events can be sub-classified into at least three sorts: State,

Process and Transition Event Structure of build as found in the following expressions They are building a new house

The house was built by John

build

EVENTSTR=E1= processE2= state

Page 24: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Word meaning is structured on the basis of four generative factors, called qualia roles, that capture how humans understand objects and relations in the world and provide the minimal explanation for the linguistic behavior of lexical items.

FORMAL: the basic category that distinguishes an object within a larger domain

CONSTITUTIVE: the relation between an object and its constituent parts

TELIC: the object’s purpose and functionAGENTIVE: factors involved in the object’s origin or ‘coming to

being’

QUALIA ROLES

Page 25: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

28

Qualia structures and argument polysemy

Qualia Structure for novel

novel

const = narrativeformal = booktelic = readingagent = writingQualia

Page 26: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

29

A generative device: type coercion

• Type Coerciona lexical item or phrase is coerced to a semantic interpretation by a

governing item in the phrase, without changing its syntactic type

Mary began to read the novel Mary began reading the novel Mary began the novel

• • Function Application with Coercion

different complement type of the verb different interpretations of the verb that arise for the different

complements

Page 27: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

30

Other generative devices

• Selective Binding a lexical item or a phrase operates specifically on the substructure of a phrase,

without changing the overall type in the composition

a good knife: a knife that cuts well

• Co-composition multiple elements within a phrase behave as functors, generating new non-

lexicalized senses for the words in composition John baked the potato John baked the cake

Page 28: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

A dot object is a deeper structure relating the apparently contradictorysenses of the word. For each sense pair there is a relation that ‘connects’the senses in a well-defined way.

The dot object is characterized as:- a Cartesian type product of n types (the product τ1 x τ2, of types τ1 and τ2, each denoting sets, is the

ordered pair <t1 , t2>, where t1 ε τ1 , t2 ε τ2)- with some additional constraints: there exists a relation R between

the elements of τ1 and τ2 , namely, R(t1 , t2). This relation must be seen as part of the definition of the semantics for

thedot object.

NOMINAL POLYSEMY

Page 29: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Type combinations included in the broad range of complex typesencountered in natural language:

a. phys_objinfo : e.g., book, record b. eventevent : e.g., construction, examinationc. eventquestion : e.g., examd. eventfood : e.g., lunch, dinnere. eventhuman : e.g., appointment

For each of these type products, there is a unique relation, Ri, thatstructures the types.

For example, nouns such as book or record, are structured by acontainment relation R (container-like concepts). This containment relation -hold(x,y)- must be encoded directly into thesemantics of the concept as the FORMAL quale value.

Page 30: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

The lexical structure for newspaper as a dot object is represented as follows:

newspaper ARGSTR = ARG1 = y:information

ARG2 = x:phys_obj

QUALIA = informationphys_obj

FORM = hold(x,y)

TELIC = read(e,w,xy)

AGENT = write(e,v,xy)

This translates to the following logical expression:λx yev[newspaper (x:physobj y :info) hold(x,y)

λwλe [read (e,w,xy) [write(e;v,x y)]]

Page 31: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

MORE FORMAL DETAILS

Page 32: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Three Ranks of TypeEntities

Events

Page 33: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

System of Generating Types

Page 34: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Qualia are incorporated into Type Itself

Page 35: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Qualia as Types

Page 36: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Functional Selection

Page 37: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Functional Type Coercion

Page 38: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Co-composition

Page 39: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Coercion in Function Composition

Page 40: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Selection and Coercion

Page 41: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Type Specification

Page 42: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

GLT AND LEXICAL RESOURCES

Page 43: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

50

Generative lexicon vs. WordNet

• Formal role is similar to the hypernymy relation• Constitutive role is similar to the meronymy relation• Nothing in WordNet like

• the functionality link• Event structure

• Exists in some WordNets, e.g., Hindi WordNet

Page 44: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

LEXICAL RESOURCES BASED ON GLT

• SIMPLE• LKB

Page 45: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

SIMPLE

• Lessico creato all’Universita’ di Pisa

Page 46: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Concrete_entity

Abstract_entityPropertyRepresentation

TELIC

•Furniture

•Instrument

•Clothing

•Artwork

•Sign

•Language

•Information

•.....

•Living_entity

•Human

•Animal

•Vegetal_entity

•Artifact

•Susbstance

•Location

•Food

•Material

•Quality

•Quantity

•Physical_prop

•Psychol_prop

•.....

•Convention

•Cognitive_fact

•.....

Artifactual_material

Artifact

TOP

AGENTIVE CONSTITUTIVE ENTITYEvent...

...

...

some semantic types for abstract & concrete entitiessome semantic types for

abstract & concrete entities

Page 47: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Phenomenon

Change

Psych_eventAspectual

State Act

EVENT

Cause_change

Relational_state

Non_relational_act

Relational_act

Move

Cause_act

Relational_change

Change_possession

Change_location

Acquire_knowledge

Natural_transition

...

Creation

......

......

...

...

Speech_act

...

...

some semantic types for events

some semantic types for events

Page 48: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

some semantic types for adjectives

some semantic types for adjectives

ExtensionalIntensional

TOP

Psychological_prop

Social_prop

Physical_prop Intensifying_prop

Temporal_prop

Relational_prop

Temporal

Modal

EmotiveManner

Object_related

Emphasizer

Page 49: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

isaantonym_compantonym_gradmult_opposition

FormalFormal

result_ofagentive_progagentive_causeagentive_experiencecaused_bysource

AGENTIVE

ARTIFACTUAL

AGENTIVE

created_byderived_from

AgentiveAgentive

used_forused_asused_byused_against

TELIC

ACTIVITY

INSTRUMENTAL

DIRECT

TELIC

indirect_telicpurpose

object_of_activity

is_the_activity_ofis_the_ability_ofis_the_habit_of

TelicTelicmade_ofis_a_follower_ofhas_as_memberis_a_member_ofhas_as_partinstrumentkinshipis_a_part_ofresulting_staterelatesuses

CONSTITUTIVE

causesconcernsaffectsconstitutive_activitycontains has_as_colourhas_as_effecthas_as_propertymeasured_bymeasuresproducesproduced_by property_ofquantifiesrelated_tosuccessor_ofprecedestypical_ofcontainsfeeling

P

R

O

P

E

R

T

Y

is_inlives_intypical_location

LOCATION

ConstitutiveConstitutiveExtended

Extended

Extended

roles

Extended

roles

Qualia

Qualia

Structure

Structure

Page 50: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Formal role

Agentive role

Tel

ic r

ole

Con

stit

utiv

e ro

le

instrument

is_a

used_forcr

eate

d_byis_made_of

Orthogonal dimensions of meaningOrthogonal dimensions of meaning

Page 51: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

Formal role

Agentive role

Tel

ic r

ole

Con

stit

utiv

e ro

le

violin

is_a

mus

ical

_ins

trum

ent

used_forplaying

crea

ted_

bym

ake

has_as_partstrings

is_made_ofwood

Orthogonal dimensions of meaningOrthogonal dimensions of meaning

Page 52: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

recipienterecipientedi legnodi legnofattofatto

che serve per la conservazione e il trasportoche serve per la conservazione e il trasporto

Formal: isa Constitutive: made_of

Agentive: created_by

Constitutive:contains

Telic:Used_for

di doghe arcuate tenute unite da cerchi di ferrodi doghe arcuate tenute unite da cerchi di ferro

Constitutive: made_of

di liquidi, specialmente vinodi liquidi, specialmente vino

bottebottebottebottebarrel

traditional dictionary definition

meaning dimensions expressed by Qualia relations

meaning dimensions expressed by Qualia relations

Page 53: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

REFERENCES

• Pustejovsky, J. (1995). The generative Lexicon. Cambridge, MA: MIT Press.

Page 54: LINGUISTICA GENERALE E COMPUTAZIONALE CONOSCENZA LESSICALE: IL LESSICO GENERATIVO.

ACKNOWLEDGMENTS

• Slides borrowed from– Debasri Chakrabarti– James Pustejovsky– Nilda Ruimy