BrainNet: Combining evidence from corpora and from the brain to
study conceptual representations
Massimo PoesioUni Essex, Language & Computation
Uni Trento, CIMEC/CLIC
COLLABORATORS
Essex:HEBA LAKANY (now Strathclyde)FRANCISCO SEPULVEDA
BRIAN MURPHY(CMU, was Trento)
Trento: ANDREW ANDERSONYUQIAO GUYUAN TAOMARCO BARONIGABRIELE MICELI
MOTIVATIONSResearch on conceptual knowledge is carried
out in Artificial Intelligence, Computational Linguistics, Neural Science, and Psychology
But there is limited interchange between AI, CL and the other disciplines studying concepts Except indirectly through the use of WordNet
This line of research: use evidence from Neural Science, work on (vector-space) models in CL, and psychology to rethink the design of lexical repositories such as WordNet
THE (LEXICAL) SEMANTICS REVOLUTION IN CL AND AIThe availability of repositories of lexical
knowledge such as ConceptNet, Cyc, FrameNet, and especially WordNet, has had a dramatic impact on research and development in HLT and AI, leading to the development of the first HLT systems able to do (some form of) lexical semantic interpretation on large amounts of data
This extensive use however has also highlighted the limitations of such resources (focusing here on WordNet as it’s the best known)
LIMITATIONS OF WORDNETAlready familiar from the CL literature:
CoverageOverly fine-grained distinctions
More fundamental problems:Evidence for categorical distinctionsAssumptions about taxonomic structureLack of information about function / perceptual
propertiesEmotional import
ENCOUNTERING WORDNET’S LIMITATIONS: A TYPICAL EXAMPLE Between 2003 and 2006 Abdulrahman Almuhareb
and myself ran a series of studies on ontology learning from text (Poesio & Almuhareb, 2008)
We used WordNet to identify the categories of interest and to evaluate the results of our system
QUANTITATIVE EVALUATION
ATTRIBUTES PROBLEM: can’t compare against WordNetPrecision / recall against hand-annotated datasetsHuman judges (ourselves):
We used the classifiers to classify the top 20 features of 21 randomly chosen concepts
We separately evaluated the results
CATEGORIES:Clustering of the balanced datasetPROBLEM: The WordNet category structure is highly
subjective
CLUSTERING: ERROR ANALYSISANIMAL bear, bull, camel, cat, cow, deer,
dog, elephant, horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle
CLUSTERING: ERROR ANALYSISANIMAL bear, bull, camel, cat, cow, deer, dog,
elephant, horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle
EDIBLE FRUIT
apple, banana, berry, cherry, fig, grape, kiwi, lemon, lime, mango, melon, olive, orange, peach, pear, pineapple, strawberry, watermelon, (pistachio, oyster)
CLUSTERING: ERROR ANALYSISANIMAL bear, bull, camel, cat, cow, deer, dog, elephant,
horse, kitten, lion, monkey, puppy, rat, sheep, tiger, turtle
EDIBLE FRUIT
apple, banana, berry, cherry, fig, grape, kiwi, lemon, lime, mango, melon, olive, orange, peach, pear, pineapple, strawberry, watermelon, (pistachio, oyster)
ILLNESS acne, anthrax, arthritis, asthma, cancer, cholera, cirrhosis, diabetes, eczema, flu, glaucoma, hepatitis, leukemia, malnutrition, meningitis, plague, rheumatism, smallpox, (superego, lumbago, neuralgia, sciatica, gestation, menopause, quaternary, pain)
IN WORDNET: PAIN
EXAMPLE: STATES AND RELATIONS
FEELING STATE ATTRIBUTE
FEELING
STATE ATTRIBUTE
WORDNET
PLAUSIBLE ALTERNATIVE?
LIMITS OF THIS TYPE OF EVALUATIONNo way of telling how complete / accurate are our
concept descriptionsBoth in terms of relations and in terms of their
relative importance
No way of telling whether the category distinctions we get from WordNet are empirically founded
EVIDENCE FROM OTHER AREAS OF COGNITIVE SCIENCE
Attributes: evidence from psychologyAssociation lists (priming)
E.g., use results of association tests to evaluate proximity (Lund et al, 1995; Pado and Lapata, 2008)
Comparison against feature norms: Schulte im Walde, 2008)
Feature norms
Category distinctions: evidence from neural science
USING BRAIN DATA TO IDENTIFY CATEGORY DISTINCTIONS
Studies of brain-damaged patients have been shown to provide useful insights in the organization of conceptual knowledge in the brainWarrington and Shallice 1984, Caramazza &
Shilton 1998fMRI has been used to identify these
distinctions in healthy patients as wellE.g., Martin & Chao
See, e.g., Capitani et al 2003 for a survey
Magnetic Resonance ImagingScanner
fMRI Setup
SETUP
19
Simple Paradigms
Image visualisation
Property elicitation
Silent naming
Concept “simulation”
CATEGORY DISTINCTIONS IN THE BRAIN
ANIMALSTOOLS
VOXEL
A MORE COMMON CASE
d. RED: Law, BLUE: Music
MVPA: USING SUPERVISED LEARNING TO CLASSIFY ACTIVATION PATTERNSSimple experiment: Show subjects pictures of
different objects (e.g., shoes vs. bottles) on different trials of different runs
FROM WORDNET TO BRAINNETNeural evidence, unlike the evidence used to
compile dictionaries and WordNet, and like the evidence one gathers from corpora and certain behavioral experiments, is entirely objective (although it can be subjective in the sense of differing from subject to subject)
The objective of our research is to combine evidence from brain data, from corpora, and from behavioral experiments (all of which is rather noisy) to develop a new architecture for conceptual knowledge: BrainNet
FIRST CASE STUDY: ABSTRACT CONCEPTSUntil recently, most work on concepts in CL /
neuroscience / psychology focused on concrete conceptsBut the type of conceptual knowledge that really
challenges traditional assumptions about its organization are `abstract concepts’ – or to be more precise, the set of categories of non-concrete concepts Events / actions States ‘Urabstract’ concepts: LAW, JUSTICE, ART
We are carrying out explorations of abstract knowledge using fMRI
THEORIES OF ABSTRACT CONCEPTS IN COGNITIVE NEUROSCIENCEIn CL/AI: TAXONOMIC organization for both
abstract and concrete conceptsBest known Cognitive Neuroscience: Paivio’s DUAL
CODE theory (Paivio, 1986)CONCRETE: verbal system & visual systemABSTRACT: verbal system only
Schwanenflugel & Akin 1994: CONTEXT AVAILABILITY
Barsalou’s SCENARIO-BASED MODEL (Barsalou, 1999):Abstract knowledge organized around SCENARIOS
THE OBJECTIVES OF OUR EXPERIMENT Identify the representation in the brain of a variety of WordNet
categories exemplifying both concrete and abstract concepts (abstract words chosen by inspecting the words rated as most abstract in the De Rosa et al norms 2005) Really abstract: ATTRIBUTE, COMMUNICATION, EVENT, LOCATION,
‘URABSTRACT’ A category of concrete objects: TOOLS A complex category: SOCIAL-ROLE
Comparing two types of classification: TAXONOMIC (as in WordNet) DOMAIN (cfr. Barsalou’s hypothesis about abstract concepts being
‘situated’)
Two domains: LAW and MUSIC Using WordNet Domain
STIMULICATEGORY LAW (English) MUSIC (English)
attributegiurisdizione jurisdiction sonorita' sonority
cittadinanza citizenship ritmo rhythm
impunita' impunity melodia melody
legalita' legality tonalita' tonality
illegalita' illegality intonazione pitchcommunication divieto prohibition canzone song
verdetto verdict pentagramma stave
ordinanza decree ballata ballad
addebito accusation ritornello refrain
ingiunzione injunction sinfonia symphony
STIMULI, 2: URABSTRACTS
CATEGORYurabstracts giustizia justice musica music
liberta' liberty blues blues
legge law jazz jazz
corruzione corruption canto singing
refurtiva loot punk punk
STIMULI, 3: SOCIAL ROLES
Social-role giudice judge musicista musician
ladro thief cantante singer
imputato defendantcompositore composer
testimone witness chitarrista guitarist
avvocato lawyer tenore tenor
ABSTRACT CONCEPTS: DATA COLLECTION AND ANALYSIS 7 right-handed native speakers of Italian Task:
Words presented in white on grey screen for 10 sec Cross in between, 7 sec Subjects had to think of a situation in which the word applied
Scanner: 4T Bruker MedSpec MRI scanner, EPI pulse sequence TR=1000ms, TE=33ms, 26° flip angle. Voxel dimensions 3mm*3mm*5mm
Preprocessing: using UCL’s Statistical Parameter Mapping Software Data corrected for head motion
Classification: using a single layer NN
MAIN QUESTIONSCan the taxonomic and domain classes be
distinguished from the fMRI data?Is there a difference in classification accuracy
between taxonomy and domain?Can the taxonomic and domain classes be
predicted across participants?
RESULTS WITHIN PARTICIPANTS (CATEGORY DISTINCTIONS)
ALL CATEGORICAL DISTINCTIONS CAN BE PREDICTED ABOVE CHANCETHERE ARE SIGNIFICANT DIFFERENCES BETWEEN CATEGORIES
RESULTS WITHIN PARTICIPANTS(DOMAIN)
WITHIN PARTICIPANTS RESULTS SUMMARYCan discriminate with accuracy well above chance
both taxonomic and domain distinctionsEasiest categories to recognize: TOOL, ATTRIBUTE,
LOCATION, Then SOCIAL ROLE, COMMUNICATIONMain confusions: communication / event
Red: AttributeBlue: ToolGreen: Location
R+G=YellowG+B=CyanR+B=PinkR+G+B=White
CATEGORY LOCALIZATION IN THE BRAIN
Red: Social-roleGreen: AttributeBlue: Urabstract
Red: Social-roleGreen: CommunicationBlue: Event
R+G=YellowG+B=CyanR+B=PinkR+G+B=White
Concrete taxonomic classes tool and location can be predicted across participant, attribute can also be significantly classified, but less concrete classes become conflated with attribute.
In general domain can be predicted across participants, however domain membership is much better classified in the most abstract taxonomic classes (attribute, communication and urabstract)
Visually apparent inter-region differences in activation.
The precuneus appears to contain voxels systematically associated with independent taxonomic/topical categories.
CROSS PARTICIPANTS RESULTS SUMMARY
Concrete categories TOOL and LOCATION can be predicted across participant; ATTRIBUTE can also be significantly classified; but less concrete classes become conflated with ATTRIBUTE.
In general DOMAIN can be predicted across participants, however domain membership is much better classified in the most abstract taxonomic classes (attribute, communication and urabstract)
CROSS PARTICIPANTS RESULTS SUMMARY
LAW MUSICAttribute giurisdizione jurisdiction sonorita' sonority
cittadinanza citizenship ritmo rhythmimpunita' impunity melodia
melodylegalita' legality tonalita’
tonalityillegalita' illegality intonazione pitch
communication divieto prohibition canzone songverdetto verdict pentagrammastaveordinanza decree ballata balladaddebito accusation ritornello refrainingiunzione injunction sinfonia
symphony event arresto arrest concerto concert
processo trial recital recitalreato crime assolo solofurto theft festival festivalassoluzione acquittal spettacolo show
social-role giudice judge musicistamusician
ladro thief cantante singerimputato defendant compositore composertestimone witness chitarrista
guitaristavvocato lawyer tenore tenor
tool manette handcuffs violino violin
toga robe tamburo drummanganello truncheon tromba trumpetcappio noose metronomo metronomegrimaldello skeleton key radio radio
Location tribunale court/tribunal palco stagecarcere prison auditorium
auditoriumquestura police station discoteca discopenitenziario penitentiary conservatorio conservatorypatibolo gallows teatro
theatreurabstracts giustizia justice musica music
liberta' liberty blues blueslegge law jazz jazzcorruzione corruption canto singingrefurtiva loot punk punk
TAXONOMIC / DOMAIN ORGANIZATION
WHAT THE DATA SUGGESTS
EEG vs fMRIA question about conceptual organization that can
be clearly investigated using neural evidence is: Which categories can be distinguished?
But: fMRI too expensive to carry out systematic investigations (~500 euros x hour)Alternative: EEGUsed in BCI for a variety of ‘mind reading’ tasksAlso used to study semantics with ERPs
EEG vs. fMRI
49USING EEG TO STUDY SEMANTICS: ERP
• Features: signal amplitude and slope at range of resolutions gives compact representation of waveform
• N400 Violations of person and number in pronoun-verb agreement
• Up to 70% detection on single trials
• Gaussian Naive-Bayes, SM Log. Regression, Linear SVM
EEG Spectral Analysis of Concepts?Participants presented
with aural or visual concept stimuli
EEG apparatus records electrical activity on the scalp
Waveforms can be reduced to frequency components
EEG pros and consPros:
Lighter CheaperBetter temporal resolution (ms)
Cons:Coarser spatial resolution (cm)Noisy (e.g., very sensitive to skull depth)
EEG CAN BE USED TO IDENTIFY MAJOR CATEGORICAL DISTINCTIONSMurphy et al, Brain and Language 2011:
7 Italian subjects30 animals, 30 toolsEach presented 6 timesTask: silent naming
STIMULI
EEG SIGNALS: TIME-FREQUENCY (PER CHANNEL)
Data analysisClassification System Schematic
Filter by Time,
Freq and Eelectr.
CS
SD
D
ecomposition
Vector Transform
SupVec M
achine
var(“tool”), var(“anim
al”)
64 channels preprocessed data
X channels filtered data* “Tool” component
“Animal” component
Feature vector
Answer
?
58
RESULTS
Murphy et al, 2011, Brain and Language
• Time/Freq window optimisation, CSP extraction of class-sensitive sources, 5-fold cross-validated SVM
• With group analysis, 98% accuracy categorising mammals vs tools
PRELIMINARY CONCLUSIONSEEG can be used to decode broad categorical
distinctionsMay need to use fMRI to study
Finer grained distinctionsCross-language distinctions
BRAIN EVIDENCE AND CORPUS EVIDENCECan we find ways of combining evidence about
strength of categorial distinctions coming from EEG / fMRI with the evidence coming from corpora?
First question: what is the relation between the conceptual spaces induced from corpora and the conceptual spaces elicited using EEG?
PREDICTING BRAIN (FMRI) ACTIVATION USING CONCEPT DESCRIPTIONS
T. Mitchell, S. Shinkareva, A. Carlson, K. Chang, V. Malave, R. Mason and M. Just. 2008. Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195
MITCHELL ET AL 2008: METHODSRecord fMRI activation for 60 nominal concepts
And extract 200 ‘best’ features, or VOXELsBuild conceptual descriptions for these concepts from corpora
(the Web) 25 features for each concept 25 verbs expressing typical properties of living things / tools Collect strength of association between these features and each
conceptLearn association between each voxel and the 25 verbal
features using 58 concepts
Use learned model to predict activation of 2 held-out data (compare using Euclidean distance) Accuracy: 77%
MITCHELL ET AL 2008
MITCHELL ET AL 2008: VERB FEATURES
MITCHELL ET AL: LEARNING ASSOCIATIONS
OUR EXPERIMENTS Replicate the Mitchell et al study using EEG
data instead of fMRIDifferent feature selection mechanisms
Compare different methods for building concept descriptionsIn addition to hand-picked, also a variety of
standard corpus modelsFor ItalianB. Murphy, M. Baroni, and M. Poesio, EEG
responds to conceptual stimuli and corpus semantics, EMNLP 2009
RESULTS USING THE HAND-PICKED FEATURES
RESULTS USING AUTOMATICALLY SELECTED FEATURES
MITCHELL ET AL
AA-MP
INTERIM SUMMARYIt is possible to establish systematic links between
knowledge about concepts acquired from corpora and knowledge extracted from brain data
These links may be used for instance to compare ontology learning methods (need however to extend the investigation of categorial distinctions discussed above)
APPLICATIONS‘Mind-reading’ techniques can be used for a
variety of other studies of interest to CL typesDEEP RELATIONS: fMRI can be used to extract
information about POLARITYThis can be used for sentiment analysis in text
ADAM: being able to distinguish between ANIMALS and TOOLS using EEG can be used as an early predictor of certain classes of semantic dementia
CONCLUSIONSEvidence from neuroscience, combined
with evidence from corpora and from behavioral studies, may be used to put our theories of the lexicon on a firmer empirical footing
The resulting resources may be more useful both for HLT and for other applications
THANKS!
Top Related