Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of...

13
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1246–1258, Seattle, Washington, USA, 18-21 October 2013. c 2013 Association for Computational Linguistics Understanding and Quantifying Creativity in Lexical Composition Polina Kuznetsova Jianfu Chen Yejin Choi Department of Computer Science Stony Brook University Stony Brook, NY 11794-4400 {pkuznetsova,jianchen,ychoi}@cs.stonybrook.edu Abstract Why do certain combinations of words such as “disadvantageous peace” or “metal to the petal” appeal to our minds as interesting ex- pressions with a sense of creativity, while other phrases such as “quiet teenager”, or “geometrical base” not as much? We present statistical explorations to understand the char- acteristics of lexical compositions that give rise to the perception of being original, inter- esting, and at times even artistic. We first ex- amine various correlates of perceived creativ- ity based on information theoretic measures and the connotation of words, then present ex- periments based on supervised learning that give us further insights on how different as- pects of lexical composition collectively con- tribute to the perceived creativity. 1 Introduction An essential property of natural language is the gen- erative capacity that makes it possible for people to express indefinitely many thoughts through indef- initely many different ways of composing phrases and sentences (Chomsky, 1965). The possibility of novel, creative expressions never seems to exhaust. Various types of writers, such as novelists, journal- ists, movie script writers, and creatives in adver- tising, continue creating novel phrases and expres- sions that are original while befitting in expressing the desired meaning in the given situation. Consider unique phrases such as “geological split personal- ity”, or “intoxicating Shangri-La of shoes”, 1 that 1 Examples from New York Times articles in 2013. continue flowing into the online text drawing atten- tion from readers. Writers put significant effort in choosing the per- fect words in completing their compositions, as a well-chosen combination of words is impactful in readers’ minds for rendering the precise intended meaning, as well as stimulating an increased level of cognitive responses and attention. Metaphors in particular, one of the quintessential forms of lin- guistic creativity, have been discussed extensively by studies across multiple disciplines, e.g., Cog- nitive Science, Psychology, Linguistics, and Liter- ature (e.g., Lakoff and Johnson (1980), McCurry and Hayes (1992), Goatly (1997)). Moreover, re- cent studies based on fMRI begin to discover bio- logical evidences that support the impact of creative phrases on people’s minds. These studies report that unconventional metaphoric expressions elicit signif- icantly increased involvement of brain processing when compared against the effect of conventional metaphors or literal expressions (e.g., Mashal et al. (2007), Mashal et al. (2009)). Several linguistic elements, e.g., syntax, seman- tics, and pragmatics, are likely to be working to- gether in order to lead to the perception of creativ- ity. However, their underlying mechanisms by and large are yet to be investigated. In this paper, as a small step toward quantitative understanding of lin- guistic creativity, we present a focused study on lex- ical composition two content words. Being creative, by definition, implies qualities such as being unique, novel, unfamiliar or uncon- ventional. But not every unfamiliar combination of words would appeal as creative. For example, unfa- 1246

Transcript of Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of...

Page 1: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1246–1258,Seattle, Washington, USA, 18-21 October 2013. c©2013 Association for Computational Linguistics

Understanding and Quantifying Creativity in Lexical Composition

Polina Kuznetsova Jianfu Chen Yejin ChoiDepartment of Computer Science

Stony Brook UniversityStony Brook, NY 11794-4400

{pkuznetsova,jianchen,ychoi}@cs.stonybrook.edu

Abstract

Why do certain combinations of words suchas “disadvantageous peace” or “metal to thepetal” appeal to our minds as interesting ex-pressions with a sense of creativity, whileother phrases such as “quiet teenager”, or“geometrical base” not as much? We presentstatistical explorations to understand the char-acteristics of lexical compositions that giverise to the perception of being original, inter-esting, and at times even artistic. We first ex-amine various correlates of perceived creativ-ity based on information theoretic measuresand the connotation of words, then present ex-periments based on supervised learning thatgive us further insights on how different as-pects of lexical composition collectively con-tribute to the perceived creativity.

1 Introduction

An essential property of natural language is the gen-erative capacity that makes it possible for people toexpress indefinitely many thoughts through indef-initely many different ways of composing phrasesand sentences (Chomsky, 1965). The possibility ofnovel, creative expressions never seems to exhaust.Various types of writers, such as novelists, journal-ists, movie script writers, and creatives in adver-tising, continue creating novel phrases and expres-sions that are original while befitting in expressingthe desired meaning in the given situation. Considerunique phrases such as “geological split personal-ity”, or “intoxicating Shangri-La of shoes”,1 that

1Examples from New York Times articles in 2013.

continue flowing into the online text drawing atten-tion from readers.

Writers put significant effort in choosing the per-fect words in completing their compositions, as awell-chosen combination of words is impactful inreaders’ minds for rendering the precise intendedmeaning, as well as stimulating an increased levelof cognitive responses and attention. Metaphors inparticular, one of the quintessential forms of lin-guistic creativity, have been discussed extensivelyby studies across multiple disciplines, e.g., Cog-nitive Science, Psychology, Linguistics, and Liter-ature (e.g., Lakoff and Johnson (1980), McCurryand Hayes (1992), Goatly (1997)). Moreover, re-cent studies based on fMRI begin to discover bio-logical evidences that support the impact of creativephrases on people’s minds. These studies report thatunconventional metaphoric expressions elicit signif-icantly increased involvement of brain processingwhen compared against the effect of conventionalmetaphors or literal expressions (e.g., Mashal et al.(2007), Mashal et al. (2009)).

Several linguistic elements, e.g., syntax, seman-tics, and pragmatics, are likely to be working to-gether in order to lead to the perception of creativ-ity. However, their underlying mechanisms by andlarge are yet to be investigated. In this paper, as asmall step toward quantitative understanding of lin-guistic creativity, we present a focused study on lex-ical composition two content words.

Being creative, by definition, implies qualitiessuch as being unique, novel, unfamiliar or uncon-ventional. But not every unfamiliar combination ofwords would appeal as creative. For example, unfa-

1246

Page 2: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

miliar biomedical terms, e.g., “cardiac glycosides”,are only informative without appreciable creativity.Similarly, less frequent combinations of words, e.g.,“rotten detergent” or “quiet teenager”, though de-scribing situations that are certainly uncommon, donot bring about the sense of creativity. Finally, someunique combinations of words can be just nonsensi-cal , e.g., “elegant glycosides”.

Different studies assumed different definitions oflinguistic creativity depending on their context andend goals (e.g., Chomsky (1976), Zhu et al. (2009),Gervas (2010), Maybin and Swann (2007), Carterand McCarthy (2004)). In this paper, as an opera-tional definition, we consider a phrase creative if itis (a) unconventional or uncommon, and (b) expres-sive in an interesting, imaginative, or inspirationalway.

A system that can recognize creative expressionscould be of practical use for many aspiring writerswho are often in need of inspirational help in search-ing for the optimal choice of words. Such a systemcan also be integrated into automatic assessment ofwriting styles and quality, and utilized to automat-ically construct a collection of interesting expres-sions from the web, which may be potentially usefulfor enriching natural language generation systems.

With these practical goals in mind, we aim to un-derstand phrases with linguistic creativity in a broadscope. Similarly as the work of Zhu et al. (2009),our study encompasses phrases that evoke the senseof interestingness and creativity in readers’ minds,rather than focusing exclusively on clearly but nar-rowly defined figure of speeches such as metaphors(e.g., Shutova (2010)), similes (e.g., Veale et al.(2008), Hao and Veale (2010)), and humors (e.g.,Mihalcea and Strapparava (2005), Purandare andLitman (2006)). Unlike the study of Zhu et al.(2009), however, we concentrate specifically on howcombinations of different words give rise to thesense of creativity, as this is an angle that has notbeen directly studied before. We leave the roles ofsyntactic elements as future research.

We first examine various correlates of perceivedcreativity based on information theoretic measuresand the connotation of words, then present experi-ments based on supervised learning that give us fur-ther insights on how different aspects of lexical com-position collectively contribute to the perceived cre-

ativity.

2 Theories of Creativity and Hypotheses

Many researchers, from the ancient philosophers tothe modern time scientists, have proposed theoriesthat attempt to explain the mechanism of creativeprocess. In this section, we draw connections fromsome of these theories developed for general humancreativity to the problem of quantitatively interpret-ing linguistic creativity in lexical composition.

2.1 Divergent Thinking and Composition

Divergent thinking (e.g., McCrae (1987)), whichseeks to generate multiple unstereotypical solutionsto an open ended problem has been considered asthe key element in creative process, which contrastswith convergent thinking that find a single, cor-rect solution (e.g., Cropley (2006)). Applying thesame high-level idea to lexical composition, diver-gent composition that explores an unusual, uncon-ventional set of words is more likely to be creative.

Note that the key novelty then lies in the composi-tional operation itself, i.e., the act of putting togethera set of words in an unexpected way, rather than therareness of individual words being used. In recentyears there has been a swell of work on composi-tional distributional semantics that captures the com-positional aspects of language understanding, suchas sentiment analysis (e.g., Yessenalina and Cardie(2011), Socher et al. (2011)) and language model-ing (e.g., Mitchell and Lapata (2009), Baroni andZamparelli (2010), Guevara (2011), Clarke (2012),Rudolph and Giesbrecht (2010)). However, nonehas examined the compositional nature in quantify-ing creativity in lexical composition.

We consider two computational approaches tocapture the notion of creative composition. The firstis via various information theoretic measures, e.g.,relative entropy reduction, to measure the surprisalof seeing the next word given the previous word.The second is via supervised learning, where we ex-plore different modeling techniques to capture thestatistical regularities in creative compositional op-erations. In particular, we will explore (1) compo-sitional operations of vector space models, (2) ker-nels capturing the non-linear composition of differ-ent dimensions in the meaning space, (3) the use of

1247

Page 3: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

neural networks as an alternative to incorporate non-linearity in vector composition. (See §5).

2.2 Latent Memory and Creative SemanticSubspace

Although we expect that unconventional composi-tion has a connection to creativeness of resultingphrases, that alone does not explain many counterexamples where the composition itself is uncommonbut the resulting expression is not creative due tolack of interestingness or imagination, e.g., “roomand water”.2 Therefore, we must consider addi-tional conditions that give rise to creative phrases.

Let S represent the semantic space, i.e., the set ofall possible semantic representation that can be ex-pressed by a phrase that is composed of two contentwords.3 Then we hypothesize that some subsets ofsemantic space {Si|Si ⊂ S} are semantically futileregions for appreciable linguistic creativity, regard-less of how novel the composition in itself might be.Such regions may include technical domains such aslaw or pharmacology. Similarly, we expect seman-tically fruitful subsets of semantic space where cre-ative expressions are more frequently found. For in-stance, phrases such as “guns and roses” and “metalto the petal” are semantically close to each other andyet both can be considered as interesting and cre-ative (as opposed to one of them losing the sense ofcreativity due to its semantic proximity to the other).

This notion of creative semantic subspace con-nects to theories that suggest that latent memoriesserve as motives for creative ideas and that one’screativity is largely depending on prior experienceand knowledge one has been exposed to (e.g., Freud(1908), Necka (1999), Glaskin (2011), Cohen andLevinthal (1990), Amabile (1997)), a point alsomade by Einstein: “The secret to creativity is know-ing how to hide your sources.”

Figure 5 presents visualized supports for creativesemantic subspace,4 where we observe that phrasesin the neighborhood of legal terms are generallynot creative, while the semantic neighborhood of

2With additional context this example may turn into a cre-ative one, but for simplicity we focus on phrases with two con-tent words considered out of context.

3Investigation on recursive composition of more than twocontent words and the influence of syntactic packaging is left asfuture research.

4See §6 for more detailed discussion.

Source # of # of Avguniq sent sent Entropy

words lenQUOTESraw 29498 49402 28 173.05GLOSSESraw 20869 7745 53 96.79

Table 1: Entropy of word distribution in datasets

Dataset# of word pairs percentage

total #(-) #(+) #(+)/total %GLOSSES 1912 149 18 0.94QUOTES 3298 204 35 1.06

Table 2: Distribution of creative(+)/common(-) wordpairs over GLOSSES and QUOTES dataset.

“kingdom” and “power” is relatively more fruitfulfor composing creative (i.e., unique and uncommonwhile being imaginative and interesting, per our op-erational definition of creativity given in §1) wordpairs, e.g., invisible empire”. In our empirical in-vestigation, this notion of semantically fruitful andfutile semantic subspaces are captured using dis-tributional semantic space models under supervisedlearning framework (§5).

2.3 Affective LanguageAnother angle we probe is the connection betweencreative expressions and the use of affective lan-guage. This idea is supported in part by previ-ous research that explored the connection betweenfigurative languages such as metaphors and senti-ment (e.g., Fussell and Moss (1998), Rumbell etal. (2008), Rentoumi et al. (2012)). The focus ofprevious work was either on interpretation of thesentiment in metaphors, or the use of metaphorsin the description of affect. In contrast, we aimto quantify the correlation between creative expres-sions (beyond metaphors) and the use of sentiment-laden words in a more systematic way. This explo-ration has a connection to the creative semantic sub-space discussed earlier (§2.2), but pays a more directattention to the aspect of sentiment and connotation.

3 Creative Language Dataset

We start our investigation by considering two typesof naturally existing collection of sentences: (1)quotes and (2) dictionary glosses. We expect thatquotes are likely to be rich in creative expressions,while dictionary glosses stand in the opposite spec-

1248

Page 4: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

less  freq   more  freq  

(a)   (b)   (c)  

0  5  

10  15  20  25  30  

1   5   9   13   17   21   25   29   33   37  %  of  w

ord  pa

irs  

bucket  

0  

5  

10  

15  

20  

25  

1   9   17   25   33   41   49   57   65  

%  of  w

ord  pa

irs  

bucket  

0  10  20  30  40  50  60  

1   11   21   31   41   51   61   71   81   91  %  of  w

ord  pa

irs  

bucket  

less  freq   more  freq   less  freq   more  freq  

Glosses   Quotes   All  

Figure 1: Distribution of creative (double lines in blue) versus common (single lines in red) word pairs with varyingranges of frequencies (x-axis) for GLOSSES, QUOTES and both datasets combined.

lower  val   higher  val  

(a)   (b)   (c)  

lower  val   higher  val   lower  val   higher  val  

0  

5  

10  

15  

20  

1   5   9   13   17   21   25   29   33   37  %  of  w

ord  pa

irs  

bucket  

0  

5  

10  

15  

20  

25  

1   9   17   25   33   41   49   57   65  %  of  w

ord  pa

irs  

bucket  

0  

10  

20  

30  

40  

50  

1   11   21   31   41   51   61   71   81   91  

%  of  w

ord  pa

irs  

bucket  

Glosses   Quotes   All  

Figure 2: Distribution of creative (double lines in blue) versus common (single lines in red) word pairs with varyingranges of PMI values (x-axis) for GLOSSES, QUOTES and both datasets combined.

trum of being creative.QUOTESraw: We crawled inspirational quotesfrom “Brainy Quote”.5

GLOSSESraw: We collected glosses from Ox-ford Dictionary and Merriam-Webster Dictionary.6

Overall we crawled about 8K definitions. Table 1shows statistics of the dataset.7

Entropy of word distribution We conjecture thatQUOTES and GLOSSES are different in terms ofword variety, which can be quantified by the entropy

5http://www.brainyquote.com/6http://oxforddictionaries.com/ and http://www.merriam-

webster.com/. We only consider words appearing in both dic-tionaries to avoid unusual words such as compound words, e.g.,“zero-base”.

7QUOTESraw contain 30K unique words and GLOSSESraw

has 20K unique words. QUOTESraw have much arger numberof sentences, while its average sentence is shorter.

of word distributions. To compute the entropy foreach dataset, we use ngram statistics from the corre-sponding dataset to measure the probability of eachword. As expected, QUOTES dataset has higherentropy than GLOSSES in Table 1.

3.1 Creative Word PairsWe extract word pairs corresponding to the follow-ing syntactic patterns: [NN NN], [JJ NN], [NN JJ]and [JJ JJ]. Not all pairs from QUOTESraw are cre-ative, and likewise, not all pairs from GLOSSESraw

are uncreative. Therefore, we perform manual an-notations to a subset of the collected pairs as fol-lows. We obtain a small subset of pairs by apply-ing stratified sampling based on bigram frequencybuckets: first we sort word pairs by their bigramfrequencies obtained from Web 1T corpus (Brantsand Franz (2006)), group them into consecutive fre-

1249

Page 5: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

quency buckets each of which containing 400 wordpairs, then sample 40 word pairs from each bucket.

We label word pairs using Amazon MechnicalTurk (AMT) (e.g., Snow et al. (2008)). We ask threeturkers to score each pair in 1-5 scale, where 1 is theleast creative and 5 is the most creative. We thenobtain the final creativity scale score by averagingthe scores over 3 users. In addition, we ask turkersa series of yes/no questions to help turkers to deter-mine whether the given pair is creative or not.8 Wedetermine the final label of a word pair based on twoscores, creativity scale score and yes/no question-based score. If creativity scale score is 4 or 5 andquestion-based score is positive, we label the pair ascreative. Similarly, if creativity scale score is 1 or2 and question-based score is negative, we label thepair as common. We discard the rest from the finaldataset. This filtering process is akin to the removalof neural sentiment in the early work of sentimentanalysis (e.g., Pang et al. (2002)).9 Table 2 showsthe statistics of the resulting dataset.Creative Pairs and their Frequencies: To gaininsights on the stratified sample of word pairs, weplot the label (∈ {creative, common}) distributionof word pairs as a function of simple statistics, suchas a range (bucket) of bigram frequencies or PMIvalues of the given pair of words. Both bigram fre-quencies and PMI scores are computed based onGoogle Web 1T corpus Brants and Franz (2006).Figure 1 shows the results for word frequencies. Asexpected, word pairs with high frequencies are muchmore likely to be common, while word pairs withlow frequencies can be either of the two. Also as ex-pected, pairs extracted from QUOTES are relativelymore likely to be creative than those from GLOSSES.In any case, it is clear that not all rare pairs are cre-ative.Creative Pairs and their PMI Scores: Similarlyas above, Figure 2 plots the relation between thedistribution of labels of word pairs and their corre-sponding PMI. As expected, pairs with high PMI aremore likely to be common, though the trend is not as

8E.g., “is this word combination boring and not original?”or “does it provoke unusual imagination?”.

9Cohen’s Kappa and Pearson Correlation on the filtered dataare 0.69 and 0.72 respectively. Corresponding scores for the un-filtered data drop to 0.26 and 0.29 respectively. All the experi-ments are performed on the filtered data.

Common Creativequiet teenager inglorious success

constant longitude thorny existencewatery juice relaxed symmetry

noble political sardonic destinydiet cooking dispassionate history

verbal interpretation poetical enthusiasmunwelcome situation verbal beauty

migratory tuna earth breathelousy businessman disadvantageous peace

terrific marriage alchemical marriagesolved issue deep nonsense

Table 3: Sample Creative / Common Word Pairs

skewed as before.

Final Dataset: From our initial annotation study,it became apparent to us that creative pairs are veryrare, perhaps not surprisingly, even among infre-quent pairs. In order to build the word pair corpuswith as many creative pairs as possible, we focus oninfrequent word pairs for further annotation, fromwhich we construct a larger and balanced set of cre-ative and common word pairs, with 394 word pairsfor each class. The specific construction procedureis as follows: first combine all of the word pairsextracted from both QUOTESraw and GLOSSESraw

as a single dataset, sort them by bigram frequency,group them into consecutive frequency buckets eachof which has 40 word pairs; finally balance each fre-quency bucket, by discarding word pairs with higherfrequency value from the larger class in that bucket.Examples of labeled word pairs are shown in Ta-ble 3. Hereafter we use this balanced dataset of wordpairs for all experiments.10

4 Creativity Measures

4.1 Information MeasuresIn this section we explore information theoreticmeasures to quantify the surprisal aspect of creativeword pairs, relating to the divergent, compositionalnature of creativity discussed in §2.1.

Entropy of Context Seeing a word w changes ourexpectation on what might follow next. Some wordshave stronger selective preference (higher entropy)than others.

10The resulting dataset is available at http://www.cs.stonybrook.edu/˜pkuznetsova/creativity/

1250

Page 6: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

10  

30  

50  

70  

90  

0   2   4   6   8   10   12   14   16   18   20   22  

%  of  w

ord  pa

irs  

bucket  

MI(w1,w2)    

30  

40  

50  

60  

70  

80  

0   2   4   6   8   10   12   14   16   18   20   22  

%  of  w

ord  pa

irs  

bucket  

Lconn(w1,w2)  

20  30  40  50  60  70  80  

0   2   4   6   8   10   12   14   16   18   20   22  

%  of  w

ord  pa

irs  

bucket  

Lsubj(w1,w2)  

10  

30  

50  

70  

90  

0   2   4   6   8   10   12   14   16   18   20   22  

%  of  w

ord  pa

irs  

bucket  

H(w1w2)  

10  

30  

50  

70  

90  

0   2   4   6   8   10   12   14   16   18   20   22  

%  of  w

ord  pa

irs  

bucket    

RH(w1,w2)  

10  

30  

50  

70  

90  

0   2   4   6   8   10   12   14   16   18   20   22  

%  of  w

ordp

airs  

bucket  

KL(w1w2,w2)    

(a)   (b)   (c)  

(d)   (e)   (f)  

diff   diff  

Figure 3: Distribution of creative (double lines in blue) versus common (single lines in red) word pairs with varyingranges of information or polarity measures (x-axis).

0  

0.01  

0.02  

0.03  

0.04  

0.05  

0.06  

Milton

 as  

best  

busy  

clear  

common

 cool  

deep

 early

 en

d  expe

nsive  

final  

glad  

hard  

hot  

ingot  

last  

likely  

many  

next  

own  

pleased  

professio

nal  

rarely  

role  

serio

usly  

slow  

special  

success   to  

valuable  

welcome  

Figure 4: Conditional probability of neighboring wordsfor “inglorious” (filled / red) and “very” (unfilled / blue).

For instance, the entropy after seeing “very”would be higher than that after seeing “inglorious”,as the former can be used in a wider variety of con-text than the later. Figure 4 visualizes relativelymore skewed distribution of “inglorious”. We com-pute the entropy of future context conditioning onw1, w2 and w1w2, which we denote as H(w1),H(w2), H(w1w2) respectively, latter is shown inFigure 3 – a.11

11As before, language models are drawn from Google Web

Relative Entropy Transformation In order to fo-cus more directly on the relative change of entropyas a result of composition, we compute Relative En-tropy Transformation:

RH(w1, w2) =|H(w1)−H(w1w2)|H(w1) +H(w1w2)

(1)

As expected (Figure 3 – b and Table 4), this relativequantity captures creativity better than the absolutemeasure H(w1w2) computed above. The idea be-hind this measure has a connection to uncertaintyreduction in psycholinguistic literature (e.g., Frank(2010), Hale (2003), Hale (2006)).KL divergence To capture unusual combinationsof words, we compare the difference between thedistributional contexts of w1 and w1w2 so that

KL(w1w2, w1) =∑

wi∈V

P (wi|w1, w2) logP (wi|w1, w2)

P (wi|w1)

(2)

Figure (3 – c) shows thatKL(w1w2, w1)12 is among

1T corpus Brants and Franz (2006).12We also compute KL(w1, w2) in a similar manner as

KL(w1w2, w1)

1251

Page 7: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

the effective measures in capturing creative pairs.

Mutual Information Finally, we consider mutualinformation (Figure 3 – d):

MI(w1, w2) =∑

wi∈V

P (wi|w1, w2)×

logP (wi|w1, w2)

P (wi|w1) · P (wi|w2)(3)

Correlation coefficients Pearson coefficients forall measures are shown in Table 4. Interestingly, in-formation theoretic measures that compare the dis-tribution of word’s context, such as RH(w1, w2),KL(w1w2, w1) and MI(w1, w2), capture the sur-prisal aspect of creativity better than simple frequen-cies or PMI scores that do not consider contextualchanges. But even for those cases when the corre-lation is statistically significant, the values are nottoo high. We conjecture that there are two reasonsfor this. First, Pearson assumes linear correlations,hence not sensitive enough to capture non-linear cor-relations that are evident in graphs shown in Fig-ure 3. Second, these measures only capture the sur-prisal aspect of creativity, missing the other impor-tant qualities: interestingness or imaginativeness.

4.2 Sentiment and ConnotationNext we investigate the connection between creativ-ity and sentiment, as illustrated in §2.3. We con-sider both sentiment (more explicit) and connotation(more implicit) words,13 and consider them withor without distinguishing the polarity (i.e., positive,negative). To determine sentiment and connotation,we use lexicons provided by OpinionFinder (Wilsonet al. (2005)) and Feng et al. (2013) respectively. Wedenote polarity of a word wi as L(wi).14 When wi

has a negative polarity L(wi) is assigned a value of-1, and when wi is positive L(wi) is equal to 1. Weassume that a word is neutral when it is not in thelexicon, assigning 0 to L(wi). For a word pair w1w2

we compute absolute difference Ldiff (w1, w2) be-tween polarities of tokens in a word pair in order tocatch examples such as “inglorious success”.

13E.g., expressions such as “blue sky” or “white sand” arenot sentiment-laden, but do have positive connotation.

14We denote polarity from OpinionFinder as Lsubj and con-notation as Lconn

Measure Corr Coeff p-value∗ adj p-value∗∗

pointwise, noncontextualFreq(w1w2) 0.014 0.67 0.86PMI(w1, w2) 0.011 0.75 0.86

information theoretic, contextualE(w1) -0.038 0.26 0.49E(w2) -0.126 0.00019 0.00083

E(w1, w2) 0.013 0.71 0.86RH(w1, w2) 0.113 0.00081 0.0024

KL(w1w2, w1) 0.134 7.152-05 0.00054KL(w1, w2) -0.080 0.018 0.039MI(w1, w2) 0.125 0.00022 0.00083

sentiment & connotationLsubj(w1) 0.006 0.87 0.87Lsubj(w2) 0.031 0.36 0.60

Ldiffsubj (w1, w2) 0.168 6.67e-07 1.00e-05Lconn(w1) 0.023 0.49 0.74Lconn(w2) 0.008 0.80 0.86

Ldiffconn(w1, w2) 0.082 0.015 0.038

Table 4: Pearson correlation between various measuresand creativity of word pairs. Boldface denotes statisticalsignificance (p ≤ 0.05).note *: Two-tailed p-value, 394 word pairs per classnote **: We used Benjamini-Hochberg method to adjustp-values for multiple tests

Table 4 shows Pearson coefficient for sentimentand connotation based measures. It turns out thatpolarity of each word on its own does not have ahigh impact on the creativity of a word pair. Rather,it is the difference between the two words that givesrise the sense of creativity.

4.3 Learning to Recognize CreativityNow we put together all measures explored in §4.1and 4.2 in a supervised-learning framework. As ex-pected, rather than either one alone, the combinationof various measures leads to the best performance:

~F12 = [RH(w1, w2);KL(w1, w2);H(w1w2);

Ldiffconn(w1, w2);PMI(w1, w2);

H(w2);KL(w1w2, w1);KL(w2, w1);

Ldiffsubj (w1, w2);MI(w1, w2);

Freq(w1w2);H(w1)]

Table 5 shows the performance of the above fea-ture vector with 12 features using libsvm (Chang andLin, 2011). We use C-Support Vector Classification(C-SVC). Performance is reported in accuracy using5-fold cross validation.15

15Among these 12 features, the feature selection algorithm

1252

Page 8: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

5 Learning Creative Pairs withDistributional Semantic Vectors

The measures explored in §4 were largely unin-formed of distributional semantic dimensions ofeach word. However, in order to pursue the concep-tual aspect of creativity illustrated in §2.2, that is, thenotion of semantic subspaces that are inherently fu-tile or fruitful for creativity, we need to incorporatesemantic representations more directly. We there-fore explore the use of distributional vector spacemodels. Another goal of this section will be addi-tional learning-based investigation to the composi-tional nature of creative word pairs, complementingthe investigation in §4, which focused on the com-positional aspect of creativity described in §2.1.

With above goals in mind, in what follows, we ex-plore three different ways to learn compositional as-pect of creative word pairs: (1) learning with explicitcompositional vector operations (§5.1), (2) learningnonlinear composition via kernels (§5.2), (3) learn-ing nonlinear composition via deep learning (§5.3).Note that in all these approaches, the notion of cre-ative semantic subspace is integrated indirectly, asthe feature representation always incorporates theresulting (composed) vector representations.

Baseline & Configuration We consider the con-catenation of two word vectors [~w1; ~w2] as the base-line, since it can be viewed as what simple bag-of-word features would be. Since the size of creativepair dataset is not at scale yet, we choose to workwith vector space models that are in reduced dimen-sions. We experimented with both Non-NegativeSparse Embedding (Murphy et al. (2012)) and neu-ral semantic vectors of Huang et al. (2012), but re-port experiments with the latter only as those gaveus slightly better results.

5.1 Compositional Vector OperationsWe consider the following compositional vector op-erations inspired by recent studies for composi-tional distributional semantics (e.g., Guevara (2011),Clarke (2012), Mitchell and Lapata (2008), Wid-dows (2008)).

• ADD: ~w1 + ~w2

• DIFF: abs(~w1 − ~w2)

of Chen and Lin (2005) determines that the most two importantones are RH(w1, w2) and KL(w1, w2).

• MULT: ~w1 .∗ ~w2

• MIN: min{~w1, ~w2}• MAX: max{~w1, ~w2}

All operations take two input vectors ∈ Rn, andoutput a vector ∈ Rn. Each operation is appliedelement-wise. We then perform binary classifica-tion over the composed vectors using linear SVM.Besides using features based on the composed vec-tors, we also experiment with features based on con-catenating multiple composed vectors, in the hope tocapture more diverse compositional operations. SeeTable 5 for more details and experimental results.

5.2 Learning Nonlinear Composition viaKernels

As an alternative to explicit vector compositions, wealso probe implicit operations based on non-linearcombinations of semantic dimensions using kernels(e.g., Scholkopf and Smola (2002), Shawe-Taylorand Cristianini (2004)), in particular:

• Polynomial: K(x, y) = (γxT y + r)d, γ > 0

• RBF: K(x, y) = exp(−γ ‖x− y‖2), γ > 0• Laplacian: K(x, y) = exp(−γ ‖x− y‖), γ > 0

5.3 Learning Non-linear Composition via DeepLearning

Yet another alternative to model non-linear com-position is deep learning. To learn the non-lineartransformation of a pair of semantic vectors, we ex-plore the use of autoencoders (e.g., Pollack (1990),Voegtlin and Dominey (2005)). We follow the for-mulation of vector composition proposed by Socheret al. (2011) except that we do not stack autoen-coders for recursion. More specifically, given thetwo input words ~w1, ~w2 ∈ Rn, we want to learna vector space representation of their combination~p ∈ Rn. The recursive auto encoder (RAE) ofSocher et al. (2011) models the composition of aword pair as a non-linear transformation of theirconcatenation [~w1; ~w2]:

~p = f(M1[~w1; ~w2] +~b1) (4)

where M1 ∈ Rn×2n. After adding a bias term~b1 ∈ Rn, a nonlinear element-wise function f suchas tanh is applied to the resulting vector. The repre-sentation ~p of the word pair is then fed into a recon-struction layer to reconstruct the two input vectors,

1253

Page 9: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

Methods AccuracyCreativity measures (§4.3)~F12 62.30

Baseline: vector concatenation (no composition)[~w1; ~w2] 67.51Explicit vector composition (§5.1)~w1 + ~w2 66.62

abs(~w1 − ~w2) 60.03min{~w1, ~w2} 66.08max{~w1, ~w2} 64.97~w1 .∗ ~w2 56.34

[abs(~w1 − ~w2); ~w1; ~w2] 69.54[max{~w1, ~w2}; ~w1; ~w2] 68.02

Non-linear composition via kernels (§5.2)Polynomial 65.86

RBF 69.16Laplacian 68.15

Non-linear composition via deep learning (§5.3)f(M1[~w1; ~w2] +~b1) 67.25

Table 5: Performance comparison of creativity classifiers.

Incorrectly predicted y∗ Semantically close y∗

word pairs word pairsCONFUSION DUE TO WORD SIMILARITY (20/42)

“entire carton” - “whole angst” +“outdated tax” - “graconian tax” +

“dismissive way” - “amorous way” +“insidious part” + “leather part” -

CONFUSION DUE TO SUBJECTIVE LABELING (8/42)“independent + “wonderful -

religion” religion”WORD SENSE DISAMBIGUATION PROBLEMS (2/42)

“fiscal cliff” - “winding lake” +“opera window” + “work-shop floor” -

Table 6: Error analysis: y∗ denotes the true label. Foreach incorrectly predicted word pair (left column), weshow an example of semantically close word pairs (rightcolumn) with the opposite true label that might have con-fused learning.

and a softmax layer to predict the probability of theword pair being creative and not creative. We ini-tialize the word vectors using the pre-learned vectorspace representations in Huang et al. (2012).

5.4 Experimental Results

Table 5 shows the performance comparison of dif-ferent features sets and algorithms. In all cases,parameters are tuned from the training portion ofthe data. We see that simple vector composition

alone does not perform better than vector concate-nation [~w1; ~w2]. However, combining abs(~w1− ~w2)or max{~w1; ~w2} with [~w1; ~w2] perform better thanconcatenation. Kernels with non-linear transforma-tion of feature space generally improve performanceover linear SVM, suggesting that kernels capturesome of the interesting compositional aspect of cre-ativity that is not covered by some of the explicitvector compositions considered in §5.1. We also ex-perimented with additional features driven from thecreativity measures explored in §4, but we omit theirresults as those did not help improving the perfor-mance. Unfortunately learning nonlinear composi-tion with deep learning did not yield better results.We conjecture that it is due to the small dataset wewere able to obtain for this study, which may havenot been enough to learn the rich parameter space ofthe nonlinear transformation matrix.

6 Analysis and Insight

Error analysis We manually inspected a ran-domly chosen 42 error cases, and characterize thepotential causes of those errors. Examples of threetypes of errors are shown in Table 6. For each incor-rectly predicted word pair, we also show a seman-tically close word pair with the opposite true labelthat might have confused the learning algorithm.

Visualization To gain additional insight, weproject word pairs represented in their vec-tor concatenations onto 2-dimensional space us-ing t-Distributed Stochastic Neighbor Embedding(van der Maaten and Hinton (2008)). Figure 5shows some of the interesting regions of the pro-jection: some regions are relatively futile in hav-ing creative phrases (e.g., regions involving simpleadjectives such as “good”, “bad”, regions corre-sponding to legal terms), while some regions are rel-atively more fruitful (e.g., regions involving abstractadjectives such as “infinite”, “universal”, “funda-mental”). There are also many other regions (e.g., inthe vicinity of “true”, “perfect” or “intelligent” inFigure 5) where the separation between creative andnoncreative phrases are not as prominent. In thoseregions, compositional aspects would play a biggerrole in determining creativity than memorizing fruit-ful semantic subspaces.

1254

Page 10: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

•  infinite promise

•  infinite leisure

•  universal aspiration

•  perfect disorder

•  absolute barbarism

•  fundamental soul

•  theoretical wisdom

•  fundamental key

•  technological refinement

•  good marathon

•  good custodian

•  universal anguish •  pure phenomenology

•  logical market

•  true perversion

•  perfect fire

•  perfect land

•  true ambition

•  true golfer

•  normal professor •  normal adulthood

•  bad profession

•  bad motivation

•  intelligent manipulation •  intelligent vocabulary

•  honest coward

•  human spark •  human architecture

•  human masterpiece

•  human incompetence

•  legal slavery •  legal corporation

•  legal trading

•  legal progress •  judicial verdict

•  -------- -------- •  -------- --------

•  -------- ---------

•  -------- --------

•  -------- -------- •  -------- ---------

•  -------- ---------

•  -------- --------

•  -------- --------

•  -------- -------- •  -------- --------- •  -------- --------

•  -------- ---------

•  -------- ---------

•  -------- ---------

•  -------- --------

•  -------- --------

•  -------- --------

•  invisible empire

•  omnipotent realm •  finite realm

•  -------- --------

•  -------- --------

•  -------- --------

•  -------- -------- •  -------- --------

•  -------- ---------

•  -------- --------- •  -------- ---------

•  -------- ---------

•  -------- ---------

•  -------- ---------

•  -------- ---------

•  -------- ---------

•  -------- --------- •  -------- --------

•  -------- --------

Figure 5: Creative (blue bold) and not creative (red italic) word pairs graph.

7 Related Work

Among computational approaches that touch on lin-guistic creativity, many focused on metaphor (e.g.,Dunn (2013), Krishnakumaran and Zhu (2007),Mashal et al. (2007), Rumbell et al. (2008), Ren-toumi et al. (2012), Mashal et al. (2009)). Other lin-guistic devices and phenomena related to creativityinclude irony (e.g., Davidov et al. (2010), Gonzalez-Ibanez et al. (2011), Filatova (2012)), neologism(e.g., Cartoni (2008)), humor (e.g., Mihalcea andStrapparava (2005), Purandare and Litman (2006)),and similes (e.g., Hao and Veale (2010)).

Veale (2011) proposed the new task of creativetext retrieval to harvest expressions that potentiallyconvey the same meaning as the query phrase ina fresh or unusual way. Our work contributes tothe retrieval process of recognizing more creativephrases. Ozbal and Strapparava (2012) exploredautomatic creative naming of commercial productsand services, focusing on the generation of creativephrases within a specific domain. Costello (2002)investigated the cognitive process that guides peo-ple’s choice of words when making up a novel noun-noun compound. In contrast, we present a data-driven investigation to quantifying creativity in lex-ical composition. Memorability is loosely related to

linguistic creativity (Danescu-Niculescu-Mizil et al.(2012)) as some of the creative quotes may be morememorable, but not all creative phrases are memo-rable and vice versa.

8 ConclusionWe presented the first study that focuses on learn-ing and quantifying creativity in lexical composi-tions, exploring statistical techniques motivated bythree different theories and hypotheses of creativ-ity, ranging from divergent thinking, compositionalstructure, creative semantic subspace, and the con-nection to sentiment and connotation. Our experi-mental results suggest the viability of learning cre-ative language, and point to promising directions forfuture research.

Acknowledgments This research was supportedin part by the Stony Brook University Office of theVice President for Research, and in part by gift fromGoogle. We thank anonymous reviewers for insight-ful comments and suggestions.

ReferencesT. Amabile. 1997. Motivating creativity in organiza-

tions: On doing what you love and loving what youdo. California Management Review, 40(1):39–58.

1255

Page 11: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

Marco Baroni and Roberto Zamparelli. 2010. Nounsare vectors, adjectives are matrices: representingadjective-noun constructions in semantic space. InProceedings of the 2010 Conference on Empiri-cal Methods in Natural Language Processing, pages1183–1193, Stroudsburg, PA, USA.

Thorsten Brants and Alex Franz. 2006. Web 1t 5-gramcorpus version 1.1. Google Inc.

Ronald Carter and Michael McCarthy. 2004. Talking,creating: interactional language, creativity, and con-text. Applied Linguistics, 25(1):62–88.

Bruno Cartoni. 2008. Lexical resources for automatictranslation of constructed neologisms: the case studyof relational adjectives. In LREC.

Chih-Chung Chang and Chih-Jen Lin. 2011. Libsvm:a library for support vector machines. ACM Trans-actions on Intelligent Systems and Technology (TIST),2(3):27.

Yi-Wei Chen and Chih-Jen Lin. 2005. Combining svmswith various feature selection strategies. In TaiwanUniversity. Springer-Verlag.

Noam Chomsky. 1965. Aspects of the Theory of Syntax,volume 11. The MIT press.

Carol Chomsky. 1976. Creativity and innovation in childlanguage. Journal of Education, Boston.

Daoud Clarke. 2012. A context-theoretic framework forcompositionality in distributional semantics. Compu-tational Linguistics, 38(1):41–71.

W.M. Cohen and D.A. Levinthal. 1990. Absorptive Ca-pacity: A New Perspective on Learning and Innova-tion. Administrative Science Quarterly, 35(1).

Fintan J. Costello. 2002. Investigating creative language:People’s choice of words in the production of novelnoun-noun compounds. In Proceedings of the 24thAnnual Conference of the Cognitive Science Society.

Arthur Cropley. 2006. In praise of convergent thinking.Creativity Research Journal, 18(3):391–404.

Cristian Danescu-Niculescu-Mizil, Justin Cheng, JonKleinberg, and Lillian Lee. 2012. You had me athello: How phrasing affects memorability. In Pro-ceedings of the 50th Annual Meeting of the Associationfor Computational Linguistics: Long Papers-Volume1, pages 892–901. Association for Computational Lin-guistics.

Dmitry Davidov, Oren Tsur, and Ari Rappoport. 2010.Semi-supervised recognition of sarcastic sentences intwitter and amazon. In Proceedings of the Four-teenth Conference on Computational Natural Lan-guage Learning, pages 107–116. Association forComputational Linguistics.

Jonathan Dunn. 2013. What metaphor identifica-tion systems can tell us about metaphor-in-language.Meta4NLP 2013, page 1.

Song Feng, Jun Sak Kang, Polina Kuznetsova, and YejinChoi. 2013. Connotation lexicon: A dash of senti-ment beneath the surface meaning. In Proceedings ofthe 51th Annual Meeting of the Association for Com-putational Linguistics (Volume 2: Short Papers), Sofia,Bulgaria, Angust. Association for Computational Lin-guistics.

Elena Filatova. 2012. Irony and sarcasm: Corpus gen-eration and analysis using crowdsourcing. In LREC,pages 392–398.

Stefan L Frank. 2010. Uncertainty reduction as a mea-sure of cognitive processing effort. In Proceedings ofthe 2010 workshop on cognitive modeling and com-putational linguistics, pages 81–89. Association forComputational Linguistics.

Sigmund Freud. 1908. Creative writers and day-dreaming. Standard edition, 9:143–153.

Susan R. Fussell and Mallie M. Moss. 1998. Figurativelanguage in descriptions of emotional states. In Socialand cognitive approaches to interpersonal communi-cation.

Pablo Gervas. 2010. Engineering linguistic creativ-ity: Bird flight and jet planes. In Proceedings ofthe NAACL HLT 2010 Second Workshop on Computa-tional Approaches to Linguistic Creativity, pages 23–30. Association for Computational Linguistics.

Katie Glaskin. 2011. Dreams, memory, and the ances-tors: creativity, culture, and the science of sleep. Jour-nal of the royal anthropological institute, 17(1):44–62.

Andrew Goatly. 1997. The language of metaphors.Routledge.

Roberto Gonzalez-Ibanez, Smaranda Muresan, and NinaWacholder. 2011. Identifying sarcasm in twitter: Acloser look. In ACL (Short Papers), pages 581–586.Citeseer.

Emiliano Guevara. 2011. Computing semantic composi-tionality in distributional semantics. In Proceedings ofthe Ninth International Conference on ComputationalSemantics (IWCS 2011), pages 135–144. Citeseer.

John Hale. 2003. The information conveyed by wordsin sentences. Journal of Psycholinguistic Research,32(2):101–123.

John Hale. 2006. Uncertainty about the rest of the sen-tence. Cognitive Science, 30(4):643–672.

Yanfen Hao and Tony Veale. 2010. An ironic fistin a velvet glove: Creative mis-representation in theconstruction of ironic similes. Minds and Machines,20(4):635–650.

Eric H. Huang, Richard Socher, Christopher D. Manning,and Andrew Y. Ng. 2012. Improving Word Represen-tations via Global Context and Multiple Word Proto-types. In Annual Meeting of the Association for Com-putational Linguistics (ACL).

1256

Page 12: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

Saisuresh Krishnakumaran and Xiaojin Zhu. 2007.Hunting elusive metaphors using lexical resources. InProceedings of the Workshop on Computational ap-proaches to Figurative Language, pages 13–20. Asso-ciation for Computational Linguistics.

George Lakoff and Mark Johnson. 1980. Metaphors weLive by. University of Chicago Press, Chicago.

N. Mashal, M. Faust, T Hendler, and M. Jung-Beeman.2007. An fmri investigation of the neural correlatesunderlying the processing of novel metaphoric expres-sions. Brain and Language, pages 115 – 126.

N Mashal, M Faust, T Hendler, and M Jung-Beeman.2009. An fmri study of processing novel metaphoricsentences. Laterality, (1):30–54.

Janet Maybin and Joan Swann. 2007. Everyday creativ-ity in language: Textuality, contextuality, and critique.Applied Linguistics, 28(4):497–517.

Robert R McCrae. 1987. Creativity, divergent thinking,and openness to experience. Journal of personalityand social psychology, 52(6):1258.

Susan M. McCurry and Steven C. Hayes. 1992. Clinicaland experimental perspectives on metaphorical talk.Clinical Psychology Review, 12(7):763 – 785.

Rada Mihalcea and Carlo Strapparava. 2005. Mak-ing computers laugh: Investigations in automatic hu-mor recognition. In Proceedings of Human LanguageTechnology Conference and Conference on EmpiricalMethods in Natural Language Processing, pages 531–538, Vancouver, British Columbia, Canada, October.Association for Computational Linguistics.

Jeff Mitchell and Mirella Lapata. 2008. Vector-basedmodels of semantic composition. In In Proceedings ofACL-08: HLT, pages 236–244.

Jeff Mitchell and Mirella Lapata. 2009. Language mod-els based on semantic composition. In Proceedings ofthe 2009 Conference on Empirical Methods in Natu-ral Language Processing: Volume 1-Volume 1, pages430–439. Association for Computational Linguistics.

Brian Murphy, Partha Pratim Talukdar, and TomMitchell. 2012. Learning effective and interpretablesemantic models using non-negative sparse embed-ding. In COLING, pages 1933–1950.

Edward Necka. 1999. Memory and creativity. Ency-clopedia of creativity, ed. by MA Runco, SR Pritzker,2:193–99.

Gozde Ozbal and Carlo Strapparava. 2012. A compu-tational approach to the automation of creative nam-ing. In Proceedings of the 50th Annual Meeting of theAssociation for Computational Linguistics (Volume 1:Long Papers), pages 703–711, Jeju Island, Korea, July.Association for Computational Linguistics.

Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan.2002. Thumbs up?: sentiment classification using ma-

chine learning techniques. In Proceedings of the ACL-02 conference on Empirical methods in natural lan-guage processing-Volume 10, pages 79–86. Associa-tion for Computational Linguistics.

J. B. Pollack. 1990. Recursive distributed representation.Artificial Intelligence, 46:77–105.

Amruta Purandare and Diane Litman. 2006. Humor:Prosody analysis and automatic recognition for f* r* i*e* n* d* s*. In Proceedings of the 2006 Conference onEmpirical Methods in Natural Language Processing,pages 208–215. Association for Computational Lin-guistics.

Vassiliki Rentoumi, George A. Vouros, VangelisKarkaletsis, and Amalia Moser. 2012. Investigatingmetaphorical language in sentiment analysis: A sense-to-sentiment perspective. ACM Trans. Speech Lang.Process., 9(3):6:1–6:31, November.

Sebastian Rudolph and Eugenie Giesbrecht. 2010. Com-positional matrix-space models of language. In Pro-ceedings of the 48th Annual Meeting of the Associationfor Computational Linguistics, pages 907–916. Asso-ciation for Computational Linguistics.

Tim Rumbell, John Barnden, Mark Lee, and AlanWallington. 2008. Affect in metaphor: Developmentswith wordnet.

Bernhard Scholkopf and Alexander J Smola. 2002.Learning with kernels. The MIT Press.

John Shawe-Taylor and Nello Cristianini. 2004. Kernelmethods for pattern analysis. Cambridge universitypress.

Ekaterina Shutova. 2010. Models of metaphor in nlp.In Proceedings of the 48th Annual Meeting of theAssociation for Computational Linguistics, ACL ’10,pages 688–697, Stroudsburg, PA, USA. Associationfor Computational Linguistics.

Rion Snow, Brendan O’Connor, Daniel Jurafsky, and An-drew Y Ng. 2008. Cheap and fast—but is it good?:evaluating non-expert annotations for natural languagetasks. In Proceedings of the conference on empiricalmethods in natural language processing, pages 254–263. Association for Computational Linguistics.

Richard Socher, Jeffrey Pennington, Eric H Huang, An-drew Y Ng, and Christopher D Manning. 2011. Semi-supervised recursive autoencoders for predicting sen-timent distributions. In Proceedings of the Conferenceon Empirical Methods in Natural Language Process-ing, pages 151–161. Association for ComputationalLinguistics.

L.J.P. van der Maaten and G.E. Hinton. 2008. Visualiz-ing high-dimensional data using t-sne.

Tony Veale, Yanfen Hao, and Guofu Li. 2008. Multilin-gual harvesting of cross-cultural stereotypes. In ACL,pages 523–531.

1257

Page 13: Understanding and Quantifying Creativity in Lexical …We rst examine various correlates of perceived creativity based on information theoretic measures and the connotation of words,

Tony Veale. 2011. Creative language retrieval: A ro-bust hybrid of information retrieval and linguistic cre-ativity. In Proceedings of the 49th Annual Meetingof the Association for Computational Linguistics: Hu-man Language Technologies, pages 278–287, Port-land, Oregon, USA, June. Association for Computa-tional Linguistics.

Thomas Voegtlin and Peter F. Dominey. 2005. Linearrecursive distributed representations. Neural Netw.,18(7):878–895, September.

Dominic Widdows. 2008. Semantic vector products:Some initial investigations. In Proceedings of the Sec-ond AAAI Symposium on Quantum Interaction.

Theresa Wilson, Paul Hoffmann, Swapna Somasun-daran, Jason Kessler, Janyce Wiebe, Yejin Choi, ClaireCardie, Ellen Riloff, and Siddharth Patwardhan. 2005.Opinionfinder: A system for subjectivity analysis. InProceedings of HLT/EMNLP on Interactive Demon-strations, pages 34–35. Association for ComputationalLinguistics.

Ainur Yessenalina and Claire Cardie. 2011. Composi-tional matrix-space models for sentiment analysis. InProceedings of the Conference on Empirical Methodsin Natural Language Processing, pages 172–182. As-sociation for Computational Linguistics.

Xiaojin Zhu, Zhiting Xu, and Tushar Khot. 2009. Howcreative is your writing? a linguistic creativity mea-sure from computer science and cognitive psychologyperspectives. In Proceedings of the Workshop on Com-putational Approaches to Linguistic Creativity, pages87–93. Association for Computational Linguistics.

1258