Ordinal Common-sense Inferences.zhang/assets/pdf/joci-tacl.pdf · Ordinal Common-sense Inference...

113
Ordinal Common-sense Inference Transactions of the Association for Computational Linguistics Vancouver, July 31 st , 2017 Johns Hopkins University Sheng Zhang Kevin Duh Benjamin Van Durme Rachel Rudinger

Transcript of Ordinal Common-sense Inferences.zhang/assets/pdf/joci-tacl.pdf · Ordinal Common-sense Inference...

Ordinal Common-sense Inference

Transactions of the Association for Computational Linguistics Vancouver, July 31st, 2017

Johns Hopkins University

Sheng Zhang Kevin Duh Benjamin Van DurmeRachel Rudinger

New task: Ordinal Common-sense Inference

New corpus:

2

JOCI [joe-cee]

New task: Ordinal Common-sense Inference

New corpus:

3

39k examples(Context, Hypothesis, Subjective likelihood)

JOCI [joe-cee]

4

“We use words to talk about the world.Therefore, to understand what words mean,we must have a prior explication of how weview the world. ”

-- Hobbs (1987)

Common Sense for language▷ Definitions▷ Common-sense inference is Prevalent▷ Characterize common-sense inference

New task: Ordinal Common-sense Inference

Common Sense from language

5

Common Sense for language▷ Definitions▷ Common-sense inference is Prevalent▷ Characterize common-sense inference

New task: Ordinal Common-sense Inference

Common Sense from language

6

Definitions

7

Shared Knowledge

If …I know pYou know p

8

Shared Knowledge

If …I know pYou know pI know that you know pYou know that I know p……

9

Shared Knowledge

If …I know pYou know pI know that you know pYou know that I know p……

Then p is shared knowledge

10

Background Knowledge

If p is shared across some group,then we say that p is background knowledge.

11

Common Sense

When that group is really big then p is called:commonly known background knowledge

12

Common Sense

When that group is really big then p is called:commonly known background knowledge

or just simply: common sense

13

Common Sense is Prevalent

14

Example

Rachel walked up to a house.She knocked on the door.

15

Example

Rachel walked up to a house.She knocked on the door.

What door?

16

Example

Rachel walked up to a house.She knocked on the door.

What door?Houses have doors.

17

Common-sense Inference

18

19

“Inferences… though conveyed by language… draw on one’s knowledge of naturalobjects and events that goes beyond one’sknowledge of language itself.”

-- Clark (1975)

20

“a program has common sense if itautomatically deduces for itself asufficiently wide class of immediateconsequences of anything it is told and whatit already knows”

-- McCarthy (1959)

Characterize Common-sense Inference

21

Textual InferenceRecognizing Textual Entailment (RTE)

22

Textual InferenceRecognizing Textual Entailment (RTE)

23(Dagan et al. 2006)

24

Text:“China launched a meteorological satellite into orbit Wednesday.”

Textual InferenceRecognizing Textual Entailment (RTE)

(Example adapted from Clark et al., 2003)

25

Text:“China launched a meteorological satellite into orbit Wednesday.”

Textual InferenceRecognizing Textual Entailment (RTE)

China launched a satellite.Hypothesis:

China canceled the satellite launch.

China owns the satellite.…

The orbit is around Neptune.

(Example adapted from Clark et al., 2003)

26

Text:“China launched a meteorological satellite into orbit Wednesday.”

Textual InferenceRecognizing Textual Entailment (RTE)

China launched a satellite.Hypothesis:

China canceled the satellite launch.

China owns the satellite.…

Entailment

Contradiction

The orbit is around Neptune.

(Example adapted from Clark et al., 2003)

27

Text:“China launched a meteorological satellite into orbit Wednesday.”

Textual InferenceRecognizing Textual Entailment (RTE)

China launched a satellite.Hypothesis:

China canceled the satellite launch.

China owns the satellite.…

Entailment

Contradiction

The orbit is around Neptune.

(Example adapted from Clark et al., 2003)

?

?

28

Text:“China launched a meteorological satellite into orbit Wednesday.”

Textual InferenceRecognizing Textual Entailment (RTE)

China launched a satellite.Hypothesis:

China canceled the satellite launch.

China owns the satellite.…

Entailment

Contradiction

The orbit is around Neptune.

(Example adapted from Clark et al., 2003)

Neutral

Neutral

29

China owns the satellite.

The orbit is around Neptune.

“China launched a meteorological satellite into orbit Wednesday.”

… Neutral

Neutral

30

Non-entailing Inference

Non-entailingInference

“China launched a meteorological satellite into orbit Wednesday.”

China owns the satellite.

The orbit is around Neptune.

31

Non-entailing Inference

Not logically entailed, but more or less likely to betrue in a given context.

“China launched a meteorological satellite into orbit Wednesday.”

Non-entailingInference

China owns the satellite.

The orbit is around Neptune.

32

Non-entailing Inference

“China launched a meteorological satellite into orbit Wednesday.”

Context Hypothesis↝Entailment / Contradiction

Non-entailingInference

China owns the satellite.

The orbit is around Neptune.

33

Non-entailing Inference

“China launched a meteorological satellite into orbit Wednesday.”

Context Hypothesis↝

Non-entailingInference

China owns the satellite.

The orbit is around Neptune.

Entailment / Contradiction

34

Non-entailing Inference

“China launched a meteorological satellite into orbit Wednesday.”

Context Hypothesis↝

Non-entailingInference

China owns the satellite.

The orbit is around Neptune.

Entailment / Contradiction

35

Non-entailing Inference

“China launched a meteorological satellite into orbit Wednesday.”

Context Hypothesis↝

Non-entailingInference

China owns the satellite.

The orbit is around Neptune.

Entailment / Contradiction

36

Non-entailing Inference

“China launched a meteorological satellite into orbit Wednesday.”

Context Hypothesis↝SubjectiveLikelihood

Non-entailingInference

China owns the satellite.

The orbit is around Neptune.

37

Non-entailing Inference

Context Hypothesis↝SubjectiveLikelihood

38

Non-entailing Inference

Context Hypothesis↝

(Saurí and Pustejovsky, 2009)

Continuous Category

SubjectiveLikelihood

39

Non-entailing Inference

ImpossibleVerylikely Likely Plausible Technically

possible

Context Hypothesis↝

(Saurí and Pustejovsky, 2009)

Discreet values

SubjectiveLikelihood

Ordinal Common-sense Inference

40

Ordinal Common-sense Inference

41

Context:“China launched a meteorological satellite into orbit Wednesday.”

(Example adapted from Clark et al., 2003)

Ordinal Common-sense Inference

42

There was a rocket launch.

Context:“China launched a meteorological satellite into orbit Wednesday.”

Hypothesis:

(Example adapted from Clark et al., 2003)

Ordinal Common-sense Inference

43

There was a rocket launch.

Context:“China launched a meteorological satellite into orbit Wednesday.”

Hypothesis:

Very likely

(Example adapted from Clark et al., 2003)

Ordinal Common-sense Inference

44

There was a rocket launch.

Context:“China launched a meteorological satellite into orbit Wednesday.”

Hypothesis:

China owns the satellite.Very likely

Likely

(Example adapted from Clark et al., 2003)

Ordinal Common-sense Inference

45

There was a rocket launch.

Context:“China launched a meteorological satellite into orbit Wednesday.”

Hypothesis:

China owns the satellite.The satellite weighs 10,000 pounds.

Very likely

LikelyPlausbile

(Example adapted from Clark et al., 2003)

Ordinal Common-sense Inference

46

There was a rocket launch.

Context:“China launched a meteorological satellite into orbit Wednesday.”

Hypothesis:

China owns the satellite.

The orbit is around Neptune.The satellite weighs 10,000 pounds.

Very likely

LikelyPlausbile

Tech-possible

(Example adapted from Clark et al., 2003)

Ordinal Common-sense Inference

47

There was a rocket launch.

The satellite was caught by a bird.

Context:“China launched a meteorological satellite into orbit Wednesday.”

Hypothesis:

China owns the satellite.

The orbit is around Neptune.The satellite weighs 10,000 pounds.

Very likely

Impossible

LikelyPlausbile

Tech-possible

(Example adapted from Clark et al., 2003)

Common Sense for language▷ Definitions▷ Common-sense inference is Prevalent▷ Characterize common-sense inference

New task: Ordinal Common-sense Inference

Common Sense from language

48

Common Sense for language▷ Definitions▷ Common-sense inference is Prevalent▷ Characterize common-sense inference

New task: Ordinal Common-sense Inference

Common Sense from language

49

Approaches

• Human Elicitation

• Text Mining

50

Human Elicitation

51

Human Elicitation

Expert elicitation is expensive.FRACAS (Cooper et al., 1996)

52

Human Elicitation

Expert elicitation is expensive.FRACAS (Cooper et al., 1996)

Crowdsourced elicitation is scalable.SNLI (Bowman et al., 2015)ROCStories (Mostafazadeh et al., 2016)

53

54

“Features such as <is larger than a tulip> or <moves faster than an infant>, althoughlogically possible, do not occur in human responses … people are capable of verifying that a <dog is larger than a pencil>.”

-- McRae et al. (2005)

Elicitation Bias

Text Mining

55

Text Mining

Reporting Bias:P(people write about X) ≠ P(X in the real world)

56(Van Durme 2010, Gordon and Van Durme, 2013)

Text Mining

Reporting Bias:P(people write about X) ≠ P(X in the real world)

Frequencies of “A person may 𝑥 ”

57(Van Durme 2010, Gordon and Van Durme, 2013)

Text Mining

Reporting Bias:P(people write about X) ≠ P(X in the real world)

Frequencies of “A person may 𝑥 ”

58(Van Durme 2010, Gordon and Van Durme, 2013)

59

No elicitation biasNo reporting bias

Our Approach(Data for Ordinal Common-sense Inference)

(Schubert 2002, Van Durme and Schubert 2008)

60

Automated Construction

Crowdsourced Annotation

Ordinal Common-sense Inference

Text KB

Context Hypothesis↝Common-sense Inference Candidates

61

Automated Construction

Crowdsourced Annotation

Ordinal Common-sense Inference

Text KB

Context Hypothesis↝Common-sense Inference Candidates

62

Automated Construction

Crowdsourced Annotation

Ordinal Common-sense Inference

Text KB

Context Hypothesis↝Common-sense Inference Candidates

Automated Construction

63

Text KB

Automated Construction

64

Text KB[person] borrow [book] from [library]

Abstracted Propositions

Automated Construction

65

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

Automated Construction

66

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

No frequency

Automated Construction

67

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

publication.n.01

magazine.n.01

collection.n.02

book.n.01

Automated Construction

68

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

publication.n.01

magazine.n.01

collection.n.02

book.n.01hyponym

hyponym

hyponym

Automated Construction

69

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

publication.n.01

magazine.n.01

collection.n.02

book.n.01

person buy ___

person subscribe to ___

person borrow ___ from library

yes no

yes no

yes

Decision Trees

hyponym

hyponym

hyponym

Automated Construction

70

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

publication.n.01

magazine.n.01

collection.n.02

book.n.01

person buy ___

person subscribe to ___

person borrow ___ from library

yes no

yes no

yes

Decision Trees

hyponym

hyponym

hyponym

Automated Construction

71

Text KB[person] borrow [book] from [library]

book

person borrow ___ from library

person buy ___…

Abstracted Propositions

Propositional Templates

publication.n.01

magazine.n.01

collection.n.02

book.n.01

person buy ___

person subscribe to ___

person borrow ___ from library

yes no

yes no

yes

Decision Trees

hyponym

hyponym

hyponym

Common-sense Inference Candidates

72

KB

Common-sense Inference Candidates

73

Context: A child is reading books on a park bench.

KB

Common-sense Inference Candidates

74

Context: A child is reading books on a park bench.

KB

Common-sense Inference Candidates

75

Context: A child is reading books on a park bench.

“___ be borrowed from a library”

KB

Common-sense Inference Candidates

76

Context: A child is reading books on a park bench.

Hypothesis: The books are borrowed from a library.

“___ be borrowed from a library”

KB

Common-sense Inference Candidates

77

Context: A child is reading books on a park bench.

Hypothesis: The books are borrowed from a library.

78

Automatic GenerationCommon-sense Inference Candidates

79

Text

KB

Automatic GenerationCommon-sense Inference Candidates

80

Text

KB

Context Hypothesis↝Common-sense Inference Candidates

Automatic GenerationCommon-sense Inference Candidates

81

Text

KB

Context Hypothesis↝Common-sense Inference Candidates

SNLI

Automatic GenerationCommon-sense Inference Candidates

82

Context Hypothesis↝Common-sense Inference Candidates

83

Crowdsourced Annotation

Ordinal Common-sense Inference

Context Hypothesis↝Common-sense Inference Candidates

Ordinal Label Annotation

Amazon Mechanical Turk

Initial Sentence: Mary saw a car.

1. The following statements is to be true during or shortly after the context of the initial sentence.

The car was made of gold .

tech possible

This statement does not make sense.

84

Amazon Mechanical Turk

Initial Sentence: Mary saw a car.

1. The following statements is to be true during or shortly after the context of the initial sentence.

The car was made of gold .

tech possible

This statement does not make sense.

85

Context

Amazon Mechanical Turk

Initial Sentence: Mary saw a car.

1. The following statements is to be true during or shortly after the context of the initial sentence.

The car was made of gold .

tech possible

This statement does not make sense.

86

Hypothesis

Amazon Mechanical Turk

Initial Sentence: Mary saw a car.

1. The following statements is to be true during or shortly after the context of the initial sentence.

The car was made of gold .

tech possible

This statement does not make sense.

87

Amazon Mechanical Turk

Initial Sentence: Mary saw a car.

1. The following statements is to be true during or shortly after the context of the initial sentence.

The car was made of gold .

tech possible

This statement does not make sense.

88

89

Crowdsourced Annotation

Ordinal Common-sense Inference

Context Hypothesis↝Common-sense Inference Candidates

JOCI corpus(JHU Ordinal Common-sense Inference)

JOCI

90

39k (Context, Hypothesis, Label)

JOCI

91

39k (Context, Hypothesis, Label)

Our ApproachSNLI/ROCStories

Context Hypothesis Label

Major

JOCI

92

39k (Context, Hypothesis, Label)

Our ApproachSNLI/ROCStories

SNLI SNLIROCStories ROCStories

COPA COPA

Context Hypothesis Label

Major

Comparing

93

JOCI

Scalable & Reliable

94

JOCI

Average annotation time per example 20.71sAverage cost per example 1.99¢Average Cohen’s 𝜅 0.54

Scalable & Reliable

95

Average annotation time per example 20.71sAverage cost per example 1.99¢Average Cohen’s 𝜅 0.54

Scalable & Reliable

96

Average annotation time per example 20.71sAverage cost per example 1.99¢Average Cohen’s 𝜅 0.54

Scalable & Reliable

97

Average annotation time per example 20.71sAverage cost per example 1.99¢Average Cohen’s 𝜅 0.54

Scalable & Reliable

98

Scalable & Reliable

99

JOCI

Scalable & Reliable

Capable of evaluating/training inference systems

100

JOCI

Scalable & Reliable

Capable of evaluating/training inference systems• Label Distribution

101

JOCI

Label Distribution is Balanced

102

Very-likely

LikelyPlausible

Technically possible

Impossible

JOCI

Label Distribution

103

SNLI

Label Distribution

104

Very-likely

Entailment

SNLI

Label Distribution

105

Very-likely Plausible

Entailment Neutral

SNLI

Label Distribution

106

Very-likely PlausibleTech-

possible

Impossible

Entailment Neutral Contradiction

SNLI

107

Very-likely Likely

Plausible

Technically possible

Impossible

Label DistributionROCStories

Our Goal for JOCI

Scalable & Reliable

Capable of evaluating/training inference systems• Label Distribution

108

Our Goal for JOCI

Scalable & Reliable

Capable of evaluating/training inference systems• Label Distribution• Baselines

109

Our Goal for JOCI

Scalable & Reliable

Capable of evaluating/training inference systems• Label Distribution• Baselines

Baseline(JOCI) > Baseline(SNLI/ROCStories)

110

Common Sense for languageNew task: Ordinal Common-sense Inference

Common Sense from language▷Mining Common-sense is Challenging

- Human Elicitation (Elicitation bias)- Text Mining (Reporting bias)

▷ Our ApproachText Mining + Crowdsourced Annotation

New corpus: JOCI

111

112

Sheng Zhang Kevin Duh Benjamin Van DurmeRachel Rudinger

JOCIhttp://decomp.net/common-sense-inference

113

Sheng Zhang Kevin Duh Benjamin Van DurmeRachel Rudinger

JOCIhttp://decomp.net/common-sense-inference

Thank you!