Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max...
-
Upload
marisa-ipson -
Category
Documents
-
view
217 -
download
0
Transcript of Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max...
![Page 1: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/1.jpg)
Coh-Metrix: An Automated Measure of Text Cohesion
Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser
![Page 2: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/2.jpg)
Coh-Metrix Investigators
Co-PIs and Senior Researchers: Max Louwerse, Art Graesser, Zhiqiang Cai, Randy Floyd, Xiangen Hu, Vasili Rus
Postdocs & Staff: Rachel Best, David Dufty, Christian Hempelman, Tenaha O’Reilly, Yasuhiro Ozuru
Many students
![Page 3: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/3.jpg)
Coh-Metrix
• Coh-Metrix v1.2 Analyzes texts on many different dimensions of cohesion and language– Input text on a web site– Outputs 12 primary measures and over 200
additional measures
Graesser, McNamara, Louwerse, & Cai, 2004
![Page 4: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/4.jpg)
Prior Research• Increasing text cohesion improves
memory for text content.– Increasing argument overlap between sentences.
• Most plastics are good insulators. So are clothes you wear, like sweaters and coats.
• Most plastics are good insulators. Other good insulators are the clothes you wear, like sweaters and coats.
– Adding connectives• For example, most plastics are good insulators.• because, consequently, so that, in addition, however
– Adding headers and topic sentences
![Page 5: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/5.jpg)
Prior Research• Increasing text cohesion improves memory for
text content.• Text cohesion is particularly crucial for low-
knowledge readers.• Decreasing text cohesion helps high-
knowledge readers process the text more actively and understand it at a deeper level.– McNamara, Kintsch, Songer, & Kintsch (1996, C&I)– McNamara & Kintsch (1996, DP)– McNamara (2001, CJEP)
![Page 6: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/6.jpg)
Cohesion and Coherence
• Research points to the need to consider text difficulty in terms of text cohesion and coherence. – Cohesion is a property of the text.– Coherence is a property of the reader’s mental
representation.
• We need automated measures of cohesion and coherence.
![Page 7: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/7.jpg)
Current Method:Readability Measures
• E.g., Flesch-Kincaid Grade Level• Based on the work of Rudolph Flesch in the
1940’s• Scores range from 0-12 to predict grade
appropriateness• Measure based on surface characteristics
– sentence length – word length
![Page 8: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/8.jpg)
Goals of Coh-Metrix Tool
• Analyze texts on many different dimensions of cohesion and language– Input text on a web site– Outputs over 200 measures
• Focus primarily on deeper levels of meaning and cohesion, unlike standard readability formulas
• Tailor texts to students (K12, college) with different world knowledge and abilities
![Page 9: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/9.jpg)
Computational Linguistics Modules
Lexicons
Morpho-semantics
Part-of-speech tagging
Syntactic parsing
Latent Semanticanalysis
Pattern classifiers
Corpora norms
![Page 10: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/10.jpg)
Heart Disease Text (McNamara et al., 1996)
5.8
6.0
6.2
6.4
6.6
6.8
7.0
7.2
7.4
High local High global
High local low global
Low local high global
Low locallow global
Cohesion
F-K
Gra
de L
evel
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Arg
umen
t ove
rlap
xx
Flesch-Kincaid Grade Level Argument overlap
Argument overlap
F-K
easy
hard
Any disorder that stops the heart from supplying blood to the body is a threat to life. Heart disease is such a disorder.
Any disorder that stops the blood supply is a threat to life. Heart disease is very common
![Page 11: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/11.jpg)
7.8
8.4
0.26
0.45
7.4
7.6
7.8
8.0
8.2
8.4
8.6
High Cohesion Low Cohesion
Cohesion
F-K
Gra
de L
evel
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Arg
umen
t ove
rlap
xx
Flesch-Kincaid Grade Level Argument overlap
Argument overlap
F-K
easy
hard
Cohesion and Readability Scores for 19 pairs of passages examined in 12 published studies
![Page 12: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/12.jpg)
Beck et al. (1984)Beck et al. (1991)Britton and Gulgoz (1989)Cataldo & Oakhill (2000) Kintsch (1990)Lehman & Schraw (2002)Linderholm et al. (2000) Loxterman et al. (1994) McNamara (2001) McNamara et al. (1996) Vidal-Abarca et al. (2000) Voss & Silfies (1996)
List of Cohesion Publications
![Page 13: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/13.jpg)
Text code
Arg
um
en
t O
verl
ap,
ad
jace
nt,
un
weig
hte
dja
cen
t, u
nw
eig
hte
d
1.0
.8
.6
.4
.2
0.0
Low vs High Cohesion
1.00
2.00
![Page 14: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/14.jpg)
Text code
Arg
um
en
t O
verl
ap,
ad
jace
nt,
un
weig
hte
dja
cen
t, u
nw
eig
hte
d
1.0
.8
.6
.4
.2
0.0
Low vs High Cohesion
1.00
2.00
Linderholm et al. 2000 Mademoiselle Germaine (Easy Text)
McNamara et al. 1996Mammal Text, Exp. 1
Lehman & Schraw 2002The Quest for the Northwest Passage
No differences
causal, particle to verb ratiocausal connectivesLSA Sentence to Sentencenoun overlap
clarification connectivescausal, particle to verb ratiocausal connectivespronoun incidence
What variables showed a greater than 50% difference in favor of the cohesive text?
![Page 15: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/15.jpg)
Overall Results
• The 20 variables showing the largest differences were co-reference measures.
• Argument overlap measures showed the largest differences in comparison to noun and stem overlap measures– Argument overlap includes pronouns
• They skied all day. They were tired.
– Regardless of whether overlap was counted at distances of 1, 2, or 3 sentences
– Adjacent overlap showed the largest difference
![Page 16: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/16.jpg)
Other Significant Variables
• Type-Token Ratio for Nouns (L>H) • Higher level constituents per sentence (H>L)• Ratio of causal particles and causal verbs
(p<.06; H>L)• Causal connectives (p<.07; H>L)• Celex, log Freq, min in sentence (p<.08; L>H)• Average Words per Sentence (p<.08; H>L)• LSA, sentence to sentence (p<.11; H>L)
![Page 17: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/17.jpg)
0.650.66
0.670.680.69
0.700.710.72
0.730.74
Type Token
Low High
Indicates that the high-cohesion texts did not add new information
![Page 18: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/18.jpg)
Constituents
1.8
1.9
2.0
2.1
2.2
2.3
2.4
2.5
Constituents
Low High
![Page 19: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/19.jpg)
Causal Ratio
0
0.2
0.4
0.6
0.8
1
Causal Ratio
Low High
![Page 20: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/20.jpg)
Connectives
0
5
10
15
20
25
Causal Connectives
Low High
![Page 21: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/21.jpg)
Celex
0.80
0.85
0.90
0.95
1.00
1.05
Celex Min Log
Low High
![Page 22: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/22.jpg)
Number of Words
0
100
200
300
400
500
600
700
800
Number of Words
Low High
Descriptive StatisticsN Minimum Maximum Mean Std. Deviation
38 101.0 1390.0 590.8 381.9
![Page 23: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/23.jpg)
LSA
0.210.27
0.00
0.05
0.10
0.15
0.20
0.25
0.30
LSA
Low High
![Page 24: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/24.jpg)
ANNOUNCING THE RELEASEOF
Coh-Metrix 1.1
![Page 25: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/25.jpg)
![Page 26: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/26.jpg)
![Page 27: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/27.jpg)
Current Goals
• Examine cohesion measures by grade level for TASA and complete textbooks.
• Conducting empirical studies to further examine the effects of text cohesion for adults
• Conducting experiments to establish the effects of cohesion for young children.– e.g., currently conducting comprehension and
eye-tracking studies with 3rd-5th grade children.
![Page 28: Coh-Metrix: An Automated Measure of Text Cohesion Danielle S. McNamara, Yasuhiro Ozuru, Max Louwerse, Art Graesser.](https://reader030.fdocuments.in/reader030/viewer/2022032702/56649cba5503460f949815fe/html5/thumbnails/28.jpg)
What will Coh-Metrix achieve?
• Enhance education by giving educators better tools for choosing textbooks
• Help publishers more appropriately tailor books to target age groups
• Help writers improve the cohesion of their writing
• Help researchers better understand the hidden properties of text