(C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.
-
Upload
reginald-wheeler -
Category
Documents
-
view
218 -
download
3
Transcript of (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.
![Page 1: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/1.jpg)
(C) 2000, The University of Michigan
1
Language and Information
Handout #1
September 7, 2000
![Page 2: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/2.jpg)
(C) 2000, The University of Michigan
2
Course Information
• Instructor: Dragomir R. Radev ([email protected])
• Office: 305A, West Hall
• Phone: (734) 615-5225
• Office hours: TTh 3-4
• Course page: http://www.si.umich.edu/~radev/760
• Class meets on Thursdays, 5-8 PM in 311 West Hall
![Page 3: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/3.jpg)
(C) 2000, The University of Michigan
3
Introduction
![Page 4: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/4.jpg)
(C) 2000, The University of Michigan
4
Demos
• AskJeeves
• OneAcross
• Systran
![Page 5: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/5.jpg)
(C) 2000, The University of Michigan
5
Some Statistics• Business e-mail sent per day in the US: 2.1Billion
• Spam per day: 7 Billion
• First class mail per year: 107 Billion
• Text on Internet (2/99): > 6TB
• indexed: 16% (Lawrence and Giles, Nature 400, 1999)
• Dialog (www.dialog.com): 9 TB
• Average college library: 1 TB
• More statistics: http://www.cyberatlas.internet.com
![Page 6: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/6.jpg)
(C) 2000, The University of Michigan
6
Languages• Languages: 39,000 languages and dialects (22,000 dialects
in India alone)• Top languages: Chinese/Mandarin (885M), Spanish
(332M), English (322M), Bengali (189M), Hindi (182M), Portuguese (170M), Russian (170M), Japanese (125M)
• Source: www.sil.org/ethnologue, www.nytimes.com• Internet: English (128M), Japanese (19.7M), German
(14M), Spanish (9.4M), French (9.3M), Chinese (7.0M)• Usage: English (1999-54%, 2001-51%, 2003-46%, 2005-
43%)• Source: www.computereconomics.com
![Page 7: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/7.jpg)
(C) 2000, The University of Michigan
7
Syllabus
• Introduction to the course and linguistic background– The study of language. Computational Linguistics and Psycholinguistics.
• Elementary probability and statistics – Describing data. Measures of central tendency. The z score. Hypothesis
testing.
• Information theory – Entropy, joint entropy, conditional entropy. Relative entropy and mutual
information. Chain rules.
• Data compression and coding – Entropy rate. Language modeling. Examples of codes. Optimal codes.
Huffman codes. Arithmetic coding. The entropy of English.
![Page 8: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/8.jpg)
(C) 2000, The University of Michigan
8
Syllabus
• Clustering – Cluster analysis. Clustering of terms according to semantic similarity.
Distributional clustering.
• Concordancing and collocations – Concordances. Collocations. Syntactic criteria for collocability.
• Literary detective work – The statistical analysis of writing style. Decipherment and translation.
• Information extraction – Message understanding. Trainable methods.
• Word sense disambiguation and lexical acquisition – Supervised disambiguation. Unsupervised disambiguation. Attachment
ambiguity. Computational lexicography.
![Page 9: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/9.jpg)
(C) 2000, The University of Michigan
9
Syllabus
• Part-of-speech tagging [*]– Statistical taggers. Transformation-based learning of tags. Maximum
entropy models. Weighted finite-state transducers.
• Question answering – Semantic representation. Predictive annotation.
• Text summarization – Single-document summarization. Multi-document summarization.
Language models. Maximal Marginal Relevance. Cross-document structure theory. Trainable methods. Text categorization.
• Other topics – Text alignment. Word alignment. Statistical machine translation.
Discourse segmentation. Text categorization. Maximum entropy modeling.
![Page 10: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/10.jpg)
(C) 2000, The University of Michigan
10
Assignments
• Problem sets– The assignments will involve analysis of Web-based
data using both manual and automated techniques
• Project– Data analysis and/or programming involved
• Midterm– A mixture of short-answer and essay-type questions
• Final– A mixture of short-answer and essay-type questions
![Page 11: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/11.jpg)
(C) 2000, The University of Michigan
11
Projects
Each student will be responsible for designing and completing a research project that demonstrates the ability to use concepts from the class in addressing a practical problem for humanities computing. A significant part of the final grade will depend onthe project assignment. Students will need to submit a project proposal, a progress report, and the project itself. Students can elect to do a project on an assigned topic, or to select a topic of their own.
The final version of the project will be put on the World Wide Web, and will be defended in front of the class at the end of the semester (procedure TBA).In some cases (and only with instructor’s approval), students may be allowed to work in pairs, e.g., students with different backgrounds may collaborate on a larger project.
![Page 12: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/12.jpg)
(C) 2000, The University of Michigan
12
Readings
• Textbook:– Oakes, Chapter 1, pages 1 – 10, 24 – 35
• Additional readings– M&S, Chapter 2, pages 39 – 54– M&S, Chapter 3, pages 81 – 113
![Page 13: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/13.jpg)
(C) 2000, The University of Michigan
13
Computational Linguistics
![Page 14: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/14.jpg)
(C) 2000, The University of Michigan
14
Syntactic categories• Substitution test:
Joseph eats {
}
food.
Chinese hot freshvegetarian
• Open (lexical) and closed (functional) categories:
No-fly-zoneyadda yadda yadda
thein
![Page 15: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/15.jpg)
(C) 2000, The University of Michigan
15
Morphology
• Parts of speech: eight (or so) general types
• Inflection (number, person, tense…)
• Derivation (adjective-adverb, noun-verb)
• Compounding (separate words or single word)
• Part-of-speech tagging
• Morphological analysis (prefix, root, suffix, ending)
The dog chased the yellow bird.
![Page 16: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/16.jpg)
(C) 2000, The University of Michigan
16
Part of Speech Tags
NN /* singular noun */IN /* preposition */AT /* article */NP /* proper noun */JJ /* adjective */, /* comma */ NNS /* plural noun */CC /* conjunction */RB /* adverb */VB /* un-inflected verb */VBN /* verb +en (taken, looked (passive,perfect)) */VBD /* verb +ed (took, looked (past tense)) */CS /* subordinating conjunction */
From Church (1991) - 79 tags
![Page 17: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/17.jpg)
(C) 2000, The University of Michigan
17
Part of Speech Tags
r RP partitive articles S particlesx SX particlea T nominalu U proper nounv1p V1PPI verb 1st person plural present indicativev1p V1PPM verb 1st person plural present imperativev1p V1PPC verb 1st person plural present conditionalv1p V1PPS verb 1st person plural present subjunctivev1p V1PFI verb 1st person plural future indicativev1p V1PII verb 1st person plural imperfect indicativev1p V1PSI verb 1st person plural simple-past indicativev1p V1PIS verb 1st person plural imperfect subjunctivev2p V2PPI verb 2nd person plural present indicativev2p V2PPC verb 2nd person plural present conditional
From Tzoukermann and Radev (1995) - 258 tags
![Page 18: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/18.jpg)
(C) 2000, The University of Michigan
18
Jabberwocky (Lewis Carroll)
`Twas brillig, and the slithy tovesDid gyre and gimble in the wabe:All mimsy were the borogoves,And the mome raths outgrabe.
"Beware the Jabberwock, my son!The jaws that bite, the claws that catch!Beware the Jubjub bird, and shunThe frumious Bandersnatch!"
![Page 19: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/19.jpg)
(C) 2000, The University of Michigan
19
Nouns
• Nouns: dog, tree, computer, idea
• Nouns vary in number (singular, plural), gender (masculine, feminine, neuter), case (nominative, genitive, accusative, dative)
• Latin: filius (m), filia (f), filium (object)German: Mädchen
• Clitics (‘s)
![Page 20: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/20.jpg)
(C) 2000, The University of Michigan
20
Pronouns
• Pronouns: she, ourselves, mine• Pronouns vary in person, gender, number, case (in
English: nominative, accusative, possessive, 2nd possessive, reflexive)
Joe bought him an ice cream.Joe bought himself an ice cream.
• Anaphors: herself, each other
![Page 21: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/21.jpg)
(C) 2000, The University of Michigan
21
Determiners and Adjectives
• Articles: the, a
• Demonstratives: this, that
• Adjectives: describe properties
• Attributive and predicative adjectives
• Agreement: in gender, number
• Comparative and superlative (derivative and periphrastic)
• Positive form
![Page 22: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/22.jpg)
(C) 2000, The University of Michigan
22
Verbs• Actions, activities, and states (throw, walk, have)
• English: four verb forms
• tenses: present, past, future
• other inflection: number, person
• gerunds and infinitive
• aspect: progressive, perfective
• voice: active, passive
• participles, auxiliaries
• irregular verbs
• French and Finnish: many more inflections than English
![Page 23: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/23.jpg)
(C) 2000, The University of Michigan
23
Other Parts of Speech
• Adverbs, prepositions, particles• phrasal verbs (the plane took off, take it off)• particles vs. prepositions (she ran up a bill/hill)• Coordinating conjunctions: and, or, but• Subordinating conjunctions: if, because, that,
although• Interjections: Ouch!
![Page 24: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/24.jpg)
(C) 2000, The University of Michigan
24
Alice bought Bob flowers.Bob bought Alice flowers.
Phrase-structure Grammars
• Constituent order (SVO, SOV)• imperative forms• sentences with auxiliary verbs• interrogative sentences• declarative sentences• start symbol and rewrite rules• context-free view of language
![Page 25: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/25.jpg)
(C) 2000, The University of Michigan
25
Sample Phrase-structure Grammar
S NP VPNP AT NNSNP AT NNNP NP PPVP VP PP VP VBD VP VBD NP P IN NP
AT theNNS drivers NNS teachers NNS lakes VBD drank VBD ate VBD saw IN in IN of NN cake
![Page 26: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/26.jpg)
(C) 2000, The University of Michigan
26
Phrase-structure Grammars
• Local dependencies• Non-local dependencies• Subject-verb agreement
The students who wrote the best essays were given a reward.
• wh-extraction
Should Derek read a magazine?Which magazine should Derek read?
• Empty nodes
![Page 27: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/27.jpg)
(C) 2000, The University of Michigan
27
Dependency: Arguments and Adjuncts
• Event + dependents (verb arguments are usually NPs)
• agent, patient, instrument, goal - semantic roles
• subject, direct object, indirect object
• transitive, intransitive, and ditransitive verbs
• active and passive voice
Sally watched the kids in the car.
![Page 28: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/28.jpg)
(C) 2000, The University of Michigan
28
Phrase Structure Ambiguity• Grammars are used for generating and parsing sentences
• Parses
• Syntactic ambiguity
• Attachment ambiguity: Visiting relatives can be boring.
• The children ate the cake with a spoon.
• High vs. low attachment
• Garden path sentences: The horse raced past the barn fell. Is the book on the table red?
![Page 29: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/29.jpg)
(C) 2000, The University of Michigan
29
Ungrammaticality vs. Semantic Abnormality
* Slept children the.# Colorless green ideas sleep furiously.# The cat barked.
![Page 30: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/30.jpg)
(C) 2000, The University of Michigan
30
Semantics and Pragmatics
• Lexical semantics and compositional semantics• Hypernyms, hyponyms, antonyms, meronyms and
holonyms (part-whole relationship, tire is a meronym of car), synonyms, homonyms
• Senses of words, polysemous words• Homophony (bass).• Collocations: white hair, white wine• Idioms: to kick the bucket
![Page 31: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/31.jpg)
(C) 2000, The University of Michigan
31
Discourse Analysis• Anaphoric relations:
1. Mary helped Peter get out of the car. He thanked her.
2. Mary helped the other passenger out of the car. The man had asked her for help because of his foot injury.
• Information extraction problems (entity crossreferencing)
Hurricane Hugo destroyed 20,000 Florida homes.At an estimated cost of one billion dollars, the disasterhas been the most costly in the state’s history.
![Page 32: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/32.jpg)
(C) 2000, The University of Michigan
32
Pragmatics
• The study of how knowledge about the world and language conventions interact with literal meaning.
• Speech acts• Research issues: resolution of anaphoric relations,
modeling of speech acts in dialogues
![Page 33: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/33.jpg)
(C) 2000, The University of Michigan
33
Other Research Areas• Linguistics is traditionally divided into phonetics,
phonology, morphology, syntax, semantics, and pragmatics.
• Sociolinguistics: interactions of social organization and language.
• Historical linguistics: change over time.
• Linguistic typology
• Language acquisition
• Psycholinguistics: real-time production and perception of language
![Page 34: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/34.jpg)
(C) 2000, The University of Michigan
34
Ambiguities in Natural Language
• address, resent, entrance, number• Lee: Wait to buy IBM
(http://cnnfn.cnn.com/2000/07/19/investing/q_talking_stocks/)
• Pfizer to buy Warner-Lambert in $90-billion deal (http://detnews.com/2000/business/0002/07/02080007.htm)
![Page 35: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/35.jpg)
(C) 2000, The University of Michigan
35
My Research Interests
• Text summarization (especially, of multiple documents)
• Text categorization and clustering
• Information extraction
• Question answering
![Page 36: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/36.jpg)
(C) 2000, The University of Michigan
36
Main Research Forums• Conferences: ACL, SIGIR, ANLP, Coling,
EACL/NAACL, AMTA/MT Summit, ICSLP/Eurospeech• Journals: Computational Linguistics, Natural Language
Engineering, Information Retrieval, Information Processing and Management, ACM Transactions on Information Systems
• University centers: Columbia, CMU, UMass, MIT, UPenn, USC/ISI, NMSU, Brown, Michigan, Maryland, Edinburgh, Cambridge, Saarbrücken, Kyoto, and many others
• Industrial research sites: AT&T, Bell Labs, IBM, Xerox PARC, SRI, BBN/GTE, MITRE, Microsoft
• Startups: Nuance, Ask.com, Inxight
![Page 37: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/37.jpg)
(C) 2000, The University of Michigan
37
Mathematical Foundations
![Page 38: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/38.jpg)
(C) 2000, The University of Michigan
38
Probability Spaces• Probability theory: predicting how likely it is that
something will happen
• basic concepts: experiment (trial), basic outcomes, sample space
• discrete and continuous sample spaces
• for NLP: mostly discrete spaces
• events is the certain event while is the impossible event
• event space - all possible events
![Page 39: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/39.jpg)
(C) 2000, The University of Michigan
39
Probability Spaces
• Probabilities: numbers between 0 and 1• Probability function (distribution): distributes a
probability mass of 1 throughout the sample space .
• Example: coin is tossed three times. What is the probability of 2 heads?
• Uniform distribution
![Page 40: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/40.jpg)
(C) 2000, The University of Michigan
40
Conditional Probability and Independence
• Prior and posterior probability
P(A|B) = P(A B)
P(B)
A BAB
![Page 41: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/41.jpg)
(C) 2000, The University of Michigan
41
Conditional Probability and Independence
• The chain rule:
P(A1 … An) = P(A1) P(A2 |A1) P(A3|A1A2 ) … P(An | Ai)n-1
i=1
• This rule is used in many ways in statistical NLP more specifically in Markov Models.
• Two events are independent when P(AB) = P(A)P(B)
• Unless P(B)=0 this is equivalent to saying that P(A) = P(A|B)
• If two events are not independent, they are considered dependent
![Page 42: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/42.jpg)
(C) 2000, The University of Michigan
42
Bayes’ Theorem
P(B|A) = P(BA)
P(B) =
P(A|B)P(B)
P(A)
• Bayes’ theorem is used to calculate P(A|B) given P(B|A).
![Page 43: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/43.jpg)
(C) 2000, The University of Michigan
43
X: Rn
Random Variables• Simply a function:
• The numbers are generated by a stochastic process with a certain probability distribution.• Example: the discrete random variable X that is the sum of the faces of two randomly thrown dice.• Probability mass function (pmf) which gives the probability that the random variable has different numeric values:
P(x) = P(X = x) = P(Ax) where Ax = { : X() = x}
![Page 44: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/44.jpg)
(C) 2000, The University of Michigan
44
Random Variables
• If a random variable X is distributed according to the pmf p(x), the we write X ˜ p(x)
• For a discrete random variable, we have that:
p(xi) = P(Axi) = P() = 1
![Page 45: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/45.jpg)
(C) 2000, The University of Michigan
45
Expectation and Variance• Expectation = mean (average) of a random variable.
• If X is a random variable with a pmf p(x), such that
|x| p(x) < , then the expectation is:
E(X) =
xp(x)• Example: rolling one die
• Variance = measure of whether the values of the random variable tend to be consistent over trials or to vary a lot.
Var(X) = E((X - E(X))2) = E(X2) - E2(X)
• Standard deviation = square root of variance
![Page 46: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/46.jpg)
(C) 2000, The University of Michigan
46
Expectation and Variance
• Composition of functions:
E(g(Y)) = g(y)p(y)
• Examples:
If g(Y) = aY + b, then E(g(Y)) = aE(Y) + b
E(X+Y) = E(X) + E(Y)
E(XY) = E(X)E(Y), if X and Y are independent
![Page 47: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/47.jpg)
(C) 2000, The University of Michigan
47
Joint and Conditional Distributions
• Joint (multivariate) probability distributions:
p(x,y) = P(X = x , Y = y)
• Marginal pmf:
px(x) = yp(x,y) pY(y) = xp(x,y)
• If X and Y are independent:
p(x,y) = pX(x)pY(y)
![Page 48: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/48.jpg)
(C) 2000, The University of Michigan
48
Joint and Conditional Distributions
• Conditional pmf in terms of the joint distribution:
pX|Y(x|y) =P(x,y)
pY(y) for y such that pY(y) > 0
![Page 49: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/49.jpg)
(C) 2000, The University of Michigan
49
Determining P
• Estimation
• Example “The cow chewed its cud”
• Relative frequency
• Parametric approach (doesn’t work for distribution of words in newspaper articles in a particular topic category)
• Non-parametric approach
![Page 50: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/50.jpg)
(C) 2000, The University of Michigan
50
The Binomial Distribution
• The number r of successes out of n trials given that the probability of success in any single trial is p:
B(r; n,p) = ( ) pr (1-p)n-r, where ( ) = n
r
n
r (n-r)!r!
n!
• Example: tossing a (possibly weighted) coin n times.
![Page 51: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/51.jpg)
(C) 2000, The University of Michigan
51
Pascal’s Triangle
1 1
1 2
1 3
1 4
3 1
6 4
1
1
1
![Page 52: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/52.jpg)
(C) 2000, The University of Michigan
52
The Normal Distribution• Describes a continuous distribution
n(x; ,) = e-(x-)2/(22)
2
1
• Standard normal distribution: when = 0 and = 1
• In statistics, normal distribution is often used to approximate the binomial distribution. It should only be used when np(1-p) > 5
• In NLP, such assumptions are unwise. Example: “shade tree mechanics”
![Page 53: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/53.jpg)
(C) 2000, The University of Michigan
53
Statistics for Corpus Linguistics
![Page 54: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/54.jpg)
(C) 2000, The University of Michigan
54
Statistics for Corpus Linguistics
• Descriptive statistics: how to describe data
• Describing relationships: the Chi-square test, correlation, regression
• Information theory: information, entropy, coding, redundancy, optimal codes, mutual information
![Page 55: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/55.jpg)
(C) 2000, The University of Michigan
55
Measures of Central Tendency
• Mode: the most frequent score in a data set
• Median: central score of the distribution
• Mean: average of all scores
![Page 56: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/56.jpg)
(C) 2000, The University of Michigan
56
Examples
• Split “Moby Dick” into 135 files (“pages”).
• Occurrences of the word “the” in the first 15 pages:
Data: 17 125 99 300 80 36 43 65 78 259 62 36 40 120 45Mean: 93.67Median: 65Mode: 36
![Page 57: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/57.jpg)
(C) 2000, The University of Michigan
57
Probabilities
• p = a/n, where a is the number of successes, and n is the number of trials.
p (i) = 1the sum of all probabilities is 1:
• Independent probabilities (product of probabilities): P(a AND b) = P(a) * P(b)
![Page 58: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/58.jpg)
(C) 2000, The University of Michigan
58
Binomial Coefficient
( ) = n!/r!(n-r)!n
rThe probability of success in a single trial is:
( ) pr qn-r
r
n
where q = 1 - p
![Page 59: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/59.jpg)
(C) 2000, The University of Michigan
59
Related Concepts
• For binomial distributions:– standard deviation is the square root of n, p, and q– mean is nxp
• Normal distributions:– same as binomial, for large values of n– asymptotical bell curves
![Page 60: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/60.jpg)
(C) 2000, The University of Michigan
60
Skewed Normal Distributions
• Positively skewed (most of the data is below the mean)
• Negatively skewed (the opposite)• Bimodal distributions• In corpus analysis: the number of letters in a word
or the length of a verse in syllables is usually positively skewed
• Lognormal distributions
![Page 61: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/61.jpg)
(C) 2000, The University of Michigan
61
Central Limit Theorem
When samples are repeatedly drawn from a population, the means of the samples are normally distributed around the population mean. This occurs whether or not the actual distribution is normal or not.
![Page 62: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/62.jpg)
(C) 2000, The University of Michigan
62
Measures of Variability• Variance = (x-)2/N-1
• Range
• Standard deviation is the square root of the variance
• Semi inter-quartile range (25%-75% range): Columbia ACT scores (26-30)
Data: 17 125 99 300 80 36 43 65 78 259 62 36 40 120 45 Mean: 93.67Median: 65Variance: 6729.52Standard Deviation: 82.03
![Page 63: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/63.jpg)
(C) 2000, The University of Michigan
63
z-score• A measure of how far a value is from the mean, in
terms of standard deviations• Example: = 93, = 82. Let’s consider a page
with 144 occurrences of the word “the”. The z-score for that page is:
z = (144-93)/82 = 0.62
• Using the table on pages 258-259 of Oakes, we find that the new page is at the 26th percentile
![Page 64: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/64.jpg)
(C) 2000, The University of Michigan
64
Hypothesis Testing
• If two data sets are both normally distributed, and the means and standard deviations are known
• Example: Francis and Kucera reported that the mean sentence length in government documents is 25.48 words, while in the Present-Day Edited American English corpus, the mean length is 19.27 words only
![Page 65: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/65.jpg)
(C) 2000, The University of Michigan
65
Hypotheses
• Null hypothesis: that the difference can be explained in terms of chance and natural variability
• Statistical significance: when there is less than 5% chance that the null hypothesis holds
![Page 66: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/66.jpg)
(C) 2000, The University of Michigan
66
T-testing
• Tests the difference between two groups for normally-distributed interval data
• The t-test is normally used with small samples: less than 30 items
• The one-sample study compares a sample mean with an established population
Tobs = (x - ) / stderr
![Page 67: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/67.jpg)
(C) 2000, The University of Michigan
67
Example 1
• Mixed corpus: 2.5 verbs per sentence with 1.2 standard deviation
• Scientific corpus: 3.5 verbs per sentence with 1.6 standard deviation
• number of sentences in the scientific corpus: 100• standard error in scientific corpus: 3.5/10• observed value of t = (3.5-2.5)/0.35 = 2.86
![Page 68: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/68.jpg)
(C) 2000, The University of Michigan
68
Example 1 (Cont’d)
• Number of degrees of freedom: in the example: 99• Use table on page 260 of Oakes• Find value: 1.671• The observed value of t is larger, therefore the null
hypothesis can be rejected
![Page 69: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/69.jpg)
(C) 2000, The University of Michigan
69
Tests for Difference
Tobs = (x1 - x2) / stderr
stderr2 = s12/n1 + s2
2/n2
![Page 70: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/70.jpg)
(C) 2000, The University of Michigan
70
Control(n=8)
Test(n=7)
10 8
5 1
3 2
6 1
4 3
4 4
7 2
9
![Page 71: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/71.jpg)
(C) 2000, The University of Michigan
71
Example 2
stderr = +2.27 x 2.27 2.21 x 2.21
7 8=
0.736 + 0.611 = 1.347 = 1.161
t = (6-3)/1.161 = 2.584
![Page 72: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/72.jpg)
(C) 2000, The University of Michigan
72
Example 2 (Cont’d)
• Number of degrees of freedom:7 + 8 - 2 = 13
• critical value of significance at the 5 per cent level is 2.16
• Since the observed value is greater than 2.16, we can reject the null hypothesis
![Page 73: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/73.jpg)
(C) 2000, The University of Michigan
73
Parametric and Non-parametric Tests
• Four scales of measurement: ratio, interval, ordinal, nominal
• parametric tests (e.g., t-test): interval or ratio-scored dependent variables; assumes independent observations; usually normal distributions only
• non-parametric tests: mostly for frequencies and rank-ordered scales; any type of distributions; less powerful than parametric tests
![Page 74: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/74.jpg)
(C) 2000, The University of Michigan
74
2 = (O-E)2
E
Chi-square Test
• Relationship between the frequencies in a display table
• Null hypothesis: no difference in distribution (all distributions are equal)
![Page 75: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/75.jpg)
(C) 2000, The University of Michigan
75
Special cases
• When the number of degrees of freedom is 1, as in a 2x2 contingency table, Yates’s correction factor is used.
• If O > E, add 0.5 to O, otherwise, subtract 0.5 from O.
• If E < 5, results are not reliable.
![Page 76: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/76.jpg)
(C) 2000, The University of Michigan
76
Two-dimensional Contingency Table
a b
c d
X = yes X = no
Y = yes
Y = no
![Page 77: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/77.jpg)
(C) 2000, The University of Michigan
77
2 =N( |ad - bc| - N/2)2
(a+b)(c+d)(a+c)(b+d)
Expected value =Row total x column total
Grand number of items
![Page 78: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/78.jpg)
(C) 2000, The University of Michigan
78
Third Person Singular Reference (O)Japanese English Total
Ellipsis 104 0 104
Central pronouns 73 314 387
Non-central pronouns 12 28 40
Names 314 291 605
Common NPs 205 174 379
Total 708 807 1515
![Page 79: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/79.jpg)
(C) 2000, The University of Michigan
79
Third Person Singular Reference (E)Japanese English Total
Ellipsis 48.6 55.4 104
Central pronouns 180.9 206.1 387
Non-central pronouns 18.7 21.3 40
Names 282.7 322.3 605
Common NPs 177.1 201.9 379
Total 708 807 1515
![Page 80: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/80.jpg)
(C) 2000, The University of Michigan
80
(O-E)2/E for the Two LanguagesJapanese English
Ellipsis 63.2 55.4
Central pronouns 64.4 56.5
Non-central pronouns 2.4 2.1
Names 3.5 3.0
Common NPs 4.4 3.9
S = 258.8; df = (5-1) x (2-1) = 4 --> different at the 0.001 level
![Page 81: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/81.jpg)
(C) 2000, The University of Michigan
81
Rank Correlation
• Pearson - continuous data
• Spearman’s rank correlation coefficient - non-continuous variables
= 1 - 6 d2
N (N2 - 1)
![Page 82: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/82.jpg)
(C) 2000, The University of Michigan
82
ExampleS X Y X' Y' d d2
1 894 80.2 2 5 3 9
2 1190 86.9 1 2 1 1
3 350 75.7 6 6 0 0
4 690 80.8 4 4 0 0
5 826 84.5 3 3 0 0
6 449 89.3 5 1 4 16
= 1 - 6 x 26
6 (62 - 1) =
![Page 83: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/83.jpg)
(C) 2000, The University of Michigan
83
Linear Regression
• Dependent and independent variables
• Regression: used to predict the behavior of the dependent variable
• Needed: X, Y, X, b = slope of Y(X)
b = NXY - XY
NX2 - (X)2
Y’ = Y + b(X - X)
![Page 84: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/84.jpg)
(C) 2000, The University of Michigan
84
Example
Section X Y X2 XY
1 22 20 484 440
2 49 24 2401 1176
3 80 42 6400 3360
4 26 22 676 572
5 40 23 1600 920
6 54 26 2916 1404
7 91 55 8281 5005
TOTAL 362 212 22758 12877
![Page 85: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/85.jpg)
(C) 2000, The University of Michigan
85
Example (Cont’d)
a = 5.775
(7 x 12877) - (362 x 212)
(7 x 22758) - (362 x 362)=
90139 - 76744
159306 - 131044=
13395
28262= 0.474b =
Y’ = 5.775 + 0.474 X
![Page 86: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/86.jpg)
(C) 2000, The University of Michigan
86
N-gram Models
![Page 87: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/87.jpg)
(C) 2000, The University of Michigan
87
Word Prediction
• Example: “I’d like to make a collect …”
• “I have a gub”
• augmentative communication systems
• “He is trying to fine out”
• “Hopefully, all with continue smoothly in my absence”
• “They are leaving in about fifteen minuets to go to her house”
• “I need to notified the bank of [this problem]
• Language model: a statistical model of word sequences
![Page 88: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/88.jpg)
(C) 2000, The University of Michigan
88
Counting Words
• Brown corpus (1 million words from 500 texts)
• Example: “He stepped out into the hall, was delighted to encounter a water brother” - how many words?
• Word forms and lemmas. “cat” and “cats” share the same lemma (also tokens and types)
• Shakespeare’s complete works: 884,647 word tokens and 29,066 word types
• Brown corpus: 61,805 types and 37,851 lemmas
• American Heritage 3rd edition has 200,000 “boldface forms” (including some multiword phrases)
![Page 89: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/89.jpg)
(C) 2000, The University of Michigan
89
Unsmoothed N-grams
• First approximation: each word has an equal probability to follow any other. E.g., with 100,000 words, the probability of each of them at any given point is .00001
• “the” - 69,971 times in BC, while “rabbit” appears 11 times
• “Just then, the white …”
P(w1,w2,…, wn) = P(w1) P(w2 |w1) P(w3|w1w2) … P(wn |w1w2…wn-1)
Replace P(wn |w1w2…wn-1) with P(wn|wn-1)
Bigram model:
![Page 90: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/90.jpg)
(C) 2000, The University of Michigan
90
Markov Models
• Assumption: we can predict the probability of some future item on the basis of a short history
• Bigrams: first-level Markov models• Bigram grammars: as an N-by-N matrix of probabilities,
where N is the size of the vocabulary that we are modeling.
![Page 91: (C) 2000, The University of Michigan 1 Language and Information Handout #1 September 7, 2000.](https://reader030.fdocuments.in/reader030/viewer/2022032414/56649eec5503460f94bfe1a0/html5/thumbnails/91.jpg)
(C) 2000, The University of Michigan
91
Relative Frequenciesa aardvark aardwolf aback … zoophyte zucchini
a X 0 0 0 … X X
aardvark 0 0 0 0 … 0 0
aardwolf 0 0 0 0 … 0 0
aback X X X 0 … X X
… … … … … … … …
zoophyte 0 0 0 X … 0 0
zucchini 0 0 0 X … 0 0