Natural Language processing Parts of speech tagging, its classes, and how to process it

Part of Speech Tagging

Perpectivising NLP: Areas of AI andtheir inter-dependencies

KnowledgeSearch Logic Representation

MachineLearning Planning

ExpertSystemsNLP Vision Robotics

Two pictures

ProblemNLP

Semantics NLPnity

Parsing

MorphVision SpeechAnalysis

HMMStatistics and Probability Hindi English

LanguageCRF+

Knowledge Based

Algorithm

Semantics N

Part of SpeechTagging

Marathi French

What it is

POS Tagging is a process that attaches

each word in a sentence with a suitable

tag from a given set of tags.

The set of tags is called the Tag-set.

Standard Tag-set : Penn Treebank (for

English).

Definition

Tagging is the assignment of a

singlepart-of-speech tag to each word

(and punctuation marker) in a corpus.

“_“ The_DT guys_NNS that_WDT

make_VBP traditional_JJ hardware_NN

are_VBP really_RB being_VBG

obsoleted_VBN by_IN microprocessor-

based_JJ machines_NNS ,_, ”_” said_VBD

Mr._NNP Benton_NNP ._.

POS Tags

NN – Noun; e.g.

VM – Main Verb;

Dog_NN

e.g. Run_VM

VAUX – Auxiliary Verb; e.g. Is_VAUX

JJ – Adjective; e.g. Red_JJ

PRP – Pronoun; e.g. You_PRP

NNP – Proper Noun; e.g. John_NNP

POS Tag Ambiguity

In English : I bank1 on the bank2 on the

river bank3 for

Bank1 is verb,

my transactions.

the other two banks are

In Hindi :

”Khaanaa” : can be noun (food) or

verb (to

For Hindi

Rama achhaa gaata hai. (hai is VAUX :

Auxiliary verb); Ram sings well

Rama achha ladakaa hai. (hai is VCOP :

Copula verb); Ram is a good boy

Process

List all possible tag for each word in

sentence.

Choose best suitable tag sequence.

Example

”People jump high”.

People : Noun/Verb

jump : Noun/Verb

high : Noun/Verb/Adjective

We can start with probabilities.

Importance of POS tagging

Ack: presentation by ClaireGardent on POS tagging by

What is Part of Speech (POS)

Words can be divided into classesbehave similarly.

Traditionally eight parts of speech

inEnglish: noun, verb, pronoun,preposition, adverb,adjective and article

More recently larger

conjunction,

sets have beenused: e.g. Penn Treebank (45 tags),Susanne (353 tags).

Why POS POS tell us a lot about a word (and the

words near it). E.g, adjectives often followed by nouns

personal pronouns often followed by verbs

possessive pronouns by nouns

Pronunciations depends on POS, e.g. object (first syllable NN, second syllable VM), content, discount

First step in many NLP applications

Categories of POSOpen and closed classes

Closed classes have a fixed membership of words: determiners, pronouns, prepositions

Closed class words are usually functionword: frequently occurring, grammatically important, often short (e.g. of, it, the, in)

Open classes: nouns, verbs, adjectives and adverbs(allow new addition of word)

Open Class (1/2) Nouns:

Proper nouns (Scotland, BBC),

common nouns count nouns (goat, glass)

mass nouns (snow, pacifism)

Verbs: actions and processes (run, hope)

also auxiliary verbs (is, are, am, will, can)

Open Class (2/2) Adjectives:

properties and qualitiesvalue)

Adverbs:

(age, colour,

modify verbs, or verb phrases, or otheradverbs- Unfortunately John walked home extremely slowly yesterday

Sentential adverb: unfortunately

Manner adverb: extremely, slowly

Time adverb: yesterday

Closed class Prepositions: on, under, over, to, with,

Determiners: the, a, an, some

Pronouns: she, you, I, who

Conjunctions: and, but, or, as, when, if Auxiliary verbs: can, may, are

Penn tagset (1/2)

Penn tagset (2/2)

IndianNoun

Language Tagset:

Indian Language Tagset:Pronoun

Indian Language Tagset:Quantifier

Indian Language Tagset:Demonstrative

3 Demonstrative DM DM Vaha, jo, yaha,

3.1 Deictic DMD DM DMD Vaha, yaha

3.2 Relative DMR DM DMR jo, jis

3.3 Wh-word DMQ DM DMQ kis, kaun

Indefinite DMI DM DMI KoI, kis

Indian Language Tagset:Verb, Adjective, Adverb

Indian Language Tagset:Postposition, conjunction

Indian Language Tagset:Particle

Indian Language Tagset:Residuals

BigramBest tag sequence

Assumption

T*argmax P(T|W)argmax P(T)P(W|T) (by Baye’s Theorem)

P(T) = P(t0=^ t1t2 … tn+1=.)

= P(t0)P(t1|t0)P(t2|t1) … P(tn|tn-1)P(tn+1|tn)

∏i = 0

= P(ti|ti-1) Bigram Assumption

Lexical Probability AssumptionP(W|T) = P(w0|t0-tn+1)P(w1|w0t0-tn+1)P(w2|w1w0t0-tn+1) …

P(wn|w0-wn-1t0-tn+1)P(wn+1|w0-wnt0-tn+1)

Assumption: A word is determined completely by inspired by speech recognition

its tag. This is

= P(wo|to)P(w1|t1) … P(wn+1|tn+1)

=∏ P(wi|ti)i = 0

= ∏ P(wi|ti)i = 1

(Lexical Probability Assumption)

Generative Model

^_^ People_N Jump_V High_R ._.

LexicalProbabilities

^ N V A .

V N N BigramProbabilities

This model is called Generative model.Here words are observed from tags as states.This is similar to HMM.

Bigram probabilities

N 0.2 0.7 0.1

V 0.6 0.2 0.2

A 0.5 0.2 0.3

Lexical Probability

People jump high

0.4x10 - 7

10 - 7

A 0 0 - 1

values in cell are P(col-heading/row-heading)

Calculation from Corpus

actual data

^ Ram got many NLP books. He found themall very interesting.

Pos Tagged ^ N V A N N . N V N A R A .

Recording numbers^ N V A R .

^ 0 2 0 0 0 0

N 0 1 2 1 0 1

V 0 1 0 1 0 0

A 0 1 0 0 1 1

R 0 0 0 1 0 0

. 1 0 0 0 0 0

Probabilities^ N V A R .

^ 0 1 0 0 0 0

N 0 1/5 2/5 1/5 0 1/5

V 0 1/2 0 1/2 0 0

A 0 1/3 0 0 1/3 1/3

R 0 0 0 1 0 0

. 1 0 0 0 0 0

To find

T* = argmax (P(T) P(W/T)) P(T).P(W/T) = Π P( ti / ti+1 ).P(wi /ti)

) : Bigram probability P( ti / ti+1

P(wi /ti): Lexical probability

Bigram probabilities

N V A R

N 0.15 0.7 0.05 0.1

V 0.6 0.2 0.1 0.1

A 0.5 0.2 0.3 0

R 0.1 0.3 0.5 0.1

Lexical Probability

People jum p high

0.4x10 -7

A 0 0 -1

R 0 0 0

values in cell are P(col-heading/row-heading)

Natural Language processing Parts of speech tagging, its classes, and how to process it

Engineering

Transcript of Natural Language processing Parts of speech tagging, its classes, and how to process it

POS for Kannada - LDC-IL Tagging for... · PARTSPARTS----OFOOFFOF----SPEECH TAGGING FOR SPEECH TAGGING FOR KANNADA Vijayalaxmi FF ... Parts-Parts ---ofooffof----SpeechSpeechSpeech

Part-of-speech tagging (3) - The University of EdinburghSteve Renals s.renals@ed.ac.uk Part-of-speech tagging (3) Outline Recall: HMM PoS tagging Viterbi decoding Trigram PoS tagging

Chapter 8. Word Classes and Part-of-Speech Tagging

TP2663 Pemprosesan Bahasa Tabii - ftsm.ukm.my 05 Part of Speech.pdf1 TP2663 Pemprosesan Bahasa Tabii Part of Speech Tagging Part of Speech tagging Part of speech tagging Parts of speech

Word classes and part of speech tagging Chapter 5.

Part-of-Speech Tagging & Sequence Labeling

A PROJECT REPORT ON PART-OF-SPEECH TAGGING FOR BENGALI › xadm › data_entry_module › project › ... · bengali part-of-speech tagging 1 a project report on part-of-speech tagging

Robust Part of Speech Tagging

Hindi Parts-of-Speech Tagging & Chunking

Chapter 8. Word Classes and Part-of-Speech Tagging From: Chapter 8 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech.

CS 4705 Part of Speech Tagging

Natural Language Processing - unisi.itmaggini/Teaching/TEL... · Natural Language Processing Part 2: Part of Speech Tagging . Word classes- 1 • Words can be grouped into classes

Part-of-Speech Tagging

Word classes and the distribution of words, and Part of Speech tagging Computational linguistics.

Part-of-Speech Tagging Updated 22/12/2005. Part-of-Speech Tagging Tagging is the task of labeling (or tagging) each word in a sentence with its appropriate.

WORD CLASSES AND PART-OF-SPEECH TAGGING T...T 4 Chapter 5. Word Classes and Part-of-Speech Tagging white). VERB The verb class includes most of the words referring to actions and processes,

Part-Of-Speech Tagging using Neural Networks

Part of Speech Tagging - The University of Edinburgh · HMM Part-of-Speech Tagging Part of Speech Tagging Informatics 2A: Lecture 15 Mirella Lapata School of Informatics University

Part of Speech Tagging - BGUelhadad/nlp13/prob/postagging.pdfPart-of-Speech Tagging Part-of-SpeechTagging I Givenawordsequencew 1 w m,determinethecorresponding part-of-speech(tag)sequencet

COMP 786 (Fall 2020) Natural Language Processingmbansal/teaching/slides...Lecture 3: POS-Tagging, NER, Seq Labeling, Coreference . Part-of-Speech Tagging . Part-of-Speech Tagging Basic