LIN3022 Natural Language Processing Lecture 5 Albert Gatt LIN3022 -- Natural Language Processing.
Natural Language Processing
description
Transcript of Natural Language Processing
04/21/23 1
Natural Language ProcessingNatural Language Processing
Lecture Notes 1
04/21/23 2
TodayToday• Administration and Syllabus
– course web page
• Introduction
04/21/23 3
Natural Language ProcessingNatural Language Processing• What is it?
– What goes into getting computers to perform useful and interesting tasks involving human languages.
– Secondarily: insights that such computational work give us into human languages and human processing of language.
04/21/23 4
Natural Language ProcessingNatural Language Processing• Foundations are in computer
science (AI, theory, algorithms,…); linguistics; mathematics; logic and statistics; and psychology
04/21/23 5
Why Should You Care?Why Should You Care?
• Two trends1.1. An enormous amount of knowledge is An enormous amount of knowledge is
now available in machine readable form now available in machine readable form as natural language textas natural language text
2.2. Conversational agents are becoming an Conversational agents are becoming an important form of human-computer important form of human-computer communicationcommunication
04/21/23 6
Knowledge of LanguageKnowledge of Language
• Words (words and their composition)• Syntax (structure of sentences)• Semantics (explicit meaning of sentence)• Discourse and pragmatics (implicit and
contextual meaning)
04/21/23 7
Small Applications Small Applications
• Line breakers• Hyphenators• Spelling correctors• Optical Character Recognition
software• Grammar and style checkers
04/21/23 8
Big ApplicationsBig Applications• Question answering• Conversational agents• Text summarization• Machine translation
04/21/23 9
NoteNote
NLP, as in many areas of AI:– We’re often dealing with ill-defined
problems– We don’t often come up with perfect
solutions/algorithms– We can’t let either of those facts get in
our way
04/21/23 10
Course MaterialCourse Material• We’ll be intermingling discussions
of:– Linguistic topics
•Syntax and meaning representations– Computational techniques
•Context-free grammars– Applications
•Translation and QA systems
04/21/23 11
Chapter 1Chapter 1• Knowledge of language• Ambiguity• Models and algorithms• History
04/21/23 12
Knowledge of LanguageKnowledge of Language• Phonetics and phonology: speech
sounds, their production, and the rule systems that govern their use
• Morphology: words and their composition from more basic units– Cat, cats (inflectional morphology)– Child, children– Friend, friendly (derivational
morphology)
04/21/23 13
Knowledge of LanguageKnowledge of Language• Syntax: the structuring of words
into legal larger phrases and sentences
04/21/23 14
SemanticsSemantics• The meaning of words and phrases
– Lexical semantics: the study of the meanings of words
– Compositional semantics: how to combine word meanings
– Word-sense disambiguation•River bank vs. financial bank
04/21/23 15
PragmaticsPragmatics• Indirect speech acts:
– Do you have a stapler?
• Presupposition:– Have you stopped beating your wife?
• Deixis and point of view:– Zoe was angry at Joe. Where was he?
• Implicature:-Yes, there are 3 flights to Boston. In fact, there
are 4.* The general was assassinated. In fact, he isn’t
dead.
04/21/23 16
DiscourseDiscourse• Utterance interpretation in the
context of the text or dialog– Sue took the trip to New York. She had
a great time there.•Sue/she; •New York/there; • took/had (time)
04/21/23 17
AmbiguityAmbiguity• Almost all of the non-trivial tasks
performed by NLP systems are ambiguity resolution tasks
• There is ambiguity at all levels of language
04/21/23 18
AmbiguityAmbiguity• I saw the woman with the telescope• Syntactically ambiguous:
– I saw (NP the woman with the telescope)
– I saw (NP the woman) (PP with the telescope)
04/21/23 19
““I made her duck”I made her duck”• I cooked waterfowl for her• I cooked waterfowl belonging to her• I create the duck she owns• I caused her to lower her head quickly…
• Part of speech tagging: is “duck” a noun or verb?
• Parsing syntactic structure: is “her” part of the “duck” NP?
• Word-sense disambiguation (lexical semantics): does “make” mean create, lower head, or cook?
04/21/23 20
Dealing with AmbiguityDealing with Ambiguity
• Two approaches:– Tightly coupled interaction among processing
levels; knowledge from other levels can help decide among choices at ambiguous levels.
– Pipeline processing
• Most NLP systems are probabilistic: they make the most likely choices
04/21/23 21
Models and AlgorithmsModels and Algorithms• Models (as we are using the term
here): – Formalisms to represent linguistic
knowledge
• Algorithms:– Used to manipulate the
representations and produce the desired behavior •choosing among possibilities and
combining pieces
04/21/23 22
ModelsModels• State Machines: finite state automata,
finite state transducers• Formal rule systems: context free
grammars• Logical formalisms: first-order
predicate calculus; higher-order logics• Models of uncertainty: Bayesian
probability theory• Vector Space Models
04/21/23 23
AlgorithmsAlgorithms• Many of the algorithms that we’ll
study will turn out to be transducers; algorithms that take one kind of structure as input and output another.
04/21/23 24
AlgorithmsAlgorithms• In particular..
– State-space search•To manage the problem of making
choices during processing when we lack the information needed to make the right choice
– Dynamic programming•To avoid having to redo work during
the course of a state-space search– Machine Learning (classifiers, EM, etc)
04/21/23 25
State Space SearchState Space Search• States represent pairings of partially
processed inputs with partially constructed answers– E.g. sentence + partial parse tree
• Goal is to arrive at the right/best structure after having processed all the input.– E.g. the best parse tree spanning the sentence
• As with most interesting AI problems the spaces are too large and the criteria for “bestness” are difficult to encode (so heuristics, probabilities)
04/21/23 26
Dynamic ProgrammingDynamic Programming• Don’t do the same work over and
over.• Avoid this by building and making
use of solutions to sub-problems that must be invariant across all parts of the space.