Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science,...
-
Upload
maryann-simpson -
Category
Documents
-
view
214 -
download
1
Transcript of Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science,...
Graphical Models over Multiple Strings
Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University
EMNLP 2009
Presented by Ji Zongcheng
Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions
Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions
Overview
We study graphical modeling in the case of string-valued random variables Rather than over the finite domains: booleans , words, or tags
Whereas a weighted finite-state transducer can model the probabilistic relationship between two strings.
We are interested in building up joint models of three or more strings.
Graphical models
Build: variables; domains; possible direct interactions
Train: parameters θ p(V1, . . . , Vn) Choice of training procedure
Infer: predict unobserved variables from observed ones Choice of exact or approximate inference algorithm
Weighted finite-state transducer FSA: finite-state automata WFSA: weighted finite-state automata
FST: finite-state transducers WFST: weighted finite-state transducers
FSM: finite-state machine WFSM: weighted finite-state machine
K = 1, Acceptor
K = 2, Transducer
K > 2, Machine
Contents Overview Motivation Formal Modeling Approach Approximate Inference Training the Model Parameters Comparison With Other Approaches Experiments Conclusions
Motivation String mapping between different forms and representations is ubiquitous
in NLP and computational linguistics.
However, many problems involve more than just two strings: Morphological paradigm (e.g. infinitive, past, present-tense of verb ) Word translation Cognates in multiple languages Modern and ancestral word In bioinformatics and in system combination, multiple sequences nee
d to be aligned ……
We propose a unified model for multiple strings that is suitable for all the problems mentioned above.
Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions
Formal Modeling Approach Variables
Markov Random Field (MRF) : a joint model of a set of random variables, V = {V1, . . . , Vn}
We assume that all variables are string-valued. The assumption is not crucial, since most can be easily encoded as strings
Factors Factor (or potential function): Fj : A R ≥0
Unary factor: WFSABinary factor: WFSTA factor depend on k > 2 variables: WFSM
A MRF defines a probability for each assignment A of values to the variables in V:
Formal Modeling Approach
Parameters A vector of feature weights θ R∈ How to specify and train such Parameterized WFSMs
(Eisner (2002) explains how to.)
Power of the formalism The framework is powerful enough to express computationally
undecidable problems Graphical models has developed many methods: Exact or approximate inference
Figure 1: Example of a factor graph
Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions
Approximate Inference
Belief Propagation BP vs. forward-backward algorithm (only on chain-structured factor
graphs) Loopy Belief Propagation (factor graphs have cycles)
Approximate Inference
How BP works in general?
Each variable V maintains a belief about it’s value: Two messages:
The final beliefs are the output of the algorithm
If variable V is observed: modify (2),(4) by multiplying evidence potential (1 on v and 0 on other)
Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions
Experiments
Reconstruct missing word forms in morphological paradigms Given lemma (e.g. brechen) Ovserved (e.g. brachen, bricht, …) Predict (e.g. breche, brichst, ...)
Experiments
Development Data (100 verbs)
Experiments
Test Data (9293 test paradigms)
Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions
Conclusions
Graphical model with string-valued variables Factors are defined by WFSA Approximate inference can be done by loopy BP
Potentially applicable Transliteration Cognate modeling Multiple-sequence alignment System combination
Thank you!