Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science,...

21
Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongch eng

Transcript of Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science,...

Page 1: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Graphical Models over Multiple Strings

Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University

EMNLP 2009

Presented by Ji Zongcheng

Page 2: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions

Page 3: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions

Page 4: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Overview

We study graphical modeling in the case of string-valued random variables Rather than over the finite domains: booleans , words, or tags

Whereas a weighted finite-state transducer can model the probabilistic relationship between two strings.

We are interested in building up joint models of three or more strings.

Page 5: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Graphical models

Build: variables; domains; possible direct interactions

Train: parameters θ p(V1, . . . , Vn) Choice of training procedure

Infer: predict unobserved variables from observed ones Choice of exact or approximate inference algorithm

Page 6: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Weighted finite-state transducer FSA: finite-state automata WFSA: weighted finite-state automata

FST: finite-state transducers WFST: weighted finite-state transducers

FSM: finite-state machine WFSM: weighted finite-state machine

K = 1, Acceptor

K = 2, Transducer

K > 2, Machine

Page 7: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Training the Model Parameters Comparison With Other Approaches Experiments Conclusions

Page 8: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Motivation String mapping between different forms and representations is ubiquitous

in NLP and computational linguistics.

However, many problems involve more than just two strings: Morphological paradigm (e.g. infinitive, past, present-tense of verb ) Word translation Cognates in multiple languages Modern and ancestral word In bioinformatics and in system combination, multiple sequences nee

d to be aligned ……

We propose a unified model for multiple strings that is suitable for all the problems mentioned above.

Page 9: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions

Page 10: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Formal Modeling Approach Variables

Markov Random Field (MRF) : a joint model of a set of random variables, V = {V1, . . . , Vn}

We assume that all variables are string-valued. The assumption is not crucial, since most can be easily encoded as strings

Factors Factor (or potential function): Fj : A R ≥0

Unary factor: WFSABinary factor: WFSTA factor depend on k > 2 variables: WFSM

A MRF defines a probability for each assignment A of values to the variables in V:

Page 11: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Formal Modeling Approach

Parameters A vector of feature weights θ R∈ How to specify and train such Parameterized WFSMs

(Eisner (2002) explains how to.)

Power of the formalism The framework is powerful enough to express computationally

undecidable problems Graphical models has developed many methods: Exact or approximate inference

Figure 1: Example of a factor graph

Page 12: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions

Page 13: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Approximate Inference

Belief Propagation BP vs. forward-backward algorithm (only on chain-structured factor

graphs) Loopy Belief Propagation (factor graphs have cycles)

Page 14: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Approximate Inference

How BP works in general?

Each variable V maintains a belief about it’s value: Two messages:

The final beliefs are the output of the algorithm

If variable V is observed: modify (2),(4) by multiplying evidence potential (1 on v and 0 on other)

Page 15: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions

Page 16: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Experiments

Reconstruct missing word forms in morphological paradigms Given lemma (e.g. brechen) Ovserved (e.g. brachen, bricht, …) Predict (e.g. breche, brichst, ...)

Page 17: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Experiments

Development Data (100 verbs)

Page 18: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Experiments

Test Data (9293 test paradigms)

Page 19: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Contents Overview Motivation Formal Modeling Approach Approximate Inference Experiments Conclusions

Page 20: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Conclusions

Graphical model with string-valued variables Factors are defined by WFSA Approximate inference can be done by loopy BP

Potentially applicable Transliteration Cognate modeling Multiple-sequence alignment System combination

Page 21: Graphical Models over Multiple Strings Markus Dreyer and Jason Eisner Dept. of Computer Science, Johns Hopkins University EMNLP 2009 Presented by Ji Zongcheng.

Thank you!