Post on 24-May-2020
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
A Simple and Efficient Solution of theIdentifiability Problem
for Hidden Markov Models and Quantum Random Walks
Alexander Schönhuth
Pacific Institute for the Mathematical SciencesSchool of Computing Science
Simon Fraser University
February 2009
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Guideline1 Introduction
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
2 String FunctionsStochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
3 Solution of the Identifiability ProblemComputational BottleneckKey InsightAlgorithm
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Identifiability Problem
Situation :Φ : P → S
where P is a set of parameterizations and S is the corresponding set ofstochastic processes.
Definition
A stochastic process Φ(P) as induced by the parameterization P is said to beidentifiable iff
Φ−1(Φ(P)) = {P} (1)
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Hidden Markov Processes (HMPs)
0.8
a b c a b c
1 2
START
0.25
0.5
0.25 0.25 0.30.45
0.5
0.7
0.50.3
0.2
Initial probabilities π = (0.8, 0.2)T
Transition probabilities
M = (mij := P(i → j))i,j=1,2
=
(
0.3 0.70.5 0.5
)
Emission probabilities,e.g. e1b = 0.5, e2c = 0.45.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Hidden Markov Processes (HMPs)
0.8
a b c a b c
1 2
START
0.25
0.5
0.25 0.25 0.30.45
0.5
0.7
0.50.3
0.2
Initial probabilities π = (0.8, 0.2)T
Transition probabilities
M = (mij := P(i → j))i,j=1,2
=
(
0.3 0.70.5 0.5
)
Emission probabilities,e.g. e1b = 0.5, e2c = 0.45.
Random source (Xt ) with values in Σ = {a, b, c}:
e.g.: PX (X1 = a,X2 = b) = π1e1a(a11e1b + a12e2b) + π2e2a(a21e1b + a22e2b)
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Quantum Random Walks (QRWs)
A QRW Q = (G,U, ψ0) consists of
a directed graph G = (V ,E),
a unitary operator U : C|E| → C|E| and
a wave function ψ0 ∈ C|E|
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Quantum Random Walks (QRWs)
A QRW Q = (G,U, ψ0) consists of
a directed graph G = (V ,E),
a unitary operator U : C|E| → C|E| and
a wave function ψ0 ∈ C|E|
Classical random source associated with QRW Q = (G,U, ψo):
Sequences of symbols v0...vtvt+1... from V
Underlying sequences of states ψo ...ψtψt+1... from C|E|
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Quantum Random Walks (QRWs)
A QRW Q = (G,U, ψ0) consists of
a directed graph G = (V ,E),
a unitary operator U : C|E| → C|E| and
a wave function ψ0 ∈ C|E|
Classical random source associated with QRW Q = (G,U, ψo):
Sequences of symbols v0...vtvt+1... from V
Underlying sequences of states ψo ...ψtψt+1... from C|E|
Mechanism:
Generate symbol vt ∈ V with probability∑
e∈E,e=(vt ,u)|(Uψt)e |2.
ψt+1 = (1/∑
e∈E,e=(vt ,x)|(Uψ)e |2) ·
∑
e∈E,e=(v,u)(Uψ)e ∈ C|E|
Return to first step.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Identifiability ProblemHidden Markov Processes (HMPs)Quantum Random Walks (QRWs)
Identifiability Problem
Identifiability Problem
Given the parameterizations of two HMPsM1,M2 or two QRWs Q1,Q2,decide whether the associated random processes p1, p2 are equivalent.
Input : Two parameterizations of two HMPsM1,M2 or two QRWs Q1,Q2.
Output: Yes, if p1 = p2, no else.
Solution for HMPs: Ito, Amari and Kobayashi, IEEE Tr. Inf. Th., 1992.Algorithm is exponential in the number of hidden states.
No solution for QRWs known!
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
String Functions
Let Σ∗ := ∪t≥0Σt be the set of all strings of finite length over an
alphabet Σ.
Treat random processes (Xt) with values in Σ as string functionspX : Σ∗ → R by
pX (v = v0v1...vt ) := P(X0 = vo,X1 = v1, ...,Xt = vt ).
By standard arguments:
(Xt) = (Yt) ⇔ ∀v ∈ Σ∗ : pX (v) = pY (v).
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Dimension of String FunctionsThe Hankel Matrix
Let wv = w1...wmv1...vn ∈ Σm+n
be the concatenation of twostrings w = w1...wm ∈ Σs, v =v1...vn ∈ Σt .
Consider the (infinite-dimensional)Hankel matrix
Pp := [p(wv)]v,w∈Σ∗ ∈ RΣ∗×Σ∗ ∼= R
N×N.
for a string function p : Σ∗ → R.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Dimension of String FunctionsThe Hankel Matrix
Let wv = w1...wmv1...vn ∈ Σm+n
be the concatenation of twostrings w = w1...wm ∈ Σs, v =v1...vn ∈ Σt .
Consider the (infinite-dimensional)Hankel matrix
Pp := [p(wv)]v,w∈Σ∗ ∈ RΣ∗×Σ∗ ∼= R
N×N.
for a string function p : Σ∗ → R.
Example : Let Σ = {0, 1}.
Pp =
p(�) p(0) p(1) . . .
p(0) p(00) p(10) . . .
p(1) p(01) p(11) . . .
p(00) p(000) p(100) . . .
p(01) p(001) p(101) . . ....
......
. . .
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Dimension of String FunctionsThe Hankel Matrix
Let wv = w1...wmv1...vn ∈ Σm+n
be the concatenation of twostrings w = w1...wm ∈ Σs, v =v1...vn ∈ Σt .
Consider the (infinite-dimensional)Hankel matrix
Pp := [p(wv)]v,w∈Σ∗ ∈ RΣ∗×Σ∗ ∼= R
N×N.
for a string function p : Σ∗ → R.
Example : Let Σ = {0, 1}.
Pp =
p(�) p(0) p(1) . . .
p(0) p(00) p(10) . . .
p(1) p(01) p(11) . . .
p(00) p(000) p(100) . . .
p(01) p(001) p(101) . . ....
......
. . .
We define the dimension of p to be
dim p := rk Pp ∈ N ∪ {∞}.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Observable Operators
Let pv resp. pw be the row resp. column vector of Pp referring tostrings v resp. w.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Observable Operators
Let pv resp. pw be the row resp. column vector of Pp referring tostrings v resp. w.
Definition
The linear operators
ρv , τw : RΣ∗
−→ RΣ∗
p 7→ pv , pw
for v ,w ∈ Σ∗ are called observable operators.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Observable Operators
Let pv resp. pw be the row resp. column vector of Pp referring tostrings v resp. w.
Definition
The linear operators
ρv , τw : RΣ∗
−→ RΣ∗
p 7→ pv , pw
for v ,w ∈ Σ∗ are called observable operators.
Observation : Let v1, ..., vt ,w1, ...,ws ∈ Σ be single letters. Then itholds that
ρv1...vt = ρv1 ◦ ... ◦ ρvt
and, in the reverse order on the letters,
τw1...ws = τws ◦ ... ◦ τw1 .
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Dimension of Hidden Markov Processes and QuantumRandom Walks
Lemma
Let p : Σ∗ → R be associated with a hidden Markov process on d hiddenstates resp. a quantum random walk on a graph with |E | edges. Then thereare string functions
gi : Σ∗ → R, i = 1, ..., N
where N = d resp. N = |E |2, such that
span{pw |w ∈ Σ∗} ⊂ span{gi | i = 1, ...,N}.
and computation of gi(v = v1...vk ) is efficient.
Corollary: The lemma straightforwardly implies
dim p ≤ N.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Finite-dimensional Processes
Theorem (AS, Jaeger, 2007)
Let p : Σ∗ → R. Then the following conditions are equivalent.
(i)dim p = rk Pp ≤ d .
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Finite-dimensional Processes
Theorem (AS, Jaeger, 2007)
Let p : Σ∗ → R. Then the following conditions are equivalent.
(i)dim p = rk Pp ≤ d .
(ii) There exist vectors x , y ∈ Rd as well as matrices Ta ∈ R
d×d for all a ∈ Σsuch that
∀v ∈ Σ∗ : p(v = v1...vn) = 〈y |Tvn ...Tv1 |x〉.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Finite-dimensional Processes
Theorem (AS, Jaeger, 2007)
Let p : Σ∗ → R. Then the following conditions are equivalent.
(i)dim p = rk Pp ≤ d .
(ii) There exist vectors x , y ∈ Rd as well as matrices Ta ∈ R
d×d for all a ∈ Σsuch that
∀v ∈ Σ∗ : p(v = v1...vn) = 〈y |Tvn ...Tv1 |x〉.
Definition
An ensemble ((Ta)a∈Σ, x , y) is called a minimal representation of p.
Idea: Given two stochastic processes p1, p2, compare their minimalrepresentations.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Computation of Minimal Representations
1 Determine words v1, ..., vd and w1, ...,wd such that for
V := [p(wjvi )]1≤i,j≤d : rk V = dim p.
2 Definex = (x1, ..., xd )
T := (p(v1), ..., p(vd ))T
andy = (y1, ..., yd )
T := (V T )−1(p(v1), ...,p(vd ))T
3 For each a ∈ Σ, compute matrices
Wa := [p(wj avi)]1≤i,j≤d ∈ Rd×d .
4 A minimal representation of p is then given by
((WaV−1)a∈Σ, x, y).
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Stochastic Processes as String FunctionsHankel Matrices and Dimension of String FunctionsObservable OperatorsDimension of HMPs and QRWsMinimal Representations
Identification of Finite-Dimensional ProcessesGeneric Algorithm
1: Determine matrices V1,V2 of maximal rank for p1, p2.2: If rk V1 6= rk V2 (⇔ dim p1 6= dim p2) then output ’NOT IDENTICAL’ .3: if d = rk V1 = rk V2 then4: Compute V3 := [p2(wj vi)]1≤i,j≤d , where vi ,wj are from V1.5: If V1 6= V3, output ’NOT IDENTICAL’ .6: Compute matrices W1a,W2a for all a ∈ Σ and vectors x1, x2, y1, y2, all
referring to the strings of V1.7: If W1a = W2a for all a and x1 = x2, y1 = y2 then output ’IDENTICAL’ .8: Else, output ’NOT IDENTICAL’ .9: end if
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Computational Bottleneck
Computational bottleneck of the identifiability problem: determinationof bases for the row and the column space of Pp.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Hidden Markov Processes and Quantum RandomWalks
Situation (Σ = {0, 1}):
g1(�) . . . gN(�) p(�) p(0) p(1) . . .
g1(0) . . . gN(0) p(0) p(00) p(10) . . .
g1(1) . . . gN(1) p(1) p(01) p(11) . . .
g1(00) . . . gN(00) p(00) p(000) p(100) . . .
g1(01) . . . gN(01) p(01) p(001) p(101) . . .
......
......
......
. . .
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Hidden Markov Processes and Quantum RandomWalks
Situation (Σ = {0, 1}):
g1(�) . . . gN(�) p(�) p0(�) p1(�) . . .
g1(0) . . . gN(0) p(0) p0(0) p1(0) . . .
g1(1) . . . gN(1) p(1) p0(1) p1(1) . . .
g1(00) . . . gN(00) p(00) p0(00) p1(00) . . .
g1(01) . . . gN(01) p(01) p0(01) p1(01) . . .
......
......
......
. . .
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Hidden Markov Processes and Quantum RandomWalks
Situation (Σ = {0, 1}):
g1(�) . . . gN(�) p(�) p0(�) p1(�) . . .
g1(0) . . . gN(0) p(0) p0(0) p1(0) . . .
g1(1) . . . gN(1) p(1) p0(1) p1(1) . . .
g1(00) . . . gN(00) p(00) p0(00) p1(00) . . .
g1(01) . . . gN(01) p(01) p0(01) p1(01) . . .
......
......
......
. . .
where for all w ∈ Σ∗:
pw ∈ span{gi , i = 1, ...,N}.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Key Insight
Lemma
Let p : Σ∗ → R such that for all w ∈ Σ∗
pw ∈ span{gi , i = 1, ...,N}
for suitable gi : Σ∗ → R, i = 1, ...,N (hence dim p ≤ N). Then it holds that
(
g1(v0) · · · gN(v0))
∈ span
g1(v1) · · · gN(v1)...
. . ....
g1(vm) · · · gN(vm)
=⇒
∀u ∈ Σ∗ : puv0 ∈ span
puv1
...puvk
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Key InsightProof : Choose β1, ..., βm and α1, ..., αN such that
(g1(v0), ..., gN(v0)) =m∑
j=1
βj(g1(vj), ..., gN(vj))
pw =
n∑
i=1
αigi .
⋄Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Key InsightProof : Choose β1, ..., βm and α1, ..., αN such that
(g1(v0), ..., gN(v0)) =m∑
j=1
βj(g1(vj), ..., gN(vj))
pw =
n∑
i=1
αigi .
It follows, for arbitrary w ∈ Σ∗,
pv0(w) = p(wv0) = pw(v0) =
m∑
j=1
βj
n∑
i=1
αigi(vj) =
m∑
j=1
βj pw(vj) =
m∑
j=1
βj pvj (w)
meaning that pv0 =∑m
j=1 βj pvj .
⋄Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Key InsightProof : Choose β1, ..., βm and α1, ..., αN such that
(g1(v0), ..., gN(v0)) =m∑
j=1
βj(g1(vj), ..., gN(vj))
pw =
n∑
i=1
αigi .
It follows, for arbitrary w ∈ Σ∗,
pv0(w) = p(wv0) = pw(v0) =
m∑
j=1
βj
n∑
i=1
αigi(vj) =
m∑
j=1
βj pw(vj) =
m∑
j=1
βj pvj (w)
meaning that pv0 =∑m
j=1 βj pvj . Applying ρu yields
puv0 = ρu(pv0) =m∑
j=1
βjρu(pvj ) =m∑
j=1
βjpuvj .
⋄Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Solution of the Identifiability Problem
Theorem
Let p : Σ∗ → R such that for all w ∈ Σ∗
pw ∈ span{gi , i = 1, ...,N}
for suitable gi : Σ∗ → R, i = 1, ...,N.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Solution of the Identifiability Problem
Theorem
Let p : Σ∗ → R such that for all w ∈ Σ∗
pw ∈ span{gi , i = 1, ...,N}
for suitable gi : Σ∗ → R, i = 1, ...,N.
Then one can determine strings
vi ,wj , i , j = 1, ..., dim p
such thatrk ([p(wjvi)]1≤i,j≤dim p) = dim p
in time linear in N.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Algorithm
Collect strings v into Arow such that thepv , v ∈ Arow span the row space.1: h(v) := (g1(v), ..., gN(v)) ∈ R
N
2: Arow ← {�}Brow ← {h(�)}Crow ← Σ.
3: while Crow 6= ∅ do4: Choose v ∈ Crow .5: if h(v) is linearly independent of
Brow then6: Arow ← Arow ∪ {v}
Brow ← Brow ∪ {h(v)}Crow ← Crow ∪ {av | a ∈ Σ}
7: end if8: end while
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Algorithm
Collect strings v into Arow such that thepv , v ∈ Arow span the row space.1: h(v) := (g1(v), ..., gN(v)) ∈ R
N
2: Arow ← {�}Brow ← {h(�)}Crow ← Σ.
3: while Crow 6= ∅ do4: Choose v ∈ Crow .5: if h(v) is linearly independent of
Brow then6: Arow ← Arow ∪ {v}
Brow ← Brow ∪ {h(v)}Crow ← Crow ∪ {av | a ∈ Σ}
7: end if8: end while
Collect strings w into Acol such that thepw ,w ∈ Acol span the column space.1: q(w) := (p(wv), v ∈ Arow ) ∈ R
|Arow |.2: Acol ← {�}
Bcol ← {q(�)}Ccol ← Σ
3: while Ccol 6= ∅ do4: Choose w ∈ Ccol .5: if q(w) is linearly independent of
Bcol then6: Acol ← Acol ∪ {w}
Bcol ← Bcol ∪ {q(w)}Ccol ← Ccol ∪ {wa | a ∈ Σ}
7: end if8: end while
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Conclusion
Identifiability problem for hidden Markov processes and quantumrandom walks presented.
Solution efficient in the parameterizations.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Conclusion
Identifiability problem for hidden Markov processes and quantumrandom walks presented.
Solution efficient in the parameterizations.
Core idea also applicable to efficiently test HMMs and QRWs forergodicity:
Theorem
Let M := [∑
a Wa]V−1. A finite-dimensional process p is ergodic iff
dim Eig(M;1) = 1.
Alexander Schönhuth Identifiability Problem
GuidelineIntroduction
String FunctionsSolution of the Identifiability Problem
Computational BottleneckKey InsightAlgorithm
Thanks for the attention!
Alexander Schönhuth Identifiability Problem