Ò %Ê'2 · InfoGAN #Ý 8 S # ... Interpretable Representation Learning by Information Maximizing...
Transcript of Ò %Ê'2 · InfoGAN #Ý 8 S # ... Interpretable Representation Learning by Information Maximizing...
●
●
SLAM
●
●
SLAM
FastSLAM
●
● ( μ Σ
)
●
Nonverbal communication system
recognition
What is
the state
of the
class?
Student 1
Student 2
Student 3
What are
the
students
thinking?…
(Suggestion received)
Then, let’s
try to …
→
U-net VAE
InfoGAN
x Z
μ
σ
Pθ(x|z) x’
Encoder Decoder
input output
Encoder Decoder
DecoderEncoder
Z1
Z2
U-VAE
VAE
Z1 Z2
image
question
concat
CN
N Attention
weig
hte
d
concat answer
show ask attend
and answer
Generator
show ask attend
and answer
Decoder
show ask attend
and answer
Discriminator
image
quasi question
z~N(0,I)
fake sentence
real sentence
real or fake
reconstructed
quasi question
z_hat
VQA(Visual Question
Answering)
[1]
[1] Show, Ask, Attend, and Answer: A Strong Baseline For Visual
Question Answering, Vahid Kazemi, Ali Elqursh,
arXiv:1704.03162v2[cs.CV]
InfoGAN[2]
[2] InfoGAN: Interpretable Representation Learning by Information
Maximizing Generative Adversarial Nets, Xi Chen, Yan Duan, Rein,
Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel, arXiv:1606.03657v1 [cs.LG]
VAE U-net
POI
1.
2. 3
→
Conventional SV
Personalized SV
Point Of Interest
( )
image
@ailab.ics.keio.ac.jp
↓
0.
1.
2.
λ:
Gi:
Gj:UP