EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector...

17
Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, Andrew McCallum ®Preferred Infrastructure ~ ôø (@unnonouno) EMNLP2014@PFI

description

EMNLP2014の Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space を紹介しました

Transcript of EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector...

Page 1: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

Efficient Non-parametric Estimation of Multiple Embeddings per Word in

Vector Space

Arvind Neelakantan, Jeevan Shankar, Alexandre Passos, Andrew McCallum�

�®�Preferred Infrastructure �~�ôø (@unnonouno) �

EMNLP2014� �@PFI�

Page 2: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

nìûÏ�

�~�ôø (@unnonouno) ! nÕ�ǽ�6ODKPYARbF6�ÿ�å ! ´ò½�5&7'+6:2" !  NLP¾q/fµt©�k�2014-�

! rc§��¨Û¹�IBM¢ýÊ�PFI

��

Page 3: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

îÓNLP¾q/f�YANS��

!  YANSIbXJBZ�9o�

!  ¾qÊÒl�40ÈãØ�9��/¾qÊÒl��nÃ/ÊÒ>�4;$4/¸2:

!  �s6¤jCbJR@/»�5üç !  �d0mê* $�

!  YANSĀ�3o� !  �ǽ��f�ecf�yg.ĀÑf> 2" !  à-;õ3f*"�"

��

Page 4: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

�þ�

!  word2vec>ÙÉ )�1(/àÇ.í�/�×WEPa>Á�;

! àÇ/���å+ÇÂĄă�¥¿>tx.¥� ! �×WEPa/�>n�*�4;w�5«£ )�;�

��

Page 5: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

ù*=�;Skip-gram [Mikolov+13] �

��

éā�

0�

�

*�

�;�

$(&#%'�

���

Page 6: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

�{��*=�;Skip-gram�

! àÇ/WEPa v(wt) +GbODKP/WEPa v(c) /|Ð>IF\AQ.�<;�

��

� *��;�

Page 7: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

Multi Sense Skip-gram (MSSG) model�

1.  ÞÚàÇ/GbODKPWEPa�9��æGbODKP>±h

2.  �×âÀ/g�9iËGbODKP/ßú $�×>�ó

3.  #/�×.v¼";�×WEPa>�ó 4.  �ó�<$�×WEPa.Ý')ÞÚGbODKP��z )�;�+.";�Skip-gram+t!�

��

Page 8: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

ù*=�;MSSG �

��$(&#%'"�!� � ������"���

Skip-gram����

Page 9: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

�{��*=�;MSSG �

! �ó $�×WEPa>¯�%��! àÇ/WEPa0�×ï.ª:�GbODKP/WEPa01(%�ª;

��

Page 10: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

@aH`LZ*=�;MSSG �

!  ÞÚGbODKP>¯')�×>�ó";G Q�u;%��

���

Page 11: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

Non-Parametric MSSG (NP-MSSG) model�

!  MSSG*0�×WEPa/�0�4¬&%'$ !  NP-MSSG*0�<>n�*�4; ! �4w0�àö.�2*+GbODKPWEPa/�æ�Æ�¶�$9} ��×WEPa>Í�";%��

���

Page 12: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

NP-MSSG.��;�×/�ó/Ôw�

! �2*+Æ�GbODKP�h� $9�}9 ��×Ëä>²:�)8� !  k(wt)0�2*wt.²:�)$�×/�

!  vcontext0wt/�æGbODKPWEPa !  �(wt, k)0kË�/�×/GbODKPg¦�

���

Page 13: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

q�/2+4�

!  MSSG !  àÇ.v )kÅ/�×WEPa>²:�);�S_[ M�kè�

!  �� ������"������!� !  �ó�<$�×WEPa>¯')ñ�.Skip-gram

!  NP-MSSG !  kÅ/�×WEPa>²:�);/0t!

!  �� ������ "����#��������������

!  �ó�<$�×WEPa*Skip-gram";/0t!����

Page 14: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

�Ì¡°�Apple��

!  ° +J^VK��<�.p�<;�

���

Page 15: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

�Ì¡°�Run) �

!  �×/Ąă-àÇ.0º�/�×WEPa�³Á�<;�

���

Page 16: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

�×/ßú�>­�*�;�/�Ì�

localSim�iËú$�×t¹/÷Ö��c���Ü�

���

Page 17: EMNLP2014読み会 "Efficient Non-parametric Estimation of Multiple Embeddings per Word in Vector Space"

·Î�

! �01+?,t!�Ì>6')$�(^o^)� ! E_K�>n�*�z";w�.áð )�$�/*��?-/*��/�+��ëÄ

! ÞÚàÇ*�4;8:�QD][bP��/PTNE*�4$1��nÕ-��"; !  S_F_UWEPa>¯��

���