Ulanov nlp-7

download

of 33

  • date post

    18-Dec-2014
  • Category

    Documents
  • view

    160
  • download

    0

Embed Size (px)

description

 

transcript

  • 1. Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 2. 1. 2. 3. 4. 5. 6. 7. 8. . , , , , . Chris Manning and Hinrich Schuetze. Foundations of Statistical Natural Language Processing, MIT Press, 1999 Philipp Koehn. Statistical Machine Translation, Cambridge Univ. Press, 2010 2 , Stanford Coursera (Manning) Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 3. 7 3 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 4. 4 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 5. Sentiment Analysis, Opinion Mining , , , , , ., , , , , Sentiment Analysis Opinion Mining Data Mining -, .. 5 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 6. , 6 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 7. , , : , , 7 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 8. User1 1.4.2011 22:01: iPhone 5 , . , , , , -. 4s , , . , 4 , 1410 1400. - apple . 8 : : , , , : ? Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 9. (Named Entity Recognition) (Relationship Extraction) (Sentiment Identification) ( Co-reference resolution) (Synonym extraction) (Information Extraction) (NLP) , , , 9 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 10. , , : General Inquirer (1045pos, 1160neg), Bing Liu (2007pos, 4784neg), MPQA (2718pos, 4913neg) ( , ) : 1,2 , 4,5 (+ ). , 0/1 [Pang & Lee 2002] Pang, Bo, Lillian Lee, and Shivakumar ( , )= ( , ) log delta( , )= , log ( ) c . = ( , ) =1 ( , ) 2 Vaithyanathan. "Thumbs up?: sentiment classification using machine learning techniques." Proceedings of the ACL-02 Martineau, Justin, and Tim Finin. "Delta NaiveBayes, SVM, Decision Trees information contained herein is subject to change withoutTFIDF: An Improved Feature Space for notice. 10 Copyright 2013 Hewlett-Packard Development Company, L.P. The Sentiment Analysis." ICWSM. 2009. ~83% F-
  • 11. 2011 , : 750 124 : : MicroP MicroR (Accuracy) MacroR MacroF1 0,84 0,84 0,59 0,60 0,84 0,84 0,62 0,63 0,84 0,80 0,59 0,61 0,86 0,82 0,59 0,61 Perceptron Perceptron + delta-tf-idf *A. Ulanov,2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 11 Copyright G. Sapozhnikov. CONTEXT-DEPENDENT OPINION LEXICON TRANSLATION WITH THE USE OF A PARALLEL CORPUS. Dialog 2013
  • 12. SVM (liblinear) ( Accuracy) : ( , ) = 0.5 + 0.5 , , ( ) log +0.5 +0.5 - bTFIDF (norm) dTFIDF adTFIDFs Movie Review 2000 668 87.85 88.20 91.60 96.60 MultiDomain 8000 217 86.96 88.25 92.25 96.36 BLOGS06 17898 2832 77.39 78.55 80.58 85.04 Paltoglou,2013 Hewlett-Packard DevelopmentThelwall. "A study of information retrieval weighting schemes for sentiment analysis." Proceedings Copyright Georgios, and Mike Company, L.P. The information contained herein is subject to change without notice. of the 48th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2010. 12
  • 13. :) :( [Go et al. 2009] () , , 50% , - ( 40% [Jiang et al. 2011]) : SVM ( - ) 82% Go, Alec, Richa Bhayani, and Lei Huang. "Twitter sentiment classification using distant supervision." CS224N Project Report, Stanford (2009): 1-12. Jiang, Long, et al. "Target-dependent Twitter Sentiment Classification." ACL. 2011. 13 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 14. . , , , : (, ) . Delta-tf-idf ( ) , 14 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 15. . PMI 15 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. J. Blitzer, M. Dredze, and F. Pereira. 2007. Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In ACL.
  • 16. [Banea et al. 2011] , , () (, ) , ( Google Translate) , #1 #2 [Steinberger et al. 2012] Google Translate [Ulanov & Sapozhnikov 2013] 16 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. [Ulanov & Sapozhnikov 2013]
  • 17. : bootstrapping JJ- NN RB* VB* - PMI - . SO phrase PMI ( phrase, " excellent " ) PMI ( phrase, " poor" ) PMI a, b hits a b hits a hits b *Turney, P. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Copyright 2013 ProceedingsHewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. of Annual Meeting of the Association for Computational Linguistics (ACL-2002), 2002. 17
  • 18. [Hatzivassiloglou & McKeown 1997] (and, or, but, either-or, neither-nor) , ( 90% F1 ) PMI [Turney 2002] PMI(best,candidate)-PMI(worst,candidate) 18 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Hatzivassiloglou, Vasileios, and Kathleen R. McKeown. "Predicting the semantic orientation of adjectives." ACL, 1997.
  • 19. (/) -, PMI, IG... Double propagation [Qiu et al 2009, 2011] , 19 Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
  • 20. , (-) ( ) (, , ) - 40% [Jiang et al. 2011] sowi soa i d wi , a 2