LSTM Networks for Sentiment Analysis with Keras
Click here to load reader
-
Upload
leon-lin -
Category
Presentations & Public Speaking
-
view
948 -
download
23
Transcript of LSTM Networks for Sentiment Analysis with Keras
LSTM Networks for Sentiment Analysis
YAN TING LIN
Summary• This tutorial aims to provide an example of how a Recurrent Neural
Network (RNN) using the Long Short Term Memory (LSTM) architecture can be implemented using Theano. In this tutorial, this model is used to perform sentiment analysis on movie reviews from the Large Movie Review Dataset, sometimes known as the IMDB dataset.• In this task, given a movie review, the model attempts to predict
whether it is positive or negative. This is a binary classification task.
• Ref: http://deeplearning.net/tutorial/lstm.html
Data• Ref: https://keras.io/datasets/• Dataset of 25,000 movies reviews from IMDB, labeled by sentiment
(positive/negative). Reviews have been preprocessed, and each review is encoded as a sequence of word indexes (integers). For convenience, words are indexed by overall frequency in the dataset, so that for instance the integer "3" encodes the 3rd most frequent word in the data. This allows for quick filtering operations such as: "only consider the top 10,000 most common words, but eliminate the top 20 most common words".• As a convention, "0" does not stand for a specific word, but instead is
used to encode any unknown word.
Data
Data Label
Train Data : X_trainTrain Data Answer: y_trainTest Data: X_testTest Data Answer: y_test
Understanding LSTM Networks• Ref: http://colah.github.io/posts/2015-08-Understanding-LSTMs/• Recurrent Neural Networks• The Problem of Long-Term Dependencies• LSTM Networks• The Core Idea Behind LSTMs• Step-by-Step LSTM Walk Through• Variants on Long Short Term Memory• Conclusion
Install TensorFlowImportError: No module named tensorflow# creating virtual environment using python 2.7 version• conda create -n tensorflow python=2.7# enter Conda Virtual Environment• source activate tensorflow# Using pip to install # Mac OS X, GPU enabled, Python 2.7:• Export
TF_BINARY_URL=https://storage.googleapis.com/tensorflow/mac/gpu/tensorflow-0.11.0-py2-none-any.whl• sudo pip install --upgrade $TF_BINARY_URL
Install Keras (conda) • conda install -c conda-forge keras• # you may use conda-forge to install Tensorflow• # ref: https://conda-forge.github.io• conda install -c conda-forge tensorflow
Data Preprocessing
Make each comment of imdb data be fixed length (80)
Model
Train Model
• In the neural network terminology:• one epoch = one forward pass and one backward pass of all the training
examples• batch size = the number of training examples in one forward/backward pass.
The higher the batch size, the more memory space you'll need.• number of iterations = number of passes, each pass using [batch size] number
of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).• Example: if you have 1000 training examples, and your batch size is 500, then
it will take 2 iterations to complete 1 epoch.
Result - 1 : It takes much time to download data and train model
Result - 2 : After 1 hour
Time Reduction• Make the training data smaller. 5x smaller and 5x faster.
Visualizing your model
# install pydot and graphvis conda install -c anaconda pydot=1.0.28conda install -c anaconda graphviz=2.38.0# in python code
Dropout Comparison - 1
Dropout Comparison - 2
Why Keras?