Tensorflow (1.x) Review Suraj Nair Session 9/26/2019 CS330...

Tensorflow (1.x) Review Session

CS330: Deep Multi-task and Meta Learning9/26/2019Suraj Nair

Overview● Installation● Tensorflow Basics

○ Variables / Placeholders / Constants / Operations○ Graphs / Sessions○ Optimizers○ Training loop○ MNIST Example

● High Level APIs○ tf.layers, Keras

● Advanced features○ Memory layers○ Manually extracting gradients○ Distributions○ Eager execution Some examples adapted from

https://adventuresinmachinelearning.com/python-tensorflow-tutorial/

https://adventuresinmachinelearning.com/python-tensorflow-tutorial/

Installation● In this class we will use Python 3.5+, tensorflow 1.14.0● pip install tensorflow(-gpu)==1.14.0

● We recommend a python virtual environment○ virtualenv○ Anaconda

● Notes about GPUs:○ Tf 1.14 requires CUDA 10○ If you want to run your code on a gpu you will need to make sure CUDA 10 is installed (CUDA

9 will not work)

Tensorflow Basics: Variables/Constants/Operations● Standard definition of a

constant or variable○ A constant is fixed○ A variable can be assigned to

any value, can be optimized.○ Both can be used in a

computation graph.

● Operations can be done on Variables to define new Variables

Tensorflow Basics: Placeholders● What if you don’t know the value of a variable yet?● Placeholder in a graph is a variable whose value can be filled in later

Tensorflow Basics: Graphs/Sessions● A set of variables/constants/placeholders and the operations between them

form a computation graph - stored as a tf.Graph object● A tf.Session stores a graph, as well as computation information (i.e.

GPU/CPU)● tf.Session.run(Variable) runs the computation graph and outputs the value of

the variable● Need to create session and initialize the variables

Tensorflow Basics: Graphs/Sessions

Tensorflow Basics: Optimizers● Given a node in the graph, you

can optimize the variables in the graph to minimize/maximize the node.

● Standard set of optimizers○ Adam○ SGD

● When you run the optimizer using sess.run, computes gradients and applies them

Tensorflow Basics: Optimizers/Training Loop● Given a node in the graph, you

can optimize the variables in the graph to minimize/maximize the node.

● Standard set of optimizers○ Adam○ SGD

● When you run the optimizer using sess.run, computes gradients and applies them

● Normally data inputs will be placeholders

Tensorflow Basics: MNIST Example

High Level APIs● Writing out every Variable in a big network is time consuming● Some helpful wrapper functions which contain standard network layers

○ Tf.layers■ tf.layers.Dense■ tf.layers.Conv2D

○ Tf.keras■ tf.keras.Conv2D■ tf.keras.layers.ConvLstm2D

● Wrapper loss functions○ Cross Entropy loss○ Mean squared error

Advanced Features: Memory● Recurrent Cell:

○ lstm_cell = tf.keras.layers.LSTMCell(hidden_size)○ output, hidden_state = lstm_cell(input, hidden_state)

● Or use a recurrent layer:○ lstmlayer = tf.keras.layers.LSTM(hidden_size, return_sequences=True)○ output = lstmlayer(input)

■ Where input is (batch_size, seq_len, input_dim)■ And output is (batch_size, seq_len, hidden_size)

Advanced Features: Using Gradients● Can directly get gradient da / db using tf.gradients(a , b)● For example - can do gradient descent without using optimizer

Advanced Features: Distributions● Often you may need distributions

○ Training a generative model○ Stochastic policies

● Used to be tf.Distributions - now tensorflow_probability.Distributions○ pip install tensorflow-probability

● If reparameterized - can backpropagate through sample()

Advanced Features: Eager Execution● Standard TF uses a static graph

○ Graph is built and fixed■ Makes it fast to compute gradients/feedforward since graph is static

● Eager execution○ Graph is built dynamically○ Track gradients with tf.GradientTape

Tensorflow (1.x) Review Suraj Nair Session 9/26/2019 CS330...

Documents

Transcript of Tensorflow (1.x) Review Suraj Nair Session 9/26/2019 CS330...