L25 CNNs3 - CSUcs510/yr2017sp/more_progress/L25.pdf4/7/17 1 Convolutional Neural Networks (Part 3)...
Transcript of L25 CNNs3 - CSUcs510/yr2017sp/more_progress/L25.pdf4/7/17 1 Convolutional Neural Networks (Part 3)...
4/7/17
1
Convolutional Neural Networks (Part 3)
CS 510 Lecture #25
April 5th, 2017
Announcements (repeat)
• PA4 is due Monday, April 17th
• Test #2 will be Wednesday, April 19th
• Test #3 is Monday, May 8th at 8AM– Just 1 hour long– University schedule says 7:30…
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 2
VGG.PY• if __name__ == '__main__':• sess = tf.Session()• images = tf.placeholder(tf.float32, [None, 224, 224, 3])• vgg = vgg16(images, 'vgg16_weights.npz', sess)•
•
• img1 = imresize(imread('test.jpeg', mode='RGB'), (224, 224))• img2 = imresize(imread('test1.jpg', mode='RGB'), (224, 224))• image_stack = np.stack([img1, img2])•
•
• probs = sess.run(vgg.probs, feed_dict={vgg.imgs: image_stack})•• preds = np.argmax(probs, axis=1)•
• for index, p in enumerate(preds):• print "Prediction: %s; Probability: %f"%(class_names[p], probs[index, p])
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 3
Don’tneedtotouchthis
Thisiswhereyouloadyouimages
InvokesVGGnet,returnsactivations(+exampleprint)
AlexNet : The Start of a Revolution
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 4
What Does AlexNet Learn?
• Layer #1 Convolution masks:
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 5
Final Layer Features
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 6
http://yosinski.com/static/proj/deepvis_goose_ostrich.jpg
Createdwithgradient-ascentoptimizationandL2regularization
4/7/17
2
More Recent Performance
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 7
Network Year Top-1ErrorRate Top5ErrorRateAlexNet 2012 37.5% 17%
VGG-16 2015 28.07% 9.33%
GoogLeNet 2015 (unreported) 9.15%
PreLu-Net 2015 24.27% 7.38%
ResNet 2016 21.43% 5.71%
• Ceilingunclear(someerrorsinlabels)• Steadyimprovementovertime• Whatyouwouldexpectfromcontinuousrefinement
Batch Processing• GPUs have improved
– No need to split processing across 2 GPUs• Batch Training
– Batch size determined by GPU memory limitations
• 64 images is common (so is 32)– Instead of epochs, batches
• Images selected at random• Images pre-processed each time they are selected• Images processed as a batch• Weights updated as a batch
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 8
Image Pre-processing• Goal: more scale & translation invariance• Every time an image is included in a batch
– Randomly pick a number between 256 & 480– Rescale shorter side to this number– Crop a random 224x224 image from the result– Horizontally flip image 50% of the time– Subtract mean pixel value
• No training sample is ever the same twice!
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 9
Image Pre-processing Example
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 10
Sourceimage(GoldenRetriever)
Traininginstance#1Scaledto~256V
CenterCut
Traininginstance#2Scaledto~400VUpperLeftCut
Batch Normalization
• Batch Normalization Nodes process all the data in a batch at once
• Compute the mean and st dev. of HxWxD values
• Zero mean and unit length the batch data
• Re-mean and re-scale the data using two learned paramaters, α and β
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 11
InputRasterData(HxWxD)
Convolution
ReLu
OutputRasterData(HxWxD)
NewNodeType!BatchNormalizationNode
Batch Normalization (II)
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 12
𝜇 =1𝑁%𝑥'
(
')*
Step1:ComputeBatchMean
𝜎, =1𝑁% 𝑥' − 𝜇 ,(
')*
Step2:ComputeBatchSt.Dev.
𝑥'. =𝑥' − 𝜇𝜎, + 𝜀1
Step3:NormalizeData
𝑦' = 𝛼𝑥'. + 𝛽
Step4:RescaleData&MoveMeanNote:αandβtrainedbybackprop
4/7/17
3
• Batch normalization increases learning speed
• Unclear if final answer is improved
• β resembles a bias term
Batch Normalization: Effect
4/7/17 CS510,ImageComputation,©RossBeveridge&BruceDraper 13
https://shuuki4.wordpress.com/2016/01/13/batch-normalization-%EC%84%A4%EB%AA%85-%EB%B0%8F-%EA%B5%AC%ED%98%84/