Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar...

1

Transcript of Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar...

Page 1: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Srikar MuppirisettyManager, Machine Learning and Data Analytics, Volvo Cars

Joint work with Sohini RoyChowdary preseted at ICMLA 2018

Fast Semantic Proposals for Image and Video Annotation using Modified ESNs

Page 2: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Motivation

Annotated data is the "future source

code " - Nvidia

Data is new Oil Mined annotated data is currency/Gold

With DL models, massive high quality annotated data becomes necessity

Page 3: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Challenges with Data Annotation

Cost• Manual annotation

expensive and time consuming

• Automatic generation of fast and accurate semantic pre-proposals for video and images.

Scalability• Scalability of algorithms across

data sets is often a challenge

• Proposed framework based on variant of RNN high level feature abstraction with very small number of image frames.

Page 4: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Recurrent Neural Network• Directed cycle between

connections between units

• Useful for series kind of data

• In RNN the decisions areinfluenced from what has learntfrom the past

• Difficulty in training RNNsvanishing gradient problem

Page 5: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Echo State Network• Variant of RNN• Input layer, dynamical reservoir,

output layer• Reservoir unitlarge sparsely

and randomly connectedneurons

• Inner weights fixed, and only the output layer is trained low computationl cost for training

Source:https://www.mdpi.com/1996-1073/8/10/12228/htm

𝒙𝒙 𝑘𝑘 = 𝑓𝑓(𝑾𝑾𝑖𝑖𝑖𝑖𝒖𝒖 𝑘𝑘 + 𝑾𝑾𝒙𝒙 𝑘𝑘 − 1 + 𝑾𝑾𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝒚𝒚 𝑘𝑘 − 1 )y 𝑘𝑘 = 𝑓𝑓𝑜𝑜𝑜𝑜𝑜𝑜(𝑾𝑾𝑜𝑜𝑜𝑜𝑜𝑜(𝒖𝒖 𝑘𝑘 ,𝒙𝒙 𝑘𝑘 )

Page 6: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Tuning of ESN

Spectral radius

• Max absolute value of Eigenvalues of the weight matrix

• Higher spectralradius for longermemory

Reserviour size

• Numbers ofneurons in the reserviour

• Larger reservoiroffers betterperformance dueto more non-linearity in ESN

Connectivity

• Connectivity b/w diff. neurons in the weight matrix

• 10neuron networkconnectivity of 0.6 40 zero weights

Page 7: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

ESN approach for Semantic Segmentation

Input image

Extract somehand-craftedimage planes

Extract input feature vector

per image pixel

ESN modelwith a

reservoir layer

Probabilityimage

predictedby ESN

Post processing

Final segmented

binarymage mask

Page 8: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Proposed ESN framework

Baseline ESN Modified ESN

Page 9: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Mathematical Formulation

• Reserviour state in Baseline ESN

• Reserviour state in Modified ESN

• Output state

Page 10: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Data Sets

Weizmann two object dataset

• 100 color images

• Manually annotated

LISA vehicle detection dataset

• 3 video subsets(30fps)

• Urban (300): 1car Sunny(300):3-4 cars

• Dense(1600): 4 or more cars

ADE Challenge subset data

• 20K images

• 125 images for drivable surface

• Manuallyannotated

Page 11: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Performance Metrics

Sens.Recall

Time

MaxF_score

Specificity

F_ScoreIOUAUC

FDR

Page 12: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Experiments & Results

Page 13: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Weizmann 2-object Dataset

Page 14: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

ADE Challenge Data Set

Page 15: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

LISA Vehicle Detection Data Set (Video)

Page 16: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini

Summary

• A modification to existing Echo state network (ESN) modelsto incorporate spatial and temporal features within an imageand also across a batch of training images.

• A modified ESN framework that generates fast automatedproposals (~1 second per image) for adetection/segmentation tasks using only 20-30% of a datasetfor training and testing on the remaining 70-80% dataset.

Page 17: Fast Semantic Proposals for Image and Video Annotation using … · 2019-04-09 · Srikar Muppirisetty Manager, Machine Learning and Data Analytics, Volvo Cars Joint work with Sohini