Neural Network Architectures for Large-scale 3d...

Neural Network Architectures for Large-scale 3d Reconstruction

Viren Jain Research & Machine Intelligence, Google Inc.

• reconstructing neural wiring diagrams from 3d, nanoscale images of brain tissue

“connectomics”

Which neurons form synaptic connections with each other?

‣ Associate each dendrite and axon with a parent cell body

‣ Find synapses (connections) between particular cells

• Global architecture: pathways, projections- fMRI, DTI, tracer injections

• Local circuitry: detailed wiring diagrams- electron microscopy- molecular genetic fluorescent labeling

1970’s: c. elegans300 neurons; printed photos; MRC in the UK

2013: retina, drosophila medulla1000 neurons; O(teravoxel); Max Planck Institute, MIT, HHMI

In progress: mouse cortical column, whole drosophila brain, song-bird100,000 neurons; O(petavoxel); Harvard, HHMI, Allen Institute, MPI

Machines now being designed & built for whole mouse brain100,000,000 neurons; O(100-1,000 petavoxels)

Synapse Identification

• Conventional image processing techniques and even modern ‘off-the-shelf’ machine learning techniques make too many mistakes

.

pres

ynap

tic

postsynaptic

pres

ynap

tic

postsynaptic

pres

ynap

tic

postsynaptic

X

• How to increase accuracy?

• Image may contain trillions of voxels

• Linear time methods

• Need to decompose the problem

Image Segmentation for Connectomics

Global Region Formation

Local Boundary Prediction

“Simple” Global Region Formation

“Sophisticated” Local Boundary Prediction

input image local boundaries nonlocal segmentation

Multi-layer neural network architecture that can be trained by gradient learning to interpret images in a desired way.

Convolutional Networks

• No prior research on best feature representations for EM data

• Not afraid to gather lots of training data

Why Convolutional Networks?

LeNet-5 Digit/Letter Recognition Network

Yann LeCun and others Bell Labs (1980’s)

Jain et al, ICCV 2007Turaga et al, NC 2010

Jain et al, CVPR 2010Turaga et al, NIPS 2009

Helmstaedter et al, Nature 2013

“Connectomic reconstruction of the IPL in mouse retina”

CLEAN NOISY PSNR=14.96 CN2 PSNR=24.25

BLS-GSM PSNR=23.78 FoE PSNR=23.02

CLEAN CN2

FoEBLS-GSM

Jain & Seung, NIPS 2008

test set performance

Natural Image Denoising with Convolutional Networks

convolutional nets are related to CRF’s

CRF model:

mean field:

inference:

CN2 (equivalent convolutional network):

image restoration

p(x|y) � e(�[ 12

Pi xi(w�x)i+

Pi xi(yi+b)])

µi = tanh (� [(w � µ)i + yi + b])

µi(t + 1) = tanh (� [(w � µ(t))i + yi + b])

• Takes a long time to learn the parameters (even on a GPU)

• Computations involved in backpropagation make (further) parallelization difficult

• Deep and Wide Multiscale Recursive Networks

• Model Statistical Structure in Label-Space

(3) Model Statistical Structure in Label-Space

• Labels for neighboring image locations are correlated (i.e., structured prediction).

• In supervised image labeling approaches, predictions for neighboring locations typically only correlated through dependence on overlapping input image patches.


• (Bayesian) MRF approach: model p(Y) and P(X|Y)

• Computational expense of probabilistic learning & inference can limit model complexity of MRFs in important ways (Jain 2007, 2009).


input

output

x1

y1

z1

x2

y2

z2

x3

y3

z3

xN

yN

zN

x

y

z

x

y

z

x

y

z

x

y

zN

x

y

z

x

y

z

x

y

z

x

y

z

mlp_1

mlp_2

unsupervised

supervised

node = feature map arrow = filter

input

output

x

y

z

x

y

z

x

y

z

xN

yN

zN

x

y

z

x

y

z

x

y

z

x

y

z

x

y

z

x

y

z

x

y

z

x

y

z

mlp_1

mlp_2

unsupervised

supervised

DAWMRIteration 1

DAWMRIteration 3

Original Scale Downsampled FurtherDownsampled In

put C

hann

els

(3d

imag

e +

4d a

ffini

ty g

raph

)

. . .

1 2 1000. . .

. . .

1 2 1000. . .

. . .

1 2 1000. . .

. . .

Unsu

perv

ised

Feat

ure

Extra

ctio

n (fi

lterin

g,

enco

ding

, poo

ling,

su

bsam

plin

g). . . . . .

1 2 200. . .

x y z

Supe

rvise

dCl

assifi

catio

n(M

LP o

r SVM

)

Input Image Affinity Graph Affinity Graph Affinity Graph

Architecture of Single Iteration

DAWMRIteration 2

DAWMR Architecture

Huang & Jain, ICLR 2014

Unsupervised Feature Learning Feature Extraction Supervised Learning

Trai

ning

Pipe

line

Impl

emen

tatio

nAr

chite

ctur

e

. . .

. . .

. .

.

Storage( HDD or

flash memory)

. .

.

Multicore CPU System

Thousands of CPU cores

GPU

• distributed CPU feature extraction takes advantage of parallel file system for accessing raw image data of training examples spread across large image space

• distributed CPU feature extraction stage allows expensive operations (sparse coding, multiscale downsampling, etc) without having to optimize implementation for GPUs

• GPU stage focused on simple and relatively “stable” multilayer perceptron computations

image affinity graphsegmented

affinity graph

DAWMR iteration 1

DAWMR iteration 2

DAWMR iteration 3

recursive processing

R(S, T ) =1�n2

�X

i,j 6=i

�(Si = Sj,Ti = Tj) + �(Si 6= Sj , Ti 6= Tj)Rand Index:

Huang & Jain, In Review

Would be useful to have a model that could provide some measure of p(segmentation | image).

Could be used to identify spurious locations in the reconstruction or evaluate specific manipulations (e.g., agglomeration).

Shape Modeling with Neural Networks

negative positive

http://youtube.com/v/ZqAJ7HvG8cI

Energy-based model for segmentation

Combinatorial Energy Learning for Segmentation

binary shape descriptors for efficient representation of 3-D shape configurations

deep neural network architecture combines convolutional image network with fully-connected network that scores compatibility between a segmentation and 3-D image

a training procedure uses pairwise object relations within a segmentation to learn the energy-based model

connectivity region data structure for efficiently computing the energy of configurations of 3-D objects

CELIS: Local energy model

Combinatorial Agglomeration

CELIS: Energy-based model

E(S, I) = compatibility between raw data and segmentation (learned)

minimize E by searching over the space of possible segmentations

Combinatorial Energy Learning for Image Segmentation, Submitted

DAWMRIteration 1

DAWMRIteration 3

Original Scale Downsampled FurtherDownsampled In

put C

hann

els

(3d

imag

e +

4d a

ffini

ty g

raph

)

. . .

1 2 1000. . .

. . .

1 2 1000. . .

. . .

1 2 1000. . .

. . .

Unsu

perv

ised

Feat

ure

Extra

ctio

n (fi

lterin

g,

enco

ding

, poo

ling,

su

bsam

plin

g). . . . . .

1 2 200. . .

x y z

Supe

rvise

dCl

assifi

catio

n(M

LP o

r SVM

)

Input Image Affinity Graph Affinity Graph Affinity Graph

Architecture of Single Iteration

DAWMRIteration 2

DAWMR Architecture

Huang & Jain, ICLR 2014

End-to-End Learning

input imagenonlocal

segmentation

Google: Samantha Ainsley Tim Blakely Tom Dean Michal Januszewski Peter Li Larry Lindsey Jeremy Maitin-Shepard Art Pope Mike Tyka

MPI: Winfried Denk Jorgen Kornfeld Fabian Svara

Janelia Bock Lab: Davi Bock Eric Perlman

Harvard/AIBS: Nuno da Costa Wei-Chung Lee Clay Reid

Janelia FlyEM: Bill Katz Stephen Plaza Pat Rivlin Proofreading Team

Neural Network Architectures for Large-scale 3d...

Documents

Transcript of Neural Network Architectures for Large-scale 3d...