Supervised Convolutional GSN for Protein Secondary ...jzthree/datasets/ICML2014/slides.pdf•...

Supervised Convolutional GSN

for Protein Secondary Structure

Prediction

Jian Zhou

Olga Troyanskaya

Princeton University

What’s In this talk..

• Problem: Predict protein secondary structure

• Iterative prediction with multi-layer hierarchical

representation

– Supervised GSN

– Convolutional architecture for GSN

– A trick for improving convergence and performance

• Performance evaluations

Previous Approaches: neural network

from 1988 (Qian & Sejnowski); bidirectioal

recurrent neural network (Baldi et al.,

1999); conditional neural fields (Peng et al.,

2009); many more…

Protein secondary structure

prediction

MDLSALRVEEVQNVINAMQKILECP

ICLELIKEPVSTKCDHIFCKFCMLKL

LNQKKGPSQCPLCKNDITKRSLQE

STRFSQLVEELLKIICAFQLDTGLEY

ANSYNFAKKGK

Protein sequence

CCGGGSSHHHHHHHHHHHHHHTS

CSSSCCCCSSCCBCTTSCCCCSH

HHHHHHHSSSSSCCCTTTSCCCC

TTTCBCCCSSSHHHHHHHHHHHH

HHHHTCCCCCC

Secondary structure

Image credit:

Wikimedia common

Predict

20 types of amino acids

8 classes

3D structure

Protein Sequence -> Secondary Structure

Protein sequence 20 types of amino acids

8 classes Secondary structure

label sequence

Predict

Evolutionary

neighborhood

3D structure

Motivation

• Challenge: Prediction with both local and long-range dependencies

• Plan:

Multi-layer hierarchical representation

Both ‘upward’ and ‘downward’

connections

Supervised GSN formulation

𝐻𝑡+1 ~ 𝑃𝜃1 𝐻 𝐻𝑡, 𝑋𝑡𝑋𝑡+1 ~ 𝑃𝜃2 𝑋 𝐻𝑡+1)

𝐻1 𝐻2

𝑋0 𝑋1 𝑋2

• Generative Stochastic Network

Learning the transition operators of a Markov chain whose

stationary distribution estimates the data distribution 𝑃 (𝑋).

Learning 𝑃 𝑋 𝐻) can be much easier than 𝑃 (𝑋) by design.

Trainable using back-propagation

Bengio, Y., Thibodeau-Laufer, É., Alain, G., and Yosinski, J.

Deep Generative Stochastic Networks Trainable by Backprop

𝐻𝑡+1 ~ 𝑃𝜃1 𝐻 𝐻𝑡, 𝑋𝑡𝑋𝑡+1 ~ 𝑃𝜃2 𝑋 𝐻𝑡+1)

𝐻𝑡+1 ~ 𝑃𝜃1 𝐻 𝐻𝑡 , 𝑌𝑡, 𝑋0𝑌𝑡+1 ~ 𝑃𝜃2 𝑌 𝐻𝑡+1)

𝐻1 𝐻2

𝑋0 𝑋1 𝑋2

𝐻1 𝐻2

𝑌0 𝑌1

P(Y|X)

Supervised

Learning 𝑃 𝑌 𝐻) can be much easier than 𝑃 𝑌 𝑋 , utilizing

previous state of the chain

𝐻𝑡+1 ~ 𝑃𝜃1 𝐻 𝐻𝑡 , 𝑌𝑡, 𝑋0𝑌𝑡+1 ~ 𝑃𝜃2 𝑌 𝐻𝑡+1)

𝐻1 𝐻2

𝑌0 𝑌1

P(Y|X)

Supervised

Maximize log-likelihoods

True 𝑃(𝑌|𝑋0)

𝑃𝜃(𝑌|𝐻1)

𝑃𝜃(𝑌|𝐻2)

𝑌0 𝑌1

Architecture for protein secondary structure prediction

Multi-scale representation – multi-layer convolutional architecture

Local information sensitive – output unit at bottom layer

W1’W1 W1’ W1 W1 W1’

W2 W2’ W2 W2W2’

𝑌2 𝑌3

𝐻1 𝐻2

pooling

Training

Initialize at a specified test initialization

value for a subset of training batches:

Experiments on initialization of chain during training

W1’W1 W1’ W1 W1 W1’

W2 W2’ W2 W2W2’

𝑌2 𝑌3

𝐻1 𝐻2

Accuracy

# of iterations

Accuracy

# of iterations

- Optimal performance at 50% test initialization

𝑌0 𝑡𝑟𝑢𝑒

𝑌0 𝑡𝑒𝑠𝑡

Performance

CullPDB-30

test set

Overall Accuracy

(8-class)

1 layer 0.714± 0.006

2 layers 0.720± 0.006

3 layers 0.721 ± 0.006

CB513 dataset Overall Accuracy

(8-class)

RaptorSS8/CNF 0.649 ± 0.003

Our method 0.664 ± 0.005

Cull PDB dataset (6133 proteins with <30% identity between any protein pairs);

available at www.princeton.edu/~jzthree/datasets

single protein prediction example Performance through averaging iterative predictions:

𝑌16

𝑌32

𝐿𝑎𝑏𝑒𝑙

Summary

• We developed supervised convolutional GSN model for

protein secondary structure prediction.

• Supervised GSN

– Stochastic iterative prediction through Markov chain

– Initialization trick improve both performance and convergence

rate empirically

• Convolutional architecture for Supervised GSN

– Combine high level representation and local prediction

– Improved over previous best performance

• Filters: Layer1, 𝑋, 𝑌 ↔ 𝐻0

𝑊𝑋→𝐻0

(Amino

acids)

Position

Channel

𝑊𝑌→𝐻0

(Secondary

structure)

𝑊𝐻0→𝑌(Secondary

structure)

Supervised Convolutional GSN for Protein Secondary ...jzthree/datasets/ICML2014/slides.pdf•...

Documents

Transcript of Supervised Convolutional GSN for Protein Secondary ...jzthree/datasets/ICML2014/slides.pdf•...

GSN - Home | Nanometrics

Gsn feb edition

NORTH AMERICA GSN - michelin.com

GSN Profile pdf

OMG Astah GSN Demonstration

GSN Quality Initiative

Gsn vol 5 no 20 final

GSN NEWSLETTER - Geological Society of NevadaFor more information contact Stacy Saari at 775/778-2679 or stacysaari@newmont.com. CALENDAR OF GSN EVENTS GSN Newsletter is published

GSN May 2015 Eddition

Status Update May 2008 – Confidential – GSN & FUN.

GSN infographic_5

Air coolers - guentner · DHF DHF DHF DHN DGN ADHN GDS DHN GSN GACA GSN GACA GSSA GSN XXX GAIL GAIL GAIL Slim Dual Cubic Dual Blast Floor Agri Penthouse HighStore ThermoStore 0,6

YOUR SOURCE FOR AIRPORT EDUCATION AND TRAINING · Silver diploma: GSN 1, 2 & 3 courses Gold diploma: GSN Silver diploma + GSN 4, 5 & 6 3 YEARS To complete each level 3 classroom courses

GSN infographic_15

Revenue: GSN Highlights€¦ · GSN announcement & expected impact Description Support implementation of the Ontario First Nation, Métis, and Inuit Education Policy Framework. Also,

GSN Mng corepack - media.unicity.netmedia.unicity.net › usa › gsn › GSN_Mng_corepack.pdf · 1. Franchise Starter Kit ($40 Value) Recieve 10 shaker cups Unicity Product Catalog

Module 5 comprehensive care gsn

Freezer GSN..

GEOLOGICAL SOCIETY OF NEVADA NEWSLETTERgeoscience.unlv.edu/GSN/Newsletters/2011/Oct11News.pdf · (775) 323-3500 or e-mail gsn@gsnv.org for reservations. GSN BOARD OF DIRECTORS Chairman

05 Tm2110eu01tm 0003 Gsn Architecture