Optimal Transport Graph Neural Networks · Property smoothness & robustness of representations...

Post on 07-Aug-2020

1 views 0 download

Transcript of Optimal Transport Graph Neural Networks · Property smoothness & robustness of representations...

Optimal Transport Graph Neural Networks

Graph Convolutions and Chemprop Model

Toxic?+

Molecule embedding

Graph Convolutional

Network / Chemprop

(GCN)

2

Property smoothness & robustness of representations

Small, soluble Large, soluble

Large, insoluble

3

Solu

bilit

y

Molecule embedding distance

Pro

perty

val

ue d

ista

nce

4

Property smoothness & robustness of representations

From atom embeddings to molecule embedding

Toxic?+

• Simplistic graph compression

5

Toxic?+

Green, red and orange sets have the same sum and same average.

6

• Simplistic graph compression

From atom embeddings to molecule embedding

Toxic?

Better set functions?

7

From atom embeddings to molecule embedding

Prototypes Inspired From Functional Groups

Prototype 1: Aromatic Nitro

Prototype 3: Nitroso

Prototype 2: Azo-type

Prototype 3: Aromatic Amine

dist=0.5

dist=2

dist=10

dist=20

8

Our idea: • learn a dictionary of structural basis functions that serve to highlight key facets of compounds • express molecules by relating them to abstract molecular prototypes • prototypes highlight property values associated with different structural features (e.g. solubility)

Prototypes as GCN Embedding Point Clouds

Prototype 1: Aromatic Nitro

Prototype 3: Nitroso

Prototype 2: Azo-type

Prototype 3: Aromatic Amine

dist=0.5

dist=2

dist=10

dist=20

9

Our idea: • learn a dictionary of structural basis functions that serve to highlight key facets of compounds • express molecules by relating them to abstract molecular prototypes • prototypes highlight property values associated with different structural features (e.g. solubility)

Wasserstein Prototypes

Molecule atom embeddings

Prototype 1Prototype 2

Prototype 3Prototype 4

d4

d1

d3

d2

d1d2d3d4

Toxic? 10

Wasserstein Prototypes

11

Wasserstein distances between sets of embeddings

Molecule atom embeddings

Prototype 1 Prototype 2

Prototype 3Prototype 4

d4

d1

d3

d2

d1d2d3d4

Toxic?

Property smoothness: Latent Space for Real Models

Graph Convolution w/ sum aggregation

Spearman ⍴ = .424±.029

Pearson r = .393±.049

Wasserstein Prototypes

Spearman ⍴ = .815±.026

Pearson r = .828±.020

Molecule embedding in 2D vs property value

Molecule embedding distance vs property value distance

12

13

Property smoothness: Latent Space for Real Models

14

Property smoothness: Latent Space for Real Models

Results on Molecular Property Prediction

Solubility LipophilicityInhibitors of human

β-secretase 1 (BACE-1)

Blood-brain barrier

penetration (permeability)

15