Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/20/16

Practical Probabilistic Programming with Figaro

Avi Pfeffer

Charles River Analytics

MLConf May 20, 2016

Why Probabilistic Programming? Figaro Examples and Applications Where We’re Going

Overview

We want to Predict the future Infer past causes of current observations Learn from experience

With much less effort and expertise than before

What Are We Trying To Do?

Probabilistic Reasoning Lets You Do All These Things

Probabilistic Reasoning: Predicting the Future

Probabilistic Reasoning: Inferring Factors that Caused Observations

Probabilistic Reasoning: Using the Past to Predict the Future

Probabilistic Reasoning: Learning from the Past

You need to Implement the representation

Implement the probabilistic inference algorithm

Implement the learning algorithm

Interact with data

Integrate with an application

But Probabilistic Reasoning Is Hard!

Drastically reduce the work to create probabilistic reasoning applications

Goal of Probabilistic Programming

1. Expressive programming language for representing models2. General-purpose inference and learning algorithms apply to

models written in the language

All you have to do is represent the model in code and you automatically get the application

How Probabilistic Programming Achieves This

It’s easy to incorporate rich domain knowledge into probabilistic programs

Probabilistic programming can work well even when you don’t have a lot of data

Probabilistic programming models are explainable and understandable

Probabilistic programming can predict outputs belonging to complex data types of variable size, like social networks

Probabilistic Programming Compared to Deep Learning

Overview

Figaro goalsA probabilistic programming system that is: Easy to interact with data

Easy to integrate with applications

General and expressive representation to capture common programming patterns

An extensible library of inference algorithms

Figaro provides data structures to represent probabilistic programs

Scala programs construct the Figaro models

Inference algorithms implemented in Scala operate on these models

Figaro as a Scala Library

Easy interaction with data and integration with applications

Can embed general-purpose code in probabilistic programs

Can construct models programmatically

Figaro inherits functional and object-oriented features of Scala

Can use Scala functions to specify constraints

Scala supports extensible library of inference algorithms

Advantages of Scala Embedding

Hard to reason about models at source level, since arbitrary Scala code may be embedded in model

Syntax not as elegant as self-contained languages

Steeper learning curve You need to learn Scala and Figaro But we have found that beginners can easily learn to write models

quickly

We have found that the power and practicality of Figaro more than make up for these disadvantages

Disadvantages of Scala Embedding

Overview

Figaro novices were able to quickly build up an integrated probabilistic reasoning application

Hydrological Terrain Modeling for Army Logistics

We were able to perform a sophisticated analysis far better than our previous non-probabilistic method

Malware Lineage (DARPA Cyber Genome)

Parent Correct

Parent Precision

Parent Recall

Parent FMeasure

00.10.20.30.40.50.60.70.80.9

1New Algorithm

Old Algorithm With New Fea-tures

Phase I IV &V Result

Tracklet Merging (DARPA PPAML Challenge Problem)

We came up with a new algorithm that we would not have thought of without

probabilistic programming and expressed it in one slide

class Tracklet( toCandidates: List[(Double, Tracklet)], fromCandidates: List[(Double, Tracklet)]){ val next = Select(toCandidates: _*) val previous = Select(fromCandidates: _*)}

for (source <- sources) { val nextPrevious = Chain(source.next, nextTracklet => nextTracklet.previous) nextPrevious.observe(source)}

Tracklet Merging in Figaro

Overview

We’ve significantly reduced the effort required to build complex probabilistic reasoning applications

But it still requires a fair amount of machine learning expertise to make these applications work You need to know how to represent models You need to know how to choose and configure inference algorithms

Current State of the Art

A probabilistic programming framework that domain experts with little or no machine learning knowledge can use

1. An English-like language for describing a domain2. A method for automatically filling in the gaps in a model3. Automated inference techniques that optimally choose and

configure algorithms for a particular problem

Our Goal

1. Decompose an inference problem into many subproblems

2. Optimize the choice an appropriate solver for each subproblem

3. Combine the subproblem solutions into a solution of the whole problem

Automated Inference Strategy

Subproblems are represented as factor graphs

Factored algorithms are used to solve subproblems E.g., variable elimination, belief propagation, Gibbs sampling

We intelligently choose between the available algorithms on each subproblem

Structured Factored Inference (SFI)

Compiled Graphical Model of Figaro Program

fbT fb

fcT fc

x1bT x2b

y1bT y2b

x1bF x2b

y1bF y2b

x1cT x2c

y1cT y2c

x1cF x2c

y1cF y2c

Decompose Problem Automatically

fbT fb

fcT fc

x1bT x2b

y1bT y2b

x1bF x2b

y1bF y2b

x1cT x2c

y1cT y2c

x1cF x2c

y1cF y2c

Subproblems

Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/20/16

Technology

Transcript of Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/20/16

MLconf NYC Animashree Anandkumar

Quoc le, slides MLconf 11/15/13

Fam ily ENOPLOTEUTHIDAE Pfeffer, 1900

MLconf NYC Ted Willke

Representations and solutions for game-theoretic problemsgomide/courses/EA044/... · Intelligence Representations and solutions for game-theoretic problems Daphne Koller *, Avi Pfeffer

Probabilistic Models of Relational Data Daphne Koller Stanford University Joint work with: Lise Getoor Ming-Fai Wong Eran Segal Avi Pfeffer Pieter Abbeel.

MLconf NYC Justin Basilico

Talwalkar mlconf (1)

Ted Willke, Intel Labs MLconf 2013

MLconf NYC Pek Lum

Turning Probabilistic Reasoning into Programming Avi Pfeffer Harvard University.

MLConf 2016 SigOpt Talk by Scott Clark

Pfeffer (2005) Producing sustainable.pdf

Learning Probabilistic Relational Models Lise Getoor 1, Nir Friedman 2, Daphne Koller 1, and Avi Pfeffer 3 1 Stanford University, 2 Hebrew University,

Pfeffer and Salancik 1978 Notes - PBworkskendlevidian.pbworks.com/w/file/fetch/100099305/Pfeffer and... · Pfeffer and Salancik 1978 ... analysis of the roles of management, ... it

Scott Triglia, MLconf 2013

MLconf NYC Samantha Kleinberg

MLconf seattle 2015 presentation

ReviewAnalysis MLconf 2016 JPrendki

American Express Slides, MLconf 2013