Welcome to Bowdoin | Bowdoin College - Birds, books, and …tpietrah/TALKS/bbm.pdf · 2018. 3....

Birds, books, and matrices: a brief adventurein artificial intelligence and neural networks

Thomas PietrahoSpring, 2018

I am an algebraist

Neural networks: major successes

Neural nets can recognize images

carball

bridge burrito

Current accuracy ≈ 95%

Neural nets can translate

Polish: mój poduszkowiec jest pełen węgorzy

English: my hovercraft is full of eels

Google’s version is very close to human translation for anumber of languages. Not Chinese.

Neural nets can play games

Top ranked Go player defeated by AlphaGo (4-1). AlphaGodestroyed by AlphaGo Zero (100-0).

Image by Saran Poroong

Neural networks: minor successes

Neural nets can judge a book by its cover

history science romance sports

Problem: Predict book genre based on its cover.

Accuracy 76%.

with Parikshit Sharma, ’17, IndieBio

Neural nets can identify birds

cardinal wood duck anhinga chickadee

Problem: Predict species of bird based on image.

Accuracy 87%. (P., 2017)

american crow fish crow common raven

Neural nets can identify birds

cardinal wood duck anhinga chickadee

Problem: Predict species of bird based on image.

Accuracy 87%. (P., 2017)

american crow fish crow common raven

Neural nets can be useful to an algebraist?

From The Accountant

What are neural nets?

Neural nets are functions

Image courtesy of JD Cruzan

1.00 4.98 7.21 9.89 1.01 2.30

3.72 2.67 22.01 1.92 3.70

1.00 4.98 7.21 9.89 1.01 2.30

3.72 2.67 22.01 1.92 3.70

In this form, neural nets can carry out

• regression, or•

1.00 4.98 7.21 9.89 1.01 2.30

0 0 0 1 0

In this form, neural nets can carry out

• regression, or• classification

Neural nets are made up of “neurons”

Two parameters: laziness and loudness.This specifies a neuron’s activation function.

Neural nets are networks of neurons

output

Neural nets are universal

Theorem (G. Cybenko 1989)Every function can be modeled as a neural network.

Examples of functions: image classification, languagetranslation, etc.

Question: Why no self-drivingcars in the 1990s?

Learning with neural nets

Procedure:

• assemble a neural network (craft)• adjust laziness and loudness for each neuron (math)• measure error based on a sample of data and repeat(fast processors)

Advances in all three parts of this process are responsible forthe machine learning revolution since 2012.

Image courtesy of Kaiming He

We don’t completely understand why neural nets work

Image courtesy of Elsayed et. al.

A problem in algebra

Matrix multiplication

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

This process is a mess, involving lots of ordinary addition andmultiplication. But it is an important mess.

Goal: minimize number of ordinary multiplications: “rank”

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

=( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 83× 3 274× 4 64

1000× 1000 109

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 �A8 7 (Strassen, 1969)3× 3 274× 4 64

1000× 1000 109

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 �A8 7 (Strassen, 1969)3× 3 274× 4 ��ZZ64 49 (Strassen, 1969)

1000× 1000 109

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 �A8 7 (Strassen, 1969)3× 3 274× 4 ��ZZ64 49 (Strassen, 1969)

1000× 1000 ��ZZ109 264M (Strassen, 1969)

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 �A8 7 (Strassen, 1969)3× 3 ��ZZ27 23 (Lederman, 1976)4× 4 ��ZZ64 49 (Strassen, 1969)

1000× 1000 ��ZZ109 264M (Strassen, 1969)

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 �A8 7 (Strassen, 1969)3× 3 ��ZZ27 23 (Lederman, 1976)4× 4 ��ZZ64��ZZ49 48 (Stothers, 2012)

1000× 1000 ��ZZ109 264M (Strassen, 1969)

( 0.98 0.23 0.120.12 0.34 0.670.11 0.54 0.18

)·( 0.56 0.09 0.100.99 0.45 0.410.39 0.02 0.11

( 0.82 0.19 0.210.67 0.18 0.230.67 0.26 0.25

matrix size rank2× 2 �A8 7 (Strassen, 1969)3× 3 ��ZZ27 23 (Lederman, 1976)4× 4 ��ZZ64��ZZ49 48 (Stothers, 2012)

1000× 1000 ��ZZ109��XXX264M 238M (Stothers, 2012)

A little insight

A neural network can model matrix multiplication:

(a bc d

·(e fg h

(i jk l

A little insight

(a bc d

)·(e fg h

=(i jk l

A little insight

(a bc d

)·(e fg h

(i jk l

A little insight

(a bc d

)·(e fg h

(i jk l

A little insight

(a bc d

)·(e fg h

(i jk l

A little insight

(a bc d

)·(e fg h

(i jk l

Question: can our methods learn this network?

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8

X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X

7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7

X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X

6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6

X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

error over learning time

matrix size rank2× 2 8 X 7 X 6 X

2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11

X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X

10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10

X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X

3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15

X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X

14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14

X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X

3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23

X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X

22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22

X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X

4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49

X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X

48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48

X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X

47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47

X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X

46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46

X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X

45 X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45

X 44 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44

Will it learn?

Thanks: Dj and HPC

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

Will it learn?

Thanks: Dj and HPC

Upshot: This result reduces the computational costfor 1000 × 1000 matrix multiplication from 238M to172M ordinary multiplications!

matrix size rank2× 2 8 X 7 X 6 X2× 3 11 X 10 X3× 2 15 X 14 X3× 3 23 X 22 X4× 4 49 X 48 X 47 X 46 X 45 X 44 X

I am an algebraist

Luckily (for algebraists), the neural network solution is only anapproximation.

Question: how can one obtain an exact solution?

Hint: algebra

I am an algebraist

Luckily (for algebraists), the neural network solution is only anapproximation.

Question: how can one obtain an exact solution?

Hint: algebra

Welcome to Bowdoin | Bowdoin College - Birds, books, and …tpietrah/TALKS/bbm.pdf · 2018. 3....

Documents

Transcript of Welcome to Bowdoin | Bowdoin College - Birds, books, and …tpietrah/TALKS/bbm.pdf · 2018. 3....

Bowdoin: Data Driven Societies: Remix

The Recent Work - Bowdoin College

Named Professorships at Bowdoin College

Bowdoin 9-4

Virtualization - Bowdoin

Dugdale Undergraduate Transcript- Bowdoin College

The Bowdoin Orient Volume 144 Number 1The Bowdoin Orient - Vol. 144, No. 1 - September 12, 2014

FRESHMAN FACEBOOK - Bowdoin

Bowdoin College - Medical School of Maine Catalogue (1913 ... › download › pdf › 214028051.pdf · Bowdoin College Bowdoin Digital Commons Bowdoin College Catalogues 1-1-1914

Bowdoin College Catalogue (1849 Spring Term)Bowdoin College Bowdoin Digital Commons Bowdoin College Catalogues 1-1-1849 Bowdoin College Catalogue (1849 Spring Term) Bowdoin College

BENEFACTORS - Bowdoin

Algorithms for GIS - Bowdoin

Bowdoin College Catalogue (1822 Feb)

Bowdoin Globalist May 2012

Algorithms for GIS csci3225 - Bowdoin

History of Bowdoin Maine

The Bowdoin College Trail Guide

U.S. Fish & Wildlife Service Bowdoin · 2014. 3. 27. · of Birds . Bowdoin National Wildlife Refuge, located 7 miles east of Malta on Old Highway 2 in northeastern Montana, was established

The Bird Life of Lake Bowdoin, Montana (with five ills.) · 2015. 3. 4. · water birds than it has been in the past. In spite of its renown among hunters, Lake Bowdoin is evidently

MapReduce - Bowdoin