A Snake Learns

33
A Snake Learns Machine Learning and Python Igor Guerrero @igorgue

Transcript of A Snake Learns

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 1/33

A Snake LearnsMachine Learning and Python

Igor Guerrero

@igorgue

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 2/33

What's Machine Learning?

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 3/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 4/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 5/33

 

"A branch of artificial intelligence

 , is a scientific discipline

concerned with the design and development of algorithms that 

allow computers to evolve behaviors based on empirical data ,

 such as from sensor data or databases".

- Wikipedia (http://en.wikipedia.org/wiki/Machine_Learning)

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 6/33

Cool Story, Bro!

Machine Learning is more than just 

algorithms! 

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 7/33

Machine Learning in real life.

Data Input

Algorithms

Data Output

Runtime

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 8/33

Big Data is Big

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 9/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 10/33

I'm not telling you to switch database...

If your current relational database doesn't cut it for ML

there are alternatives! 

 And really good ones! 

http://aws.amazon.com/elasticmapreduce/(let them run your stuff, based on Hadoop)

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 11/33

Brute-force "learning"

Data is the algorithm

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 12/33

Silly Google practices this!

89,600 < 714,000,000

Brute-forcing their spell checker...

Not so genius now right? 

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 13/33

 

http://code.google.com/apis/predict/

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 14/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 15/33

 

The Netflix Challenge winner was a collection of resultsgenerated by multiple algorithms:

http://www.netflixprize.com/leaderboard 

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 16/33

NLP

Natural Language Processing, I 

knew grammar was useful.

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 17/33

 

 A field of computer science and linguistics concerned with the

interactions between computers and human (natural)

languages

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 18/33

Guess the first word!

dataisbig

Word?(d) + ataisbig

Word?(da) + taisbig

Word?(dat) + aisbigWord?(data) + isbig

(repeat procedure with the rest)

This is known as word segmentation very useful in foreignlanguages search!

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 19/33

 

Word?(word) = #Google hits / ~#pages of the web

It works, I promise!

http://ngrams.googlelabs.com/datasets

Google ngram database from scans from Google Books.

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 20/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 21/33

Recommendations

Based on your viewing history you

might like "Snakes on a Plane"...

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 22/33

Amazon loves these

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 23/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 24/33

Euclidean Distance Algorithm

d ( p,q) = ( p1

− q1

)2 + ( p2

− q2

)2

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 25/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 26/33

Toby might enjoy "Lady in the Water" and "The NightListener".

And he'd hate "Just My Luck"...

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 27/33

Classification

"Dividing" data sets

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 28/33

Great for face recognition!

Facebook implemented it!

http://face.com offers a Free API!

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 29/33

Support Vector Machines

The calculation the line that divide objects is done via SVM.

http://www.csie.ntu.edu.tw/~cjlin/libsvm/ 

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 30/33

Clustering

"Similarities" between different sets

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 31/33

This is how compression algorithms work

1. AAAA AAA AA AAAAAA

2. BB BBBBB BBB BBBBBB

3. CCC CCCC CCCC CCC

Use Euclidean Distance to know what elements aresimilar!

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 32/33

8/6/2019 A Snake Learns

http://slidepdf.com/reader/full/a-snake-learns 33/33

Resources

● Programming Collective Intelligence: http://oreilly.com/catalog/9780596529321

● Hadoop tutorial: http://developer.yahoo.com/hadoop/tutorial/● R Programming language: http://www.r-project.org/

● My favorite Machine Learning community members:○ Ilya Grigorik (Google): http://www.igvita.com/○ Jonathan Harris (We Feel Fine): http://www.wefeelfine.

org/● Contact me: http://igorgue.com