All your types are belong to us!
-
Upload
phillip-trelford -
Category
Technology
-
view
2.622 -
download
1
Transcript of All your types are belong to us!
ALL YOUR TYPES ARE BELONG TO US!
PHILLIP TRELFORD, @PTRELFORD
DDD DUNDEE 2013, #DUNDDD
F#UNCTIONAL LONDONERS
Meetup• 600 members
• 50 meetup
• Meets every 2 weeks
• Talks & Hands On
Topics• Finance
• Machine Learning
• Big Data
• Gaming
FSHARP.ORG/GROUPS
F# TESTIMONIALS – MACHINE LEARNINGPHILLIP TRELFORD, @PTRELFORD
DDD DUNDEE 2013, #DUNDDD
FSHARP.ORG/TESTIMONIALS
For a machine learning scientist, speed of experimentation is the critical factor to optimize.
Compiling is fast but loading large amounts of data in memory takes a long time.
With F#’s REPL, you only need to load the data once
and you can then code and explore in the interactive environment.
Unlike C# and C++, F# was designed for this mode of interaction.
- Patrice Simard, Microsoft
FSHARP.ORG/TESTIMONIALS - AMYRIS BIOTECH
F# has been phenomenally useful.
I would be writing a lot of this in Python otherwise
and F# is more robust, 20x - 100x faster to run
and for anything but the most trivial programs,
faster to develop.
- Darren Platt, Amyris Biotechnology
CASE STUDIESPHILLIP TRELFORD, @PTRELFORD
DDD DUNDEE 2013, #DUNDDD
F# TOOLS FOR HALO 3
Questions
• Controllable player skill distribution (slow down!)
• Controllable skills distributions (re-ordering)
Simulations
• Large scale simulation of 8,000,000,000 matches
• Distributed computation – 15 machines for 2wks
Tools
• Result viewer (Logged results: 52GB of data)
• Real-time simulator of partial update
ADCENTER
Weeks of data in training:
• 7,000,000,000 impressions
2 weeks of CPU time during sessions
• 2 wks x 7 days x 86,400 sec/day
Learning algorithm speed requirement:
• 5,787 impression updates /sec
• 172.8 µs per impression update
LIVE DEMOSPHILLIP TRELFORD, @PTRELFORD
DDD DUNDEE 2013, #DUNDDD
TYPE PROVIDERS: JSON
open FSharp.Data
type Simple = JsonProvider<“sample.js”>
let simple = Simple.Parse(""" { "name":"Tomas", "age":4 } """)
simple.Age
CSV TYPE PROVIDER
SPLIT DATA SET (FROM ML IN ACTION)
Pythondef splitDataSet(dataSet, axis, value):
retDataSet = []
for featVec in dataSet:
if featVec[axis] == value:
reducedFeatVec = featVec[:axis]
reducedFeatVec.extend(featVec[axis+1:])
retDataSet.append(reducedFeatVec)
return retDataSet
F#let splitDataSet(dataSet, axis, value) =
[|for featVec in dataSet do
if featVec.[axis] = value then
yield featVec |> Array.removeAt axis|]
K-MEANS CLUSTERING ALGORITHM
(* K-Means Algorithm *)
/// Group all the vectors by the nearest center.
let classify centroids vectors =
vectors |> Array.groupBy (fun v -> centroids |> Array.minBy (distance v))
/// Repeatedly classify the vectors, starting with the seed centroids
let computeCentroids seed vectors =
seed |> Seq.iterate (fun centers -> classify centers vectors
|> Array.map (snd >> average))
R – TYPE PROVIDER
WORLD BANK DATA
RESOURCESPHILLIP TRELFORD, @PTRELFORD
DDD DUNDEE 2013, #DUNDDD
TYPE PROVIDERS
• JSON
• XML
• CSV
• Excel
• SQL
• R
• MATLAB
• Hadoop
• ...
TRYFSHARP.ORG
BUY THE BOOK
GET THE T-SHIRT
MACHINE LEARNING JOB TRENDS
• Source indeed.co.uk
QUESTIONSPHILLIP TRELFORD, @PTRELFORD
DDD DUNDEE 2013, #DUNDDD