Euroclojure 2017

47
The REturn of Clojure Data Science Elise Huard @elise_huard - Euroclojure 2017 Thursday, 20 July 17

Transcript of Euroclojure 2017

The REturn of Clojure Data Science

Elise Huard @elise_huard - Euroclojure 2017

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

The REturn of Clojure Data Science

Elise Huard @elise_huard - Euroclojure 2017

Thursday, 20 July 17

http://www.mastodonc.com/

Thursday, 20 July 17

Basic tooling

Data structures

2 Examples

Roadmap?

Credits

Thursday, 20 July 17

Your Mileage May vary

Thursday, 20 July 17

Basic tooling

Thursday, 20 July 17

“Awkward-sized data”

Thursday, 20 July 17

Sampling

https://github.com/bigmlcom/sampling

Thursday, 20 July 17

Making sense of data

Thursday, 20 July 17

Making sense of data

Thursday, 20 July 17

Notebooks

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Data structures

Thursday, 20 July 17

(deftype DataSet [^IPersistentVector column-names ^IPersistentVector columns ^IPersistentVector shape] clojure.lang.IMeta (meta [m] nil) clojure.lang.IObj (withMeta [m meta] (with-meta (mp/convert-to-nested-vectors m) meta))...)

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Plain old Clojure Data Structures

[{:vote "I would probably vote for it", :age 57, :rural "urban", :arguments-against "It might encourage people to stop working", :age_group "40_65", :country_code "AT", :weight "1.533.248.826", :arguments-for "It increases appreciation for household work and volunteering | It encourages financial independence and self-responsibility | It reduces anxiety about financing basic needs", :gender "male", :dem_has_children "yes", :dem_full_time_job "yes", :dem_education_level "high", :awareness "I understand it fully"} ... ]

Thursday, 20 July 17

clojure.spec(s/def ::acctid int?)(s/def ::first-name string?)(s/def ::last-name string?)(s/def ::email ::email-type)

(s/def ::person (s/keys :req [::first-name ::last-name ::email] :opt [::phone]))

Thursday, 20 July 17

clojure.spec(defn -integer? [x] (cond (string? x) (str->int x) (clojure.core/integer? x) x (and (clojure.core/double? x) (double->int x)) (double->int x) :else :clojure.spec/invalid))

(def integer? (s/conformer -integer?))

Thursday, 20 July 17

shiny transducers: kixi.stats

• Arithmetic mean• Geometric mean• Harmonic mean• Variance• Standard deviation• Standard error• Skewness• Kurtosis• Covariance• Covariance matrix• Correlation• Correlation matrix• Simple linear regression

...

https://github.com/MastodonC/kixi.statsThursday, 20 July 17

(->> [{:x 1 :y 3 :z 2} {:x 2 :y 2 :z 4} {:x 3 :y 1 :z 6}] (transduce identity (correlation-matrix {:x :x :y :y :z :z})));; => {[:x :y] -1.0, [:x :z] 1.0, [:y :z] -1.0,;; [:y :x] -1.0, [:z :x] 1.0, [:z :y] -1.0}

Thursday, 20 July 17

xform https://github.com/cgrand/xforms

redux https://github.com/henrygarner/redux

huri https://github.com/sbelak/huri

Thursday, 20 July 17

Example:universal basic

income Eu survey

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

(def in-favour (filter #(= (:vote %) "I would vote for it") data))

(defn comp-by-numbers [a b] (> (:how-many a) (:how-many b)))

(defn tally-numbers-fn [d] (fn[reason] (hash-map :reason reason :how-many (reduce + 0 (map #(get % reason) d)))))

(table-view (sort comp-by-numbers (map (tally-numbers-fn in-favour) reasons-for)))

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Example: neural nets

Thursday, 20 July 17

https://github.com/thinktopic/cortex

Cortex

deeplearning4jhttps://deeplearning4j.org/

Thursday, 20 July 17

http://yann.lecun.com/exdb/mnist/

Thursday, 20 July 17

(defn initial-description [input-w input-h num-classes] [(layers/input input-w input-h 1 :id :data) (layers/convolutional 5 0 1 20) (layers/max-pooling 2 0 2) (layers/dropout 0.9) (layers/relu) (layers/convolutional 5 0 1 50) (layers/max-pooling 2 0 2) (layers/batch-normalization) (layers/linear 1000) (layers/relu :center-loss {:label-indexes {:stream :labels} :label-inverse-counts {:stream :labels} :labels {:stream :labels} :alpha 0.9 :lambda 1e-4}) (layers/dropout 0.5) (layers/linear num-classes) (layers/softmax :id :labels)])

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Thursday, 20 July 17

Java bindings

https://github.com/mastodonc/kixi.mallet

long wishlist ...

Thursday, 20 July 17

Roadmap?

Thursday, 20 July 17

• Notebooks: continue to improve/upgrade gorilla-repl, new alternative?

• Adding more to kixi.stats

• Mine the good bits of incanter

• More clojure bindings to java libs

Thursday, 20 July 17

Credits

Thursday, 20 July 17

David Edgar LiebkeMichael Anderson

Jony HudsonCarin MeyerSimon Belak

...

Thursday, 20 July 17

Thank you

Elise Huard @elise_huard - Euroclojure 2017Thursday, 20 July 17