From Power Chord to the Power of Models - Oredev
-
Upload
ali-kheyrollahi -
Category
Software
-
view
178 -
download
0
Transcript of From Power Chord to the Power of Models - Oredev
![Page 1: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/1.jpg)
From Power Chords
to the Power of
Models
@aliostadAli Kheyrollahi
![Page 2: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/2.jpg)
> stackoverflow> £1.5 bln
global fashion destination
> 35% every year
![Page 3: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/3.jpg)
8
Local pop music
9
Local pop music “Cheelee pom!”
10
Boney M “Rasputin”
11
Blondie “Heart of Glass”
![Page 4: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/4.jpg)
![Page 5: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/5.jpg)
Infobox
Free textLinks
![Page 6: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/6.jpg)
Data Acquisition
![Page 7: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/7.jpg)
Data Source - Wiki
4,990,2794,990,279 English Articles
37,583,879 Articles
![Page 8: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/8.jpg)
Data Source - Wiki vs BritannicaFeng Zhu (assistant prof at Harvard):
“There has been lots of research on the accuracy of Wikipedia, and the results are mixed—some studies show it is just as good as the experts, others show [that] Wikipedia is not accurate at all.”
“… the editors [of Britannica] are still not found to be more objective than the crowd in articles that are sufficiently revised.”
![Page 9: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/9.jpg)
Data Source - Wikipedia in scholar papers
0
45000
90000
135000
180000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014Source: Google Scholar
![Page 10: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/10.jpg)
Data Acquisition - Wiki
List of Rock Genres Rock Genres Rock Artists
Store
Store HTML
Capture Links
Store HTML
Python scripts
Postgres
![Page 11: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/11.jpg)
Data Source - Content vs. Data
Hyphen U+002D
figure dash U+2012
minus sign U+2015
em dash U+2014
en dash U+2013
![Page 12: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/12.jpg)
Data Exploration
![Page 13: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/13.jpg)
Data Exploration
“I personally … literally just look at the screen, just like the matrix”
Claudia Perlich, multi-award winner Data Scientist
![Page 14: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/14.jpg)
Data Exploration
“… the dirty little secret that I have won all of them because I have found something wrong with the data… I would like to play around with dataset and get initimately familiar with dataset and its properties.“
Claudia Perlich
![Page 15: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/15.jpg)
Album Genre
![Page 16: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/16.jpg)
Album Genre
http://wiki-rock.azurewebsites.net/top10-album-genres.html
![Page 17: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/17.jpg)
Data Models
![Page 18: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/18.jpg)
Data Models Model?!
![Page 19: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/19.jpg)
Data Models Model
Mathematical representation of a concept based on parameters that impact that concept
• Rating of a native app • Stackoverflow score • Credit score • Fraud check
![Page 20: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/20.jpg)
“All models are wrong… but some are useful.
George Box
Data Models Model
![Page 21: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/21.jpg)
Data Models Graph 101
Social Network Analysis and Graph Theory
• Nodes/vertices and edges/lines • Directedness:
• Directed • Undirected
• Degree, InDegree/OutDegree • Weight
A B
![Page 22: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/22.jpg)
Data Models Centrality
12
4
2
2
1
Same degree Different betweenness
Degree
![Page 23: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/23.jpg)
Graph Codez
import networkx as nx
g = nx.Graph() g.add_edge(‘a’, ‘b’) g.add_edge(‘b’, ‘c’) … print len(g[‘b’]) # degree c = nx.betweenness_centrality(g, normalized=True) # c -> dictionary of node names and their score
DiGraph()
![Page 24: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/24.jpg)
Modelling Influence using Wiki
![Page 25: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/25.jpg)
Data Models Cited Influence
Howlin’ Wolf
Captain Beefheart
1940 1964
![Page 26: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/26.jpg)
Data Models Cited InfluenceMost influential Rock Artists Based on out-degree
The Beatles => 188 Black Sabbath => 127 Led Zeppelin => 118 Jimi Hendrix => 114 Bob Dylan => 94 Pink Floyd => 86 Iron Maiden => 77 Metallica => 77 The Rolling Stones => 66 The Beach Boys => 65 Neil Young => 63 Nirvana => 62 Slayer => 60 Queen => 59
![Page 27: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/27.jpg)
Data Models Cited InfluenceMost influential Rock Artists Based on Betweenness Centrality
Jimi Hendrix => 53476.2014921 The Beatles => 47511.7957531 Bob Dylan => 38107.0298185 Led Zeppelin => 32701.7223273 Nirvana => 29733.9066836 Metallica => 29356.6009213 Queen => 28989.2844223 Robert Smith => 28880.670718 Elvis Presley => 28463.2891497 Slade => 27656.487307 Iron Maiden => 22449.6697023 Ramones => 22437.6112965 Rush => 21125.9481602 Neil Young => 19913.887522
![Page 28: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/28.jpg)
Data Models Cited InfluenceMost influential Artists Based on Betweenness Centrality
Metallica => 566.06 Iron Maiden => 419.21 Corey Taylor => 146.0 Led Zeppelin => 122.73 Slipknot => 116.58 King Diamond => 94.7 Machine Head => 85.12 Rush => 70.41 Black Sabbath => 68.0 Van Halen => 54.56 Deep Purple => 53.5 Megadeth => 42.63 Guns N' Roses => 24.25
Heavy MetalNirvana => 490.08 Muse => 114.5 Weezer => 97.33 Pixies => 94.17 Sonic Youth => 78.5 Rivers Cuomo => 69.5 Siouxsie and the Banshees => 51.67 The Smiths => 51.5 Jeff Buckley => 46.17 The Offspring => 43.0 Placebo => 42.0 My Chemical Romance => 34.0 The Smashing Pumpkins => 32.33
Alternative RockRush => 54.0 Marillion => 34.0 Pink Floyd => 33.0 Yes => 20.0 Porcupine Tree => 19.5 Dream Theater => 19.0 Chris Squire => 16.5 Primus => 15.0 Tool => 12.0 Mahavishnu Orchestra => 8.0 Geddy Lee => 7.0 Neil Peart => 5.0 Keith Emerson => 5.0
Progressive Rock
![Page 29: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/29.jpg)
Data Models PageRank
![Page 30: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/30.jpg)
Data Models Page RankThe Beatles => 0.00837723421839 Blind Lemon Jefferson => 0.00837369035189 Josh White => 0.00824945015047 Bessie Smith => 0.00717743996144 Louis Armstrong => 0.00692897940193 James P. Johnson => 0.00628676810257 Little Richard => 0.00584677302727 Muddy Waters => 0.005773172933 Tampa Red => 0.00572032424174 Robert Johnson => 0.00523579252974 Big Bill Broonzy => 0.00516075834679 Moon Mullican => 0.0050657751593 Black Sabbath => 0.00498789229732 Elvis Presley => 0.00497932058047 Duke Ellington => 0.00465800760107 Bo Diddley => 0.0044496675634 Jimmy Page => 0.00437658472459 Frank Zappa => 0.00431978608953 Miles Davis => 0.00396303890974 Jimi Hendrix => 0.00391117233916 Sister Rosetta Tharpe => 0.00390833570401 Bing Crosby => 0.00385435213525 Bob Dylan => 0.00358608821536 James Brown => 0.00349870931123
![Page 31: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/31.jpg)
Other Models
![Page 32: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/32.jpg)
Weighted graph Album GenresKrautrock
Psychedelic Rock
Experimental Rock
1
1
1
![Page 33: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/33.jpg)
Genre Affinity
Indie Rock
Shoegazing
Alternative Rock
Dream Pop
22
25
2412
Post-rock
![Page 34: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/34.jpg)
Genre Affinity
Gothic Metal
Doom Metal
Black Metal
Heavy Metal
13
34
2712
Stoner Metal
![Page 35: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/35.jpg)
Clustering in Networks
![Page 36: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/36.jpg)
Clustering in Networks
u1 u2 u3 u4 u5u1 1 0 0 1u2 1 1 1 0u3 0 1 0 1u4 0 1 0 1u5 1 0 1 1
Adjacency Matrix (Similarity Matrix)
u1 u2 u3 u4 u5u1 2u2 3u3 2u4 2u5 3
Degree Matrix1
5
4
2
3
![Page 37: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/37.jpg)
Clustering in Networks
u1 u2 u3 u4 u5u1 2u2 3u3 2u4 2u5 3
Spectral Clustering: Using Eigenvectors of the Laplacian Matrix
−u1 u2 u3 u4 u5
u1 1 0 0 1u2 1 1 1 0u3 0 1 0 1u4 0 1 0 1u5 1 0 1 1
=u1 u2 u3 u4 u5
u1 2 -1 0 0 -1u2 -1 3 -1 -1 0u3 0 -1 2 0 -1u4 0 -1 0 2 -1u5 -1 0 -1 -1 3
Degree MatrixAdjacency Matrix (Similarity Matrix)
Laplacian Matrix
![Page 38: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/38.jpg)
Clustering in Networks
Eigenvector: a vector (v) that by getting multiplied in matrix A does not result in changing its direction (similar to being multiplied by scalar λ)
u1 u2 u3 u4 u5
-0.7 0.3 -0.2 -0.1 0.7-0.7 0.3 -0.2 -0.1 0.7
![Page 39: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/39.jpg)
Spectral Clustering Codez
from sklearn.cluster import spectral_clustering import numpy as np
A = [[0.0 for x in n] for x in n] … # build adjacency matrix res = spectral_clustering(np.matrix(A), n_clusters) # res -> list of cluster indices e.g. [1,1,0,5,…]
![Page 40: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/40.jpg)
Spectral Clustering Results
Folk Rock Country Rock
Blues Folk
Country Americana Roots Rock Blues Rock
Southern Rock
Power Metal Progressive Metal Symphonic Metal Black Metal Melodic Death Metal Groove Metal Nu Metal Thrash Metal
Death Metal Metalcore Industrial Metal Gothic Metal Christian Metal Doom Metal Speed Metal
Alternative Rock Indie Rock
New Wave Synthpop
Electronica
Rock R&B Pop
Pop Rock Funk Soul
Heavy Metal Hard Rock
Alternative Metal
![Page 41: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/41.jpg)
Intelligent Models
![Page 42: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/42.jpg)
word2vec Model
Skip-gram: a proximity-based probability model trained using Neural Networks (Deep Learning)
Pink Floyd were an English rock band formed in LondonX XX
![Page 43: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/43.jpg)
word2vec Representation
rock
0000000100000
0000000
0010000000000
Pink Floyd
band
formed
London
0000000010000
0000000000010
1000000000000
0.90.10.20.40.10.1
0.80.10.10.40.10.2
pop
![Page 44: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/44.jpg)
word2vec Demo
![Page 45: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/45.jpg)
Album Genre Model
Fun Happy Saturday We Are Friends Electronic Frozen Blood In My Veins Redneck Dance Chaos and Mayhem Basement Dub
Sentiment Analysis in text
Predicting the genre based on name of the album
![Page 46: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/46.jpg)
Deep Learning Basics
1) Traditional Neural Networks with many layers2) Often uses convolution as the node function 3) Training on Big Data can take weeks even on GPU
0) A method of supervised learning
4) Huge success attributed to improved training, powerful computation and above all Big Data5) Pooling, Dropout and local connections important
![Page 47: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/47.jpg)
Deep Learning Topology
![Page 48: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/48.jpg)
Deep Learning TensorFlow
“Wish you were here”
=> [123, 101, 42, 1969 ]=> [123, 101, 42, 1969, 0, 0, 0, … 0 ]
Rock=> [0, 0, 0, 1, 0, 0, 0, 0 ]
=> [[100000000000],[000000010000], … ]
![Page 49: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/49.jpg)
Deep Learning Demo
![Page 50: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/50.jpg)
Wrap-up
![Page 51: From Power Chord to the Power of Models - Oredev](https://reader031.fdocuments.in/reader031/viewer/2022021502/58e77df71a28ab4d5a8b4a83/html5/thumbnails/51.jpg)
References•All pictures from wikipedia.org used under Creative Commons •Source of all data is from wikipedia.org collected online using a single call and then stored and processed •Efficient Estimation of Word Representations in Vector Space. Mikolov et. al. http://arxiv.org/abs/1301.3781 •Gensim's word2vec •networkx lib •word2vec blog post (500K docs): Five crazy abstractions my Deep Learning word2vec model just did •word2vec on Rock music blog: Daft Punk+Tool=Muse: word2vec model trained on a small Rock music corpus •code for word2vec on wiki data •Highcharts: highcharts •word2vec paper: PDF •Automatic real-time road marking recognition using a feature-driven approach PDF •Video of the road marking recognition: here and here and here •Future of Programming - Rise of the Scientific Programmer (and fall of the craftsman) •Deep Learning articles •code for Deep Learning genre analysis •…