Henrik Bærbak Christensen1 Frameworks / Product Lines The frame for reuse.
Tensor Abuse - how to reuse machine learning frameworks
-
Upload
ted-dunning -
Category
Data & Analytics
-
view
284 -
download
1
Transcript of Tensor Abuse - how to reuse machine learning frameworks
© 2017 MapR TechnologiesMapR Confidential 1
Tensor Abuse
In the Workplace
© 2017 MapR TechnologiesMapR Confidential 2
Agenda
• What are tensors?
• How modern tensor systems work
• It's not just machine learning
• It’s also not magic
• Machine learning as optimization
• Animation of optimization algorithm
© 2017 MapR TechnologiesMapR Confidential 3
Spoilers
• Trigger warning: Greek letters ahead
– AND lots of animations for developing intuitions
• Tensors have a history in physics
• We don’t use them that way in machine learning
• Tensors are cool because they make it easier to code
numerical algorithms
• But auto-derivatives are as big a deal
• And we can abuse this machinery
© 2017 MapR TechnologiesMapR Confidential 4
Linear Operators on To the N-th Degree
• Tensors were originally invented for differential geometry
• That gave us relativity
© 2017 MapR TechnologiesMapR Confidential 5
But This Isn’t Physics
• In computing, we use some of the same words
• But they don’t mean the same thing
• We don’t need the Greek letters
• The point is that we have important patterns of computation
– We mostly don’t care about changing coordinate systems
© 2017 MapR TechnologiesMapR Confidential 6
Basic Operations
• Element-wise operations
• Outer products
• Reductions
• Matrix and vector products are special cases
© 2017 MapR TechnologiesMapR Confidential 7
This is news?!?
© 2017 MapR TechnologiesMapR Confidential 8
This is news?!?
Sounds like APL in sheep’s clothing
© 2017 MapR TechnologiesMapR Confidential 9
But it really is importantbecause
loop structuring is critical
© 2017 MapR TechnologiesMapR Confidential 10
Why This Matters for Machine Learning
• Sums of products and element-wise evaluations are ubiquitous
in machine learning
• And this is often surrounded by an outer loop
• This is susceptible to pipelining in GPU’s, but only if you can
see the large-scale patterns in the code
© 2017 MapR TechnologiesMapR Confidential 11
But, tensors are only half of the story
© 2017 MapR TechnologiesMapR Confidential 12
Machine Learning Has Changed
• Machine learning used to be really hard
– Numeric performance was zilch
– Training data was poor and small
– Learning many layers of a MLP was impossible
– Productivity for new approaches hampered by code complexity
• Recent advances have changed things (a lot)
– Important new regularization techniques
– Per coefficient learning rate techniques
– New gradient based optimization algorithms
– Automated differentiation
© 2017 MapR TechnologiesMapR Confidential 13
© 2017 MapR TechnologiesMapR Confidential 14
© 2017 MapR TechnologiesMapR Confidential 15
© 2017 MapR TechnologiesMapR Confidential 17
Gradients are a Big Deal
• In low dimensions, simple search techniques work well
– Line search, evolutionary processes, polyhedron warping all work
• It’s different in high dimensions
– Need some guide about which direction to go
– Too many ways to get sideways
© 2017 MapR TechnologiesMapR Confidential 18
© 2017 MapR TechnologiesMapR Confidential 19
© 2017 MapR TechnologiesMapR Confidential 20
Yeah, But …
• Gradients are also a pain in the ***
– Traditionally you needed to derive them by hand
– For some problems like robot kinematics or astrodynamics, equations
would cover many pages
• The big change in recent years is automatic differentiation
– Some limits on how problems are formulated
– But pretty complicated forms can be allowed easily
© 2017 MapR TechnologiesMapR Confidential 21
© 2017 MapR TechnologiesMapR Confidential 22
© 2017 MapR TechnologiesMapR Confidential 23
hidden = tf.nn.elu(x - biases)y_pred = tf.matmul(hidden, weights)loss = tf.reduce_mean((y - y_pred) ** 2)
© 2017 MapR TechnologiesMapR Confidential 24
The Lessons to Learn
• These new systems are astounding
– Tensors allow us to encode very complex algorithms
– Auto-differentiation allows us to use gradients
– New optimizers solve very nasty problems
– The same code can drive CPU, GPU or clusters
– The code is strange and abstract, but not that hard
• Not just for breakfast any more
– Machine learning isn’t the only thing you can do with these systems
– Go for it!
© 2017 MapR TechnologiesMapR Confidential 25
Q&A
@mapr
maprtechnologies
ENGAGE WITH US