Tensor Abuse - how to reuse machine learning frameworks

© 2017 MapR TechnologiesMapR Confidential 1

Tensor Abuse

In the Workplace


Agenda

• What are tensors?

• How modern tensor systems work

• It's not just machine learning

• It’s also not magic

• Machine learning as optimization

• Animation of optimization algorithm


Spoilers

• Trigger warning: Greek letters ahead

– AND lots of animations for developing intuitions

• Tensors have a history in physics

• We don’t use them that way in machine learning

• Tensors are cool because they make it easier to code

numerical algorithms

• But auto-derivatives are as big a deal

• And we can abuse this machinery


Linear Operators on To the N-th Degree

• Tensors were originally invented for differential geometry

• That gave us relativity


But This Isn’t Physics

• In computing, we use some of the same words

• But they don’t mean the same thing

• We don’t need the Greek letters

• The point is that we have important patterns of computation

– We mostly don’t care about changing coordinate systems


Basic Operations

• Element-wise operations

• Outer products

• Reductions

• Matrix and vector products are special cases


This is news?!?


This is news?!?

Sounds like APL in sheep’s clothing


But it really is importantbecause

loop structuring is critical


Why This Matters for Machine Learning

• Sums of products and element-wise evaluations are ubiquitous

in machine learning

• And this is often surrounded by an outer loop

• This is susceptible to pipelining in GPU’s, but only if you can

see the large-scale patterns in the code


But, tensors are only half of the story


Machine Learning Has Changed

• Machine learning used to be really hard

– Numeric performance was zilch

– Training data was poor and small

– Learning many layers of a MLP was impossible

– Productivity for new approaches hampered by code complexity

• Recent advances have changed things (a lot)

– Important new regularization techniques

– Per coefficient learning rate techniques

– New gradient based optimization algorithms

– Automated differentiation


Gradients are a Big Deal

• In low dimensions, simple search techniques work well

– Line search, evolutionary processes, polyhedron warping all work

• It’s different in high dimensions

– Need some guide about which direction to go

– Too many ways to get sideways


Yeah, But …

• Gradients are also a pain in the ***

– Traditionally you needed to derive them by hand

– For some problems like robot kinematics or astrodynamics, equations

would cover many pages

• The big change in recent years is automatic differentiation

– Some limits on how problems are formulated

– But pretty complicated forms can be allowed easily


hidden = tf.nn.elu(x - biases)y_pred = tf.matmul(hidden, weights)loss = tf.reduce_mean((y - y_pred) ** 2)


The Lessons to Learn

• These new systems are astounding

– Tensors allow us to encode very complex algorithms

– Auto-differentiation allows us to use gradients

– New optimizers solve very nasty problems

– The same code can drive CPU, GPU or clusters

– The code is strange and abstract, but not that hard

• Not just for breakfast any more

– Machine learning isn’t the only thing you can do with these systems

– Go for it!


Q&A

@mapr

maprtechnologies

[email protected]

ENGAGE WITH US

Tensor Abuse - how to reuse machine learning frameworks

Data & Analytics

Transcript of Tensor Abuse - how to reuse machine learning frameworks