Sweating the Small Stuff
description
Transcript of Sweating the Small Stuff
Sweating the Small StuffOptimization and tooling
FCNY July 2010
Machinarium
Here's what this talk is about
AS3 is only a tool Apparat and TDSI Wireworld code remap example Adding your Tools to a Build
Process Q & A Kumbaya
AS3 ≠ Flash Many languages can script SWFs
• AS3, MXML, AS2, HaXe, C, C++, Lua … Different reasons to use each one
• AS3: common• Alchemized C: runs fast, but hard to write• HaXe: runs fast, targets practically
everything (Tramp.)
AS3 ≠ Flash Did you know?
• Alchemy C code can run at 30% the speed of native C code
• AS3? 3% Use the tools at your disposal Pick the right tools for the right job
• You can use more than one at a time, you know
Apparat Behind Audiotool
• Joa Ebert TAAS – bytecode analyzer, optimizer Stripper – removes SWF debug data Reducer – compresses PNGs in SWFs Concrete – lets you implement
abstract classes Coverage – unit testing thingy TDSI
TDSI It's a car modding term.
• We're not supposed to get it. Finds slow "dummy code" in your
SWF that you deliberately placed in your AS3
Replaces them with Alchemy opcodes Fast Math replacement Memory system Inlining and macros
• ?
So here’s what we’re going to do We’ll start with a slow AS3 app and
improve it in stages. Generally applicable strategies for
optimization
Wireworld in a nutshell
Particle system… but stuck in a grid Information leaks from pixel to pixel
Instead of particle.move(); particle.drop();
It’s pixel.countNeighbors(); pixel.changeColor();
Supports circuit-like systems WW computer by Owen and Moore
(Quinapalus.com) wireworldAS3 (Google Projects) I get a lot of attention from Germans…?
Wireworld in a nutshell
DISCLAIMER wireworldAS3 is needlessly
complicated. Show and tell
This complexity isn’t required from your own projects.
the Apparat project contains some example code that may be easier to follow at home
Naïve implementation
for ( every row ) {for ( every column ) {there is a pixel.for ( every neighbor of the pixel ) {do something.}update the pixel's state.}
}
Naïve implementation
Result: sucks Slow, slow, slow, slow. Touching every pixel seems dumb
Naïve
Performance
New idea
List the pixels (or nodes) that might change their neighborsfor ( each node in the list) {
for ( each neighbor of the node) {if ( the neighbor might change its
neighbors next time ) {add it to a new list
} update the node
}}Then swap the old list with the new list.
Way less work
Linked List instead of Array Arrays
untyped (slow) ordered (unnecessary) weird push() and pop() are expensive and
lame
Linked List instead of Array Linked lists are easy: every node points
to the next node in the list start node node node node null
Chop it and you get two LLs Connect their ends and they’re one again
No pushing or popping No class
Result: betterNaïve Linked Lists
Performance
Let’s take a breather Grab that second beer. Optimization is never an end unto itself
• It’s so easy to forget that• Performance matters in four or five situations:
1 Addressing bad user experience, freezing process2 Facing stiff competition3 Porting code to mobile devices4 p l o t t i n g w o r l d d o m i n a t i o n
Otherwise, don’t we all have enough on our plates already?
Are there any other bright ideas? How about filters? Pixel Bender?
Wireworld rule is basically a weird BitmapFilter
These work, but they're slow Remapping to these is scary Some tasks can be PBJ'd, but not all of
them
Result: disappointingNaïve Linked Lists Convolution Pixel Bender
Performance
Property Vectors instead of objects Make a Vector for each node
property xVec, yVec, stateVec node.next.x becomes xVec[ nextVec[ i ] ]
Replaces the node class with ints, Booleans and other Vectors
The Vectors don't grow or shrink, but the data changes its value Nodes now point to each other with an
index The linked lists are still in there
Property Vectors instead of objects Wait! We can't use null.
You have to make a custom null. Call it something else. Nada. Diddly. Buggerall. Squat. Bupkis.
It’s just an int, like -1, that we use to signify nothingness.
Doesn’t work well with dynamic properties There are ways around this Maybe you shouldn’t be using dynamic properties
Result: on par with linked lists Main advantage: all the data is primitive.
Naïve Linked ListsProperty Vectors
Performance
ByteArray time
We can pack our primitive data into a BA Write out the values of each property for
each cell, same order every time.
Result: SUCKS!Naïve Linked ListsProperty Vectors ByteArray
Performance
Wait. What??
Wait. What?? BAs + [your project] = SUPAH FAST!!!1 Dopes. BAs aren’t a cure-all. ByteArray.position
• Like a needle on a record player or hard disk• It’s fast, as long as you don’t lift that needle
BAs in AS3 will perform well for you in many cases, just not all cases
TDSI manipulates BAs with Alchemy opcodes, not the BA methods
Finally, the TDSI step
Use TDSI’s Memory API Slows your program way down at first
(Don’t freak out) Run the SWF through TDSI
Result: ExcellentNaïve Linked
ListsProperty Vectors
ByteArray TDSI ByteArray
Performance
Bonus: green threads
Cut a big loop into a repeatable task Perform the task in response to a
timed event Stop the task when the loop test fails Overdrive
Packing the flash event loop Framerate threshold
Don’t forget to solve the problem
Was all that really worth it? Wireworld won’t impact most people. Other systems can seriously benefit
• Audio players/synths• Emulators• Graphics engines• Solvers• Tough stough
Was all that really worth it? Consider:
• This is probably the most efficient SWF we've ever compiled during a talk at FCNY Hard task Runs fast
Targeting mobile devices• Your app will stand a better chance• (Apparently WW already runs nice on Froyo)
Fitting tools like TDSI into your workflow
Bash script / Bat file• Write it just once• Tack it on the end of your FB builders list• Register it as an external build tool
Most IDEs offer some way of doing this• Double click it
Ant build for Eclipse fans AGAIN: implementation before
optimization
BATBASH
That’s it. Please direct all your questions to
Hudson. Thank you.