Numba - A Dynamic Python Compiler for Science

download Numba - A Dynamic Python Compiler for Science

of 39

Transcript of Numba - A Dynamic Python Compiler for Science

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    1/39

    Numba: A dynamic Pythoncompiler for Science (i.e. for

    NumPy and other typed containers)

    March 16, 2013

    Travis E. Oliphant, Jon Riehl

    Mark Florisson, Siu Kwan Lam

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    2/39

    Where Im coming from

    AfterBefore

    0(2f)2Ui(a, f) = [Cijkl(a, f)Uk,l(a, f)],j

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    3/39

    1,000,000 to 2,000,000 users of NumPy!

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    4/39

    NumFOCUS --- blatant ad!

    www.numfocus.org

    501(c)3 Public Charity

    Join Us! http://numfocus.org/membership/

    Saturday, March 16, 13

    http://numfocus.org/membership/http://numfocus.org/membership/http://www.numfocus.org/http://www.numfocus.org/
  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    5/39

    Code that users might write

    xi =

    i1X

    j=0

    kij,jaijaj

    O = I ? F

    Slow!!!!

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    6/39

    Why is Python slow?

    1. Dynamic typing

    2. Attribute lookups

    3. NumPy get-item (a[...])

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    7/39

    What are Scientists doing Now?

    Writing critical parts in C/C++/Fortran andwrapping with SWIG

    ctypes Cython f2py (or fwrap) hand-coded wrappers

    Writing new code in Cython directly Cython is modified Python with type information everywhere. It produces a C-extension module which is then compiled

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    8/39

    Cython is the most popular

    these days. But, speeding upNumPy-based codes should be

    even easier!

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    9/39

    NumPy Array is typed container

    shape

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    10/39

    Lets use this!

    NumPy Users are already using typedcontainers with regular storage and accesspatterns. There is plenty of information tooptimize the code if we either:

    Provide type information for functioninputs (jit)

    Create a call-site for each function that

    compiles and caches the result the firsttime it gets called with new types.

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    11/39

    Requirements Part I

    Work with CPython (we need the full scientificPython stack!)

    Minimal modifications to code (use type inference) Programmer control over what and when to jit Ability to build static extensions (for libraries)

    Fall back to Python C-API for object types.

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    12/39

    Requirements Part II

    Produce code as fast as C (maybe even Fortran) Support NumPy array-expressions and be able to

    produce universal functions (e.g. y = sin(x)) Provide a tool that could adapt to provide

    parallelism and produce code for modern vector

    hardware (GPUs, accelerators, and many-coremachines)

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    13/39

    Do we have to write the full compiler??

    No!

    LLVM hasdone much

    heavy lifting

    LLVM =

    Compilers foreverybody

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    14/39

    Face of a modern compiler

    IntermediateRepresentation

    (IR)

    x86C++

    ARM

    PTX

    C

    Fortran

    ObjC

    Parsing Code Generation

    Front-End Back-End

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    15/39

    Face of a modern compiler

    IntermediateRepresentation

    (IR)

    x86

    ARM

    PTX

    Python

    Code Generation

    Back-End

    Numba LLVM

    ParsingFront-End

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    16/39

    Example

    Numba

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    17/39

    NumPy + Mamba = Numba

    LLVM Library

    Intel Nvidia AppleAMD

    OpenCLISPC CUDA CLANGOpenMP

    LLVMPY

    Python Function Machine Code

    ARM

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    18/39

    Simple APIjit --- provide type information (fastest to call at run-time) autojit --- detects input types, infers output, generates code

    if needed, and dispatches (a little more run-time calloverhead)

    #@jit('void(double[:,:], double, double)')@autojitdef numba_update(u, dx2, dy2): nx, ny = u.shape for i in xrange(1,nx-1): for j in xrange(1, ny-1): u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 + (u[i,j+1] + u[i,j-1]) * dx2) / (2*(dx2+dy2))

    Comment out one of jit or autojit (dont use together)

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    19/39

    Example

    @numba.jit(f8(f8))def sinc(x): if x==0.0: return 1.0 else: return sin(x*pi)/(pi*x)

    Numba

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    20/39

    ~150x speed-up Real-time imageprocessing (50 fps

    Mandelbrot)

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    21/39

    Speeding up Math Expressions

    xi =

    i1X

    j=0

    kij,jaijaj

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    22/39

    Image Processing

    @jit('void(f8[:,:],f8[:,:],f8[:,:])')def filter(image, filt, output): M, N = image.shape m, n = filt.shape for i in range(m//2, M-m//2): for j in range(n//2, N-n//2): result = 0.0 for k in range(m): for l in range(n): result += image[i+k-m//2,j+l-n//2]*filt[k, l] output[i,j] = result

    ~1500x speed-up

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    23/39

    Compile NumPy array expressions

    from numba import autojit

    @autojitdef formula(a, b, c): a[1:,1:] = a[1:,1:] + b[1:,:-1] + c[1:,:-1]

    @autojitdef express(m1, m2): m2[1:-1:2,0,...,::2] = (m1[1:-1:2,...,::2] *

    m1[-2:1:-2,...,::2]) return m2

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    24/39

    Fast vectorize

    NumPys ufuncs take kernels andapply the kernel element-by-elementover entire arrays

    Write kernels in

    Python!from numba.vectorize import vectorizefrom math import sin

    @vectorize([f8(f8), f4(f4)])def sinc(x): if x==0.0: return 1.0

    else: return sin(x*pi)/(pi*x)

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    25/39

    Case-study -- j0 from scipy.special

    scipy.special was one of the first libraries I wrote extended umath module by adding new

    universal functions to compute many scientific

    functions by wrapping C and Fortran libs. Bessel functions are solutions to a differential

    equation:x2

    d2y

    dx2+x

    dy

    dx+ (x2 2)y= 0

    y=J(x)

    Jn(x) = 1

    Z

    0

    cos(n x sin()) d

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    26/39

    scipy.special.j0 wraps cephes algorithm

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    27/39

    Result --- equivalent to compiled code

    In [6]: %timeit vj0(x)10000 loops, best of 3: 75 us per loop

    In [7]: from scipy.special import j0

    In [8]: %timeit j0(x)

    10000 loops, best of 3: 75.3 us per loop

    But! Now code is in Python and can be

    experimented with more easily (and moved tothe GPU / accelerator more easily)!

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    28/39

    Laplace Example

    @jit('void(double[:,:], double, double)')def numba_update(u, dx2, dy2):

    nx, ny = u.shape for i in xrange(1,nx-1): for j in xrange(1, ny-1): u[i,j] = ((u[i+1,j] + u[i-1,j]) * dy2 + (u[i,j+1] + u[i,j-1]) * dx2) / (2*(dx2+dy2))

    Adapted from http://www.scipy.org/PerformancePythonoriginally by Prabhu Ramachandran

    @jit('void(double[:,:], double, double)')def numbavec_update(u, dx2, dy2): u[1:-1,1:-1] = ((u[2:,1:-1]+u[:-2,1:-1])*dy2 +

    (u[1:-1,2:] + u[1:-1,:-2])*dx2) / (2*(dx2+dy2))

    Saturday, March 16, 13

    http://www.scipy.org/PerformancePythonhttp://www.scipy.org/PerformancePython
  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    29/39

    Results of Laplace example

    Version Time S eed UNumPy 3.19 1.0

    Numba 2.32 1.38Vect. Numba 2.33 1.37

    Cython 2.38 1.34

    Weave 2.47 1.29

    Numexpr 2.62 1.22Fortran Loops 2.30 1.39

    Vect. Fortran 1.50 2.13

    https://github.com/teoliphant/speed.git

    Saturday, March 16, 13

    https://github.com/scipy/speed.githttps://github.com/scipy/speed.git
  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    30/39

    Numba can change the game!

    LLVM IR

    x86C++

    ARM

    PTX

    C

    Fortran

    Python

    Numba turns Python into a compiledlanguage (but much more flexible). You dont

    have to reach for C/C++

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    31/39

    Many More Advanced Features

    Extension classes (jit a class --- autojit coming soon!) Struct support (NumPy arrays can be structs) SSA --- can refer to local variables as different types Typed lists and typed dictionaries and sets coming soon!

    pointer support calling ctypes and CFFI functions natively pycc (create stand-alone dynamic library and executable) pycc --python (create static extension module for Python)

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    32/39

    Uses of Numba

    PythonFunction

    Framework accepting dynamic function pointers

    Ufuncs

    Generalized

    UFuncs

    Function-

    based

    Indexing

    Memory

    Filters

    Window

    Kernel

    Funcs

    I/OFilters

    Reduction

    Filters

    Computed

    Columns

    Numba

    function pointer

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    33/39

    Accelerate/NumbaPro -- blatant ad!

    Python and NumPy compiled to

    Parallel Architectures(GPUs and multi-coremachines)

    Create parallel-for loops Parallel execution ofufuncs

    Run ufuncs on the GPU Write CUDA directly in

    Python! Free for Academics

    fast development and fastexecution!

    Currently premiumfeatures will becontributed to open-source over time!

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    34/39

    Numba Development 1260 Mark Florisson203 Jon Riehl181 Siu Kwan Lam

    110 Travis E. Oliphant 30 Dag Sverre Seljebotn 28 Hernan Grecco 19 Ilan Schnell 11 Mark Wiebe 8 James Bergstra 4 Alberto Valverde 3 Thomas Kluyver 2 Maggie Mari 2 Dan Yamins 2 Dan Christensen 1 timo 1 Yaroslav Halchenko

    1 Phillip Cloud 1 Ond!ej "ertk 1 Martin Spacek 1 Lars Buitinck 1 Juan Luis Cano Rodrguez

    git log --format=format:%an | sort | uniq -c | sort -r

    Siu

    Mark

    Jon

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    35/39

    Milestone Roadmap

    Rapid progress this year Still some bugs -- needs users! Version 0.7 end of Feb. Version 0.8 in April

    Version 0.9 June Version 1.0 by end of August Stable API (jit, autojit) easy to use Should be able to write equivalent of

    NumPy and SciPy with Numba andmemory-views.

    http://numba.pydata.orghttp://llvmpy.orghttp://compilers.pydata.org

    We need you:

    your use-cases your tests

    developer help

    Saturday, March 16, 13

    http://compilers.pydata.org/http://compilers.pydata.org/http://llvmpy.org/http://llvmpy.org/http://numba.pydata.org/http://numba.pydata.org/
  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    36/39

    Architectural OverviewPython

    Source

    Python Parser

    PythonAST

    Numba Stage 1 Numba Stage n

    Numba CodeGenerator

    NumbaEnvironment

    NumbaAST

    LLVM

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    37/39

    Numba Architecture

    ! Entry points! /numba/decorators.py

    ! Environment! /numba/environment.py

    !

    Pipeline! /numba/pipeline.py

    ! Code generation! /numba/codegen/...

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    38/39

    Development Roadmap

    ! Better stage separation, better modularity! Untyped Intermediate Representation (IR)! Typed IR! Specialized IR

    ! Module level entry points! Better Array Specialization

    Saturday, March 16, 13

  • 8/13/2019 Numba - A Dynamic Python Compiler for Science

    39/39

    Community Involvement

    ! ~/git/numba$ wc AUTHORS 25 88 1470 AUTHORS! (4 lines are blank or instructions)

    ! Github https://github.com/numba/numba!

    Mailing list --- [email protected]! Sprints --- contact Jon Riehl! Examples:

    ! Hernan Grecco just contributed Python 3 support (Yeah!)! Dag collaborating on autojit classes with Mark F.! We need you to show off your amazing demo!

    https://github.com/numba/numbahttps://github.com/numba/numba