NumPy References The Numpy Example List Tables of contents ordered.

Post on 21-Jan-2016

250 views 0 download

Tags:

Transcript of NumPy References The Numpy Example List Tables of contents ordered.

NumPy References The Numpy Example List

http://www.scipy.org/Numpy_Example_List

Tables of contents ordered by alphabetical order of functions/methods illustrated

The NumPy Tentative Tutorial http://www.scipy.org/Tentative_NumPy_Tutorial

The most extensive tutorial but not finished

These notes are loosely based on this

The Guide to NumPy book http://www.tramy.us/numpybook.pdf

The first 211 pages are largely a reference manual but with considerable discussion and occasional examples This is the definitive source

Last 167 pages cover the C API for NumPy

SciPy course online http://www.rexx.com/~dkuhlman/scipy_course_01.html

Also covers NumPy

Mostly lists of functions under various headings

NumPy for Matlab Users http://www.scipy.org/NumPy_for_Matlab_Users

Succinct (16 pp.) but useful tables (for Matlab users)

A matlab, R, IDL, NumPy/SciPy dictionary http://mathesaurus.sourceforge.net/

Links to tables of comparable functions for NumPy for users of other software (including Matlab)

NumPy BasicsMore important ndarray attributesndarray.ndim Number of axes of the array (i.e., its rank in Pythonese)

ndarray.shape dimensions of the array Tuple of integers indicating the array’s size in each dimension.

ndarray.size Total number of elements of the array Equal to the product of the elements of shape

ndarray.dtype An object describing the type of the elements in the array. Can create or specify dtype's using standard Python types NumPy provides others, e.g.: bool_, character, int_, int8, int16, int32,

int64, float_, float8, float16, float32, float64, complex_, complex64, object_

ndarray.itemsize Size in bytes of each array element

E.g., an array of elements of type float64 has itemsize 8 (=64/8),

Equivalent to ndarray.dtype.itemsize

ndarray.data Buffer containing the actual elements of the array

Normally, don’t need this—access elements with indexing

>>> from numpy import *

>>> a = arange(10).reshape(2,5)

>>> a.shape

(2, 5)

>>> a.ndim

2

>>> a.size

10

>>> a.dtype

dtype('int32')

>>> a.itemsize

4

>>> a.data<read-write buffer for 0x00FA1D88, size 40, offset 0 at 0x00FCE2E0>

More on Array Creation Have already seen the following array-constructing functions

arange(stop )

arange(start, stop )

arange(start, stop, step )

Like Python range() except it returns an ndarray

Defaults: start = 0, step = 1

linspace (start, stop, num )

Returns num evenly spaced samples from start to stop

zeros(shape )

Returns an array of 0’s dimensioned by tuple shape—e.g., zeros((2,3))

ones(shape )

Returns an array of 1’s dimensioned by tuple shape

array() transforms

sequences of sequences into bidimensional arrays

sequences of sequences of sequences into tridimensional arrays

Etc.

Type of resulting array deduced from type of the sequence elements

>>> b = array([((1, 2), (3, 4)), ((5, 6), (7, 8))])

>>> b.shape

(2, 2, 2)

>>> b.dtype

dtype('int32')

With zeros(), ones(), linspace(), default type is float64 but can be explicitly specified at creation time:

>>> array( [ [1,2], [3,4] ], dtype=complex )

array([[ 1.+0.j, 2.+0.j],

[ 3.+0.j, 4.+0.j]])

>>> ones( (2,2), dtype=int16 )

array([[1, 1],

[1, 1]], dtype=int16)

empty() creates an array without filling it in

Initial content is random

>>> empty( (2) )

array([ 2.21650110e-301, 1.30472769e-305])

fromfunction() constructs an n-dimensional array given an n-ary function and upper bounds of n lists of arguments

fromfunction(func, arg_lists, **kwargs )

Where func is an n-ary function, arg_lists is a tuple of n integers, (a0, a1, …, an-1 )

It returns the n-dimensional array computed as follows

func is applied to the elements of

A = range(a0 ) range(a1 ) range(an-1 )

in lex order to give a 1D array Ar Here is the Cartesian product

The final result is Ar.reshape(arg_list )

E., g.,

>>> def sum_inds(i, j):

... return i + j

...

>>> print fromfunction(sum_inds, (2,3))

[[ 0. 1. 2.]

[ 1. 2. 3.]]

Here A = [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2)] Ar = array([0, 1, 2, 1, 2, 3])

If we use the function just once, use a lamda form

>>> print fromfunction(lambda i,j: i+j, (2,3))

[[ 0. 1. 2.]

[ 1. 2. 3.]]

Besides a tuple, arg_lists can be an array

>>> ar = array([2,3])

>>> print fromfunction(lambda i,j: i+j, ar)

[[ 0. 1. 2.]

[ 1. 2. 3.]]

Keyword arguments to func may also be passed in as keywords to fromfunction

Sometimes convenient to construct an array by applying array() to a list comprehension

>>> array([x*x for x in range(11) if x % 2 == 0])

array([ 0, 4, 16, 36, 64, 100])

More on Array Basic Operations Recall: arithmetic operators on arrays apply elementwise

A new array is created and filled with the result.

>>> a = arange(4)

>>> b = arange(1, 5)

>>> -a + 2*b

array([2, 3, 4, 5])

The product operator * operates elementwise on arrays

The matrix product can be done with the dot function (extended)

>>> dot([[1,2],[3,4]], [[1,2],[2,1]])

array([[ 5, 4],

[11, 10]])

Or create matrix objects (see below)

Can do some operations in place (no new array created)

>>> a = ones( 2, dtype = int )

>>> b = random.random( 2 )

>>> a += 3

>>> a

array([4, 4])

>>> b += a

>>> b

array([ 4.22907662, 4.84909757])

>>> a += b # b is converted to integer type

>>> a

array([8, 8])

With arrays of different types, the type of the resulting array is the more general or precise operand type

>>> a = ones(3, dtype=int32)

>>> b = linspace(0,pi,3)

>>> b.dtype.name

'float64'

>>> c = a + b

>>> c

array([ 1. , 2.57079633, 4.14159265])

>>> c.dtype.name

'float64‘

This is called upcasting

See the table in the NumPy Tentative Tutorial

Many unary operations (e.g., sum of elements) are implemented as ndarray methods

By default, apply to the array as a list of numbers, regardless of shape

But can apply an operation along the specified axis of an array—parameter axis

>>> b = arange(9).reshape(3,3)

>>> b

array([[0, 1, 2],

[3, 4, 5],

[6, 7, 8]])

>>> b.sum(axis=0) # sum of each column

array([ 9, 12, 15])

>>> b.min(axis=1) # min of each row

array([0, 3, 6])

>>> b.cumsum(axis=1) # cumulative sum along the rows

array([[ 0, 1, 3],

[ 3, 7, 12],

[ 6, 13, 21]])

Can use Python’s functional programming tools for lists on arrays

Convert list result with array()

>>> arr = array([1,14, 6, 9, 4, 7, 12])

>>> array(filter(lambda x: 5 <= x <= 10, arr))

array([6, 9, 7])

>>> a = array([2**i for i in range(1,5)])

>>> a

array([ 2, 4, 8, 16])

>>> b = arange(1,5)

>>> b

array([1, 2, 3, 4])

>>> array(map(lambda x,y: x-y, a, b))

array([ 1, 2, 5, 12])

>>> reduce(lambda x,y: x+y, b)

10

More on Indexing, Slicing, Iterating 1D arrays can be indexed, sliced and iterated over like Python lists

Multidimensional arrays can be indexed, sliced and iterated over with 1 index per axis

Indices are given in a tuple separated by comas

>>> b = arange(9).reshape(3,3)

>>> b

array([[0, 1, 2],

[3, 4, 5],

[6, 7, 8]])

>>> b[1,2]

5

>>> b[:,1] # the 2nd column of b

array([1, 4, 7])

>>> b[0:2,:] # the 1st and 2nd rows of b

array([[0, 1, 2],

[3, 4, 5]])

When fewer indices are provided than the number of axes, missing indices are complete slices

>>> b[-1] # the last row, equivalent to b[-1,:]

array([6, 7, 8])

b[i] can be read an b[i, as many ’:,’ as needed]

Can also be written using dots as b[i,...]

... means as many ':,' as needed for a complete indexing tuple

E.g., if array x has rank 5 array, then

x[1,2,...] is same as x[1,2,:,:,:]

x[...,3] is same as x[:,:,:,:,3]

x[4,...,5,:] is same as x[4,:,:,5,:]

>>> c = arange(12).reshape(2,2,3)

>>> c

array([[[ 0, 1, 2],

[ 3, 4, 5]],

[[ 6, 7, 8],

[ 9, 10, 11]]])

>>> c[1,...] # same as c[1,:,:] or c[1]

array([[ 6, 7, 8],

[ 9, 10, 11]])

>>> c[...,2] # same as c[:,:,2]

array([[ 2, 5],

[ 8, 11]])

Iterating is with respect to the 1st axis

>>> for row in b:

... print row

...

[0 1 2]

[3 4 5]

[6 7 8]

To perform something for each array element, one use the flat attribute

An iterator over all the array elements

>>> for element in b.flat:

... print element,

...

0 1 2 3 4 5 6 7 8

Indexing with Arrays of Indices NumPy offers more indexing facilities than regular Python sequences

Arrays can be indexed with

integers and slices (as we’ve seen) and

arrays of integers (here) and arrays of booleans (next)

Can index with an array of indices of elements to include in a result array

>>> a = arange(6) * 2

>>> a

array([ 0, 2, 4, 6, 8, 10])

>>> i = array([0,2,3,2])

>>> a[i]

array([0, 4, 6, 4])

Can use a 2D (or higher-dimension) array of indices to index into a 1D array

Result has the shape of the index array

>>> j = array([[2,4],[1,3]])

>>> a[j]

array([[4, 8],

[2, 6]])

When the indexed array is multidimensional, elements in an index array refer to 1st dimension

>>> b = arange(9).reshape(3,3)

>>> b

array([[0, 1, 2],

[3, 4, 5],

[6, 7, 8]])

>>> k = array([0,2])

>>> b[k]

array([[0, 1, 2],

[6, 7, 8]])

Can give (separate) arrays of indexes for more than 1 dimension

The arrays of indices for each dimension must have the same shape

>>> a = arange(12).reshape(3,4)

>>> a

array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

>>> i, j = array([0,2]), array([1,3])

>>> a[i,j]

array([ 1, 11])

>>> a[i,2]

array([ 2, 10])

>>> a[:,j] # select each row; columns as j

array([[ 1, 3],

[ 5, 7],

[ 9, 11]])

>>> a[:2,2]

array([2, 6])

Can put i and j in a sequence (e.g., a list) then index with the list

>>> l = [i,j]

>>> a[l]

array([ 1, 11])

Can’t do this by putting i and j into an array

This array is interpreted as indexing the first dimension of a

>>> s = array(l)

>>> a[s]Traceback (most recent call last):

File "<stdin>", line 1, in <module>

IndexError: index (3) out of range (0<=index<2) in dimension 0

But can convert the array to a sequence (e.g., a tuple) for indexing

>>> a[tuple(s)]

array([ 1, 11])

Can use indexing with arrays as a target to assign to

>>> a = arange(6) ** 2

>>> a

array([ 0, 1, 4, 9, 16, 25])

>>> a[[1,3,5]] = 0

>>> a

array([ 0, 0, 4, 0, 16, 0])

>>> a[[1,2,5]] = [3,5,7]

>>> a

array([ 0, 3, 5, 0, 16, 7])

Indexing with Boolean Arrays Indexing with a Boolean array, specify for each element whether to

include it

>>> a = arange(6).reshape(2,3)

>>> b = a % 2 == 1

>>> b

array([[False, True, False],

[ True, False, True]], dtype=bool)

>>> a[b]

array([1, 3, 5])

Useful for assignment

>>> a[b] = -1

>>> a

array([[ 0, -1, 2],

[-1, 4, -1]])

For each dimension of the array, can give a 1D Boolean array selecting the slices we want

Dimension of result is the same as the dimension of indexed array

>>> a = arange(12).reshape(3,4)

>>> a

array([[ 0, 1, 2, 3],

[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

>>> b1 = array([False,True,True])

>>> b2 = array([True,False,True,False])

>>> a[b1] # select rows

array([[ 4, 5, 6, 7],

[ 8, 9, 10, 11]])

>>> a[:,b2] # select columns

array([[ 0, 2],

[ 4, 6],

[ 8, 10]])

>>> a[b1,b2] # Select 1st Trues and 2nd Trues (weird)

array([ 4, 10])

>>> a[b1][:,b2] # Select rows then columns

array([[ 4, 6],

[ 8, 10]])

See the NumPy Tutorial, section 4.3 (Fancy indexing and index tricks), for more on indexing

Printing Arrays When you print an array, NumPy displays it like a nested lists, but with the

following layout

the last axis is printed from left to right,

the last but 1, from top to bottom,

the rest, from top to bottom, slice separated by empty lines

So 1D arrays are printed as rows, 2D as matrices, 3D as lists of matrices

>>> print arange(24).reshape(2,3,4)

[[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]]

[[12 13 14 15]

[16 17 18 19]

[20 21 22 23]]]

If an array is too large to be printed, NumPy skips the central part of the array

>>> print arange(10000).reshape(100, 100)

[[ 0 1 2 ..., 97 98 99]

[ 100 101 102 ..., 197 198 199]

[ 200 201 202 ..., 297 298 299]

...,

[9700 9701 9702 ..., 9797 9798 9799]

[9800 9801 9802 ..., 9897 9898 9899]

[9900 9901 9902 ..., 9997 9998 9999]]

To force NumPy to print the entire array, change the printing options using set_printoptions

>>> set_printoptions(threshold=nan)

String-Array Conversionfromstring(str, dtype=type, sep=separator )

str is a string that can be interpreted as sequence of elements of type type separated by separator

Returns an array of these elements converted to type

>>> fromstring("45.5 76.8 84.3 4.5", dtype='float', sep = ' ')

array([ 45.5, 76.8, 84.3, 4.5])

>>> fromstring("45,76,84", dtype='int', sep = ',')

array([45, 76, 84])

This tolerates whitespace after the ‘,’

Do the same with string method split() and array()

Specify dtype

>>> array('45,76,84'.split(','), dtype='int32')

array([45, 76, 84])

array2string(arr )

Convert array arr to a string as output by print

The default printing mechanism uses this function

>>> array2string(array([1,2,3]))

'[1 2 3]'

Writing/Reading Arrays to/from Files tofile(fname, sep=separator, format=fstring )

Open text file named fname for writing

Write the flattened contents of self to the file Separated by separator Formatted according to fstring

>>> array([[1,2],[3,4]]).tofile('f2.dat', sep=' ', format="%d")

Contents of f2.dat

1 2 3 4

fromfile (fname, dtype=type, sep=separator )

Open text file named fname for reading

Should contain a sequence of elements that

can be interpreted as elements of type type and are separated by separator

Returns a 1D array of these elements converted to type

>>> fromfile('f2.dat', dtype='int', sep=' ').reshape(2,2)

array([[1, 2],

[3, 4]])

tofile() and fromfile() can take an open file object besides a file name

If separator is the empty string (the default),

writing/reading is in binary mode

format not used in tofile()

The default format for tofile() is “%s”

For anything more complex, role our own functions to write/read arrays to/from files

from numpy import *

def out3D( fname, a ):

f = open(fname, 'w')

for l in a:

for r in l:

for c in r:

f.write( "%d " % c )

f.write( "\n" )

f.write( "\n" )

f.close()

a = arange(1,9).reshape(2,2,2)

out3D('aout.txt', a)

E:\Old D Drive\c690f08\NumPy>awrite.py

Contents of aout.txt

1 2

3 4

5 6

7 8

2 blank lines

from numpy import *

def in3D( fname ):

f = open('aout.txt', 'r')

a = []

a1 = []

for l in f:

if l == "\n":

a.append(a1)

a1 = []

else:

a1.append([int(x) for x in l.split()])

f.close()

return array(a)

print in3D( 'aout.txt' )

E:\Old D Drive\c690f08\NumPy>aread.py

[[[1 2]

[3 4]]

[[5 6]

[7 8]]]

Pickling Arrays A quick way to save intermediate results is to pickle them

Array method

dump(file )

where file is a file object open for writing or a file name,

pickles the contents of self to file

NumPy function

load(file )

where file is a file object open for reading or a file name,

loads a pickled array (and returns the original array)

In one session

>>> from numpy import *

>>> array([1,2,3]).dump('apicklejar')

In a latter session

>>> from numpy import *

>>> print load('apicklejar')

[1 2 3]

Array method

dumps()

returns the n pickled representation of self as a string NumPy function

loads(picklestr )

unpickles array pickle string picklestr, returning the original array

>>> str = arr.dumps()

>>> str'\x80\x02cnumpy.core.multiarray\n_reconstruct\nq\x01cnumpy\nndarray\nq\x02K\x00\

x85U\x01b\x87Rq\x03(K\x01K\x03\x85cnumpy\ndtype\nq\x04U\x02i4K\x00K\x01\x87Rq\x0

5(K\x03U\x01<NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tb\x89U\x0c\x01\x00\x00\x

00\x02\x00\x00\x00\x03\x00\x00\x00tb.'

>>> loads(str)

array([1, 2, 3])

More Advanced NumPyC-Style and FORTRAN-Style Arrays A 42 array (C) and a 34 (F)

Each takes 12 blocks of contiguous memory (above)

2 ways an ndarry object can use this contiguous memory

In the C-style of indexing (left), last index “varies the fastest” Generally the NumPy default

In the Fortran-( right), the 1st index “varies the fastest”

If A is a C-style array, same block of memory can be used to represent AT as a Fortran-style array

The stride is how many elements in the underlying 1D layout we must jump to get to the next array element of a specific dimension

E.g., in a C-style 4×5×6 array, must jump over 30 elements to increment 1st index by 1 30 is the stride for the 1st dimension.

The figure shows a highlighted areas C[1:3,1:3] and F[1:3,1:3]

Neither C contiguous nor Fortran contiguous

But can still be represented by an ndarray object using the same striding tuple as the original array used

So a regular indexing expression on an ndarray can always produce an ndarray object without copying any data

The “view” feature of array indexing

Allows fast indexing without exploding memory usage

Changing the Shape of an Array An array has a shape, the number of elements along each axis

>>> a = arange(6).reshape(2,3)

>>> a

array([[0, 1, 2],

[3, 4, 5]])

>>> a.shape

(2, 3)

Several ndarray methods change an array’s shape

transpose() changes rows to columns and vice versa

>>> a.transpose()

array([[0, 3],

[1, 4],

[2, 5]])

ravel() “fattens” an array (gives the underlying 1D object)

>>> a.ravel()

array([0, 1, 2, 3, 4, 5])

Order of the elements in the array resulting from ravel() is normally C-style

So ravel() usually doesn’t copy its argument

But may need to be copied if the array was made by taking slices of another array or

created with unusual options

ravel() and reshape() can be told with an optional argument to use FORTRAN-style arrays

>>> a.ravel('FORTRAN')

array([0, 3, 1, 4, 2, 5])

>>> a.reshape((3,2), order='FORTRAN')

array([[0, 4],

[3, 2],

[1, 5]])

>>> a

array([[0, 1, 2],

[3, 4, 5]])

reshape() leaves the array it’s invoked on unchanged

But resize() changes it

>>> a.resize(3,2)

>>> a

array([[0, 1],

[2, 3],

[4, 5]])

A dimension of -1 in a reshaping operation is automatically calculated to correspond to the other dimensions

>>> b = arange(24).reshape(2,3,4)

>>> b.reshape(2,-1,6).shape

(2, 2, 6)

Stacking together Different Arrays Several arrays can be stacked together, along different axes:

Function column_stack() stacks 1D arrays as columns into a 2D array

row_stack() stacks 1D arrays as rows into a 2D array

The 1D arrays must have the same length in both cases

>>> a = array([1,2,3])

>>> b = array([4,5,6])

>>> column_stack((a,b,a))

array([[1, 4, 1],

[2, 5, 2],

[3, 6, 3]])

>>> row_stack((a,b))

array([[1, 2, 3],

[4, 5, 6]])

In general,

hstack() stacks along their 1st axes

vstack() stacks along their last axes

So

column_stack() is equivalent to vstack() for 1D arrays

row_stack() is equivalent to hstack() for 1D arrays

concatenate() (more general) has optional argument axis for the axis along which concatenation is done

Default axis is 0

>>> x = array([[1,2],[3,4]])

>>> y = array([[5,6],[7,8]])

>>> concatenate((x,y))

array([[1, 2],

[3, 4],

[5, 6],

[7, 8]])

>>> concatenate((x,y),axis=1)

array([[1, 2, 5, 6],

[3, 4, 7, 8]])

For more info, see NumPy Tentative Tutorial, section 3.6.2 (Stacking together different arrays)

Splitting One Array into Several Smaller Ones Function hsplit() splits an array along its horizontal axis

Specifying either

number of equally shaped arrays to return or Error if this number not a divisor of number of columns

tuple of columns before which splitting occurs

>>> a = arange(8).reshape(2,4)

>>> a

array([[0, 1, 2, 3],

[4, 5, 6, 7]])

>>> hsplit(a,2)

[array([[0, 1],

[4, 5]]), array([[2, 3],

[6, 7]])]

>>> [a1, a2] = hsplit(a,2)

>>> a1

array([[0, 1],

[4, 5]])

>>> a2

array([[2, 3],

[6, 7]])

>>> [a3,a4,a5] = hsplit(a, (1,3))

>>> a3

array([[0],

[4]])

>>> a4

array([[1, 2],

[5, 6]])

>>> a5

array([[3],

[7]])

vsplit() is similar but splits along the vertical axis

array_split() (more general) has optional argument axis for the axis along which splitting is done

>>> [b1,b2] = array_split(a, 2, axis=1)

>>> b1

array([[0, 1],

[4, 5]])

>>> b2

array([[2, 3],

[6, 7]])

Copies and Views When working with arrays, their data is sometimes copied into a

new array and sometimes not

There are three cases

No Copy Arrays are objects (instances of ndarry)

Variables are actually bound to references to arrays, not to arrays themselves

Assignment involving such variables copies the reference and not the array

>>> a = array([1,2,3])

>>> b = a

>>> b is a

True

>>> b[0] = 5

>>> print a

[5 2 3]

Similarly, Python passes mutable objects as references

So function calls make no copies of arrays

View or Shallow Copy Different array objects can share the same data

Method view() creates a new array object that looks at the same data

>>> a = arange(4)

>>> c = a.view()

>>> c is a

False

>>> print c

[0 1 2 3]

Changing the shape of a view doesn’t change the shape of its base

>>> c.shape = 2,2

>>> a.shape

(4,)

Can change the base via the view even when they have different shapes

>>> c[0,0] = 7

>>> print a

[7 1 2 3]

The type of the view is ndarry, like all NumPy arrays

>>> type(c)

<type 'numpy.ndarray'>

What distinguishes a view is that it doesn’t own its own memory

Value of the base attribute for an array that doesn’t own its own memory is the array whose memory the view references

For an array that owns its own memory, the value is None

>>> c.base

array([7, 1, 2, 3])

>>> print a.base

None

A view is a reference to its base and prevents it from being deallocated

>>> from numpy import *

>>> from gc import *

>>> enable() # garbage collection

>>> a = arange(4)

>>> c = a.view()

>>> print c

[0 1 2 3]

>>> a = arange(6) # Can’t access old array via ‘a’ now

>>> collect() # Collect garbage

0

>>> print a

[0 1 2 3 4 5]

>>> print c.base

[0 1 2 3]

A slice is a view

Its base is the array it’s derived from

>>> a = arange(8)

>>> s = a[2:6]

>>> type(s)

<type 'numpy.ndarray'>

>>> print s

[2 3 4 5]

>>> s.base is a

True

Again, we can update the base via the view (slice)

>>> s[:] = 9

>>> print a

[0 1 9 9 9 9 6 7]

Deep Copy Method copy() makes a complete copy of the array and its data

>>> a = arange(4)

>>> d = a.copy()

>>> d is a

False

>>> print d.base

None

>>> d[0] = 9

>>> print a

[0 1 2 3]

The flags Attribute Array flags give info about how the array’s memory area is interpreted

There are 6 flags, all Boolean

C_CONTIGUOUS is true if the data is in a single, C-style contiguous segment

F_CONTIGUOUS is true if the data is in a single, Fortran-style contiguous segment

OWNDATA is true if the array owns the memory it uses

WRITEABLE is true if the data area can be written to

ALIGNED is true if the data and strides are aligned appropriately for the hardware (as determined by the compiler)

UPDATEIFCOPY is true if this array is a copy of some other array When this array is deallocated, the base array is updated with its

contents

Flags can be read using attribute access with the lowercase versions of the names

>>> a = arange(4)

>>> print a.flags

C_CONTIGUOUS : True

F_CONTIGUOUS : True

OWNDATA : True

WRITEABLE : True

ALIGNED : True

UPDATEIFCOPY : False

>>> v = a.view()

>>> print v.flags.owndata, v.flags.updateifcopy

False False

>>> c = a.copy()

>>> print c.flags.owndata, c.flags.updateifcopy

True False

Only the UPDATEIFCOPY, WRITEABLE, ALIGNED flags can be changed by the user

But UPDATEIFCOPY can never be set to True

Flags can be changed using attribute access with the lowercase versions of the names

Can also use method setflags(), with keyword Boolean arguments

write: the new value of WRITEABLE

align: new value of ALIGNED

uic: new value of UPDATEIFCOPY (must be False)

>>> a.setflags(write=False)

>>> a[0] = 9

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

RuntimeError: array is not writeable

>>> a.flags.writeable

False

>>> a.flags.writeable = True

>>> a[0] = 9

Universal Functions A universal function (ufunc) is a function that operates element-wise on

ndarrays

Supports array broadcasting, type casting, and other standard features

I.e., a ufunc is a “vectorized” wrapper for a function that takes a fixed number of scalar inputs and produces a fixed number (we consider only 1) of scalar outputs

Universal functions are instances of class numpy.ufunc

We’ve already seen numerous ufuncs

Regarding type casting, we noted that

With arrays of different types, the type of the resulting array is the more general or precise operand type

We explain broadcasting (for when operands have different yet compatible shapes) below

See EricsBroadcastingDoc, http://www.scipy.org/EricsBroadcastingDoc

There are several ufunc methods—look at all but 1

Where a and b are arrays,

reduce(a )applies self to all elements of a

accumulate(a ) accumulates the result of applying self to all elements of a

outer(a, b ) computes the result of applying self to all pairs of elements of a and b

Note that the + operator is syntactic sugar for the ufunc add()

>>> a = arange(1,5)

>>> print a

[1 2 3 4]

>>> add.reduce(a)

10

>>> add.accumulate(a)

array([ 1, 3, 6, 10])

>>> add.outer(a, a)

array([[2, 3, 4, 5],

[3, 4, 5, 6],

[4, 5, 6, 7],

[5, 6, 7, 8]])

For multidimensional arrays, all these methods allow the operation to be done along a particular axis

>>> b = arange(12).reshape(2,6)

>>> print b

[[ 0 1 2 3 4 5]

[ 6 7 8 9 10 11]]

>>> add.reduce(b, axis=1)

array([15, 51])

ufunc methods can’t be used with operators that are syntactic sugar for ufuncs

Some such operators and the names of the corresponding ufuncs

Operator ufunc

x+ y add(x, y)

x * y multiply(x, y)

x – y subtract(x, y)

x // y divide(x, y) (Integer division)

x / y true_divide(x, y)

x % y mod(x, y)

x ** y power(x, y)

x > y greater(x, y)

x >= y greater_equal(x, y)

x < y less(x, y)

x <= y less_equal(x, y)

x != y not_equal(x, y)

x == y equal(x, y)

Broadcasting Broadcasting describes how NumPy treats arrays with different shapes

during arithmetic operations

Smaller array is "broadcast" across the larger array (if possible) so they have compatible shapes

Broadcasting vectorizes array operations so looping occurs in C instead of Python

Usually leads to efficient algorithm implementations

But sometimes not

NumPy operations are usually done element-by-element

Requires 2 arrays to have exactly the same shape

Example 1

>>> a = array([1.0, 2.0, 3.0])

>>> b = array([2.0, 2.0, 2.0])

>>> a * b

array([ 2., 4., 6.])

NumPy's broadcasting rule relaxes this constraint when the arrays' shapes meet certain constraints

Simplest example: an array and a scalar are combined

Example 2

>>> a = array([1.0,2.0,3.0])

>>> b = 2.0

>>> a * b

array([ 2., 4., 6.])

Result equivalent to the previous example (b was an array)

Scalar b is stretched during the operation into an array with same shape as a

In fact, NumPy uses the original scalar value without actually making copies Because Example 2 moves less memory around, it’s faster than

Example 1

The Broadcasting Rule

To broadcast, the size of the trailing axes for both arrays must either be the same or one must be 1

If this condition isn’t met, a “ValueError: frames are not aligned” exception is thrown

The size of the result array is the max size along each dimension from the input arrays

Rule says nothing about 2 arrays having same number of dimensions

E.g., have a 256x256x3 array of RGB values

Want to scale each color component in the image by a different value Multiply the image by a 1D array with 3 values

Lining up the sizes of the trailing axes according to the broadcast rule shows they’re compatible

Image (3D): 256 x 256 x 3

Scale (1D): 3

Result (3D): 256 x 256 x 3

E.g., here both A and B have axes with length 1 that are expanded to a larger size

A (4D): 8 x 1 x 6 x 1

B (3D): 7 x 1 x 5

Result (4D): 8 x 7 x 6 x 5

Example 3

>>> a = array([[ 0.0, 0.0, 0.0],

... [10.0,10.0,10.0],

... [20.0,20.0,20.0],

... [30.0,30.0,30.0]])

>>> b = array([1.0,2.0,3.0])

>>> a + b

array([[ 1., 2., 3.],

[ 11., 12., 13.],

[ 21., 22., 23.],

[ 31., 32., 33.]])

b is added to each row of a

Contents of b and result are off by 1

When b is longer than the rows of a, the shapes are incompatible

Broadcasting lets us take the outer product (or any other outer operation) of 2 arrays

The following shows an outer addition operation of 2 1D arrays Produces same result as Example 3

Example 4

>>> a = array([0.0,10.0,20.0,30.0])

>>> b = array([1.0,2.0,3.0])

>>> a[:,newaxis] + b

array([[ 1., 2., 3.],

[ 11., 12., 13.],

[ 21., 22., 23.],

[ 31., 32., 33.]])

The newaxis index operator inserts a new axis into a

Makes it a 2D 4x1 array

Getting Help on a Function or Method Enter the help facility with

>>> help()

This is followed by some directions for its use

The prompt now is help>

To exit, type quit

Suppose we already have done

>>> from numpy import *

For help on a NumPy function, still must prefix its name with ‘numpy.’

help> numpy.arange

Help on built-in function arange in numpy:

numpy.arange = arange(...)

arange([start,] stop[, step,], dtype=None)

For integer arguments, just like range() except it returns an array

Etc.

For ndarray methods, prefix the name with ‘numpy.ndarray.”

help> numpy.ndarray.setflags

Help on method_descriptor in numpy.ndarray:

numpy.ndarray.setflags = setflags(...)

a.setflags(write=None, align=None, uic=None)

Instead of entering the help facility, can pass the name of the function/method to help()

Don’t need the ‘numpy.’ Prefix

>>> help(arange)Help on built-in function arange in module numpy.core.multiarray:

Etc.

>>> help(ndarray.setflags)

Help on method_descriptor:

setflags(...)

a.setflags(write=None, align=None, uic=None)

Matrices NumPy provides two fundamental objects

an N-dimensional array object (ndarray) and

a universal function object (ufunc)

Other objects are built on top of these

Matrices (subclass matrix) are 2D objects that inherit from ndarray

matrix() is like array() but produces a matrix

>>> from numpy import matrix

>>> from numpy import linalg

>>> A = matrix( [[1,2,3],[11,12,13],[21,22,23]])

>>> print A

[[ 1 2 3]

[11 12 13]

[21 22 23]]

Make a column vector, 4x1 (not just 4)

>>> x = matrix( [[1],[2],[3]] )

>>> print x

[[1]

[2]

[3]]

Make a row vector, 1x4

>>> y = matrix( [[1,2,3]] )

>>> print y

[[1 2 3]]

* is now matrix (not element-wise) multiplication

>>> A * x

matrix([[ 14],

[ 74],

[134]])

Find the transpose >>> A.T

matrix([[ 1, 11, 21],

[ 2, 12, 22],

[ 3, 13, 23]])

Solve a system of equations

Result is in the order value for x[0], for x[1], for x[2]

>>> linalg.solve(A, x)

matrix([[ 0.03333333],

[-0.76666667],

[ 0.83333333]])

matrix() may take a string description, rows separated by “;”

>>> B = matrix("[1 2; 3 1]")

>>> print B

[[1 2]

[3 1]]

Find the determinant

>>> linalg.det(B)

-5.0

Find the inverse (a matrix)

>>> Binv = linalg.inv(B)

>>> print Binv

[[-0.2 0.4]

[ 0.6 -0.2]]

>>> type(Binv)

<class 'numpy.core.defmatrix.matrix'>

>>> B * Binv

matrix([[ 1., 0.],

[ 0., 1.]])

matrix() also takes an array as argument

Converts it to a (2D) matrix even if its 1D

>>> from numpy import *

>>> matrix(arange(4))

matrix([[0, 1, 2, 3]])

>>> matrix(arange(4)).T

matrix([[0],

[1],

[2],

[3]])

Can reshape the array before converting

>>> matrix(arange(4).reshape(2,2))

matrix([[0, 1],

[2, 3]])

Or reshape the matrix

>>> D = matrix(arange(4))

>>> D.reshape(2,2)

matrix([[0, 1],

[2, 3]])

The A attribute of a matrix is the underlying 1D array

>>> D.A

array([[0, 1, 2, 3]])

Can index, slice, and iterate over matrices much as with arrays

The linear algebra package works with both arrays and matrices

It’s generally advisable to use arrays

But you can mix them

E.g., use arrays for the bulk of the code

Switch to matrices when doing lots of multiplication