A tutorial Devert Alexandre - Marmakoidemarmakoide.org/download/teaching/dm/dm-matplotlib.pdf ·...

Post on 03-Apr-2020

1 views 0 download

Transcript of A tutorial Devert Alexandre - Marmakoidemarmakoide.org/download/teaching/dm/dm-matplotlib.pdf ·...

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

MatplotlibA tutorial

Devert AlexandreSchool of Software Engineering of USTC

30 November 2012 — Slide 1/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 2/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Curve plot

Let’s plot a curve

impor t mathimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i dnbSamples = 256xRange = (−math . p i , math . p i )

x , y = [ ] , [ ]f o r n i n x range ( nbSamples ) :

k = (n + 0 . 5 ) / nbSamplesx . append ( xRange [ 0 ] + ( xRange [ 1 ] − xRange [ 0 ] ) ∗ k )y . append (math . s i n ( x [−1]))

# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 3/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Curve plot

This will show you something like this

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 4/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

numpy

matplotlib can work with numpy arrays

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i dnbSamples = 256xRange = (−math . p i , math . p i )

x , y = numpy . z e r o s ( nbSamples ) , numpy . z e r o s ( nbSamples )f o r n i n x range ( nbSamples ) :

k = (n + 0 . 5 ) / nbSamplesx [ n ] = xRange [ 0 ] + ( xRange [ 1 ] − xRange [ 0 ] ) ∗ ky [ n ] = math . s i n ( x [ n ] )

# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )

numpy provides a lot of function and is efficient

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 5/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

numpy

• zeros build arrays filled of 0

• linspace build arrays filled with an arithmeticsequence

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . z e r o s ( nbSamples )f o r n i n x range ( nbSamples ) :

y [ n ] = math . s i n ( x [ n ] )

# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 6/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

numpy

numpy functions can work on entire arrays

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )

# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 7/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

PDF output

Exporting to a PDF file is just one change

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )

# P lo t the s i n u s o i dp l t . p l o t ( x , y )p l t . s a v e f i g ( ’ s i n−p l o t . pdf ’ , t r a n s p a r e n t=True )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 8/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 9/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple curves

It’s often convenient to show several curves in one figure

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# P lo t the cu r v e sp l t . p l o t ( x , y )p l t . p l o t ( x , z )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 10/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple curves

It’s often convenient to show several curves in one figure

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 10/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom colors

Changing colors can help to make nice documents

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 11/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom colors

Changing colors can help to make nice documents

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 11/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line thickness

Line thickness can be changed as well

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# P lo t the cu r v e sp l t . p l o t ( x , y , l i n ew i d t h =3, c=’#FF4500 ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 12/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line thickness

Line thickness can be changed as well

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 12/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line patterns

For printed document, colors can be replaced by linepatterns

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# L i n e s t y l e s can be ’− ’ , ’−−’, ’− . ’ , ’ : ’

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# P lo t the cu r v e sp l t . p l o t ( x , y , l i n e s t y l e=’−− ’ , c=’#000000 ’ )p l t . p l o t ( x , z , c=’#808080 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 13/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Line patternsFor printed document, colors can be replaced by linepatterns

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 13/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Markers

It sometime relevant to show the data points

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Markers can be ’ . ’ , ’ , ’ , ’ o ’ , ’ 1 ’ and more

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=64)y = numpy . s i n ( x )z = numpy . cos ( x )

# P lo t the cu r v e sp l t . p l o t ( x , y , marker=’ 1 ’ , ma r k e r s i z e =15, c=’#000000 ’ )p l t . p l o t ( x , z , c=’#000000 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 14/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Markers

It sometime relevant to show the data points

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 14/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend

A legend can help to make self–explanatory figures

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# l egend l o c a t i o n can be ’ b e s t ’ , ’ c e n t e r ’ , ’ l e f t ’ , ’ r i g h t ’ , e t c .

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ , l a b e l=’ cos ( x ) ’ )p l t . l e g end ( l o c=’ b e s t ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 15/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend

A legend can help to make self–explanatory figures

4 3 2 1 0 1 2 3 41.0

0.5

0.0

0.5

1.0

sin(x)cos(x)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 15/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom axis scaleChanging the axis scale can improve readability

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# l egend l o c a t i o n can be ’ b e s t ’ , ’ c e n t e r ’ , ’ l e f t ’ , ’ r i g h t ’ , e t c .

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t y l i m (−0.5 ∗ math . p i , 0 . 5 ∗ math . p i )

# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ , l a b e l=’ cos ( x ) ’ )p l t . l e g end ( l o c=’ b e s t ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 16/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Custom axis scale

Changing the axis scale can improve readability

4 3 2 1 0 1 2 3 41.5

1.0

0.5

0.0

0.5

1.0

1.5sin(x)cos(x)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 16/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

GridSame goes for a grid, can be helpful

impor t mathimpor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# l egend l o c a t i o n can be ’ b e s t ’ , ’ c e n t e r ’ , ’ l e f t ’ , ’ r i g h t ’ , e t c .

# Genera te a s i n u s o i d and a c o s i n o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=256)y = numpy . s i n ( x )z = numpy . cos ( x )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t y l i m (−0.5 ∗ math . p i , 0 . 5 ∗ math . p i )a x i s . g r i d ( True )

# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ , l a b e l= ’ s i n ( x ) ’ )p l t . p l o t ( x , z , c=’#4682B4 ’ , l a b e l=’ cos ( x ) ’ )p l t . l e g end ( l o c=’ b e s t ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 17/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Grid

Same goes for a grid, can be helpful

4 3 2 1 0 1 2 3 41.5

1.0

0.5

0.0

0.5

1.0

1.5sin(x)cos(x)

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 17/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars

Your data might come with a known measure error

impor t mathimpor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a n o i s y s i n u s o i dx = numpy . l i n s p a c e (−math . p i , math . p i , num=48)y = numpy . s i n ( x + 0.05 ∗ numpy . random . s t anda rd no rma l ( l e n ( x ) ) )y e r r o r = 0 .1 ∗ numpy . random . s t anda rd no rma l ( l e n ( x ) )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t y l i m (−0.5 ∗ math . p i , 0 . 5 ∗ math . p i )

# P lo t the cu r v e sp l t . p l o t ( x , y , c=’#FF4500 ’ )p l t . e r r o r b a r ( x , y , y e r r=y e r r o r )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 18/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars

Your data might come with a known measure error

4 3 2 1 0 1 2 3 41.5

1.0

0.5

0.0

0.5

1.0

1.5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 18/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 19/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Scatter plot

A scatter plot just shows one point for each dataset entry

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100x = numpy . random . s t anda rd no rma l ( nbPo in t s )y = numpy . random . s t anda rd no rma l ( nbPo in t s )

# P lo t the p o i n t sp l t . s c a t t e r ( x , y )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 20/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Scatter plotA scatter plot just shows one point for each dataset entry

3 2 1 0 1 2 34

3

2

1

0

1

2

3

4

5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 20/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratio

If can be very important to have the same scale on bothaxis

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100x = numpy . random . s t anda rd no rma l ( nbPo in t s )y = 0 .1 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )

# P lo t the p o i n t sp l t . s c a t t e r ( x , y , c=’#FF4500 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 21/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratioIf can be very important to have the same scale on bothaxis

3 2 1 0 1 2 30.30.20.10.00.10.20.3

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 21/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratioAlternative way to keep the same scale on both axis

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100x = numpy . random . s t anda rd no rma l ( nbPo in t s )y = 0 .1 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

cmin , cmax = min (min ( x ) , min ( y ) ) , max(max( x ) , max( y ) )cmin −= 0.05 ∗ ( cmax − cmin )cmax += 0.05 ∗ ( cmax − cmin )

a x i s . s e t x l i m ( cmin , cmax )a x i s . s e t y l i m ( cmin , cmax )

# P lo t the p o i n t sp l t . s c a t t e r ( x , y , c=’#FF4500 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 22/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Aspect ratio

Alternative way to keep the same scale on both axis

2 1 0 1 2

2

1

0

1

2

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 22/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple scatter plotsAs for curve, you can show 2 datasets on one figure

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

c o l o r s = ( ’#FF4500 ’ , ’#3CB371 ’ , ’#4682B4 ’ , ’#DB7093 ’ , ’#FFD700 ’ )

# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100

x , y = [ ] , [ ]

x += [ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]y += [ 0 . 2 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]

x += [ 0 . 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 3 . 0 ]y += [2 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 2 . 0 ]

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )

# P lo t the p o i n t sf o r i i n x range ( l e n ( x ) ) :

p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l o r s [ i % l e n ( c o l o r s ) ] )

p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 23/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple scatter plots

As for curve, you can show 2 datasets on one figure

3 2 1 0 1 2 3 4 54

2

0

2

4

6

8

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 23/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Showing centersIt can help to see the centers or the median points

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

c o l o r s = ( ’#FF4500 ’ , ’#3CB371 ’ , ’#4682B4 ’ , ’#DB7093 ’ , ’#FFD700 ’ )

# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100

x , y = [ ] , [ ]

x += [ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]y += [ 0 . 2 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]

x += [ 0 . 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 3 . 0 ]y += [2 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 2 . 0 ]

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )

# P lo t the p o i n t sf o r i i n x range ( l e n ( x ) ) :

c o l = c o l o r s [ i % l e n ( c o l o r s ) ]p l t . s c a t t e r ( x [ i ] , y [ i ] , c=c o l )p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , c=co l , s=250)

p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 24/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Showing centers

It can help to see the centers or the median points

3 2 1 0 1 2 3 4 54

2

0

2

4

6

8

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 24/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Marker stylesYou can use different markers styles

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

markers = ( ’+’ , ’ ˆ ’ , ’ . ’ )

# Genera te a 2d normal d i s t r i b u t i o nnbPo in t s = 100

x , y = [ ] , [ ]

x += [ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]y += [ 0 . 2 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) ]

x += [ 0 . 5 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 3 . 0 ]y += [2 ∗ numpy . random . s t anda rd no rma l ( nbPo in t s ) + 2 . 0 ]

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )

# P lo t the p o i n t sf o r i i n x range ( l e n ( x ) ) :m = markers [ i % l e n ( markers ) ]p l t . s c a t t e r ( x [ i ] , y [ i ] , marker=m, c=’#000000 ’ )p l t . s c a t t e r ( [ numpy . median ( x [ i ] ) ] , [ numpy . median ( y [ i ] ) ] , marker=m, s=250 , c=’#000000 ’ )

p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 25/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Marker styles

You can use different markers styles

4 3 2 1 0 1 2 3 4 54

2

0

2

4

6

8

10

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 25/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 26/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots

Let’s do a simple boxplot

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te normal d i s t r i b u t i o n datax = numpy . random . s t anda rd no rma l (256)

# Show a boxp l o t o f the datap l t . b o xp l o t ( x )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 27/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots

Let’s do a simple boxplot

13

2

1

0

1

2

3

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 27/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Boxplots

You might want to show the original data in the sametime

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te normal d i s t r i b u t i o n datax = numpy . random . s t anda rd no rma l (256)

# Show a boxp l o t o f the datap l t . s c a t t e r ( [ 0 ] ∗ l e n ( x ) , x , c=’#4682B4 ’ )p l t . b o xp l o t ( x , p o s i t i o n s =[0 ] )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 28/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

BoxplotsYou might want to show the original data in the sametime

04

3

2

1

0

1

2

3

4

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 28/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple boxplots

Boxplots are often used to show side by side variousdistributions

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te normal d i s t r i b u t i o n datadata = [ ]f o r i i n x range ( 5 ) :

mu = 10 ∗ numpy . random . random sample ( )s igma = 2 ∗ numpy . random . random sample ( ) + 0 .1data . append (numpy . random . normal (mu, sigma , 256))

# Show a boxp l o t o f the datap l t . b o xp l o t ( data )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 29/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Multiple boxplotsBoxplots are often used to show side by side variousdistributions

1 2 3 4 52

4

6

8

10

12

14

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 29/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Orientation

Changing the orientation is easily done

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te normal d i s t r i b u t i o n datadata = [ ]f o r i i n x range ( 5 ) :

mu = 10 ∗ numpy . random . random sample ( )s igma = 2 ∗ numpy . random . random sample ( ) + 0 .1data . append (numpy . random . normal (mu, sigma , 256))

# Show a boxp l o t o f the datap l t . b o xp l o t ( data , v e r t =0)p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 30/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Orientation

Changing the orientation is easily done

6 4 2 0 2 4 6 8 10 12

1

2

3

4

5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 30/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

LegendGood graphics have a proper legend

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Genera te normal d i s t r i b u t i o n datal a b e l s = [ ’ mercury ’ , ’ l e a d ’ , ’ l i t h i um ’ , ’ t ung s t ene ’ , ’ cadnium ’ ]data = [ ]f o r i i n x range ( l e n ( l a b e l s ) ) :

mu = 10 ∗ numpy . random . random sample ( ) + 100sigma = 2 ∗ numpy . random . random sample ( ) + 0 .1data . append (numpy . random . normal (mu, sigma , 256))

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t t i t l e ( ’ A l i e n nodu l e s compos i t i on ’ )xt ickNames = p l t . s e t p ( a x i s , x t i c k l a b e l s = l a b e l s )a x i s . s e t y l a b e l ( ’ c o n c e n t r a t i o n (ppm) ’ )

# Show a boxp l o t o f the datap l t . b o xp l o t ( data )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 31/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Legend

Good graphics have a proper legend

mercury lead lithium tungstene cadnium100

102

104

106

108

110

112co

nce

ntr

ati

on (

ppm

)Alien nodules composition

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 31/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 32/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms

Histogram are convenient to sum-up results

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Some datadata = numpy . abs (numpy . random . s t anda rd no rma l ( 30 ) )

# Show an h i s tog ramp l t . barh ( range ( l e n ( data ) ) , data , c o l o r=’#4682B4 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 33/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms

Histogram are convenient to sum-up results

0.0 0.5 1.0 1.5 2.0 2.50

5

10

15

20

25

30

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 33/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms

A variant to show 2 quantities per item on 1 figure

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Some datadata = [ [ ] , [ ] ]f o r i i n x range ( l e n ( data ) ) :

data [ i ] = numpy . abs (numpy . random . s t anda rd no rma l ( 30 ) )

# Show an h i s tog raml a b e l s = range ( l e n ( data [ 0 ] ) )

p l t . barh ( l a b e l s , data [ 0 ] , c o l o r=’#4682B4 ’ )p l t . barh ( l a b e l s , −1 ∗ data [ 1 ] , c o l o r=’#FF4500 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 34/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Histograms

A variant to show 2 quantities per item on 1 figure

3 2 1 0 1 2 30

5

10

15

20

25

30

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 34/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

LabelsVery often, we need to name items on an histogram

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Some datanames = [ ’Wang Bu ’ , ’ Cheng Cao ’ , ’ Zhang Xue L i ’ , ’ Tang Wei ’ , ’ Sun Wu Kong ’ ]marks = 7 ∗ numpy . random . random sample ( l e n ( names ) ) + 3

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t x l i m (0 , 10)

p l t . y t i c k s ( numpy . a range ( l e n ( marks ) )+0.5 , names )

a x i s . s e t t i t l e ( ’ Data−mining marks ’ )a x i s . s e t x l a b e l ( ’mark ’ )

# Show an h i s tog ramp l t . barh ( range ( l e n ( marks ) ) , marks , c o l o r=’#4682B4 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 35/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Labels

Very often, we need to name items on an histogram

0 2 4 6 8 10mark

Wang Bu

Cheng Cao

Zhang Xue Li

Tang Wei

Sun Wu Kong

Data-mining marks

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 35/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error barsError bars, to indicate the accuracy of values

impor t numpyimpor t numpy . randomimpor t ma t p l o t l i b . p yp l o t as p l t

# Some datanames = [ ’ 6809 ’ , ’ 6502 ’ , ’ 8086 ’ , ’ Z80 ’ , ’RCA1802 ’ ]speed = 70 ∗ numpy . random . random sample ( l e n ( names ) ) + 30e r r o r = 9 ∗ numpy . random . random sample ( l e n ( names ) ) + 1

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

p l t . y t i c k s ( numpy . a range ( l e n ( names ))+0.5 , names )

a x i s . s e t t i t l e ( ’ 8 b i t s CPU benchmark − s t r i n g t e s t ’ )a x i s . s e t x l a b e l ( ’ s c o r e ’ )

# Show an h i s tog ramp l t . barh ( range ( l e n ( names ) ) , speed , x e r r=e r r o r , c o l o r=’#4682B4 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 36/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Error bars

Error bars, to indicate the accuracy of values

0 20 40 60 80 100score

6809

6502

8086

Z80

RCA1802

8 bits CPU benchmark - string test

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 36/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Table of Contents

1 First steps

2 Curve plots

3 Scatter plots

4 Boxplots

5 Histograms

6 Usage example

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 37/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

Let’s display some real data: Old Faithful geyser

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 38/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

This way works, but good example of half-done job

impor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Read the datadata = numpy . l o a d t x t ( ’ . / d a t a s e t s / g e y s e r . dat ’ )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111 , a sp e c t=’ equa l ’ )

# P lo t the p o i n t sp l t . s c a t t e r ( data [ : , 0 ] , data [ : , 1 ] , c=’#FF4500 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 39/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

This way works, but good example of half-done job

30 40 50 60 70 80 90 1001.52.02.53.03.54.04.55.05.5

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 39/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

Let’s make a more readable figure

impor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Read the datadata = numpy . l o a d t x t ( ’ . / d a t a s e t s / g e y s e r . dat ’ )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t t i t l e ( ’ Old F a i t h f u l g e y s e r d a t a s e t ’ )a x i s . s e t x l a b e l ( ’ w a i t i n g t ime (mn . ) ’ )a x i s . s e t y l a b e l ( ’ e r u p t i o n du r a t i o n (mn . ) ’ )

# P lo t the p o i n t sp l t . s c a t t e r ( data [ : , 0 ] , data [ : , 1 ] , c=’#FF4500 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 40/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Old Faithful

Let’s make a more readable figure

30 40 50 60 70 80 90 100waiting time (mn.)

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

5.5eru

pti

on d

ura

tion (

mn.)

Old Faithful geyser dataset

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 40/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

Let’s display more complex data: fishes and mercurypoisoning

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 41/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

A first try

impor t numpyimpor t ma t p l o t l i b . p yp l o t as p l t

# Read the datadata = numpy . l o a d t x t ( ’ . / d a t a s e t s / f i s h . dat ’ )

# Ax i s s e tupf i g = p l t . f i g u r e ( )a x i s = f i g . a dd subp l o t (111)

a x i s . s e t t i t l e ( ’ North Ca r o l i n a f i s h e s mercury c o n c e n t r a t i o n ’ )a x i s . s e t x l a b e l ( ’ we ight ( g . ) ’ )a x i s . s e t y l a b e l ( ’ mercury c o n c e n t r a t i o n (ppm) ’ )

# P lo t the p o i n t sp l t . s c a t t e r ( data [ : , 3 ] , data [ : , 4 ] , c=’#FF4500 ’ )p l t . show ( )

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 42/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

A first try

1000 0 1000 2000 3000 4000 5000weight (g.)

0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0m

erc

ury

conce

ntr

ati

on (

ppm

)North Carolina fishes mercury concentration

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 42/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

Show more information with subplots

20 30 40 50 60 70length (cm.)

10000

10002000300040005000

weig

ht

(g.)

1000 0 1000 2000 3000 4000 5000weight (g.)

0.50.00.51.01.52.02.53.03.54.0

merc

ury

conce

ntr

ati

on (

ppm

)

20 30 40 50 60 70length (cm.)

0.50.00.51.01.52.02.53.03.54.0

merc

ury

conce

ntr

ati

on (

ppm

)North Carolina fishes mercury concentration

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 43/44

UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA SCHOOL OF SOFTWARE ENGINEERING OF USTC

Mercury & fishes

Data from one river with its own color

20 30 40 50 60 70length (cm.)

10000

10002000300040005000

weig

ht

(g.)

1000 0 1000 2000 3000 4000 5000weight (g.)

0.50.00.51.01.52.02.53.03.54.0

merc

ury

conce

ntr

ati

on (

ppm

)

20 30 40 50 60 70length (cm.)

0.50.00.51.01.52.02.53.03.54.0

merc

ury

conce

ntr

ati

on (

ppm

)North Carolina fishes mercury concentration

Devert Alexandre (School of Software Engineering of USTC) — Matplotlib — Slide 44/44