Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology...

61
Scientific Programming with the SciPy Stack Shaun Walbridge Kevin Butler

Transcript of Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology...

Page 1: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ScientificProgrammingwiththeSciPyStackShaunWalbridge

KevinButler

Page 3: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ScientificComputing

Page 4: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Theapplicationofcomputationalmethodstoallaspectsoftheprocessofscientificinvestigation–dataacquisition,datamanagement,analysis,visualization,andsharingofmethodsandresults.

ScientificComputing

Page 5: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ExtendingArcGISArcGISisasystemofrecord.Combinedataandanalysisfrommanyfieldsandintoacommonenvironemnt.Whyextend?Can'tdoitall,wesupportover1000GPtools—enablingintegrationwithotherenvironmentstoextendtheplatform.

Page 6: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Python

Page 7: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

WhyPython?Accessiblefornew-comers,andthe

Extensivepackagecollection(56kon ),broaduser-baseStronggluelanguageusedtobindtogethermanyenvironments,bothopensourceandcommercialOpensourcewithliberallicense—dowhatyouwant

BrandnewtoPython?ThistalkmaybechallengingResourcesincludematerialsthatforgettingstarted

mosttaughtfirstlanguageinUSuniversites

PyPI

Page 8: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

PythoninArcGISPythonAPIfordrivingArcGISDesktopandServerAfullyintegratedmodule:importarcpyInteractiveWindow,PythonAddins,PythonTooboxesExtensions:

SpatialAnalyst:arcpy.saMapDocument:arcpy.mappingNetworkAnalyst:arcpy.naGeostatistics:arcpy.gaFastcursors:arcpy.da

ArcGISAPIforPython

Page 9: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

PythoninArcGISPython3.5inPro( )

arcpy.mpinsteadofarcpy.mappingContinuetoaddmodules:NetCDF4,xlrd,xlwt,PyPDF2,dateutil,pip

,witha usingSciPyforontheflyvisualizations

DesktopvsProPython

Pythonrasterfunction repositoryofexamples

Page 10: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

PythoninArcGISHere,focusonSciPystack,what’sincludedoutoftheboxMovetowardmaintainable,reusablecodeandbeyondthe“one-off”Recurringtheme:multi-dimensionaldatastructuresAlsosee whichcoversdaskBrendanCollinstalktomorrow

Page 11: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

SciPy

Page 12: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

WhySciPy?Mostlanguagesdon’tsupportthingsusefulforscience,e.g.:

VectorprimitivesComplexnumbersStatistics

Objectorientedprogrammingisn’talwaystherightparadigmforanalysisapplications,butistheonlywaytogoinmanymodernlanguagesSciPybringsthepiecesthatmatterforscientificproblemstoPython.

Page 13: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

SciPyStack

Page 14: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

IncludedSciPyPackage KLOC Contributors Stars

118 441 4909

7 75 1053

236 429 4011

183 408 8765

387 387 2930

243 443 3642

Totals 1174 1885

matplotlib

Nose

NumPy

Pandas

SciPy

SymPy

Page 15: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

TestingwithNose—aPythonframeworkfortesting

Testsimproveyourproductivity,andcreaterobustcodeNosebuildsonunittestframework,extendsittomaketestingeasy.Pluginarchitecture, andcanbeextendedwith .

Nose

includesanumberofpluginsthird-partyplugins

Page 16: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

1. Anarrayobjectofarbitraryhomogeneousitems2. Fastmathematicaloperationsoverarrays3. RandomNumberGeneration

,CC-BYSciPyLectures

Page 17: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ArcGIS+NumPyArcGISandNumPycaninteroperateonraster,table,andfeaturedata.SeeIn-memorydatamodel.Examplescriptto ifworkingwithlargerdata.

WorkingwithNumPyinArcGISprocessbyblocks

Page 18: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ArcGIS+NumPy

Page 19: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

PlottinglibraryandAPIforNumPydataMatplotlibGallery

Page 21: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

SciPy:GeometricMeanCalculatingageometricmeanofanentirerasterusingSciPy( )source

importscipy.statsrast_in='data/input_raster.tif'rast_as_numpy_array=arcpy.RasterToNumPyArray(rast_in)raster_geometric_mean=scipy.stats.stats.gmean(rast_as_numpy_array,axis=None)

Page 22: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

UseCase:BenthicTerrainModeler

Page 23: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

BenthicTerrainModelerAPythonAdd-inandPythontoolboxforgeomorphologyOpensource,canborrowcodeforyourownprojects:

Activecommunityofusers,primarilymarinescientists,butalsousefulforotherapplications

https://github.com/EsriOceans/btm

Page 24: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

LightweightSciPyIntegration

Usingscipy.ndimagetoperformbasicmultiscaleanalysisUsingscipy.statstocomputecircularstatistics

Page 25: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

LightweightSciPyIntegration

Examplesource

importarcpyimportscipy.ndimageasndfrommatplotlibimportpyplotasplt

ras="data/input_raster.tif"r=arcpy.RasterToNumPyArray(ras,"",200,200,0)

fig=plt.figure(figsize=(10,10))

Page 26: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

LightweightSciPyIntegration

foriinxrange(25):size=(i+1)*3print"running{}".format(size)med=nd.median_filter(r,size)

a=fig.add_subplot(5,5,i+1)plt.imshow(med,interpolation='nearest')a.set_title('{}x{}'.format(size,size))plt.axis('off')plt.subplots_adjust(hspace=0.1)prev=med

plt.savefig("btm-scale-compare.png",bbox_inches='tight')

Page 27: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s
Page 28: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

SciPyStatistics

Breakdownaspectintosin()andcos()variablesAspectisacircularvariable—withoutthis0and360areoppositesinsteadofbeingthesamevalue

Page 29: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

SciPyStatisticsSummarystatisticsfromSciPyincludecircularstatistics( ).source

importscipy.stats.morestats

ras="data/aspect_raster.tif"r=arcpy.RasterToNumPyArray(ras)

morestats.circmean(r)morestats.circstd(r)morestats.circvar(r)

Page 30: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Demo:SciPy

Page 31: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

MultidimensionalData

Page 32: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

NetCDF4Fast,HDF5andNetCDF4read+writesupport,OPeNDAPHeirarchicaldatastructuresWidelyusedinmeterology,oceanography,climatecommunitiesEasier:MultidimensionalToolbox,butcanbeuseful

( )Source

importnetCDF4nc=netCDF4.Dataset('test.nc','r',format='NETCDF4')printnc.file_format#outputs:NETCDF4nc.close()

Page 33: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

MultidimensionalImprovements

Multidimensionalformats:HDF,GRIB,NetCDFAccessviaOPeNDAP,vectorrenderer,RasterFunctionChaining

Multi-DsupportedasWMS,andinMosaicdatasets(10.2.1+)Anexamplewhichcombinesmutli-Dwithtime

Page 34: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Pandas

Page 35: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

PanelData—likeR"dataframes"BringarobustdataanalysisworkflowtoPythonDataframesarefundamental—treattabular(andmulti-dimensional)dataasalabeled,indexedseriesofobservations.

Page 36: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

( )Source

importpandas

data=pandas.read_csv('data/season-ratings.csv')data.columns

Index([u'season',u'households',u'rank',u'tv_households',\u'net_indep',u'primetime_pct'],dtype='object')

Page 37: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

majority_simpsons=data[data.primetime_pct>50]

seasonhouseholdstv_householdsnet_indepprimetime_pct0113.4m[41]92.151.680.7511741212.2m[n2]92.150.478.5046732312.0m[n3]92.148.476.5822783412.1m[48]93.146.272.7559064510.5m[n4]93.146.572.093023569.0m[50]95.446.171.032357678.0m[51]95.946.670.713202788.6m[52]97.044.267.584098899.1m[53]98.042.364.3835629107.9m[54]99.439.960.91603110118.2m[55]100.838.157.466063111214.7m[56]102.236.853.958944121312.4m[57]105.535.051.094891

Page 38: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Demo:Pandas

Page 39: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

SymPy

Page 40: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

AComputerAlgebraSystem(CAS),solvemathequations( )source

fromsympyimport*x=symbol('x')eq=Eq(x**3+2*x**2+4*x+8,0)

solve(eq,x)

Page 41: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Demo:SymPy

Page 42: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

WhereandHowFast?

Page 43: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

WhereCanIRunThis?Now:

ArcGISPro(64-bit)10.4:ArcMap,Server,both32-and64-bitenvironments

Bothnowshipwith (sansIPython)MKLenabledNumPyandSciPyeverywhere

Olderreleases:NumPy:ArcGIS9.2+,matplotlib:ArcGIS10.1+,SciPy:10.4+,Pandas:10.4+

CondaformanagingfullPythonenvironments,consumingandproducingpackagesWiththeArcGISAPIforPython!CanrunanywherePythonruns.

StandalonePythonInstallforPro

ScipyStack

Page 44: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

HowDoesItperform?BuiltwithIntel’s andcompilers—highlyoptimizedFortranandCunderthehood.Automatedparallelizationforexecutedcode

MathKernelLibrary(MKL)

MKLPerformanceChart

Page 45: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

fromfutureimport*

Page 46: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

OpeningDoorsMachinelearning(scikit-learn,scikit-image,...)Deeplearning(theano,...)Bayesianstatistics(PyMC)

MarkovChainMonteCarlo(MCMC)Frequentiststatistics(statsmodels)WithConda,notjustPython!tensorflow,manyothers

Page 47: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Resources

Page 49: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

NewtoPythonCourses:

Books:

ProgrammingforEverybodyCodecademy:PythonTrack

LearnPythontheHardWayHowtoThinkLikeaComputerScientist

Page 51: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ScientificCourses:

PythonScientificLectureNotesHighPerformanceScientificComputingCodingtheMatrix:LinearAlgebrathroughComputerScienceApplicationsTheDataScientist’sToolbox

Page 52: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ScientificBooks:

Free:

verycompellingbookonBayesianmethodsinPython,usesSciPy+PyMC.

ProbabilisticProgramming&BayesianMethodsforHackers

KalmanandBayesianFiltersinPython

Page 53: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ScientificPaid:

HowtouselinearalgebraandPythontosolveamazingproblems.

ThecannonicalbookonPandasandanalysis.

CodingtheMatrix

PythonforDataAnalysis:DataWranglingwithPandas,NumPy,andIPython

Page 54: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

PackagesOnlyrequireSciPyStack:

Scikit-learn:

IncludesSVMs,canusethoseforimageprocessingamongotherthings...

FilterPy,Kalmanfilteringandoptimalestimation:

Lecturematerial

FilterPyonGitHubAnextensivelistofmachinelearningpackages

Page 55: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Code

AnopensourcecollectionoffunctionchainstoshowhowtodocomplexthingsusingNumPy+scipyontheflyforvisualizationpurposes

withahandfulofdescriptivestatisticsincludedinPython3.4.TIP:WantacodebasethatrunsinPython2and3?

,whichhelpsmaintainasinglecodebasethatsupportsboth.Includesthefuturizescripttoinitiallyaprojectwrittenforoneversion.

ArcPy+SciPyonGithubraster-functions

statisticslibrary

Checkoutfuture

Page 56: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ScientificArcGISExtensions

CombinesPython,R,andMATLABtosolveawidevarietyofproblems

speciesdistribution&maximumentropymodels

PySALArcGISToolboxMovementEcologyToolsforArcGIS(ArcMET)MarineGeospatialEcologyTools(MGET)

SDMToolbox

BenthicTerrainModelerGeospatialModelingEnvironmentCircuitScape

Page 57: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ConferencesThelargestgatheringofPythonistasintheworld

AmeetingofScientificPythonusersfromallwalks

ThePythoneventforPythonandGeoenthusiasts

TalksfromPythonconferencesaroundtheworldavailablefreelyonline.

PyCon

SciPy

GeoPython

PyVideo

PyVideoGIStalks

Page 58: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

Closing

Page 59: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

ThanksGeoprocessingTeamThemanyamazingcontributorstotheprojectsdemonstratedhere.

Getinvolved!AllareonGitHubandhappilyacceptcontributions.

Page 60: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s

RateThisSessioniOS,Android:Feedbackfromwithintheapp

WindowsPhone,ornosmartphone?Cuneiformtabletsaccepted.

Page 61: Scientific Programming with the SciPy Stack · A Python Add-in and Python toolbox for geomorphology Open source, can borrow code for your own projects: ... The Data Scientist’s