ScientificProgrammingwiththeSciPyStackShaunWalbridge
KevinButler
https://github.com/scw/scipy-devsummit-2017-talk
HighQualityPDF(5MB)
ResourcesSection
ScientificComputing
Theapplicationofcomputationalmethodstoallaspectsoftheprocessofscientificinvestigation–dataacquisition,datamanagement,analysis,visualization,andsharingofmethodsandresults.
ScientificComputing
ExtendingArcGISArcGISisasystemofrecord.Combinedataandanalysisfrommanyfieldsandintoacommonenvironemnt.Whyextend?Can'tdoitall,wesupportover1000GPtools—enablingintegrationwithotherenvironmentstoextendtheplatform.
Python
WhyPython?Accessiblefornew-comers,andthe
Extensivepackagecollection(56kon ),broaduser-baseStronggluelanguageusedtobindtogethermanyenvironments,bothopensourceandcommercialOpensourcewithliberallicense—dowhatyouwant
BrandnewtoPython?ThistalkmaybechallengingResourcesincludematerialsthatforgettingstarted
mosttaughtfirstlanguageinUSuniversites
PyPI
PythoninArcGISPythonAPIfordrivingArcGISDesktopandServerAfullyintegratedmodule:importarcpyInteractiveWindow,PythonAddins,PythonTooboxesExtensions:
SpatialAnalyst:arcpy.saMapDocument:arcpy.mappingNetworkAnalyst:arcpy.naGeostatistics:arcpy.gaFastcursors:arcpy.da
ArcGISAPIforPython
PythoninArcGISPython3.5inPro( )
arcpy.mpinsteadofarcpy.mappingContinuetoaddmodules:NetCDF4,xlrd,xlwt,PyPDF2,dateutil,pip
,witha usingSciPyforontheflyvisualizations
DesktopvsProPython
Pythonrasterfunction repositoryofexamples
PythoninArcGISHere,focusonSciPystack,what’sincludedoutoftheboxMovetowardmaintainable,reusablecodeandbeyondthe“one-off”Recurringtheme:multi-dimensionaldatastructuresAlsosee whichcoversdaskBrendanCollinstalktomorrow
SciPy
WhySciPy?Mostlanguagesdon’tsupportthingsusefulforscience,e.g.:
VectorprimitivesComplexnumbersStatistics
Objectorientedprogrammingisn’talwaystherightparadigmforanalysisapplications,butistheonlywaytogoinmanymodernlanguagesSciPybringsthepiecesthatmatterforscientificproblemstoPython.
SciPyStack
IncludedSciPyPackage KLOC Contributors Stars
118 441 4909
7 75 1053
236 429 4011
183 408 8765
387 387 2930
243 443 3642
Totals 1174 1885
matplotlib
Nose
NumPy
Pandas
SciPy
SymPy
TestingwithNose—aPythonframeworkfortesting
Testsimproveyourproductivity,andcreaterobustcodeNosebuildsonunittestframework,extendsittomaketestingeasy.Pluginarchitecture, andcanbeextendedwith .
Nose
includesanumberofpluginsthird-partyplugins
1. Anarrayobjectofarbitraryhomogeneousitems2. Fastmathematicaloperationsoverarrays3. RandomNumberGeneration
,CC-BYSciPyLectures
ArcGIS+NumPyArcGISandNumPycaninteroperateonraster,table,andfeaturedata.SeeIn-memorydatamodel.Examplescriptto ifworkingwithlargerdata.
WorkingwithNumPyinArcGISprocessbyblocks
ArcGIS+NumPy
PlottinglibraryandAPIforNumPydataMatplotlibGallery
Computationalmethodsfor:
Integration( )Optimization( )Interpolation( )FourierTransforms( )SignalProcessing( )LinearAlgebra( )Spatial( )Statistics( )Multidimensionalimageprocessing( )
scipy.integratescipy.optimizescipy.interpolate
scipy.fftpackscipy.signal
scipy.linalgscipy.spatial
scipy.statsscipy.ndimage
SciPy:GeometricMeanCalculatingageometricmeanofanentirerasterusingSciPy( )source
importscipy.statsrast_in='data/input_raster.tif'rast_as_numpy_array=arcpy.RasterToNumPyArray(rast_in)raster_geometric_mean=scipy.stats.stats.gmean(rast_as_numpy_array,axis=None)
UseCase:BenthicTerrainModeler
BenthicTerrainModelerAPythonAdd-inandPythontoolboxforgeomorphologyOpensource,canborrowcodeforyourownprojects:
Activecommunityofusers,primarilymarinescientists,butalsousefulforotherapplications
https://github.com/EsriOceans/btm
LightweightSciPyIntegration
Usingscipy.ndimagetoperformbasicmultiscaleanalysisUsingscipy.statstocomputecircularstatistics
LightweightSciPyIntegration
Examplesource
importarcpyimportscipy.ndimageasndfrommatplotlibimportpyplotasplt
ras="data/input_raster.tif"r=arcpy.RasterToNumPyArray(ras,"",200,200,0)
fig=plt.figure(figsize=(10,10))
LightweightSciPyIntegration
foriinxrange(25):size=(i+1)*3print"running{}".format(size)med=nd.median_filter(r,size)
a=fig.add_subplot(5,5,i+1)plt.imshow(med,interpolation='nearest')a.set_title('{}x{}'.format(size,size))plt.axis('off')plt.subplots_adjust(hspace=0.1)prev=med
plt.savefig("btm-scale-compare.png",bbox_inches='tight')
SciPyStatistics
Breakdownaspectintosin()andcos()variablesAspectisacircularvariable—withoutthis0and360areoppositesinsteadofbeingthesamevalue
SciPyStatisticsSummarystatisticsfromSciPyincludecircularstatistics( ).source
importscipy.stats.morestats
ras="data/aspect_raster.tif"r=arcpy.RasterToNumPyArray(ras)
morestats.circmean(r)morestats.circstd(r)morestats.circvar(r)
Demo:SciPy
MultidimensionalData
NetCDF4Fast,HDF5andNetCDF4read+writesupport,OPeNDAPHeirarchicaldatastructuresWidelyusedinmeterology,oceanography,climatecommunitiesEasier:MultidimensionalToolbox,butcanbeuseful
( )Source
importnetCDF4nc=netCDF4.Dataset('test.nc','r',format='NETCDF4')printnc.file_format#outputs:NETCDF4nc.close()
MultidimensionalImprovements
Multidimensionalformats:HDF,GRIB,NetCDFAccessviaOPeNDAP,vectorrenderer,RasterFunctionChaining
Multi-DsupportedasWMS,andinMosaicdatasets(10.2.1+)Anexamplewhichcombinesmutli-Dwithtime
Pandas
PanelData—likeR"dataframes"BringarobustdataanalysisworkflowtoPythonDataframesarefundamental—treattabular(andmulti-dimensional)dataasalabeled,indexedseriesofobservations.
( )Source
importpandas
data=pandas.read_csv('data/season-ratings.csv')data.columns
Index([u'season',u'households',u'rank',u'tv_households',\u'net_indep',u'primetime_pct'],dtype='object')
majority_simpsons=data[data.primetime_pct>50]
seasonhouseholdstv_householdsnet_indepprimetime_pct0113.4m[41]92.151.680.7511741212.2m[n2]92.150.478.5046732312.0m[n3]92.148.476.5822783412.1m[48]93.146.272.7559064510.5m[n4]93.146.572.093023569.0m[50]95.446.171.032357678.0m[51]95.946.670.713202788.6m[52]97.044.267.584098899.1m[53]98.042.364.3835629107.9m[54]99.439.960.91603110118.2m[55]100.838.157.466063111214.7m[56]102.236.853.958944121312.4m[57]105.535.051.094891
Demo:Pandas
SymPy
AComputerAlgebraSystem(CAS),solvemathequations( )source
fromsympyimport*x=symbol('x')eq=Eq(x**3+2*x**2+4*x+8,0)
solve(eq,x)
Demo:SymPy
WhereandHowFast?
WhereCanIRunThis?Now:
ArcGISPro(64-bit)10.4:ArcMap,Server,both32-and64-bitenvironments
Bothnowshipwith (sansIPython)MKLenabledNumPyandSciPyeverywhere
Olderreleases:NumPy:ArcGIS9.2+,matplotlib:ArcGIS10.1+,SciPy:10.4+,Pandas:10.4+
CondaformanagingfullPythonenvironments,consumingandproducingpackagesWiththeArcGISAPIforPython!CanrunanywherePythonruns.
StandalonePythonInstallforPro
ScipyStack
HowDoesItperform?BuiltwithIntel’s andcompilers—highlyoptimizedFortranandCunderthehood.Automatedparallelizationforexecutedcode
MathKernelLibrary(MKL)
MKLPerformanceChart
fromfutureimport*
OpeningDoorsMachinelearning(scikit-learn,scikit-image,...)Deeplearning(theano,...)Bayesianstatistics(PyMC)
MarkovChainMonteCarlo(MCMC)Frequentiststatistics(statsmodels)WithConda,notjustPython!tensorflow,manyothers
Resources
OtherSessions—
tomorrow10:30inMesquiteG-H
—stickaround,inthisroomin30min!
earliertoday,
yesterday,
ExploringContinuumAnalytics'OpenSourceOfferings
GettingDataSciencewithRandArcGIS2016video
IntegratingOpen-sourceStatisticalPackageswithArcGIS2016video
HarnessingthePowerofPythoninArcGISUsingtheCondaDistribution 2016video
NewtoPythonCourses:
Books:
ProgrammingforEverybodyCodecademy:PythonTrack
LearnPythontheHardWayHowtoThinkLikeaComputerScientist
GISFocusedPythonScriptingforArcGISArcPyandArcGIS-GeospatialAnalysiswithPythonPythonDevelopersGeoNetCommunityGISStackexchange
ScientificCourses:
PythonScientificLectureNotesHighPerformanceScientificComputingCodingtheMatrix:LinearAlgebrathroughComputerScienceApplicationsTheDataScientist’sToolbox
ScientificBooks:
Free:
verycompellingbookonBayesianmethodsinPython,usesSciPy+PyMC.
ProbabilisticProgramming&BayesianMethodsforHackers
KalmanandBayesianFiltersinPython
ScientificPaid:
HowtouselinearalgebraandPythontosolveamazingproblems.
ThecannonicalbookonPandasandanalysis.
CodingtheMatrix
PythonforDataAnalysis:DataWranglingwithPandas,NumPy,andIPython
PackagesOnlyrequireSciPyStack:
Scikit-learn:
IncludesSVMs,canusethoseforimageprocessingamongotherthings...
FilterPy,Kalmanfilteringandoptimalestimation:
Lecturematerial
FilterPyonGitHubAnextensivelistofmachinelearningpackages
Code
AnopensourcecollectionoffunctionchainstoshowhowtodocomplexthingsusingNumPy+scipyontheflyforvisualizationpurposes
withahandfulofdescriptivestatisticsincludedinPython3.4.TIP:WantacodebasethatrunsinPython2and3?
,whichhelpsmaintainasinglecodebasethatsupportsboth.Includesthefuturizescripttoinitiallyaprojectwrittenforoneversion.
ArcPy+SciPyonGithubraster-functions
statisticslibrary
Checkoutfuture
ScientificArcGISExtensions
CombinesPython,R,andMATLABtosolveawidevarietyofproblems
speciesdistribution&maximumentropymodels
PySALArcGISToolboxMovementEcologyToolsforArcGIS(ArcMET)MarineGeospatialEcologyTools(MGET)
SDMToolbox
BenthicTerrainModelerGeospatialModelingEnvironmentCircuitScape
ConferencesThelargestgatheringofPythonistasintheworld
AmeetingofScientificPythonusersfromallwalks
ThePythoneventforPythonandGeoenthusiasts
TalksfromPythonconferencesaroundtheworldavailablefreelyonline.
PyCon
SciPy
GeoPython
PyVideo
PyVideoGIStalks
Closing
ThanksGeoprocessingTeamThemanyamazingcontributorstotheprojectsdemonstratedhere.
Getinvolved!AllareonGitHubandhappilyacceptcontributions.
RateThisSessioniOS,Android:Feedbackfromwithintheapp
WindowsPhone,ornosmartphone?Cuneiformtabletsaccepted.
Top Related