The Language, Libraries and Culture of Python in MeteorologySciPy in Py-ART The Language, Libraries...

32
The Language, Libraries and Culture of Python in Meteorology Jonathan Helmus Argonne Na*onal Laboratory, USA This presentation has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (Argonne). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE- AC02-06CH11357. This research was supported by the Office of Biological and Environmental Research of the U.S. Department of Energy as part of the Atmospheric Radiation Measurement Climate Research Facility.

Transcript of The Language, Libraries and Culture of Python in MeteorologySciPy in Py-ART The Language, Libraries...

  • The Language, Libraries and Culture of Python in Meteorology

    JonathanHelmus

    ArgonneNa*onalLaboratory,USA

    This presentation has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. This research was supported by the Office of Biological and Environmental Research of the U.S. Department of Energy as part of the Atmospheric Radiation Measurement Climate Research Facility.

  • Technology Powers Scientific Discovery

    §  Scien*ficadvancementsaredrivenbystate-of-the-arttechnologies.

    §  Newtechnologiesleadthewaytoscien*ficbreakthroughsandamorecompleteunderstandingoftheworldaroundus.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    2

  • Computers Prime Technology for Science

    §  Computersarethemostsignificanttechnologyavailabletoscien*ststoday.

    §  Theabilitytoperformfastcomputa*onshasvastlychangedmeteorologyandwillcon*nuetochangethefieldinthefuture.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    3

    Source:R.G.Strauchelal,JTech,1984,1,37ARMCSAPRRadarhRp://www.arm.gov/instruments/csapr

  • Programming: Controlling Computers for Science

    §  Thedailyopera*onofscienceinvolvescomputers.

    §  Computersrequirespecificdirec*onsintheformofsoUwaretorealizetheirpoten*altopowerscien*ficresearchandopera*ons.

    §  TheabilitytocreatesoUwarecustomizedtoaspecificscien*fictaskis

    oUenneeded.

    §  Wri*ngsoUwarecanbea*meconsumingandburdensometask.–  Scien*stses*matethat35%oftheirresearch*meisspentdevelopingsoUware[1].–  Mostscien*stslearntodevelopsoUwareinformallyfromtheirpeers[2].–  Thechoiceofprogramminglanguageaffectsthe*merequiredfordevelopment.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    4

    [1]P.Prabhu,etal,ASurveyofthePrac*ceofComputa*onalScience,2011,10.1145/2063348.2063374[2]J.E.Hannay,etal,HowDoScien*stsDevelopandUseScien*ficSoUware,2009,10.1109/SECSE.2009.5069155.

  • Python is Ideal for Writing Scientific Software

    §  Pythonistheidealprogramminglanguageforwri*ngsoUwaretomeetthecomputa*onalchallengesinmeteorologyandatmosphericscience.

    §  DevelopingsoUwareinPythonissimpleandfastdueto:

    –  Thedesignandphilosophyofthelanguage.

    –  Theavailabilityofanumberofhigh-quality,efficientscien*ficPythonlibraries.

    –  Thewelcoming,vibrant,andgrowingPythoncommunity.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    5

  • Python: the Language

    Python,theprogramminglanguageis:

    §  Ahigh-levellanguage.–  Allowsonetofocusonthethetaskconceptsratherthanthedetailsofthecomputer.–  Makesdevelopmentsimpleandquick.

    §  Interpretedandinterac*ve.–  Nocompila*onstep,speedsupdevelopmentcycle.–  Interac*veshell(REPL)fortes*nganddebugging.

    §  Generalpurpose.–  CanbeusedtowritesoUwareacrossawiderangeofapplica*ons,notdomainspecific.–  Largeruseranddeveloperbase.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA6

  • Python: the Language

    Pythonisanideallanguageforscien*ficprogrammingbecauseitis:§  Conciseandreadable

    §  OpenSourceandFree

    –  NocompanycontrolsthesoUware,nolicensingcosts.

    §  Hasalargenumberofthird-partylibraries.–  PyPIcontainsmorethan72,000packages.–  Manyscience-focusedlibrariesexistcoveringavarietyoffields.

    §  Easytointerfacewithexis*ngC,C++andFortrancode.–  ToolslikeCython,cffi,ctypes,andF2PYmakeinterfacingwithexis*ngcodesimple.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    7

    colors=['red','blue','green']forcolorincolors:print(color)

    @colors=("red","blue","green");foreach(@colors){print"$_\n”;}

    Python Perl

  • Python: the Challenges

    Pythondoesfacesomechallenges:

    §  Execu*onspeedisslowcomparedtosomeprogramminglanguages.–  Manytasksarenotlimitedbyexecu*onspeed.–  Op*onsexisttospeedupPythoncode.

    §  Thelanguageisnotthebestatsometypesofparallelcomputa*ons.–  TheGlobalInterpreterLock(GIL)poseslimitsofmul*-threadedperformance.–  Handlingconcurrencyisnotasrobustassomelanguages(Go,Erlang,Clojure).–  Supportforcomputa*ononGPUsandotherhighlyparallelarchitecturesislimitedand

    requiresaddi*onallibraries(Numba,PyCUDA)§  Thelanguageiscurrentlyinflux.

    –  Python3isbackwardsincompa*blewithPython2.–  SomelibrariesdonotsupportPython3.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA8

  • Scientific Python Libraries: Introduction

    TheScien*ficPythonEcosystemorSciPystackisacollec*onofopensourcesoUwareforscien*ficcompu*nginPython.Someofthecorepackagesare:§  NumPy:Providespowerful,efficientmul*-dimensionalarraysinPython§  SciPy:Fundamentalnumericalalgorithmsforcommontasksinscience.§  matplotlib:comprehensivepublica*on-quality2Dplolng

    §  Jupyter/IPython:Rich,interac*veinterfacesforprocessingdataandtes*ngideas.§  pandas:Highperformance,easytousedatastructures.§  SymPy:Symbolicmathema*csandcomputeralgebra

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    9

  • The Python ARM Radar Toolkit: Py-ART

    §  Py-ARTisamoduleforvisualizing,correc*ngandanalyzingweatherradardatausingpackagesfromthescien*ficPythonstack.

    §  DevelopmentbegantoaddresstheneedsoftheARMprogramwiththeacquisi*onofmulPplescanningcloudandprecipitaPonradars.

    §  Theprojecthassincebeenexpandedtoworkwithavarietyofweatherradars,includingNEXRAD,andhasawideuserbaseincludingradarresearchers,weatherenthusiastsandclimatemodelers.

    §  AvailableonGitHubasopensourcesoQware,arm-doe.github.io/pyart/.§  Condapackagesareavailableatanaconda.org/jjhelmus

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    10

  • Scientific Python Libraries: NumPy

    §  NumPyisthefundamentalpackageforscien*ficcomputa*oninPython.

    §  NumPyprovides:–  Powerful,efficient(fast)mul*-dimensionalarrayobject,thendarrayclass.–  RobustmethodsformanipulaPngthesearrays.–  Maskedandrecordarrayobjects.–  Rou*nesforlinearalgebra,Fouriertransform,andrandomnumbers.–  Comprehensive,well-wriRendocumenta*on,hRp://docs.scipy.org/doc/.

    §  Otherscien*ficPythonpackagesbuildonNumPytoaddaddi*onalfeaturesandabili*es.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    11

  • NumPy’s ndarray details

    §  NumPy’scorefunc*onalityisprovidedbyitsndarrayobject.§  Andarrayisahomogeneous,stridedviewonaconPguousblockof

    memory.

    §  Althoughsimple,thendarrayisapowerfulconstructastheloca*onofunderlyingmemorycanbepassedtootherlanguages(C,C++,CythonandFortran)withouttheneedtocopydata.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA12

    4bytes 4bytes 4bytes

    16bytes

    16bytes

    (3,4)ndarrayoffloat32strides(16,4)

  • ndarray views

    §  SlicingaNumPyndarrayalmostalwayscreatesa“view”ofthedata.

    §  Nocopyingofdataisneededwhenaccessingormodifyingviews.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    13

    4bytes 4bytes 4bytes

    16bytes

    16bytes

    4bytes 4bytes 4bytes

    8bytes

    16bytes

    a=np.zeros((3,4),np.float32)(3,4)ndarray

    a[0,:](4,)ndarray

    a[:2,::2](2,2)ndarray

  • NumPy in Py-ART

    NumPyisusedextensivelythroughoutPy-ARTasndarraysaretheprimaryobjectsusedtostoreandmanipulatenumericaldata.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    14

    elifdata_type_name=='PHIDP2':out[:]=360.*(data.view('uint16')-1.)/65534.mask[data.view('uint16')==0]=Trueelifdata_type_name=='HCLASS2':out[:]=data.view('uint16')elifdata_type_name=='XHDR':#Herewereturnanarraywiththetimesinmilliseconds.returndata[...,:2].copy().view('i4')#onebytedatatypeselifdata_type_name[-1]!='2':#makeaviewoflefthalfofthedataasuint8,#thisistheactualraydatacollected,therighthalfisblank.nrays,nbin=data.shapendata=data.view('(2,)uint8').reshape(nrays,-1)[:,:nbin]

  • NumPy in Py-ART

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    15

    #decoderunlengthencodingrle_size=radial_header['nbytes']*2rle=np.fromstring(buf2[pos:pos+rle_size],dtype='>u1')colors=np.bitwise_and(rle,0b00001111)runs=np.bitwise_and(rle,0b11110000)//16radial[:]=np.repeat(colors,runs)

  • Scientific Python Libraries: SciPy

    SciPyisacollec*onofmathema*calalgorithmsandfunc*onswhichbuilduponNumPytoprovideefficientsolu*onstocommonnumericaltasks.SciPyisdividedintosubpackageswhichcoveranumberofscien*ficdomains:§  Imageprocessing§  Signalprocessing§  Interpola*on§  Spa*aldatastructuresandalgorithms§  Clustering§  Numericalintegra*on§  Differen*alequa*ons§  Sta*s*cs§  Sparsematrices

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    16

  • SciPy in Py-ART

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    17

    SciPyfunc*onsforimageprocessing,numericalintegra*on,interpola*on,spa*alanalysisandsparsematrixstorageareallusedinPy-ART.

    importscipy.ndimage…def_find_regions(vel,gfilter,limits):"""Findregionsofsimilarvelocity."""mask=~gfilterlabel=np.zeros(vel.shape,dtype=np.int32)nfeatures=0forlmin,lmaxinzip(limits[:-1],limits[1:]):#findconnectedregionswithinthelimitsinp=(lmin

  • Scientific Python Libraries: matplotlib

    matplotlibisaplolnglibrarywhichworkswithNumPy.

    §  Comprehensive2D,publica*onqualityplots.–  Mul*pleplottypes:line,scaRer,image,contours,pseudocolor,…–  Manyoutputformats:png,jpg,svg,ps,pdf,…

    §  Alimitedsetof3Dplots.–  Line,scaRer,wireframe,tri-surface,contour,polygon,…

    §  Plotscanbeexaminedinterac*velyorembeddedinapplica*ons.

    –  ExploredatainaGUI–  ARTView:GUIviewerbuiltontopofthePy-ARTwhichembedsmatplotlibplots

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    18

  • matplotlib in Py-ART

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    19

    importmatplotlib.pyplotaspltimportpyartradar=pyart.io.read('sgpxsaprrhicmacI5.c0.20110524.015604_NC4.nc')fig=plt.figure(figsize=(12,3))display=pyart.graph.RadarDisplay(radar)display.plot('reflectivity_horizontal',vmin=-20,vmax=40,cmap='pyart_NWSRef',title='XSAPR',colorbar_label='Refl.(dBZ)')display.set_limits(ylim=(0,15),xlim=(-42,42))plt.tight_layout()plt.savefig('figure.png')

  • Scientific Python Libraries: Jupyter/IPython

    ProjectJupyter(previouslyIPython)isasetofrich,interac*veinterfacesandtoolsforprocessingdataandtes*ngideasinPython.§  UserInterfaces

    –  JuypterConsole:Terminalbasedinterac*vePythonenvironment–  JuypterNotebook:Webbasedplauormforauthoringrichdocuments–  Bothhaveexcellentintegra*onwithmatplotlib

    §  Kernels–  IPython:interac*vecompu*nginPython–  ipyparallel:Lightweightparallelcompu*ngwithnotebookintegra*on–  IJulia,IRKernel,IRuby,IPerl,…

    §  Manyotherinteres*ngtools:nbviewer,nbconvert,nbgrader,jupyterhub…

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    20

  • Jupyter/IPython in Py-ART

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    21

  • Scientific Python Libraries: Cython

    Cython§  PythontoCcodetranslator.§  GeneratesaPythonextensionmodule.§  CanbeusedtospeedupPythoncodeby

    addingsta*ctypeinforma*on.§  AlsocanbeusedtointeractwithC/C++

    func*onandclassesinexternallibraries.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    22

    Cfile(.c)

    CCompiler(gcc)

    Externallibraries(*.so,.a)

    Cythonfiles(.pyx,.pxd)

    Cython

    CompiledPythonmodule(.so)

  • Cython in Py-ART: wrapping libraries

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    23

    cdefexternfrom"rsl.h":ctypedefstructRadar:Radar_headerhVolume**vctypedefstructRadar_header:intmonth,day,yearinthour,minutefloatsec...ctypedefstructVolume:Volume_headerhSweep**sweep...

    ...

    Radar*RSL_anyformat_to_radar(char*infile)...

    voidRSL_free_volume(Volume*v)voidRSL_free_radar(Radar*r)

    cimport_rsl_hcdefclassRslFile:

    cdef_rsl_h.Radar*_Radarcdef_rsl_h.Volume*_Volumedef__cinit__(self,filename):self._Radar=_rsl_h.RSL_anyformat_to_radar(filename)ifself._RadarisNULL:raiseIOError('filecannotberead.')def__dealloc__(self):_rsl_h.RSL_free_radar(self._Radar)defget_volume(self,intvolume_number):rslvolume=_RslVolume()rslvolume.load(self._Radar.v[volume_number])returnrslvolume...propertymonth:def__get__(self):returnself._Radar.h.monthdef__set__(self,intmonth):self._Radar.h.month=month

    _rsl_h.pxd _rsl_interface.pyx

  • Cython in Py-ART: speeding up Python code

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    24

    107secondsvs.0.234seconds,x450performanceimprovement.

    @cython.boundscheck(False)@cython.wraparound(False)def_fast_edge_finder(int[:,::1]labels,float[:,::1]data,intrays_wrap_around,intmax_gap_x,intmax_gap_y,inttotal_nodes):"""Returnthegateindicesandvelocitiesofalledgesbetweenregions."""cdefintx_index,y_index,right,bottom,y_check,x_checkcdefintlabel,neighborcdeffloatvel,nvelcollector=_EdgeCollector(total_nodes)right=labels.shape[0]-1bottom=labels.shape[1]-1forx_indexinrange(labels.shape[0]):fory_indexinrange(labels.shape[1]):label=labels[x_index,y_index]iflabel==0:continuevel=data[x_index,y_index]#leftx_check=x_index-1ifx_check==-1andrays_wrap_around:x_check=right#wraparoundifx_check!=-1:neighbor=labels[x_check,y_index]nvel=data[x_check,y_index]...#addtheedgetothecollection(ifvalid)collector.add_edge(label,neighbor,vel,nvel)

  • More Scientific Python Libraries…

    §  pandas:datastructuresandanalysis§  xray:labeledarrayanddatasets§  Iris:meteorology/climatedatamodel

    §  basemap:plot2Ddataonmaps§  pyproj:cartographictransforma*ons§  Cartopy:cartographictoolsforPython

    §  netCDF4-python:ReadandwriteNetCDFfiles.§  h5py:ReadandwriteHDF5files

    §  scikit-learn:machinelearning§  scikit-image:imageprocessing§  Andmany,manymore…

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    25

  • Python Community: Online

    ThePythoncommunityprovidesawelcoming,vibrantandhelpfulculture.Muchofthecommunityinterac*onsoccuronline:§  Websites

    –  Python.org:Documenta*on,tutorial,PyPIandawiki.–  Otherwebsites,scipy.org,pyaos.johnny-lin.com

    §  Socialmedia–  Facebook,TwiRer,Google+,IRC,YouTube–  Blogs:planetpython.organdplanet.scipy.org.–  Podcasts:TalkPythontomeandpodcast.__init__

    §  Mailinglists–  NearlyalltheSciPystackpackageshavetheirownmailinglists–  PyAOSmailinglist(pyaos.johnny-lin.com)

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    26

  • Python Community: in Person

    ThePythoncommunityalsomeetsinperson.§  Conferences

    –  SciPy–  PyCon–  Localandspecializedconferences(PyData,AMS)–  ConferencetalksoUenavailableatpyvideo.org

    §  Localusergroups–  wiki.python.org/moin/LocalUserGroups

    §  Meetupsandhackathons.–  python.meetup.com

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA27

  • Python Community: for Developers

    §  Mailinglists–  Manyprojectshavea–devmailinglist

    §  Socialcodingsites–  GitHub–  Bitbucket

    §  NumFocus

    §  Scien*ficPythonfocusedcompanies–  Con*nuumAnaly*cs–  Enthought

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    28

  • My Own Path Through the Python Community

    §  Undergraduate-ChemistryatMichiganTechnologicalUniversity

    §  Ph.D.atTheOhioStateUniversity-ChemicalPhysics.–  LearnedtoprograminPython–  Wrotenmrglue,alibraryforworkingwithNMRdatainPython

    §  Post-docatUConnHeathCenter–  Con*nuedtoprograminPython–signalprocessingforNMR–  ARendedfirstSciPyconference

    §  AdvancedAlgorithmsEngineeratArgonneNa*onalLaboratory–  DevelopmentleadofthewidelyusedopensourcePy-ARTproject.–  Contribu*ngtootherlibrariesintheSciPystack.–  ARendandat*mespresentatSciPy,PyCon,ChiPy,…

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA 29

  • Careers in Meteorology with Python Programming

    Doyouenjoyprogramming?Solvingproblems?Python?§  HighPerformanceCompu*ng(HPC)§  Informa*onProcessingTechnologies§  Instrumenta*onandprocessing§  DataAssimila*on§  NumericalWeatherPredic*on§  Climateandatmosphericmodeling

    Regardlessofthecareerpathyouchoose,learningPythonwillenhanceyourpoten*al.

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    30

  • Ques*ons?

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA 31

  • Resources for Learning Python

    §  Websites–  Python.orgtutorial–  ScipyLectureNotes,www.scipy-lectures.org

    §  Books–  “ThinkPython:HowtoThinkLikeaComputerScien*st”byAllenB.Downey–  “Effec*veComputa*oninPhysics:FieldGuidetoResearchwithPython”byAnthony

    Scopatz,KathrynD.Huff

    §  Inperson–  Tutorialsatconference(PyCon,SciPy,AMS)–  SoUwareCarpentry,soUware-carpentry.org

    §  Videos–  ManyPythonconferencesmakestheirtutorialsavailableonYouTubeorpyvideo.org

    TheLanguage,LibrariesandCultureofPythoninMeteorology,J.Helmus,AMS2016,NewOrleans,LA

    32