Opportunities for software engineering practices in ...€¦ · to support cloud deployment Status...
Transcript of Opportunities for software engineering practices in ...€¦ · to support cloud deployment Status...
Opportunities for software engineering practices in deploying environmental models to cloud computing architectures
Faiza Samreen, Will Simm, Richard Bassett, Gordon Blair, Paul Young and Ensemble Team
Vision: Models in the Cloud
Series of semi-structured interviews identified challenges in:
• Systems administration
• Data storage/exchange
• Alternative architectures
• Monolithic code
• Interfacing and reuse
• Version control
Challenges for environmental modellers
“I’ve never been taught to do anything on a computer apart from at school”
“By the time you’ve written ten lines of code, that quickly becomes a hundred, it quickly becomes, “Well, we can’t start again now.””
“It took me ages to read through the Fortran file”
“If you need to repeat it then copy and paste. “
• Abstraction: no need to become systems administrators
• Framework Support: separation of scientific tasks into components
• Flexible and emergent architectures:on demand cloud computing
• Education: understanding of software engineering may help structure models in more scale-able, reusable, fault tolerant manner
Opportunities for environmental science
Allow us [the environmental scientist] to spend more time concentrating on science
Model a,b,c,…,n
Parameter a,b,c,…ne.g. 10 models. Each model has 10 parameters with 10 possibilities. Each simulation takes 10 hours using 64 CPU.
= 417 days…
Cloud Computing
1011001
0101010
010111
10011
The provision of shared (rented) computing resources over the internet
• More than just personal storage (e.g. Dropbox, iCloud)
• Access to dedicated resources (e.g. GPUs), storage and software
• Elastic and convenient (i.e. only pay for what you use, instantly available and no limitations on available resources)
http://www2.mmm.ucar.edu/wrf/WG2/wrf_moving_nest.gif
“Exploratory” - current lack of:
• awareness of the potential of cloud computing
• skills to exploite cloud facilities
• support in terms of tools and frameworks to support cloud deployment
Status of cloud computing within environmental science
a) Demonstrate the suitability of a cloud platform by deploying a complex environmental model
b) Raise the level of abstraction
Approaches to facilitate clouds
Scientist
Models, Data, Programming
Libraries, Compilers, Databases
Hardware components
Platforms
Operating systems
• Free-to-use community numerical weather prediction model
• Wide range of applications e.g. air quality (WRF-CHEM) and hydrology (WRF-HYDRO)
a) Demo: WRF in the Cloud
However
• Although portable, realistically a high-performance computer is needed• Steep learning curve, particularly for model installation
Lagos, Nigeria 2m urban heat island
• Series of automate scripts to configure and install WRF on Microsoft Azure’s cloud, including all dependencies
Demo: WRF in the Cloud
Demo: WRF in the Cloud
Cost and performance of running WRF simulation on Azure cluster
WRF simulation execution time over Azure cloud and HEC
Abstraction
• However our WRF deployment is simply a demonstration of what is possible, and needs abstraction… i.e. clouds are complicated
Typical model users
A. New users, e.g. Masters level students who may take 3 months to learn how to install and configure the model before doing any science
B. Those who just want to run the model in a standard way and get some results to feed into other models or projects
C. Power users who may want to quickly deploy for a project without wanting to wait for a HPC queue or make changes to the model code
Abstraction
Leads to two key areas for abstraction:
1. Deployment of models to appropriate soft/hardware
2. Experiment description
Scientist
Models, Data, Programming
Libraries, Compilers, Databases
Hardware components
Platforms
Operating systems
1.
2.
Abstraction tools: MDE? DSL?
• Model Driven Engineering (MDE) is a software development methodology that creates and exploits domain models (i.e. conceptual models of all the topics related to a problem)
• A Domain Specific Language (DSL) is a language that is specialized to capture concerns of a specific domain (i.e. specifically written to solve a given problem)
WRF Abstraction
MDE WRF Deployment Schematic
Knowledgebase
Fortran Namelist
GeneratorJu
pyt
er
Inlin
eEn
sem
ble
DSL
Jup
yte
r In
line
Ense
mb
le D
SL Namelist.WPS
Namelist.Input
………….……..
WRF.conf.Class..Config.
DSL
Config.DSL
Machine Learning
Tacit Knowledge
CloudMLInformed Choice
Supported ArchitectureCost/Time Estimator
DeployDeployAnalysis: Success? Time?, Cost?
• Interviewed scientists to understand levels software engineering in environmental science
• Found opportunities for software engineering within environmental science
• Prototyped a model deployment in the cloud
Summary A
Abstraction can support:
• Interoperability, scaling, accessibility, democratization & open science
Allowing:
• Less systems administration, more time for science
• More model runs and easy orchestration of ensembles
• Pay-per-simulation, no queue times
Now:
• Building tools and frameworks to leverage new technologies
Summary B
Thank you
@EnsembleLancshttps://www.ensembleprojects.org/[email protected]