Transparent heterogeneous hardware Architecture …tango-project.eu › sites › default › files...
Transcript of Transparent heterogeneous hardware Architecture …tango-project.eu › sites › default › files...
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 1 of 39
Transparent heterogeneous hardware Architecture deployment for eNergy Gain in Operation
D3.1 TANGO Toolbox
Alpha version (year-1)
Lead Editor Jean-Christophe DEPREZ (CETIC)
Authors Jean-Christophe DEPREZ (CETIC), Lotfi Guedria (CETIC), Renaud De Landtsheer (CETIC), David Garcia Perez (ATOS), Roi Sucasas Font (ATOS), Richard Kavanagh (ULE), Jorge Ejarque (BSC), Yiannis Georgiou (BULL)
Version 1.3
Reviewers Bruno Wery (DELTATEC), Yiannis Georgiou (BULL)
Work package WP 3
Due date 31/12/2016
Submission date 23/12/2016
Distribution level (CO, PU): PU – Software (and associated report)
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 2 of 39
Abstract This document provides the installation manuals of the various software packages found in the TANGO architecture. The results achieved at the end of Year-1 makes it possible to use development tools to shape application code and then to run application on an heterogeneous hardware obtain time and energy consumption profile can be collected in order for developers to establish the desired benchmarks and to assist them in making requirements, design and coding decisions.
Keywords TANGO, software, framework, installation, manual
Licensing information: Each component is delivered under its own open source license specified in each code file headers.
This report is licensed under Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
http://creativecommons.org/licenses/by-sa/3.0/
Document Description
Document Revision History
Version Date
Modifications Introduced
Description of change Modified by
v0.1 2016/11/25 First draft version Jean-Christophe
DEPREZ (CETIC)
v0.2
2016/12/12 Initial integration of Programming model, Code Optimiser, Energy Modeller, Energy probes, Application Life-cycle Deployment Engine
ULE, Atos, BSC
V0.3 2016/12/13 Introduction, Conclusion, Executive
Summary Jean-Christophe DEPREZ (CETIC)
V0.4 2016/12/15 Integrate Design-Time Characteriser
(Poroto) Jean-Christophe DEPREZ and Lotfi GUEDRIA (CETIC),
V1.0 2016/12/16 Integrate SLURM input from Yiannis and finalise layout and ToC to
Jean-Christophe DEPREZ (CETIC) and Yiannis Georgiou (BULL)
V1.1 2016/12/18 Initial set of basic review comments from Bruno handled
Jean-Christophe DEPREZ (CETIC)
V1.2 2016/12/20 Updates by partners to address Bruno’s review comments
CETIC, BSC, ULE, ATOS, BULL
V1.3 2016/12/20 Updates to address Yiannis’s review comments
Jean-Christophe DEPREZ (CETIC)
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 3 of 39
Table of Contents
Table of Contents .......................................................................................................................... 3
Table of Figures ............................................................................................................................. 5
Terms and abbreviations ............................................................................................................... 6
Executive Summary ....................................................................................................................... 7
1 Introduction .......................................................................................................................... 8
1.1 About this deliverable ................................................................................................... 8
1.2 Document structure ...................................................................................................... 8
2 Installation and Configuration Guide for development tools ............................................... 9
2.1 Installation and Configuration of Requirements and Modelling Tooling .................... 10
2.1.1 Installation and Configuration of the Design Time Characteriser ....................... 10
2.1.1.1 Overview and context ..................................................................................... 10
2.1.1.2 Installation ....................................................................................................... 11
2.1.1.3 Example of tool usage ..................................................................................... 12
2.1.2 Installation and Configuration of the Design and Development Time Optimiser 14
2.1.2.1 System requirements and Software Dependencies ........................................ 14
2.1.2.2 Installing Placer (Year-1 Implementation of the Design-time Optimiser) ....... 14
2.1.2.3 Running Placer ................................................................................................. 14
2.2 Installation and Configuration of Programming Model Tooling ................................. 15
2.2.1 System Requirements and Software Dependencies: .......................................... 15
Common: ......................................................................................................................... 15
--with-monitor option: .................................................................................................... 16
--with-tracing option ....................................................................................................... 16
2.2.2 Installation Instructions ....................................................................................... 16
2.2.3 Application Development Overview ................................................................... 16
2.2.4 Application Compilation ...................................................................................... 17
2.2.5 Application Execution .......................................................................................... 17
2.2.6 Known Limitations ............................................................................................... 18
2.3 Installation and Configuration of Code Optimiser Tooling ......................................... 18
2.3.1 Platforms Supported ........................................................................................... 18
2.3.2 Software Pre-requisites and Dependencies ........................................................ 19
2.3.3 Installation Instructions ....................................................................................... 19
2.3.4 Configuration ....................................................................................................... 19
3 Installation and Configuration Guide for runtime software packages ................................ 20
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 4 of 39
3.1 Installation and Configuration of Extra Energy Probes ............................................... 21
3.1.1 Nvidia GPUs ......................................................................................................... 21
3.1.1.1 Supported OS platforms and products............................................................ 21
3.1.1.2 Example: .......................................................................................................... 22
3.1.2 NVIDIA Plugins ..................................................................................................... 22
3.1.2.1 Collectd ............................................................................................................ 22
3.1.2.2 Slurm ............................................................................................................... 22
3.1.2.3 Installation and configuration ......................................................................... 24
3.2 Installation and Configuration of SLURM .................................................................... 24
3.2.1 Platforms Supported ........................................................................................... 24
3.2.2 Software Pre-requisites and Dependencies ........................................................ 24
3.2.3 Installation instructions ....................................................................................... 24
3.2.4 Slurm accounting and profiling framework......................................................... 25
3.2.5 SLURM Key Functions .......................................................................................... 27
3.2.6 SLURM Components ............................................................................................ 27
3.2.6.1 SLURMCTLD ..................................................................................................... 28
3.2.6.2 SLURMD ........................................................................................................... 29
3.2.6.3 SlurmDBD (SLURM Database Daemon) ........................................................... 29
3.3 Installation and Configuration of Energy Modeller ..................................................... 30
3.3.1 Minimal System Requirements ........................................................................... 30
3.3.2 Platforms Supported ........................................................................................... 30
3.3.3 Software Pre-requisites and Dependencies ........................................................ 30
3.3.4 Installation Instructions ....................................................................................... 31
3.3.5 Using the standalone calibrator .......................................................................... 34
3.3.5.1 Apps.csv ........................................................................................................... 35
3.3.5.2 Using the Watt Meter Emulator ...................................................................... 35
3.3.5.3 watt-meter-emulator.properties .................................................................... 36
3.4 Installation and Configuration of Application Life-cycle Deployment Engine ............ 36
3.4.1 System Requirements ......................................................................................... 36
3.4.2 Installation and configuration ............................................................................. 36
3.4.3 API Documentation ............................................................................................. 38
4 Conclusions ......................................................................................................................... 39
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 5 of 39
Table of Figures
FIGURE 1: GENERAL TANGO ARCHITECTURE WITH DEVELOPMENT TOOLS IN RED BOXES. .............................. 9 FIGURE 2: SCREENSHOT OF EXAMPLE MODEL USING PLACER. .................................................................. 15 FIGURE 3: GENERAL TANGO ARCHITECTURE WITH OPERATION SOFTWARE COMPONENTS IN RED BOXES. ...... 20 FIGURE 4: SLURM SIMPLIFIED ARCHITECTURE. .................................................................................... 28
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 6 of 39
Terms and abbreviations
ALDE Application Lifecycle Deployment Engine
API Application Programming Interface
BMC Board-based Management Controller
BSD Berkley Software Distribution
COP Code Optimiser
(C)OMPSs (Cloud) Open MP Superscalar (from BSC)
CPU Central Processing Unit
CUDA Compute Unified Device Architecture
FPGA Field Programmable Gate Array
GID Group Identification (in *nix OS)
GPU Graphical Processing Unit
DTC Design-Time Characteriser
DTO Design-Time Optimiser
EC European Commission
IDE Integrated Development Environment
IPMI Intelligent Platform Management Interface
PCIe Peripheral Component Interconnect express
RAPL Running Average Power Limit
REST Representational State Transfer
RIFFA Reusable Integration Framework for FPGA Accelerators
ROCCC Riverside Optimizing Compiler for Configurable Circuits
SLURM Simple Linux Utility for Resource Management
UID User Identification (*nix OS)
VHDL Very High-level Design Language
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 7 of 39
Executive Summary
This document (D3.1) accompanies the software release of the TANGO project at the end of Year 1 under work package 3. It presents the installation manuals of the different software components found in the overall TANGO architecture. First, it presents the installation of development tools and then of the software to install on the operational/runtime infrastructure of heterogeneous hardware.
At this stage, not all components have started to be implemented. However, the available component implementation provides the necessary and sufficient basis to profile the time and energy performance of an application so as to define the desired benchmarks and help developers make decisions on requirements, design, and coding.
This document remains focused on the technical details of installing the TANGO components. The scientific contributions achieved at the end of Year 1 relying on the various TANGO software components are presented in Deliverable D3.2.
In the following years, work packages 4 in year 2 and then work package 5 during year 3 will follow the scientific and technical effort. While Year-1 focused on providing the necessary software to perform profiling of time and energy performance to obtain static benchmarking information on applications run on a heterogeneous hardware architecture, the effort in Year-2 will further integrate the TANGO component to make it possible to explore trade-off on additional non-functional behaviours such as security, robustness and maintainability. Finally, during year 3, the focus will be on enhancing programmer productivity.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 8 of 39
1 Introduction
TANGO develops software to facilitate exploiting heterogeneous hardware architectures in order to develop applications that meet time and energy performance while also satisfying security and dependability requirements. To achieve this goal, WP3, which runs during the first year of the TANGO project proposes to develop different tools. Although these tools have not reached their final implementation state, this deliverable presents their initial installation guides.
1.1 About this deliverable
This document presents information related to the software packages delivered as part of deliverable D3.1 of the TANGO project. It is an accompanying document that describes the installation of the different software tools used at development time by development teams and of the different software packages to install on the operational infrastructure to run and operate applications.
Next to this technical installation guide, deliverable D3.2 presents the scientific report on these different software tools and packages as well as an initial set of benchmarks obtained on the actual two TANGO testbeds.
1.2 Document structure
The first part of this document presents installation manual of software tools for the development team while the second part describes how to install the various software packages to obtain an operational TANGO infrastructure.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 9 of 39
2 Installation and Configuration Guide for development tools
From its general architecture in D2.1 recalled in Figure 1, the TANGO framework contains the development tools highlighted in red boxes:
Requirements and Modelling tools to model, characterise and optimise the granularity and the placement of software components on an heterogeneous infrastructure at design (and deployment) time
Programming Model plugins to facilitate specifying OMPSs and COMPSs tasks in the application source code
Code optimiser (for Energy consumption) to assist developers with profiling energy consumption at the source code level
Device Emulator, which can potentially be used at development or at deployment time to obtain time and energy performance data without actually running application code on an real infrastructure
Figure 1: General TANGO Architecture with development tools in Red boxes.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 10 of 39
In Year-1, an initial implementation is provided for the Requirements and Modelling tools, the Programming Model plugins and the Code Optimiser. Each development tool has its installation manual presented in subsections below. The device emulator will only start being implemented in year-2.
2.1 Installation and Configuration of Requirements and Modelling Tooling
The Requirements and Design Modelling toolbox is composed of the following sub-components:
A design time characteriser (DTC) for application time and energy performance on FPGA
A design and deployment time optimiser (DTO) to optimise the placement and execution of tasks on an heterogeneous infrastructure
A graphical modelling plugin to facilitate specifying characterisation data on an application for the design and deployment time optimiser. As of Year-1, this graphical modelling component has not been implemented yet.
The Requirement and Modelling sub components, in particular, the design time characteriser and the design and deployment time optimiser work as independent tools. Once their implementation converges and their input/output relation is better understood, the graphical modelling plugin will provide a high level integration. As of year-1, the installation manuals of DTC and DTO are presented in their own independent sub-section below.
2.1.1 Installation and Configuration of the Design Time Characteriser
2.1.1.1 Overview and context
Currently, the proposed design time characterization process is leveraging the Poroto tool developed by CETIC (https://github.com/cetic/poroto). The core of this tool is licensed under the 3-clause BSD license.
The target hardware for Poroto is typically a PCIe enabled workstation hosting a FPGA-based accelerator board.
The tool enables the generation of an FPGA design implementing a computation that the user defines in his code. It assumes input code be provided as separate C files for execution on the CPU and the FPGA respectively. The C files targeting the FPGA are basically implementing functions that will be translated to VHDL, compiled, synthesised and programmed to the target FPGA accelerator board. The code on the CPU side can then be very easily adapted to substitute the call for the original function by calls to wrapper module, generated by the tool. The wrapper encapsulates calls to the FPGA board driver API that essentially implements data transfer through the PCIe interface.
Poroto tool was initially designed around proprietary driver from AlphaData. It was recently extended to support a more generic interface approach through the integration of the RIFFA framework. RIFFA (Reusable Integration Framework for FPGA Accelerators) is a simple framework developed at UC San Diego (http://riffa.ucsd.edu/) to support communicating data from a host CPU to a FPGA via a PCI Express bus. The integration of RIFFA framework, allows to address FPGA accelerators, regardless the family type of FPGA (Xilinx or Altera) provided that the PCIe IP is available for the target architecture. Poroto also includes native support for the GHDL (VHDL simulator) which greatly facilitates the tests and accelerates the validation of the selected FPGA computation by targeting a virtual FPGA before integrating it as a real hardware
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 11 of 39
acceleration for a software application running on CPU. GHDL support within Poroto allows a fast evaluation and characterization process where compiling, testing and validating the selected computation is extremely optimised and automated.
Poroto is implemented in Python programming language and it realizes the following compilation steps for a rapid evaluation of computation kernels offloading on an FPGA board:
The source C code of the computation to be offloaded is parsed and adapted for being fed to the underlying ROCCC tool (compiler that generates VHDL from C). This step takes into account various configurations and constraints of ROCCC. In case the tool cannot infer some characteristics of the input code, low level pragmas are used to guide the code generation.
Automation of ROCCC compilation process
VHDL code for interfaces, memories and FPGA glue is generated: interfacing is leveraging either the generic RIFFA framework IP or the vendor-specific AlphaData logic.
Generation of test set-ups for the FPGA design and also for the CPU design that will exploit the offloaded computation
Automation of the compilation process for FPGA (Target dependent, in our case: Xilinx synthesis tools)
Generation of the code for CPU that interfaces with the FPGA implementation (sending the bit stream to program the FPGA, transferring the data, triggering execution of offloaded design and retrieve back the result data).
2.1.1.2 Installation
The Poroto tool is intended to be installed and run on the following platforms:
Recent Linux like distributions such as Debian Jessie (or later) or Ubuntu 14.04 (or higher)
Recent OS X based machine: Mac-OS Yosemite
2.1.1.2.1 Dependencies The Poroto tool has the following open-source dependencies:
The Python interpreter (version 2.7 or above)
PyCParser and PLY libraries (Included in Poroto)
ROCCC compiler (version 0.7.6)
GCC compiler
RIFFA Framework (Version 2.0 or above, Optional)
GHDL (Version 0.31 or above, Optional)
Besides leveraging the ROCCC compiler for the generation of VHDL code, the tool makes use of the proprietary components that are associated with the target platforms, like the AlphaData board (ADM-XRC-6T1) :
AlphaData VHDL Library, C SDK and Driver
Xilinx PlanAhead compilation suite
Xilinx IP Cores for the generation of memory blocks, FIFO, computational IP (integer multiplication and division, floating point support, ...)
The above tools, API are not packaged with the tool and should be accessed or acquired and installed separately by the user. The templates related to these proprietary tools are not
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 12 of 39
provided by default within Poroto but can be disclosed to interested parties who should have a similar platform and valid licenses for the associated proprietary tools
2.1.1.2.2 Instructions The tool does not require any installation steps. To launch the tool, the following two environment variables must be set:
ROCCC_ROOT : points to the top directory of the ROCCC tool
POROTO_ROOT : points to the top directory of the Poroto tool
In order to simplify the usage, a Makefile support file is included in Poroto distribution to set up the correct environment and select the right configuration parameters.
2.1.1.3 Example of tool usage
After installing the Poroto software and all its dependencies, one can use the simple demo provided to check the Toolchain. The demo is a simple code that calculate the sum of two vectors, element by element. The demo source code, available in the tool demo directory, is:
#pragma poroto memory test_A int 100 #pragma poroto memory test_B int 100 #pragma poroto memory test_Out int 100 #pragma poroto stream::roccc_bram_in VectorAdd::A(test_A, N) #pragma poroto stream::roccc_bram_in VectorAdd::B(test_B, N) #pragma poroto stream::roccc_bram_out VectorAdd::Out(test_Out, N) void VectorAdd(int N, int* A, int* B, int* Out) { int i; for(i = 0; i < N; ++i) { Out[i] = A[i] + B[i]; } }
In order to test the correctness of the generated code, we can specify test vectors (in a python file):
from poroto.test import TestVector test_vectors = { ’VectorAdd’: TestVector(1,{ ’N’: [12], ’A’: [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]], ’B’: [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]], ’Out’: [[10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32]], }), }
The demo uses a simple Makefile to invoke Poroto :
POROTO_ROOT=../.. FILES=vector_add.c include $(POROTO_ROOT)/poroto.mak
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 13 of 39
The tool can be invoked using the Makefile provided in the demo
make clean make gen make compile make run
The gen target will read the source code, apply specific code transformation and optimisation and then invoke the ROCCC tool for each module to be converted into VHDL. Next, it generates all the dependencies needed by the modules, like memory blocks, IP blocks, data streams and test benches. If the target is able to communicate to a host environment, the tool will also generate the required wrappers to invoke the FPGA from the host environment.
For the GHDL target, it is possible to compile the generated VHDL and perform a run of the test bench on a simulated environment, this is done using the compile and run targets.
To specify a hardware target, one must add either in the Makefile or on the command line the TARGET parameter, e.g. if we want to use the alphadata FPGA based accelerator board:
make TARGET=alphadata clean make TARGET=alphadata gen
The generated project is found under the project/ directory and can be imported as is in the FPGA backend tool for synthesis and implementation, for instance the Xilinx PlanAhead suite. Also, the wrapper and C testbench for the host CPU are generated. Here below the wrapper code:
#include <inttypes.h> #include "fpga.h" void VectorAdd(int N, int *A, int *B, int *Out) { fpga_write_vector(0, (N)*4, A); fpga_write_vector(1, (N)*4, B); pFpgaSpace[0x1] = (uint32_t)N; while (pFpgaSpace[0x2000] == 0) ; //Wait for resultReady fpga_read_vector(2, (N)*4, Out); }
The existing host code can simply invoke the transformed function without changing the rest of the code since the C wrapper keeps the same function signature (i.e. it implements a function with the same name and parameter but this time it trigger the offloaded part rather than executing within the CPU).
Poroto software comes with several other examples. Below a list of the ones available in the distribution:
simple_add : A simple adder block with no data streaming
vector_add : A simple vector addition
vector_add_ip : A simple vector addition using an external IP block to perform the operation
vector_add_float : A simple vector addition based on float elements
matrix_multiplication : A generic multiplication of integer matrices
buffer_sliding : A 3x3 moving window over a matrix
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 14 of 39
vector_avg : A n-element wide moving window over a vector
vector_add_reduce : An reduce operation performed on a vector using an add operator
The code generated by Poroto can be optimized further using dedicated pragmas. With these pragmas, the user can control the transformation path performed by ROCCC, like partial loop unrolling, arithmetic balancing, pipelining optimisation, etc., as well as the performance of the data streams generated by Poroto.
Furthermore, some other advanced code transformation can be applied like code or variable inlining, data bit-size customisation, loop fusion, etc.
2.1.2 Installation and Configuration of the Design and Development Time Optimiser
Placer is still a research prototype, as such, it has no front end, and the examples are hard coded into Scala structures that must be compiled prior to execution.
2.1.2.1 System requirements and Software Dependencies
The computer must have the following software packages installed:
Java 1.8
Scala 2.11
IntelliJ14 IDE with Scala plugin
2.1.2.2 Installing Placer (Year-1 Implementation of the Design-time Optimiser)
To install the delivered prototype of Placer, a single zip archive has been supplied in the TANGO Github repository: https://github.com/TANGO-Project/placer
The zip file contains an IntelliJ project, and the necessary jar files are included as libraries.
The project includes several example scripts that contain software and hardware models declared using the structures of Placer. These scripts end up with a call to the solver of Placer.
2.1.2.3 Running Placer
Placer is still in a prototype state. It comes as an intelliJ project. Once opened in IntelliJ, the user is presented with the project view of IntelliJ. The project contains two examples, names Example1 and Example2 as illustrated in Figure 2. Both are application and can be executed as is. The matching optimal placement result is displayed in the console that appears below in IntelliJ.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 15 of 39
Figure 2: Screenshot of example model using Placer.
2.2 Installation and Configuration of Programming Model Tooling
The TANGO Programming Model and Runtime Abstraction Layer is a combination of the BSC's COMPSs and OmpSs task-based programming models, where COMPSs is dealing with the coarse-grain tasks and platform-level heterogeneity and OmpSs is dealing with fine-grain tasks and node-level heterogeneity. The code can be found at https://github.com/TANGO-Project/compss-tango
2.2.1 System Requirements and Software Dependencies:
Common:
Supported platforms running Linux (i386, x86-64, ARM, PowerPC or IA64) Git client
bash and tcsh
Apache maven 3.0 or better
Java SDK 8.0 or better
GNU C/C++ compiler versions 4.4 or better
CNU GCC Fortran
autotools (libtool, automake, autoreconf, make)
boost-devel
python-devel 2.7 or better
GNU bison 2.4.1 or better.
GNU flex 2.5.4 or 2.5.33 or better. (Avoid versions 2.5.31 and 2.5.34 of flex as they are known to fail. Use 2.5.33 at least.)
GNU gperf 3.0.0 or better.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 16 of 39
SQLite 3.6.16 or better.
--with-monitor option:
This option enables the runtime to generate and to install tools to visualize the execution monitoring information and the execution graph.
xdg-utils package graphviz package
--with-tracing option
This option enables the runtime to generate execution trace files, which can be open with the Paraver tool1
libxml2-devel 2.5.0 or better gcc-fortran
papi-devel (sugested)
2.2.2 Installation Instructions
To install the whole framework you just need to clone the general repository code and run the following commands
$ git clone https://github.com/TANGO-Project/general.git $ cd general/IntegratedDevelopmentEnvironment/ProgrammingModelRuntime/ $ ./install.sh <Installation_Prefix> [options] #Examples #User local installation $./install.sh $HOME/TANGO --no-monitor --no-tracing #System installation $ sudo -E ./install.sh /opt/TANGO
2.2.3 Application Development Overview
To develop an application with the TANGO programming model, developers has to at least implement 3 files: the application main workflow in appName.c/cc, the application functions which are going to be coarse-grain tasks in appName.idl, and the implementation of the functions in appName-functions.cc. Other application files can be included in a src folder providing the building configuration in a Makefile.
appName.c/cc : Contains the main coarse-grain task workflow
appName.idl : Contains the Coarse-grain task definition
appName-functions.c/cc : Implementation of the coarse grain tasks
To define a coarse-grain task which contains fine-grain tasks, developers have to annotate the coarse-grain functions with the OmpSs compiler directives (pragmas).
More information about how to define coarse-grain tasks and other concerns when implementing a coarse-grain task workflow can be found
1 https://tools.bsc.es/paraver
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 17 of 39
in http://compss.bsc.es/releases/compss/latest/docs/COMPSs_User_Manual_App_Development.pdf
More information about how to define fine-grain tasks and other considerations when
implementing a fine-grain task workflow can be found in https://pm.bsc.es/ompss-docs/specs/
2.2.4 Application Compilation
Once the application has been implemented, developers have to use the buildapp command. Before, running the command user has to define a set of environment variables to indicate if the coarse-grain tasks contains OmpSs tasks, or OmpSs tasks with CUDA or OpenCL code. Following you can see an example of how to run this command.
$export WITH_OMPSS=1 #If there are coarse-grain tasks defined as a workflow of fine-grain task $export WITH_CUDA=1 #If there are fine-grain tasks defined for a cuda device $export WITH_OCL=1 #If there are fine-grain tasks defined for a OpenCL device $buildapp appName
2.2.5 Application Execution
An application implemented with the TANGO programming model can be easily executed by using the COMPSs execution scripts. It will automatically starts the Runtime Abstraction Layer and execute transparently either coarse-grain and fine-grain tasks in the selected resources.
Users can use the runcompss command to run the application in interactive nodes.
Usage: runcompss [options] application_name application_arguments
An example to run the application in the localhost which is interesting for initial debugging
$ runcompss --lang=c appName appArgs...
To run an application in a preconfigured grid of computers, the TANGO Programming model and Runtime environment must be installed in all the nodes. The executed application must also be deployed in all the nodes. Then, users have to provide the resource description in a resources.xml file and the application configuration in these resources in the project.xml. Information about how to define this files can be found in
http://compss.bsc.es/releases/compss/latest/docs/COMPSs_User_Manual_App_Exec.pdf
$ runcompss --lang=c --project=/path/to/project.xml \ --resources=/path/to/resources.xml appName app_args
More information about other possible arguments can be found by executing
$ runcompss --help
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 18 of 39
To queue an application in a cluster managed by the SLURM resource manager, users has to use the enqueue_compss command.
Usage: enqueue_compss [queue_system_options] [rucompss_options] application_name application_arguments
The following command show how to queue the application by requesting 3 nodes with at least 12 cores, 2 GPUs and 32GB of memory (approx.)
$ enqueue_compss --num_nodes=3 --tasks-per-node=12 --gpus-per-node=2 --node-memory=32000 --lang=c appName appArgs
Other options available for enqueue_compss can be found by executing the following command.
$ enqueue_compss --help
In the next release, we will also include how to build and deploy the application by using the Application Lifecycle Deployment Engine.
2.2.6 Known Limitations
Due to preliminary version of the TANGO programming model, we basically support as task normal static function in the appName-functions.c/cc. For other type of functions, there are several unsupported methods types which can make applications failing. The know issues are:
Objects as return type or defined as parameters with OUT direction
Methods called from an object
Methods with the static definition in the idl file
There is also an issue when deserializing big objects with the boost library.
We are working to solve these issues in further versions.
2.3 Installation and Configuration of Code Optimiser Tooling
The code optimizer is a standalone plugin to Eclipse that analyses Java programs for their active power consumption. Thus allowing users to understand how much power their code uses. The code can be found at https://github.com/TANGO-Project/code-optimiser-plugin
2.3.1 Platforms Supported
All Linux and Windows variants supported by the Eclipse IDE.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 19 of 39
2.3.2 Software Pre-requisites and Dependencies
To use the Code Optimizer Plug-in (COP), the following dependencies must be resolved:
Dependency Version Comment
JDK 1.7+ Provides the java runtime environment for executing the VMC.
Eclipse Platform 3.6+ Provide environment for executing Code Optimizer
JVM Monitor ¡Error! No se ncuentra el origen de la referencia.
3.8+ Library dependencies managed by Maven
Maven 3.0+ The build environment
2.3.3 Installation Instructions
After installation of a suitable Java JDK and Eclipse Platform as described above, the Code Optimizer plug-in and Eclipse site must be built using Maven:
Firstly, checkout the COP tool source code from the URL obtained from the Tango git repository:
user@host:~$ git clone https://github.com/TANGO-Project/code-optimiser-plugin
The process to build the plug-in and the Eclipse site using Maven:
user@host:~$ cd <cop_path>/code-optimiser-plugin
user@host:<cop_path>/cop-plugin$ mvn clean install
user@host:~$ cd <cop_path>/tango-eclipse-site
user@host:<cop_path>/tango-eclipse-site $ mvn clean install
Install the plug-in from a local update site using the following local URL:
jar:file<cop_path>/tango-eclipse-site/target/site-<version>-SNAPSHOT.zip!/
2.3.4 Configuration
The COP component, need valid calibration data for the energy model to provide accurate values. An initial set of values are provided by default. The energy modeller calibration tool may be used to provide a set of values in cases where an attached watt meter is available.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 20 of 39
3 Installation and Configuration Guide for runtime software packages
Once application development has taken place, it is possible to deploy and run it on a heterogeneous infrastructure. While in year-1, a significant amount of effort remains manual to upload, compile, execute, collect and aggregate monitoring data, most tools from the TANGO general architecture shown in the red box provide an initial implementation.
Figure 3: General TANGO Architecture with operation software components in Red boxes.
First, the SLURM tool provides an implementation for the Device Supervisor and the Infrastructure Monitor. Second, energy probes installed on hardware hosts can retrieve the energy consumed by the various heterogeneous components and communicated to the Infrastructure Monitor (SLURM or other).Third, the Self-Adaptation Manager proposes an initial implementation of its Energy Modeller. Fourth, the Application Life-cycle engine also presents an initial implementation. Only the implementation of Device Emulator will start during Year-2. The standard code for SLURM is available at https://slurm.schedmd.com/
The subsections below present the installation manual for:
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 21 of 39
Extra Energy Probes to install on hardware host to measure energy of the GPU and XeonPhi elements of a host. (NOTE: Energy consumption for the host can already be collected through IPMI or RAPL by SLURM
SLURM (Device Supervisor and Infrastructure Monitor)
Energy Modeller (Part of the Self-Adaptation Engine)
Application Life-cycle Deployment Engine
It is worth emphasizing that for the Benchmarking exercised in year-1 on Nova 2, only the SLURM component was used. Others have so far not been fully integrated to be used. Furthermore, most of these other components will be truly useful in order to achieve self-adaptation action at deployment and operation time. Thus, for this first year where benchmarking exercises was done statically and where application could be installed and executed manually, the TANGO operational components other than SLURM where not absolutely required.
3.1 Installation and Configuration of Extra Energy Probes
At the beginning of the project SLURM already provided probes to measure energy of a whole node using IPMI and energy consumption of an Intel CPU processor using Intel RAPL. To complement this, during the first year, it was decided to start building energy hardware probes to other hardware components such as Nvidia GPUs or Intel Xeon Phi Many-Core Processors (MCP). The probes tests of this last component are still in progress. Finally, it is expected that by the end of the project this component reaches at least a TRL 6 (Technology Readiness Level 6: system / subsystem model or prototype demonstration in a relevant environment).
These probes are designed to work only with SLURM monitoring infrastructure or with CollectD monitoring server. The next sections report about them.
3.1.1 Nvidia GPUs
In order to monitor these GPUs, this component relies on the NVIDIA Management Library (NVML2) provided by NVIDIA. This is a C-based API that offers a set of functions for monitoring various states within these GPUs, like temperature, power consumption, fan speeds etc.
3.1.1.1 Supported OS platforms and products3
The NVML library currently supports the following operating systems:
- Windows Server 2008 R2 64-bit, Windows Server 2012 R2 64bit, Windows 7-8 64-bit
- Linux 32-bit and 64-bit
The list of full supported NVIDIA products is the following:
- NVIDIA Tesla Line: S2050, C2050, C2070, C2075, M2050, M2070, M2075, M2090,
X2070, X2090, K8, K10, K20, K20X, K20Xm, K20c, K20m, K20s, K40c, K40m, K40t, K40s,
K40st, K40d, K80
- NVIDIA Quadro Line: 410, 600, 2000, 4000, 5000, 6000, 7000, M2070-Q, K2000,
K2000D, K4000, K5000, K6000
- NVIDIA GRID Line: K1, K2, K340, K520
2 https://developer.nvidia.com/nvidia-management-library-nvml
3 NVML API Reference documentation: https://docs.nvidia.com/deploy/nvml-api/nvml-api-
reference.html
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 22 of 39
And finally, it also offers a limited support for the following products:
- NVIDIA Tesla Line: S1070, C1060, M1060 and all other previous generation Tesla-
branded parts
- NVIDIA Quadro Line: All other current and previous generation Quadro-branded parts
- NVIDIA GeForce Line: All current and previous generation GeForce-branded parts
3.1.1.2 Example:
The example that can be found in the Monitor Infrastructure repository4 looks for NVIDIA GPUs and retrieves the power usage of these devices (in Watts). To install and run this example these are the requirements:
- Download and install the GPU Deployment Kit5
- Modify (if needed) the Makefile, and then execute ‘make’ command
- Run the program
3.1.2 NVIDIA Plugins
Taken as a basis the previous example, we created two plugins in order to integrate the NVIDIA monitoring capability in the Monitoring Infrastructure. These plugins will be used respectively by Collectd and Slurm. The integration of these two plugins into the TANGO framework is currently taking place and will be finalized during the second year of the project.
3.1.2.1 Collectd
The “NVIDIA” Collectd plugin follows the instructions described in the Collectd Wiki – Plugin Architecture6.
These are the steps to compile and run the NVIDIA plugin for Collectd:
- Get collectd source code from https://github.com/collectd/collectd
- "build.sh" and "./configure && make"
- Create / edit the program following the instructions from Plugin architecture
- Compile plugin C program. For example:
o gcc -DHAVE_CONFIG_H -Wall -Werror -g -O2 -shared -fPIC -Isrc/ -Isrc/daemon/
-lnvidia-ml -L nvidia/lib/ -ldl -o nvidia_plugin.so nvidia/nvidia_plugin.c
This plugin also adds the following metric to Collectd:
static data_source_t dsrc[1] = { { "watts", DS_TYPE_GAUGE, 0, NAN } };
Collectd can obtain these results by enabling other plugins, like the RDD or the CSV ones, and these new metrics should be available after configuring and launching Collectd.
3.1.2.2 Slurm
The “NVIDIA” Slurm plugin follows the following architecture:
4 https://github.com/TANGO-
Project/general/tree/master/Middleware/Monitor_Infrastructure/Tests%20Examples/NVML 5 https://developer.nvidia.com/gpu-deployment-kit
6 https://collectd.org/wiki/index.php/Plugin_architecture
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 23 of 39
/* * init() is called when the plugin is loaded, before any other functions * are called. Put global initialization here. */ extern int init(void) { nvmlReturn_t result; … return SLURM_SUCCESS; } /* * This function is called before shutting down */ extern int fini(void) { nvmlReturn_t result; … return SLURM_SUCCESS; } /* * setter and getter methods for the gathered values: */ extern int acct_gather_energy_p_update_node_energy(void) { int rc = SLURM_SUCCESS;
… return rc; } extern int acct_gather_energy_p_get_data(enum acct_energy_type data_type, void *data) { int rc = SLURM_SUCCESS; … return rc; } extern int acct_gather_energy_p_set_data(enum acct_energy_type data_type, void *data) { int rc = SLURM_SUCCESS; … return rc; } /* * These functions are called for configuration: */ extern void acct_gather_energy_p_conf_options(s_p_options_t **full_options, int *full_options_cnt) { … return; } extern void acct_gather_energy_p_conf_set(s_p_hashtbl_t *tbl) { … return; } extern void acct_gather_energy_p_conf_values(List *data) { … return; }
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 24 of 39
3.1.2.3 Installation and configuration
Currently, the Collectd and the Slurm plugins for NVIDIA were tested in isolation, during the second year of the project, they will be fully integrated as part of the TANGO framework and installed on Nova 2 testbed.
3.2 Installation and Configuration of SLURM
In Year-1, SLURM plays the role of the device supervisor and of the component that collect the measurement data, which is a part of the role of the Infrastructure Monitor.
Slurm is a basic part of Tango framework and there are planned contributions for upcoming work-packages on years 2 and 3. However since some basic features of Slurm are needed for the Tango device supervisor and infrastructure monitor of year 1 this section provides the general installation and configuration of Slurm along with some short user-level guide.
3.2.1 Platforms Supported
All Linux variants and the following hardware architectures are supported (i386, x86-64, ARM, PowerPC or IA64)
3.2.2 Software Pre-requisites and Dependencies
GNU C/C++ compiler versions 4.4 or later
autotools (libtool, automake, autoreconf, make)
freeipmi version 1.2.1 or later
hwloc and hwloc-devel
munge and munge-devel
mysql or mariadb
hdf5
3.2.3 Installation instructions
1. Slurm can be downloaded from https://schedmd.com/downloads.php . It is better to select the latest stable version. At the time of the writing of this report this is the version 16.05.
2. Make sure the clocks, users and groups (UIDs and GIDs) are synchronized across the cluster.
3. Download and Install MUNGE for authentication. You can download from here: https://dun.github.io/munge/ Make sure that all nodes in your cluster have the same munge.key. Make sure the MUNGE daemon, munged is started before you start the Slurm daemons.
4. bunzip2 the distributed tar-ball and untar the files: 5. tar --bzip -x -f slurm*tar.bz2 6. cd to the directory containing the Slurm source and type ./configure with appropriate
options, typically --prefix= and --sysconfdir= 7. Type make to compile Slurm. 8. Type make install to install the programs, documentation, libraries, header files, etc. 9. Build a configuration file using a web browser and doc/html/configurator.html. 10. Create the Slurm User to all compute nodes of the cluster . 11. The parent directories for Slurm's log files, process ID files, state save directories, etc.
must be created and made writable by Slurm User as needed prior to starting Slurm daemons.
12. Install the configuration file in <sysconfdir>/slurm.conf and copy it on all nodes of the cluster.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 25 of 39
13. Start the slurmctld and slurmd daemons.
For the configuration details you can follow the instructions at the official Slurm site: https://slurm.schedmd.com/documentation.html
3.2.4 Slurm accounting and profiling framework
Slurm provides mechanisms that enables the detailed monitoring and reporting of resources consumption during the job execution. This section will give some configuration details of this framework
The following parameters are needed to configure the job accounting gather characteristics that are collected during the job execution:
JobAcctGatherType
The job accounting mechanism type. Acceptable values at present include:
jobacct_gather/aix
jobacct_gather/linux
jobacct_gather/cgroup (jobacct_gather/cgroup uses cgroups to collect accounting statistics)
jobacct_gather/none (no accounting data collected). The default value is jobacct_gather/none.
JobAcctGatherFrequency
The job accounting sampling interval. For jobacct_gather/none this parameter is ignored. For jobacct_gather/aix and jobacct_gather/linux the parameter is a number in seconds between sampling job state. The default value is 30 seconds. The minimum is 1 sec. A value of zero disables real the periodic job sampling and provides accounting information only on job termination (reducing SLURM interference with the job).
AcctGatherNodeFreq
The AcctGather plugins sampling interval for node accounting. For AcctGather plugin values of none, this parameter is ignored. For all other values, this parameter is the number of seconds between node accounting samples. The minimum is 1 sec. The default value is zero. This value disables accounting sampling for nodes. Note: The accounting sampling interval for jobs is determined by the value of JobAcctGatherFrequency.
AcctGatherEnergyType
Identifies the plugin to be used for energy consumption accounting. The jobacct_gather plugin and slurmd daemon call this plugin to collect energy consumption data for jobs and nodes. Configurable values at present are:
acct_gather_energy/none No energy consumption data is collected.
acct_gather_energy/ipmi Energy consumption data is collected from the Baseboard Management Controller (BMC) using the Intelligent Platform Management Interface (IPMI).
o acct_gather_energy/ipmi_raw Energy consumption data is collected from the Baseboard Management Controller (BMC) using the Intelligent Platform Management Interface (IPMI), based on BMC internal consolidation.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 26 of 39
acct_gather_energy/rapl Energy consumption data is collected from hardware sensors using the Running Average Power Limit (RAPL) mechanism. The recommended option is acct_gather_energy/rapl
Besides accounting which gives aggregated results upon the resources usage, it is always desirable to have detailed data about an application performance. This data has traditionally been used to improve an application use of resources, particularly CPUs. There is an increasing need to improve the scheduling and placement of an application regarding its use of cluster resources. It is important to schedule applications to efficiently use energy. It is also important to allocate resources that are physically close together to minimize network latency for both message passing and use of parallel file systems.
In this context a profiling plugin exists that allows detailed data from different sources to be collected simultaneously and stored in a single file. The file is an HDF5 file, a format well known in High Performance Computing that allows heterogeneous data to reside in one structured dataset. In this case, there are sections for Energy statistics, Lustre I/O, Network I/O, and Task data.)
There are community programs, notably HDFView for viewing and manipulating these files.
AcctGatherInfinibandType
Identifies the plugin to be used for InfiniBand network traffic accounting. The plug-in is activated only when profiling on hdf5 files is activated and the user asks for network data collection for jobs through --profile=Network (or =All). The collection of network traffic data takes place at node level, hence only in the case of exclusive job allocation the collected values will reflect the jobs real traffic. All network traffic data is logged on hdf5 files per job on each node. No storage in the SLURM database takes place. Configurable values at present are:
acct_gather_infiniband/none No InfiniBand network data is collected.
acct_gather_infiniband/ofed InfiniBand network traffic data is collected from the hardware monitoring counters of InfiniBand devices through the OFED library.
AcctGatherFilesystemType
Identifies the plugin to be used for file system traffic accounting. The plug-in is activated only when profiling on hdf5 files is activated and the user asks for file system data collection for jobs through –profile=Network (or =All). The collection of file system traffic data takes place at node level; hence only in case of exclusive job allocation the collected values will reflect the jobs real traffic. All file system traffic data is logged on hdf5 files per job on each node. No storage in the SLURM database takes place. Configurable values at present are:
acct_gather_filesystem/none No file system data are collected.
acct_gather_filesystem/lustre Lustre file system traffic data are collected from the counters found in /proc/fs/lustre/.
For additional information on the accounting plugins, see Accounting and Resource Limits in the SLURM documentation.
Follow the link to Profiling Using HDF5 User Guide in the SLURM HTML documentation for more details.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 27 of 39
3.2.5 SLURM Key Functions
As a cluster resource manager, SLURM has three key functions. Firstly, it allocates exclusive and/or non-exclusive access to resources (Compute Nodes) to users for some duration of time so they can perform work. Secondly, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates conflicting requests for resources by managing a queue of pending work.
Optional plug-ins can be used for accounting, advanced reservation, backfill scheduling, resource limits by user or bank account, and sophisticated multifactor job prioritization algorithms.
Users interact with SLURM using various command line utilities7:
SRUN to submit a job for execution
SBCAST to transmit a file to all nodes running a job
SCANCEL to terminate a pending or running job
SQUEUE to monitor job queues
SINFO to monitor partition and the overall system state
SACCTMGR to view and modify SLURM account information. Used with the slurmdbd daemon
SACCT to display data for all jobs and job steps in the SLURM accounting log
SBATCH for submitting a batch script to SLURM
SALLOC for allocating resources for a SLURM job
SATTACH to attach to a running SLURM job step.
STRIGGER used to set, get or clear SLURM event triggers
SVIEW used to display SLURM state information graphically. Requires an XWindows capable display
SREPORT used to generate reports from the SLURM accounting data when using an accounting database
SSTAT used to display various status information of a running job or step
System administrators perform privileged operations through an additional command line utility, SCONTROL.
The central controller daemon, SLURMCTLD, maintains the global state and directs operations. Compute nodes simply run a SLURMD daemon (similar to a remote shell daemon) to export control to SLURM.
3.2.6 SLURM Components
SLURM consists of three types of daemons and various command-line user utilities. The relationships between these components are illustrated in the following diagram:
7 For more info a detailed user-guide can be found in the section “Slurm Users” at this link:
https://slurm.schedmd.com/documentation.html
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 28 of 39
Figure 4: SLURM Simplified Architecture.
3.2.6.1 SLURMCTLD
The central control daemon for SLURM is called SLURMCTLD. SLURMCTLD is multi – threaded; thus, some threads can handle problems without delaying services to normal jobs that are also running and need attention. SLURMCTLD runs on a single management node (with a fail-over spare copy elsewhere for safety), reads the SLURM configuration file, and maintains state information on:
Nodes (the basic compute resource)
Partitions (sets of nodes)
Jobs (or resource allocations to run jobs for time period)
Job steps (parallel tasks within a job).
Software Subsystem Role Description
Node Manager Monitors the state and configuration of each node in the cluster.
It receives state-change messages from each Compute Node's SLURMD daemon asynchronously, and it also actively polls these daemons periodically for status reports.
Partition Manager Groups nodes into disjoint sets (partitions) and assigns job limits and access controls to each partition. The partition manager also allocates nodes to jobs (at the request of the Job Manager) based on job and partition properties. SCONTROL is the (privileged) user utility that can alter partition properties.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 29 of 39
Job Manager Accepts job requests (from SRUN or a metabatch system), places them in a priority-ordered queue, and reviews this queue periodically or when any state change might allow a new job to start. Resources are allocated to qualifying jobs and that information transfers to (SLURMD on) the relevant nodes so the job can execute. When all nodes assigned to a job report that their work is done, the Job Manager revises its records and reviews the pending-job queue again.
3.2.6.2 SLURMD
The SLURMD daemon runs on all the Compute Nodes of each cluster that SLURM manages and performs the lowest level work of resource management. Like SLURMCTLD (previous subsection), SLURMD is multi-threaded for efficiency; but, unlike SLURMCTLD, it runs with root privileges (so it can initiate jobs on behalf of other users).
SLURMD carries out five key tasks and has five corresponding subsystems. These subsystems are described in the following table.
SLURMD Subsystem Description of Key Tasks
Machine Status Responds to SLURMCTLD requests for machine state information and sends asynchronous reports of state
changes to help with queue control.
Job Status Responds to SLURMCTLD requests for job state information and sends asynchronous reports of state
changes to help with queue control.
Remote Execution Starts, monitors, and cleans up after a set of processes (usually shared by a parallel job), as decided by
SLURMCTLD (or by direct user intervention). This can often involve many changes to process-limit,
environment-variable, working-directory, and user-id.
Stream Copy Service Handles all STDERR, STDIN, and STDOUT for remote tasks. This may involve redirection, and it always
involves locally buffering job output to avoid blocking local tasks.
Job Control Propagates signals and job-termination requests to any SLURM managed processes (often interacting with
the Remote Execution subsystem).
3.2.6.3 SlurmDBD (SLURM Database Daemon)
The SlurmDBD daemon stores accounting data into a database. Storing the data directly into a database from SLURM may seem attractive, but requires the availability of user name and password data, not only for the SLURM control daemon (slurmctld), but also user commands which need to access the data (sacct, sreport, and sacctmgr). Making possibly sensitive information available to all users makes database security more difficult to provide, sending the data through an intermediate daemon can provide better security and performance
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 30 of 39
(through caching data) and SlurmDBD provides such services. SlurmDBD is written in C, multi-threaded, secure and fast.
More information can be found in the official Slurm documentation for accounting8.
3.3 Installation and Configuration of Energy Modeller
The code can be found at https://github.com/TANGO-Project/energy-modeller
3.3.1 Minimal System Requirements
The energy modeller has two modes of operation the first gathers data and populates the models used for energy and power calculations while the second acts as a sub-component used for querying the generated models. In both cases it requires access to a MySQL database for the purposes of data storage and querying data which is used within the energy models.
The energy modeller is expected to work over a network and utilise a monitoring infrastructure such as Zabbix or by integration into SLURM, which provide the raw power information for the physical hosts. In addition to current host power information. In the event that not all host machines have Watt meters attached an estimated power value may be utilised instead. Such an estimated value can be generated by the Watt meter emulator component.
If Zabbix or SLURM are not to be used then the modeller may be directly attached to a WattsUp? Meter ¡Error! No se encuentra el origen de la referencia. in order to provide a tandalone mode of operation.
3.3.2 Platforms Supported
The Energy modeller has been tested on both Windows and Linux and works within any Java compliant environment.
3.3.3 Software Pre-requisites and Dependencies
To use the IaaS Pricing Modeller, the following dependencies must be resolved:
Dependency Version Comment
Java 7
Maven 2.2.1
MySQL 5.6.17
MySQL-connector-java 5.1.30
Apache commons-math3 3.3
log4j 1.2.17
Sigar ¡Error! No se ncuentra el origen de la referencia.
1.6.4 Used in standalone mode only, to obtain CPU load metric data
WattsUp SDK ¡Error! No e encuentra el origen de la referencia.
1.0 Used in standalone mode to contact the WattsUp? Meter
NRJavaSerial ¡Error! No e encuentra el origen de la referencia.
3.11.0 Used in standalone mode to contact the WattsUp? Meter
8 https://slurm.schedmd.com/accounting.html
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 31 of 39
3.3.4 Installation Instructions
To install the energy modeller, the following steps must be performed:
1. Generate the Energy Modeller jar using the command: mvn clean package (executed in the Energy Modeller directory)
2. Install the database. SQL statements to setup the database are held in the file “energy modeller db.sql” file it is held in the {energy-modeller root directory}\src\main\resources.
3. Install of Standalone Calibration Tool
a) Generate the jar energy modeller standalone calibration tool jar using the command: mvn clean package (executed in the standalone calibration tool directory)
b) Install the energy-modeller-standalone-calibration-tool on each host that is to be calibrated.
A configuration file called “Apps.csv” can now be specified. This file providing details about the application/s used to induce the training load for the host.
An example is provided within the source code and the headers as part of a default file are written out to disk if the apps.csv file is not found. A test application has also been provided under utils\ascetic-load-generator-app. This file specifies the following: The start time the application should run, the standard out and error files to redirect output to, the applications working directory and if output should also be redirected to the screen or not.
Configuring the Energy Modeller
3. The last stage is to configure the energy modeller. The energy modeller is also highly configurable and has several files that may be used to change its behaviour. The energy modeller has the following settings files in order to achieve these changes:
Settings File Purpose
energy-modeller-db.properties Holds database information for the energy modeller
energy-modeller-predictor.properties Holds settings relating to the prediction of energy usage.
energy-modeller-db-zabbix.properties Holds information on how to connect to the Zabbix database directly.
ascetic-zabbix-api.properties Settings for the Zabbix client, used to connect to a Zabbix based information source.
filter.properties This holds settings to distinguish between a host and a VM.
These settings must be tailored to the specific infrastructure. The settings are described below and an example of the settings is provided for reference.
energy-modeller-db.properties
This file specifies various database related settings for the energy modeller. An example is provided below:
energy.modeller.db.url = jdbc:mysql://iaas-vm-dev:3306/ascetic-em
energy.modeller.db.driver = org.mariadb.jdbc.Driver
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 32 of 39
energy.modeller.db.password = XXXXX
energy.modeller.db.user = user-em
This includes specifying the database username and password for the energy modeller to connect to its background database. This includes information such as the connection URL, the driver to use and the username and password to use.
The SQL script to setup the database structure is held in the file IaaS energy modeller db.sql. It is held under the directory {energy-modeller root directory}\src\main\resources.
energy-modeller-predictor.properties
This file specifies settings for the energy predictor mechanism, an example of such a file is provided below:
energy.modeller.cpu.energy.predictor.datasource =
ZabbixDirectDbDataSourceAdaptor
energy.modeller.cpu.energy.predictor.workload =
CpuRecentHistoryWorkloadPredictor
energy.modeller.cpu.energy.predictor.default_load = -1.0
energy.modeller.cpu.energy.predictor.utilisation.observe_time.min = 0
energy.modeller.cpu.energy.predictor.utilisation.observe_time.sec = 15
The data source parameter indicates how the energy modeller's predictor function will gain the environment data that it needs. It can be one of the following options:
ZabbixDirectDbDataSourceAdaptor: The default connector that directly accesses the Zabbix database for the information that it requires. This adaptor utilises the configuration file energy-modeller-db-zabbix.properties.
SlurmDataSourceAdaptor: This is an adaptor that connects the energy modeller into a SLURM job management based environment. Allowing access to information about the physical host.
ZabbixDataSourceAdaptor: This is an alternative adaptor that utilises at the JSON API of Zabbix in order to get hold of the required host and VM data.
WattsUpMeterDataSourceAdaptor: for local usage of the energy modeller
It should be noted that the observation window should not be too small, especially during the usage of the Zabbix data source adaptors, which may provide fewer data points than the WattsUpMeterDataSourceAdaptor, the latter been able to report at an interval as low as every second.
The energy predictor can utilise several different workload estimator functions. The default is to use the CpuRecentHistoryWorkloadPredictor. This has the following configuration settings.
The default_load parameter indicates what load the predictor should use as an estimate. It should be specified in the range 0..1. An alternative is to provide it the value -1, in which it will default to using the observed current load.
In the case where the observer current load is being used the observe_time.min and observe_time.sec parameters are used to indicate the size of the observation window for CPU utilisation. The two values are simply added together to make the total observation window time. The default observation window size is 15 minutes.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 33 of 39
The other options for workload prediction are:
BasicAverageCpuWorkloadPredictor
BasicAverageCpuWorkloadPredictorDisk
BootAverageCpuWorkloadPredictor
BootAverageCpuWorkloadPredictorDisk
DoWAverageCpuWorkloadPredictor
DoWAverageCpuWorkloadPredictorDisk
These predictors work on historical load information and are designed for virtualised infrastructures in which each VM can be tagged with basic information about the application the VM is for and the disk image it is based upon.
Average CPU Workload predictors: give an estimate of the workload based upon the average CPU utilisation for a given application tag or base disk image.
Average Boot Workload predictors: give an estimate of the workload based upon the time from boot of a VM for a given application tag or base disk image.
Day of Week (DoW) Workload predictors: give an estimate of the workload based upon the time and day of the week that a VM is active for a given application tag or base disk image.
energy-modeller-db-zabbix.properties
This is the configuration file used to configure the energy modeller when using the ZabbixDirectDBDataSourceAdaptor. It holds the database connection settings used to connect directly to the Zabbix database.
energy.modeller.zabbix.db.driver = org.mariadb.jdbc.Driver
energy.modeller.zabbix.db.url = jdbc:mysql://192.168.3.199:3306/zabbix
energy.modeller.zabbix.db.user = zabbix
energy.modeller.zabbix.db.password = XXXXX
energy.modeller.filter.begins = wally
energy.modeller.filter.isHost = true
This includes specifying the database username and password for the energy modeller to connect to directly interface with the Zabbix database. This includes information such as the connection URL, the driver to use, the username and password to use.
filter.properties
This settings file is used in conjunction with the ZabbixDataSourceAdaptor and the ascetic-zabbix-api.properties configuration file. This settings file has two properties; the first indicates a string that is at the start of a host/vms name to be searched for. The second parameter indicates that if this string is found that it is a host or virtual machine (VM). True if a host false if a VM. The following is an example of the defaults that are written to disk in the event the file is not found.
energy.modeller.filter.begins = wally
energy.modeller.filter.isHost = true
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 34 of 39
The energy modeller in addition to this settings file has a method called setHostFilter, that allows alternative patterns such as looking at the ending of a hostname to determine if it is a host or VM or not.
3.3.5 Using the standalone calibrator
The standalone calibration tool is designed to calibrate the model that is used for physical hosts and can be found at https://github.com/TANGO-Project/energy-modeller-calibration-tool . Its usage is as follows:
java –jar energy-modeller-standalone-calibration-tool-0.0.1-
SNAPSHOT.jar <hostname> [halt-on-calibrated] [benchmark-only] [no-
benchmark] [use-watts-up-meter]
<hostname>: This is a non-optional argument that states which host to emulate the Watt meter for. If no hostname is specified the tool will work for all calibrated hosts.
[halt-on-calibrated]: The halt-on-calibrated flag will prevent calibration in cases where the data has already been gathered.
[benchmark-only]: The benchmark-only flag skips the calibration run and performs a benchmark run only. Benchmarking allows physical hosts to be ranked in order i.e. performance per Watt for example.
[no-benchmark]: The no-benchmark flag skips the benchmarking.
[use-watts-up-meter]: The use-watts-up-meter flag can be used so that Zabbix is not used for calibration but local measurements are performed instead. This requires a Watt’s Up Meter.
The standalone calibrator uses the same configuration files as the energy modeller.
These are namely the: energy-modeller-db, energy-modeller-predictor and energy-modeller-db-zabbix properties files. In addition it has the calibration_settings and energy-modeller-watts-up-meter.properties files. In addition it has an Apps.csv file that is used to specify the training load to be induced.
calibration_settings.properties
#Settings
#Fri Feb 13 11:55:08 GMT 2015
poll_interval=2
delay_before_taking_measurements=4
working_directory=
log_executions=true
simulate_calibration_run=false
The poll_interval indicates how often measurements should be taken during the run. Noting that Zabbix must be configured to report the values back fast enough, for a change to have potentially occurred. The delay before taking measurements indicates the delay in seconds that should be used before measurements are taken immediately after an induced load starts and ends.
The working directory indicates where the apps.csv settings file is located. Log executions indicate if a log should be created that indicates when each application used for generating training load was started and stopped. Simulate calibration run, indicates if the data should be
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 35 of 39
written to the energy modeller's database or not, simulated runs don't record the data gathered.
energy-modeller-watts-up-meter.properties
#Settings
#Fri May 01 14:29:48 BST 2015
energy.modeller.wattsup.scrape.file=//opt//wattsup-zabbix-
probe//testnode5-wattsup.log
energy.modeller.wattsup.hostId=10134
energy.modeller.wattsup.hostname=testnode5
energy.modeller.wattsup.port=FILE
3.3.5.1 Apps.csv
Time From Start, Command, stdOut, stdError, Working Directory ,Output
To Screen, Stop Time
0,sleep 50,test.out,error.out,,TRUE,50
60,run-stress-point.sh 10 4 60,test.out,error.out,,TRUE,120
160,run-stress-point.sh 20 4 60,test1.out,error1.out,,TRUE,220
260,run-stress-point.sh 40 4 60,test2.out,error2.out,,TRUE,320
360,run-stress-point.sh 60 4 60,test3.out,error3.out,,TRUE,420
460,run-stress-point.sh 80 4 60,test4.out,error4.out,,TRUE,520
560,run-stress-point.sh 100 4 60,test5.out,error5.out,,TRUE,620
The apps.csv file has several columns. Namely: Time From Start, Command, stdOut, stdError, Working Directory, Output To Screen, Stop Time.
These indicate the time in seconds from the start of the calibration run, in which to execute a program: the command to run the program, a description of where to redirect standard out and error, in addition to the working directory of the application. If output should also be sent to screen and finally the time the application is expected to stop, indicated as seconds from the start of the calibration run.
3.3.5.2 Using the Watt Meter Emulator
The Watt meter emulator is a tool that is designed to emulate the presence of a Watt meter in cases where on is not attached to a physical host that is reported to the energy modeller via Zabbix. The code can be found at https://github.com/TANGO-Project/watt-meter-emulator. Its usage is as follows:
java –jar host-power-emulator-0.0.1-SNAPSHOT.jar [hostname]
[host-name-to-clone]
<hostname>: This is an optional argument that states which host to emulate the Watt meter for. If no hostname is specified the tool will work for all calibrated hosts.
[host-name-to-clone]: This is an optional argument that allows the named host to have its data cloned for the purpose of emulating the named host.
[stop-on-clone]: This parameter stops the emulated Watt meter as soon as the cloning of the host calibration data has been completed. Thus it may be used to simply copy calibration data for one host to another.
The watt meter emulator uses the same configuration files as the energy modeller.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 36 of 39
These are namely the: energy-modeller-db, energy-modeller-predictor and energy-modeller-db-zabbix properties files. In addition it has the watt-meter-emulator.properties file.
3.3.5.3 watt-meter-emulator.properties
#Settings
#Tue May 05 16:39:29 BST 2015
output_name=power-estimated
poll_interval=1
This has two settings, one is the metric name that should be output to Zabbix and the second is the rate at which this value should be pushed to Zabbix, in seconds.
3.4 Installation and Configuration of Application Life-cycle Deployment Engine
The Application Lifecycle Deployment Engine (ALDE) is the component responsible of taking an application, build it, package it, and if the targeted testbed supports it, deploy it remote from the TANGO development environment to the TANGO operational environment.
A complete implementation of ALDE is not expected for this first year hence no integration and usage with other components from the TANGO toolbox have been performed at the moment. However, initial implementation effort provides basic functionality reported hereafter. The code is accessible at https://github.com/TANGO-Project/alde
3.4.1 System Requirements
ALDE has the following basic requirements to run:
Python 3.4 or higher
With that minimum requirement, ALDE will be able to perform a basic running in any Windows, Mac OS or Linux based system but, to build an application, other tools maybe are necessary depending on the building scripts. Things like gcc compiler and third party libraries. In those cases it will depend in the application that it is necessary to be build and also in the selected packaged system. More specific requirements of this type will be reported later on in the project.
3.4.2 Installation and configuration
ALDE is packaged in a tar.gz file format, I to install it open a console in your system (Linux, Windows or Mac OS again, the compilation of an application may need specific system or software installed on it), one must perform the following steps:
1. Check that the right python version is already installed:
$ python --version Python 3.5.1
2. Unpackage ALDE
$ tar xvfz alde-1.0.dev0.tar.gz alde-1.0.dev0/ alde-1.0.dev0/alde.egg-info/ alde-1.0.dev0/alde.egg-info/dependency_links.txt alde-1.0.dev0/alde.egg-info/PKG-INFO
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 37 of 39
alde-1.0.dev0/alde.egg-info/requires.txt alde-1.0.dev0/alde.egg-info/SOURCES.txt alde-1.0.dev0/alde.egg-info/top_level.txt alde-1.0.dev0/alde.egg-info/zip-safe alde-1.0.dev0/alde.py alde-1.0.dev0/app.py alde-1.0.dev0/model/ alde-1.0.dev0/model/application.py alde-1.0.dev0/model/base.py alde-1.0.dev0/model/models.py alde-1.0.dev0/model/__init__.py alde-1.0.dev0/PKG-INFO alde-1.0.dev0/setup.cfg alde-1.0.dev0/setup.py alde-1.0.dev0/__init__.py
3. Install it
$ pip3.4 install --editable . Obtaining file:///home/a510804/moment/alde-1.0.dev0/dist/alde-1.0.dev0 Collecting Flask (from alde==1.0.dev0) Using cached Flask-0.11.1-py2.py3-none-any.whl Collecting Flask-Restless (from alde==1.0.dev0) Collecting Flask-SQLAlchemy (from alde==1.0.dev0) Collecting Flask-Testing (from alde==1.0.dev0) Requirement already satisfied: python-dateutil in /home/a510804/.local/lib/python3.4/site-packages (from alde==1.0.dev0) Collecting sqlalchemy (from alde==1.0.dev0) Collecting click>=2.0 (from Flask->alde==1.0.dev0) Using cached click-6.6-py2.py3-none-any.whl Collecting itsdangerous>=0.21 (from Flask->alde==1.0.dev0) Collecting Jinja2>=2.4 (from Flask->alde==1.0.dev0) Using cached Jinja2-2.8-py2.py3-none-any.whl Collecting Werkzeug>=0.7 (from Flask->alde==1.0.dev0) Using cached Werkzeug-0.11.11-py2.py3-none-any.whl Collecting mimerender>=0.5.2 (from Flask-Restless->alde==1.0.dev0) Requirement already satisfied: six>=1.5 in /home/a510804/.local/lib/python3.4/site-packages (from python-dateutil->alde==1.0.dev0) Collecting MarkupSafe (from Jinja2>=2.4->Flask->alde==1.0.dev0) Collecting python-mimeparse>=0.1.4 (from mimerender>=0.5.2->Flask-Restless->alde==1.0.dev0) Using cached python_mimeparse-1.6.0-py2.py3-none-any.whl Installing collected packages: click, itsdangerous, MarkupSafe, Jinja2, Werkzeug, Flask, sqlalchemy, python-mimeparse, mimerender, Flask-Restless, Flask-SQLAlchemy, Flask-Testing, alde Running setup.py develop for alde Successfully installed Flask-0.11.1 Flask-Restless-0.17.0 Flask-SQLAlchemy-2.1 Flask-Testing-0.6.1 Jinja2-2.8 MarkupSafe-0.23 Werkzeug-0.11.11 alde click-6.6 itsdangerous-0.24 mimerender-0.6.0 python-mimeparse-1.6.0 sqlalchemy-1.1.4
4. Configure the database for the application, you need to edit the file app.py in the root folder of the application, and set the following variable:
SQL_LITE_URL='sqlite:////tmp/test.db'
5. Set the port of the REST service. Again, edit the following line of the file app.py 6. the application, and set the following variable:
PORT=5000
Once ALDE will be integrated with the other operational components of the TANGO toolbox, it will be possible to run an application as follows.
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 38 of 39
$ python app.py
For the TANGO development tools such as the Requirement and Design Modelling toolbox, the Programming Model plugins, and eventually the Code Optimiser, ALDE proposes a REST9 service running as the front-end to access the operational part of a TANGO system, in other words, the infrastructure where an application is executed. This service will be running in the URL: http://localhost:5000/ of the operation TANGO system
3.4.3 API Documentation
The documentation for the REST API of this component is available here: http://docs.applicationlifecycledeploymentengine.apiary.io/
The REST API at the moment allows to create the following entities to define a testbed:
Testbed – It defines a testbed, its characteristics and endpoints to interact with it, if possible.
o Node – It defines a computation node in a testbed. If the testbed is the type on-line, the nodes will be automatically added by ALDE to the database.
CPU – CPU information of a Node (it can contain several CPUs) GPU – GPU information of a Node (it can contain several GPUs) MCP – MCP information of a Node (it can contain several MCPs) Memory – Information of the different memory configurations of the
node. Typically it will be one entry per memory module. FPGA – FPGA information of a node (in can contain several FPGAs).
Also, the user can define applications to be built, compiled, packaged, and when possible, deployed in a target testbed.
9 https://en.wikipedia.org/wiki/Representational_state_transfer
D3.1. TANGO Toolbox - Alpha version Version: v1.3 – Final, Date: 23/12/2016
TANGO Consortium 2016
Page 39 of 39
4 Conclusions
At the end of the first year of the project, following the installation procedures presented in Section 2 will help a development team with setting up a development environment to facilitate the implementation of an application capable to benefit from heterogeneous hardware capabilities notably in term of parallelisation.
In addition, the installation process described in Section 3 guide the setup of an operational infrastructure composed of heterogeneous hardware so it becomes possible to measure the energy consumed by various types of heterogeneous hardware elements found in the physical host.
Using the development tools and an operational testbed, it is then possible to profile application on time and energy performance. It is then possible benchmark various implementation alternatives of an application to determine how best to scope the granularity of computing tasks to deploy and run on the available heterogeneous hardware host.
At the end of Year-1, the implementation of the various TANGO components is far from final. Extensive effort will continue not only to improve the scientific innovation but also to better integrate these components to automate many tasks that currently remained manual.
At the end of Year-1, neither software nor hardware for the Smart device elements of the TANGO architecture have progressed far enough to be worth presenting. These IoT Smart Device scenarios will be further explored during Year-2 and 3 notably, through the development of the Industrial case study from Deltatec.