Benchmark Modeling Methodology - Boden Type … › upload › images › publications ›...
Transcript of Benchmark Modeling Methodology - Boden Type … › upload › images › publications ›...
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 1 | 33
Benchmark Modeling Methodology
Project acronym: BodenTypeDC GA Number 768875 Call identifier H2020-EE-2017-RIA-IA
Deliverable no. D3.1 Dissemination Level: PU Deliverable type: Report
Due Date of deliverable: 31 03 2018 Completion date of deliverable: 31 03 2018
Lead beneficiary responsible for deliverable: Fraunhofer IOSB
Related work package: WP3 Modeling and benchmarking loads patterns
Authors: Batz, Thomas (Fraunhofer IOSB) Herzog, Reinhard (Fraunhofer IOSB) Watson, Kym (Fraunhofer IOSB) Sarkinen, Jeffrey (SICS)
Summers, Jon (SICS)
Ref. Ares(2018)1890870 - 09/04/2018
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 2 | 33
Document history:
Revision Date Status
V0.1 09 01 2018 Initial draft
V0.2 29 01 2018 First version with content
V0.3 30.01.2018 Next draft with more content
V0.35 09.02.2018 Next draft with more content
V0.4 12.02.2018 Next draft with more content
V0.5 15.02.2018 Interface descriptions by SICS; Application oriented load architecture
V0.6 18.02.2018 New section layout
V0.7 07.03.2018 Input of the partners, new document structure, modified figures
V0.8 19.03.2018 Separated conceptional aspects from deployment; integrated input from all
V0.9 23.03.2018 Final draft with review by SICS
V1.0 31.03.2018 Final report after QA
Disclaimer: The European Commission support for the production of this publication does not
constitute endorsement of the contents which reflects the views only of the authors, and the
Commission cannot be held responsible for any use which may be made of the information
contained therein.
Notice on using EU funds: This project has received funding from the European Union’s Horizon
2020 research and innovation programme under grant agreement No 768875
Copyright Notice: BodenTypeDC Consortium 2018. All rights reserved.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 3 | 33
1. Executive Summary
The project BodenTypeDC aims to prototype an innovative energy and cost-efficient data center in
Boden, Sweden. WP3 “Modeling and benchmarking loads patterns” in this project contributes through
the following objectives:
Develop a methodology for modeling of benchmark loads in data centers
Specify benchmarks based on real application oriented and synthetic scenarios
Provide tools to generate IT workload models in the data center based on the benchmarks.
The IT workload models will be used in the test operation and measurement phase in WP5 “Prototype
testing, measurement”. The benchmarks will be the basis for an objective assessment of the
performance metrics (including Power Usage Effectiveness) of the data center. They will be validated
and iteratively adapted as needed.
This first WP3 Deliverable 3.1 “Benchmark modeling methodology” describes the approach to
specifying and realizing benchmarks to achieve the above objectives. The benchmarks shall include
application oriented benchmark load patterns complemented by low-level synthetic benchmark load
patterns. User stories (application scenarios) taken from given application domains will be mapped
onto use cases (and if needed sub-use cases) and then mapped step-wise onto layers of Application
Primitives. The lowest layer shall comprise deployable software modules. This same general approach
is applicable to the low-level synthetic loads as well and yields a high degree of flexibility in specifying
benchmark load patterns. Recognized sources of use cases such as Industry 4.0 of the Platform
Industrie 4.0 and the Industrial Internet Consortium will be referenced.
A functional architecture for the development of application patterns, the deployment of loads in the
data center and the measurement of performance metrics is presented.
The software containerization platform docker is proposed for the technical realization of the load
deployment. This platform provides an efficient means to deploy possibly very complex and large
software systems as docker images in so-called containers running on a host (hardware or virtual
machine). The docker images can be software emulations of real IT systems commonly used in
Industrial Internet of Things applications in various domains. The docker images can be easily
replicated and dynamically orchestrated on the hosts of the data center.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 4 | 33
List of abbreviations
AD Application Domain
AP Application
API Application Interface
AWS Amazon Web Services
DC Data Center
HW Hardware
IIC Industrial Internet Consortium
IIoT Industrial Internet of Things
IoT-A Internet of Things – Architecture
IoT RA ISO/IEC Internet of Things Reference Architecture
IIRA Industrial Internet Reference Architecture
IVI Japanese Industrial Value Chain Initiative
I4.0 Industry 4.0 or Industrie 4.0 (of the organization Plattform Industrie 4.0;
https://www.plattform-i40.de/I40/Navigation/EN)
OPC Open Platform Communications
OPC UA Open Platform Communications Unified Architecture
PUE Power Usage Effectiveness
RAMI4.0 Reference Architecture Model Industrie 4.0
ROI Return of Invest
SLA Service Level Agreement
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 5 | 33
SNMP Simple Network Management Protocol
SUC (Sub) Use Case
SW Software
UC Use Case
VBS Value Based Services
VM Virtual Machine
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 6 | 33
Table of Contents
1. Executive Summary .............................................................................................................. 3
2. Introduction ......................................................................................................................... 8
3. Overall methodology .......................................................................................................... 10
4. Relevant Application Domains ............................................................................................ 12
4.1. Domain Architectures ............................................................................................................ 12
4.1.1. Smart Cities .................................................................................................................... 13
4.1.2. Industry 4.0 use cases .................................................................................................... 14
4.1.3. IIC Use Cases................................................................................................................... 14
5. Application Oriented Benchmark Load Patterns ................................................................. 15
5.1. Load Characterization ............................................................................................................ 15
5.2. Example decomposition of a User Story ................................................................................ 16
5.2.1. User Story: Equipment Lifecycle Management .............................................................. 17
5.2.2. Use Case: Predictive Maintenance ................................................................................. 17
5.2.3. Mapping to Application Primitives ................................................................................. 18
5.3. Example Domain Autonomous Driving .................................................................................. 18
6. Functional View of Subsystems .......................................................................................... 20
6.1. Overview ................................................................................................................................ 20
6.2. Subsystems ............................................................................................................................ 20
6.2.1. Subsystem 1 “Load Definition” ...................................................................................... 20
6.2.2. Subsystem 2 “Load Deployment” .................................................................................. 21
6.2.3. Subsystem 3 “Measurement System” ............................................................................ 21
6.3. Interfaces ............................................................................................................................... 21
6.3.1. Interface 1 (high level load definition) ........................................................................... 22
6.3.2. Interface 2 (polling of IT specific measurement data) ................................................... 22
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 7 | 33
6.3.3. Interface 3 (measurement data feedback for dynamic load mapping) ......................... 22
7. Technical realization........................................................................................................... 23
7.1. What are Dockers ? ................................................................................................................ 23
7.2. Docker deployment ............................................................................................................... 23
7.3. Realization of Subsystem 2 .................................................................................................... 25
7.4. Load Mapping Strategies ....................................................................................................... 26
7.5. Deployment of Synthetic Benchmark Load Patterns ............................................................. 27
7.6. Load balancing ....................................................................................................................... 28
8. Outlook .............................................................................................................................. 30
9. References ......................................................................................................................... 31
10. List of figures ...................................................................................................................... 33
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 8 | 33
2. Introduction
The task of the BodenTypeDC project is to build an energy and cost-efficient data center at Boden in
Sweden. This covers the building construction, equipping it with IT equipment as well as a cooling,
measurement and control infrastructure.
The characteristics of the computing work load on the data center and how it is distributed over the
available resources have a major impact on the energy consumed and metrics such as the Power Usage
Effectiveness (PUE). A load balancing strategy can optimize what and when computing resources are
used. This normally depends on the management strategy for operating the data center, the service
spectrum of the data center and the Service Level Agreements (SLA) offered to customers. A service
level agreement gives a guaranteed computing performance for a given cost model. The objective from
the perspective of the data center is to minimize energy usage needed to achieve the service level
agreements and to simultaneously maximize the revenue. The data center needs to be able to develop
offers to potential new customers that it can fulfil with the available resources. A data center can vary
the distribution of CPUs assigned to the different customers, the modality of distribution of the load
to the CPUs and the mechanisms of redistribution during run-time. On the other hand, a data center
customer (user) needs to be able to estimate what resources and hence what service level agreement
is needed.
BodenTypeDC aims to address two principal questions regarding workload characterization:
a. What is the relationship between various benchmark loads and energy effectiveness?
b. Can application profiles be characterized with benchmark load patterns?
In BodenTypeDC WP5, various experiments and measurements will be performed in the real data
center to prove its power efficiency and to optimize the infrastructure. These experiments will also
have the objective of validating the benchmark load patterns. Feedback from WP5 will be used to
refine the benchmark specifications.
Two classes of benchmark load patterns will be considered that can be used in combination:
a. High-level loads: Application oriented loads motivated / derived from actual applications
b. Low-level loads: Synthetic load patterns defined at the HW level
Major sources of high-level benchmark workloads shall be usage scenarios in typical industrial
application domains. The aim is to identify use cases that are typical for the execution in a data center
and are representative load patterns for these use cases and that can be generalized for the application
domain. The objective is to characterize a class of applications. The application domains under
consideration are in the area of Industrial Internet of Things as explained in section 4.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 9 | 33
The low-level IT workloads are essentially time series of workloads for each server. The workloads can
be described in terms of CPU, RAM, hard disk or communication workloads that vary over time.
Moreover, synthetic static workloads are needed to model the background load and extreme
situations. These are models with a certain workload percentage for selected servers (even up to 100
%). The actual workload can be adjusted according to a target workload (up to 500 kW) taking into
account baseline workloads already produced by users of the data center (up to 350kW).
This report is structured as follows:
Section 3 describes the overall methodology of developing benchmark load patterns, covering
a) the validation loop with measurements in WP5 and
b) the interaction with initiatives developing recognized used cases for application domains.
Section 4 gives an overview of several major application domains that will be considered as guiding
sources of relevant load patterns.
Section 5 describes a generic layered architecture for mapping user stories (application scenarios) in
application domains to application primitives that can be deployed in a data center as a high-level
application oriented benchmark load.
In section 6, the functional view of the subsystems in the data center is described in relation to load
deployment. The proposed technical realization is presented in section 7. The approach focuses on
using the software containerization platform docker for load deployment.
Finally, section 8 gives an outlook of the further work planned in BodenTypeDC around benchmark
load patterns.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 10 | 33
3. Overall methodology
The top-level process for establishing benchmark load patterns is shown in Figure 1, where
BodenTypeDC internal and external feedback loops are distinguished. The internal feedback loops are
on the one hand with the WP3 experts and on the other hand between WP3 and WP5. Controlled
experiments with benchmark loads and pertinent performance metrics in the data center will be
conducted in WP5; see also section 6.2.3. The results of the WP5 experiments will be stored in
databases to link load patterns, application domains and performance metrics.
The benchmarks will be revised depending on the validation results, e.g.
sensitivity of the metrics to the chosen benchmark load
relevance to the operation of the data center
relevance to the application domain. Not all data processing is suitable for data center
deployment. Some may be done at the ”edge” rather than in the “cloud”.
The external feedback loop addresses the interaction with external organizations defining use cases
for application domains or lower level workloads for data centers. BodenTypeDC will derive and
propose benchmark load patterns from this input and report back to the respective organization as
appropriate. The objective is to characterize the workload for each application domain as a generic,
parameterizable workload model. It may be useful to develop an abstraction over multiple application
domains and attain abstract load patterns for a class of industrial applications.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 11 | 33
Figure 1: Organisational Feedback Loops
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 12 | 33
4. Relevant Application Domains
4.1. Domain Architectures
A benchmark for a data center shall reflect the specific characteristics of the applications to be served.
The problem is that all applications are different and have very individual behavior. The challenge is to
find the commonalities in the diversity. The goal is to define application blueprints which are generic
enough to be implementable as a benchmark, but still be specific enough to show the requirements of
relevant application domains.
The assumption for the benchmark modeling approach is that data center applications can be
categorized in so called “Application Verticals”. An application vertical is a common term describing
the complete range of elements in a technology stack for a given application domain. The verticals for
two distinguished domains may be completely different in terms of used technologies, software
architectures or business cases. But the applications within a given vertical, are similar as they share
common reference architectures. Such reference architectures define blueprints for application
primitives, which are used as common building blocks for specific applications. By implementing a
reference architecture for a specific application, the result may be very specific in many aspects but it
will still contain common structures built from the same types of application primitives and wired in
comparable ways. Figure 2 illustrates the concepts of application verticals containing individual
reference architectures.
Figure 2: Application Domain Verticals
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 13 | 33
The benchmarks to be implemented will contain application primitives defined by domain specific
reference architectures. These application primitives shall be instantiated and composed to represent
a typical workload behavior for an application in such a domain.
The application verticals to be considered will be from the Industrial Internet of Things (IIoT) context.
There are several reference architectures available, like the “Reference Architecture Model Industrie
4.0” (RAMI4.0), the “Industrial Internet Reference Architecture” (IIRA), the “ISO/IEC Internet of Things
Reference Architecture” (IoT RA), and the “Internet of Things – Architecture” (IoT-A). The IIRA is very
popular in the industrial sector as well as in some smart city appliances. The RAMI4.0 is currently being
merged with the IIRA. The IoT-A has until now only been adopted within research projects. As a starting
point the IIRA seems to be a good choice. The IIRA also defines several use case scenarios, which may
be used to instrument the BodenTypeDC benchmark.
The following sections discuss relevant use cases in the various application domains.
4.1.1. Smart Cities
In order to enhance the quality of life of their citizens and improve economics and business
competitiveness, modern cities are required to design and implement new and smart infrastructures,
business models and strategies. Various definitions exist as to what a smart city could be. Information
and communication technologies, sensor networks and collection and interpretation big data play a
central role in all definitions. Databases and models are needed for the city buildings and complete
infrastructure to support management and planning tasks. Some cities are establishing an open data
platform to enable new applications and business models to be realized.
Typical themes in smart cities are buildings, healthcare, environment, transportation and homes [1].
Use cases in the city of Tartu [2] are for example:
Energy consumption, analysis of the current situation like the solar report and showing the
energy class for single buildings
Heating demands for single buildings and city areas
Generally, use cases could include
Smart Urban Planning
Smart Governance
e-Participation
Collecting and aggregating the data for the key indicators of a smart city
Monitoring of traffic flow
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 14 | 33
4.1.2. Industry 4.0 use cases
As defined in Plattform Industrie 4.0, Application Scenarios are a generic, general description of a task
or challenge faced by a user, see [3]. The scenario may be relevant today or in the future. The
description includes business aspects and value chains. Application examples are possible solutions for
an Application Scenario.
An Application Scenario can contain one or more sub-scenarios. For example, the Application Scenario
“VBS - Value Based Services” contains the sub-scenarios “Condition Monitoring Services”, “Machine
and Process Optimization Services” and “Production Scheduling Services”, see [4]. Each sub-scenario
has a description of the stakeholder and vision, the values and experiences, the key business objectives
and the fundamental capabilities. Further Application Scenarios which are relevant to BodenTypeDC
are a) Smart product development for smart production (SP2), b) Innovative product development
(IPD).
4.1.3. IIC Use Cases
The Industrial Internet Consortium (IIC) has a Use Case Task Group that is compiling vertical use cases
for the IIC verticals and horizontal use cases. As part of the liaison between IIC and the Japanese
Industrial Value Chain Initiative IVI, IIC has also adopted use cases from IVI. IVI has defined several
Smart Manufacturing Scenarios grouped into categories [5]. Many of these categories could involve
data center deployment, e.g. “Quality Management Information” (with use case Traceability of Quality
Data), “Preventive Maintenance”, “Asset and Equipment Management” and “Maintenance Service
Management”. In addition, case studies have been published by IIC members under [6].
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 15 | 33
5. Application Oriented Benchmark Load Patterns
5.1. Load Characterization
The load on a computer system (CPU, storage, interfaces) is very much dependent on the applications
which run on it. Hence, low-level hardware-oriented load patterns are not in general a true reflection
of the load generated by an application, cf. e.g. [7]. A de-compositional approach to defining
application specific benchmark workloads is adopted here (similar to that in [8]). This involves
decomposing the application into functional layers as explained below. In fact, the approach here will
start with User Stories of an Application Domain at the highest level, map (decompose) these to Use
Cases, which in turn are mapped to Application Primitives. The Application Primitives may be organized
in a layer hierarchy.
The following key concepts will be used to define the high-level loads in the so-called load pattern
characterization architecture (See Figure 3).
Figure 3: Load pattern characterization in layers
User Story: Describes in few sentences in the everyday or business language of an actor a situation that
captures what a user does or needs to do as part of his or her job function. A user story captures the
'who', 'what' and 'why' of an activity in a simple, concise way. It is domain specific. A user story is
sometimes referred to as an application scenario. It describes a situation and its context.
User Story 1 User Story 2 User Story n
UC1 UC2 UC3 UCm
SUC1 SUC2 SUC3 SUC4
AP11 AP12 AP13
AP01 AP02 AP03 AP04
ApplicationPrimitives level 1
ApplicationPrimitives level 0
(Sub) use cases
use cases
domain userstories
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 16 | 33
Use Case: Describes functional aspects, actors and workflows to be executed by the actors. An actor
tells a user story, which motivates a use case to be performed by an actor. The use cases may be refined
with sub-use cases.
The use cases are mapped to Application Primitives (normally software modules) that are required to
realize the respective use case. The Application Primitives are hierarchically grouped in layers of
application primitives as shown in Figure 3 with two levels of primitives. In general, there may be more
levels. Primitives of a given level call primitives of the next lower level. The lowest level may be
hardware oriented operations such as the basic numeric operations or numeric routines such as matrix
multiplication (e.g. on a GPU).
In the abstract and generic approach, particular constraints are not imposed on the primitive types. It
is assumed that the lowest level comprises modules that can be deployed or executed on the target
hardware. One deployment method is with docker; this is explained in section 7.2.
An Application Primitive may have additional constraints such as necessary disk or main memory
storage, tolerable latency and data input / output throughput.
The lowest level could also comprise data analytics algorithms running on a cloud platform. The SPEC
Cloud™ IaaS 2016 Benchmark from the Standard Performance Evaluation Corporation [9] considers
two workloads:
An I/O Intensive Workload Yahoo! Cloud Serving Benchmark (YCSB)
Compute-intensive workload - K-Means with Apache Hadoop
The K-Means algorithm is often used for clustering applications.
In Figure 3, parameters are needed for each arrow to describe when and how an element calls
elements in the lower layers. It is possible for an element to instantiate and call multiple entities in the
next lower layer. This parametrization of the load pattern yields an actual load that can be deployed.
With the technique of application profiling, the idea is to find a load pattern that yields a close
approximation to the real application load as measured close to the hardware and operating system.
This is achieved by analyzing the measurement traces of an application.
5.2. Example decomposition of a User Story
This example is a primary user story with use cases in the Industry 4.0 initiative of Plattform Industrie
4.0 and in IIC.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 17 | 33
5.2.1. User Story: Equipment Lifecycle Management
The manager of a production plant wants an equipment lifecycle management with optimized
equipment maintenance so that he can manage the equipment more cost effectively to support the
production process. The manager aims to increase the ROI of the plant by reducing down-times and
maintenance costs.
Equipment vendors want equipment lifecycle management so that they can give better
service/guarantee to their customer and keep or even increase their business. Equipment vendors
want to collect and analyze data from as many factories as possible in order to better understand how
the equipment is being used and to improve the quality of their equipment.
5.2.2. Use Case: Predictive Maintenance
Predictive maintenance techniques are designed to help determine the condition of equipment in
order to predict when maintenance should be performed. The aim is to reduce costs as compared to
routine or time-based preventive maintenance, since maintenance tasks are performed only when
needed or justified. Moreover, predictive maintenance allows activities to be scheduled at a
convenient time with a minimum disruption of production. This reduces equipment failures and
increases plant availability. Other potential advantages include increased equipment lifetime,
increased plant safety, fewer accidents with negative impact on environment, and optimized spare
parts handling. The actors in this use case are the equipment vendor, the maintenance planner, the
plant manager and the maintenance service provider.
Workflow for a plant manager:
1) Gather sensor data from equipment (e.g. usage, vibrations, temperature, energy usage, etc.)
and the manufacturing process (e.g. relating to quality of the part products).
2) Analyze the data to estimate when a machine should be repaired (e.g. when the product
quality drops below a threshold or the usage of a machine part has exceeded a threshold).
3) Notify the maintenance planer
4) Notify the maintenance service provider
Workflow for an equipment vendor:
5) Gather sensor data from equipment (possibly from several plants) on a cloud platform.
6) Analyze the data to recommend a maintenance strategy and to estimate the residual life time
of equipment and parts depending on the usage profile.
7) Notify the plant manager.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 18 | 33
A further use case for the equipment vendor is to provide requirements to the development
department on new products.
5.2.3. Mapping to Application Primitives
The deployment of analytics for predictive maintenance typically involves three steps:
Selection of and training a predictive model
Test and validation of the model on previously unseen data
Deployment of the model to make predictions on real data streams
Examples of Application Primitives at level 2
Application Primitive 21: A database server with a standardized interface as a repository of
sensor data such as a Sensor Things API server.
Application Primitive 22: Data curation (to pre-process the data to get it in the right format)
Application Primitive 23: Predictive model training
Application Primitive 24: Predictive model deployment with calls to Application Primitives at
level 1
Here are a few of many possible examples of Application Primitives at level 1. For further examples
see [7].
Application Primitive 11: Signal feature extraction
Application Primitive 12: Unsupervised learning with a Gaussian Mixture Model
Application Primitive 13: Self Organizing Map (SOM)
Application Primitive 14: K-Means clustering
5.3. Example Domain Autonomous Driving
An application domain like autonomous driving defines various use case e.g.
use case 1: describing the situational representation of an actual driving scene on a road with
various cars, cyclists, pedestrians and traffic infrastructures (streets, lanes, crossings, traffic
lights, crosswalks …) involved to generate the actual options for the driver or
use case 2: to train an algorithm to recognize special kinds of vehicles up to a detection rate of
P%.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 19 | 33
Use case 2 can be performed in a remote data center whereas use case 1 has to run in a nearby data
center or in the car itself to achieve the necessary real-time response time reliably.
For use case 1 there are several possible sub-use cases:
sub-use case 1: receive the data from the environment sensors
sub-use case 2: establish the situational overview
sub-use case 3: perform the routing algorithm for the destination of the driver
sub-use case 4: generate the recommended action for the car/driver
. . .
For use case 2 examples of sub-use cases are:
sub-use case 1: transfer the necessary data to the processing environment
sub-use case 2: perform the recognition algorithm until the success rate is at least P%.
sub-use case 3: transfer the identified parameters back to the development center and delete
the data in the data center
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 20 | 33
6. Functional View of Subsystems
This section introduces the software architecture for deployment of application load patterns and describes the subsystems and the interfaces.
6.1. Overview
Figure 4 shows a simplified figure of the architecture where the cooling process is neglected.
Figure 4: Functional SW-Architecture and interfaces for the deployment of Application Load Patterns
6.2. Subsystems
The system is divided into three subsystems, one for the generation of the application load patterns,
one for load deployment and one for the measurement system.
6.2.1. Subsystem 1 “Load Definition”
This subsystem implements the load pattern layers as described in section 5.1. Each element in the
layers has associated characteristics that comprise relevant parameters and constraints. The
Application Primitives in the lowest layer are combined to form a characteristic load pattern for an
Application Domains
User Stories(Sub-) Use
Cases
Characteristics
Characteristic Load Patterns for Applications
Load Deployment
Benchmark as High-LevelLoad Mixture Definition
ServersMeasurement
Data
Facility
Load Mapping Strategy
Interface 1
Interface 2
Interface 3WP5 (T5.1) DC Manage-
mentStrategy
Subsystem 1 Load Definition
Subsystem 3Measurement System WP5 (T5.1)
Subsystem 2 Load Deployment
Application Primitives
Characteristics Characteristics
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 21 | 33
application domain. Application Primitives may require access to external data repositories or have a
built-in database of emulated input data. For example, processing algorithms in the field of Machine
Learning require input data to be executed in a meaningful way. A benchmark load is a defined mixture
of load patterns with parameters and constraints for its concrete deployment. A benchmark with
defined parameter variations forms the basis of a controlled experiment in the data center.
6.2.2. Subsystem 2 “Load Deployment”
Subsystem 2 involves deploying loads onto the BodenTypeDC data center. These loads can either be
synthetic loads or application oriented loads. The application oriented loads will be defined by
Subsystem 1 and will be provided in the form of a high-level load mixture, which will include docker
images. The high-level load mixture will be deployed using a docker orchestrator according to a desired
load mapping strategy. A more detailed description of docker deployment within subsystem 2 can be
found in section 7.3, “Realization of Subsystem 2”.
For synthetic load deployment, see section 7.5.
6.2.3. Subsystem 3 “Measurement System”
Subsystem 3 involves collecting and analyzing measurement data in the BodenTypeDC data center.
Subsystem 3 will be realized in work package 5, task 5.1. Parameters that will be monitored on the IT
equipment include power consumption, temperatures, CPU utilization, memory usage, disk and
network usage. In the facility the environment and energy consumptions will be monitored as well as
the performance of the cooling system. The data which is collected will be useful in evaluating how
different workloads affect data center operation and performance. The metric Power Usage
Effectiveness (PUE) will be obtained for the experiments. The measurement data can help provide an
understanding of how different high-level load definitions affect data centers so that they can be built
and planned for accordingly in the future.
6.3. Interfaces
There are three interfaces between the subsystems defined above. Interfaces 2 and 3 are between
Subsystem 2 ‘Load Deployment’ and Subsystem 3 ‘Measurement System’ (one in each direction).
Interface 1 is from Subsystem 1 ‘Load Definitions’ to Subsystem 2 ‘Load Deployment’. There is no
return interface because this step cannot be done by software at the moment. The feedback loop
described in section 3 and Figure 1 from ‘Experiments’ to ‘Application Load Patterns’ will be done
manually by analyzing the experimental results.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 22 | 33
6-1. Table: Interfaces within the Load Pattern Architecture
Interface Purpose
Interface 1 High level load definition for the data center
Interface 2 Polling IT specific measurement data
Interface 3 Feedback of measurement data for dynamic load mapping strategies
6.3.1. Interface 1 (high level load definition)
Interface 1 describes the mapping of the high-level load mixture to subsystem 2 (Load Deployment).
The load deployment can vary over time. A benchmark load has a start and stop time, parameter sets
and can also be given resource constraints regarding the target host systems. The parameter sets
determine the experiment variations to be run in the data center. Experiment sequences over varying
parameters help to identify the dependency of the performance metrics on particular parameters.
Simple examples of variable parameters are:
the iteration count for a Machine Learning algorithm,
the number of data servers, e.g. of OPC UA or SensorThings API servers (cf. section 7.2)
the number of information nodes in a server,
6.3.2. Interface 2 (polling of IT specific measurement data)
Interface 2 is the methodology of collecting data from the IT equipment which will be realized in WP5,
T5.1. The data collected will include power consumption, temperatures, CPU utilization, network
usage, disk usage and memory usage. The data can be collected using an appropriate management
protocol, (e.g. SNMP) or via measurement agents that run on the servers’ operating systems.
6.3.3. Interface 3 (measurement data feedback for dynamic load mapping)
Interface 3 will involve using measurement data provided by subsystem 3 to dynamically adjust the
load mapping in subsystem 2. This load mapping adjustment is given the name “DC management
strategy” in the overall system architecture Figure 4. This interface will be considered for adoption in
WP5, T5.1, where the strategy, yet to be decided, could be used to optimize data center performance
and efficiency by moving workloads to different locations within the data center. This could involve
moving loads based on measured data center metrics, such as temperatures or power consumption or
resource interference.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 23 | 33
7. Technical realization
Several variants exist to implement the benchmark loads in a data center. One variant is using virtual
machines another is using Dockers.
Dockers are light weight and seem to be more adequate at the moment. The following section
describes what Dockers are and how they can be used.
7.1. What are Dockers ?
Docker is a concept and software realization that allows virtualization on the level of the operating-
system also known as containerization. The software is developed by Docker, Inc [10]. It was primarily
developed for Linux but it is also now available for Windows.
It allows independent containers to run within a single Linux instance based on Linux kernel features.
This avoids the overhead of virtual machines. Docker allows the definition of constraints on resources,
such as CPU, memory, block I/O, and network.
Docker images are instantiated in a docker container to run on a host. A host can be physical hardware
or a virtual machine. They can be combined to form a cluster which can be managed as a whole.
Containers can be distributed among these cluster. Containers can communicate with each other and
the outside world and share external storage.
7.2. Docker deployment
The idea is to create Docker images representing domain typical application patterns, such as:
OPC UA Server and Clients for IIoT Applications, such as predicative maintenance. OPC UA is a
data transfer protocol widely used in industrial applications, e.g. in Plattform Industrie 4.0
(where it is a recommended protocol) and in many IIC testbeds. OPC UA servers can be
generated automatically from information models defined in the semantic description
language AutomationML and are therefore well suited to large scale deployment for
benchmark loads. For more information, see [11, 12, 13, 14].
OGC SensorThings API Server and Clients for IoT Applications, like Smart City, Environmental
Monitoring, Smart Agriculture, etc.
Images may be loaded in one or more docker container to run on a host (see Figure 5). The
orchestration of the docker containers can be done with Kubernetes or Docker Swarm. The load
definition will be implemented as a script for the docker orchestration.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 24 | 33
Figure 5: Load Emulation with Docker Swarm
In some cases data center load definitions will require external components to emulate load elements,
like sensor feeds or application clients. Such load components are needed to stimulate load within the
data center, but shall not be included in the measurement environment (see Figure 6).
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 25 | 33
Figure 6: External Load Stimulation
7.3. Realization of Subsystem 2
Subsystem 2 (as shown in Figure 4) will include deploying the high-level load mixture onto the data
center. This will be implemented using the concept of a docker orchestrator which manages a docker
host cluster. The docker orchestrator will have the capability to deploy and manage workloads running
within the cluster.
To enable this architecture each server will have an OS installed either on bare-metal (BM) or on a
virtual machine (VM). The benefit of using a virtual machine is that it will increase flexibility as VMs
can be migrated. This would be useful for implementing VM migration tactics to save energy and
increase performance. VM migration is not planned to be adopted in this project, but the architecture
could be built to enable its use for future work. VM migration processes could be optimized using the
strategies identified by the DOLFIN FP7 project, where OpenStack and VMware were used as VM
Hypervisor managers [15].
The servers will be managed using JUJU, which has the capability to install operating systems and
configure necessary software and is detailed in [16].
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 26 | 33
Each operating system will have a docker engine installed and the docker hosts will be grouped
together into docker clusters. The docker cluster will be managed by a docker orchestrator which will
deploy and manage containers. The containers can be deployed individually or in groups. Containers
within docker groups will run on the same docker host.
The load mixture will be created based on docker images and a load definition, as depicted in Figure
7. The load definition could be in the form of a script that would specify which docker image should be
used as the base image and how many instances of this docker container should be deployed. It will
also specify if the container should belong to a group and any constraints that it has about where it
can run in the cluster. For example, if two types of servers are available in the data center, then one
constraint could be that this container should only run on a particular server type.
Figure 7: Load Distribution Architecture
7.4. Load Mapping Strategies
The load mapping strategy describes how the load is mapped onto the available hosts and how the
load can be redistributed during runtime. The load mapping can be adapted to optimize the
performance metrics for SLAs and the cooling energy costs. This has to take the management strategy
of the data center into consideration. For example, certain hosts (CPUs) may be managed directly by
the customers. The load distribution may be static, based on heuristics or dynamically redistributed
based on actual load. The different types of hosts such as CPUs, multipurpose CPUs, GPUs and ASICS
and related constraints in SLAs also impact the load mapping strategy.
The docker containers can be deployed with different mapping strategies including automatic resource
availability-based mapping, host group specified mapping or a mixture of both. The resource
availability-based mapping will be defined within the docker orchestrator and can be configured to
operate in different ways. It has the potential to evenly spread the containers throughout the cluster
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 27 | 33
so that each host uses the least amount of resources. It could also prioritize filling each host with as
many containers as possible and leaving other hosts unoccupied for thermal distribution or energy
management. The containers can also be distributed to host groups within the cluster which are
defined by the user. For example, one group could contain one rack of the data center and containers
could be constrained to run within this rack. Mapping could also be done manually where every
container has a specified host it should run on.
A possible way to implement such load mappings is the docker compose tool. Compose is a tool for
defining and running multi-container docker applications. With Compose, a YAML file is used to
configure the application’s services. Then, with a single command, all the services are created and
started from the configuration.
The data representation and serialization language YAML (yaml.org) is suitable for defining benchmark
loads and the load mapping as it can describe complex data structures.
7.5. Deployment of Synthetic Benchmark Load Patterns
During the operational phase of the data center, synthetic benchmarks will also be available for
deployment in the data center. These synthetic loads will be developed in Task 3.3 and will be
implemented either using SICS existing loading software architecture or through running synthetic
loads as docker images to be consistent with the real IoT application loads scenario.
The existing SICS software enables synthetic benchmark models to be run on the data center. The
software uses a Flask webserver [17], which utilizes Ansible [18] to communicate with servers (see
Figure 8). Many computer architecture level parameters can be set using the loading tool Stress-ng
[19]. This includes CPU stress type and utilization, RAM, disk and network. The synthetic loads can be
scheduled to run on organized groups of servers with different defined loads. Using a web interface,
the user can either manually schedule an experimental run or upload a file which contains a group of
runs that can be configured to run at specific times. The software is modular and new loading tools
can be added. The current software may also be further developed to enable the user to upload time
series data or scripts which represent model loads, with some dynamic behavior. Other synthetic
benchmarks to be considered include Prime95 [20] and Linpack [21].
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 28 | 33
Figure 8: SICS Current Load Distribution Architecture
Synthetic benchmarks could be packaged in docker images and deployed and managed in the same
way as the real-application based benchmarks. The advantage of this is that the system is
homogeneous, and all loads are contained in docker images. The advantage of using the SICS existing
software is that a docker deployment engine will not be required to deploy the loads ensuring flexible
control of load distribution with Ansible.
7.6. Load balancing
In order to execute the application primitives on real hardware, the docker containers need to be
mapped to docker hosts. For most cases this mapping will be done in a way to balance the load equally
amongst the available hardware. This is also true for our benchmark applications. There are two
options for the load balancing of docker images.
The first option is to use the docker swarm, which has been included in the current version of the
docker distribution since version 1.12.0. The management of clusters is implemented in the docker
engine itself and it is designed in a decentralized, scalable and secure way. The docker swarm manager
uses “ingress load balancing”, which allows rules to map one externally visible service (that is external
to the docker swarm) onto many internal services. The swarm manager uses internal load balancing to
distribute the external requests amongst the internal services. Docker also allows the usage of an
external load balancer, which leads to the second option for load balancing.
The second option is the use of the kubernetes tool [22], which is also built on the concept of docker
images but provides some extended services for deployment management. Kubernetes was developed
by Google and is based on their experience of running containers in production. The presumably
largest community behind container orchestration tools is kubernetes with about 1200 contributors,
versus 160 contributors for docker. For load balancing kubernetes seems to have an extended support
for specific cloud installations, like AWS, Azure, or OpenStack.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 29 | 33
The decision of which load balancing toolset is to be used for the implementation requires further
analysis and experimentation.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 30 | 33
8. Outlook
In this document the benchmark modeling methodology has been explained. The main topics of the
subsequent work in WP3 are:
Development of a formal load modeling language based on YAML [23]. It describes the
configuration and deployment of Application Primitives in the benchmark load patterns.
Initial benchmark load pattern specifications derived for selected application domains and
complemented by synthetic load patterns.
Development of a benchmark load generation tool.
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 31 | 33
9. References
[1] G. Senatore, G. Galasso, D. Brunelleschi, G. Farina: ESPRESSO D6.3 – Report on business impacts
http://espresso.espresso-project.eu/content/deliverables/ (visited March 2018)
[2] ESPRESSO Pilot Tatu, Use Cases http://espresso.espresso-project.eu/espresso-pilots/tartu/use-
case-2-city-information-modelling/ and http://espresso.espresso-project.eu/espresso-
pilots/tartu/use-case-2-city-information-modelling/ (visited March 2018)
[3] Fortschreibung der Anwendungsszenarien der Plattform Industrie 4.0 (2016), Federal Ministry for
Economic Affairs and Energy (BMWi). Available at http://www.plattform-
i40.de/I40/Redaktion/DE/Downloads/Publikation/fortschreibung-anwendungsszenarien.pdf (in
German) (visited March 2018)
[4] Exemplification of the Industrie 4.0 Application Scenario Value-Based Service following IIRA
Structure, Federal Ministry for Economic Affairs and Energy (BMWi) (2017). Available at
http://www.plattform-i40.de/I40/Redaktion/DE/Downloads/Publikation/exemplification-i40-value-
based-service.pdf (visited March 2018)
[5] https://iv-i.org/en/docs/ScenarioWG_2016.pdf (visited March 2018)
[6] http://www.iiconsortium.org/case-studies/index.htm (visited March 2018)
[7] The Industrial Internet of Things Volume T3 (2017): Analytics Framework, Industrial Internet
Consortium. http://www.iiconsortium.org/industrial-analytics.htm (visited March 2018)
[8] Brown, Aaron B. (1997). A Decompositional Approach to Computer System Performance
Evaluation. Harvard Computer Science Group Technical Report TR-03-97.
http://nrs.harvard.edu/urn-3:HUL.InstRepos:23574652 (visited March 2018)
[9] http://www.spec.org/ (visited March 2018)
[10] https://www.docker.com/company (visited March 2018)
[11] https://www.iosb.fraunhofer.de/servlet/is/51300/ (visited March 2018)
[12] Pfrommer, J. (2017): Semantic interoperability at big-data scale with the open62541 OPC UA implementation. In : Interoperability and Open-Source Solutions for the Internet of Things. Cham: Springer International Publishing, p. 13. Available online at http://publica.fraunhofer.de/documents/N-
438796.html (visited March 2018)
[13] Sauer, Olaf (2018): PLUG and WORK mit OPC UA und AutomationML. In: Praxishandbuch OPC UA. Grundlagen - Implementierung - Nachrüstung - Praxisbeispiele. Würzburg: Vogel Business Media, pp. 156–160.
[14] https://open62541.org/doc/current/ (in German) (visited March 2018)
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 32 | 33
[15] http://www.dolfin-fp7.eu/ (visited March 2018)
[16] https://jujucharms.com/ (visited March 2018)
[17] http://flask.pocoo.org/ (visited March 2018)
[18] https://www.ansible.com/ (visited March 2018)
[19] http://manpages.ubuntu.com/manpages/xenial/man1/stress-ng.1.html (visited March 2018)
[20] https://www.mersenne.org/download/ (visited March 2018)
[21] https://software.intel.com/en-us/articles/intel-mkl-benchmarks-suite (visited March 2018)
[22] https://kubernetes.io/ (visited March 2018)
[23] http://www.yaml.org/spec/1.2/spec.html (visited March 2018)
BodenType DC Deliverable D3.1 Dissemination level: PU Grant Agreement: 768875
Status: Submitted © BodenType DC Consortium 2018 Version: 1.0 Page 33 | 33
10. List of figures
Figure 1: Organisational Feedback Loops ________________________________________________________ 11 Figure 2: Application Domain Verticals __________________________________________________________ 12 Figure 3: Load pattern characterization in layers __________________________________________________ 15 Figure 4: Functional SW-Architecture and interfaces for the deployment of Application Load Patterns _______ 20 Figure 5: Load Emulation with Docker Swarm ____________________________________________________ 24 Figure 6: External Load Stimulation ____________________________________________________________ 25 Figure 7: Load Distribution Architecture _________________________________________________________ 26 Figure 8: SICS Current Load Distribution Architecture ______________________________________________ 28