RESOURCE MANAGEMENT FOR ISOLATION ENHANCED CLOUD SERVICES Presented by: Yun Liaw Ripal Nathuji...

RESOURCE MANAGEMENT FOR ISOLATION ENHANCED CLOUD

SERVICES

Presented by: Yun Liaw

Ripal Nathuji

Abhishek Singh Paul England

ACM Workshop on Cloud Computing Security2009

Himanshu Raj

Microsoft Corportaion

Outline

Introduction Example Scenario for Isolation Attributes Enforcing Cache Isolation in Multicore Systems

Cache Hierarchy Aware Core Assignment Page-Coloring Based Cache Partitioning

Experimental Evaluation An SLA Driven Approach to Resource Management in

the Cloud Infrastructure Related Work Conclusions and Future Work Comments

2

Introduction

The cloud computing in IaaS model separates the service provider and infrastructure owner the service provider (SP) has less control over the service

deployment, and must trust cloud infrastructure provider (CIP) to uphold the guarantees provided in the service level agreement (SLA)

A service provider must trust the infrastructure provider’s ability to properly isolate the service from each other For the performance and security issue Traditionally: physical isolation

Good isolation but costly In cloud: Use virtualization to encapsulate service inside VM

Flexible but weaker isolation

3

Introduction

Resources are implicitly shared among VMs Last level cache (LLC) on multicore processors and

memory bandwidth Present opportunities for security and performance

interference Process confidentiality compromising DoS attack launched by malicious VMs

Isolation attributes for a service defined as part of the SLA between SP and CIP serve two purpose To capture the degree of isolation demanded by a service To allow a service to authoritatively report its isolation

characteristics for the service user isolation attestation

4

This paper’s focus!

Last Level Cache

Introduction

This paper’s focus: Presenting mechanisms to enforce some isolation

constraints, focusing on last level cache (LLC) Cache hierarchy aware core assignment Page-coloring based cache partitioning

Providing an example formulation of a constraint satisfaction problem (CSP) for CIP’s VM placement

5

Example Scenario for Isolation Attributes Several VMs belonging to various independent SPs

are deployed on a CIP’s infrastructure Example Scenario: Virtual Desktop Experience

(VDE)

6

Session VM:Specific to a client, and works as her personal computer

Service VM:Provide services that can be accessed in the VDE

The SP adds value by allowing roaming access to the VDE, and provide management ability

Example Scenario for Isolation Attributes Service client’s concern about the service (may be

addressed in the SLA between client and SP) will create concerns about isolation and resource management for the SP Example: Can adversary VM impact the performance of

session VM? This isolation and resource management concern will in

turn pass to the SLA between SP and CIP The CIP must manage their resources to meet the SLA

between SP and CIP The resource assignment problem can be posed as

constraint satisfaction problem (CSP)

7

Example Scenario for Isolation Attributes

8

Enforcing Cache Isolation in Multicore Systems

Shared caches are commonly used in multicore systems that are prevalent in today’s large scale data centers Difficult to guarantee performance to a thread whose

active working set spills out of its local caches into the LLC

It is possible to impact a thread’s confidentiality by cache-based side channel attack

Two techniques for cache isolation Cache hierarchy aware core assignment Page-coloring based cache partitioning

9

Cache Hierarchy Aware Core Assignment

1. Group cores on a machine based on their LLC organization

All cores sharing the LLC are put in a single group

2. If a VM V’s SLA defines isolation attribute related to the cache,

1. Choose a group that is currently not assigned to any other VM

2. Assign the cores in this group to V as V’s virtual processors Depending on the # of virtual processors, one or more

groups may be used Drawback: under utilization of cores within a group

10

Page-coloring Based Cache Partitioning – Cache Cache Line: The smallest unit of memory that can be

transferred between the RAM and the cache N-way Associative Cache

a hybrid between a fully associative cache (which requires parallel searches of all slots), and direct mapped cache (which may cause collisions of addresses to the same slot)

11

Page-coloring Based Cache Partitioning – Page Page: a fixed-length block of memory that is

contiguous in memory addressing A page is usually the smallest unit of data for the

following: memory allocation for a program transfer between main memory and any other auxiliary

store

12

Page-coloring Based Cache Partitioning – Page Coloring Page Coloring

A Software technique that controls the mapping of physical memory to a processor’s cache block

Memory Pages that map to the same cache blocks are assigned the same color

The granularity of page color is the unit of cache space that can be allocate to an application (VM)

13

Page-coloring Based Cache Partitioning – Page Coloring

14

Page size: 4KB

6 GB Memory

•128K cache lines in this cache (8MB/64byte)•8K associative sets in this cache (128K/16)

Set

…

1 2 3 16

8 MB 16-way CacheCache Line Size: 64byte

Cache line

•1 page’ size = 64 cache lines’ size

The Maximum color that this cache can support= # of sets / # a page’s cache line size = 8K / 64 = 128

By controlling the color of pages assigned to an application, the OS can manipulate cache blocks at the granularity of cache space that can be allocated to an application

Page-coloring Based Cache Partitioning The hypervisor allocates memory pages to back a

VM can influence the cache usage of threads in the VM

Utilizing page coloring for cache isolation by isolating the color sets that are used to back individual VMs running on CPU cores that share the LLC

Drawback: under utilization of memory

15

Experiment – Implementation Detail and Methodology Based on Microsoft Hyper-V

The memory management component in Hyper-V 11is replaced by a Windows NT kernel’s memory allocation API

The configuration of each physical machine is enhanced with 2 pieces of information The group information for cores # of page colors and their current size

16

Experiment – Implementation Detail and Methodology Experimental platform:

8-core Intel Nehalem processors based machine 6GB RAM 8MB shared LLC The prefetch function of Nehalem processor is

disabled Cache Hierarchy:

2 groups of cores

17

Experiment – Implementation Detail and Methodology Target VM:

1 virtual processor Running program: allocates an array of a specific working set size, and

then accesses it in a regular pattern Perturbing VM:

3 virtual processors Running program: intensive application with repeatedly access memory

and cause cache thrashing Cache hierarchy aware core assignment (CHACA) experiment

Target VM and Perturbing VM are placed on different groups of cores Page-coloring based cache partitioning (PCBCP) experiment

Target VM and Perturbing VM are placed on same groups of cores The target VM shares 50% of the total number of colors available, and

the perturbing VM shares the other 50%

18

Experiment Result - No Isolation and CHACA

19

The execution time decreases to the baseline when the working set is smaller than the LLC

In CHACA, since the perturbing VM is placed on different group of cores, it does not cause any influence on the target VM

Experiment Result - PCBCP20

Additional threads does not impact the performance

Experiment Result - PCBCP21

Log

axis

Coloring causes performance penalty The execution time can be cut when the perturbing VM included

An SLA Driven Approach to RM in the Cloud Infrastructure The SLA between SP and CIP can be converted into a set of

CIP specific constraints The constraints are defined in terms of available resources at the

CIP→ A Constraint Satisfaction Problem (CSP)!

Example scenario – The SLA between SP and CIP defines Number of processors = 2 Replication factor (r)= 5 H/w fault domain (n)= 5 Cache based DoS attack avoidance = True Cache based side channel attack avoidance = True→ To place 5 VMs (based on r) on physical machines in the cloud

such that the SLA is satisfied

22

An SLA Driven Approach to RM in the Cloud Infrastructure

Example Scenario (Cont’d) physical node: Blade object

23

Blade Attributes24

An SLA Driven Approach to RM in the Cloud Infrastructure Let VMs be the set of virtual machines,

corresponding to vm1, vm2, … vm5, that needed to be placed on the set Blades Decision Variables of each VM

Blade ProcessorDomain PageColorDomain

25

Pseudo code of a greedy algorithm for CSP formulation

26

DecisionVariables

Constraints27

Related Work

There is little prior work on security and isolation specific SLA constraints

This work is the first attempt on characterizing specific isolation related attributes for SLA between SP and CIP

Monahan et al., define security related SLA constraints that are applicable in cloud computing scenario [10]

Research on cache based interferences

28

Conclusions and Future Work

Conclusions: This paper envisions that SP in cloud computing environment will also

specify security and performance isolation constraints as part of their SLA

One such set of constraint advocated in this paper is based on cache sharing in contemporary multicore systems

This paper presents 2 approaches to provide security and performance isolation

This paper provides a generic CSP formulation Future Work

To use other CSP solvers to formulate and solve the CSP To evaluate the impact of SLA isolation attributes on the overall cost of

VM placement Isolation attestation

29

Comments

Did not mention much of the detailed approaches of cache isolation

CSP might be a good way to study 滷蛋 = 回香豆蔻甘草百里香風味白蛋 (?!)

30

RESOURCE MANAGEMENT FOR ISOLATION ENHANCED CLOUD SERVICES Presented by: Yun Liaw Ripal Nathuji...

Documents

Transcript of RESOURCE MANAGEMENT FOR ISOLATION ENHANCED CLOUD SERVICES Presented by: Yun Liaw Ripal Nathuji...