Download - 6620handout4t

1

INSE 6620 (Cloud Computing Security and Privacy)

Attacks on Cloud

1

Prof. Lingyu Wang

Outline

Co-Residence AttackPower Attacko e ttac

2

Ristenpart et al., Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds; Xu et al., Power Attack: An Increasing Threat to Data Centers

2

The Threat of Multi-Tenancy

In traditional systems, security goal is usually to “keep bad guys out” Clouds brings new threats with multi-tenancy:

Multiple independent users share the same physical infrastructureSo, an attacker can legitimately be in the same physical machine as the targetTh b d t tThe bad guys are next to you...

3

What Would the Bad Guys do?

Step 1: Find out where the target is located

Step 2: Try to be co-located with the target in the same (physical) machine

Step 2.1: Verify it’s achieved

Step 3: Gather information about the target once co-located

4

3

“Hey, You, Get Off of My Cloud”

Influential, cited by 872 papers as of 2014 July (Google Scholar)Media coverage:MIT Technology Review, Network World, Network World (2), Computer World, Data Center Knowledge, IT Business Edge, Cloudsecurity.org, Infoworld

Attack launched against commercially available ”real” cloud (Amazon EC2)Claims up to 40% success in co-residence withClaims up to 40% success in co residence with target VMFirst work showing concrete threats in cloud

5

Approach Overview

Map the cloud infrastructure to estimate where the target is located cartography

FootprintingPort scanning

located - cartographyUse various heuristics to verify co-residence of two VMsLaunch probe VMs trying to be co-residence with target VMsExploit cross-VM side-channel

Discovering vulnerabilitiesInitial exploitationPrivilege escalation

leakage to gather information about the target

escalation

6

4

Threat Model

Attacker modelCloud infrastructure provider is trustworthyCloud insiders are trustworthyAttacker is a malicious non-provider-affiliated third party who can legitimately use cloud provider's service

Victim modelVi ti th l d th t h itiVictims are other cloud users that have sensitive information

7

The Amazon EC2

Xen hypervisorDomain0 is used to manage guest images, physical

Zen Hypervisor

Dom0Guest1 Guest2

resource provisioning, and access control rights Dom0 routes packages and reports itself as a first hop

Users may choose to create instance in2 regions (United States and Europe)3 il bilit (f f lt t l )3 availability zones (for fault tolerance)5 Linux instance types m1.small, c1.medium, m1.large, m1.xlarge, c1.xlarge

8

5

IP Addresses of Instances

An instance may have a public IP75.101.210.100, which, from outside the cloud, maps to an external DNS name ec2-75-101-210-100.compute-1.amazonaws.com

And an internal IP and DNS name10.252.146.52domU-12-31-38-00-8D-C6.compute-1.internal

Within the cloud, both domain names resolve to the internal IP

75.101.210.100 -> ec2-75-101-210-100.compute-1.amazonaws.com -> 10.252.146.52

9

Network Probing

nmap: TCP connect probes (3-way handshake)hping: TCP SYN traceroutesp g C S t ace outes

both nmap/hping targeting ports 80 and 443

wget: retrieve web pages up to 1024BInternal probing from an instance to another

Legitimate w.r.t. Amazon policies

External probing from outside EC2External probing from outside EC2Not illegal (port scanning is)Only targeting port 80/443 – with services running –implication on ethical issues

10

6

Step 1: Mapping the Cloud

Hypothesis:The Amazon EC2 internal IP address space is cleanly partitioned between availability zones

(likely to make it easy to manage separate network connectivity for these zones)

Instance types within these zones also show considerable regularity Moreover, different accounts exhibit similar placement.

11

Mapping the Cloud

20 instances for each of the 15 zone/type pairs, total 300

Plot of internal IPs against zones

Result: Different availability zones correspond to different statically defined internal IP address ranges.

12

7

Mapping the Cloud

20 instances of each type, from another account, zone 3

Plot of internal IPs in Zone 3 against instance types

Result: Same instance types correspond loosely with similar IP address range regions.

13

Derive IP Address Allocation Rules

Heuristics to label /24 prefixes with both availability zone and instance type:

All IPs from a /16 are from same availability zoneA /24 inherits any included sampled instance type. If multiple instance types, then it is ambiguousA /24 containing a Dom0 IP address only contains Dom0 IP addresses. We associate to this /24 the type of the Dom0’s associated instancethe type of the Dom0 s associated instanceAll /24’s between two consecutive Dom0 /24’s inherit the former’s associated type.

10.250.8.0/24 contained Dom0 IPs associated with m1.small instances in 10.250.9.0/24, 10.250.10.0/24

14

8

Mapping 6057 EC2 Servers

15

Preventing Cloud Cartography

Why preventing?Make following attacks harderHiding infrastructure/amount of users

What makes mapping easier?Static local IPs – changing which may complicate managementExternal to internal IP mapping – preventing which

l l d i (ti i d t tcan only slow down mapping (timing and tracert are still possible)

16

9

Step 2: Determine Co-residence

Network-based co-resident checks: instances are likely co-resident if they have:

matching Dom0 IP address Dom0: 1st hop from this instance, or last hop to victim

small packet round-trip timesNeeds a “warm-up” – 1st probe discarded

numerically close internal IP addresses (e.g., within 7)7)

8 m1.small instances on one machine

17

Step 2: Determine Co-residence

Verified via a hard-disk-based covert channelAll “instances” are in zone 3

Effective false positive rate of ZEROEffective false positive rate of ZEROGo with a simpler test:

Close enough internal IPs? If yes then tracert. A single hop (dom0) in between? If yes test passes

18

10

Step 3: Exploiting VM Placement

Facts about Amazon placementSame account never has instances on same machine (so 8 instances will be placed on 8 machines)Sequential locality (A stops then B starts, A and B likely co-resident)Parallel locality (A and B under different accounts run at roughly same time likely co-resident)g y y )Machines with less instances are more likely be placed to (load balancing)m1.xlarge and c1.xlarge have their own machines

19


Strategy 1: Brute-forcing placement141 out of 1686: a success rate of 8.4%

Strategy 2: Abusing Placement LocalityAttacker instance-flooding right after the target instances are launched – exploiting parallel locality

Observing instance disappearing/reappearingTriggering the creation of new instances (elasticity)

40% success rate40% success rate

flooding after 5 mins 20

11


“Window” for parallel locality is quite large

Evidence of sequential localityy q g

(Each instance is killed immediately after probing)

21

Step 4: Information Leakage

Co-Residency affords the ability to:Denial of ServiceEstimate victim's work load Extract cryptographic keys via side channels

22

12

Mitigations

Co-residence checks:Prevent identification of dom0/hypervisor

VM placement:Allow users to control/exclusively use machines

Side channel leaks:Many methods existLimitations: impractical (overhead), application-p ( ), ppspecific, or insufficient protectionAlso, all of them require to know all possible channels in advance

23

Amazon's response

Amazon downplays report highlighting vulnerabilities in its cloud service

"The side channel techniques presented are based on testing results from a carefully controlled lab environment with configurations that do not match the actual Amazon EC2 environment.""As the researchers point out, there are a number of factors that would make such an attack significantly more difficult in practice."

http://www.techworld.com.au/article/324189/amazon_downplays_report_highlighting_vulnerabilities_its_cloud_service

24

13

Outline

Co-Residence AttackPower Attacko e ttac

25

Background

The number of servers in data center surged from 24 million in 2008 to 35 million in 2012

Power consumption 56% percent increase

Very expensive to upgrade existing power infrastructures

How to add more servers with less cost?

26

14

Power Attack

Solution: OversubscriptionPlace more servers than can be supported by the power infrastructureAssumption: not all servers will reach peak consumption (nameplate power ratings) at the same time

Leaves data center vulnerable to power attack:Malicious workload that can generate power spikesMalicious workload that can generate power spikes on multiple servers/racks/whole data centerLaunched as a regular user Causing DoS to both providers and clients by triggering the circuit breakers (CBs)

27

Power Distribution in Data Centers

Three tiers:(60-400kV →

f ktransformer → 10-20kV → switchgear → 400-600v) → UPS/PDUs (Power distribution units) → racksCircuit breakers at switchgear, PDUs, and rack level branchrack-level branch circuit

28

15

Oversubscription

Google’s analysisWorkload traces collected from real data centers: search, webmail, and MapReducePeak power reaches 96% of rated capacity at rack level, but 72% at data center levelOversubscription would allow adding 38% more servers

A big assumption:

29

A big assumption:Workloads never reach peak consumptionBenign workloads – maybe; malicious ones – no!

Threat Model

Target can be rack, PDU, or data centerTarget can be rack, PDU, or data centerRunning public servers, e.g., Running public servers, e.g., IaaSIaaS, , PaaSPaaS, , SaasSaasPower oversubscriptionPower oversubscriptionPower consumption is monitored/managed at rackPower consumption is monitored/managed at rack--level (machinelevel (machine--level is too expensive)level is too expensive)

AdversaryAdversaryHackers, competitors, cyber crime/cyber warfareHackers, competitors, cyber crime/cyber warfare

30

Regular user with sufficient resources (large number Regular user with sufficient resources (large number of accounts, workload) and mapping of cloudof accounts, workload) and mapping of cloud

Our focus: How to generate power spikes?Our focus: How to generate power spikes?Under Under IaaSIaaS, , PaaSPaaS, , SaasSaas??

16

Power Attack in PaaS

PaaS: Attacker can run any chosen applicationsLoad balancingoad ba a c g

Load (utility) balancing ≠ power balancing

Attack in two stagesUtility reaches 100% (e.g., CPU)Fine-tune workload to further increase power consumption (remember utility ≠ power)

31

Single Server Test

Goal: find out how workloads affect powerSPECCPU2006 HPC benchmark

Results:Different workloads have very different power costSame CPU (100%), memory ≠ powere.g., 462 vs 465, 462 vs 456

32

17

Single Server Test

HPL benchmarkMultiple parameters to adjustAdjust block size NB (how problem is solved)

Results:Same workload, same CPU and memoryDifferent power cost under different parameters

33

Rack Level Test

Results:Similar to single machine

Attack:Increasing workload to reach utility capFurther increasing power cost by changing workload/tuning parameters

34

18

Damage Assessment

OverheatingOne CPU is overheated, resulting in system failure

CB trippedin a room with 16 servers out of which only 4 servers are under attack

It will only get worse in real worldWhen memory/I/O devices are attacked

35

y/ /With better “power proportionality”

60% power consumed when idle in this case

Power Attack in IaaS

IaaS: More control using VMs; more exposureAttack vectors:ttac ecto s

Parasite attack – attack from inside Run applications from VMs Launch DoS attacks on such VMs (more power cost than normal workload)

Exploit routine operationsLive migration of VMs

36

Live migration of VMsLaunch parasite attack during migration

19

Evaluation - Parasite Attack

Parasite attackCo-resident with victimRun intensive workloadLaunch DoS attacks on such VMs

ResultsNormal load: 180wIntensive load: 200w

37

DoS: 230w (peak 245w)Increase 30%smurf: broadcasting

Evaluation – VM Migration

ResultsDuring migration, both source and dest. experience power spikes (memory copy, NICs, CPUs)Intra-rack and inter-rack (migrating to same server)

38

20

Power Attack in SaaS

SaaS: Limited controlAttack vectors: Specially crafted requeststtac ecto s Spec a y c a ted equests

Trigger larger numbers of cache missesFloating point operations

Floating point unit (FLU) more power hungry than Arithmetic logic unit (ALU)

Divisions rather than add/multiplication

39

Power Attack in SaaS

RUBiS online shopping benchmarkModified to support:od ed to suppo t

Floating point operations (discount coupons)Cache misses (continuously browsing at random)

30-40% of power spikes

40

21

Data Center Level Simulations

Based on configurations of Google data center in Lenoir, NC, USA“Original” workload based on traces of the data center and “Attack” includes HPC workloads

Attacking the “Peak”, “medium”, and “valley” regions

41

Results

One PDU: 22 min attack trips PDU-level CBMulti-PDU: 4 attacks all trip CBu t U attac s a t p C

First 3 attacks recovered due to load balancingLast attack causes DoS (only 53% of requests processed) during 58-69hrs

42

22

Results

DC level attack possibleLarger scale attacks require more resources

43

Mitigation

Power capping: limit peak consumptionChallenges: 2-minute sampling window enough for attacks; even longer (12 mins) to actually reduce consumption

Server consolidation (shutdown if not in use)Better power proportionality – more aggressive oversubscription - more vulnerable

Challenges

44

Challenges: Need to save powerDifficult to monitor powerDifficult to distinguish between users and attacks

23

Mitigation

Promising solutionsModels estimating power consumption of requests and consequently limit themPower balancing instead of load balancingDeploying per-server UPS

45