6620handout4o

45
INSE 6620 (Cloud Computing Security and Privacy) Attacks on Cloud Prof. Lingyu Wang 1

description

Cloud Computing

Transcript of 6620handout4o

Page 1: 6620handout4o

INSE 6620 (Cloud Computing Security and Privacy)

Attacks on Cloud

Prof. Lingyu Wang

1

Page 2: 6620handout4o

OutlineOutline

Co-Residence AttackPower Attack

2

Ristenpart et al., Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds; Xu et al., Power Attack: An Increasing Threat to Data Centers

Page 3: 6620handout4o

The Threat of Multi-TenancyThe Threat of Multi Tenancy

In traditional systems, security goal is usually t “k b d t”to “keep bad guys out” Clouds brings new threats with multi-tenancy:

l i l i d d h h h i lMultiple independent users share the same physical infrastructureSo an attacker can legitimately be in the sameSo, an attacker can legitimately be in the same physical machine as the targetThe bad guys are next to you...

3

Page 4: 6620handout4o

What Would the Bad Guys do?What Would the Bad Guys do?

Step 1: Find out where the t t i l t dtarget is located

St 2 T t b l t d ithStep 2: Try to be co-located with the target in the same (physical) machinemachine

Step 2.1: Verify it’s achieved

Step 3: Gather information about the target once co-locatedg

4

Page 5: 6620handout4o

“Hey, You, Get Off of My Cloud”Hey, You, Get Off of My Cloud

Influential, cited by 872 papers as of 2014 July (G l S h l )(Google Scholar)Media coverage:MIT Technology Review Network World Network World (2) Computer WorldMIT Technology Review, Network World, Network World (2), Computer World, Data Center Knowledge, IT Business Edge, Cloudsecurity.org, Infoworld

Attack launched against commercially available ” l” l d (A EC2)”real” cloud (Amazon EC2)Claims up to 40% success in co-residence with target VMtarget VMFirst work showing concrete threats in cloud

5

Page 6: 6620handout4o

Approach OverviewApproach Overview

Map the cloud infrastructure to estimate where the target is

FootprintingP t iestimate where the target is

located - cartographyUse various heuristics to verify

Port scanningDiscovering vulnerabilitiesy

co-residence of two VMsLaunch probe VMs trying to be co residence with target VMs

vulnerabilitiesInitial exploitation

co-residence with target VMsExploit cross-VM side-channel leakage to gather information

Privilege escalation

g gabout the target

6

Page 7: 6620handout4o

Threat ModelThreat Model

Attacker modelCloud infrastructure provider is trustworthyCloud insiders are trustworthyAttacker is a malicious non provider affiliated thirdAttacker is a malicious non-provider-affiliated third party who can legitimately use cloud provider's service

Victim modelVictims are other cloud users that have sensitive information

7

Page 8: 6620handout4o

The Amazon EC2Dom0Guest1 Guest2

The Amazon EC2

Xen hypervisorZen Hypervisor

Domain0 is used to manage guest images, physical resource provisioning, and access control rights Dom0 routes packages and reports itself as a firstDom0 routes packages and reports itself as a first hop

Users may choose to create instance iny2 regions (United States and Europe)3 availability zones (for fault tolerance)5 Linux instance types m1.small, c1.medium, m1.large, m1.xlarge, c1.xlarge

8

Page 9: 6620handout4o

IP Addresses of InstancesIP Addresses of Instances

An instance may have a public IP75.101.210.100, which, from outside the cloud, maps to an external DNS name ec2-75-101-210-100 compute-1 amazonaws comec2-75-101-210-100.compute-1.amazonaws.com

And an internal IP and DNS name10 252 146 5210.252.146.52domU-12-31-38-00-8D-C6.compute-1.internal

Within the cloud, both domain names resolveWithin the cloud, both domain names resolve to the internal IP

75.101.210.100 -> ec2-75-101-210-100.compute-1.amazonaws.com -> 10.252.146.52

9

Page 10: 6620handout4o

Network ProbingNetwork Probing

nmap: TCP connect probes (3-way handshake)hping: TCP SYN traceroutes

both nmap/hping targeting ports 80 and 443

wget: retrieve web pages up to 1024BInternal probing from an instance to another

Legitimate w.r.t. Amazon policies

External probing from outside EC2Not illegal (port scanning is)Only targeting port 80/443 – with services running –implication on ethical issuesimplication on ethical issues

10

Page 11: 6620handout4o

Step 1: Mapping the CloudStep 1: Mapping the Cloud

Hypothesis:The Amazon EC2 internal IP address space is cleanly partitioned between availability zones

(likely to make it easy to manage separate network(likely to make it easy to manage separate network connectivity for these zones)

Instance types within these zones also show considerable regularityconsiderable regularity Moreover, different accounts exhibit similar placement.p

11

Page 12: 6620handout4o

Mapping the CloudMapping the Cloud

20 instances for each of the 15 zone/type pairs, total 300

Plot of internal IPs against zonesPlot of internal IPs against zones

Result: Different availability zones correspond to differentResult: Different availability zones correspond to different statically defined internal IP address ranges.

12

eshzali
Highlight
Page 13: 6620handout4o

Mapping the CloudMapping the Cloud

20 instances of each type, from another account, zone 3

Plot of internal IPs in Zone 3 against instance typesPlot of internal IPs in Zone 3 against instance types

Result: Same instance types correspond loosely withResult: Same instance types correspond loosely with similar IP address range regions.

13

eshzali
Highlight
Page 14: 6620handout4o

Derive IP Address Allocation RulesDerive IP Address Allocation Rules

Heuristics to label /24 prefixes with both il bilit d i t tavailability zone and instance type:

All IPs from a /16 are from same availability zoneA /24 inherits any included sampled instance typeA /24 inherits any included sampled instance type. If multiple instance types, then it is ambiguousA /24 containing a Dom0 IP address only contains / g yDom0 IP addresses. We associate to this /24 the type of the Dom0’s associated instanceAll /24’s between two consecutive Dom0 /24’sAll /24’s between two consecutive Dom0 /24’s inherit the former’s associated type.

10.250.8.0/24 contained Dom0 IPs associated with m1.small instances in 10.250.9.0/24, 10.250.10.0/24

14

Page 15: 6620handout4o

Mapping 6057 EC2 ServersMapping 6057 EC2 Servers

15

Page 16: 6620handout4o

Preventing Cloud CartographyPreventing Cloud Cartography

Why preventing?Make following attacks harderHiding infrastructure/amount of users

Wh t k i i ?What makes mapping easier?Static local IPs – changing which may complicate managementmanagementExternal to internal IP mapping – preventing which can only slow down mapping (timing and tracert are still possible)

16

Page 17: 6620handout4o

Step 2: Determine Co-residenceStep 2: Determine Co residence

Network-based co-resident checks: instances lik l id t if th hare likely co-resident if they have:

matching Dom0 IP address Dom0: 1st hop from this instance or last hop to victimDom0: 1st hop from this instance, or last hop to victim

small packet round-trip timesNeeds a “warm-up” – 1st probe discarded

numerically close internal IP addresses (e.g., within 7)

8 m1 small instances on one machine8 m1.small instances on one machine

17

eshzali
Highlight
eshzali
Highlight
eshzali
Highlight
Page 18: 6620handout4o

Step 2: Determine Co-residenceStep 2: Determine Co residence

Verified via a hard-disk-based covert channelAll “i t ” i 3All “instances” are in zone 3

Effective false positive rate of ZEROG ith i l t tGo with a simpler test:

Close enough internal IPs? If yes then tracert. A single hop (dom0) in between? If yes test passessingle hop (dom0) in between? If yes test passes

18

eshzali
Highlight
eshzali
Highlight
Page 19: 6620handout4o

Step 3: Exploiting VM PlacementStep 3: Exploiting VM Placement

Facts about Amazon placementSame account never has instances on same machine (so 8 instances will be placed on 8 machines)machines)Sequential locality (A stops then B starts, A and B likely co-resident)Parallel locality (A and B under different accounts run at roughly same time likely co-resident)Machines with less instances are more likely beMachines with less instances are more likely be placed to (load balancing)m1.xlarge and c1.xlarge have their own machines

19

eshzali
Highlight
eshzali
Highlight
Page 20: 6620handout4o

Step 3: Exploiting VM PlacementStep 3: Exploiting VM Placement

Strategy 1: Brute-forcing placement141 out of 1686: a success rate of 8.4%

Strategy 2: Abusing Placement LocalityAttacker instance-flooding right after the target instances are launched – exploiting parallel locality

Observing instance disappearing/reappearingObse g s a ce d sappea g/ eappea gTriggering the creation of new instances (elasticity)

40% success rate

flooding after 5 mins 20

eshzali
Highlight
eshzali
Highlight
Page 21: 6620handout4o

Step 3: Exploiting VM PlacementStep 3: Exploiting VM Placement

“Window” for parallel Evidence of sequential l litlocality is quite large locality

(Each instance is killed immediately afterimmediately after probing)

21

Page 22: 6620handout4o

Step 4: Information LeakageStep 4: Information Leakage

Co-Residency affords the ability to:Denial of ServiceEstimate victim's work load Extract cryptographic keys via side channelsExtract cryptographic keys via side channels

22

eshzali
Highlight
Page 23: 6620handout4o

MitigationsMitigations

Co-residence checks:Prevent identification of dom0/hypervisor

VM placement:Allow users to control/exclusively use machines

Side channel leaks:M th d i tMany methods existLimitations: impractical (overhead), application-specific, or insufficient protectionspecific, or insufficient protectionAlso, all of them require to know all possible channels in advance

23

eshzali
Highlight
eshzali
Highlight
Page 24: 6620handout4o

Amazon's responseAmazon s response

Amazon downplays report highlighting l biliti i it l d ivulnerabilities in its cloud service"The side channel techniques presented are based on testing results from a carefully controlled labon testing results from a carefully controlled lab environment with configurations that do not match the actual Amazon EC2 environment.""As the researchers point out, there are a number of factors that would make such an attack significantly more difficult in practice."significantly more difficult in practice.

http://www.techworld.com.au/article/324189/amazon_downplays_report_highlighting_vulnerabilities_its_cloud_service

24

eshzali
Highlight
Page 25: 6620handout4o

OutlineOutline

Co-Residence AttackPower Attack

25

Page 26: 6620handout4o

BackgroundBackground

The number of servers in data center surged f 24 illi i 2008 t 35 illi i 2012from 24 million in 2008 to 35 million in 2012

Power consumption 56% percent increase

V i t d i tiVery expensive to upgrade existing power infrastructures

How to add more servers with less cost?How to add more servers with less cost?

26

Page 27: 6620handout4o

Power AttackPower Attack

Solution: OversubscriptionPlace more servers than can be supported by the power infrastructureAssumption: not all servers will reach peakAssumption: not all servers will reach peak consumption (nameplate power ratings) at the same time

Leaves data center vulnerable to power attack:Malicious workload that can generate power spikes

lti l / k / h l d t ton multiple servers/racks/whole data centerLaunched as a regular user Causing DoS to both providers and clients byCausing DoS to both providers and clients by triggering the circuit breakers (CBs)

27

eshzali
Highlight
eshzali
Highlight
Page 28: 6620handout4o

Power Distribution in Data CentersPower Distribution in Data Centers

Three tiers:(60-400kV →transformer → 10-20kV → switchgear → 400-600 ) UPS/PDU600v) → UPS/PDUs (Power distribution units) → racksCi it b k tCircuit breakers at switchgear, PDUs, and rack-level branch circuitcircuit

28

eshzali
Highlight
Page 29: 6620handout4o

OversubscriptionOversubscription

Google’s analysisWorkload traces collected from real data centers: search, webmail, and MapReducePeak power reaches 96% of rated capacity at rackPeak power reaches 96% of rated capacity at rack level, but 72% at data center levelOversubscription would allow adding 38% more servers

A big assumption:Workloads never reach peak consumptionBenign workloads – maybe; malicious ones – no!

29

Page 30: 6620handout4o

Threat ModelThreat Model

Target can be rack, PDU, or data centerTarget can be rack, PDU, or data centerRunning public servers, e.g., Running public servers, e.g., IaaSIaaS, , PaaSPaaS, , SaasSaasPower oversubscriptionPower oversubscriptionPower consumption is monitored/managed at rackPower consumption is monitored/managed at rackPower consumption is monitored/managed at rackPower consumption is monitored/managed at rack--level (machinelevel (machine--level is too expensive)level is too expensive)

AdversaryAdversaryAdversaryAdversaryHackers, competitors, cyber crime/cyber warfareHackers, competitors, cyber crime/cyber warfareRegular user with sufficient resources (large number Regular user with sufficient resources (large number g ( gg ( gof accounts, workload) and mapping of cloudof accounts, workload) and mapping of cloud

Our focus: How to generate power spikes?Our focus: How to generate power spikes?

30

Under Under IaaSIaaS, , PaaSPaaS, , SaasSaas??

eshzali
Highlight
eshzali
Highlight
Page 31: 6620handout4o

Power Attack in PaaSPower Attack in PaaS

PaaS: Attacker can run any chosen applicationsLoad balancing

Load (utility) balancing ≠ power balancing

Attack in two stagesUtility reaches 100% (e.g., CPU)Fi t kl d t f th iFine-tune workload to further increase power consumption (remember utility ≠ power)

31

eshzali
Highlight
eshzali
Highlight
Page 32: 6620handout4o

Single Server TestSingle Server Test

Goal: find out how workloads affect powerSPECCPU2006 HPC benchmark

Results:Different workloads have very different power costSame CPU (100%), memory ≠ powere g 462 vs 465 462 vs 456e.g., 462 vs 465, 462 vs 456

32

Page 33: 6620handout4o

Single Server TestSingle Server Test

HPL benchmarkMultiple parameters to adjustAdjust block size NB (how problem is solved)

R ltResults:Same workload, same CPU and memoryDifferent power cost under different parametersDifferent power cost under different parameters

33

Page 34: 6620handout4o

Rack Level TestRack Level Test

Results:Similar to single machine

Attack:Increasing workload to reach utility capFurther increasing power cost by changing workload/tuning parametersworkload/tuning parameters

34

Page 35: 6620handout4o

Damage AssessmentDamage Assessment

OverheatingOne CPU is overheated, resulting in system failure

CB trippedin a room with 16 servers out of which only 4 servers are under attack

It ill onl get o se in eal o ldIt will only get worse in real worldWhen memory/I/O devices are attackedWith better “power proportionality”With better power proportionality

60% power consumed when idle in this case

35

Page 36: 6620handout4o

Power Attack in IaaSPower Attack in IaaS

IaaS: More control using VMs; more exposureAttack vectors:

Parasite attack – attack from inside R li ti f VMRun applications from VMs Launch DoS attacks on such VMs (more power cost than normal workload)

Exploit routine operationsLive migration of VMsLaunch parasite attack during migrationLaunch parasite attack during migration

36

Page 37: 6620handout4o

Evaluation - Parasite AttackEvaluation Parasite Attack

Parasite attackCo-resident with victimRun intensive workloadLaunch DoS attacks on such VMsLaunch DoS attacks on such VMs

ResultsNormal load: 180wNormal load: 180wIntensive load: 200wDoS: 230w (peak 245w)DoS: 230w (peak 245w)Increase 30%smurf: broadcasting

37

Page 38: 6620handout4o

Evaluation – VM MigrationEvaluation VM Migration

ResultsDuring migration, both source and dest. experience power spikes (memory copy, NICs, CPUs)Intra-rack and inter-rack (migrating to same server)Intra-rack and inter-rack (migrating to same server)

38

Page 39: 6620handout4o

Power Attack in SaaSPower Attack in SaaS

SaaS: Limited controlAttack vectors: Specially crafted requests

Trigger larger numbers of cache missesFloating point operations

Floating point unit (FLU) more power hungry than Arithmetic logic unit (ALU)

Divisions rather than add/multiplication

39

Page 40: 6620handout4o

Power Attack in SaaSPower Attack in SaaS

RUBiS online shopping benchmarkModified to support:

Floating point operations (discount coupons)Cache misses (continuously browsing at random)

30-40% of power spikes

40

Page 41: 6620handout4o

Data Center Level SimulationsData Center Level Simulations

Based on configurations of Google data center i L i NC USAin Lenoir, NC, USA“Original” workload based on traces of the data ente nd “Att k” in l de HPC o klo dcenter and “Attack” includes HPC workloads

Attacking the “Peak”, “medium”, and “valley” regionsregions

41

Page 42: 6620handout4o

ResultsResults

One PDU: 22 min attack trips PDU-level CBMulti-PDU: 4 attacks all trip CB

First 3 attacks recovered due to load balancingLast attack causes DoS (only 53% of requests processed) during 58-69hrs

42

Page 43: 6620handout4o

ResultsResults

DC level attack iblpossible

Larger scale attacks eq i e mo erequire more

resources

43

Page 44: 6620handout4o

MitigationMitigation

Power capping: limit peak consumptionChallenges: 2-minute sampling window enough for attacks; even longer (12 mins) to actually reduce consumptionconsumption

Server consolidation (shutdown if not in use)Better power proportionality – more aggressive p p p y ggoversubscription - more vulnerable

Challenges: Need to save powerDifficult to monitor power

iffi l di i i h b d k

44

Difficult to distinguish between users and attacks

Page 45: 6620handout4o

MitigationMitigation

Promising solutionsModels estimating power consumption of requests and consequently limit themPower balancing instead of load balancingPower balancing instead of load balancingDeploying per-server UPS

45

eshzali
Highlight
eshzali
Highlight