1
INSE 6620 (Cloud Computing Security and Privacy)
Attacks on Cloud
1
Prof. Lingyu Wang
Outline
Co-Residence AttackPower Attacko e ttac
2
Ristenpart et al., Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds; Xu et al., Power Attack: An Increasing Threat to Data Centers
2
The Threat of Multi-Tenancy
In traditional systems, security goal is usually to “keep bad guys out” Clouds brings new threats with multi-tenancy:
Multiple independent users share the same physical infrastructureSo, an attacker can legitimately be in the same physical machine as the targetTh b d t tThe bad guys are next to you...
3
What Would the Bad Guys do?
Step 1: Find out where the target is located
Step 2: Try to be co-located with the target in the same (physical) machine
Step 2.1: Verify it’s achieved
Step 3: Gather information about the target once co-located
4
3
“Hey, You, Get Off of My Cloud”
Influential, cited by 872 papers as of 2014 July (Google Scholar)Media coverage:MIT Technology Review, Network World, Network World (2), Computer World, Data Center Knowledge, IT Business Edge, Cloudsecurity.org, Infoworld
Attack launched against commercially available ”real” cloud (Amazon EC2)Claims up to 40% success in co-residence withClaims up to 40% success in co residence with target VMFirst work showing concrete threats in cloud
5
Approach Overview
Map the cloud infrastructure to estimate where the target is located cartography
FootprintingPort scanning
located - cartographyUse various heuristics to verify co-residence of two VMsLaunch probe VMs trying to be co-residence with target VMsExploit cross-VM side-channel
Discovering vulnerabilitiesInitial exploitationPrivilege escalation
leakage to gather information about the target
escalation
6
4
Threat Model
Attacker modelCloud infrastructure provider is trustworthyCloud insiders are trustworthyAttacker is a malicious non-provider-affiliated third party who can legitimately use cloud provider's service
Victim modelVi ti th l d th t h itiVictims are other cloud users that have sensitive information
7
The Amazon EC2
Xen hypervisorDomain0 is used to manage guest images, physical
Zen Hypervisor
Dom0Guest1 Guest2
resource provisioning, and access control rights Dom0 routes packages and reports itself as a first hop
Users may choose to create instance in2 regions (United States and Europe)3 il bilit (f f lt t l )3 availability zones (for fault tolerance)5 Linux instance types m1.small, c1.medium, m1.large, m1.xlarge, c1.xlarge
8
5
IP Addresses of Instances
An instance may have a public IP75.101.210.100, which, from outside the cloud, maps to an external DNS name ec2-75-101-210-100.compute-1.amazonaws.com
And an internal IP and DNS name10.252.146.52domU-12-31-38-00-8D-C6.compute-1.internal
Within the cloud, both domain names resolve to the internal IP
75.101.210.100 -> ec2-75-101-210-100.compute-1.amazonaws.com -> 10.252.146.52
9
Network Probing
nmap: TCP connect probes (3-way handshake)hping: TCP SYN traceroutesp g C S t ace outes
both nmap/hping targeting ports 80 and 443
wget: retrieve web pages up to 1024BInternal probing from an instance to another
Legitimate w.r.t. Amazon policies
External probing from outside EC2External probing from outside EC2Not illegal (port scanning is)Only targeting port 80/443 – with services running –implication on ethical issues
10
6
Step 1: Mapping the Cloud
Hypothesis:The Amazon EC2 internal IP address space is cleanly partitioned between availability zones
(likely to make it easy to manage separate network connectivity for these zones)
Instance types within these zones also show considerable regularity Moreover, different accounts exhibit similar placement.
11
Mapping the Cloud
20 instances for each of the 15 zone/type pairs, total 300
Plot of internal IPs against zones
Result: Different availability zones correspond to different statically defined internal IP address ranges.
12
7
Mapping the Cloud
20 instances of each type, from another account, zone 3
Plot of internal IPs in Zone 3 against instance types
Result: Same instance types correspond loosely with similar IP address range regions.
13
Derive IP Address Allocation Rules
Heuristics to label /24 prefixes with both availability zone and instance type:
All IPs from a /16 are from same availability zoneA /24 inherits any included sampled instance type. If multiple instance types, then it is ambiguousA /24 containing a Dom0 IP address only contains Dom0 IP addresses. We associate to this /24 the type of the Dom0’s associated instancethe type of the Dom0 s associated instanceAll /24’s between two consecutive Dom0 /24’s inherit the former’s associated type.
10.250.8.0/24 contained Dom0 IPs associated with m1.small instances in 10.250.9.0/24, 10.250.10.0/24
14
8
Mapping 6057 EC2 Servers
15
Preventing Cloud Cartography
Why preventing?Make following attacks harderHiding infrastructure/amount of users
What makes mapping easier?Static local IPs – changing which may complicate managementExternal to internal IP mapping – preventing which
l l d i (ti i d t tcan only slow down mapping (timing and tracert are still possible)
16
9
Step 2: Determine Co-residence
Network-based co-resident checks: instances are likely co-resident if they have:
matching Dom0 IP address Dom0: 1st hop from this instance, or last hop to victim
small packet round-trip timesNeeds a “warm-up” – 1st probe discarded
numerically close internal IP addresses (e.g., within 7)7)
8 m1.small instances on one machine
17
Step 2: Determine Co-residence
Verified via a hard-disk-based covert channelAll “instances” are in zone 3
Effective false positive rate of ZEROEffective false positive rate of ZEROGo with a simpler test:
Close enough internal IPs? If yes then tracert. A single hop (dom0) in between? If yes test passes
18
10
Step 3: Exploiting VM Placement
Facts about Amazon placementSame account never has instances on same machine (so 8 instances will be placed on 8 machines)Sequential locality (A stops then B starts, A and B likely co-resident)Parallel locality (A and B under different accounts run at roughly same time likely co-resident)g y y )Machines with less instances are more likely be placed to (load balancing)m1.xlarge and c1.xlarge have their own machines
19
Step 3: Exploiting VM Placement
Strategy 1: Brute-forcing placement141 out of 1686: a success rate of 8.4%
Strategy 2: Abusing Placement LocalityAttacker instance-flooding right after the target instances are launched – exploiting parallel locality
Observing instance disappearing/reappearingTriggering the creation of new instances (elasticity)
40% success rate40% success rate
flooding after 5 mins 20
11
Step 3: Exploiting VM Placement
“Window” for parallel locality is quite large
Evidence of sequential localityy q g
(Each instance is killed immediately after probing)
21
Step 4: Information Leakage
Co-Residency affords the ability to:Denial of ServiceEstimate victim's work load Extract cryptographic keys via side channels
22
12
Mitigations
Co-residence checks:Prevent identification of dom0/hypervisor
VM placement:Allow users to control/exclusively use machines
Side channel leaks:Many methods existLimitations: impractical (overhead), application-p ( ), ppspecific, or insufficient protectionAlso, all of them require to know all possible channels in advance
23
Amazon's response
Amazon downplays report highlighting vulnerabilities in its cloud service
"The side channel techniques presented are based on testing results from a carefully controlled lab environment with configurations that do not match the actual Amazon EC2 environment.""As the researchers point out, there are a number of factors that would make such an attack significantly more difficult in practice."
http://www.techworld.com.au/article/324189/amazon_downplays_report_highlighting_vulnerabilities_its_cloud_service
24
13
Outline
Co-Residence AttackPower Attacko e ttac
25
Background
The number of servers in data center surged from 24 million in 2008 to 35 million in 2012
Power consumption 56% percent increase
Very expensive to upgrade existing power infrastructures
How to add more servers with less cost?
26
14
Power Attack
Solution: OversubscriptionPlace more servers than can be supported by the power infrastructureAssumption: not all servers will reach peak consumption (nameplate power ratings) at the same time
Leaves data center vulnerable to power attack:Malicious workload that can generate power spikesMalicious workload that can generate power spikes on multiple servers/racks/whole data centerLaunched as a regular user Causing DoS to both providers and clients by triggering the circuit breakers (CBs)
27
Power Distribution in Data Centers
Three tiers:(60-400kV →
f ktransformer → 10-20kV → switchgear → 400-600v) → UPS/PDUs (Power distribution units) → racksCircuit breakers at switchgear, PDUs, and rack level branchrack-level branch circuit
28
15
Oversubscription
Google’s analysisWorkload traces collected from real data centers: search, webmail, and MapReducePeak power reaches 96% of rated capacity at rack level, but 72% at data center levelOversubscription would allow adding 38% more servers
A big assumption:
29
A big assumption:Workloads never reach peak consumptionBenign workloads – maybe; malicious ones – no!
Threat Model
Target can be rack, PDU, or data centerTarget can be rack, PDU, or data centerRunning public servers, e.g., Running public servers, e.g., IaaSIaaS, , PaaSPaaS, , SaasSaasPower oversubscriptionPower oversubscriptionPower consumption is monitored/managed at rackPower consumption is monitored/managed at rack--level (machinelevel (machine--level is too expensive)level is too expensive)
AdversaryAdversaryHackers, competitors, cyber crime/cyber warfareHackers, competitors, cyber crime/cyber warfare
30
Regular user with sufficient resources (large number Regular user with sufficient resources (large number of accounts, workload) and mapping of cloudof accounts, workload) and mapping of cloud
Our focus: How to generate power spikes?Our focus: How to generate power spikes?Under Under IaaSIaaS, , PaaSPaaS, , SaasSaas??
16
Power Attack in PaaS
PaaS: Attacker can run any chosen applicationsLoad balancingoad ba a c g
Load (utility) balancing ≠ power balancing
Attack in two stagesUtility reaches 100% (e.g., CPU)Fine-tune workload to further increase power consumption (remember utility ≠ power)
31
Single Server Test
Goal: find out how workloads affect powerSPECCPU2006 HPC benchmark
Results:Different workloads have very different power costSame CPU (100%), memory ≠ powere.g., 462 vs 465, 462 vs 456
32
17
Single Server Test
HPL benchmarkMultiple parameters to adjustAdjust block size NB (how problem is solved)
Results:Same workload, same CPU and memoryDifferent power cost under different parameters
33
Rack Level Test
Results:Similar to single machine
Attack:Increasing workload to reach utility capFurther increasing power cost by changing workload/tuning parameters
34
18
Damage Assessment
OverheatingOne CPU is overheated, resulting in system failure
CB trippedin a room with 16 servers out of which only 4 servers are under attack
It will only get worse in real worldWhen memory/I/O devices are attacked
35
y/ /With better “power proportionality”
60% power consumed when idle in this case
Power Attack in IaaS
IaaS: More control using VMs; more exposureAttack vectors:ttac ecto s
Parasite attack – attack from inside Run applications from VMs Launch DoS attacks on such VMs (more power cost than normal workload)
Exploit routine operationsLive migration of VMs
36
Live migration of VMsLaunch parasite attack during migration
19
Evaluation - Parasite Attack
Parasite attackCo-resident with victimRun intensive workloadLaunch DoS attacks on such VMs
ResultsNormal load: 180wIntensive load: 200w
37
DoS: 230w (peak 245w)Increase 30%smurf: broadcasting
Evaluation – VM Migration
ResultsDuring migration, both source and dest. experience power spikes (memory copy, NICs, CPUs)Intra-rack and inter-rack (migrating to same server)
38
20
Power Attack in SaaS
SaaS: Limited controlAttack vectors: Specially crafted requeststtac ecto s Spec a y c a ted equests
Trigger larger numbers of cache missesFloating point operations
Floating point unit (FLU) more power hungry than Arithmetic logic unit (ALU)
Divisions rather than add/multiplication
39
Power Attack in SaaS
RUBiS online shopping benchmarkModified to support:od ed to suppo t
Floating point operations (discount coupons)Cache misses (continuously browsing at random)
30-40% of power spikes
40
21
Data Center Level Simulations
Based on configurations of Google data center in Lenoir, NC, USA“Original” workload based on traces of the data center and “Attack” includes HPC workloads
Attacking the “Peak”, “medium”, and “valley” regions
41
Results
One PDU: 22 min attack trips PDU-level CBMulti-PDU: 4 attacks all trip CBu t U attac s a t p C
First 3 attacks recovered due to load balancingLast attack causes DoS (only 53% of requests processed) during 58-69hrs
42
22
Results
DC level attack possibleLarger scale attacks require more resources
43
Mitigation
Power capping: limit peak consumptionChallenges: 2-minute sampling window enough for attacks; even longer (12 mins) to actually reduce consumption
Server consolidation (shutdown if not in use)Better power proportionality – more aggressive oversubscription - more vulnerable
Challenges
44
Challenges: Need to save powerDifficult to monitor powerDifficult to distinguish between users and attacks
23
Mitigation
Promising solutionsModels estimating power consumption of requests and consequently limit themPower balancing instead of load balancingDeploying per-server UPS
45
Top Related