6620handout5t

29
7/21/2014 1 Faculty of Engineering and Computer Science Concordia Institute for Information Systems Engineering Cloud Traffic Security Wen Ming Liu INSE 6620 July 23, 2014 Agenda 2 Cloud Applications Side-Channel Attacks Challenges and Solutions Ceiling Padding

description

Cloud Computing

Transcript of 6620handout5t

Page 1: 6620handout5t

7/21/2014

1

Faculty of Engineering and Computer Science

Concordia Institute for Information Systems Engineering

Cloud Traffic Security

Wen Ming Liu

INSE 6620 July 23, 2014

Agenda

2

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 2: 6620handout5t

7/21/2014

2

Cloud Computing Architecture

3

Cloud Computing Architecture

4

Page 3: 6620handout5t

7/21/2014

3

Agenda

5

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

6

Web-Based Applications

untrustedInternet

Client ServerEncryption

“Cryptography solves all security problems!”Really?

Page 4: 6620handout5t

7/21/2014

4

7

Side-Channel Attack on Encrypted Traffic

Internet

Client ServerEncrypted Traffic

User Input Observed Directional Packet Sizes

a: 801→, ←54, ←509, 60→

00: 812→, ←54, ←505, 60→,

813→, ←54, ←507, 60→

b-byte s-byte

� Network packets’ sizes and directions between user and a popular search engine

� By acting as a normal user and eavesdropping traffic with sniffer pro 4.7.5.

� Collected in May 2012

Indicator of the input itself

Fixed pattern: identified input string

8

Updated Patterns Dec 2013

Internet

Client ServerEncrypted Traffic

User Input Observed Directional Packet Sizes

a: 590→, 67→, ←60, ←60, ←728, 60→

00: 590→, 67→, ←60, ←60, ←698, 60→,

590→, 68→, ←60, ←60, ←717, 60→

b-byte s-byte

� Patterns may change over time, but attacks will still work in similar ways

� Patterns may be different for different Web applications, but they are always there

Indicator of input itself

Fixed pattern: identified input string

Page 5: 6620handout5t

7/21/2014

5

9

To Make Things Worse

� The “Autocomplete” feature allows adversaries to combine the packets corresponding to multiple keystrokes

� Web applications are highly interactive

User Input Observed Directional Packet Sizes

a: 590→, 67→, ←60, ←60, ←728, 60→

00: 590→, 67→, ←60, ←60, ←698, 60→,

590→, 68→, ←60, ←60, ←717, 60→

b-byte s-byte

10

Longer Inputs, More Unique the Patterns

� S value for each character entered as:

a b c d e f g

509 504 502 516 499 504 502

h i j k l m n

509 492 517 499 501 503 488

o p q r s t

509 525 494 498 488 494

u v w x y z

503 522 516 491 502 501

� First keystroke: � Second keystroke:

First Keystroke

Second Keystroke

a b c d

a 509 487 493 501 497

b 504 516 488 482 481

c 502 501 488 473 477

d 516 543 478 509 499

Unique s value 12 out of 1616 out of 16

The unique patterns leak out users’ private information: the input string

In reality, it may take more than two

keystrokes to uniquely identify an input string.

Page 6: 6620handout5t

7/21/2014

6

Side-Channel Leaks

� To protect the information in critical applications against network sniffing, a common practice is to encrypt their network traffic. However, as discovered in the research, serious information leaks are still a reality.

� Even though the communications generated during these state transitions are protected by HTTPS, their observable attributes can still give away the information about the user’s selection.

� The eavesdropper cannot see the contents, but can observe: number of packets, timing/size of each packet.

11

Slides 11-27 are partially based on: S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel leaks in web applications: A reality today, a challenge tomorrow. In IEEE Symposium on Security and Privacy’10, pages 191–206, 2010.

Main Findings

� Analysis of the side-channel weakness in web applications.

� Several high-profile and really popular web applications actually disclose surprisingly detailed user’s sensitive information� Personal health data, family income, investment details, search queries

� The root causes of the side-channel information leaks are the fundamental characteristics in today’s web applications.� Stateful communication, low entropy input and significant traffic

distinctions

� In-depth study on the challenges in mitigating the threat.

� Evaluate the effectiveness and the overhead of common mitigation techniques such as packet padding.

� Show that effective solutions to the side-channel problem have to be application-specific, relying on an in-depth understanding of the application being protected.

� This suggests the necessity of a significant improvement of the current practice for developing web applications.

12

Page 7: 6620handout5t

7/21/2014

7

Fundamental Characteristics of Web Applications

The root causes are some fundamental characteristics in today’s web applications :

� Low entropy inputs for better interactions.� Small input space� Autosuggestion, auto-complete

� Stateful communications.� Transitions to next states depend both on the current state and

on its input.� Although information for each transition may be insignificant, their

combination can be really powerful.

� Significant traffic distinctions.� The chance of two different user actions having the same traffic

pattern is really small. � Such distinctions often come from the objected updated by client-

server data exchanges.

13

Significant traffic distinctions

14

Page 8: 6620handout5t

7/21/2014

8

Basic of Wi-Fi Encryption Schemes:

WEP: susceptible to key-recovery attacksWPA: TKIP (RC4)WPA2: CCMP (128-bit AES block cipher in counter mode)

The ciphertext fully preserves the size of its plaintext!

15

Scenario: search using encrypted Wi-Fi WPA/WPA2.Example: user types “list” on a WPA2 laptop.

Consequence: Anybody on the street knows our search queries.

Attacker’s effort: linear, not exponential.

821�

910

822�

931

823�

995824�

1007

16

Page 9: 6620handout5t

7/21/2014

9

OnlineHealthA

(“A” denoting a pseudonym)

� A web application by one of the most reputable companies of online services

� Illness/medication/surgery information is leaked out, as well as the type of doctor being queried.

� Vulnerable designs� Entering health records

� By typing – auto suggestion� By mouse selecting – a tree-structure organization of

elements

� Finding a doctor� Using a dropdown list item as the search input

17

tabs

Entering health records: no matter keyboard typing or mouse selection, attacker has a 2000× ambiguity reduction power.

Find-A-Doctor: attacker can uniquely identify the specialty.

Attacker’s power

Page 10: 6620handout5t

7/21/2014

10

OnlineTaxA

� It is the online version of one of the most widely used applications for the U.S. tax preparation.

� Design: a tax-preparation wizard� Tailor the conversation based on user’s previous input.

� The forms that you work on tell a lot about your family� Filing status� Number of children� Paid big medical bill� The adjusted gross income (AGI)

19

Entry page of Deductions & Credits

Summary of Deductions & Credits

Full credit

Not eligible

Partial credit

All transitions have unique traffic patterns.

Consult the IRS instruction: $1000 for each child

Phase-out starting from $110,000. For every $1000 income, lose $50

credit.

$0

$110000 $150000

Not eligibleFull credit Partial credit

(two children scenario)

child credit state machine

20

Page 11: 6620handout5t

7/21/2014

11

Entry page of Deductions & Credits

Summary of Deductions & Credits

Full credit

Not eligible

Partial credit

Even worse, most decision procedures for credits/deductions

have asymmetric paths.Eligible – more questions

Not eligible – no more question

Enter your paid interest

$0

$115000 $145000

Not eligibleFull credit Partial credit

Student-loan-interest credit

21

Disabled Credit

$24999

Retirement Savings$53000

IRA Contribution

$85000 $105000

College Expense $116000

$115000Student Loan Interest

$145000

First-time Homebuyer credit $150000 $170000

Earned Income Credit$41646

Child credit *$110000

Adoption expense $174730 $214780

$130000 or $150000 or $170000 …

$0

A subset of identifiable AGI thresholds

� We are not tax experts.� OnlineTaxA can find more than 350 credits/deductions.

22

Page 12: 6620handout5t

7/21/2014

12

A major financial institution in the U.S.

Which funds you invest? • No secret.

• Each price history curve is a

GIF image from MarketWatch.

• Everybody in the world can

obtain the images from

MarketWatch.

• Just compare the image sizes!

OnlineInvestA

Your investment allocation• Given only the size of the pie chart,

can we recover it?

• Challenge: hundreds of pie-charts

collide on a same size.

23

Inference based on the evolution of the pie-chart size in 4-or-5 days

� The financial institution updates the pie chart every day after the market is closed.

� The mutual fund prices are public knowledge.

≅ 800 charts ≅ 80 charts ≅ 8 charts 1 chart

Siz

e o

f d

ay

1

Siz

e o

f d

ay

2;

Pri

ce

s o

f th

e d

ay

Siz

e o

f d

ay

3;

Pri

ce

s o

f th

e d

ay

Siz

e o

f d

ay

4;

Pri

ce

s o

f th

e d

ay

≅80000 c

hart

s

24

Page 13: 6620handout5t

7/21/2014

13

Challenging to Mitigate the Vulnerabilities

25

� Traffic differences are everywhere. Which ones result in

serious data leaks?� Need to analyze the application semantics, the availability of

domain knowledge, etc.

� Hard.

� Is there a vulnerability-agnostic defense to fix the

vulnerabilities without finding them?� Obviously, padding is a must-do strategy.

� We found that even for the discussed apps, the defense policies

have to be case-by-case.

Why challenging?

26

Page 14: 6620handout5t

7/21/2014

14

� See if problem can be solved without analyzing individual application

� Application-agnostic manner: Padding

� Rounding:

� Random padding:

� Average overhead:

� Given , reduction power being calculated after padding

Universal Mitigation Policies

27

Any Problem?

Agenda

28

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 15: 6620handout5t

7/21/2014

15

29

Trivial problem?

30

Don’t Forget the Cost

(Prefix) char s Value Rounding (ΔΔΔΔ)

64 160 256

(c) c 473 512 480 512

(c) d 477 512 480 512

(d) b 478 512 480 512

(d) d 499 512 640 512

(a) c 501 512 640 512

(b) a 516 576 640 768

Padding Overhead (%) 6.5% 14.1% 13.0%

1 S. Chen, R. Wang, X. Wang, and K. Zhang. Side-channel leaks in web applications: A reality today, a challenge tomorrow. In IEEE Symposium on Security and Privacy’10, pages 191–206, 2010.

� No guarantee of better privacy at a higher cost

� ∆ ↑⇏ privacy↑� ∆ ↑ ⇏ overhead↑

� To make all inputs indistinguishable by rounding will result in a 21074% overhead for a well-known online tax system 1

Page 16: 6620handout5t

7/21/2014

16

Two Conflicting Goals

31

� To prevent side-channel attacks, we face two seemingly conflicting goals,

� Privacy protection: Reduce the differences in packet sizes

� Cost: Minimize the overhead (communication and processing…)

� The similar goals may allow us to borrow existing expertise on privacy-preserving data publishing(PPDP).

How?

Grouping and Breaking

Slides 31-45 are partially based on: W. M. Liu, L. Wang, K. Ren, P. Cheng, M. Debbabi, S. Zhu, “PPTP: Privacy-Preserving Traffic Padding in Web-Based Applications,” IEEE Trans. on Dependable and Secure Computing (TDSC),

Solution: Ceiling Padding

32

473 477 478 (c) c

477 477 478 (c) d

478 499 478 (d) b

499 499 516 (d) d

501 516 516 (a) c

516 516 516 (b) a

S Value Padding (Prefix) charOption 1 Option 2

Quasi-ID Function 1 Function 2 Sensitive AttributeGeneralization

PPTP:Padding group

PPDP:anonymized group

� PPTP goals:

� Privacy

� Cost

� PPDP goals:

� Privacy

� Data utility

So we can apply existing techniques in data publication to achieve ceiling padding

However, there are a few difference, and hence challenges...

� Ceiling padding: pad every packet to the maximum size in the group

Page 17: 6620handout5t

7/21/2014

17

33

PPTP Components

Internet

� Interaction:

� action a:� Atomic user input that triggers traffic� A keystroke, a mouse click …

� action-sequence �� :� A sequence of actions with complete input info� Consecutive keystrokes…

� action-set Ai:� Collection of all ith actions in a set of action-seq

� Observation:

� flow-vector v:� A sequence of flows (directional packet sizes)� Triggered an action

� vector-sequence ��:� A sequence of flow-vectors� Triggered by an equal-length action-sequence

� vector-set Vi:� Collection of all ith vectors in a set of vector-seq

� Vector-Action Set VAi:

� Pairs of ith actions and corresponding ith flow-vectors

User Input Observed Directional Packet Sizes

a: 801→, ←54, ←509, 60→

00: 812→, ←54, ←505, 60→,

813→, ←54, ←507, 60→

34

Privacy and Cost

� k-indistinguishability: Given a vector-action set VA

� Padding group:any S⊆VA satisfying all the pairs in S have identical flow-vectors and no S’ ⊃S can

satisfy this property

� We say VA satisfies k-indistinguishability (k is an integer) if the cardinality of every padding group is no less than k

� Goal of privacy protection:

� Upon observing any flow-vector in the traffic, the eavesdropper cannot determine which action in the table (vector-action set) has triggered this flow-vector.

� l-diversity:

� Address the cases that:No all inputs should be treated equally in padding (for example, some statistical

information regarding the likelihood of different inputs may be publicly known).

Page 18: 6620handout5t

7/21/2014

18

35

Privacy and Cost

� Vector-distance:

� Given two equal-length flow-vectors v1 and v2, vector-distance is the total number of bytes different in the flows: ���� ��, �� =∑ (|��� − ���|)"#�$� .

� Padding cost:

� Given a vector-set V, the padding cost is the sum of the vector-distances between each flow-vector in V and its countpart after padding.

� Processing cost:

� Given a vector-set V, the processing cost is the number of flows in V

which corresponding packets should be padded.

Agenda

36

� Cloud Applications

� Side-Channel Attacks

� Challenges and Solutions

� Ceiling Padding

Page 19: 6620handout5t

7/21/2014

19

Challenge 1

37

473 477 478 (c) c

477 477 478 (c) d

478 499 478 (d) b

499 499 516 (d) d

501 516 516 (a) c

516 516 516 (b) a

S Value Padding (Prefix) charOption 1 Option 2

Differences and challenges:

� Data utility measures & padding cost

� Traffic padding: cost of option 1 is worse than that of option 2

� Data publication: utility of function 1 is better than that of option 2

38

Challenge 2

� Recall that adversaries may combine multiple keystokes

� Example:

� One obvious, but invalid solution:Pad every keystroke (separately)

� Another obvious, but invalid solution:Pad on the whole string!

First Keystroke

Second Keystroke

a b c d

a 487 493 501 497

b 516 488 482 481

c 501 488 473 477

d 543 478 509 499

First Keystroke

Second Keystroke

a b c d

a 487 493 501 497

b 516 488 482 481

c 501 488 473 477

d 543 478 509 499

First Keystroke

Second Keystroke

a b c d

a … … 501 …

b … 488 … …

c 501 488 … …

d … … … …

First Keystroke

Second Keystroke

a b c d

a 509 … … 501 …

b 504 … 488 … …

c 502 501 488 … …

d 516 … … … …

First Keystroke

Second Keystroke

a b c d

a 516 … … 501 …

b 504 … 488 … …

c 504 501 488 … …

d 516 … … … …

Strings 1st keystroke 2nd keystroke

ac 509 501

ca 502 501

ad 509 497

dd 516 499

… … …

Page 20: 6620handout5t

7/21/2014

20

39

PPTP - Overview of Algorithms

� Intention:� To demonstrate the existence of abundant possibilities in approaching PPTP issue, and not to design an exhaustive list of solutions.

� Design three algorithms for partitioning inputs into padding groups.� Main difference: the algorithms handle in increasingly complicated cases .� Computational complexity:

� svsdSimple algorithm: Ο &'()&� svmdGreedy algorithm: Ο(&* ⨯ &2) (worse case), Ο(&* ⨯ &'()&) (average case)� mvmdGreedy algorithm:Ο(&* ⨯ &2) (worse case), Ο(&* ⨯ &'()&) (average case)

40

Experiment Settings

� Collect data from two real-world web applications:

� A popular search engine(users’ search keyword needs to be protected)

Collect flow-vectors for query suggestion widget for all possible combinations of four lettersby crafting requests to simulate the normal AJAX connection request.

� Authoritative drug information system from national institute(user’s possible health information needs to be protected)

Collect vector-action set for all the drug information by mouse-selecting following theapplication’s three-level tree-hierarchical navigation.

Page 21: 6620handout5t

7/21/2014

21

41

Overhead - Padding Cost

� The padding cost against k:

� To compare to rounding, Δ=512(engineB)andΔ=5120(drugB)which achieves only 5-indistinguishility.� Our algorithms have less padding cost in both cases.� Observe that our algorithms are superior specially when the number of flow-vectors is larger.

42

Overhead – Execution Time

� Generate n-size flow data by synthesizing n/|VA| copies of engineB and drugB.

� The computation time of mvmdGreedy increases slowly with n.� Practically efficient (1.2s for 2.7m flow-vectors),� Require slightly more overhead than rounding when it is applied to a single Δ value.

� The computational time of mvmdGreedy against privacy property k� A tighter upper bound: Ο(&* ⨯ & ⨯ 21 ⨯ λ) (worse case), Ο(&* ⨯ & ⨯ log(21 ⨯ λ)) (average case)� The computation time increases slowly with k for engineB, and decreases slowly for drugB

.

Page 22: 6620handout5t

7/21/2014

22

43

Overhead – Processing Cost

� An application can choose to incorporate the padding at different stage ofprocessing a request, however, we must minimize the number of packets to bepadded.

� Pad the flow-vectors on the fly,� Modify the original data beforehand.

� The processing cost against k:� Rounding must pad each flow-vector regardless of the k’s and the applications, while ouralgorithms have much less cost for engineB and slightly less for drugB.

Extension

44

� Adapt l-diversity to address cases that:

� No all inputs should be treated equally in padding.

� Model:� Catch the information about the inequality.

� Algorithms:

� Need additional constraints on partition.

Page 23: 6620handout5t

7/21/2014

23

45

Experiments

� Collect data from two real-world web applications:� Another popular search engine

(users’ search keyword needs to be protected)

� Authoritative patent information system from national institute (company’s patent interest needs to be protected)

46

Challenge 3: Ceiling Padding Defeated

Condition s Value Rounding (ΔΔΔΔ) Ceiling

Padding112 144 176

Cancer 360 448 432 528 360

Cervicitis 290 336 432 352 360

Cold 290 336 432 352 290

Cough 290 336 432 352 290

Padding Overhead (%) 18.4% 40.5% 28.8% 5.7%

2-indistinguishability

� Observation: a patient received a 360-byte packet after login

� Cancer? Cervicitis? ⇒50%,50%� Extra knowledge: this patient is a male

� Cancer? Cervicitis? ⇒100%,0%

� Facts (Ceiling Padding):

Slides 46-58 are partially based on: W. M. Liu, L. Wang, K. Ren, M. Debbabi,, “Background Knowledge-Resistant Traffic Padding for Preserving User Privacy in Web-Based Applications,” Proc. The 5th IEEE International Conference and on Cloud Computing Technology and Science (IEEE CloudCom 2013),

Page 24: 6620handout5t

7/21/2014

24

47

Solution: Add Randomness

Condition s Value

Cancer 360

Cervicitis 290

Cold 290

Cough 290

Cancerous Person

36

0

36

0

36

0

� Random Ceiling Padding� Instead of deterministically forming padding groups, the server will randomly (at uniform, in this example) selects one out of the other three conditions (together with the real condition) to form a padding group for ceiling padding.

Always receive a 360-byte packet

Condition s Value

Cancer 360

Cervicitis 290

Cold 290

Cough 290

Cervicitis Patient

36

0

29

0

66.7%: 290-byte packet33.3%: 360-byte packet

29

0

48

Better Privacy Protection

Diseases s Value

Cancer 360

Cervicitis 290

Cold 290

Cough 290

� Can tolerate adversaries’ extra knowledge

� Suppose an adversary knows a patient is male and he saw s = 360

Patient

Has

Server

Selects

Cancer Cervicitis

Cancer Cold

Cancer Cough

Cold Cancer

Cough Cancer

The adversary

now can only

be 60%, instead

of 100%, sure

that patient has

Cancer.

� Cost is not necessarily worse

� In this example, these two methods actually lead to exactly the same expected padding and processing costs

Page 25: 6620handout5t

7/21/2014

25

Analysis

49

� Scenario:

� Algorithm: randomness drawn from uniform distribution

� Data: action-sequence and flow-vector are of length one

� Analysis of privacy preservation:

� Lemma 6.1

� Analysis of costs:

� Lemma 6.2, Lemma 6.3

Model, Scheme and Experiment

50

� Model:

� component, privacy, method, cost

� Scheme:

� Main idea:

� Server randomly selects members to form the group. � Different choices of random distribution lead to different algorithms.

� Two instantiations of scheme:

� Bounded uniform distribution; Normal distribution� Computation complexity: O(k)

� Experiment:

� Data from real-world web applications

� Low overheads and high uncertainty

Page 26: 6620handout5t

7/21/2014

26

51

Privacy Properties

� k-Indistinguishability:

� For any flow-vector, at least k different actions can trigger it.

� Given vector-action set VA, padding algorithm M, range Range(M,VA)

∀ � ∈ 9�&): ;, <= , �: Pr(;@� � = � > 0 ∧ � ∈ = ≥ 1

Model the privacy requirement of a traffic padding from two perspectives

� Uncertainty:

� Apply the concept of entropy in information theory to quantify an adversary’s uncertainty about the real action performed by a user.

� Given vector-action sequence <=, padding algorithm M

D �, <=,; = −E Pr(;@� � = � log� Pr(;@� � = �F∈G

H <=,; = E D(�, <=,;) × Pr(; = = �)"∈JFKLM(N,OG)

Φ <=,; = Q H(<=,;)OG∈OG

52

Privacy Properties (cont.)

� δ-uncertain k-Indistinguishability:

An algorithm M give δ-uncertain k-Indistinguishability for a

vector-action sequence <= if

� M w.r.t. any <= ∈ <= satisfies k-indistinguishability, and

� The uncertainty of M w.r.t. <= is not less than δ.

Page 27: 6620handout5t

7/21/2014

27

53

Padding Method

� Ceiling padding [15][16]:

� Inspired by PPDP: grouping and breaking

� Dominant-vector of a padding group:

� Size of each group is not less than k, andevery flow-vector in a group is padded to dominant-vector of that group.

Achieves k-indistinguishability, but

not sufficient if the adversary possess prior knowledge.

� Random ceiling padding method:

� A mechanism M: when responding to an action a (per each user request),

� It randomly selects k-1 other actions, and

� Pads the flow-vector of action a to be dominant-vector of transient group(those k actions).

� Randomness:

� Randomly selects members of transient group from certain candidates based on certain distributions.

� To reduce the cost, change the probability of an action being selected as a member of transient group.

54

Cost Metrics

� Expected processing cost:

� How many flow-vectors need to be padded

� RS(� <=,; = ∑ ∑ TUN(F)V"(W,X)∈YZYZ∈YZ∑ F," : F," ∈OGYZ∈YZ

� Expected padding cost:

� The proportion of packet size increases compared to original flow-vectors

� Given vector-action sequence <=, padding algorithm M,

(�, �): *S(� �, <=,; = E Pr ; � = �[ × �[ − �"[∈JFKLM(N,OG) �

<=:*S(� <=,; = E (*S(�(�, <=,;))

F," ∈OG<=: *S(� <=,; = E (*S(�(<=,;))

OG∈OG

Page 28: 6620handout5t

7/21/2014

28

55

Overview of Scheme

� Main idea:

� To response a user input, server randomly selects members to form the group. � Different choices of random distribution lead to different algorithms.

� Goal:

� The privacy properties need to be ensured.� The costs of achieving such privacy protection should be minimized.

� Two stage scheme:

� Stage 1: derive randomness parameters, one-time, optimization problem;� Stage 2: form transient group, real-time.

56

Scheme (cont.)

� Computational complexity: O(k)

� Stage 1: pre-calculated only once� Stage 2: select k-1 random actions without duplicate, O(k).

� Discussion on privacy:

� The adversary cannot collect vector-action set even acting as normal user,

<= = 100, 1 = 20, \&�](R^ ⟹ `R�&��:&`)R(\*�(�&�S`�(&) = 9919 ≈ 2cc

� Approximate the distribution is hard: all users share one random process.

� Discussion on costs:

� Deterministically incomparable with those of ceiling padding.

Page 29: 6620handout5t

7/21/2014

29

57

Instantiations of Scheme

� Bounded uniform distribution:

� ct: cardinality of candidate actions

� cl: number of larger candidates

� Scheme can be realized in many different ways.

� Choose group members from different subsets of candidates and based ondifferent distributions, in order to reduce costs.

� Normal distribution:

� µ: mean

� σ: standard deviation

0 n

iaction

cl

ct

0 n

iaction

largest smallest largest smallest

probability

[max(0,min(i-cl, |VA|-ct)), min(max(0,min(i-cl, |VA|-ct))+ct, |VA|)]

(return)

58

Uncertainty and Costs v.s. k

� The padding and processing costs of all algorithms increase with k, while TUNI and NORM have less than those of SVMD.

� Our algorithms have much larger uncertainty for Drug and slightly larger for Engine.