Optimizing Password Composition Policies Jeremiah Blocki Saranga Komanduri Ariel Procaccia Or...

63
Optimizing Password Composition Policies Jeremiah Blocki Saranga Komanduri Ariel Procaccia Or Sheffet To appear at EC 2013

Transcript of Optimizing Password Composition Policies Jeremiah Blocki Saranga Komanduri Ariel Procaccia Or...

Optimizing Password Composition Policies

Jeremiah BlockiSaranga Komanduri

Ariel ProcacciaOr Sheffet

To appear at EC 2013

2

3

Password Composition Policy

password

Password Composition Policy

4

How Do Users Respond?

Password1

6

Predictable Responses

1. password2. 1234563. 123456784. abc1235. qwerty6. monkey7. letmein8. dragon9. 111111….25. password1

7

Previous Work

• Initial password composition policies designed without empirical data [BDP, 2006].

• User’s respond to password composition policies in predictable ways [KSKMBCCE, 2011]

• Trivial password choices vary widely across contexts [BX, 2012].

• No theoretical models of password composition policies.

8

Our Contributions

We initiate an algorithmic study of password composition policies.

Theoretical Model

Security Goal

Policy Structure

User Model

9

Outline

• User Model• Policy Structure• Goal• Algorithms and Reductions• Experiments

10

Rankings ModelUser 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

Each User: Passwords P ordered by preference.n = 7 (number of users).

11

Rankings Model: Example 1User 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

Allowed Passwords All Passwords

𝑨=𝑷− { 𝑝𝑎𝑠𝑠𝑤𝑜𝑟𝑑 ′ }

12

Rankings Model: Example 1

Pr[111111 | A] = 3/7Pr[letmein | A] = 2/7Pr[123456 | A]=Pr[12345 | A]=1/7

User 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

13

Rankings Model: Example 2User 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

)}(|{ wNoNumberswPA

Allowed Passwords All Passwords

14

Warm-up

Fact: Let A’ A then for any w A’ Pr[w|A] ≤ Pr[w|A’]

User 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

Initially one person uses letmein as their password.

letmein

15

Warm-up

Fact: Let A’ A then for any w A’ Pr[w|A] ≤ Pr[w|A’]

User 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

Every user who used letmein before is still using the same password.

16

Outline

• User Model• Policy Structure– Positive Rules – Negative Rules – Singleton Rules

• Goal• Algorithms and Reductions• Experiments

17

Positive Rules

Rules R1,…,Rm P

R1 = {w | Length(w) 14}.

Active Rules: S {1,…,m}.

.

18

Positive Rules - Example

Rules R1,…,Rm P

R1 = {w | Length(w) 14}.

Active Rules: S {1,…,m}.

A{1}= {w | Length(w) 14}.

19

Negative Rules

Rules R1,…,Rm P

R1 = {w | Length(w) < 8}.

Active Rules: S {1,…,m}.

20

Negative Rules - Example

Rules R1,…,Rm P

R1 = {w | Length(w) < 8}.

Active Rules: S {1,…,m}.

A{1}= P - {w | Length(w) < 8}.

21

Singleton Rules

Rule Rw= {w} for each w P.

Can allow/ban any individual password.

Special Case of Positive Rules/Negative Rules.

22

Outline

• User Model• Policy Structure• Goal• Algorithms and Reductions• Experiments

23

Online Attack

password

Guess Limit: k-strikes policy

12345

12345

p(k, A) – probability of a successful untargeted attack given A.

25

p(k,A) - Example

p(1,A) = Pr[111111] = 3/7p(2,A) = p(1,A) + Pr[letmein] = 5/7p(3,A) = p(2,A) + Pr[123456]= 6/7

User 1 User 2 User 3 User 4 User 5 User 6 User 7

password 123456 letmein password 12345 password password

letmein 12345 password 111111 123456 111111 111111

abc123 12345678 baseball Passw0rd 12345678 letmein Passw0rd

Passw0rd password Passw0rd abc123 baseball iloveyou iloveyou

… … … … … … …

qwerty Passw0rd qwerty1 letmein password baseball #$H%*@T

qwerty1 qwerty1 qwerty qwerty P@ssw0rd #$H%*@T letmein

26

Goal: Optimize p(k,A)

Goal: Find a password composition policy S {1,…,m} which minimizes p(k,AS) for some k.

p(k, A) – Fraction of accounts an adversary can crack with k guesses per account given policy A.

p(1, A): minimum entropy of the password distribution resulting from policy A.

28

Outline

• User Model• Policy Structure• Goal• Algorithms and Reductions• Experiments

29

ResultsRankings Model

Constant k Large k

Singleton Rules P NP-HardAPX-Hard (UGC)

Positive Rules P NP-Hard

Negative Rules n1/3-approx is NP-Hard NP-Hard

This Talk: k=1

n1/3-approx is NP-Hard

Parameters: n, m, |P|

30

Negative Rules are Hard!

Theorem: Unless P = NP no polynomial time algorithm can even approximate p(1,AS) to a factor of n1/3- in the negative rules setting.

31

Reduction

Maximum Independent Set: g vertices e edges

Theorem [Hastad 1996]: NP-Hard to distinguish the following two cases (1) any independent set has size at most K = g or (2) the maximum independent set has size g1-.

32

Reduction (Preference Lists)Preference Lists: Type 1

W1 … W1

W2 … W2

… … …WK … WK

B1 … Bg

… … …

Observation: Unless we ban W1,…,WK we have p(1,AS) ≥ g/n

33

Reduction (Preference Lists)

Preference Lists: Type 2 (for each edge e = {u,v})(u,v,1) … (u,v,g)(v,u,1) … (v,u,g)

X … X… … …

Observation: If for any edge e = {u,v} we ban (u,v,1),…,(u,v,g) and (v,u,1),…,(v,u,g) then p(1,AS) ≥ g/n.

34

Reduction (Preference Lists)

Preference Lists: Type 3 (for each vertex v, i j [K])(v,i,j,1) … (v,i,j,g)(v,j,i,1) … (v,j,i,g)

X … X… … …

Observation: If we ban (v,i,j,1),…,(v,i,j,g) and (v,j,i,1),…,(v,j,i,g) then p(1,AS) ≥ g/n.

35

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4Preference Lists: Type 1

W1 … W1

W2 … W2

… … …

WK … WK

B1 … Bg

… … …

s

t

36

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4 Preference Lists: Type 2 (edge e = {u,x})

(u,x,1) … (u,x,g)

(x,u,1) … (x,u,g)

X … X

… … …

s

t

p(1,AS) ≥ g/n

37

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4 Preference Lists: Type 2 (edge e = {u,s})

(u,s,1) … (u,s,g)

(s,u,1) … (s,u,g)

X … X

… … …

s

t

38

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rx,4

K=4 Preference Lists: Type 2 (edge e = {s,t})

(s,t,1) … (s,t,g)

(t,s,1) … (t,s,g)

X … X

… … …

s

t

39

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4

K=5Preference Lists: Type 1

W1 … W1

W2 … W2

… … …

WK … WK

B1 … Bg

… … …

s

t

p(1,AS) ≥ g/n

40

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4

K=5Preference Lists: Type 1

W1 … W1

W2 … W2

… … …

WK … WK

B1 … Bg

… … …

s

t

Rv,5

41

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4

K=5

s

t

Rv,5

Preference Lists: Type 3 (for each vertex u, i j [K])

(v,2,5,1) … (v,2,5,g)

(v,5,2,1) … (v,5,2,g)

X … X

… … …

p(1,AS) ≥ g/n

42

Reduction (Rules)

Ru,1

Rv,2Rw,3

Rw,4

K=4

s

t

Preference Lists: Type 3 (w, i=4, j=2)

(w,4,2,1) … (w,4,2,g)

(w,2,4,1) … (w,2,4,g)

X … X

… … …

44

ReductionIndependent Set of Size K? maxS [m] p(1,AS)Yes 1/n

No g/n where n = O(g3)

45

ResultsRankings Model

Constant k Large k

Singleton Rules P NP-HardAPX-Hard (UGC)

Positive Rules P NP-Hard

Negative Rules n1/3-approx is NP-Hard NP-Hard

This Talk: k=1

P

Parameters: n, m, |P|

46

Key Difference: Positive vs. Negative

Let S w = {i | w Ri} (all rules Ri that contain w).

Negative Rules: Ban w - activate any rule in Sw.

Positive Rules: Ban w - deactivate all rules in Sw.

47

Positive Rules

Fact: Let S* {1,…m} denote the optimal solution, and let S S* then either

(1) p(1,AS) = p(1,AS*), or (S is optimal)

(2) S-Sw S*, where Pr[w|AS] = p(1,AS).

All rules Ri that contain the most popular word in AS.

48

Positive Rules

Fact: Let S* {1,…m} denote the optimal solution, and let S S* then either

(1) p(1,AS) = p(1,AS*), or (S is optimal)

(2) S-Sw S*, where Pr[w|AS] = p(1,AS).

Proof: Suppose for contradiction that w AS*, and observe that .

Therefore, . Contradiction!S*S AA

Si

iSi

i RR*

S*S*S AA|wA |PrPr,1 wp

49

Positive Rules Algorithm

Iterative Elimination: Initialize: S0 = {1,…,m}

Repeat: (Ban w - current most popular password)

Si+1 = Si – Sw

Claim: One of the Si’s must be the optimal solution!

50

ResultsRankings Model

Constant k Large k

Singleton Rules P NP-HardAPX-Hard (UGC)

Positive Rules P NP-Hard

Negative Rules n1/3-approx is NP-Hard NP-Hard

This Talk: k=1

Question: What if we don’t have access to the full preference lists of each user? What if we don’t want to run in time n?

Parameters: n, m

51

ResultsRankings Model

Constant k Large k

Singleton Rules P NP-HardAPX-Hard (UGC)

Positive Rules P NP-Hard

Negative Rules n1/3-approx is NP-Hard NP-Hard

This Talk: k=1

Sampling Algorithm: ε-approximation with probability 1-δ

Parameters: m, 1/ε, 1/δ

52

Sampling Algorithm

Theorem: There is an efficient algorithm that makes O(m log (m/𝛿)/𝜀2) queries and with probability at least 𝛿 outputs positive rules S ⊆ [m] s.t

p(1,AS) ≤ p(1,AS*)+𝜀.

Sample: q(A) returns w with probability P[w|A].

Idea: Run iterative elimination. In each round use sampling to estimate the probability of the most popular word.

53

Sampling Lemma

Lemma: Let s=100 log (m/)/2 denote the number of samples in each round, and let BADi denote the event that in iteration i, there exists a password w s.t.

(e.g., our probability estimate off by /2). ThenPr[i.BADi]≤

2

|Pr

s

sw w

iSA

# times w sampled

54

Sampling Lemma

Partition P into buckets.

2

Pr

iS

Aw 4

Pr2

iSAw

… …

12

Pr2

iSi iAw

B0 B1 Biw

Contains at mot 2i+1/ such passwords.

55

Sampling Lemma

Partition P into buckets.

s=100 log (m/)/2

… …B0 B1 Bi

2122PrPr

i

wS

ms

sAw

i

Chernoff Bounds:

Contains at most 2i+1/ such passwords.

w

56

Sampling Lemma

Partition P into buckets.

s=100 log (m/)/2

… …B0 B1 Bi

Contains at most 2i+1/ passwords.

Union Bound: 1

1

21 2

2

22Pr.Pr

i

i

i

wSi mms

sAwBw

i

57

Sampling Lemma

Partition P into buckets.

s=100 log (m/)/2

… …B0 B1 Bi

mm

BADi

ii

012

PrUnion Bound (buckets):

58

Sampling Lemma

Partition P into buckets.

s=100 log (m/)/2

… …B0 B1 Bi

mmBADi i.PrUnion Bound (rounds):

64

Outline

• User Model• Policy Structure• Goal• Algorithms and Reductions• Experiments– RockYou Dataset– Rules– Results

65

RockYou Dataset

• RockYou password leak: 32 million plaintext passwords.

• No Preference Lists: Insufficient for our sampling algorithm.

• We test our algorithm under an additional assumption…

66

0.51

Normalized Probabilities

RockYou: initial distribution over P.

0.5

letmein (0.1)

PA

1letmein (0.2)

A

67

Normalized ProbabilitiesRankings Model

Constant k Large k

Singleton Rules P NP-HardAPX-Hard (UGC)

Positive Rules P NP-Hard

Negative Rules n1/3-approx is NP-Hard NP-Hard

Normalized Probabilities Model

Constant k Large k

Singleton Rules P P

Positive Rules P NP-Hard

Negative Rules NP-Hard NP-Hard

71

72

Base Line Results

73

Results

74

Discussion

• Optimal solution was better under negative rules.

• However, sampled solutions were much better with positive rules.

• Interesting Directions:– Additional Rules?– Is the Normalized Probabilities Model reasonable?– General experiment in preference list model?

75

Open Questions

• Efficient approximation algorithm in negative rules setting with normalized probabilities assumption?

• Adversary with limited background knowledge about the user (e.g., age, gender, birthday).