A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption...
Transcript of A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption...
![Page 1: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/1.jpg)
A Comprehensive Quality Evaluation of Security and Privacy Advice on the Web
Elissa M. Redmiles, Noel Warford, Amritha Jayanti, and Aravind Koneru,Sean Kross, Miraida Morales, Rock Stevens and Michelle L. Mazurek
![Page 2: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/2.jpg)
2
![Page 3: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/3.jpg)
3
People must learn a variety of security & privacy behaviors
![Page 4: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/4.jpg)
By Elissa Redmiles, May 16, 2017
Estimates of damage caused by phishing vary widely, ranging from $61 million per year to $3 billion per year of direct losses to victims in the U.S.
By Jason Hong
Despite advances on core security problems,user decisions can still lead to significant security risks
4
![Page 5: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/5.jpg)
5
How do they learn security? Is security education working?
![Page 6: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/6.jpg)
6
Ecosystem-wide quality measurement of one of the most prevalent security education sources: online articles
Where is the Digital Divide? A Survey of Security, Privacy, and Socioeconomics. CHI2017. How I Learned to be Secure: a Census-Representative Survey of Security Advice Sources and Behavior. CCS2016.
![Page 7: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/7.jpg)
7
Comprehensibility: can users understand the document?
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
Evaluate quality of corpus along three axes
![Page 8: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/8.jpg)
8
User Generated Search Queries (989 docs)
Expert Recommended Advice (889 docs)
• List 5 search queries for each of 3 digital security topics you’re interested in learning more about
• Show up to 6 security & privacy news articles• First one they indicate interest in: ask for 3 search queries
10 security experts & librarians
Collected representative corpus of online security advice
Step 2: Crowd workers clean corpus “Is this document about online privacy/security?”
1,264 documents left after cleaning
Step 1: Collect documents based on user-generated searches & expert recommendations
![Page 9: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/9.jpg)
9
Comprehensibility: can users understand the document?
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
Evaluate quality of corpus along three axes
![Page 10: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/10.jpg)
10
Comprehensibility: can users understand the document?
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
Evaluate quality of corpus along three axes
![Page 11: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/11.jpg)
What to use when evaluating security documents?
Domain-Specific Application?
First-Glance PerceptionMatters?
No No YesYes
Expect Uniform Distribution?
YesNo
Cloze
FRES
Smart Cloze
FRES
Ease
FRES
Smart Cloze
Ease
Figure Credit: Comparing and Developing Tools to Measure the Readability of Domain-Specific Texts. EMNLP 2019.
![Page 12: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/12.jpg)
12
Smart Cloze tool creates domain-relevant distractors
Use NLP techniques to generate four grammatically-probable distractors:two distractors drawn from a domain-specific dictionary we generatetwo from a general dictionary
![Page 13: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/13.jpg)
13
Each document evaluated by three test-takers,who had excellent reliability (ICC>0.90)
Census-representative sample of test takers
![Page 14: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/14.jpg)
14
55% of documents at least partially comprehensibleAverage doc perceived as “somewhat” easy to read
Mean = 47.5% Partial comprehensionFull comprehension
![Page 15: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/15.jpg)
15
Variance within domain groupings: some government providers far more comprehensible than others
Partial comprehension
Full comprehension
![Page 16: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/16.jpg)
16
Evaluate quality of corpus along three axes
Comprehensibility: measure with Smart Cloze & perceived ease
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
55% of documents at least partially comprehensible
![Page 17: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/17.jpg)
17
Evaluate quality of corpus along three axes
Comprehensibility: measure with Smart Cloze & perceived ease
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
55% of documents at least partially comprehensible
![Page 18: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/18.jpg)
18
To measure actionability (and accuracy) need to extractadvice imperatives from documents
Two research assistants manually annotated 1,264 documents to extract imperatives
![Page 19: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/19.jpg)
19
Started with literature-grounded taxonomy of 194 codes, 206 new codes discovered through annotation
374 unique advice imperatives2,780 pieces of advice
securityadvice.cs.umd.edu
![Page 20: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/20.jpg)
20
12 high level topics of security advice
![Page 21: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/21.jpg)
21
Evaluate quality of corpus along three axes
Comprehensibility: measure with Smart Cloze & perceived ease
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
55% of documents at least partially comprehensible
![Page 22: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/22.jpg)
22
Four theoretically-grounded actionability sub-metrics
Time Consumption: how time consuming would it be to follow this advice?economic frameworks (cost)
Difficulty: how difficult would it be to follow this advice?HiTL (capabilities)
Confidence: how confident is the user that they can follow the advice?PMT (perceived ability) & HiTL (knowledge acquisition)
Disruption: how disruptive would it be to follow this advice?economic frameworks (cost)
Answered on a Likert Scale: Very to Not at All
![Page 23: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/23.jpg)
23
Each piece of advice evaluated by three evaluators,who had good reliability (ICC>0.85)
Census-representative sample of evaluators
![Page 24: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/24.jpg)
Majority of advice rated as actionable
⁄𝟑 𝟒 of advice “somewhat”+ confident⁄𝟐 𝟑 of advice at most “slightly”
time consuming, disruptive, and difficult
20% of documents contain at least one unactionable piece of advice
![Page 25: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/25.jpg)
25
Evaluate quality of corpus along three axes
Comprehensibility: measure with Smart Cloze & perceived ease
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
People are somewhat or very confident about implementing ⁄𝟑 𝟒 of advice⁄𝟐 𝟑 considered at most slightly time consuming, disruptive, or difficult to implement
55% of documents at least partially comprehensible
![Page 26: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/26.jpg)
26
Evaluate quality of corpus along three axes
Comprehensibility: measure with Smart Cloze & perceived ease
Actionability: can users follow the advice?
Accuracy: will following the advice make users more secure?
55% of documents at least partially comprehensible
People are somewhat or very confident about implementing ⁄𝟑 𝟒 of advice⁄𝟐 𝟑 considered at most slightly time consuming, disruptive, or difficult to implement
![Page 27: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/27.jpg)
27
Recruit security experts to evaluate advice accuracy
RecruitmentQualification
CTF, pen testing,secure development
OR those who are certified
41 Experts
2+
![Page 28: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/28.jpg)
28
Ask experts to evaluate impact on risk & to prioritize
Perceived accuracy: accurate, useless, harmful
Risk reduction (or increase): 0-50+%
Priority: number 1, top 3, top 5, top 10
![Page 29: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/29.jpg)
29
Each piece of advice evaluated by three experts,who had good reliability (ICC>0.85)
Average of 38 pieces of advice evaluated by each expert
![Page 30: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/30.jpg)
30
Experts perceive 333 pieces of advice (89%) as accurate
All documents contain at least one piece of accurate advice
![Page 31: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/31.jpg)
31
Experts are a bit more discerning when prioritizing advicebut 118 pieces of advice are rated in the ”top 5”
Used matrix factorization to generate full ranked list across all votes
#1 Use unique passwords for different accounts#2 Update devices#3 Use anti-malware software#4 Scan attachments you open for viruses…
Top Advice
![Page 32: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/32.jpg)
32
Expert Priority Rankingof Advice
Reported User Adoption
User Priority Rankingof Advice Advice Actionability Ratings
Confidence Time Consumption
Disruption Difficulty
r = 0.600r = 0.212
Users’ reported adoption of advice correlates with actionability & prioritization
r =0.391 r =0.305 r =0.355 r =0.367
![Page 33: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/33.jpg)
Problem with online security advice: there is too much
Comprehensibility: average document is “partially” comprehensible to the average U.S. user
Actionability: majority of advice rated as actionable andactionability correlates with prioritization & adoption
Accuracy: 89% of advice rated accurate
Leaves behind low-literacy users
Data storage & network security advice not very actionable20% of documents contain at least one unactionable piece of advice
Lack of prioritization & falsifiability: experts think (almost) all the advice is great
![Page 34: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/34.jpg)
Future of Security AdviceNow What?
![Page 35: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/35.jpg)
Future of security advice requires falsifiability for security claims and empirical studies to narrow down behaviors
![Page 36: A Comprehensive Quality Evaluation of Security and Privacy ... · Reported UserAdoption UserPriority Ranking of Advice Advice Actionability Ratings Confidence Time Consumption Disruption](https://reader035.fdocuments.in/reader035/viewer/2022071222/6076be0984cd88103a4ce762/html5/thumbnails/36.jpg)
A Comprehensive Quality Evaluation of Security and Privacy Advice on the Web
Collected a corpus of 1,264 security advice documentsThrough user generated queries and expert recommendations
Evaluated Quality along three axesAverage document is partially comprehensible to the average U.S. userMajority of advice rated actionable; actionability correlated w/ reported behavior89% of advice rated accurate by experts
Experts can’t narrow down advice; need empirical scienceExperts struggle to identify the most impactful adviceWe need more concrete measurement & falsifiability
Elissa M. Redmiles, Noel Warford, Amritha Jayanti, and Aravind Koneru,Sean Kross, Miraida Morales, Rock Stevens and Michelle L. Mazurek
@eredmil1 [email protected]