CSE 127: Computer Security - cseweb.ucsd.edu

62
CSE 127: Computer Security Modern client-side defenses Deian Stefan

Transcript of CSE 127: Computer Security - cseweb.ucsd.edu

Page 1: CSE 127: Computer Security - cseweb.ucsd.edu

CSE 127: Computer Security Modern client-side defenses

Deian Stefan

Page 2: CSE 127: Computer Security - cseweb.ucsd.edu

Today

How can we build flexible and secure client-side web applications (from vulnerable/untrusted components)

Page 3: CSE 127: Computer Security - cseweb.ucsd.edu

Modern web sites are complicated

Page 4: CSE 127: Computer Security - cseweb.ucsd.edu

• Page developer

• Library developers

• Service providers

• Data provides

• Ad providers

• CDNs

• Network provider

Many acting parties on a site

Page 5: CSE 127: Computer Security - cseweb.ucsd.edu

• How do we protect page from ads/services?

• How to share data with a cross-origin page?

• How to protect one user from another’s content?

• How do we protect the page from a library?

• How do we protect the page from the CDN?

• How do we protect the page from network provider?

Page 6: CSE 127: Computer Security - cseweb.ucsd.edu

Recall: Same origin policy

Idea: isolate content from different origins ➤ E.g., can’t access document of cross-origin page

➤ E.g., can’t inspect responses from cross-origin

c.com b.coma.com

postMessage

✓JSON

DOM access✓

Page 7: CSE 127: Computer Security - cseweb.ucsd.edu

Why is the SOP not good enough?

Page 8: CSE 127: Computer Security - cseweb.ucsd.edu

The SOP is not strict enough

• Third-party libs run with privilege of the page

• Code within page can arbitrarily leak/corrupt data ➤ How? So, how should we isolate untrusted code?

• iframes isolation is limited ➤ Can’t isolate user-provided content from page (why?)

➤ Can’t isolate third-party ad placed in iframe (why?)

Page 9: CSE 127: Computer Security - cseweb.ucsd.edu

The SOP is not strict enough

• Third-party libs run with privilege of the page

• Code within page can arbitrarily leak/corrupt data ➤ How? So, how should we isolate untrusted code?

• iframes isolation is limited ➤ Can’t isolate user-provided content from page (why?)

➤ Can’t isolate third-party ad placed in iframe (why?)

Page 10: CSE 127: Computer Security - cseweb.ucsd.edu
Page 11: CSE 127: Computer Security - cseweb.ucsd.edu

The SOP is not flexible enough

• Can’t read cross-origin responses ➤ What if we want to fetch data from provider.com?

➤ JSON with padding (JSONP)

- To fetch data, insert new script tag: <script src=“https://provider.com/getData?cb=f”></script>

- To share data, reply back with script wrapping data f({ ...data...})

Page 12: CSE 127: Computer Security - cseweb.ucsd.edu

http://example.com

provider.com

Page 13: CSE 127: Computer Security - cseweb.ucsd.edu

Why is JSONP is a terrible thing?

• Provider data can easily be leaked (CSRF)

• Page is not protected from provider (XSS)

Page 14: CSE 127: Computer Security - cseweb.ucsd.edu

Why is JSONP is a terrible thing?

• Provider data can easily be leaked (CSRF)

• Page is not protected from provider (XSS)

Page 15: CSE 127: Computer Security - cseweb.ucsd.edu

The SOP is also not enough security-wise

Page 16: CSE 127: Computer Security - cseweb.ucsd.edu

Outline: modern mechanisms

• iframe sandbox

• Content security policy (CSP)

• HTTP strict transport security (HSTS)

• Subresource integrity (SRI)

• Cross-origin resource sharing (CORS)

Page 17: CSE 127: Computer Security - cseweb.ucsd.edu

iframe sandbox

Idea: restrict actions iframe can perform

Approach: set sandbox attribute, by default: ➤ disallows JavaScript and triggers (autofocus,

autoplay videos etc.)

➤ disallows form submission

➤ disallows popups

➤ disallows navigating embedding page

➤ runs page in unique origin: no storage/cookies

Page 18: CSE 127: Computer Security - cseweb.ucsd.edu

Grant privileges explicitly

Can enable dangerous features with: ➤ allow-scripts: allows JS + triggers (e.g., autofocus)

➤ allow-forms: allow form submission

➤ allow-popups: allow iframe to create popups

➤ allow-top-navigation: allow breaking out of frame

➤ allow-same-origin: retain original origin

Page 19: CSE 127: Computer Security - cseweb.ucsd.edu

Grant privileges explicitly

Page 20: CSE 127: Computer Security - cseweb.ucsd.edu

What can you do with iframe sandbox?

• Run content in iframe with least privilege ➤ Only grant content privileges it needs

• Privilege separate page into multiple iframes ➤ Split different parts of page into sandboxed iframes

Page 21: CSE 127: Computer Security - cseweb.ucsd.edu

Least privilege: twitter button

Page 22: CSE 127: Computer Security - cseweb.ucsd.edu

Least privilege: twitter button

• Let’s embed button on our site ➤ How do they suggest doing it?

➤ What’s the problem with this embedding approach?

<a href="https://twitter.com/share?ref_src=twsrc%5Etfw" class="twitter-share-button" data-show-count="false">Tweet</a><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

Page 23: CSE 127: Computer Security - cseweb.ucsd.edu

Least privilege: twitter button

• Let’s embed button on our site ➤ How do they suggest doing it?

➤ What’s the problem with this embedding approach?

<a href="https://twitter.com/share?ref_src=twsrc%5Etfw" class="twitter-share-button" data-show-count="false">Tweet</a><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

Page 24: CSE 127: Computer Security - cseweb.ucsd.edu

Least privilege: twitter button

• Better approach: put it in an iframe iframe

➤ Is this good enough?

➤ What can a malicious/compromised twitter still do?

<iframe src="https://platform.twitter.com/widgets/tweet_button.html" style="border: 0; width:130px; height:20px;"></iframe>

Page 25: CSE 127: Computer Security - cseweb.ucsd.edu

• Use iframe sandbox: remove all permissions and then enable JavaScript, popups, etc.

➤ Is the allow-same-origin unsafe?

➤ Why do you think we need all the other bits?

<iframe src=“https://platform.twitter.com/widgets/tweet_button.html" sandbox=“allow-same-origin allow-scripts allow-popups allow-forms” style="border: 0; width:130px; height:20px;"></iframe>

Least privilege: twitter button

Page 26: CSE 127: Computer Security - cseweb.ucsd.edu

Privilege separation: blog feed• Typically include user content inline:

➤ Problem with this?

• With iframe sandbox:

➤ May need allow-scripts - why?

➤ Is allow-same-origin safe too?

<div class=“post”> <div class=“author”>{{post.author}}</div> <div class=“body”>{{post.body}}</div></div>

<iframe sandbox srcdoc=“...<div class=“post”> <div class=“author”>{{post.author}}</div> <div class=“body”>{{post.body}}</div></div>...”></iframe>

Page 27: CSE 127: Computer Security - cseweb.ucsd.edu

Privilege separation: blog feed• Typically include user content inline:

➤ Problem with this?

• With iframe sandbox:

➤ May need allow-scripts - why?

➤ Is allow-same-origin safe too?

<div class=“post”> <div class=“author”>{{post.author}}</div> <div class=“body”>{{post.body}}</div></div>

<iframe sandbox srcdoc=“...<div class=“post”> <div class=“author”>{{post.author}}</div> <div class=“body”>{{post.body}}</div></div>...”></iframe>

Page 28: CSE 127: Computer Security - cseweb.ucsd.edu

What are some limitations of iframe sandbox?

Page 29: CSE 127: Computer Security - cseweb.ucsd.edu

Too strict vs. not strict enough

• Consider running library in sandboxed iframes ➤ E.g., password strength checker

➤ Desired guarantee: checker cannot leak password

• Problem: sandbox does not restrict exfiltration ➤ Can use XHR to write password to b.ru

b.ru/chk.htmla.com

Page 30: CSE 127: Computer Security - cseweb.ucsd.edu

• Can we limit the origins that the page (iframe or otherwise) can talk talk to? ➤ Can only leak to a trusted set of origins

➤ Gives us a more fine-grained notion of least privilege

• This can also prevent or limit damages due to XSS

Too strict vs. not strict enough

Page 31: CSE 127: Computer Security - cseweb.ucsd.edu

Outline: modern mechanisms

• iframe sandbox (quick refresher)

• Content security policy (CSP)

• HTTP strict transport security (HSTS)

• Subresource integrity (SRI)

• Cross-origin resource sharing (CORS)

Page 32: CSE 127: Computer Security - cseweb.ucsd.edu

Content Security Policy (CSP)

• Idea: restrict resource loading to a whitelist ➤ By restricting to whom page can talk to: restrict

where data is leaked!

• Approach: send page with CSP header that contains fine-grained directives ➤ E.g., allow loads from CDN, no frames, no plugins

Content-Security-Policy: default-src https://cdn.example.net; child-src 'none'; object-src 'none'

Page 33: CSE 127: Computer Security - cseweb.ucsd.edu

script-src: where you can load scripts from

connect-src: limits the origins you can XHR to

font-src: where to fetch web fonts form

form-action: where forms can be submitted

child-src: where to load frames/workers from

img-src: where to load images from

default-src: default fallback

Page 34: CSE 127: Computer Security - cseweb.ucsd.edu

Special keywords• ‘none’ - match nothing

• ‘self’ - match this origin

• ‘unsafe-inline’ - allow unsafe JS & CSS

• ‘unsafe-eval’ - allow unsafe eval (and the like)

• http: - match anything with http scheme

• https: - match anything with https scheme

• * - match anything

Page 35: CSE 127: Computer Security - cseweb.ucsd.edu

How can CSP help with XSS?

• If you list all places you can load scripts from: ➤ Only execute code from trusted origins

➤ Remaining vector for attack: inline scripts

• CSP by default disallows inline scripts ➤ If scripts are enabled at least it disallows eval

Page 36: CSE 127: Computer Security - cseweb.ucsd.edu

Adoption challenge

• Problem: inline scripts are widely-used ➤ Page authors use the ‘unsafe-inline' directive

➤ Is this a problem?

• Solution: script nonce and script hash ➤ Allow scripts that have a particular hash

➤ Allow scripts that have a specific nonce

Page 37: CSE 127: Computer Security - cseweb.ucsd.edu

Adoption challenge

• Problem: inline scripts are widely-used ➤ Page authors use the ‘unsafe-inline' directive

➤ Is this a problem?

• Solution: script nonce and script hash ➤ Allow scripts that have a particular hash

➤ Allow scripts that have a specific nonce

Page 38: CSE 127: Computer Security - cseweb.ucsd.edu

Other adoption challenges

• Goal: set most restricting CSP that is permissive enough to not break existing app

• How can you figure this out for a large app? ➤ CSP has a report-only header and report-uri directive

➤ Report violations to server; don’t enforce

• In practice: devs hardly ever get the policy right

Page 39: CSE 127: Computer Security - cseweb.ucsd.edu

How is CSP really used in practice?

in this case is the resource from c.com. In other words, strict-dynamic allows you to vet a “root” script using a nonce or a hash.That “root” script then has the ability to load other resourcesunmolested.2

• unsafe-inline: Any inline script is allowed. This is very inse-cure.

• unsafe-eval: Allows the use of eval() in allowed resources.This is also unsafe as it allows scripts to inject any JavaScriptonto the page, bypassing any CSPs currently in place.

3 METHODOLOGYIn order to obtain our data set, we performed a simple crawl ofthe Alexa top 1 million sites [2] using a headless browser in or-der to collect the Content-Security-Policy headers present ineach request. With this data, we normalized the policies remov-ing unnecessary white space and sorting both the directives andtheir associated values in lexicographical order. Additionally, wenormalized the ‘nonce’ and ‘sha256’ values by removing theunique content of the nonce and hash values. After identifying5,240 sites from our initial corpus that returned CSP headers withthe ‘script-src’ directive and the ‘unsafe-inline’ keywordwe revisited those sites, this time using a proxy to capture the con-tent of all scripts loaded by each site. We constructed a graph whosevertices are the origins of all loaded scripts and whose edges repre-sent the loading of a script by another script. Finally, we visited thesame 5240 sites again the following day and in order to determinewhether each script changed across visits providing and estimateof the percentage of scripts which are dynamically generated.

3.1 LimitationsThe Internet is constantly growing and evolving and as a result ourdata represents only a snapshot of a small portion of the greaterInternet. Increasingly websites serve di�erent content based upon avisitors browser cookies, device type, and other factors. Our freshlyinstantiated, i.e. without any cookies, headless browsing crawlermay receive di�erent content than the average visitor to the samewebsites. Additionally, for any given website, we browse only thelanding page, not exhaustively exploring all links on each page.

4 RESULTSOut of the 1 million sites that we visit 39,022 or approximately 3.9%return the Content-Security-Policy header and in total we �nd6,670 unique normalized policies. CSP usage is trending upwardincreasing from 1.6% of sites in 2016 [14] to 1.97% in 2017 [5] to3.9% in 2018. The 2016 number is based upon a Google searchindex of approximately 1 billion sites while the 2017 and 2018numbers are both based upon the Alexa top 1 million sites at thetime. The usage frequencies of the various directives are displayedin �gure 1. Our data indicates that the most common usage of CSPtoday is to require the usage of HTTPS. The ‘upgrade-insecure-requests’ directive is themost commonly used directive appearingin 56.74% of policies. This represents a dramatic change from the2It’s important to note that since eval is blocked by default when using CSP, in orderfor the script from b.com to inject new JavaScript resources onto the page it willneed to use some other API method such as document.createElement() which willthen need to be added to the document through document.body.appendChild() orsimilar.

2016 data from Weichselbaum et al. [14] where it appeared in only1.87% of policies. The adoption of forced HTTPS is certainly apositive security trend. The ‘frame-ancestors’ directive has alsogained popularity appearing in only 8.11% of policies in 2016 whileappearing in 54.07% of policies in 2018.

Figure 1: Distribution of CSP directives.

The directive that primarily deals with XSS mitigation is the‘script-src’ directive. In our data this directive appears in 7,317or 18.75% of policies. Figure 2 shows the frequencies of various typesof values included in policies with the ’script-src’ directive.The ‘unsafe-inline’ keyword which renders the policy entirelyuseless for XSS protection appears in 90.69% of policies. The mostsecure keyword ‘sha256’ appears in 7.95% of policies in our dataset, up from just 1.65% in 2016. The next best alternative ‘nonce’appears in 6.57% of policies up from 0.92% in 2016. The use ofwildcards in sources also presents security risks. Only 18.38% ofpolicies that specify at least one host include exclusively hostswith fully speci�ed paths. Of all policies with the ‘srcipt-src’directive 38.28% include at least one host with a wildcard as and18.03% include the general wildcard. Whitelisting entire protocols,i.e. https:// or http://, is also extremely insecure and 59.87% ofpolicies include at least one protocol among their sources.

Figure 3 shows the most commonly whitelisted hosts. Google’spervasiveness on the Internet is apparent as 13 of the top 15 mostwhite-listed hosts are Google domains with the remaining 2 comingfrom Facebook.

Figure 4 shows the origins of the scripts most commonly re-quested by other third party scripts. The most common use casesfor scripts loading other scripts are advertisements and tracking.These account for 59.11% of scripts loaded by other scripts. Wede�ne ads and tracking scripts based upon the uBlock Origin [6]default �lter settings.

The loading of dynamically generated scripts prevents the useof the most secure keyword, ‘sha256’. According to our data ofthe 37,087 scripts loaded by sites that include the ‘script-src’directive and the ‘unsafe-inline’ keyword in their CSPs 13.06%are dynamically generated and 43.03% of sites load at least onedynamically generated script.

3

Page 40: CSE 127: Computer Security - cseweb.ucsd.edu

How is CSP really used in practice?

in this case is the resource from c.com. In other words, strict-dynamic allows you to vet a “root” script using a nonce or a hash.That “root” script then has the ability to load other resourcesunmolested.2

• unsafe-inline: Any inline script is allowed. This is very inse-cure.

• unsafe-eval: Allows the use of eval() in allowed resources.This is also unsafe as it allows scripts to inject any JavaScriptonto the page, bypassing any CSPs currently in place.

3 METHODOLOGYIn order to obtain our data set, we performed a simple crawl ofthe Alexa top 1 million sites [2] using a headless browser in or-der to collect the Content-Security-Policy headers present ineach request. With this data, we normalized the policies remov-ing unnecessary white space and sorting both the directives andtheir associated values in lexicographical order. Additionally, wenormalized the ‘nonce’ and ‘sha256’ values by removing theunique content of the nonce and hash values. After identifying5,240 sites from our initial corpus that returned CSP headers withthe ‘script-src’ directive and the ‘unsafe-inline’ keywordwe revisited those sites, this time using a proxy to capture the con-tent of all scripts loaded by each site. We constructed a graph whosevertices are the origins of all loaded scripts and whose edges repre-sent the loading of a script by another script. Finally, we visited thesame 5240 sites again the following day and in order to determinewhether each script changed across visits providing and estimateof the percentage of scripts which are dynamically generated.

3.1 LimitationsThe Internet is constantly growing and evolving and as a result ourdata represents only a snapshot of a small portion of the greaterInternet. Increasingly websites serve di�erent content based upon avisitors browser cookies, device type, and other factors. Our freshlyinstantiated, i.e. without any cookies, headless browsing crawlermay receive di�erent content than the average visitor to the samewebsites. Additionally, for any given website, we browse only thelanding page, not exhaustively exploring all links on each page.

4 RESULTSOut of the 1 million sites that we visit 39,022 or approximately 3.9%return the Content-Security-Policy header and in total we �nd6,670 unique normalized policies. CSP usage is trending upwardincreasing from 1.6% of sites in 2016 [14] to 1.97% in 2017 [5] to3.9% in 2018. The 2016 number is based upon a Google searchindex of approximately 1 billion sites while the 2017 and 2018numbers are both based upon the Alexa top 1 million sites at thetime. The usage frequencies of the various directives are displayedin �gure 1. Our data indicates that the most common usage of CSPtoday is to require the usage of HTTPS. The ‘upgrade-insecure-requests’ directive is themost commonly used directive appearingin 56.74% of policies. This represents a dramatic change from the2It’s important to note that since eval is blocked by default when using CSP, in orderfor the script from b.com to inject new JavaScript resources onto the page it willneed to use some other API method such as document.createElement() which willthen need to be added to the document through document.body.appendChild() orsimilar.

2016 data from Weichselbaum et al. [14] where it appeared in only1.87% of policies. The adoption of forced HTTPS is certainly apositive security trend. The ‘frame-ancestors’ directive has alsogained popularity appearing in only 8.11% of policies in 2016 whileappearing in 54.07% of policies in 2018.

Figure 1: Distribution of CSP directives.

The directive that primarily deals with XSS mitigation is the‘script-src’ directive. In our data this directive appears in 7,317or 18.75% of policies. Figure 2 shows the frequencies of various typesof values included in policies with the ’script-src’ directive.The ‘unsafe-inline’ keyword which renders the policy entirelyuseless for XSS protection appears in 90.69% of policies. The mostsecure keyword ‘sha256’ appears in 7.95% of policies in our dataset, up from just 1.65% in 2016. The next best alternative ‘nonce’appears in 6.57% of policies up from 0.92% in 2016. The use ofwildcards in sources also presents security risks. Only 18.38% ofpolicies that specify at least one host include exclusively hostswith fully speci�ed paths. Of all policies with the ‘srcipt-src’directive 38.28% include at least one host with a wildcard as and18.03% include the general wildcard. Whitelisting entire protocols,i.e. https:// or http://, is also extremely insecure and 59.87% ofpolicies include at least one protocol among their sources.

Figure 3 shows the most commonly whitelisted hosts. Google’spervasiveness on the Internet is apparent as 13 of the top 15 mostwhite-listed hosts are Google domains with the remaining 2 comingfrom Facebook.

Figure 4 shows the origins of the scripts most commonly re-quested by other third party scripts. The most common use casesfor scripts loading other scripts are advertisements and tracking.These account for 59.11% of scripts loaded by other scripts. Wede�ne ads and tracking scripts based upon the uBlock Origin [6]default �lter settings.

The loading of dynamically generated scripts prevents the useof the most secure keyword, ‘sha256’. According to our data ofthe 37,087 scripts loaded by sites that include the ‘script-src’directive and the ‘unsafe-inline’ keyword in their CSPs 13.06%are dynamically generated and 43.03% of sites load at least onedynamically generated script.

3

Warning: This graph is a few years old

Page 41: CSE 127: Computer Security - cseweb.ucsd.edu

How is CSP really used in practice?

in this case is the resource from c.com. In other words, strict-dynamic allows you to vet a “root” script using a nonce or a hash.That “root” script then has the ability to load other resourcesunmolested.2

• unsafe-inline: Any inline script is allowed. This is very inse-cure.

• unsafe-eval: Allows the use of eval() in allowed resources.This is also unsafe as it allows scripts to inject any JavaScriptonto the page, bypassing any CSPs currently in place.

3 METHODOLOGYIn order to obtain our data set, we performed a simple crawl ofthe Alexa top 1 million sites [2] using a headless browser in or-der to collect the Content-Security-Policy headers present ineach request. With this data, we normalized the policies remov-ing unnecessary white space and sorting both the directives andtheir associated values in lexicographical order. Additionally, wenormalized the ‘nonce’ and ‘sha256’ values by removing theunique content of the nonce and hash values. After identifying5,240 sites from our initial corpus that returned CSP headers withthe ‘script-src’ directive and the ‘unsafe-inline’ keywordwe revisited those sites, this time using a proxy to capture the con-tent of all scripts loaded by each site. We constructed a graph whosevertices are the origins of all loaded scripts and whose edges repre-sent the loading of a script by another script. Finally, we visited thesame 5240 sites again the following day and in order to determinewhether each script changed across visits providing and estimateof the percentage of scripts which are dynamically generated.

3.1 LimitationsThe Internet is constantly growing and evolving and as a result ourdata represents only a snapshot of a small portion of the greaterInternet. Increasingly websites serve di�erent content based upon avisitors browser cookies, device type, and other factors. Our freshlyinstantiated, i.e. without any cookies, headless browsing crawlermay receive di�erent content than the average visitor to the samewebsites. Additionally, for any given website, we browse only thelanding page, not exhaustively exploring all links on each page.

4 RESULTSOut of the 1 million sites that we visit 39,022 or approximately 3.9%return the Content-Security-Policy header and in total we �nd6,670 unique normalized policies. CSP usage is trending upwardincreasing from 1.6% of sites in 2016 [14] to 1.97% in 2017 [5] to3.9% in 2018. The 2016 number is based upon a Google searchindex of approximately 1 billion sites while the 2017 and 2018numbers are both based upon the Alexa top 1 million sites at thetime. The usage frequencies of the various directives are displayedin �gure 1. Our data indicates that the most common usage of CSPtoday is to require the usage of HTTPS. The ‘upgrade-insecure-requests’ directive is themost commonly used directive appearingin 56.74% of policies. This represents a dramatic change from the2It’s important to note that since eval is blocked by default when using CSP, in orderfor the script from b.com to inject new JavaScript resources onto the page it willneed to use some other API method such as document.createElement() which willthen need to be added to the document through document.body.appendChild() orsimilar.

2016 data from Weichselbaum et al. [14] where it appeared in only1.87% of policies. The adoption of forced HTTPS is certainly apositive security trend. The ‘frame-ancestors’ directive has alsogained popularity appearing in only 8.11% of policies in 2016 whileappearing in 54.07% of policies in 2018.

Figure 1: Distribution of CSP directives.

The directive that primarily deals with XSS mitigation is the‘script-src’ directive. In our data this directive appears in 7,317or 18.75% of policies. Figure 2 shows the frequencies of various typesof values included in policies with the ’script-src’ directive.The ‘unsafe-inline’ keyword which renders the policy entirelyuseless for XSS protection appears in 90.69% of policies. The mostsecure keyword ‘sha256’ appears in 7.95% of policies in our dataset, up from just 1.65% in 2016. The next best alternative ‘nonce’appears in 6.57% of policies up from 0.92% in 2016. The use ofwildcards in sources also presents security risks. Only 18.38% ofpolicies that specify at least one host include exclusively hostswith fully speci�ed paths. Of all policies with the ‘srcipt-src’directive 38.28% include at least one host with a wildcard as and18.03% include the general wildcard. Whitelisting entire protocols,i.e. https:// or http://, is also extremely insecure and 59.87% ofpolicies include at least one protocol among their sources.

Figure 3 shows the most commonly whitelisted hosts. Google’spervasiveness on the Internet is apparent as 13 of the top 15 mostwhite-listed hosts are Google domains with the remaining 2 comingfrom Facebook.

Figure 4 shows the origins of the scripts most commonly re-quested by other third party scripts. The most common use casesfor scripts loading other scripts are advertisements and tracking.These account for 59.11% of scripts loaded by other scripts. Wede�ne ads and tracking scripts based upon the uBlock Origin [6]default �lter settings.

The loading of dynamically generated scripts prevents the useof the most secure keyword, ‘sha256’. According to our data ofthe 37,087 scripts loaded by sites that include the ‘script-src’directive and the ‘unsafe-inline’ keyword in their CSPs 13.06%are dynamically generated and 43.03% of sites load at least onedynamically generated script.

3

Page 42: CSE 127: Computer Security - cseweb.ucsd.edu

What’s frame-ancestors?

What problem is this addressing?

Page 43: CSE 127: Computer Security - cseweb.ucsd.edu

Clickjacking!

Busting Frame Busting:a Study of Clickjacking Vulnerabilities on Popular Sites

Gustav Rydstedt, Elie Bursztein, Dan BonehStanford University

{rydstedt,elie,dabo}@stanford.edu

Collin JacksonCarnegie Mellon [email protected]

June 7, 2010

Abstract

Web framing attacks such as clickjacking useiframes to hijack a user’s web session. The mostcommon defense, called frame busting, preventsa site from functioning when loaded inside aframe. We study frame busting practices for theAlexa Top-500 sites and show that all can be cir-cumvented in one way or another. Some circum-ventions are browser-specific while others workacross browsers. We conclude with recommen-dations for proper frame busting.

1 Introduction

Frame busting refers to code or annotationprovided by a web page intended to preventthe web page from being loaded in a sub-frame.Frame busting is the recommended defenseagainst clickjacking [10] and is also requiredto secure image-based authentication such asthe Sign-in Seal used by Yahoo. Sign-in Sealdisplays a user-selected image that authenticatesthe Yahoo! login page to the user. Withoutframe busting, the login page could be openedin a sub-frame so that the correct image isdisplayed to the user, even though the toppage is not the real Yahoo login page. Newadvancements in clickjacking techniques [22]using drag-and-drop to extract and inject datainto frames further demonstrate the importanceof secure frame busting.

Figure 1: Visualization of a clickjacking attackon Twitter’s account deletion page.

Figure 1 illustrates a clickjacking attack: thevictim site is framed in a transparent iframe thatis put on top of what appears to be a normalpage. When users interact with the normal page,they are unwittingly interacting with the victimsite. To defend against clickjacking attacks, thefollowing simple frame busting code is a com-monly used by web sites:

i f ( top . l o c a t i o n != l o c a t i o n )top . l o c a t i o n = s e l f . l o c a t i o n ;

Frame busting code typically consists of aconditional statement and a counter-action thatnavigates the top page to the correct place.As we will see, this basic code is fairly easyto bypass. We discuss far more sophisticatedframe busting code (and circumvention tech-niques) later in the paper.

1

• How does frame-ancestor help? ➤ Don’t allow non twitter origins to frame delete page!

Page 44: CSE 127: Computer Security - cseweb.ucsd.edu

http://best.game.ever

twitter.com

Page 45: CSE 127: Computer Security - cseweb.ucsd.edu

What about the other two?

in this case is the resource from c.com. In other words, strict-dynamic allows you to vet a “root” script using a nonce or a hash.That “root” script then has the ability to load other resourcesunmolested.2

• unsafe-inline: Any inline script is allowed. This is very inse-cure.

• unsafe-eval: Allows the use of eval() in allowed resources.This is also unsafe as it allows scripts to inject any JavaScriptonto the page, bypassing any CSPs currently in place.

3 METHODOLOGYIn order to obtain our data set, we performed a simple crawl ofthe Alexa top 1 million sites [2] using a headless browser in or-der to collect the Content-Security-Policy headers present ineach request. With this data, we normalized the policies remov-ing unnecessary white space and sorting both the directives andtheir associated values in lexicographical order. Additionally, wenormalized the ‘nonce’ and ‘sha256’ values by removing theunique content of the nonce and hash values. After identifying5,240 sites from our initial corpus that returned CSP headers withthe ‘script-src’ directive and the ‘unsafe-inline’ keywordwe revisited those sites, this time using a proxy to capture the con-tent of all scripts loaded by each site. We constructed a graph whosevertices are the origins of all loaded scripts and whose edges repre-sent the loading of a script by another script. Finally, we visited thesame 5240 sites again the following day and in order to determinewhether each script changed across visits providing and estimateof the percentage of scripts which are dynamically generated.

3.1 LimitationsThe Internet is constantly growing and evolving and as a result ourdata represents only a snapshot of a small portion of the greaterInternet. Increasingly websites serve di�erent content based upon avisitors browser cookies, device type, and other factors. Our freshlyinstantiated, i.e. without any cookies, headless browsing crawlermay receive di�erent content than the average visitor to the samewebsites. Additionally, for any given website, we browse only thelanding page, not exhaustively exploring all links on each page.

4 RESULTSOut of the 1 million sites that we visit 39,022 or approximately 3.9%return the Content-Security-Policy header and in total we �nd6,670 unique normalized policies. CSP usage is trending upwardincreasing from 1.6% of sites in 2016 [14] to 1.97% in 2017 [5] to3.9% in 2018. The 2016 number is based upon a Google searchindex of approximately 1 billion sites while the 2017 and 2018numbers are both based upon the Alexa top 1 million sites at thetime. The usage frequencies of the various directives are displayedin �gure 1. Our data indicates that the most common usage of CSPtoday is to require the usage of HTTPS. The ‘upgrade-insecure-requests’ directive is themost commonly used directive appearingin 56.74% of policies. This represents a dramatic change from the2It’s important to note that since eval is blocked by default when using CSP, in orderfor the script from b.com to inject new JavaScript resources onto the page it willneed to use some other API method such as document.createElement() which willthen need to be added to the document through document.body.appendChild() orsimilar.

2016 data from Weichselbaum et al. [14] where it appeared in only1.87% of policies. The adoption of forced HTTPS is certainly apositive security trend. The ‘frame-ancestors’ directive has alsogained popularity appearing in only 8.11% of policies in 2016 whileappearing in 54.07% of policies in 2018.

Figure 1: Distribution of CSP directives.

The directive that primarily deals with XSS mitigation is the‘script-src’ directive. In our data this directive appears in 7,317or 18.75% of policies. Figure 2 shows the frequencies of various typesof values included in policies with the ’script-src’ directive.The ‘unsafe-inline’ keyword which renders the policy entirelyuseless for XSS protection appears in 90.69% of policies. The mostsecure keyword ‘sha256’ appears in 7.95% of policies in our dataset, up from just 1.65% in 2016. The next best alternative ‘nonce’appears in 6.57% of policies up from 0.92% in 2016. The use ofwildcards in sources also presents security risks. Only 18.38% ofpolicies that specify at least one host include exclusively hostswith fully speci�ed paths. Of all policies with the ‘srcipt-src’directive 38.28% include at least one host with a wildcard as and18.03% include the general wildcard. Whitelisting entire protocols,i.e. https:// or http://, is also extremely insecure and 59.87% ofpolicies include at least one protocol among their sources.

Figure 3 shows the most commonly whitelisted hosts. Google’spervasiveness on the Internet is apparent as 13 of the top 15 mostwhite-listed hosts are Google domains with the remaining 2 comingfrom Facebook.

Figure 4 shows the origins of the scripts most commonly re-quested by other third party scripts. The most common use casesfor scripts loading other scripts are advertisements and tracking.These account for 59.11% of scripts loaded by other scripts. Wede�ne ads and tracking scripts based upon the uBlock Origin [6]default �lter settings.

The loading of dynamically generated scripts prevents the useof the most secure keyword, ‘sha256’. According to our data ofthe 37,087 scripts loaded by sites that include the ‘script-src’directive and the ‘unsafe-inline’ keyword in their CSPs 13.06%are dynamically generated and 43.03% of sites load at least onedynamically generated script.

3

Page 46: CSE 127: Computer Security - cseweb.ucsd.edu

What is MIXed content?

• Why is this bad? ➤ Network attacker can inject their own scripts,

images, etc.!

https://…

http://…

Page 47: CSE 127: Computer Security - cseweb.ucsd.edu

What is MIXed content?

• Why is this bad? ➤ Network attacker can inject their own scripts,

images, etc.!

https://…

http://…

Page 48: CSE 127: Computer Security - cseweb.ucsd.edu

How does CSP help?

• upgrade-insecure-requests ➤ Rewrite HTTP URL to HTTPS before making request

• block-all-mixed-content ➤ Don’t load any content over HTTP

Page 49: CSE 127: Computer Security - cseweb.ucsd.edu

Outline: modern mechanisms

• iframe sandbox (quick refresher)

• Content security policy (CSP)

• HTTP strict transport security (HSTS)

• Subresource integrity (SRI)

• Cross-origin resource sharing (CORS)

Page 50: CSE 127: Computer Security - cseweb.ucsd.edu

Motivation for HSTS

• Attacker can force you to go to HTTP vs. HTTPS ➤ SSL Stripping attack (Moxie)

- They can rewrite all HTTPS URLs to HTTP

- If server serves content over HTTP: doom!

• HSTS header: never visit site over HTTP again ➤ Strict-Transport-Security: max-age=31536000

Page 51: CSE 127: Computer Security - cseweb.ucsd.edu

Outline: modern mechanisms

• iframe sandbox (quick refresher)

• Content security policy (CSP)

• HTTP strict transport security (HSTS)

• Subresource integrity (SRI)

• Cross-origin resource sharing (CORS)

Page 52: CSE 127: Computer Security - cseweb.ucsd.edu

Motivation for SRI• CSP+HSTS can be used to limit damages, but

can’t really defend against malicious code

• How do you know that the library you’re loading is the correct one?Won’t using HTTPS address this problem?

Page 53: CSE 127: Computer Security - cseweb.ucsd.edu

Motivation for SRI• CSP+HSTS can be used to limit damages, but

can’t really defend against malicious code

• How do you know that the library you’re loading is the correct one?Won’t using HTTPS address this problem?

Mikko Ohtamaa

Page 54: CSE 127: Computer Security - cseweb.ucsd.edu

Subresource integrity

• Idea: page author specifies hash of (sub)resource they are loading; browser checks integrity ➤ E.g., integrity for scripts

➤ E.g., integrity for link elements

<link rel="stylesheet" href="https://site53.cdn.net/style.css" integrity="sha256-SDfwewFAE...wefjijfE">

<script src="https://code.jquery.com/jquery-1.10.2.min.js" integrity="sha256-C6CB9UYIS9UJeqinPHWTHVqh/E1uhG5Tw+Y5qFQmYg=">

Page 55: CSE 127: Computer Security - cseweb.ucsd.edu

What happens when check fails?

• Case 1 (default): ➤ Browser reports violation and does not render/

execute resource

• Case 2: CSP directive with integrity-policy directive set to report ➤ Browser reports violation, but may render/execute

resource

Page 56: CSE 127: Computer Security - cseweb.ucsd.edu

Multiple hash algorithms

• Authors may specify multiple hashes ➤ E.g.,

• Browser uses strongest algorithm

• Why support multiple algorithms? ➤ Don’t break page on old browser

<script src="hello_world.js" integrity=“sha256-... sha512-... "></script>

Page 57: CSE 127: Computer Security - cseweb.ucsd.edu

Outline: modern mechanisms

• iframe sandbox (quick refresher)

• Content security policy (CSP)

• HTTP strict transport security (HSTS)

• Subresource integrity (SRI)

• Cross-origin resource sharing (CORS)

Page 58: CSE 127: Computer Security - cseweb.ucsd.edu

Recall: SOP is also inflexible

• Problem: Can’t fetch cross-origin data ➤ Leads to building insecure sites/services: JSONP

• Solution: Cross-origin resource sharing (CORS) ➤ Data provider explicitly whitelists origins that can

inspect responses

➤ Browser allows page to inspect response if its origin is listed in the header

Page 59: CSE 127: Computer Security - cseweb.ucsd.edu

E.g., CORS usage: amazon

• Amazon has multiple domains ➤ E.g., amazon.com and aws.com

• Problem: amazon.com can’t read cross-origin aws.com data

• With CORS aws.com can allow amazon.com to readHTTP responses amazon.com evil.biz

aws.com

Page 60: CSE 127: Computer Security - cseweb.ucsd.edu

How CORS works• Browser sends Origin header with XHR request

➤ E.g., Origin: https://amazon.com

• Server can inspect Origin header and respond with Access-Control-Allow-Origin header ➤ E.g., Access-Control-Allow-Origin: https://amazon.com

➤ E.g., Access-Control-Allow-Origin: *

• CORS XHR may send cookies + custom headers ➤ Need “preflight” request to authorize this

Page 61: CSE 127: Computer Security - cseweb.ucsd.edu

New: Permissions Policy

• Web platform has access to many things ➤ accelerometer, ambient-light-sensor, battery,

camera, display-capture, geolocation, …

• Permissions policy tries to limit access to certain resources ➤ Header:

Permissions-Policy: fullscreen=(), geolocation=() ➤ Iframe attribute:

<iframe src="https://other.com/map" allow="geolocation"></iframe>

Page 62: CSE 127: Computer Security - cseweb.ucsd.edu

✓ How do we protect page from ads/services?

✓ How to share data with cross-origin page?

✓ How to protect one user from another’s content?

✓ How do we protect the page from a library?

✓ How do we protect the page from the CDN?

✓ How do we protect the page from network provider?