Performance - cse.tkk.fi · web performance translates directly to dollars and cents—e.g., a...
Transcript of Performance - cse.tkk.fi · web performance translates directly to dollars and cents—e.g., a...
Performance
Mobile Cloud Computing 14.11.2014 Jukka K. Nurminen
Today
• General
• What is performance? • Why does it matter? • What can you do?
• Especially for mobile web
What is Performance? • Response time
– Processing • Mobile & Server
– Communication • Latency vs. bandwidth
– Feedback • Real response time vs. hiding the delay
• Resource use – Especially for concurrent apps – Memory and cache use, CPU load
• Money? • Energy
– Specific for mobile – Visible only after longer time of use – => harder to use for marketing
Bandwidth and Latency
• Latency the delay between the sender and the receiver decoding it, this is mainly a function of the signals travel time, and processing time at any nodes the information traverses “How fast can you drive?”
• Bandwidth commonly measured in bits/second is the maximum rate that information can be transferred “How many lanes does the road have”
• Extreme example “Sneakernet” – Copy data to DVD disks, load them to a truck – Hours of latency but very high and cheap bandwidth
Jukka K. Nurminen
4
Time is money
Delay User perception 0–100 ms Instant 100–300 ms Small perceptible delay 300–1000 ms Machine is working 1,000+ ms Likely mental context switch 10,000+ ms Task is abandoned
Well-publicized studies from Google, Microsoft, and Amazon all show that web performance translates directly to dollars and cents—e.g., a 2,000 ms delay on Bing search pages decreased per-user revenue by 4.3%! Similarly, an Aberdeen study of over 160 organizations determined that an extra one-second delay in page load times led to 7% loss in conversions, 11% fewer page views, and a 16% decrease in customer satisfaction!
What can an end-user do?
• Get a faster device • Get a faster network connection • Get more efficient software, e.g. browser
• All of these approaches have problems
What can a developer do?
• Ensure that enough server resources are available – Own or cloud providers
• Do the application in a smart way – Appropriate algorithms and good code – Consider the special needs of different platforms e.g. mobile
• Hard to manage multiple versions • Responsive web design
– but still long way to go
– Use new platform features? • Can bring major simplifications and resource savings (e.g.
websocket) • May not work with old platforms
Difficulty: Compatibility slows down adoption of new features Time to have the feature in the browser + Time when users have the latest browser = Many years • This is easier if you have a dedicated client
– but then the users have to install your app
HTML5test.com 11/13/2014 HTML5test - How well does your browser support HTML5?
http://html5test.com/results/desktop.html 1/1
HTML5test.com score over the years
Chrome Firefox Internet Explorer Maxthon Opera Safari
Jan 2010Jan 2009 Jan 2011 Jan 2012 Jan 2013 Jan 20140
100
200
300
400
500
600
Score (points)
World wide device shipments
Mobile-first?
WEB
Web structure
Firewalls NATs
Browser Architecture
• Generate and submit web requests to web servers • Accept responses from web servers and produce visual
presentations out of it • Render the results
Techniques for responsive web apps Push technologies AJAX
Push Technologies
• How can browser know if something changed on the server?
• What if we wanted to browser to react to changes that happen at the server side? – E.g. new email arrives, location of an object changes
• Simple (and inefficient way) – Browser polls the server
Push technologies • In normal cases the client (=browser) is the initiating
party making the request. • A number of alternative possibilities (sometimes comet
is the umbrella term) – A set of “hacks”
• Leaving connection open • Long polling (a query that only returns when results are available)
– XMPP based pub-sub techniques and other protocols
• With HTML5 – WebSocket – Server-sent events
AJAX Introduction
• Rationale: – Why should a complete web page be loaded, if only a small
portion of it needs updating? – Why should all the data be loaded to a page, if the user is likely
not to go through it all? • Although the acronym is “Asynchronous JavaScript and
XML” it is increasingly common to return JSON rather than XML
AJAX Asynchronous JavaScript and XML • Enables more interactive web applications • Moves data in the background
2. Run server side code to generate proper response to request 3. Send the response in JSON or XML
1. Submit XMLHttpRequest
4. Response processing typically modifies DOM tree, which influences what the user sees
Mobile Web Clients
sdfa
Challenges and Opportunities
• Limited bandwidth • Battery consumption of communication and computing • Amount of data influences may influence phone bill
– And operators are more sensitive to the amounts of data – Although flat rate tariffing is spreading
• Mobile display size and other limitations of the mobile UI Opportunities: • Mobile first • Context (e.g. location) specific services • Always-on • Access to human user • Novel innovations
Separate content for mobile devices - Response web design
Impact of mobile versions www.websiteoptimization.com
An average web application, as of early 2013, is composed of the following: 90 requests, fetched from 15 hosts, with 1,311 KB total transfer size HTML: 10 requests, 52 KB Images: 55 requests, 812 KB JavaScript: 15 requests, 216 KB CSS: 5 requests, 36 KB Other: 5 requests, 195 KB
Proxy to transcode the content
• Compresses the page content • Pros:
– Much less data needs to be sent – The resulting page fits the mobile
screen – Does not require much
processing on the phone side • Cons:
– Not the same page as what was created => functionality may suffer
– All data goes through Opera servers => single point of failure
RabbIT proxy
• Compress text pages to gzip streams. This reduces size by up to 75%
• Compress images to 10% jpeg. This reduces size by up to 95%
• Remove advertising • Remove background images • Cache filtered pages and images • Uses keepalive if possible • Easy and powerful configuration • Multi threaded solution written in java • Modular and easily extended • Complete HTTP/1.1 compliance
Web Perfomance Based on Ilya Grigori: High Performance Browser Networking Available: http://chimera.labs.oreilly.com/books/1230000000545 Sections 10-12 A good source of modern web related information
Latency is the main bottleneck, not bandwidth
What if we could reduce cross-atlantic RTTs from 150 ms to 100 ms? This would have a larger effect on the speed of the internet than increasing a user’s bandwidth from 3.9 Mbps to 10 Mbps or even 1 Gbps.
www.webpagetest.org
www.aalto.fi
www.aalto.fi (with www.webpagetest.org)
First View
Second View
www.google.com searching for “Aalto”
How to speed things up?
Modern browser techniques to speed up things • Resource pre-fetching and prioritization • Document, CSS, and JavaScript parsers may communicate extra information to
the network stack to indicate the relative priority of each resource: blocking resources required for first rendering are given high priority, while low-priority requests may be temporarily held back in a queue.
• DNS pre-resolve • Likely hostnames are pre-resolved ahead of time to avoid DNS latency on a
future HTTP request. A pre-resolve may be triggered through learned navigation history, a user action such as hovering over a link, or other signals on the page.
• TCP pre-connect • Following a DNS resolution, the browser may speculatively open the TCP
connection in an anticipation of an HTTP request. If it guesses right, it can eliminate another full roundtrip (TCP handshake) of network latency.
• Page pre-rendering • Some browsers allow you to hint the likely next destination and can pre-render
the entire page in a hidden tab, such that it can be instantly swapped in when the user initiates the navigation.
Some techniques
• Critical resources such as CSS and JavaScript should be discoverable as early as possible in the document.
• CSS should be delivered as early as possible to unblock rendering and JavaScript execution.
• Noncritical JavaScript should be deferred to avoid blocking DOM and CSSOM construction.
• The HTML document is parsed incrementally by the parser; hence the document should be periodically flushed for best performance.
HTTP/1.1 performance improvement mechanisms • Persistent connections to allow connection reuse • Chunked transfer encoding to allow response streaming • Request pipelining to allow parallel request processing • Byte serving to allow range-based resource requests • Improved and much better-specified caching
mechanisms
Persistent connections
Saving: (N-1) * RTT N = number of resources (avg 90)
Pipelined: Additional requests sent before replies arrive
Parallel processing at server side: Head of line blocking as replies cannot be multiplexed
Alternative to pipelining
• Up to N parallel TCP connections to from a client to a server – N is browser dependent, typically 6
• Sharding: resources split under multiple host names (which could even reside on same server) – Allows to have any number of parallel TCP connections (more
than 6) • What number is optimal?
– Application specific – Latency and bandwidth specific – Each new host names causes a DNS lookup (+ TLS handshake
in case of HTTPS)
Header overhead
• Average header overhead: 500-800 bytes per HTTP request
• With cookies much more (can be limited to be < 8 KB by servers or proxies)
Headers: 352 bytes Payload: 15 bytes 96% protocol overhead
Reduce number of loaded resources
• Eliminate unnecessary requests • Concatenation
– Multiple JavaScript or CSS files are combined into a single resource. • Spriting
– Multiple images are combined into a larger, composite image. • Inlining
– Embed JavaScript, CSS, and other resources to HTML • Drawbacks
– Cache performance • Single update invalidates whole cache
– Loading of unnecessary resources – JavaScript and CSS parsing only when whole file is downloaded (HTML
processed incrementally) – Complicated management + search for optimal strategy
HTTP 2.0
Jukka K. Nurminen
Heavily influenced by Chapter 12 of High performance browser networking
4.2.2014
SPDY & HTTP2.0
• By Google, since 2009 • Goals:
– Target a 50% reduction in page load time (PLT). – Avoid the need for any changes to content by website authors. – Minimize deployment complexity, avoid changes in network
infrastructure. – Develop this new protocol in partnership with the open-source
community. – Gather real performance data to (in)validate the experimental
protocol.
• “When we download the top 25 websites over simulated home network connections, we see a significant improvement in performance—pages loaded up to 55% faster.”
For more see e.g. http://www.webpronews.com/google-spdy-gaining-adoption-2012-01
SPDY Today
• Supported in Chrome, Firefox, and Opera browsers (maybe more)
• Many large web destinations (e.g., Google, Twitter, Facebook) offer SPDY to compatible clients
• Many people are using SPDY (but they don’t know that) • Work towards HTTP 2.0 standard starts
– Based on SPDY lessons learned
HTTP 2.0 standard
• All HTTP 1.1. concepts are available – HTTP methods, status codes, URIs, and header fields
• All existing applications can be delivered without modification.
• HTTP 2.0 main focus is on performance • HTTP 2.0 modifies how the data is formatted (framed)
and transported between the client and server • Standardization is still on-going • Heavily influence by Google and its SPDY protocol
HTTP 2.0 targets • Substantially and measurably improve end-user perceived
latency in most cases, over HTTP 1.1 using TCP. • Address the "head of line blocking" problem in HTTP. • Not require multiple connections to a server to enable
parallelism, thus improving its use of TCP, especially regarding congestion control.
• Retain the semantics of HTTP 1.1, leveraging existing documentation, including (but not limited to) HTTP methods, status codes, URIs, and where appropriate, header fields.
• Clearly define how HTTP 2.0 interacts with HTTP 1.x, especially in intermediaries.
• Clearly identify any new extensibility points and policy for their appropriate use.
Performance
Compatibility
Extensibility
Binary Framing Layer
Terminology
• All communication is performed with a single TCP connection. • The stream is a virtual channel within a connection, which
carries bidirectional messages. Each stream has a unique integer identifier (1, 2, …, N).
• The message is a logical HTTP message, such as a request, or response, which consists of one or more frames.
• The frame is the smallest unit of communication, which carries a specific type of data—e.g., HTTP headers, payload, and so on. – Frames use binary encoding, and header data is compressed
Request and Response Multiplexing
• HTTP 1.x: Only one response can be delivered at a time (response queuing) per connection – Head-of-line blocking, inefficient TCP use
• HPPT 2.0: full request and response multiplexing
Request Prioritization
• Each stream can be assigned a 31-bit priority value – 0 represents the highest priority stream.
• Client and server can apply different strategies to process individual streams, messages, and frames in an optimal order
• Strategies – HTML document itself is critical to construct the DOM; the CSS is
required to construct the CSSOM; JavaScrip is often needed for both. Remaining resources, such as images, are often fetched with lower priority
– Modern browsers prioritize requests based on type of asset, its location on the page, and even learned priority from previous visits—e.g., if the rendering was blocked on a certain asset in a previous visit, then the same asset may be prioritized higher in the future.
One connection per origin
• HTTP 2.0 connections are persistent, and only one connection should be used between the client and server – No more multiple TCP connections like in HTTP1.1
• Less overhead – fewer sockets to manage along the connection path, smaller memory
footprint, and better connection throughput. • Consistent prioritization between all streams • Better compression through use of a single compression context • Improved impact on network congestion due to fewer TCP
connections • Less time in slow-start and faster congestion and loss recovery • Most HTTP transfers are short and bursty, whereas TCP is
optimized for long-lived, bulk data transfers. By reusing the same connection between all streams, HTTP 2.0 is able to make more efficient use of the TCP connection.
Server Push
• Push additional resources to the client with client explicitly asking for them – Typical web application consists of dozens of resources, all of which are discovered by the
client by examining the document provided by the server. As a result, why not eliminate the extra latency and let the server push the associated resources to the client ahead of time?
• Pushed content goes to browser cache – Invisible to client application
• Different strategies to apply server push
Header compression
• HTTP 1.x: – Header sent sent as plain text – Adds around 500–800 bytes of overhead per request, and
kilobytes more if HTTP cookies are required
• HTTP 2.0 – Only changes to previous request are sent. HTTP 2.0 uses
"header tables" on both the client and server to track and store previously sent key-value pairs.
– Header tables persist for the entire HTTP 2.0 connection and are incrementally updated both by the client and server.
– Each new header key-value pair is either appended to the existing table or replaces a previous value in the table.
Initial SPDY mechanism did not work because of security problems • Early versions of SPDY used zlib, with a custom
dictionary, to compress all HTTP headers, which delivered 85%–88% reduction in the size of the transferred header data, and a significant improvement in page load time latency
• In the summer of 2012, a "CRIME" security attack was published against TLS and SPDY compression algorithms, which could result in session hijacking. As a result, the zlib compression algorithm was disabled
Other features
• Flow-control in a rather similar fashion as in TCP • Application Layer Protocol Negotiation (ALPN) is used to
discover and negotiate HTTP 2.0 support as part of the regular HTTPS negotiation
Status of standardization
• Lots of open issues still • Performance is not always as good as has been the
target • Using SPDY as an intermediate option still makes a lot
of sense • If you only develop web apps little need to care. If
performance of those apps is important good to understand what is happening