Mini-Training: To cache or not to cache

25
Private and Confidential TO CACHE OR NOT TO CACHE MAXIME LEMAITRE - 06/02/2013

description

In today’s systems , the time it takes to bring data to the end-user can be very long, especially under heavy load. An application can often increase performance by using an appropriate caching system. There are many caching level that you can use in our application today : CDN, In-Memory/Local Cache, Distributed Cache, Outut Cache, Browser Cache, Html Cache

Transcript of Mini-Training: To cache or not to cache

Page 1: Mini-Training: To cache or not to cache

Private and Confidential

TO CACHE OR NOT TO CACHE

MAXIME LEMAITRE - 06/02/2013

Page 2: Mini-Training: To cache or not to cache

• Introduction• Definition, Concepts, … and a quiz• Caching in Web Applications• Caching for WebFarms and IS• Conclusion

Agenda

Page 3: Mini-Training: To cache or not to cache

3

Can you give me an example

of cache system?

CPU Cache

Disk Cache

DNS CacheOutput Cache

Browser Cache

Web Cache

Database Cache

Memoization

DistributedCache

Memory Cache

???

??????

???

Page 4: Mini-Training: To cache or not to cache

4

A cache transparently stores data “somewhere” so that future requests can be served

If requested data is in the cache (cache hit), this request can be served by simply reading the cache, which is comparatively faster.

Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower

Cachingintroduction

faster

Page 5: Mini-Training: To cache or not to cache

5

• Why ?– Better performance system loads faster/better responsiveness– Better scalability limit bottlenecks in a system– Better robustness can support more load– Reduce costs reduce round trips, servers and hardware

• How ?– Many APIs/Frameworks available to meet all of your needs always easy to use– Many types of caching “think

cache” !

• When ?– Not too late “Designed with cache” is better than “Designed for being

cached”

• Who ?– You (The Devs) you will never have a business request to use caching

Do you remember « Quality of Service » ? Caching is typically implicit and you will have to put it in your projects by yourself

Cachingwhy ? how ? when ? who ?

Page 6: Mini-Training: To cache or not to cache

6

• Does Caching always improve performance ?

• Is Caching small data items useless ?

• Is Caching mandatory for web apps ?

• Is Caching only for data access layer ?

• Caching == indexes ?

Cachingshort quiz

No : even if performance is the target, an unhealthy cache will hurt performance. « Cache everything » is also counterproductive.

No : ideal caching candidates are items frequently accessed, long to compute or that change infrequently.

No : you can cache everything. Caching raw data is only a small part of the job.

Yes : because of high traffic, dynamic page content, … and because our business involves frequently updates (lives, odds, …)

No : Indexes are a way to get data faster ; cache is a copy of data (stale)

Page 7: Mini-Training: To cache or not to cache

7

• Cache Hit : when requested data is contained in the cache• Cache Miss : when requested data in not in the cache.Has to be recomputed or fetched

from its original storage• Cache Key : unique identifier for a data item in the cache• Expiration

– Absolute : item expires at a specific date, regardless of how often it is accessed– Sliding : specifies how long after an item was last accessed that it expires

• Backing store : persist cached data on disk• Cache Scavenging : deleting items from the cache when memory is scarce• MRU/LRU/LFU : algorithms for removing objects from the cache when scavenging• Memoization : avoid repeating the calculation of results for previously processed inputs• Cache Dependancy : item’s lifetime in the cache to be dependent on other application

elements such as files or databases• Local Cache : caching data on clients rather than on servers• Distributed Cache : extension of the traditional concept of cache that may span multiple

servers• …• …

Cachingterminology

Page 8: Mini-Training: To cache or not to cache

8

• CDN (forward proxy)– Serve content to end-users with high availability and high

performance (Images, Scripts, Medias, …)• User-scoped caching (Session)– allows us to store data that persists between multiple

requests on a per-user basis• Memory cache• Browser cache• HTML5 cache features• ASP.NET caching techniques

Caching in Web Applicationsin the next slides

Page 9: Mini-Training: To cache or not to cache

9

• Previously called asp.net cache or Web Cache• Built-in, supported and recommended by Microsoft• Basically a key-value store (dictionary) that lives

throughout the lifetime of the application but ASP.NET automatically manages removal of cached items (Expiration, Eviction, Dependancy, …)

• Be aware of double-checked locking (see below)

System.Runtime.Caching.MemoryCachewho does not know this cache, does not know very well BetClic

Page 10: Mini-Training: To cache or not to cache

10

Resources that are cached locally by the browser are controlled by three basic mechanisms, defined by HTTP Headers.• Freshness allows a response to be used without re-checking it on the

origin server, and can be controlled by both the server and the client.– “Expires” response header gives a date when the document becomes stale– “Cache-Control: max-age” tells the cache how many seconds the response is

fresh for.• Validation is used to check whether a cached response is still good after it

becomes stale.– For example, if the response has a ”Last-Modified” header, a cache can make a

conditional request using the ”If-Modified-Since” header to see if it has changed.– The ”Etag” mechanism allows for both strong validation rather than the If-

Modified-Since header which is often referred as weak validation.• Invalidation is usually a side effect of another request that passes through

the cache. For example, if URL associated with a cached response subsequently gets a POST, PUT or DELETE request, the cached response will be invalidated.

Browser Cachepart of HTTP protocol

What’s the difference between a response 200 (from cache) and 304 (Not Modified) ?

Page 11: Mini-Training: To cache or not to cache

11

• Caching the HTML that is generated as a result of a request• Simple add an attribute to a Controller or an action• Able to cache multiple versions of the same controller action based on the

request parameters used to call the action• By default content is cached in three locations: the web server, any proxy

servers, and the user’s web browser, but you can have fine-grained control Warning : do not add output cache to something which is bound to user like session

Output Cachean efficient cache for dynamic content

Page 12: Mini-Training: To cache or not to cache

12

• Donut cachingServer-side caching technique in which the entire page gets cached except for a small portions which remain dynamic. It’s not a native feature of MVC (why ?) but there are Nuget Packages.

• Donut Hole CachingDonut hole caching is where you cache one or more parts of a page, but not the entire page. It is handled by using the built-in OutputCache attribute on one or more child actions (called using Html.Action or Html.RenderAction from the parent view)

Output CacheDonut Caching Vs Donut Hole Caching

Page 13: Mini-Training: To cache or not to cache

13

• A "super cookie“ that allow us to persist data (up to 5 MB) in the browser (shared between tabs and keep after restart)

• Only strings (thanks to JSON)• JS API to store, get, remove, clear, get

remaining space, … • Many frameworks available such as

http://www.jstorage.info/ ( Caching & TTL Support for all browsers)

• Used at betClic for Live/Multiplex (Keep static translations)

HTML 5 Web StorageLocal Storage /Session Storage

Page 14: Mini-Training: To cache or not to cache

14

• HTML5 feature to access a web application even without a network connection– Avoid the classic « 404 Not Found » Page, but not only …– In the Top 5 supported HTML5 features in mobiles devices

• Cache Manifiest File : Allows a dev to specify which files the browser should cache and make available to offline users.– CACHE files will be explicitly cached after they're downloaded– NETWORK files are white-listed resources that require a

connection ; resources bypass the cache,– FALLBACK fallback file if a resource is inaccessible

• Also some APIs to check the cache state and switch to online too (work offline)

HTML Cache Manifestfor offline web applications

Page 15: Mini-Training: To cache or not to cache

15

All previous techniques are good but suffer from majors inconvenients :– Cached Data reside in the same process/server

If we have 30 front end servers, they will have to set up their own cache.– No High availability

If the web site is restarted/the pool recycled, all cached items are lost. – Cached Data is limited to the server capacity

If all cached data become important, we need additional memory on the server– Does not fit very well in WebFarms/Information Systems

imagine a company with databases in Gib and frontend server in Paris …

Couldn't we find a way to solve all these problems?

Distributed Cache

Caching for Information Systems and Web FarmsOn the Road to another type of cache

Page 16: Mini-Training: To cache or not to cache

16

Quite recent in web applications because• Memory is now very cheap and network cards are very fast • Works well on lower cost machines usually employed for web servers as opposed

to database servers which require expensive hardware

Currently two main approaches• Use a real distributed cache• Use a NoSql Database

Distributed Cacheintroduction

Memcached

Page 17: Mini-Training: To cache or not to cache

17

Typical Web Architecturewithout distributed cache

• Need to route users to same machine (i.e. sticky sessions)Users

• Each machine round trips for data• Some data might be expensive to

retrieve• Cached data is typically stored in

the memory of one (each) server

Web Tier

• CPU and disk can get saturated due with an increase trafficDatabase

Page 18: Mini-Training: To cache or not to cache

18

Scalable Web Architecturewith distributed cache

• No stick load balancing needed – all servers have copy of cached dataUsers

• Easy access to cache clusterWeb Tier

• Multiple machines means scale and potential for high-availability

• More machines == more memory for cache objects

Caching Tier

• Reduces load on databaseDatabase

Page 19: Mini-Training: To cache or not to cache

19

• A Distributed in-memory cache for “data”• .Net Client Api (Nuget Package)• Two Patterns :

– Cache Aside – Read-Through and Write-Behind

• Main Features– Logical Containers (Cache/Region)– Local Cache– Expiration/Eviction– Notifications– Secure API– High Availability– Concurrency Model– Tags– Monitoring API– …

Windows Server AppFabrican example of Distributed Caching

//create DataCacheFactory based on config filevar dcf = new DataCacheFactory(); //get the cache named "TestCache"var cache = dcf.GetCache(“MyCache");//Add an item called "Test" - throws if existscache.Add("Test", new Data { TheData = "Test" });//Get "Test" - add if not in cache (cache-aside)var test = cache.Get("Test") as Data;if (test == null){ test = new Data {TheData = "Test" }; cache.Add("Test", test);}

Page 20: Mini-Training: To cache or not to cache

20

• Additional ASPNET Providers:– Session State Provider

– Output Cache Provider

Windows Service AppFabric CachingBonuses

Page 21: Mini-Training: To cache or not to cache

21

• Adding caching to a system is always easy • Items in the cache may can become out-of-date or stale

– Find to best expiration duration• Be careful of multiple cache level

– Db Cache + Service Cache + API Cache + Browser Cache = ???

• Be careful of Cache Health– Always monitor your cache

• Be careful of Cache Context– Do not include User-Specific data

• Find the best granularity for your usage– Large list Vs many small items ?

• …

Caching challengesremember this !

Page 22: Mini-Training: To cache or not to cache

22

Questions ?

Page 23: Mini-Training: To cache or not to cache

23

• http://www.mnot.net/cache_docs/• MvcDonutCaching Package : http://nuget.org/packages/MvcDonutCaching• AppFabric.Client Package : http://nuget.org/packages/ServerAppFabric.Client/• http://html5demos.com/ • http://www.html5rocks.com • Caching Architecture Guide for .NET Framework Applications : http://

msdn.microsoft.com/en-us/library/ee817646.aspx• http://redis.io/ • http://www.infoq.com/news/2011/11/distributed-cache-nosql-data-sto

Appendices

Page 24: Mini-Training: To cache or not to cache

Find out more

• On https://techblog.betclicgroup.com/

Page 25: Mini-Training: To cache or not to cache

About Betclic• Betclic Everest Group, one of the world leaders in online gaming, has a unique portfolio

comprising various complementary international brands: Betclic, Everest Gaming, bet-at-home.com, Expekt…

• Active in 100 countries with more than 12 million customers worldwide, the Group is committed to promoting secure and responsible gaming and is a member of several international professional associations including the EGBA (European Gaming and Betting Association) and the ESSA (European Sports Security Association).

• Through our brands, Betclic Everest Group places expertise, technological know-how and security at the heart of our strategy to deliver an on-line gaming offer attuned to the passion of our players.