Download - Intro to Caching,Caching Algorithms and Caching Frameworks

Transcript
Page 1: Intro to Caching,Caching Algorithms and Caching Frameworks

Friday, January 2, 2009

Intro to Caching,Caching algorithms and caching frameworks part 1

Introduction:

A lot of us heard the word cache and when you ask them about caching they give you a perfect answer but theydon’t know how it is built, or on which criteria I should favor this caching framework over that one and so on, inthis article we are going to talk about Caching, Caching Algorithms and caching frameworks and which is betterthan the other.

The Interview:

"Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensiveto be fetched, so I can retrieve it faster. "

That what programmer 1 answered in the interview (one month ago he submitted his resume to a company whowanted a java programmer with a strong experience in caching and caching frameworks and extensive datamanipulation)

Programmer 1 did make his own cache implementation using hashtable and that what he only knows aboutcaching and his hashtable contains about 150 entry which he consider an extensive data(caching = hashtable, loadthe lookups in hashtable and everything will be fine nothing else) so lets see how will the interview goes.

Interviewer: Nice and based on what criteria do you choose your caching solution?

Programmer 1 :huh, (thinking for 5 minutes) , mmm based on, on , on the data (coughing…)

Interviewer: excuse me! Could you repeat what you just said again?

Programmer 1: data?!

Interviewer: oh I see, ok list some caching algorithms and tell me which is used for what

Programmer 1: (staring at the interviewer and making strange expressions with his face, expressions that no oneknew that a human face can do :D )

Interviewer: ok, let me ask it in another way, how will a caching behave if it reached its capacity?

Programmer 1: capacity? Mmm (thinking… hashtable is not limited to capacity I can add what I want and it willextend its capacity) (that was in programmer 1 mind he didn’t say it)

The Interviewer thanked programmer 1 (the interview only lasted for 10minutes) after that a woman came andsaid: oh thanks for you time we will call you back have a nice dayThis was the worst interview programmer 1 had (he didn’t read that there was a part in the job description whichstated that the candidate should have strong caching background ,in fact he only saw the line talking aboutexcellent package ;) )

Talk the talk and then walk the walk

After programmer 1 left he wanted to know what were the interviewer talking about and what are the answers tohis questions so he started to surf the net, Programmer 1 didn’t know anything else about caching except: when Ineed cache I will use hashtableAfter using his favorite search engine he was able to find a nice caching article and started to read.

Why do we need cache?

Long time ago before caching age user used to request an object and this object was fetched from a storage placeand as the object grow bigger and bigger, the user had spend more time to fulfill his request, it really made the

Page 2: Intro to Caching,Caching Algorithms and Caching Frameworks

storage place suffer a lot coz it had to be working for the whole time this caused both the user and the db to beangry and there were one of 2 possibilities

1- The user will get upset and complain and even wont use this application again(that was the case always)

2- The storage place will pack up its bags and leave your application , and that made a big problems(no place tostore data) (happened in rare situations )

Caching is a god sent:

After few years researchers at IBM (in 60s) introduced a new concept and named it “Cache”

What is Cache?

Caching is a temp location where I store data in (data that I need it frequently) as the original data is expensiveto be fetched, so I can retrieve it faster.

Caching is made of pool of entries and these entries are a copy of real data which are in storage (database forexample) and it is tagged with a tag (key identifier) value for retrieval.Great so programmer 1 already knows this but what he doesn’t know is caching terminologies which are as follow:

Cache Hit:

When the client invokes a request (let’s say he want to view product information) and our application gets therequest it will need to access the product data in our storage (database), it first checks the cache.

If an entry can be found with a tag matching that of the desired data (say product Id), the entry is used instead.This is known as a cache hit (cache hit is the primary measurement for the caching effectiveness we will discussthat later on).And the percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache.

Cache Miss:

On the contrary when the tag isn’t found in the cache (no match were found) this is known as cache miss , a hit tothe back storage is made and the data is fetched back and it is placed in the cache so in future hits it will be foundand will make a cache hit.

If we encountered a cache miss there can be either a scenarios from two scenarios:

First scenario: there is free space in the cache (the cache didn’t reach its limit and there is free space) so in thiscase the object that cause the cache miss will be retrieved from our storage and get inserted in to the cache.

Second Scenario: there is no free space in the cache (cache reached its capacity) so the object that cause cachemiss will be fetched from the storage and then we will have to decide which object in the cache we need to movein order to place our newly created object (the one we just retrieved) this is done by replacement policy (cachingalgorithms) that decide which entry will be remove to make more room which will be discussed below.

Storage Cost:

Page 3: Intro to Caching,Caching Algorithms and Caching Frameworks

When a cache miss occurs, data will be fetch it from the back storage, load it and place it in the cache but howmuch space the data we just fetched takes in the cache memory? This is known as Storage cost

Retrieval Cost:

And when we need to load the data we need to know how much does it take to load the data. This is known asRetrieval cost

Invalidation:

When the object that resides in the cache need is updated in the back storage for example it needs to be updated,so keeping the cache up to date is known as Invalidation.Entry will be invalidate from cache and fetched again from the back storage to get an updated version.

Replacement Policy:

When cache miss happens, the cache ejects some other entry in order to make room for the previously uncacheddata (in case we don’t have enough room). The heuristic used to select the entry to eject is known as thereplacement policy.

Optimal Replacement Policy:

The theoretically optimal page replacement algorithm (also known as OPT or Belady’s optimal page replacementpolicy) is an algorithm that tries to achieve the following: when a cached object need to be placed in the cache,the cache algorithm should replace the entry which will not be used for the longest period of time.

For example, a cache entry that is not going to be used for the next 10 seconds will be replaced by an entry thatis going to be used within the next 2 seconds.

Thinking of the optimal replacement policy we can say it is impossible to achieve but some algorithms do nearoptimal replacement policy based on heuristics.So everything is based on heuristics so what makes algorithm better than another one? And what do they use fortheir heuristics?

Nightmare at Java Street:

While reading the article programmer 1 fall a sleep and had nightmare (the scariest nightmare one can ever have)

Programmer 1: nihahha I will invalidate you. (Talking in a mad way)

Cached Object: no no please let me live, they still need me, I have children.

Programmer 1: all cached entries say that before they are invalidated and since when do you have children? Nevermind now vanish for ever.

Buhaaahaha , laughed programmer 1 in a scary way, ,silence took over the place for few minutes and then apolice serine broke this silence, police caught programmer 1 and he was accused of invalidating an entry that wasstill needed by a cache client, and he was sent to jail.

Programmer 1 work up and he was really scared, he started to look around and realized that it was just a dreamthen he continued reading about caching and tried to get rid of his fears.

Caching Algorithms:

No one can talk about caching algorithms better than the caching algorithms themselves

Least Frequently Used (LFU):

I am Least Frequently used; I count how often an entry is needed by incrementing a counter associated with eachentry.

Page 4: Intro to Caching,Caching Algorithms and Caching Frameworks

I remove the entry with least frequently used counter first am not that fast and I am not that good in adaptiveactions (which means that it keeps the entries which is really needed and discard the ones that aren’t needed forthe longest period based on the access pattern or in other words the request pattern)

Least Recently Used (LRU):

I am Least Recently Used cache algorithm; I remove the least recently used items first. The one that wasn’t usedfor a longest time.

I require keeping track of what was used when, which is expensive if one wants to make sure that I alwaysdiscards the least recently used item.Web browsers use me for caching. New items are placed into the top of the cache. When the cache exceeds itssize limit, I will discard items from the bottom. The trick is that whenever an item is accessed, I place at the top.

So items which are frequently accessed tend to stay in the cache. There are two ways to implement me either anarray or a linked list (which will have the least recently used entry at the back and the recently used at the front).

I am fast and I am adaptive in other words I can adopt to data access pattern, I have a large family whichcompletes me and they are even better than me (I do feel jealous some times but it is ok) some of my familymember are (LRU2 and 2Q) (they were implemented in order to improve LRU caching

Least Recently Used 2(LRU2):

I am Least recently used 2, some people calls me least recently used twice which I like it more, I add entries tothe cache the second time they are accessed (it requires two times in order to place an entry in the cache); whenthe cache is full, I remove the entry that has a second most recent access. Because of the need to track the twomost recent accesses, access overhead increases with cache size, If I am applied to a big cache size, that wouldbe a problem, which can be a disadvantage. In addition, I have to keep track of some items not yet in the cache(they aren’t requested two times yet).I am better that LRU and I am also adoptive to access patterns.

-Two Queues:

I am Two Queues; I add entries to an LRU cache as they are accessed. If an entry is accessed again, I move themto second, larger, LRU cache.

I remove entries a so as to keep the first cache at about 1/3 the size of the second. I provide the advantages ofLRU2 while keeping cache access overhead constant, rather than having it increase with cache size. Which makesme better than LRU2 and I am also like my family, am adaptive to access patterns.

Adaptive Replacement Cache (ARC):

I am Adaptive Replacement Cache; some people say that I balance between LRU and LFU, to improve combinedresult, well that’s not 100% true actually I am made from 2 LRU lists, One list, say L1, contains entries that havebeen seen only once “recently”, while the other list, say L2, contains entries that have been seen at least twice“recently”.

The items that have been seen twice within a short time have a low inter-arrival rate, and, hence, are thought ofas “high-frequency”. Hence, we think of L1as capturing “recency” while L2 as capturing “frequency” so most ofpeople think I am a balance between LRU and LFU but that is ok I am not angry form that.

I am considered one of the best performance replacement algorithms, Self tuning algorithm and low overheadreplacement cache I also keep history of entries equal to the size of the cache location; this is to remember theentries that were removed and it allows me to see if a removed entry should have stayed and we should havechosen another one to remove.(I really have bad memory)And yes I am fast and adaptive.

Most Recently Used (MRU):

I am most recently used, in contrast to LRU; I remove the most recently used items first. You will ask me why forsure, well let me tell you something when access is unpredictable, and determining the least most recently used

Page 5: Intro to Caching,Caching Algorithms and Caching Frameworks

entry in the cache system is a high time complexity operation, I am the best choice that’s why.

I am so common in the database memory caches, whenever a cached record is used; I replace it to the top ofstack. And when there is no room the entry on the top of the stack, guess what? I will replace the top most entrywith the new entry.

First in First out (FIFO):

I am first in first out; I am a low-overhead algorithm I require little effort for managing the cache entries. Theidea is that I keep track of all the cache entries in a queue, with the most recent entry at the back, and theearliest entry in the front. When there e is no place and an entry needs to be replaced, I will remove the entry atthe front of the queue (the oldest entry) and replaced with the current fetched entry. I am fast but I am notadaptive

-Second Chance:

Hello I am second change I am a modified form of the FIFO replacement algorithm, known as the Second chancereplacement algorithm, I am better than FIFO at little cost for the improvement. I work by looking at the front ofthe queue as FIFO does, but instead of immediately replacing the cache entry (the oldest one), i check to see if itsreferenced bit is set(I use a bit that is used to tell me if this entry is being used or requested before or no). If it isnot set, I will replace this entry. Otherwise, I will clear the referenced bit, and then insert this entry at the back ofthe queue (as if it were a new entry) I keep repeating this process. You can think of this as a circular queue.Second time I encounter the same entry I cleared its bit before, I will replace it as it now has its referenced bitcleared. am better than FIFO in speed

-Clock:

I am Clock and I am a more efficient version of FIFO than Second chance because I don’t push the cached entriesto the back of the list like Second change do, but I perform the same general function as Second-Chance.

I keep a circular list of the cached entries in memory, with the "hand" (something like iterator) pointing to theoldest entry in the list. When cache miss occurs and no empty place exists, then I consult the R (referenced) bit atthe hand's location to know what I should do. If R is 0, then I will place the new entry at the "hand" position,otherwise I will clear the R bit. Then, I will increment the hand (iterator) and repeat the process until an entry isreplaced.I am faster even than second chance.

Simple time-based:

I am simple time-based caching; I invalidate entries in the cache based on absolute time periods. I add Items tothe cache, and they remain in the cache for a specific amount of time. I am fast but not adaptive for accesspatterns

Extended time-based expiration:

I am extended time based expiration cache, I invalidate the items in the cache is based on relative time periods. Iadd Items the cache and they remain in the cache until I invalidate them at certain points in time, such as everyfive minutes, each day at 12.00.

Sliding time-based expiration:

I am Sliding time-base expiration, I invalidate entries a in the cache by specifying the amount of time the item isallowed to be idle in the cache after last access time. after that time I will invalidate it . I am fast but not adaptivefor access patterns

Ok after we listened to some replacement algorithms (famous ones) talking about themselves, some otherreplacement algorithms take into consideration some other criteria like:

Cost: if items have different costs, keep those items that are expensive to obtain, e.g. those that take a long timeto get.

Page 6: Intro to Caching,Caching Algorithms and Caching Frameworks

Size: If items have different sizes, the cache may want to discard a large item to store several smaller ones.

Time: Some caches keep information that expires (e.g. a news cache, a DNS cache, or a web browser cache). Thecomputer may discard items because they are expired. Depending on the size of the cache no further cachingalgorithm to discard items may be necessary.

The E-mail!

After programmer 1 did read the article he thought for a while and decided to send a mail to the author of thiscaching article, he felt like he heard the author name before but he couldn’t remember who this person was butanyway he sent him mail asking him about what if he has a distributed environment? How will the cache behave?

The author of the caching article got his mail and ironically it was the man who interviewed programmer 1 :D, Theauthor replied and said :

Distributed caching:

*Caching Data can be stored in separate memory area from the caching directory itself (who handle the cachingentries and so on) can be across network or disk for example.

*Distrusting the cache allows increase in the cache size.

*In this case the retrieval cost will increase also due to network request time.

*This will also lead to hit ratio increase due to the large size of the cache

But how will this work?

Let’s assume that we have 3 Servers, 2 of them will handle the distributed caching (have the caching entries), andthe 3rd one will handle all the requests that are coming (Which asks about cached entries):

Step 1: the application requests keys entry1, entry2 and entry3, after resolving the hash values for these entries,and based on the hashing value it will be decided to forward the request to the proper server.

Step 2: the main node sends parallel requests to all relevant servers (which has the cache entry we are lookingfor).

Step 3: the servers send responses to the main node (which sent the request in the 1st place asking to the cachedentry).

Step 4: the main node sends the responses to the application (cache client).

*And in case the cache entry were not found (the hashing value for the entry will be still computed and willredirect either to server 1 or server 2 for example and in this case our entry wont be found in server 1 so it willfetched from the DB and added to server 1 caching list.

Measuring Cache:

Most caches can be evaluated based on measuring the hit ratio and comparing to the theoretical optimum, this is

Page 7: Intro to Caching,Caching Algorithms and Caching Frameworks

Widget by Css Reflex | TutZone

usually done by generation a list of cache keys with no real data, but here the hit ratio measurement assumesthat all entries have the same retrieval cost which is not true for example in web caching the number of bytes thecache can server is more important than the number of hit ration (for example I can replace the big entry will 10small entries which is more effective in web)

Conclusion:

We have seen some of popular algorithms that are used in caching, some of them are based on time, cache objectsize and some are based on frequency of usage, next part we are going to talk about the caching framework andhow do they make use of these caching algorithms, so stay tuned ;)

Related Articles:

Part 2 (Algorithm Implementation)

Part 3 (Algorithm Implementation)

Part 4 (Frameworks Comparison)

Part 5 (Frameworks Comparison)

Posted by Ahmed Ali at 1:09 PM

Labels: Algorithms, Framework

Intro to Caching,Caching algorithms and caching frameworks part 2

Introduction:

In this part we are going to show how to implement some of the famous replacement algorithms as we mentionedin part 1, the code in this article is just for demonstration purpose which means you will have to do some extraeffort if you want to make use of it in your application (if you are going to build your own implementation andwont use any caching frameworks)

The Leftover policy:

After programmer 1 read the article he proceeded to review the comments on this article, one of these commentswere talking about leftover policy, which is named “Random Cache”

Random Cache:

I am random cache, I replace any cache entry I want, I just do that and no one can complain about that, you cansay the unlucky entry, by doing this I remove any overhead of tracking references or so, am better than FIFOpolicy, in some cases I perform even better than LRU but in general LRU is better than me.

It is comment time:

While programmer 1 was reading the rest of the comments, he found very interesting comment aboutimplementation of some of the famous replacement policies, actually it was a link to the commenter site which hasthe actual implementation so programmer 1 clicked the link and here what he got:

Meet the Cache Element:

public class CacheElement {

private Object objectValue;

Page 8: Intro to Caching,Caching Algorithms and Caching Frameworks

private Object objectKey;

private int index;

private inthitCount;.. // getters and setters.}

This is the cache entry which will use to hold the key and the value; this will be used in all the cache algorithmsimplementation

Common Code for All Caches:

public final synchronized void addElement(Object key,Object value) {

int index;Object obj;

// get the entry from the tableobj = table.get(key);

// If we have the entry already in our tablethen get it and replace only its value.if (obj != null) {CacheElementelement;

element = (CacheElement) obj;element.setObjectValue(value);element.setObjectKey(key);

return;}}

The above code will be common for all our implementation; it is about checking if the cacheElemnet already existsin our cache, if so then we just need to place its value and we don’t need to make anything else but what if wedidn’t find it ? Then we will have to dig deeper and see what will happen below.

The Talk Show:

Today’s episode is a special episode , we have special guests , they are in fact compotators we are going to hearwhat everyone has to say but first lets introduce our guests:

Random Cache, FIFO Cache

Let’s start with the Random Cache.

Meet Random Cache implementation:

public final synchronized void addElement(Object key,Object value) {

int index;Object obj;

obj = table.get(key);

Page 9: Intro to Caching,Caching Algorithms and Caching Frameworks

if (obj!= null) {CacheElement element;

// Just replace the value.element = (CacheElement) obj;element.setObjectValue(value);element.setObjectKey(key);

return;}

// If we haven'tfilled the cache yet, put it at the end.if (!isFull()) {index =numEntries;++numEntries;} else {// Otherwise, replace a randomentry.index = (int) (cache.length * random.nextFloat());table.remove(cache[index].getObjectKey());}

cache[index].setObjectValue(value);cache[index].setObjectKey(key);table.put(key, cache[index]);}

Analyzing Random Cache Code (Talk show):

In today’s show the Random Cache is going to explain the code line by line and here we go.I will go straight to the main point; if I am not full then I will place the new entry that the client requested at theend of the cache (in case there is a cache miss).

I do this by getting the number of entries that resides in the cache and assign it to index (which will be the indexof the current entry the client is adding) after that I increment the number of entries.

if (!isFull()) {index = numEntries;++numEntries;}

If I don’t have enough room for the current entry, I will have to kick out a random entry (totally random, bribingisn’t allowed).

In order to get the random entry, I will use the random util. shipped with java to generate a random index andask the cache to remove the entry that its index equal to the generated index.

else {// Otherwise, replace a random entry.index = (int) (cache.length * random.nextFloat());table.remove(cache[index].getObjectKey());}

At the end I just place the entry -either the cache was full or no- in the cache.

cache[index].setObjectValue(value);cache[index].setObjectKey(key);table.put(key, cache[index]);

Page 10: Intro to Caching,Caching Algorithms and Caching Frameworks

Magnifying the Code:

It is said that when you look at stuff from a near view it is better to understand it, so that’s why we have amagnifying glass and we are going to magnify the code to get more near to it (and maybe understand it more).

Cache entries in the same voice: hi ho, hi ho, into cache we go.

New cache entry: excuse me; I have a question! (Asking a singing old cache entry near to him)

Old cache entry: go ahead.

New cache entry: I am new here and I don’t understand my role exactly, how will the algorithm handle us?

Old cache entry: cache! (Instead of man!), you remind me of myself when I was new (1st time I was added to thecache), I used to ask questions like that, let me show you what will happen.

Meet FIFO Cache Implementation:

public final synchronized void addElement(Objectkey,Object value) {int index;Object obj;

obj = table.get(key);

if (obj != null) {CacheElement element;

// Just replace thevalue.element = (CacheElement) obj;element.setObjectValue(value);element.setObjectKey(key);

return;}

// If we haven'tfilled the cache yet, put it at the end.if (!isFull()) {index =numEntries;++numEntries;} else {

Page 11: Intro to Caching,Caching Algorithms and Caching Frameworks

// Otherwise, replace the currentpointer, entry with the new oneindex = current;// in order to makeCircular FIFOif (++current >= cache.length)current = 0;

table.remove(cache[index].getObjectKey());}

cache[index].setObjectValue(value);cache[index].setObjectKey(key);table.put(key, cache[index]);}

Analyzing FIFO Cache Code (Talk show):

After Random Cache, audience went crazy for random cache, which made FIFO a little bit jealous so FIFO startedtalking and said:

When there is no more rooms for the new cache entry , I will have to kick out the entry at the front (the onecame first) as I work in a circular queue like manner, by default the current position is at the beginning of thequeue(points to the beginning of the queue).

I assign current value to index (index of the current entry) and then check to see if the incremented currentgreater than or equals to the cache length(coz I want to reset current –pointer- position to the beginning of thequeue) ,if so then I will set current to zero again ,after that I just kick the entry at the index position (Which isthe first entry in the queue now) and place the new entry.

else {// Otherwise, replace the current pointer,which takes care of// FIFO in a circular fashion.index = current;

if (++current >= cache.length)current = 0;

table.remove(cache[index].getObjectKey());}

cache[index].setObjectValue(value);cache[index].setObjectKey(key);table.put(key, cache[index]);

Magnifying the Code:

Back to our magnifying glass we can observe the following actions happening to our entries

Page 12: Intro to Caching,Caching Algorithms and Caching Frameworks

Widget by Css Reflex | TutZone

Conclusion:

As we have seen in this article how to implement the FIFO replacement policy and also Random replacementpolicy, in the upcoming articles we will try to take our magnifying glass and magnify LFU, LRU replacement policy,till then stay tuned ;)

Posted by Ahmed Ali at 11:05 PM

Labels: Algorithms, Framework

Intro to Caching,Caching algorithms and caching frameworks part 3

Introduction:

In part 1 we talked about the basics and terminologies of cache and we have also shown replacement policies , inpart 2 we implemented some of these famous replacement polices and now in this part we will continue talkingabout the implementation of two famous algorithms which are LFU and LRU. Again, the implementation in thisarticle is for sake of demonstration and in order to use it (we just concentrate over the replacement algorithm andwe will skip other things like loading data and so on), you will have to do some extra work but you can base yourimplementation over it.

Meet LFU Cache Implementation:

public synchronized Object getElement(Object key) {

Object obj;

obj = table.get(key);

if (obj != null) {CacheElement element = (CacheElement) obj;element.setHitCount(element.getHitCount() + 1);return element.getObjectValue();

Page 13: Intro to Caching,Caching Algorithms and Caching Frameworks

}return null;

}

public final synchronized void addElement(Object key, Object value) {

Object obj;

obj = table.get(key);

if (obj != null) {CacheElement element;

// Just replace the value.element = (CacheElement) obj;element.setObjectValue(value);element.setObjectKey(key);

return;}

if (!isFull()) {

index = numEntries;++numEntries;} else {CacheElement element = removeLfuElement();index = element.getIndex();table.remove(element.getObjectKey());

}

cache[index].setObjectValue(value);cache[index].setObjectKey(key);cache[index].setIndex(index);table.put(key, cache[index]);}

public CacheElement removeLfuElement() {

CacheElement[] elements = getElementsFromTable();CacheElement leastElement = leastHit(elements);return leastElement;}

public static CacheElement leastHit(CacheElement[] elements) {

CacheElement lowestElement = null;for (int i = 0; i < elements.length; i++) {CacheElement element = elements[i];if (lowestElement == null) {lowestElement = element;

Page 14: Intro to Caching,Caching Algorithms and Caching Frameworks

} else {if (element.getHitCount() < lowestElement.getHitCount()) {lowestElement = element;}}}return lowestElement;}Analyzing LFU Cache Code (Talk Show):

Presenter: it is getting hotter and hotter now, our next contestant is LFU cache, please make some noise for it.

Audience began to scream for LFU which made LFU hesitated.

Hello, I am LFU, when the cache client want to add a new element and cache is full (no enough room for the newentry) I will have to kick out the least frequently used entry, by using the help of the removelfuElement methodwhich will allow me to get the least frequently used element, after I get it, I will remove this entry and place thenew entry

else {CacheElement element = removeLfuElement();index = element.getIndex();table.remove(element.getObjectKey());

}

If we dived into this method…, I am saying if we dived into this method (still nothing happened)LFU tried pressing the next button on the presentation remote control (to get the next presentation slide) but Ididn’t work.

Ahh now we are talking, ok if we dived into this method we will see that the method is just getting the wholeelements in cache by calling getElementsFromTable method and then returns the element with the least hit.

public CacheElement removeLfuElement() {

CacheElement[] elements = getElementsFromTable();CacheElement leastElement = leastHit(elements);return leastElement;}}

By calling leastHit method which loops over the cache elements and check if the current element has the least hit,if so, I will make it my lowestElement which I am going replace the new entry with.

public static CacheElement leastHit(CacheElement[] elements) {

CacheElement lowestElement = null;for (int i = 0; i <>CacheElement element = elements[i]; if (lowestElement == null) { lowestElement = element; }else {

Page 15: Intro to Caching,Caching Algorithms and Caching Frameworks

if (element.getHitCount() <>{ lowestElement = element; } } }return lowestElement; }

LFU stopped talking and waited for any action from the audience and the only action it get was scratching heads(audience didn’t get some stuff).

One of the production team whispered to LFU cache and said: you didn’t mention how the lowest element will bedistinguished from another element?

Then LFU cache started talking gain and said: By default when you add the element to the cache its hitCoint willbe the same as the previous element so how do we handle the hit count thing?

Every time I encounter a cache hit I will increment the hit count of the entry and then return the entry the cacheclient asked for which would be something like that

public synchronized Object getElement(Object key) {

Object obj;

obj = table.get(key);

if (obj != null) {CacheElement element = (CacheElement) obj;element.setHitCount(element.getHitCount() + 1);return element.getObjectValue();}return null;

}

Magnifying the Code:

Did anyone say magnification?

Page 16: Intro to Caching,Caching Algorithms and Caching Frameworks

Meet LRU Cache Implementation:

private void moveToFront(int index) {int nextIndex, prevIndex;

if(head != index) {nextIndex = next[index];prevIndex = prev[index];

// Only the head has a prev entry that is an invalid index so// we don't check.next[prevIndex] = nextIndex;

// Make sure index is valid. If it isn't, we're at the tail// and don't set prev[next].if(nextIndex >= 0)prev[nextIndex] = prevIndex;elsetail = prevIndex;

prev[index] = -1;next[index] = head;prev[head] = index;head = index;}}

public final synchronized void addElement(Object key, Object value) {int index;Object obj;

obj = table.get(key);

if(obj != null) {CacheElement entry;

// Just replace the value, but move it to the front.entry = (CacheElement)obj;entry.setObjectValue(value);entry.setObjectKey(key);

moveToFront(entry.getIndex());

return;}

// If we haven't filled the cache yet, place in next available spot// and move to front.if(!isFull()) {if(_numEntries > 0) {prev[_numEntries] = tail;next[_numEntries] = -1;moveToFront(numEntries);

Page 17: Intro to Caching,Caching Algorithms and Caching Frameworks

}++numEntries;} else {// We replace the tail of the list.table.remove(cache[tail].getObjectKey());moveToFront(tail);}

cache[head].setObjectValue(value);cache[head].setObjectKey(key);table.put(key, cache[head]);}

Analyzing LRU Cache Code (Talk show):

After LFU finished talking, there were not much screaming, they didn’t like the presentation and LFU washesitating while talking, this gave a big push to LRU which started by saying:

This time I will consider the case also when the cache is not full, I am little more complex than those otheralgorithms, when the cache isn’t full and it is the first entry I will just increment the numEntries which representsthe number of entries in the cache.

After adding a second entry I will need to move it to the front by calling moveToFront method (we will talk aboutit soon), I didn’t do this for the first entry because it is for sure the first element.

So let’s see some action.

As you can see I am stating that the previous of the current entry will have the tail value and the next entry willbe -1 (undefined in other words) these are just initial data.

After adding the new entry (which isn’t the first entry) I will move it to front.

if(!isFull()) {if(_numEntries > 0) {prev[_numEntries] = tail;next[_numEntries] = -1;moveToFront(numEntries);}++numEntries;}

The moveToFront method moves an entry to the head of the array so that the least recently used elements resideat the bottom of the array.

Before I do any move I check if the head is not equal to current index (this will be false in case we only have 1entry) if yes, then assign the value of the next of the current entry (which is a pointer to next entry as in linkedlist) to nextIndex and the value of the previous of the current entry (which is a pointer to the previous entry as inlinked list) to prevIndex

int nextIndex, prevIndex;

if(head != index) {nextIndex = next[index];prevIndex = prev[index];

Then I assign the value of the nextIndex to the value of next of the previous entry

// Only the head has a prev entry that is an invalid index so

Page 18: Intro to Caching,Caching Algorithms and Caching Frameworks

// we don't check.next[prevIndex] = nextIndex;

After that I am going to check for the nextIndex if it is greater that or equal 0 then the previous the next entrywill have the value of prevIndex , else the tail will be equal to the prevIndex

// Make sure index is valid. If it isn't, we're at the tail// and don't set prev[next].if(nextIndex >= 0)prev[nextIndex] = prevIndex;else tail = prevIndex;

And because I moved this entry to the front so there won’t be any previous entry for it so am assigning -1 to itand the next entry of the current entry (top one) will be the head (previous old head) and the prev of head (theold head) will have the index of the current entry and then the new head is assigned the new index (currentindex)

prev[index] = -1;next[index] = head;prev[head] = index;head = index;

Magnifying the Code:

It is magnifying time! Get your magnifying glass we are going to see some interesting stuff here

It is Confession Time! :

LRU didn’t mention that it is possible to implement the LRU algorithm in a simple way , our previousimplementation is based on Arrays , the other implementation that LRU cache didn’t mention is throughLinkedHashMap which was introduced in JDK 1.4

public class LRUCache2 extends LinkedHashMap{private

Page 19: Intro to Caching,Caching Algorithms and Caching Frameworks

Widget by Css Reflex | TutZone

static final int MAX_ENTRIES = 3;

public LRUCache2()

{super(MAX_ENTRIES+1, .75F,true);}

// This method isinvoked by put and putAll after inserting a new entry into

// the map. It allows the map to have up to 3 entries and then// delete the oldest entry each time a new entry isadded. protected boolean removeEldestEntry(Map.Entry eldest){returnthis.size() > MAX_ENTRIES;}

}

For sure, the LinkedHashMap solution is less time consuming that the array solution and it is more efficient cozyou will leave the handling of the deletion and so on to the JDK itself, so you won’t bother yourself implementingsuch stuff.OSCache use such implementation in its LRU caching implementation.

Conclusion:

We have seen how to implement LFU and LRU algorithms and the two ways to implement the LRU, it is based onyou to choose which way to use, Arrays or LinkedHashMap for me I would recommend Arrays for small sizeentries and LinkedHashMap for big size entries.

In next part we will be talking about the Caching framework and a comparison between them and what cachingalgorithm is employed by which caching framework, stay tuned till then ;)

Posted by Ahmed Ali at 10:55 PM

Intro to Caching,Caching algorithms and caching frameworks part 4

Introduction:

In part 1 we talked about Caching introduction and some terminologies of caching and in part 2 and part 3 wehave seen some implementation of the famous replacement cache algorithms and now in this part we will seecomparison between open source java caching frameworks as I am not that rich to buy commercial frameworks:D.In this part we will talking about OSCache,Ehcache,JCS and Cache4J and we are going to concentrate on memorycaching only, there will be performance comparison based on in memory caching by using JBoss cachingbenchmark framework and other test cases for cache.

The Task:

“Programming Mania” is a famous programming magazine from geeks to geeks every release from the magazine

Page 20: Intro to Caching,Caching Algorithms and Caching Frameworks

there a section specialized in frameworks comparison like MVC, ORM and so on, this month they decided that theyare going to make a comparison about caching frameworks

And as we know the editors have programmatic background, in fact they are real programmers (not fake ones).

Head of Editors: this time we want to make our comparison article about caching frameworks, so we need toinvestigate the already used caching frameworks and I don’t need to remind you that the economic crisis affectedus as well, so we will just care about open source frameworks.

Programmer 1: oh, okay no problem in that.

Head of Editors: excellent, oh and by the way, we will make it in two parts so try getting as much information asyou can.

Programmer 1: ok, no problem.

Head of Editors: oh yea, one more thing, I am excepting you to be done by the day after tomorrow as we aregoing to release the article this week.

Programmer 1: !!! : (Shocked)

First few lines!

In order for programmer 1 to make the right comparison he needs to know what type of objects or what cachingframeworks cache, some caching frameworks cache just normal POJOs while others cache portions of JSPs and soon, below is a list of common objects that caching frameworks cache

1-POJO Caching2-HTTP Response Caching3-JSP Caching4-ORM Data Access Caching

The Checklist:

After Programmer 1 read a lot about caching he made a check list which enables him to make the comparison ofthe different frameworks, he will validates each item from the check list against all the caching frameworks.

The check list is as follow:

Page 21: Intro to Caching,Caching Algorithms and Caching Frameworks

Programmer 1 decided to list the famous caching frameworks he is going to compare between so he selected thefollowing frameworks:

· Java Caching System (JCS)· Ehcache· OSCache· Cache4J· ShiftOne· WhirlyCache· SwarmCache· JBoss Cache

As soon he finished listing the frameworks he started to write the first few lines in the 1st part

Java Caching System (JCS):

JCS is a distributed caching system written in java for server-side java applications. It is intended to speed updynamic web applications by providing a means to manage cached data of various dynamic natures. Like anycaching system, the JCS is most useful for high read, low put applications.

The foundation of JCS is the Composite Cache, which is the pluggable controller for a cache region. Four types ofcaches can be plugged into the Composite Cache for any given region: Memory, Disk, Lateral, and Remote.The JCS jar provides production ready implementations of each of the four types of caches. In addition to the corefour, JCS also provides additional plug-ins of each type.

JCS provides a framework with no point of failure, allowing for full session failover (in clustered environments),including session data across up to 256 serversJCS has a wick nested categorical removal, data expiration (idle time and max life) Extensible framework, fullyconfigurable runtime parameters, and remote synchronization, remote store recovery, Non-blocking "zombie"(balking facade) pattern

"balking facade pattern , if a method is invoked on an object and that object is not in appropriate state to executethat method, have the method return without doing anything is in state or even throw an exception for example'IllegalStateException' “

The configurations of JCS are set in a properties file named config.ccf file.

-Memory Cache:

JCS support LRU and MRU, The LRU Memory Cache is an extremely fast, highly configurable memory cache. Ituses a Least Recently Used algorithm to manage the number of items that can be stored in memory. The LRUMemory Cache uses its own LRU Map implementation that is significantly faster than both the commons LRUMapimplementation and the LinkedHashMap that is provided with JDK1.4 up. (At least that what JCS claims which wewill show below )

-Disk Cache:

The Indexed Disk Cache is a fast, reliable, and highly configurable swap for cached data. The indexed disk cachefollows the fastest pattern for disk swapping.

-Lateral Cache:

The TCP Lateral Cache provides an easy way to distribute cached data to multiple servers. It comes with a UDPdiscovery mechanism, so you can add nodes without having to reconfigure the entire farm. The TCP Lateral ishighly configurable.

Page 22: Intro to Caching,Caching Algorithms and Caching Frameworks

-Remote Cache:

JCS also provides an RMI based Remote Cache Server. Rather than having each node connects to every othernode, you can use the remote cache server as the connection point.

JCS and Check List:

JCS in Action:

Our programmer 1 was checking JCS site and in the site they claimed that its LRU Map caching algorithm is fasterthan LinkedHashMap that is shipped with JDK 1.4 and up. So our newbie ran the following test against JCS (1.3)and LinkedHashMap JDK 1.4 and 1.6

The above is the PC specification that we are going to run our test on

In order to check what JCS claims we used their own test case from the JCS site (I will be using this test case forthe rest of our frameworks testing)

The following configuration file was used during the test:

JCS

After using this test case for LinkedHashMap and JCS we got the following results:

JCS vs. LinkedHashMap

Ehcache:

Ehcache is a java distributed cache for general purpose caching, J2EE and light-weight containers tuned for largesize cache objects. It features memory and disk stores, replicate by copy and invalidate, listeners, a gzip cachingservlet filter, Fast, Simple.

Page 23: Intro to Caching,Caching Algorithms and Caching Frameworks

Ehcache Acts as a pluggable cache for Hibernate 2.1. with Small foot print, Minimal dependencies, fullydocumented and Production tested.

It is used in a lot of Java frameworks such as Alfresco, Cocoon, Hibernate, Spring, JPOX, Jofti, Acegi, Kosmos,Tudu Lists and Lutece.

One of its features is to cache domain objects that map to database entities. As the domain objects that maps todatabase entities is the core of any ORM system that’s why Ehcache is the default cache for HibernateWithEhcache you can serialize both Serializable objects and Non-serializable.

Non-serializable Objects can use all parts of Ehcache except for Disk Store and replication. If an attempt is madeto persist or replicate them they are discarded and a WARNING level log message emitted.

Another feature in Ehache is that admin can monitor the cache statistics, configuration changing and managing thecache through JMX service as Ehcache supports it (Which is really nice feature).

The configurations of Ehcache are set in an xml file named ehcache.xml file.

-Memory Cache:

EHCache support LRU, LFU and FIFO.

-Disk Cache:

Ehcache can store up to 100G of data to disk and access them in a fast manner.

Ehcache and Check List:

OSCache:

OSCache is a caching solution that includes a JSP tag library and set of classes to perform fine grained dynamiccaching of JSP content, servlet responses or arbitrary objects. It provides both in memory and persistent on diskcaches, and can allow your site to continue functioning normally even if the data source is down(for example if anerror occurs like your db goes down, you can serve the cached content so people can still surf the site).

When dealing with static HTML pages. The Page response can be cached indefinitely in memory thus avoidingreprocessing of the page. OSCache do so by using the URI and query parameters to form a unique key.

This key is used to store page content. HttpResponse caching is implemented as a ServletFilter. Thus, the cachefilter abstracts the API usage from the client.

Page 24: Intro to Caching,Caching Algorithms and Caching Frameworks

By default, the Cache Filter holds the page response in 'Application' scope and refreshes the cache every one hour.These default values can be changed.

In case of dynamic pages (JSPs), OSCache provides tags that surround the static part in the page. Thus, only thestatic part of the page is cached.

OSCache can be configured for persistence cache. When the memory capacity is reached, objects are evicted fromthe memory and stored on a hard disk. Objects are evicted from memory based on the configured cache algorithm.Other caching places (like DB for example) you could also implement your own custom Persistencelistener (topersist in a any place you want)OSCache supports distributed caching.

When an application is deployed in a cluster of application servers, the local cache is kept in sync bycommunication amongst all the caches in the cluster; this is achieved either by JMS or by JGroups.

Multiple caches can be created, each with their own unique configuration.

Another feature in OSCache is that admin can monitor the cache statistics; configuration changing and managingthe cache through JMX service but this is only available via spring framework (while Ehcache supports this featurewithout the need of any other framework or so).

OSCache is also used by many projects Jofti, Spring, Hibernate.

OSCache is also used by many sites like TheServerSide, JRoller, JavaLobby

The configurations of OSCache are set in a property file named oscache.properties file.

-Memory Cache:

OSCache support LRU and FIFO, and any other custom replacement algorithm

-Disk Cache:

OSCache supports the Disk cache, when using memory anddisk since, when capacity is reached, item is removedfrom memory but notfrom disk. Therefore, if that item is needed again, it will be found on diskand brought backinto memory. You get a behavior similar as a browsercache. However you still need to do some administrativetasks to clean the diskcache periodically since this has not been implemented in OSCache.

OSCache and Check List:

Page 25: Intro to Caching,Caching Algorithms and Caching Frameworks

Cache4J:

Cache4j is a cache for Java objects that stores objects only in memory (suitable for Russian speaking guys only asthere is not documentation in English and the JavaDoc is in Russian also :D).

It is mainly useful for caching POJO objects only.

In the wish list they stated that they want to support disk caching and distributed handling also but that was longtime ago in 2006 but nothing happened.

It supports LRU, LFU, and FIFO caching algorithms. For storing objects in its cache, cache4j offers hard and softreferences (best practice for caching frameworks is to use the weak reference and soft reference because if theJVM needs to garbage collect some objects to make room in memory, then the cached objects will be the first oneto be removed).

Cache4j is implemented in a way that multiple application threads can access the cache simultaneously. It alsoprovides easy to use programming APIs

-Memory Cache:

Cache4J support LRU, LFU and FIFO

Cache4J Check List:

Performance in action:

Ok now it is show time for this performance testing programmer 1 used 3 different test cases which are as follow:

1-Test Case from JCS site (applied on all caching frameworks)2-JBoss Cache benchmark framework (which is really a very nice cache benchmark framework)3-Test Case from Cache4J site (applied on all caching frameworks)

In the 1st and 3rd cache test case it just simple testing of retrieve and populating the cache, while in JBoss cachebenchmark there are a lot of test cases shipped with the benchmark from replication to distributed and clusteringtesting.

All the testing here were performed on a single machine (no distributed testing were performed) and all the testingwere performed in memory.

The versions of the frameworks we are going to test now are as follow:

Page 26: Intro to Caching,Caching Algorithms and Caching Frameworks

OSCache: 2.4.1Ehcache: 1.6.0JCS: 1.3Cache4j: 0.4

Configurations Used:

OSCache

Ehcache

JCS

Cache4J:

SynchronizedCache cache = new SynchronizedCache();cache.setCacheConfig(new CacheConfigImpl("cacheId", null, 0, 0, 0, 1000000, null, "lru", "strong"));

JBoss cache benchmark:

We can see here that there is nearly 8 million get operation invoked on the different cache frameworks and theJCS took the smallest time while OSCache took the biggest time

Page 27: Intro to Caching,Caching Algorithms and Caching Frameworks

We see here that there is nearly 2 million put operation invoked on the different cache frameworks and cache4jtook the smallest time while OSCache took the biggest time

The cache test performed here was in memory cache and there were 25 threads accessing the cache but we willnot depend on this only and we will just continue with our testing

JCS Test Case:

OScache vs. Ehcache

OSCache vs. JCS

OScache vs. Cache4J

Ehcache vs. JCS

Ehcache vs. Cache4J

JCS vs. Cache4J

The winner in this test in Ehcache which achieved outstanding results against all the other frameworks, this test isjust adding 50,000 items to the cache and then retrieves them and measure the time take for adding and gettingthe items from cache

Cache4j Test Case:

---------------------------------------------------------------java.version=1.6.0_10java.vm.name=Java HotSpot(TM) Client VMjava.vm.version=11.0-b15java.vm.info=mixed mode, sharingjava.vm.vendor=Sun Microsystems Inc.os.name=Windows XPos.version=5.1os.arch=x86---------------------------------------------------------------This test can take about 5-10 minutes. Please wait ...---------------------------------------------------------------GetPutRemoveT GetPutRemove Get

Page 28: Intro to Caching,Caching Algorithms and Caching Frameworks

Widget by Css Reflex | TutZone

---------------------------------------------------------------cache4j 0.4 2250 2125 1703oscache 2.4.1 4032 4828 1204ehcache 1.6 1860 1109 703jcs 1.3 2109 1672 766---------------------------------------------------------------

As we can see the OSCache also took the biggest time while ehcache took the smallest time.

This test also performs addition and retrieving for cache items which means there is no cache miss (like the testcases in JBoss cache benchmark)

And the gold medal goes to!

Our candidate framework in this part is ehcache which achieved the best time in most of the testing, bestperformance for cache miss and cache hits and not only that but also provides very good features from monitoringstatistics to distributed functionality.

2nd place goes to JCS and OSCache, JCS is really a great caching framework but wont serve the need of cachingresponse and JSP portions but it will be a great choice for caching POJOs in while OSCache have nice features butunfortunately the performance is not that good that is because an exception is thrown when there is a cache misswhich would affect the performance, most of the cache frameworks introduced here just return null if cache missis encountered.

Finally in the last place comes Cache4j which did really a great job in caching but isn’t feature rich and also it isRussian documented so wont be helpful when you face a problem with it :D but it still achieved outstandingresults.

Conclusion:

In this part we have seen different cache frameworks and we made a comparison for them but that’s not the endwe still have more open source caching frameworks to check so stay tuned ;)

Posted by Ahmed Ali at 10:55 PM

Labels: Algorithms, Framework

Intro to Caching,Caching algorithms and caching frameworks part 5

Introduction:

In part 1 we talked about Caching introduction and some terminologies of caching and in part 2 and part 3 wehave seen some implementation of the famous replacement cache algorithms and in part 4 we saw comparisonsbetween some famous caching frameworks and in this part we are going to continue what we started in part 4 andas in part 4 we will concentrate only on memory caching.

The Task:

After programmer 1 released the caching article in “Programming Mania” the geek to geek magazine, he got a lotof threaten mails and terrible messages from caching geeks defending their beloved caching frameworks and

Page 29: Intro to Caching,Caching Algorithms and Caching Frameworks

warning him if he didn’t make their beloved caching framework win the contest, he will regret the day he becamea programmer.

That didn’t scare our programmer and he went on completing the second part of the comparison

. ShiftOne· WhirlyCache· SwarmCache· JBoss Cache

ShiftOne:

ShiftOne or as they call JOCache is a lightweight caching framework that implements several strict object cachingpolicies which comes up with a set of cache algorithm implementations that supports in memory cache.

ShiftOne cache forces two rules for every cache:

Max Size - each cache has a hard limit on the number of elements it will contain. When this limit is exceeded,the least valuable element is evicted. This happens immediately, on the same thread. This prevents the cachefrom growing uncontrollablyElement Timeout - each cache has a maximum time that it's elements are considered valid. No element willever be returned that exceeds this time limit. This ensures a predictable data freshness.

ShiftOne use decorator pattern in order to make it more flexible for the user to use any underneath cachingproduct to maintain the cache.

The following caching products can be plugged into ShiftOne:

EHCache

SwarmCache

JCS Cache

Oro Cache

ShiftOne enables client to gather statistics (Hit/Miss) about the cache by using JMX, not only that but also enablesintegration with Hibernate ORM through adaptors.When it comes to in memory caching (which is the only thing JOcache supports) JOCache uses Soft references forthe caching entries.

JOCache was originally implemented as part of the ExQ project to support ResultSet caching. It was later split outfor use by other projects. It was designed to cache large expensive database query results.

-Memory Cache:

ShiftOne cache supports LRU, LFU, FIFO, Single, Zero

ShiftOne and Check List:

Page 30: Intro to Caching,Caching Algorithms and Caching Frameworks

WhirlyCache:

WhirlyCache is a fast, configurable in-memory object cache for Java. It can be used to speed up a website or anapplication by caching objects that would otherwise have to be created by querying a database or by anotherexpensive procedure it also provides an in-memory cache.

WhirlyCache runs a separate thread to prune the cache; in other words, the data from the cache is not providedby the same application thread that the client uses. Thus, there are fewer burdens on the application thread.

Whirlycache is built around several design principles that differ from other cache implementations:

Require synchronization as infrequently as possible

Do as little as possible in the insertion and retrieval operations

Soft limits are acceptable for many applications

Disk overflow becomes a bad idea very quickly

Many attributes of Whirlycache are configurable in an XML file, but the most important components of the cacheare the Backend, the Tuner, and the Policy.

WhirlyCache support pluggable backend implementations that need to implement the ManagedCache interface(which is a subinterface of java.util.Map, although not all the methods of Map need to be implemented).WhirlyCache currently support two backends: ConcurrentHashMap and FastHashMap. You can even implement yourown backed by implementing the ManagedCache interface.

The Tuner is a background thread that performs cache maintenance activities specified in the configured Policyimplementation. One Tuner thread per cache is created and it is configured to run every n seconds. It depends onyour application, but you definitely don't want to run the Tuner too often since it will only serve to burden thesystem unnecessarily.

Page 31: Intro to Caching,Caching Algorithms and Caching Frameworks

-Memory Cache:

Currently, WhirlyCache offers FIFO, LFU and LRU. You can specify a different Policy implementation per namedcache in the whirlycache.xml configuration file

WhirlyCache and Check List:

SwarmCache:

SwarmCache is an in-memory cache intended more for caching domain objects on the data access layer. It offerssupport for a distributed cache in a clustered environment.SwarmCache supports the LRU caching algorithm. However, SwarmCache is essentially an in-memory cache. WhenLRU is set as the caching algorithm and the memory capacity is reached, SwarmCache evicts the memory objectsas per LRU logic from its memory.

SwarmCache uses soft references to the cached objects. So, if the LRU is not set as the caching algorithm, it relieson the garbage collector to swipe through its memory and clean objects that are least frequently accessed.However, SwarmCache recommends a combination of the above two to be set as the caching algorithm.

SwarmCache provides a wrapper in order to be used with Hibernate ORM and DataNucleus

When used in clustering environment each server instantiates its own manager. For each type of object that theserver wishes to cache, it instantiates a cache and adds it to the manager. The manager joins a multicast groupand communicates with other managers in the group. Whenever an object is removed from a cache, the managernotifies all other managers in the group. Those managers then ensure that the object is removed from theirrespective caches. The result is that a server will not have in its cache a stale version of an object that has beenupdated or deleted on another server.

Note that the managers only need to communicate when an object is removed from a cache. This only happenswhen an object is updated or deleted. The managers do not co-operate beyond this. This means that the amount

Page 32: Intro to Caching,Caching Algorithms and Caching Frameworks

of inter-server communications is proportional to the amount of updates/deletes of the application. Also notice thatthere is no "server"; all hosts are equal peers and they can come and go from the cache group as they pleasewithout affecting other group members. Thus the operation of the distributed cache is very robust

-Memory Cache:

LRU, Timeout, Automatic and Hybrid

SwarmCache and Check List:

JBoss Cache:

JBoss offers two kinds of cache flavors, namely CoreCache and PojoCache.

JBoss Core Cache is a tree-structured, clustered, transactional cache. It can be used in a standalone, non-clustered environment, to cache frequently accessed data in memory thereby removing data retrieval orcalculation bottlenecks while providing "enterprise" features such as JTA compatibility, eviction and persistence.

JBoss Cache is also a clustered cache, and can be used in a cluster to replicate state providing a high degree offailover. A variety of replication modes are supported, including invalidation and buddy replication, and networkcommunications can either be synchronous or asynchronous.

JBoss Cache can - and often is - used outside of JBoss AS, in other Java EE environments such as Spring, Tomcat,Glassfish, BEA WebLogic, IBM WebSphere, and even in standalone Java programs thanks to its minimaldependency set

POJO Cache is an extension of the core JBoss Cache API. POJO Cache offers additional functionality such as:maintaining object references even after replication or persistence.fine grained replication, where only modified object fields are replicated.

Page 33: Intro to Caching,Caching Algorithms and Caching Frameworks

"API-less" clustering model where POJOs are simply annotated as being clustered.

In addition, JBoss Cache offers a rich set of enterprise-class features:

being able to participate in JTA transactions (works with most Java EE compliant transaction managers).

Attach to JMX consoles and provide runtime statistics on the state of the cache.Allow client code to attach listeners and receive notifications on cache events.Allow grouping of cache operations into batches, for efficient replication

The cache is organized as a tree, with a single root. Each node in the tree essentially contains a map, which actsas a store for key/value pairs. The only requirement placed on objects that are cached is that they implementjava.io.Serializable.

JBoss Cache works out of the box with most popular transaction managers, and even provides an API wherecustom transaction manager lookups can be written.

The cache is completely thread-safe. It employs multi-versioned concurrency control (MVCC) to ensure threadsafety between readers and writers, while maintaining a high degree of concurrency. The specific MVCCimplementation used in JBoss Cache allows for reader threads to be completely free of locks and synchronizedblocks, ensuring a very high degree of performance for read-heavy applications. It also uses custom, highlyperformant lock implementations that employ modern compare-and-swap techniques for writer threads, which aretuned to multi-core CPU architectures.

Multi-versioned concurrency control (MVCC) is the default locking scheme since JBoss Cache 3.x.

-Memory Cache:

JBoss cache support LRU, LFU, MRU, Expiration, ElementSize and FIFO

JBoss Check List:

Page 34: Intro to Caching,Caching Algorithms and Caching Frameworks

Performance in action:

Ok now it is show time for this performance testing programmer 1 used 3 different test cases which are as follow:

Test Case from JCS site (applied on all caching frameworks)

JBoss Cache benchmark framework (which is really a very nice cache benchmark framework)

Test Case from Cache4J site (applied on all caching frameworks)

In the 1st and 3rd cache test case it just simple testing of retrieve and populating the cache, while in JBoss cachebenchmark there are a lot of test cases shipped with the benchmark from replication to distributed and clusteringtesting.

All the testing here were performed on a single machine (no distributed testing were performed) and all the testingwere performed in memory.

The versions of the frameworks we are going to test now are as follow:

OSCache: 2.4.1Ehcache: 1.6.0JCS: 1.3Cache4j: 0.4JBoss: 3.0.0Whirly: 1.0.1Swarm: 1.0ShiftOne: 2.0b

JBoss cache benchmark:

Page 35: Intro to Caching,Caching Algorithms and Caching Frameworks

We can see here that there is nearly 8 million get operation invoked on the different cache frameworks and theWhirlyCache took the smallest amount of time (followed by JBoss Cache) while OSCache took the biggest time.

we see here that there is nearly 2 million put operation invoked on the different cache frameworks andWhirlyCache took the smallest time while OSCache took the biggest time

The cache test performed here was in memory cache and there were 25 threads accessing the cache.

JCS Test Case:

Page 36: Intro to Caching,Caching Algorithms and Caching Frameworks

Cache4j vs. JBoss EhCache vs. JBoss JCS vs. JBoss

OSCache vs. JBoss ShiftOne vs. cache4J Shiftone vs. EhCache

ShiftOne vs. JCS ShiftOne vs. OSCache ShiftOne vs. Swarm

ShiftOne vs.JBoss Swarm vs. Cache4J Swarm vs. EHCache

Swarm vs. Jboss Swarm vs. JCS Swarm vs. OSCache

Whirly vs. Cache4J Whirly vs. EhCache Whirly vs. JBoss

Whirly vs. JCS Whirly vs. OScache Whirly vs. ShiftOne

Whirly vs. Swarm

The winner in this test in Ehcache which achieved outstanding results against all the other frameworks, in 2ndplace comes Whirly Cache and in 3rd place comes JBoss cache

Cache4j Test Case:

Cache4J Test With Remove

As we can see the SwarmCache took the biggest time while ehcache and whirlyCache took the smallest time.

This test also performs addition and retrieving for cache items which means there is no cache miss (like the testcases in JBoss cache benchmark)But there is an extra step this test do which is removing cache entries from cache and if we omitted this operation(just concentrated on the put and get operation) we will get the following results

Cache4J Test Without Remove

As we can see the JBoss and Swarm time is heavily reduced, this mean that the remove operation takes a lot oftime in these two cache frameworks, but lets not forget that JBoss is not a flat cache (a structure cache) whichmight be the reason for the delay and also it uses transaction like mechanism for caching which would affect alsoits performance but still great feature (and for sure we wont invoke remove method so often)

And the gold medal goes to!

Our candidate frameworks in this part are WhirlyCache and JBoss cache both of them are achieved very goodperformance in the cache hit and miss but let's not forget that Whirly is not distributed cache which is a bad thing, beside that JBoss offers structure cache as we discussed before beside the transaction mechanism that is offeredby it also , WhirlyCache is really nice for in memory cache either in single or multi threaded application on thecontrary Swarm cache performance is really bad in multi threading application , it throw out of memory exceptionmore than once while it is being tested .

Second place goes to ShiftOne which is really nice but suffer from lake of support ,documentation and evenconfiguration.

If we considered the caching we introduced in the previous part we would have the following order:

Page 37: Intro to Caching,Caching Algorithms and Caching Frameworks

Widget by Css Reflex | TutZone

First place: EhCache (still the best) along with Whirly and JBossSecond place: ShiftOne and JCSThird place: Cache4J and OSCache

The worst performance was achieved by Swarm cache (I guess It would be fast not to cache you objects thancaching it with Swarm cache :D )

Conclusion:

In this part we have seen the comparison of different Open source cache frameworks we and concluded thatEhCache is considered one of the bets choices (beside JBoss and Whirly cache) while Swarm is one of the poorestchoice you will ever make.

Posted by Ahmed Ali at 9:34 PM

Labels: Framework