Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

32
[email protected] | twitter.com/galderz | zamarreno.com Thursday, October 7, 2010

Transcript of Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

Page 2: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Keeping Infinispan In Shape: Highly-Precise, Scalable

Data Eviction

Galder Zamarreño & Vladimir BlagojevicSenior Engineer, Red Hat7th October 2010, JUDCon - Berlin

Thursday, October 7, 2010

Page 3: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Who is Galder?

• R&D engineer (Red Hat Inc):• Infinispan developer• JBoss Cache developer

• Contributor and committer:• JBoss AS, Hibernate, JGroups, JBoss Portal,...etc

• Blog: zamarreno.com• Twitter: @galderz

Thursday, October 7, 2010

Page 4: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Agenda

• Eviction in:• Java Collections Framework• JBoss Cache• Infinispan 4.0

• New in Infinispan 4.1:• LIRS• Batching Updates

Thursday, October 7, 2010

Page 5: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Java Collections Framework

• Key building block of any Java app

• Introduced in Java 1.2

• Extended with concurrent collections in Java 1.5

• Collection element eviction ??

Thursday, October 7, 2010

Page 6: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Collections cannot grow forever

Thursday, October 7, 2010

Page 7: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

If using a collection as cache

• Forces clients to either:

• Remove elements proactively

• Or run a periodic cleanup process

• Which can be a PITA...

Thursday, October 7, 2010

Page 8: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Eviction in JBoss Cache days

Thursday, October 7, 2010

Page 9: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

JBoss Cache eviction issues

• Under heavy load, eviction queues could fill up - bottleneck

• Possibly a side effect of MVCC’s non-blocking reads

• Separating data into regions could alleviate the issue

• But it’s really a hack!

Thursday, October 7, 2010

Page 10: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Original Infinispan 4.0 eviction

• Tried a different approach:

• Avoid using queues to hold events

• Taking advantage of tree to map change:

• Attempt to maintain map ordered as per eviction rules

• Could we use ConcurrentSkipListMap ?

• No. O(1) desired for all map operations

Thursday, October 7, 2010

Page 11: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Doubly Linked ConcurrentHashMap

• Based on H.Sundell and P.Tsigas paper:

• Lock-Free Deques and Doubly Linked Lists (2008)

• With each cache access or update update links

• Eviction just a matter of walking the linked list one way

Thursday, October 7, 2010

Page 12: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Issues under heavy load

• Algorithm always uses same node (the tail) to append stuff

• With high concurrency, CAS stress lead to loads of retrying

• So much retrying, we’re getting infinite loops

• Before 4.0.0.Final, reverted to a more conservative algorithm

Thursday, October 7, 2010

Page 13: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

4.0.0.Final eviction

Thursday, October 7, 2010

Page 14: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

LRU eviction algorithm issues

• Weak access locality

• One-time accessed keys not evicted timely

• In loops, soon to be accessed keys might get evicted first

• With distinct access frequencies, frequently accessed keys can unfortunately get evicted

• LRU’s working set limited to cache size

Thursday, October 7, 2010

Page 15: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Enter the room... LIRS

• Eviction algorithm that can cope with weak access locality

• Based on S.Jiang and X.Zhang’s 2002 paper:

• LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance

• LIRS based around two concepts: IRR and Recency

Thursday, October 7, 2010

Page 16: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

IRR and Recency• Inter-Reference Recency (IRR):

• Number of other unique keys accessed between two consecutive accesses to same key

• Recency (R)

• Number of other unique keys accessed from last reference until now

Thursday, October 7, 2010

Page 17: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

How does LIRS work?

• If key has a high IRR, it’s next IRR is likely to be high again

• Keys with highest IRR are considered for eviction

• Once IRR is out of date, we start relying on Recency

• LIRS = Low Inter-reference Recency Set

Thursday, October 7, 2010

Page 18: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

How is LIRS implemented?

• Low IRR (LIR) area

• Holds hot keys!

• High IRR (HIR) area

• Holds recently accessed keys

• Keys here might get promoted to LIR area

Thursday, October 7, 2010

Page 19: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Cache Hit - LIR area

Thursday, October 7, 2010

Page 20: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Cache Hit - LIR area

Thursday, October 7, 2010

Page 21: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Hit - HIR area and in LIR Q ...

Thursday, October 7, 2010

Page 22: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Hit - HIR area and in LIR Q

Thursday, October 7, 2010

Page 23: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Hit - HIR area and not in LIR Q

Thursday, October 7, 2010

Page 24: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Cache Miss

Thursday, October 7, 2010

Page 25: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Cache Miss

Thursday, October 7, 2010

Page 26: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Cache Miss

Thursday, October 7, 2010

Page 27: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

LIRS implementation hurdles

• LIRS requires a lot of key shifting around

• It can lead to high contention

• Unless you can implement it in scalable way, it’s useless

• Low contended way to implement a high precision eviction algorithm? Is it possible?

Thursday, October 7, 2010

Page 28: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Batching Eviction Updates

• Original idea: X.Ding, S.Jiang and X.Zhang’s 2009 paper:

• BP-Wrapper: A System Framework Making Any Replacement Algorithms (Almost) Lock Contention Free

• Keeping cache access per thread in a queue

• If queue reaches a threshold:

• Acquire locks and execute eviction as per algorithm

• Batching updates significantly lowers lock contention

Thursday, October 7, 2010

Page 29: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Batching Updates in Infinispan

• Decided against recording access per thread

• 100s threads could be hitting cache; some short lived

• Created BoundedConcurrentHashMap

• Based on Doug Lea's ConcurrentHashMap

• Records accesses in a lock-free queue in each segment

• When threshold passed, acquire lock for segment and evict

Thursday, October 7, 2010

Page 30: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Precision and Performance

• Segment level eviction does not affect overall precision

• Community member run some performance tests:

• After swapping their own LRU cache with BoundedCHM:

• Berlin SPARQL Benchmark performance increased 55-60% for both cold and hot caches

Thursday, October 7, 2010

Page 31: Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction

[email protected] | twitter.com/galderz | zamarreno.com

Summary

• In JBoss Cache, eviction can become a bottleneck

• Infinispan 4.0 uses conservative eviction

• Infinispan 4.1 has more precise eviction algorithm (LIRS)

• Batching updates, present in 4.1, significantly lowers lock contention

• Result = Highly-concurrent, highly-precise implicit eviction

Thursday, October 7, 2010