Ece2013 Java Advanced Memorymanagement

40
Java Advanced Memory Management Chris6an Campo EclipseCon Europe 2013 Memory API

description

Talk on Java Memory Management from eclipsecon europe 2013. The talk is about pitfalls when you use Java Caches and API tips for better scaling with growing data

Transcript of Ece2013 Java Advanced Memorymanagement

Page 1: Ece2013 Java Advanced Memorymanagement

Java  -­‐  Advanced  Memory  Management  

Chris6an  Campo  -­‐  EclipseCon  Europe  2013  

Memory API

Page 2: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   Overview  •   Challenges  •   Cache  Problems  •   API  

Agenda  

Page 3: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Overview  

Private filesystem

Hierarchical datastructure • root, folders, files, attributes, binary content

Simultaneous read/write from multiple threads • application UI, synchronizer, filesystem driver etc.

Stored in sql database

Schema is dynamic (columns added at runtime)

No OR-Mapper

Page 4: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Data  model  

<root>

"docs" "images"

"a.txt"

/docs

/docs/a.txt

"/"

/images

Attributes: -  lastmodified -  length -  owner -  customernumber

Page 5: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Components  

Store API

Files / Directories

Nodes

SQL Access

Application Virtual File System Synchronizer(Daemon)

Page 6: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   Repe66ve  queries  (aHributes)  › Windows  assumes  any  FileSystem  opera6on  as  cheap  

•   Applica6on  developer  don't  think  about  internals  ›  they  assume  that  any  API  call  is  cheap  and  fast  

•   Data  structure  is  dynamic  •   Growing  size  of  data  

Challenges  

Page 7: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Developer  stereotypes  

DB Developer • ACID • transactions • logic should be a stored procedure

• db has buffer (=cache)

Framework Developer • clean and small API • refactor often • DBs are too slow à cache

Application Developer • never refactor :-) • wrap every framework • Framework is too slow à cache

• why cant I access the DB directly ? :-)

Page 8: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Page 9: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Cache  -­‐  Requirements  

Rule #1: Cache improves speed but does not change functionality

Rule #2: Cache must not lead to OOM

Page 10: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Virtual  File  System  -­‐  FileCache  

Page 11: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

often lists children of a

directory

tempting to add a cache

build "MyCache" :-)

Virtual  File  System  -­‐  FileCache  

Page 12: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

VFS  -­‐  "MyCache"  V1.0  

private WeakHashMap<String, List<GFile> > cache = new WeakHashMap<String, List<GFile> >();

private Store store; public synchronized List<GFile> list( final GFile gFile ) { List<GFile> list = cache.get(gFile.getPath()); if ( list == null) { list = store.list(goyaFile ); cache.put(gFile.getPath(), list ); } return list; }

Page 13: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

VFS  -­‐  "MyCache"  V1.0  -­‐  junit  test  

// create + write content for (int x=0; x<100; x++) { GFile file = store.create(new GFile("/docs/test" +x+ ".txt")); } // check for (int x=0; x<100; x++) { List<GFile> list = list(new GFile("/docs")); assertTrue(list.contains(new GFile("/docs/test" +x+ ".txt"))); } // list from "MyCache" public synchronized List<GFile> list( final GFile gFile ) {}

OK + FASTER with Cache

Page 14: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

VFS  -­‐  "MyCache"  V1.0  -­‐  junit  test  

// create + write content + check for (int x=0; x<100; x++) { GFile file = store.create(new GFile("/docs/test“ +x+ ".txt")); List<GFile> list = list(new GFile("/docs")); assertTrue(list.contains(new GFile("/docs/test“ +x+ ".txt"))); } // list from "MyCache" public synchronized List<GFile>

list( final Store store, final GFile gFile ) {}

FAILS with Cache -> expiration time for cache entries

Page 15: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

private long EXPIRATION_TIME = 200; private WeakHashMap<String, MyCacheEntry> cache =

new WeakHashMap<String, MyCacheEntry>(); public synchronized List<GFile> list( final LocalStore store, final GFile gFile ) { final MyCacheEntry myCacheEntry = cache.get(gFile.getPath()); if ( myCacheEntry == null || System.currentTimeMillis() - EXPIRATION_TIME > myCacheEntry.currentTimeMillis ) { final List<GoyaFile> listGoyaFiles = store.list( goyaFile ); cache.put(gFile.getPath(), new MyCacheEntry( listGoyaFiles ) ); return listGoyaFiles; } else { return myCacheEntry.listGoyaFiles; }

VFS  -­‐  MyCache  V2.0  

class MyCacheEntry { public List<GFile> listGFiles; public long currentTime; MyCacheEntry( final List<GFile> listGFiles ) { this.listGFiles = listGFiles; this.currentTime = System.currentTimeMillis(); } }

Page 16: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   It  was  hard  to  find  a  good  value  for  EXPIRATION_TIME  › 200  ms  effec6ve  but  it  was  easy  to  get  wrong  HITS  

› 10  ms  beFer  (not  always  right)  but  not  many  HITS  •   implement  a  maximum  number  of  entries  

•   find  means  to  "know"  when  CacheEntries  are  invalid  •   sta6s6cs  ?  

Virtual  FS  -­‐  MyCache  V2.0  -­‐  more  problems  ...  

Page 17: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Virtual  FS  -­‐  LoadingCache  (Guava)  

private final LoadingCache<String, List<GFile>> cache = CacheBuilder.newBuilder()

.maximumSize( 10 ) .softValues()

.expireAfterWrite( 200, TimeUnit.MILLISECONDS) .build( new CacheLoader<String, List<GFile>>() { public List<GFile> load( final String path ) {

return store.list( new GFile( path ) ); }

} );

Rule #3: Never write your own Cache implementation

Page 18: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Virtual  FS  -­‐  invalida6ng  CacheEntries  

public GFile create(GFile file) { GFile g = store.create(file); cache.flush(); return g; }

public GFile create(GFile file) { GFile g = store.create(file); cache.remove(getParent(file).getPath()); return g; }

Rule #4: Have a strategy for removing cache entries (expiration timing alone does not work)

Page 19: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  

Page 20: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   heavy  load  on  the  db  •   a  lot  of  read  opera6ons  on  aHributes  •   not  many  writes  •   à  cache  aHributes  

AHribute  Cache  

Page 21: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  

private final Map<String, String> attributesCache; public String getAttribute( final String name ) { String value = attributesCache.get( name ); if (value!=null) { return value; } ... // read it from DB } public void setAttribute( final String name, String value) { if ( !Objects.equal( value, attributesCache.get( name ) ) ) { attributesCache.put( name, value ); ... // update attribute in the database } }

Page 22: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Junit  test  

// previous value of length = "1057" db.transaction(new DbRunnable() { protected void run() { file.setAttribute("length", "0"); long newLength = store.put(file, inputStream); file.setAttribute("length", new String(newLength)) } }); ...

•  IOException reading inputStream •  transaction rolls back •  attribute length = "0" in Cache and has "1057" in DB •  setAttribute("length","0") after Transaction, does not reach DB

public void setAttribute(final String name,String value) { if(!Objects.equal(value, attributeCache.get(name))){ attributesCache.put( name, value ); ... // update attribute in the database }

Rule #5: Cache should be aware of transaction rollbacks

file.setAttribute("length", "0");

Page 23: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Transac6ons  

private final Map<String, Optional<String>> attributesCache; private void init() { getTransaction().register(this); } public String getAttribute( final String key ) { ... } public void setAttribute( final String name, String value) {} public void rollback() { // called when transaction rolls back attributesCache.flush(); }

Page 24: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Transac6ons  

db.transaction(new DbRunnable() { try { store.put(file, inputStream); file.setAttribute("loaded","true"); } catch (IOException ioe) { file.setAttribute("loaded","false"); throw ioe; } });

•  IOException -> transaction rollback •  loaded = "false" NOT in Cache, NOT in DB

•  Cache changes behaviour once it does Transaction rollback. •  Rule #6 Dont return with Exception if data should not rollback

Page 25: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Isola6on  

Transaction 1 Component A

Transaction 2 Component B

???

x = ?

GFile

Page 26: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHributeCache  -­‐  Isola6on  -­‐  Solu6on  1  

public void setAttribute( final String name, String value ) { ... getTransaction().getContext().set( ... + name, value ); ... } public String getAttribute(String name) { ... String value = getTransaction().getContext().get( ... + name ); ... }

•  store cached values global IN the transaction context •  however all attribute values are stored in Transaction

Page 27: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHributeCache  -­‐  Isola6on  -­‐  Solu6on  2  (beHer)  

private HashMap<String, String> cache=new HashMap<String,String>(); public void setAttribute( final String name, String value) { ... cache.put(getTransactionId() + name, value); ... } public String getAttribute(String name) { ... String value = cache.get(getTransactionId() + name); ... }

store cached values local depending on transaction-id

Page 28: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

ChunkCache  -­‐  data  with  dirty  state  

Page 29: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

• VFS  writes  and  reads  data  in  random  "segments"  ›  i.e.  "write  offset  110.000  length  200"  

•   Stored  on  disk  in  1  MB  "Chunks"  

ChunkCache  -­‐  data  with  dirty  state  

01011001010101010101010101010110100011110101

Chunks (1 MB)

Virtual FileSystem

•  "write offset=110.000 length=2000" •  "write..." •  "write..." •  "write..." •  "write..." •  "write..."

ChunkFileStore

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

Page 30: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   Cache  writes  lazy  •   Cache  maintained  in  dirty  state  

•   LoadingCache  used  •   6meout,  maxsize  

ChunkCache  -­‐  data  with  dirty  state  

01011001010101010101010101010110100011110101

Chunks (1 MB)

Virtual FileSystem

•  "write offset=110.000 length=200" •  "write..."

ChunkCache

ChunkFileStore

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

Page 31: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

ChunkCache  -­‐  data  with  dirty  state  

private final LoadingCache<Integer, Chunk> cache = CacheBuilder.newBuilder() .maximumSize(4) .expireAfterAccess(2, TimeUnit.SECONDS) .removalListener(new RemovalListener<Integer, Chunk>() { public void onRemoval(final RemovalNotification<Integer, Chunk> notification ) { if ( notification.getValue().isDirty() ) { // write chunk } } }) .build( new CacheLoader<Integer, Chunk>() { public Chunk load( final Integer key ) throws Exception { // load Chunk for key } } );

•  write to disk on remove •  cache entries only expire on access !! •  no automatic housekeeping

Page 32: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Page 33: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

query  for  resources  sa6sfying  a  condi6on    

API  -­‐  query  -­‐  list  

•  can produce a lot of instances •  returns only upon complete resultset •  returns all result instances •  memory footprint unpredicable (maybe large)

ArrayList<GFile> query("type='file' and lastmodified>'20130901' ");

Page 34: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

API  -­‐  query  -­‐  lazy  list  

•  List can be lazy and custom implementation •  that only stores primkeys •  builds instance on demand

•  returns only upon complete resultset •  less memory footprint

List<GFile> query("type='file' and lastmodified>'20130901'");

Page 35: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

API  -­‐  query  -­‐  iterator  

•  Iterator can be lazy and custom implementation •  and builds instance on demand

•  uses DB cursor and reads on demand •  less memory footprint •  result can only be consumed once

Iterator<GFile> query("type='file' and lastmodified>'20130901'");

Page 36: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   track  "framework"  objects  that  the  applica6on  holds  on  too  •   for  sta6s6cs  •   for  error  logging  

›  i.e.  "more  than  10.000  file  resources  WARN"  

•   Implementa6on  ›  store  all  returned  object  in  a  WeakHashMap  

› GC  removes  objects  when  there  is  no  hard  reference  › WeakHashMap.size()  can  give  you  an  ESTIMATE    

API  -­‐  track  objects  "idea"  

Page 37: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

API  -­‐  famous  "count"  implementa6on  

WASTE

OK

looks trivial, but its a real world example :-)

public long notYetSyncFiles() { localStore.query("type='file' and changed='true'").size(); }

public long notYetSyncFiles() { localStore.queryCount("type='file' and changed='true'"); }

Page 38: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   fields  that  have  a  limited  number  of  values  ›  true  /  false  › bitsets  (1,8,128,32)  › filetypes  'file'  /  'directory'  › etc.  

•   if  you  read  those  a  lot  from  DB  or  remote  services*  

› you  quickly  have  10.000  or  more  instances  of  'true'  

API  -­‐  mapping  to  String  

•  intern() is expensive •  PermGen stores 'intern'

•  small space •  since Java7 in Heap

if (value.equals("true") || value.equals("false") || ... { value = value.intern(value); }

Page 39: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Cache must not change the behaviour

Use existing Cache

implementations

Caches must be aware of database

behaviour (ACID) API must scale

Cache  /  API  -­‐  Summary  

Page 40: Ece2013 Java Advanced Memorymanagement

Copyright © 2009 compeople AG, Made available under the Eclipse Public License v 1.0