Download - Ece2013 Java Advanced Memorymanagement

Transcript
Page 1: Ece2013 Java Advanced Memorymanagement

Java  -­‐  Advanced  Memory  Management  

Chris6an  Campo  -­‐  EclipseCon  Europe  2013  

Memory API

Page 2: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   Overview  •   Challenges  •   Cache  Problems  •   API  

Agenda  

Page 3: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Overview  

Private filesystem

Hierarchical datastructure • root, folders, files, attributes, binary content

Simultaneous read/write from multiple threads • application UI, synchronizer, filesystem driver etc.

Stored in sql database

Schema is dynamic (columns added at runtime)

No OR-Mapper

Page 4: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Data  model  

<root>

"docs" "images"

"a.txt"

/docs

/docs/a.txt

"/"

/images

Attributes: -  lastmodified -  length -  owner -  customernumber

Page 5: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Components  

Store API

Files / Directories

Nodes

SQL Access

Application Virtual File System Synchronizer(Daemon)

Page 6: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   Repe66ve  queries  (aHributes)  › Windows  assumes  any  FileSystem  opera6on  as  cheap  

•   Applica6on  developer  don't  think  about  internals  ›  they  assume  that  any  API  call  is  cheap  and  fast  

•   Data  structure  is  dynamic  •   Growing  size  of  data  

Challenges  

Page 7: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Developer  stereotypes  

DB Developer • ACID • transactions • logic should be a stored procedure

• db has buffer (=cache)

Framework Developer • clean and small API • refactor often • DBs are too slow à cache

Application Developer • never refactor :-) • wrap every framework • Framework is too slow à cache

• why cant I access the DB directly ? :-)

Page 8: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Page 9: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Cache  -­‐  Requirements  

Rule #1: Cache improves speed but does not change functionality

Rule #2: Cache must not lead to OOM

Page 10: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Virtual  File  System  -­‐  FileCache  

Page 11: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

often lists children of a

directory

tempting to add a cache

build "MyCache" :-)

Virtual  File  System  -­‐  FileCache  

Page 12: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

VFS  -­‐  "MyCache"  V1.0  

private WeakHashMap<String, List<GFile> > cache = new WeakHashMap<String, List<GFile> >();

private Store store; public synchronized List<GFile> list( final GFile gFile ) { List<GFile> list = cache.get(gFile.getPath()); if ( list == null) { list = store.list(goyaFile ); cache.put(gFile.getPath(), list ); } return list; }

Page 13: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

VFS  -­‐  "MyCache"  V1.0  -­‐  junit  test  

// create + write content for (int x=0; x<100; x++) { GFile file = store.create(new GFile("/docs/test" +x+ ".txt")); } // check for (int x=0; x<100; x++) { List<GFile> list = list(new GFile("/docs")); assertTrue(list.contains(new GFile("/docs/test" +x+ ".txt"))); } // list from "MyCache" public synchronized List<GFile> list( final GFile gFile ) {}

OK + FASTER with Cache

Page 14: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

VFS  -­‐  "MyCache"  V1.0  -­‐  junit  test  

// create + write content + check for (int x=0; x<100; x++) { GFile file = store.create(new GFile("/docs/test“ +x+ ".txt")); List<GFile> list = list(new GFile("/docs")); assertTrue(list.contains(new GFile("/docs/test“ +x+ ".txt"))); } // list from "MyCache" public synchronized List<GFile>

list( final Store store, final GFile gFile ) {}

FAILS with Cache -> expiration time for cache entries

Page 15: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

private long EXPIRATION_TIME = 200; private WeakHashMap<String, MyCacheEntry> cache =

new WeakHashMap<String, MyCacheEntry>(); public synchronized List<GFile> list( final LocalStore store, final GFile gFile ) { final MyCacheEntry myCacheEntry = cache.get(gFile.getPath()); if ( myCacheEntry == null || System.currentTimeMillis() - EXPIRATION_TIME > myCacheEntry.currentTimeMillis ) { final List<GoyaFile> listGoyaFiles = store.list( goyaFile ); cache.put(gFile.getPath(), new MyCacheEntry( listGoyaFiles ) ); return listGoyaFiles; } else { return myCacheEntry.listGoyaFiles; }

VFS  -­‐  MyCache  V2.0  

class MyCacheEntry { public List<GFile> listGFiles; public long currentTime; MyCacheEntry( final List<GFile> listGFiles ) { this.listGFiles = listGFiles; this.currentTime = System.currentTimeMillis(); } }

Page 16: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   It  was  hard  to  find  a  good  value  for  EXPIRATION_TIME  › 200  ms  effec6ve  but  it  was  easy  to  get  wrong  HITS  

› 10  ms  beFer  (not  always  right)  but  not  many  HITS  •   implement  a  maximum  number  of  entries  

•   find  means  to  "know"  when  CacheEntries  are  invalid  •   sta6s6cs  ?  

Virtual  FS  -­‐  MyCache  V2.0  -­‐  more  problems  ...  

Page 17: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Virtual  FS  -­‐  LoadingCache  (Guava)  

private final LoadingCache<String, List<GFile>> cache = CacheBuilder.newBuilder()

.maximumSize( 10 ) .softValues()

.expireAfterWrite( 200, TimeUnit.MILLISECONDS) .build( new CacheLoader<String, List<GFile>>() { public List<GFile> load( final String path ) {

return store.list( new GFile( path ) ); }

} );

Rule #3: Never write your own Cache implementation

Page 18: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Virtual  FS  -­‐  invalida6ng  CacheEntries  

public GFile create(GFile file) { GFile g = store.create(file); cache.flush(); return g; }

public GFile create(GFile file) { GFile g = store.create(file); cache.remove(getParent(file).getPath()); return g; }

Rule #4: Have a strategy for removing cache entries (expiration timing alone does not work)

Page 19: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  

Page 20: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   heavy  load  on  the  db  •   a  lot  of  read  opera6ons  on  aHributes  •   not  many  writes  •   à  cache  aHributes  

AHribute  Cache  

Page 21: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  

private final Map<String, String> attributesCache; public String getAttribute( final String name ) { String value = attributesCache.get( name ); if (value!=null) { return value; } ... // read it from DB } public void setAttribute( final String name, String value) { if ( !Objects.equal( value, attributesCache.get( name ) ) ) { attributesCache.put( name, value ); ... // update attribute in the database } }

Page 22: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Junit  test  

// previous value of length = "1057" db.transaction(new DbRunnable() { protected void run() { file.setAttribute("length", "0"); long newLength = store.put(file, inputStream); file.setAttribute("length", new String(newLength)) } }); ...

•  IOException reading inputStream •  transaction rolls back •  attribute length = "0" in Cache and has "1057" in DB •  setAttribute("length","0") after Transaction, does not reach DB

public void setAttribute(final String name,String value) { if(!Objects.equal(value, attributeCache.get(name))){ attributesCache.put( name, value ); ... // update attribute in the database }

Rule #5: Cache should be aware of transaction rollbacks

file.setAttribute("length", "0");

Page 23: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Transac6ons  

private final Map<String, Optional<String>> attributesCache; private void init() { getTransaction().register(this); } public String getAttribute( final String key ) { ... } public void setAttribute( final String name, String value) {} public void rollback() { // called when transaction rolls back attributesCache.flush(); }

Page 24: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Transac6ons  

db.transaction(new DbRunnable() { try { store.put(file, inputStream); file.setAttribute("loaded","true"); } catch (IOException ioe) { file.setAttribute("loaded","false"); throw ioe; } });

•  IOException -> transaction rollback •  loaded = "false" NOT in Cache, NOT in DB

•  Cache changes behaviour once it does Transaction rollback. •  Rule #6 Dont return with Exception if data should not rollback

Page 25: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHribute  Cache  -­‐  Isola6on  

Transaction 1 Component A

Transaction 2 Component B

???

x = ?

GFile

Page 26: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHributeCache  -­‐  Isola6on  -­‐  Solu6on  1  

public void setAttribute( final String name, String value ) { ... getTransaction().getContext().set( ... + name, value ); ... } public String getAttribute(String name) { ... String value = getTransaction().getContext().get( ... + name ); ... }

•  store cached values global IN the transaction context •  however all attribute values are stored in Transaction

Page 27: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

AHributeCache  -­‐  Isola6on  -­‐  Solu6on  2  (beHer)  

private HashMap<String, String> cache=new HashMap<String,String>(); public void setAttribute( final String name, String value) { ... cache.put(getTransactionId() + name, value); ... } public String getAttribute(String name) { ... String value = cache.get(getTransactionId() + name); ... }

store cached values local depending on transaction-id

Page 28: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

ChunkCache  -­‐  data  with  dirty  state  

Page 29: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

• VFS  writes  and  reads  data  in  random  "segments"  ›  i.e.  "write  offset  110.000  length  200"  

•   Stored  on  disk  in  1  MB  "Chunks"  

ChunkCache  -­‐  data  with  dirty  state  

01011001010101010101010101010110100011110101

Chunks (1 MB)

Virtual FileSystem

•  "write offset=110.000 length=2000" •  "write..." •  "write..." •  "write..." •  "write..." •  "write..."

ChunkFileStore

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

Page 30: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   Cache  writes  lazy  •   Cache  maintained  in  dirty  state  

•   LoadingCache  used  •   6meout,  maxsize  

ChunkCache  -­‐  data  with  dirty  state  

01011001010101010101010101010110100011110101

Chunks (1 MB)

Virtual FileSystem

•  "write offset=110.000 length=200" •  "write..."

ChunkCache

ChunkFileStore

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

01011001010101010101010101010110100011110101

Page 31: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

ChunkCache  -­‐  data  with  dirty  state  

private final LoadingCache<Integer, Chunk> cache = CacheBuilder.newBuilder() .maximumSize(4) .expireAfterAccess(2, TimeUnit.SECONDS) .removalListener(new RemovalListener<Integer, Chunk>() { public void onRemoval(final RemovalNotification<Integer, Chunk> notification ) { if ( notification.getValue().isDirty() ) { // write chunk } } }) .build( new CacheLoader<Integer, Chunk>() { public Chunk load( final Integer key ) throws Exception { // load Chunk for key } } );

•  write to disk on remove •  cache entries only expire on access !! •  no automatic housekeeping

Page 32: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Page 33: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

query  for  resources  sa6sfying  a  condi6on    

API  -­‐  query  -­‐  list  

•  can produce a lot of instances •  returns only upon complete resultset •  returns all result instances •  memory footprint unpredicable (maybe large)

ArrayList<GFile> query("type='file' and lastmodified>'20130901' ");

Page 34: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

API  -­‐  query  -­‐  lazy  list  

•  List can be lazy and custom implementation •  that only stores primkeys •  builds instance on demand

•  returns only upon complete resultset •  less memory footprint

List<GFile> query("type='file' and lastmodified>'20130901'");

Page 35: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

API  -­‐  query  -­‐  iterator  

•  Iterator can be lazy and custom implementation •  and builds instance on demand

•  uses DB cursor and reads on demand •  less memory footprint •  result can only be consumed once

Iterator<GFile> query("type='file' and lastmodified>'20130901'");

Page 36: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   track  "framework"  objects  that  the  applica6on  holds  on  too  •   for  sta6s6cs  •   for  error  logging  

›  i.e.  "more  than  10.000  file  resources  WARN"  

•   Implementa6on  ›  store  all  returned  object  in  a  WeakHashMap  

› GC  removes  objects  when  there  is  no  hard  reference  › WeakHashMap.size()  can  give  you  an  ESTIMATE    

API  -­‐  track  objects  "idea"  

Page 37: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

API  -­‐  famous  "count"  implementa6on  

WASTE

OK

looks trivial, but its a real world example :-)

public long notYetSyncFiles() { localStore.query("type='file' and changed='true'").size(); }

public long notYetSyncFiles() { localStore.queryCount("type='file' and changed='true'"); }

Page 38: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

•   fields  that  have  a  limited  number  of  values  ›  true  /  false  › bitsets  (1,8,128,32)  › filetypes  'file'  /  'directory'  › etc.  

•   if  you  read  those  a  lot  from  DB  or  remote  services*  

› you  quickly  have  10.000  or  more  instances  of  'true'  

API  -­‐  mapping  to  String  

•  intern() is expensive •  PermGen stores 'intern'

•  small space •  since Java7 in Heap

if (value.equals("true") || value.equals("false") || ... { value = value.intern(value); }

Page 39: Ece2013 Java Advanced Memorymanagement

Copyright © 2013 compeople AG, Made available under the Eclipse Public License v 1.0

Cache must not change the behaviour

Use existing Cache

implementations

Caches must be aware of database

behaviour (ACID) API must scale

Cache  /  API  -­‐  Summary  

Page 40: Ece2013 Java Advanced Memorymanagement

Copyright © 2009 compeople AG, Made available under the Eclipse Public License v 1.0