Immutable Collectionscr.openjdk.java.net/~psandoz/conferences/2017-JavaOne/j1-2017... · JavaOne...
Transcript of Immutable Collectionscr.openjdk.java.net/~psandoz/conferences/2017-JavaOne/j1-2017... · JavaOne...
JavaOne 2017 Immutable Collections CON6079 2
Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
JavaOne 2017 Immutable Collections CON6079
Agenda
• A recap of unmodifiable collections in the JDK
• A brief overview of immutable collections in external Java libraries and JVM-based platforms
• Immutable collections leveraging persistent data structures
3
JavaOne 2017 Immutable Collections CON6079
When referring toimmutable collections
there are no claims made as to the immutability of the collection’s elements
4
JavaOne 2017 Immutable Collections CON6079
Advantages of immutability
• Don’t need to think about concurrency and data races
• Resistant to misbehaving libraries
• Are constants that may be optimized at runtime
• Implementations can optimize over time, space for representation and transformation
5
JavaOne 2017 Immutable Collections CON6079
Immutable collections wish list
• Manifests immutability(of the collections, not their elements)
• Sealed(not publicly extensible)
• Provide a bridge to mutable collections (not extension of)
• Efficient construction, updates, and copying
6
JavaOne 2017 Immutable Collections CON6079
Unmodifiable in the JDK
• The JDK has the notion of unmodifiable collections
• Unmodifiable is a runtime property of a collection
• Modifying (add, put, remove, …) methods throw UnsupportedOperationException
• No way to directly query
7
JavaOne 2017 Immutable Collections CON6079
Two forms of unmodifiable
• Unmodifiable view or wrapper to a source or backing collection List<T> uvl = Collections .unmodifiableList(sourceList);
• Directly unmodifiableList<T> dul = List.of(1, 2, 3, …);
8
JavaOne 2017 Immutable Collections CON6079
Immutability with unmodifiable collections
• When wrapping ensure the source collection is never accessible*List<T> uvl = Collections .unmodifiableList(new ArrayList<>(source));source = null; List<T> dul = stream.collect( collectingAndThen(toList(), Collections::unmodifiableList))
• List.of and friends are is if the source is never accessible
9* Except, of course, to the unmodifiable wrapper
JavaOne 2017 Immutable Collections CON6079
JDK collections as immutable collections
✗ Manifests immutability
✗ Sealed
• Provide a bridge to mutable collections
✗ Efficient construction, updates and copying
10
JavaOne 2017 Immutable Collections CON6079
Unmodifiableis a reasonable abstraction for
mutable collectionsbut not for
immutable collections
11
JavaOne 2017 Immutable Collections CON6079
Guava’s immutable collections
• Defines sealed types such as ImmutableList, ImmutableMap, …
• These implement the corresponding JDK mutable collection type (ImmutableList implements List)
• Copying is smartImmutableList.copyOf(otherCollection)
12
JavaOne 2017 Immutable Collections CON6079
Guava’s collections are a good compromise
✔ Manifests immutability
✔ Sealed
✘ Provide a bridge to mutable collections
✘ Efficient construction (✔✘), updates (✘), and copying (✔✘)
13
JavaOne 2017 Immutable Collections CON6079
Eclipse collections: something for everyone
✔ Manifests immutability
✘ Sealed
✔ Provide a bridge to mutable collections
✘ Efficient construction, updates, and copying
14
JavaOne 2017 Immutable Collections CON6079
Vavr (Java), Clojure, Scala
✔ Manifests immutability
✔ Sealed*
✔ Provide a bridge to mutable collections*
✔ Efficient construction, updates, and copying
15
* Not completely verified but believed to be mostly true
JavaOne 2017 Immutable Collections CON6079
Vavr (Java), Clojure, Scala
✔ Efficient updates (addition, removal, replace, merge)
• The immutable collection implementations leverage persistent data structures for maps, sets and vectors (non-linked lists)
16
JavaOne 2017 Immutable Collections CON6079
Persistent data structures• A persistent data structure preserves the previous
version of itself when modified
• Hash Array Mapped Tries (HAMTs) are the basis of efficient persistent (immutable) maps, sets, and vectors
• Provide structural sharing between a new and previous version of a collection
• Effectively constant time for many operations
• Cache friendly
17
JavaOne 2017 Immutable Collections CON6079
Trie
18
In computer science, a trie, also called digital tree and sometimes radix tree or prefix tree (as they can be searched by prefixes), is a kind of search tree — an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings
A trie for keys "A","to", "tea", "ted", "ten", "i", "in", and "inn".
JavaOne 2017 Immutable Collections CON6079
Hash Array Mapped Trie
• Symbol is a 5 bit sequence
• String is fixed in size, 32 bits, consisting of 7 symbols (last symbol is truncated to 2 bits)
• String is the hashCode of an Object (the key)
19
JavaOne 2017 Immutable Collections CON6079
Hash Array Mapped Trie
20
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01
1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 1 1 1 0
0xCAFEBABE
s1s2s3s4s5s6s7
JavaOne 2017 Immutable Collections CON6079
HAMT properties
• Wide branching factor, 32
• Limited tree depth, 6
• Effectively constant time lookup O(log32N) = O(log2N/log232) = O(log2N/5)
21
JavaOne 2017 Immutable Collections CON6079
HAMT properties• Good structural sharing (for updates, merging and
splitting) but also good memory usage and cache coherency
• The basis for vectors, where index is the hash code (see also Relaxed Radix Balanced trees), and multi-maps
• Can be applied to mutable collections, for efficient construction of an immutable collection
• Efficiently implemented in Java
22
JavaOne 2017 Immutable Collections CON6079
A naive implementation
23
public class PMap<K, V> { Object[] nodes = new Object[32 * 2]; public Optional<V> get(K k) { return get(k, hash(k), 0); } private Optional<V> get(K k, int h, int d) { int symbol = symbolAtDepth(h, d); Object _k = nodes[symbol * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> n = (PMap<K, V>) nodes[symbol * 2 + 1]; return n.get(k, h, d + 1); } else if (k.equals(_k)) return Optional.of((V) nodes[symbol * 2 + 1]); else return Optional.empty(); } static int symbolAtDepth(int h, int d) { return (h >>> (d * 5) & (32 - 1)); }}
JavaOne 2017 Immutable Collections CON6079
A naive implementation
24
public class PMap<K, V> { Object[] nodes = new Object[32 * 2]; public Optional<V> get(K k) { return get(k, hash(k), 0); } private Optional<V> get(K k, int h, int d) { int symbol = symbolAtDepth(h, d); Object _k = nodes[symbol * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> n = (PMap<K, V>) nodes[symbol * 2 + 1]; return n.get(k, h, d + 1); } else if (k.equals(_k)) return Optional.of((V) nodes[symbol * 2 + 1]); else return Optional.empty(); } static int symbolAtDepth(int h, int d) { return (h >>> (d * 5) & (32 - 1)); }}
JavaOne 2017 Immutable Collections CON6079
A naive implementation
25
public class PMap<K, V> { Object[] nodes = new Object[32 * 2]; public Optional<V> get(K k) { return get(k, hash(k), 0); } private Optional<V> get(K k, int h, int d) { int symbol = symbolAtDepth(h, d); Object _k = nodes[symbol * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> n = (PMap<K, V>) nodes[symbol * 2 + 1]; return n.get(k, h, d + 1); } else if (k.equals(_k)) return Optional.of((V) nodes[symbol * 2 + 1]); else return Optional.empty(); } static int symbolAtDepth(int h, int d) { return (h >>> (d * 5) & (32 - 1)); } }
JavaOne 2017 Immutable Collections CON6079
A compact representation
26
public class PMap<K, V> { // bit map of symbols @Stable private final int bitmap; // [..., k, v, ....] or // [..., SUB_LAYER_NODE, PMap, ...] or // [..., COLLISION_NODE, CollisionNode, ...] or // invariant: a sub-layer will not consist of a single mapping node @Stable private final Object[] nodes;
JavaOne 2017 Immutable Collections CON6079
A better representation
27
private Optional<V> get(K k, int h, int dShift) { int symbol = symbolAtDepth(h, dShift); if (bitmapGet(bitmap, symbol) == 0) return Optional.empty(); int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> s = (PMap<K, V>) nodes[nodeCount * 2 + 1]; return s.get(k, h, dShift + 5); } else if (_k.equals(k)) return Optional.of((V) nodes[nodeCount * 2 + 1]); else return Optional.empty();}
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1));}
JavaOne 2017 Immutable Collections CON6079
A better representation
28
private Optional<V> get(K k, int h, int dShift) { int symbol = symbolAtDepth(h, dShift); if (bitmapGet(bitmap, symbol) == 0) return Optional.empty(); int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> s = (PMap<K, V>) nodes[nodeCount * 2 + 1]; return s.get(k, h, dShift + 5); } else if (_k.equals(k)) return Optional.of((V) nodes[nodeCount * 2 + 1]); else return Optional.empty();}
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1));}
JavaOne 2017 Immutable Collections CON6079
A better representation
29
private Optional<V> get(K k, int h, int dShift) { int symbol = symbolAtDepth(h, dShift); if (bitmapGet(bitmap, symbol) == 0) return Optional.empty(); int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> s = (PMap<K, V>) nodes[nodeCount * 2 + 1]; return s.get(k, h, dShift + 5); } else if (_k.equals(k)) return Optional.of((V) nodes[nodeCount * 2 + 1]); else return Optional.empty();}
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1));}
JavaOne 2017 Immutable Collections CON6079
A better representation
30
private Optional<V> get(K k, int h, int dShift) { int symbol = symbolAtDepth(h, dShift); if (bitmapGet(bitmap, symbol) == 0) return Optional.empty(); int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap<K, V> s = (PMap<K, V>) nodes[nodeCount * 2 + 1]; return s.get(k, h, dShift + 5); } else if (_k.equals(k)) return Optional.of((V) nodes[nodeCount * 2 + 1]); else return Optional.empty();}
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1));}
Compiles to POPCNT on x64
JavaOne 2017 Immutable Collections CON6079
A better representation • Space is required only for present nodes, using the
bitmap (made possible with HotSpot optimzations)
• Further refinements (and tradeoffs) possible
• Sub nodes and entries could be separated for more cache friendly traversal (see Steindorfer’s work compressed HAMTs aka CHAMP)
• Hash codes could be cached
31
JavaOne 2017 Immutable Collections CON6079
Persistent Map API
32
public void forEach(BiConsumer<K, V> action);
public Optional<V> get(K k);
public PMap<K, V> put(K k, V v);
public PMap<K, V> remove(K k);
JavaOne 2017 Immutable Collections CON6079
Persistent collections API• Modifying methods return a new collection
• An implementation shares unmodified structure with the previous collection
• Require mutable builders to efficiently construct in a confined manner
• For example, closure/thread confined construction then freezing
33
JavaOne 2017 Immutable Collections CON6079
Demo:Visualizing
HAMT-basedpersistent maps
34
https://github.com/PaulSandoz/per/
JavaOne 2017 Immutable Collections CON6079
Summary
• Unmodifiable is a reasonable abstraction for mutable but not immutable
• For efficient immutable collections we need persistent collections
• Sets, maps and vectors using HAMTs have proven to be effective in many libraries and platforms
35
JavaOne 2017 Immutable Collections CON6079
What about Java?• We shall continue to improve on unmodifiable
in the JDK
• Selective sedimentation of persistent collections into the Java platform?
• Claim: possibly to optimize such collections very aggressively with internal APIs, HotSpot, and safely contained unsafe mechanisms
36
JavaOne 2017 Immutable Collections CON6079
References• Fast And Space Efficient Trie Searches, Bagwell
https://pdfs.semanticscholar.org/93a1/fe7f226cfbc7cb2bceac39308a66c8aef0b0.pdf
• Ideal Hash Trees, Bagwell http://lampwww.epfl.ch/papers/idealhashtrees.pdf
• RRB-Trees: Efficient Immutable Vectors, Bagwell and Rompf https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf
• Optimizing Hash-Array Mapped Tries for Fast Lean Immutable JVM Collections, Steindorfer and Vinju https://michael.steindorfer.name/publications/oopsla15.pdf
• Efficient Immutable Collections - PhD Thesis - Steindorfer https://michael.steindorfer.name/publications/phd-thesis-efficient-immutable-collections.pdf
• Cache-Aware Lock-Free Concurrent Hash Tries, Prokopec, Bagwell, Odersky https://infoscience.epfl.ch/record/166908/files/ctries-techreport.pdf
37