Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and...
Transcript of Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and...
![Page 1: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/1.jpg)
PVS
Empirical Study of Usage and Performance of Java Collections
1Diego Costa, 1Artur Andrzejak, 1Janos Seboek, 2David Lo
1Heidelberg University, 2Singapore Management University
1
![Page 2: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/2.jpg)
Published as:
Paper/slides available at: https://pvs.ifi.uni-heidelberg.de/publications/
Empirical Study of Usage and Performance of Java Collections
1Diego Costa, 1Artur Andrzejak, 1Janos Seboek, 2David Lo
2
1Heidelberg University, 2Singapore Management University
Diego Costa, Artur Andrzejak, Janos Seboek, and David Lo. 2017. Empirical Study of Usage and Performance of Java Collections. In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering (ICPE '17), L'Aquila, Italy — April 22 - 26, 2017
![Page 3: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/3.jpg)
Collections
• Collections are objects that groups multiple elements into a single unit.
• Use its metadata to track, access and manipulate its elements
2
![Page 4: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/4.jpg)
Collections
• Collections are objects that groups multiple elements into a single unit.
• Use its metadata to track, access and manipulate its elements
Object[]
Integer
ArrayList<Integer>
2
![Page 5: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/5.jpg)
Collections
• Collections are objects that groups multiple elements into a single unit.
• Use its metadata to track, access and manipulate its elements
CollectionOverhead
ElementsFootprint
Object[]
Integer
ArrayList<Integer>
2
![Page 6: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/6.jpg)
Motivation
• Numerous studies have identified the inefficient use of collections as the main cause of runtime bloat
Execution Time
+17% Improv.
Configuration of one HashMap instance
[Liu et al. 2009]
Memory Usage
+54% Improv.
Use of ArrayMapsinstead of HashMaps
[Ohad et al. 2009]
Energy Consumption
+300% Improv.
Use of ArrayList instead of LinkedList
[Jung et al. 2016]
3
![Page 7: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/7.jpg)
Collection Frameworks
• The Java Collection Framework offers a Standardimplementation of the major collection abstractions
• Stable and reliable framework
• Easy to use
• However, there exist alternative libraries that provide a myriad of different implementations:
• Primitive Collections (IntArrayList)
• Immutable Collections
• Multimaps (Map<K, Collection>)
• Multisets (Map<K, int>)
4
![Page 8: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/8.jpg)
Collection Frameworks
• The Java Collection Framework offers a Standardimplementation of the major collection abstractions
• Stable and reliable framework
• Easy to use
• However, there exist alternative libraries that provide a myriad of different implementations:
• Primitive Collections (IntArrayList)
• Immutable Collections
• Multimaps (Map<K, Collection>)
• Multisets (Map<K, int>)
4
Unsupported features
![Page 9: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/9.jpg)
Collection Frameworks
• The Java Collection Framework offers a Standardimplementation of the major collection abstractions
• Stable and reliable framework
• Easy to use
• However, there exist alternative libraries that provide a myriad of different implementations:
• Primitive Collections (IntArrayList)
• Immutable Collections
• Multimaps (Map<K, Collection>)
• Multisets (Map<K, int>)
4
Unsupported features
Simplified API
![Page 10: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/10.jpg)
Analysis of Performance Impact of Alternative Collections
1. Study on Collections Usage• How often do programmers use alternative implementations?
2. Experimental Evaluation of popular Java Collection Libraries• Are there better alternatives to the most commonly used Collections
regarding performance?
5
Goal: Can we find alternatives to the Standard collection
types which improve performance on time/memory?
![Page 11: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/11.jpg)
Study on Usage of Collections
• Dataset• We analyze the GitHub Java Corpus
• 10K projects
• 268 MLOC
• Static Analysis• Use of Java Parser to extract variable declaration and allocation sites
of Types with suffix:
{List, Map, Set, Queue, Vector}
• Manually removed false-positives
6
![Page 12: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/12.jpg)
Developers Rarely Use non-Standard Collections
ArrayList48%
HashSet10%
HashMap22%
Lists57%
Sets15%
Maps28%
7
![Page 13: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/13.jpg)
ArrayList48%
LinkedList5%
HashSet10%
LinkedHashSet1%
HashMap22%
LinkedHashMap2%
8
Lists57%
Sets15%
Maps28%
Developers Rarely Use non-Standard Collections
![Page 14: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/14.jpg)
ArrayList48%
LinkedList5%Other Lists
4%
HashSet10%
TreeSet1%
LinkedHashSet…
Other Sets3%
HashMap22%
LinkedHashMap2%
TreeMap1%
Other Maps3%
• Top 4 represent 86% of all declared instantiations
• Non-Standard collections are declared <4%
9
Lists57%
Sets15%
Maps28%
Developers Rarely Use non-Standard Collections
Evaluate alternatives to
ArrayList, HashMap,
HashSet and LinkedList
![Page 15: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/15.jpg)
Commonly Used Element Data Types
From the categorized data types:
• Strings are the most commonly held data type, followed by Numeric
Lists Sets Map Keys Map Values
Top50
String
Numeric
Collections
Other
0%
20%
40%
60%
80%
100%
Lists Sets MapKeys
MapValues
10
![Page 16: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/16.jpg)
Commonly Used Element Data Types
From the categorized data types:
• Strings are the most commonly held data type, followed by Numeric
Lists Sets Map Keys Map Values
Top50
String
Numeric
Collections
Other
0%
20%
40%
60%
80%
100%
Lists Sets MapKeys
MapValues
10
Evaluate collections holding
Strings, Integer and Long
![Page 17: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/17.jpg)
Initial Capacity is Rarely Specified
ArrayList HashMap HashSet
Top50
Specified
Default
Copy
0%
20%
40%
60%
80%
100%
List Map Set
11
![Page 18: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/18.jpg)
Initial Capacity is Rarely Specified
ArrayList HashMap HashSet
Top50
Specified
Default
Copy
0%
20%
40%
60%
80%
100%
List Map Set
11
Evaluate collections with
default Initial Capacity
![Page 19: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/19.jpg)
Superior Alternatives
• Superior Alternative: a Non-Standard implementation that can outperform a Standard counterpart in terms of execution time and/or memory consumption.
• Can we find a superior alternative to the most commonly used collection types?
12
![Page 20: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/20.jpg)
Experimental Study on Java Collections
• We selected 6 alternative libraries:• Repository Popularity (GitHub)
• Appearance in previous partial benchmarks
13
Libraries Version JCF
Compatible
Available at
Trove 3.0.3 yes trove.starlight-systems.com
Guava 18.0 yes
github
/google/guava
GSCollections 6.2.0 yes /goldmansachs/gs-collections
HPPC 0.7.1 no /carrotsearch/hppc
Fastutil 7.0.10 yes /vigna/fastutil
Koloboke 0.6.8 yes /leventov/Koloboke
![Page 21: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/21.jpg)
Experimental Study on Java Collections
• Seven typical scenarios evaluated• populate, iterate, contains, get, add, remove, copy
• Collections holding from 100 to 1 million elements
• Alternatives to the most commonly used collections• JDK 1.8.0_65
• ArrayList, HashMap, HashSet and LinkedList• Object collection alternatives
• Primitive collection alternatives
14
![Page 22: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/22.jpg)
CollectionsBench Suite
• We create a benchmark suite: CollectionsBench• Open Java Microbenchmark Harness
@Setuppublic void setup() {fullList = this.createNewList();fullList.addAll(values); // Randomly
} generated
@Benchmarkpublic void iterate() {for (T element : fullList) {blackhole.consume(element);// Blackholes avoid dead-code // optimization}
} 15
![Page 23: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/23.jpg)
CollectionsBench Suite
• We create a benchmark suite: CollectionsBench
@Setuppublic void setup() {fullList = this.createNewList();fullList.addAll(values); // Randomly
} generated
return new ArrayList<T>();
return new LinkedList<T>();
Only the instantiation is needed for each collection type*
return new FastList<T>();
Here we measure:
• Execution time (ns)
• Collection Overhead (allocation)• GC Profiler
@Benchmarkpublic void iterate() {for (T element : fullList) {blackhole.consume(element);// Blackholes avoid dead-code // optimization}
} 16
![Page 24: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/24.jpg)
Experimental Planning
• To accomplish a steady state performance evaluation:
@Benchmarkpublic void iterate()
5s
Iteration
17
![Page 25: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/25.jpg)
Experimental Planning
• To accomplish a steady state performance evaluation:
@Benchmarkpublic void iterate()
5s
Iteration
17
IterationWarmup
Measured Iteration
10𝑥
30𝑥
Fork 1
![Page 26: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/26.jpg)
Experimental Planning
• To accomplish a steady state performance evaluation:
@Benchmarkpublic void iterate()
5s
Iteration
17
IterationWarmup
Measured
Iteration
Iteration Iteration
10𝑥
30𝑥
10𝑥
30𝑥
Fork 1 Fork 2
![Page 27: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/27.jpg)
Experimental Planning
• To accomplish a steady state performance evaluation:
@Benchmarkpublic void iterate()
5s
Iteration
17
IterationWarmup
Measured
Iteration
Iteration Iteration
AVG Time ± CI (99%)AVG Memory allocated ± CI (99%)
10𝑥
30𝑥
10𝑥
30𝑥
Fork 1 Fork 2
![Page 28: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/28.jpg)
Reporting Speedup/Slowdown
• We present the results of alternatives normalized to the Standard implementation performances.
• Means with overlapping CI are set to zero
• We use the following speedup/slowdown definitions:
𝑆 =
𝑇𝑠𝑡𝑑𝑇𝑎𝑙𝑡
, 𝑖𝑓 𝑇𝑠𝑡𝑑 > 𝑇𝑎𝑙𝑡
−𝑇𝑎𝑙𝑡𝑇𝑠𝑡𝑑
, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
18
![Page 29: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/29.jpg)
Reporting Speedup/Slowdown
• We present the results of alternatives normalized to the Standard implementation performances.
• Means with overlapping CI are set to zero
• We use the following speedup/slowdown definitions:
𝑆 =
𝑇𝑠𝑡𝑑𝑇𝑎𝑙𝑡
, 𝑖𝑓 𝑇𝑠𝑡𝑑 > 𝑇𝑎𝑙𝑡
−𝑇𝑎𝑙𝑡𝑇𝑠𝑡𝑑
, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1 1 1
2 7
1 1 1
2 7
𝑆 = {0, 1.5, 1.9, 2, 7.8}
𝑆 = {0,−1.5,−1.9, −2,−7.8}
Heat map
Overlapping
Confidence Intervals 19
![Page 30: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/30.jpg)
Reporting Memory Overhead
• For the memory comparison we present the collection overhead reduction per element (with compressed object pointers)
20
![Page 31: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/31.jpg)
Reporting Memory Overhead
• For the memory comparison we present the collection overhead reduction per element (with compressed object pointers)
For instance
Collection X: 100 bytes per element
Collection Y: 10 bytes per element
• Evaluated on the copy scenario
20
![Page 32: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/32.jpg)
Reporting Memory Overhead
• For the memory comparison we present the collection overhead reduction per element (with compressed object pointers)
For instance
Collection X: 100 bytes per element
Collection Y: 10 bytes per element
• Evaluated on the copy scenario
90%
Reduction of Collection Overhead
20
![Page 33: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/33.jpg)
Superior Alternatives: LinkedList
• LinkedList was outperformed by everyArrayList alternative
contains add get remove copyJCF
2 2 2 2
31 0 0 1 0
* * * * *1
4 6 32
11 12 16 11 12
FU2 2 2 2
31 1 1 1 1
* * * * *1
4 6 32
3 3 4 3 4
GS2 2 2 2
31 1 1 1 1
* * * * *1
4 6 32
10 13 16 12 12
HP1 2 2 2 2 2 1 1 1 1
* * * * *1
3 6 32 2 2 1 -1 1
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
84%
JCF LinkedList$Entryconsumes 24 bytes
21
StandardFastutil
GSCollec.HPPC
Reduction of Collection Overhead
![Page 34: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/34.jpg)
Superior Alternatives: LinkedList
• LinkedList was outperformed by everyArrayList alternative
contains add get remove copyJCF
2 2 2 2
31 0 0 1 0
* * * * *1
4 6 32
11 12 16 11 12
FU2 2 2 2
31 1 1 1 1
* * * * *1
4 6 32
3 3 4 3 4
GS2 2 2 2
31 1 1 1 1
* * * * *1
4 6 32
10 13 16 12 12
HP1 2 2 2 2 2 1 1 1 1
* * * * *1
3 6 32 2 2 1 -1 1
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
Asymptotic disadvantage
84%
JCF LinkedList$Entryconsumes 24 bytes
21
StandardFastutil
GSCollec.HPPC
Reduction of Collection Overhead
![Page 35: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/35.jpg)
Superior Alternatives: LinkedList
• LinkedList was outperformed by everyArrayList alternative
contains add get remove copyJCF
2 2 2 2
31 0 0 1 0
* * * * *1
4 6 32
11 12 16 11 12
FU2 2 2 2
31 1 1 1 1
* * * * *1
4 6 32
3 3 4 3 4
GS2 2 2 2
31 1 1 1 1
* * * * *1
4 6 32
10 13 16 12 12
HP1 2 2 2 2 2 1 1 1 1
* * * * *1
3 6 32 2 2 1 -1 1
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
Asymptotic disadvantage Asymptotic advantage
public boolean remove(Object o)
84%
JCF LinkedList$Entryconsumes 24 bytes
21
StandardFastutil
GSCollec.HPPC
Reduction of Collection Overhead
![Page 36: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/36.jpg)
Superior Alternatives: ArrayList
• GSCollections provides a superior alternative• Faster when populating the list (no time penalty)
populate iterate copy
FU1 1 1 1 1 0 0 0 0 0
-4 -4 -4 -3 -3
GS1 1 1 0 1 0 0 0 0 0 0 0 0 0 0
HP1 2 2 1 2 -1 -1 -1 -1 -1
-6 -7 -14 -14 -9
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
No memory difference
22
FastutilGSCollec.
HPPC
![Page 37: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/37.jpg)
Superior Alternatives: ArrayList
• GSCollections provides a superior alternative• Faster when populating the list (no time penalty)
populate iterate copy
FU1 1 1 1 1 0 0 0 0 0
-4 -4 -4 -3 -3
GS1 1 1 0 1 0 0 0 0 0 0 0 0 0 0
HP1 2 2 1 2 -1 -1 -1 -1 -1
-6 -7 -14 -14 -9
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
Distinct Array copy calls
Std: Arrays.copyOf();
Alt: System.arraycopy();
No memory difference
22
FastutilGSCollec.
HPPC
![Page 38: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/38.jpg)
Superior Alternatives: ArrayList
• GSCollections provides a superior alternative• Faster when populating the list (no time penalty)
populate iterate copy
FU1 1 1 1 1 0 0 0 0 0
-4 -4 -4 -3 -3
GS1 1 1 0 1 0 0 0 0 0 0 0 0 0 0
HP1 2 2 1 2 -1 -1 -1 -1 -1
-6 -7 -14 -14 -9
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
Distinct Array copy calls
Std: Arrays.copyOf();
Alt: System.arraycopy();
HPPC adds each element instead of copying the array
No memory difference
22
FastutilGSCollec.
HPPC
![Page 39: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/39.jpg)
Superior Alternatives: HashSet
• Koloboke provides a superior alternative• Fastutil is a solid 2nd option
populate iterate contains add remove copyFU
0 -1 -1 0 -2 1 2 2 1
30 0 0 1 1 0 0 -2 0 0 0 -1 -1 1 1 2 1 0 2 1
GS-1 -1 0 0 -2 -1 1 0 0 0 -1 -1 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 1 2 1
Gu0 0 0 0 -1 -2 0 -1 -1 -1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 -1 -2 -2 0
HP-2 -2 -1 0 -2 0 2 1 0
30 0 0 1 1 0 0 -2 0 0 0 -1 -2 1 0 1 -1 -1 1 0
Ko0 0 0 1 0 0 2 2 1
30 0 0 1 1 0 0 -2 0 0 0 -1 -1 1 1
13 16 29 29 51
Tr-2 -2 -2 0 -2 -2 -2 0 0 2 0 0 0 0 1 -2 -2 -2 -2 -2 -2 -2 -2 -1 -2 0 -2 -2 1 -2
100 1K 10K 100K 1M 100 1K 10K100
K1M 100 1K 10K
100K
1M 100 1K 10K100
K1M 100 1K 10K
100K
1M 100 1K 10K100
K1M
75%
Std HashSet$Nodeobject consumes 32 bytes
23
FastutilGSCollec.
GuavaHPPC
KolobokeTrove
Reduction of Collection Overhead
![Page 40: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/40.jpg)
Superior Alternatives: HashSet
• Koloboke provides a superior alternative• Fastutil is a solid 2nd option
populate iterate contains add remove copyFU
0 -1 -1 0 -2 1 2 2 1
30 0 0 1 1 0 0 -2 0 0 0 -1 -1 1 1 2 1 0 2 1
GS-1 -1 0 0 -2 -1 1 0 0 0 -1 -1 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 1 2 1
Gu0 0 0 0 -1 -2 0 -1 -1 -1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 -1 -2 -2 0
HP-2 -2 -1 0 -2 0 2 1 0
30 0 0 1 1 0 0 -2 0 0 0 -1 -2 1 0 1 -1 -1 1 0
Ko0 0 0 1 0 0 2 2 1
30 0 0 1 1 0 0 -2 0 0 0 -1 -1 1 1
13 16 29 29 51
Tr-2 -2 -2 0 -2 -2 -2 0 0 2 0 0 0 0 1 -2 -2 -2 -2 -2 -2 -2 -2 -1 -2 0 -2 -2 1 -2
100 1K 10K 100K 1M 100 1K 10K100
K1M 100 1K 10K
100K
1M 100 1K 10K100
K1M 100 1K 10K
100K
1M 100 1K 10K100
K1M
75%
Koloboke performs a memory copy of its HashTable
Std HashSet$Nodeobject consumes 32 bytes
Impact of memory efficiency on time
23
FastutilGSCollec.
GuavaHPPC
KolobokeTrove
Reduction of Collection Overhead
![Page 41: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/41.jpg)
Superior Alternatives: HashMap
• Standard HashMap is a solidimplementation
• No superior alternatives on time
populate iterate contains remove copyFU
0 -1 -1 0 1 -1 -2 -1 0 1 -1 0 0 0 1 -1 -1 -1 -1 0 -1 -1 -1 -2 -1
GS-1 -1 -1 -1 0 -2
-3-2 -1 0 -1 0 -1 0 0 0 0 0 1 1 0 -1 -1 -1 -1
Gu -3-2 -2 -2 -2
-7 -9 -7 -8 -70 0 -1 0 0 -2 -2 -2
-3-2 -9 -10 -9 -13 -9
HP-2 -2 -1 -1 0 -1 -1 0 0 1 -1 0 0 0 0 -1 -1 -1 -1 0 -1 -2 -2 -2 -2
Ko-2
-6 -4 -4 -2 -3 -3-2 -1 1 -1 -2
-3 -3-1 -2
-11-13-15 -4 4 6 9 11 11
Tr -3-2 -2 -1 -1
-7 -9 -5 -4 -31 1 1 1 2 -2 -2 -2 -1 -1 -4 -4 -6 -5 -7
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
24
FastutilGSCollec.
GuavaHPPC
KolobokeTrove
![Page 42: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/42.jpg)
Superior Alternatives: HashMap
• Standard HashMap is a solidimplementation
• No superior alternatives on time
populate iterate contains remove copyFU
0 -1 -1 0 1 -1 -2 -1 0 1 -1 0 0 0 1 -1 -1 -1 -1 0 -1 -1 -1 -2 -1
GS-1 -1 -1 -1 0 -2
-3-2 -1 0 -1 0 -1 0 0 0 0 0 1 1 0 -1 -1 -1 -1
Gu -3-2 -2 -2 -2
-7 -9 -7 -8 -70 0 -1 0 0 -2 -2 -2
-3-2 -9 -10 -9 -13 -9
HP-2 -2 -1 -1 0 -1 -1 0 0 1 -1 0 0 0 0 -1 -1 -1 -1 0 -1 -2 -2 -2 -2
Ko-2
-6 -4 -4 -2 -3 -3-2 -1 1 -1 -2
-3 -3-1 -2
-11-13-15 -4 4 6 9 11 11
Tr -3-2 -2 -1 -1
-7 -9 -5 -4 -31 1 1 1 2 -2 -2 -2 -1 -1 -4 -4 -6 -5 -7
100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
55%
Std. HashMap$Node object consumes 32 bytes
• Fastutil provides a superior alternative on memory consumption
24
FastutilGSCollec.
GuavaHPPC
KolobokeTrove
Reduction of Collection Overhead
![Page 43: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/43.jpg)
Object vs Primitive Collections
• Reducing collection footprint: overhead + element footprint
int[]Object[]
Object collectionArrayList<>
Primitive collectionIntArrayList
25
![Page 44: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/44.jpg)
Object vs Primitive Collections
• Reducing collection footprint: overhead + element footprint
int[]Object[]
Object collectionArrayList<>
Primitive collectionIntArrayList
SmallerCollection Overhead
25
![Page 45: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/45.jpg)
Object vs Primitive Collections
• Reducing collection footprint: overhead + element footprint
int[]Object[]
Object collectionArrayList<>
Primitive collectionIntArrayList
Integer
(16 bytes)
int
(4 bytes)
SmallerCollection Overhead
25
![Page 46: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/46.jpg)
Object vs Primitive Collections
• Reducing collection footprint: overhead + element footprint
int[]Object[]
Object collectionArrayList<>
Primitive collectionIntArrayList
Integer
(16 bytes)
int
(4 bytes)
SmallerCollection Overhead
SmallerElement Footprint
25
![Page 47: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/47.jpg)
Object vs Primitive Collections
• Reducing collection footprint: overhead + element footprint
int[]Object[]
Object collectionArrayList<>
Primitive collectionIntArrayList
Integer
(16 bytes)
int
(4 bytes)
SmallerCollection Overhead
SmallerElement Footprint
25
![Page 48: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/48.jpg)
Object vs Primitive Collections
• Reducing collection footprint: overhead + element footprint
int[]Object[]
Object collectionArrayList<>
Primitive collectionIntArrayList
Integer
(16 bytes)
int
(4 bytes)
SmallerCollection Overhead
SmallerElement Footprint
25
80%
Collection Footprint Reduction
![Page 49: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/49.jpg)
Superior Alternatives: Primitive Collections
• We found superior alternatives to all three abstraction types • Data Type: Integer to int
• Performance varies considerably from distinct libraries for multiple reasons
For instance, ArrayList primitive implementations
Libs populate iterate contains copy
FU 3 3 32 2 -2 -2 -2 -1 -2 1 2
3 3 30 0 0 0 0
GS 3 3 32 2 1 1 1 1 1 2 2
3 3 40 0 0 0 0
HP2 1 2 2 2 1 1 1 1 2 1 2
3 3 3 -5 -7 -8 -9 -7
Tr 3 3 32 2 1 1 1 1 1 0 0 0 1 1
-3 -8 -9 -7 -4100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
26
FastutilGSCollec.
HPPCTrove
![Page 50: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/50.jpg)
Superior Alternatives: Primitive Collections
• We found superior alternatives to all three abstraction types • Data Type: Integer to int
• Performance varies considerably from distinct libraries for multiple reasons
For instance, ArrayList primitive implementations
Libs populate iterate contains copy
FU 3 3 32 2 -2 -2 -2 -1 -2 1 2
3 3 30 0 0 0 0
GS 3 3 32 2 1 1 1 1 1 2 2
3 3 40 0 0 0 0
HP2 1 2 2 2 1 1 1 1 2 1 2
3 3 3 -5 -7 -8 -9 -7
Tr 3 3 32 2 1 1 1 1 1 0 0 0 1 1
-3 -8 -9 -7 -4100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
for(Object..)
5x slower than for(IntProcedure) commonly implemented
26
FastutilGSCollec.
HPPCTrove
![Page 51: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/51.jpg)
Superior Alternatives: Primitive Collections
• We found superior alternatives to all three abstraction types • Data Type: Integer to int
• Performance varies considerably from distinct libraries for multiple reasons
For instance, ArrayList primitive implementations
Libs populate iterate contains copy
FU 3 3 32 2 -2 -2 -2 -1 -2 1 2
3 3 30 0 0 0 0
GS 3 3 32 2 1 1 1 1 1 2 2
3 3 40 0 0 0 0
HP2 1 2 2 2 1 1 1 1 2 1 2
3 3 3 -5 -7 -8 -9 -7
Tr 3 3 32 2 1 1 1 1 1 0 0 0 1 1
-3 -8 -9 -7 -4100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M 100 1K 10K 100K 1M
for (int = offset; i-- > 0; )
30% more branch misses
for(Object..)
5x slower than for(IntProcedure) commonly implemented
26
FastutilGSCollec.
HPPCTrove
![Page 52: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/52.jpg)
Superior Alternative: Primitive Collections
• GSCollections, Koloboke and Fastutil provide solid superior alternatives
ArrayList HashMap HashSet
80% 76% 84%
Footprint Reduction
Integer → int
Time impact
(1M elements)12x
+2x populate
+4x contains
+2x remove12x 12x
+2x populate
-2x iterate
+6x/+4x copy12x 12x
+4x populate
+4x/+3x iterate
+33x/+7x copy12x27
![Page 53: Empirical Study of Usage and Performance of Java Collections · PVS Empirical Study of Usage and Performance of Java Collections 1Diego Costa , 1Artur Andrzejak 1Janos Seboek 2David](https://reader035.fdocuments.in/reader035/viewer/2022071214/6042e0c8221462326e63c390/html5/thumbnails/53.jpg)
Summary
• There are performance opportunities on Alternative Collection Frameworks
• Time/memory improvement with moderate refactor effort
• We provide a Guideline to assist developers on:• Identifying superior alternative implementations
• Which scenarios an alternative could lead to a substantial performance improvement
• CollectionsBench is open-source and available at GitLab
28