1 Empirical Study of Object-layout Strategies and Optimization Techniques Natalie Eckel Supervisor:...
-
Upload
roxanne-watts -
Category
Documents
-
view
212 -
download
0
Transcript of 1 Empirical Study of Object-layout Strategies and Optimization Techniques Natalie Eckel Supervisor:...
1
Empirical StudyEmpirical Studyof Object-layout Strategiesof Object-layout Strategies
and Optimization Techniquesand Optimization Techniques
Natalie Eckel
Supervisor: Dr. Joseph (Yossi) Gil
Computer Science Department
Technion - The Israel Institute of Technology
M.Sc. seminar M.Sc. seminar (in the proceedings of ECOOP’2000)(in the proceedings of ECOOP’2000)
2
OutlineOutline• Overhead incurred due to multiple inheritance:
– VPTRs and VBPTRs– The separate compilation dilemma– Hierarchies used in out experiments– Distribution of object size
• Optimization Techniques:– Elimination of transitive virtual inheritance– Inlining virtual bases– Bidirectional layout – Hermaphrodite bidirectional layout– Packing VBPTRs
3
The Subobject RuleThe Subobject Rule• Basic rule of OO: if class B inherits from class A, then,
– Every object of B must have inside it a subobject of A.
• Example (B. Meyer): if SoftwareEngineer is an Engineer then,– There is a part in every software engineer which is an engineer.
• Rationale: procedures and methods expecting objects of A, should be able to also operate on an object of type B.
Software Engineer Engineer
4
The VPTRThe VPTR• VPTR: virtual table pointer.
– A pointer leading from every object and every subobject to a table of virtual functions (and other RTTI).
• Single inheritance: VPTR can be shared between an object, its subobject, its sub-subobject, sub-subobject, etc.– VPTR is laid out at offset 0
• Multiple inheritance: VPTR can only be shared with only one subobject.
VPTRVirtual functions table (VTBL)
Software Engineer Engineer
5
The VBPTRThe VBPTR• VBPTR: virtual base pointer
– Answers the question: where is the subobject?
– Occurs only in multiple inheritance case.
• Rationale: the diamond problem– It is impossible for class Person to have a
fixed offset with respect to both Teacher and Student.
• Solution:
Teacher Student TA Person
VPTRs
VBPTRs
Person
Teacher Student
TeachingAssistant
6
Experimental SettingExperimental SettingHierarchy Language Hierarchy’s
weight in experiments
Number of classes
Number of inheritance
links
Unidraw C++ 7.2% 613 476
Self Self 21.1% 1801 1838
Laure Laure 3.5% 295 315
JDK 1.1 Java 19.3% 1654 1927
Eiffel 4 Eiffel 23.4% 1999 2678
Ed LOV 5.1% 434 750
LOV LOV 5.1% 436 774
Geode LOV 15.4% 1318 2785
Total: 100% 8550 11543
Used in benchmarking: 6898 9616
7
No Dynamic MeasurementsNo Dynamic Measurements• Objective: estimate the saving for all possible object
sizes– The chicken and egg problem: people may not use MI
because of current overhead.
• Adds other factors:– Selection of inputs– How to deal with libraries?– Correlated instantiations– Cache– ….
8
The Topology of HierarchiesThe Topology of Hierarchies
Hierarchy Depth Average number of
parents
Percentage of virtual bases
Average number of
virtual bases
Unidraw 8 1.02 0.3% 0.02
Self 16 1.05 0.2% 0.73
Laure 11 1.07 3.7% 2.86
JDK 1.1 8 1.23 1.1% 0.52
Eiffel 4 14 1.34 3.2% 2.49
Ed 8 1.73 5.3% 3.79
LOV 9 1.78 5.5% 3.99
Geode 11 2.11 7.6% 8.37
Total: 16 1.39 2.9% 2.62
9
Overheads of Multiple InheritanceOverheads of Multiple Inheritance• Space Overhead:
– VPTR: if a class X inherits from n “roots”, then its objects will have at least n VPTRs in their layout.
– VBPTR: to every “shared” base, usually more than one
• Time Overhead: – VPTR: add/subtract offset, i.e., “this adjustment”, in down-
and up-casts (not dealt with here).– VBPTR: follow pointers in up-casts.
• Inessential VBPTRs (used by some compilers): Add a transitive edge to shortcut every chain of VBPTRs.– Minimizes time overhead.– Induces space overhead.
10
Compilation ModelsCompilation Models• Given an inheritance link (a,b), is it
– Simple inheritance (no diamonds)?– Virtual inheritance ?(diamond might show up later)
• Whole program analysis– the whole picture is available for compilation – the compiler assigns virtual inheritance to solve diamond
problems
• Separate compilation– the compiler must make the decision without seeing the whole
picture– Solution: all inheritance links are treated as virtual
• C++ compilation model– user takes the responsibility to assign virtual inheritance– we consider C++ compilers with whole program information
a
b
11
Distribution of Object SizeDistribution of Object SizeDefinition: object size is the total number of compiler
generated fields in the layout of objects of a certain class
12
Cost of Using Separate Compilation Cost of Using Separate Compilation Over C++ Compilation ModelOver C++ Compilation Model
13
Elimination of Transitive Virtual Elimination of Transitive Virtual InheritanceInheritance
• A preliminary step to more sophisticated techniques• Can be done in any compilation model
V
A
Bv
v V
A
B
this edge is transitive!
14
The EfficacyThe Efficacy
Definition: efficacy of optimization technique for a certain class is the relative reduction in object size for a class due to application of the technique
Definition: accumulative efficacy=(x,y) means that x% of classes experience at least y% reduction in their object size
15
Efficacy of Elimination of Efficacy of Elimination of Transitive Virtual InheritanceTransitive Virtual Inheritance
• Eliminates 4.1% of inheritance links• Reduces the faction of virtual inheritance
links from 35.2% to 28.6%• Accumulative efficacy=(8%,8%)
16
Inlining of Virtual BasesInlining of Virtual Bases• Inlining: Layout a virtual base inside a child,
thus eliminating at least one VBPTR.– Has a potential of saving a VPTR.
• A virtual base can be inlined into several children, as long as the shared inheritance semantics is obliged.
• Not without whole program analysis! Must examine descendants!– Can we inline X into Y?– No! But we have to see Z to understand why:
• Due to the repeated inheritance semantics of C++, class Z has two Y objects in it.
• If Y has X inlined into it, then there would be two copies of X in Z, which contradicts the C++ semantics
X
Z
Y
W
v
17
Inlining TechniquesInlining Techniques• Devirtualization of single
virtual inheritance– V is inlined into E
• Simple Inlining– Devirtualization + inline into one child– V is inlined into E and either A, B, C or D
• Aggressive Inlining– Find a maximally independent set of children to inline into
• Classes are independent if they don’t share a descendant
– V is inlined into E , either A or B , either C or D
V
G
DC
F
BA
E
18
Efficacy of Inlining TechniquesEfficacy of Inlining TechniquesSimple Inlining Aggressive Inliningvs.
Technique: Devirtualization Simple Aggressive
Inlined fraction of inheritance links
17% 17%+2.4% 17%+6.3%
Average efficacy (for big objects)
10-20% 25-30% 60-70%
Accumulative efficacy
(20%,25%) (30%,30%) (35%,50%)
19
Bidirectional Object LayoutBidirectional Object Layout
• Idea: use both ascending and descending memory addresses for object layout
• One VPTR can be saved in a marriage of a “positive” and a “negative” class– C has mixed directionality
C
B+A-
A B C A- B+ C
Standard layout: Bidirectional layout:
20
Bidirectional Layout of Virtual Bidirectional Layout of Virtual Functions TableFunctions Table
• The Virtual Function Table must also have a directionality.– Positive classes: entries 0,1,2,…– Negative classes: -1, -2, ….
A- B+ C
-1-2-3 0 1 2 3 4
A’s virtual table B’s virtual table Functions introduced in C
21
The Theorem of MarriageThe Theorem of Marriage• The BIG question: how to assign directionality to
classes to maximize savings?• Whole program analysis: various algorithms and
heuristics possible• Separate compilation: assign directionality at
random! (actually use a good hash function)• The theorem of marriage:
With random assignments, a class that has n roots will enjoy an expected saving of at least: n/2/2 n/4. In other words, about half of all root classes will eventually find a mate.
22
Marriages of Non-Virtual and Marriages of Non-Virtual and Virtual BasesVirtual Bases
• Ones classes A and B are married in C, they remain married in all C’s descendants
• However, marriage of virtual bases cannot be permanent.– V1 and V2 are married in A
– V2 and V3 are married in B
– What happens in C?• Each class marries its virtual bases independently
of what its ancestors did
• Theorem: If there are n virtual base classes, then
the number of marriages is n/2 - O(n) – that’s the expectation for separate compilation model
V1+
B
C
A
V2- V3
+
23
Bidirectional Layout EfficacyBidirectional Layout EfficacyC++ compilation model with inessential VBPTRs
Separate compilationwithout inessential VBPTRs
•Applied after Aggressive Inlining•Big objects have 20% of their size occupied by VPTRs•5% savings for big objects – a quarter of VPTRs as predicted•(30%,30%)
•The number of VPTRs and VBPTRs is about the same•15-20% for big objects – almost a half of the VPTRs as predicted•(60%,18%)
24
Hermaphrodite Bidirectional Hermaphrodite Bidirectional Object LayoutObject Layout
• Bidirectional layout drawback: two base classes with the same directionality will never be married
• Hermaphroditing: a directed (hermaphrodite) class has two types of instances: “positive” and “negative”– Two hermaphrodite classes can always be married
25
Efficacy of Hermaphrodite Efficacy of Hermaphrodite Bidirectional LayoutBidirectional Layout
C++ compilation modelwith inessential VBPTRs
Separate compilationwithout inessential VBPTRs
•(33%,33%)•Applied after Aggressive Inlining
•(50%,25%)•Makes savings for all classes of size 2 and more!
26
Packing VBPTRsPacking VBPTRs• Observation: objects are laid out consecutive in memory• Motivation: In large objects VBPTRs occupy 80-90% of
their size• Idea: instead of using full blown pointers to virtual base
sub-objects, use offsets– Assumption: machine word = 4 bytes– Small objects (under size 1K): an offset to a sub-object can be
stored in one byte = 4 offsets in a word– Larger objects (under size 0.25MB): an offset could be stored in
2 bytes = 2 offsets in a word
• Class can reuse empty “slots” in non-virtual bases• Cannot reuse empty slots in virtual base sub-objects
27
Efficacy of Packing in C++ Efficacy of Packing in C++ Compilation ModelCompilation Model
2 slots in word 4 slots in word
•Expected savings: •4 slots in word: saves 60-70% in object size•2 slots in word: saves 40-45% in object size
28
SummarySummary• Evils of virtual inheritance and different compilation
models.• Distribution of object size votes against separate
compilation.• Optimization techniques:
– Inlining (not so trivial).• Aggressive inlining.
– Bidirectional layout.• Architectural support.• Hermaphroditing idea
– Secure savings for all sizes of objects– Possible run-time costs for checking the instance directionality
– Packing VBPTRs• The bottom line: saving in the range of 40% can be
achieved for all object sizes!!!
29
Future ResearchFuture Research
• Dynamic measurements
• More optimization techniques
• Efficient implementation of Java interfaces