Post on 19-Jan-2016
description
Custom STL Allocators
Pete IsenseeXbox Advanced Technology Grouppkisensee@msn.com
Topics
• Allocators: What are They Good For?
• Writing Your First Allocator• The Devil in the Details• Allocator Pitfalls
– State– Syntax– Testing
• Case Study
Containers and Allocators
• STL containers allocate memory– e.g. vector (contiguous), list (nodes)– string is a container, for this talk
• Allocators provide a standard interface for container memory use
• If you don’t provide an allocator, one is provided for you
Example
• Default Allocatorlist<int> b;
// same as:
list< int, allocator<int> > b;
• Custom Allocator#include “MyAlloc.h”
list< int, MyAlloc<int> > c;
The Good
• Original idea: abstract the notion of near and far memory pointers
• Expanded idea: allow customization of container allocation
• Good for– Size: Optimizing memory usage
(pools, fixed-size allocators)– Speed: Reducing allocation time
(single-threaded, one-time free)
Example Allocators
• No heap locking (single thread)• Avoiding fragmentation• Aligned allocations
(_aligned_malloc)• Fixed-size allocations• Custom free list• Debugging• Custom heap• Specific memory type
The Bad
• No realloc()• Requires advanced C++ compilers• C++ Standard hand-waving• Generally library-specific
– If you change STL libraries you may need to rewrite allocators
• Generally not cross-platform– If you change compilers you may
need to rewrite allocators
The Ugly
• Not quite real objects– Allocators with state may not work
as expected
• Gnarly syntax– map<int,char> m;– map<int,char,less<int>,
MyAlloc<pair<int,char> > > m;
Pause to Reflect
• “Premature optimization is the root of all evil” – Donald Knuth
• Allocators are a last resort and low-level optimization
• Especially for games, allocators can be the perfect optimization
• Written correctly, they can be introduced w/o many code changes
Writing Your First Allocator• Create MyAlloc.h• #include <memory>• Copy or derive from the default
allocator• Rename “allocator” to “MyAlloc”• Resolve any helper functions• Replace some code with your own
Writing Your First Allocator• Demo• Visual C++ Pro 7.0 (13.00.9466)• Dinkumware STL (V3.10:0009)• 933MHz PIII w/ 512MB• Windows XP Pro 2002• Launch Visual Studio
Two key functions
• Allocate• Deallocate• That’s all!
Conventions
template< typename T >
class allocator
{
typedef size_t size_type;
typedef T* pointer;
typedef const T* const_pointer;
typedef T value_type;
};
Allocate Function
• pointer allocate( size_type n, allocator<void>::const_pointer p = 0)– n is the number of items T, NOT
bytes– returns pointer to enough memory to
hold n * sizeof(T) bytes– returns raw bytes; NO construction– may throw an exception
(std::bad_alloc)– default calls ::operator new– p is optional hint; avoid
Deallocate function
• void deallocate( pointer p,
size_type n )– p must come from allocate()– p must be raw bytes; already
destroyed– n must match the n passed to
allocate()– default calls ::operator delete(void*)– Most implementations allow and
ignore NULL p; you should too
A Custom Allocator
• Demo• That’s it!• Not quite: the devil is in the
details– Construction– Destruction– Example STL container code– Rebind
Construction
• Allocate() doesn’t call constructors• Why? Performance• Allocators provide construct
function void construct(pointer p, const T& t)
{ new( (void*)p ) T(t); }
• Placement new– Doesn’t allocate memory– Calls copy constructor
Destruction
• Deallocate() doesn’t call destructors
• Allocators provide a destroy function
void destroy( pointer p )
{ ((T*)p)->~T(); }
• Direct destructor invocation– Doesn’t deallocate memory– Calls destructor
Example: Vector
template< typename T, typename A >
class vector {
A a; // allocator
pointer pFirst; // first object
pointer pEnd; // 1 beyond end
pointer pLast; // 1 beyond last
};
Example: Reserve
vector::reserve( size_type n ){ pointer p = a.allocate( n, 0 ); // loop on a.construct() to copy // loop on a.destroy() to tear down a.deallocate( pFirst, capacity() ); pFirst = p; pLast = p + size(); pEnd = p + n;}
Performance is paramount• Reserve
– Single allocation– Doesn’t default construct anything– Deals properly with real objects
• No memcpy• Copy constructs new objects• Destroys old objects
– Single deallocation
Rebind
• Allocators don’t always allocate Tlist<Obj> ObjList; // allocates nodes
• How? Rebindtemplate<typename U> struct rebind
{ typedef allocator<U> other; }
• To allocate an N given type TAlloc<T> a;
T* t = a.allocate(1); // allocs sizeof(T)
Alloc<T>::rebind<N>::other na;
N* n = na.allocate(1); // allocs sizeof(N)
Allocator Pitfalls
• To Derive or Not to Derive• State
– Copy ctor and template copy ctor– Allocator comparison
• Syntax issues• Testing• Case Study
To Derive or Not To Derive• Deriving from std::allocator
– Dinkumware derives (see <xdebug>)– Must provide rebind, allocate,
deallocate– Less code; easier to see differences
• Writing from scratch– Allocator not designed as base class– Josuttis and Austern write from scratch– Better understanding
• Personal preference
Allocators with State
• State = allocator member data• Default allocator has no data• C++ Std says (paraphrasing
20.1.5):– Vendors encouraged to support
allocators with state– Containers may assume that
allocators don’t have state
State Recommendations
• Be aware of compatibility issues across STL vendors
• list::splice() or C::swap()will indicate if your vendor supports stateful allocators– Dinkumware: yes– STLport: no
• Test carefully
State Implications
• Container size increase• Must provide allocator:
– Constructor(s)• Default may be private if parameters required
– Copy constructor– Template copy constructor– Global comparison operators (==, !=)
• No assignment operators required• Avoid static data; generates one per
T
Heap Allocator Example
template< typename T >
class Halloc {
Halloc(); // could be private
explicit Halloc( HANDLE hHeap );
Halloc( const Halloc& ); // copy
template< typename U > // templatized
Halloc( const Halloc<U>& ); // copy
};
Template Copy Constructor• Can’t see private data
template< typename U >
Halloc( const Halloc<U>& a ) :
m_hHeap( a.m_hHeap ) {} // error
• Solutions– Provide public data accessor
function– Or allow access to other types U
template <typename U>
friend class Halloc;
Allocator comparison
• Exampletemplate< typename T, typename U >
bool operator==( const Alloc<T>& a,
const Alloc<U>& b )
{ return a.state == b.state; }
• Provide both == and !=• Should be global fucns, not
members• May require accessor functions
Syntax: Typedefs
• Prefer typedefs• Offensive
list< int, Alloc< int > > b;
• Better// .h
typedef Alloc< int > IAlloc;
typedef list< int, IAlloc > IntList;
// .cpp
IntList v;
Syntax: Construction
• Containers accept allocators via ctorsIntList b( IAlloc( x,y,z ) );
• If none specified, you get the defaultIntList b; // calls IAlloc()
• Map/multimap requires pairsAlloc< pair< K,T > > a;
map< K, T, less<K>,
Alloc< pair< K,T > > >
m( less<K>(), a );
Syntax: Other Containers
• Container adaptors accept containers via constructors, not allocatorsAlloc<T> a;
deque< T, Alloc<T> > d(a);
stack< T, deque<T,Alloc<T> > > s(d);
• String exampleAlloc<T> a;
basic_string< T, char_traits<T>, Alloc<T> > s(a);
Testing
• Test the normal case• Test with all containers (don’t forget
string, hash containers, stack, etc.)• Test with different objects T,
particularly those w/ non-trivial dtors
• Test edge cases like list::splice• Verify that your version is better!• Allocator test framework:
www.tantalon.com/pete.htm
Case Study
• In-place allocator– Hand off existing memory block– Dole out allocations from the block– Never free
• Example usagetypedef InPlaceAlloc< int > IPA;
void* p = malloc( 1024 );
list< int, IPA > x( IPA( p, 1024 ) );
x.push_back( 1 );
free( p );
• View code
In-Place Allocator
• Problems– Fails w/ multiple concurrent copies– No copy constructor– Didn’t support comparison– Didn’t handle containers of void*
• Correct implementation– Reference counted– Copy constructor implemented– Comparison operators– Void specialization
In-Place Summary
• Speed– Scenario: add x elements, remove half– About 50x faster than default allocator!
• Advantages– Fast; no overhead; no fragmentation– Whatever memory you want
• Disadvantages– Proper implementation isn’t easy– Limited use
Recommendations
• Allocators: a last resort optimization
• Base your allocator on <memory>• Beware porting issues (both
compilers and STL vendor libraries)
• Beware allocators with state• Test thoroughly• Verify speed/size improvements
Recommendations part II
• Use typedefs to simplify life• Don’t forget to write
– Rebind– Copy constructor– Templatized copy constructor– Comparison operators– Void specialization
References
• C++ Standard section 20.1.5, 20.4.1
• Your STL implementation: <memory>
• GDC Proceedings: References section
• Game Gems III• pkisensee@msn.com• www.tantalon.com/pete.htm