1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

31
1 Becoming More Effective with C++ … Becoming More Effective with C++ … Day Two Day Two Stanley B. Lippman Stanley B. Lippman [email protected] [email protected]

description

3 Complex Global Objects Global objects are problematic for many reasons. Their primary benefit is that they simplify information sharing across functions, modules, and libraries. The three primary drawbacks of global objects are The direct use of global objects across functions, modules, and/or libraries results in code that is difficult to reuse or modify in any significant way. In a threaded environment, global objects require locks before write access. Complex global objects are considerably more expensive to initialize, and are difficult to program correctly. In this unit, we look at this last issue, that of complex global objects.

Transcript of 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

Page 1: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

1

Becoming More Effective with C++ … Becoming More Effective with C++ … Day TwoDay Two

Stanley B. LippmanStanley B. [email protected]@gmail.com

Page 2: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

2

The Problem of Static The Problem of Static InitializationInitialization

Page 3: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

3

Complex Global ObjectsComplex Global Objects

Global objects are problematic for many reasons. Global objects are problematic for many reasons. Their primary benefit is that they simplify information Their primary benefit is that they simplify information sharing across functions, modules, and libraries.sharing across functions, modules, and libraries.The three primary drawbacks of global objects are The three primary drawbacks of global objects are

The direct use of global objects across functions, The direct use of global objects across functions, modules, and/or libraries results in code that is modules, and/or libraries results in code that is difficult to reuse or modify in any significant way. difficult to reuse or modify in any significant way. In a threaded environment, global objects require In a threaded environment, global objects require locks before write access. locks before write access. Complex global objects are considerably more Complex global objects are considerably more expensive to initialize, and are difficult to program expensive to initialize, and are difficult to program correctly.correctly.

In this unit, we look at this last issue, that of complex In this unit, we look at this last issue, that of complex global objects.global objects.

Page 4: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

4

C-Style strings vs. string ClassC-Style strings vs. string Class

First, we need to understand what it means to say First, we need to understand what it means to say that a global object is complex. Consider the that a global object is complex. Consider the following pairs of global declarations.following pairs of global declarations.

In this pair, the first instance is a C-style string. The In this pair, the first instance is a C-style string. The second is a standard library string object. second is a standard library string object.

Both hold a constant string literal indicating a Both hold a constant string literal indicating a version number. What are the differences in version number. What are the differences in initialization?initialization?

const char* const version1 = “0.00 ”;const char* const version1 = “0.00 ”;const string version2( “0.00 ” );const string version2( “0.00 ” );

Page 5: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

5

C-Style strings vs. string ClassC-Style strings vs. string Class

The first instance represents a constant expression. The first instance represents a constant expression. version1 can be completely evaluated at compile-version1 can be completely evaluated at compile-time.time.Typically, the string literal is allocated in the program Typically, the string literal is allocated in the program data segment. (Alternatively, the implementation may data segment. (Alternatively, the implementation may use a string table to avoid multiple instances of the use a string table to avoid multiple instances of the same string literal.)same string literal.)version1 is initialized to the location of the string version1 is initialized to the location of the string literal during compilation.literal during compilation.There is no run-time overhead.There is no run-time overhead.

const char* const version1 = “0.00 ”;const char* const version1 = “0.00 ”;

Page 6: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

6

C-Style strings vs. string ClassC-Style strings vs. string Class

The second instance represents a constructor The second instance represents a constructor invocation. This in general cannot be evaluated at invocation. This in general cannot be evaluated at compile-time.compile-time.

The string literal is still allocated during compilation. The string literal is still allocated during compilation. The actual invocation of the string constructor, The actual invocation of the string constructor, however, must be delayed until program start-up.however, must be delayed until program start-up.

In addition, if version2 is needed by another global In addition, if version2 is needed by another global object not defined in the same file, we have a object not defined in the same file, we have a potentially serious dependency problem. (More on potentially serious dependency problem. (More on that later.)that later.)

const string version2( “0.00 ” );const string version2( “0.00 ” );

Page 7: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

7

Built-in Array vs. vector ClassBuilt-in Array vs. vector Class

In this pair, the first instance is a built-in array. The In this pair, the first instance is a built-in array. The second is a standard library vector object. second is a standard library vector object.

Both are initialized to the same four constant Both are initialized to the same four constant expressions.expressions.

What are the differences in initialization?What are the differences in initialization?

int lut1[] = { 7, 12, 48, 106 };int lut1[] = { 7, 12, 48, 106 };vector<int> lut2( lut1, lut1+4 );vector<int> lut2( lut1, lut1+4 );

Page 8: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

8

Built-in Array vs. vector ClassBuilt-in Array vs. vector Class

In this pair, the first instance is a built-in array In this pair, the first instance is a built-in array object. The second is a standard library vector object. The second is a standard library vector object. Both are initialized to the same values.object. Both are initialized to the same values.

Again, the first instance represents a constant Again, the first instance represents a constant expression. lut1 can be completely evaluated at expression. lut1 can be completely evaluated at compile-time.compile-time.

Typically, the array memory is allocated in the Typically, the array memory is allocated in the program data segment with the elements initialized program data segment with the elements initialized to the literal values.to the literal values.

There is no run-time overhead. (Of course, only There is no run-time overhead. (Of course, only constant expressions can be used within the constant expressions can be used within the initialization list.)initialization list.)

int lut1[] = { 7, 12, 48, 106 };int lut1[] = { 7, 12, 48, 106 };

Page 9: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

9

Built-in Array vs. vector ClassBuilt-in Array vs. vector Class

The second instance again represents a run-time The second instance again represents a run-time constructor invocation. constructor invocation.

The begin and end address marking off the range of The begin and end address marking off the range of elements with which to initialize lut2 are constant elements with which to initialize lut2 are constant expressions.expressions.

The iteration across that range to copy the values is The iteration across that range to copy the values is a run-time activity.a run-time activity.

Again, if lut2 is needed by another global object not Again, if lut2 is needed by another global object not defined in the same file, we have the same defined in the same file, we have the same potentially serious dependency problem. potentially serious dependency problem.

vector<int> lut2( lut1, lut1+4 );vector<int> lut2( lut1, lut1+4 );

Page 10: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

10

Pointer InitializationsPointer Initializations

In this pair, both are pointers to objects of type int. In this pair, both are pointers to objects of type int. Both address an integer object containing the value Both address an integer object containing the value 7. 7.

One is initialized to the address of the first One is initialized to the address of the first element of lut1.element of lut1.The other is initialized with a copy of that The other is initialized with a copy of that element’s value. The object it addresses is element’s value. The object it addresses is allocated on the heap.allocated on the heap.

What are the differences in initialization?What are the differences in initialization?

int lut1[] = { 7, 12, 48, 106 };int lut1[] = { 7, 12, 48, 106 };

int *pi1 = &lut1[ 0 ];int *pi1 = &lut1[ 0 ];int *pi2 = new int( lut1[ 0 ] );int *pi2 = new int( lut1[ 0 ] );

Page 11: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

11

Pointer InitializationsPointer Initializations

In this pair, the first instance is initialized with the In this pair, the first instance is initialized with the address of the first array element.address of the first array element.The address of an object is a constant expression: The address of an object is a constant expression: the compiler knows its value. the compiler knows its value. The memory for the pointer is allocated within the The memory for the pointer is allocated within the data segment by the compiler and initialized with the data segment by the compiler and initialized with the address of the first element during code generation.address of the first element during code generation.There is no run-time overhead.There is no run-time overhead.

int lut1[] = { 7, 12, 48, 106 };int lut1[] = { 7, 12, 48, 106 };int *pi1 = &lut1[ 0 ];int *pi1 = &lut1[ 0 ];

Page 12: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

12

Pointer InitializationsPointer Initializations

In this pair, the second instance invokes the new In this pair, the second instance invokes the new expression. The new expression is actually a library expression. The new expression is actually a library call. It cannot be evaluated at compile-time.call. It cannot be evaluated at compile-time.Heap memory, in general, is a run-time resource that Heap memory, in general, is a run-time resource that requires one or two function calls.requires one or two function calls.So, although this does not involve a class So, although this does not involve a class constructor, it still requires a run-time function constructor, it still requires a run-time function invocation.invocation.

int lut1[] = { 7, 12, 48, 106 };int lut1[] = { 7, 12, 48, 106 };int *pi2 = new int( lut1[ 0 ] );int *pi2 = new int( lut1[ 0 ] );

Page 13: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

13

Simple Type InitializationSimple Type Initialization

In this pair, both objects are the built-in integer data In this pair, both objects are the built-in integer data type. type.

What are the differences in initialization? Or, rather, What are the differences in initialization? Or, rather, at this point, why is the second instance a run-time at this point, why is the second instance a run-time initialization?initialization?

int ival1 = 7;int ival1 = 7;int ival2 = lut1[ 0 ];int ival2 = lut1[ 0 ];

Page 14: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

14

Simple Type InitializationSimple Type Initialization

The first instance is initialized to a constant literal. The first instance is initialized to a constant literal. The compiler is able to completely evaluate the The compiler is able to completely evaluate the expression and complete the initialization during expression and complete the initialization during code-generation. code-generation. The second instance is initialized to the value of a The second instance is initialized to the value of a non-constant integer object. The value associated non-constant integer object. The value associated with a non-const object of any type cannot be known with a non-const object of any type cannot be known until run-time.until run-time.While no constructor or function needs to be invoked While no constructor or function needs to be invoked in the initialization of ival2, it still cannot be in the initialization of ival2, it still cannot be initialized at compile-time and requires static initialized at compile-time and requires static initialization.initialization.

int ival1 = 7;int ival1 = 7;int ival2 = lut1[ 0 ];int ival2 = lut1[ 0 ];

Page 15: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

15

Summy:Summy:Constant ExpressionsConstant Expressions

Each pair’s first definition makes use of constant Each pair’s first definition makes use of constant expressions, and result in compile-time initialization expressions, and result in compile-time initialization that is loaded within the program’s data segment:that is loaded within the program’s data segment:

const char* const version1 = “0.00 ”;const char* const version1 = “0.00 ”;int lut1[] = { 7, 12, 48, 106 };int lut1[] = { 7, 12, 48, 106 };int *pi1 = &lut1[ 0 ];int *pi1 = &lut1[ 0 ];int ival1 = 7;int ival1 = 7;

A constant expression is an expression that can be A constant expression is an expression that can be fully evaluated at compile-time.fully evaluated at compile-time.Yes, the address of an object is a constant Yes, the address of an object is a constant expression!expression!In the C language, global data can only be initialized In the C language, global data can only be initialized with constant expressions …with constant expressions …

Page 16: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

16

Summy:Summy:Constant ExpressionsConstant Expressions

Each pair’s second definition requires run-time Each pair’s second definition requires run-time evaluation of its initial value.evaluation of its initial value.

Evaluation must be delayed until program start-up.Evaluation must be delayed until program start-up.

const string version2(“0.00 ” );const string version2(“0.00 ” );lut lut2( 7, 12, 48, 106 );lut lut2( 7, 12, 48, 106 );int *pi2 = new int( lut1[ 0 ] );int *pi2 = new int( lut1[ 0 ] );int ival2 = lut1[ 0 ];int ival2 = lut1[ 0 ];

The next question, then, is how does this The next question, then, is how does this initialization get accomplished during run-time.initialization get accomplished during run-time.

Page 17: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

17

OK, given the following program fragmentOK, given the following program fragment

Matrix identity;Matrix identity;int main(){int main(){ // identity must be initialized by this point!// identity must be initialized by this point! Matrix m1 = identity;Matrix m1 = identity; ...... return 0;return 0;}}

the language guarantees that the language guarantees that identityidentity is constructed is constructed prior to the first user statement of prior to the first user statement of main()main(), and that it , and that it is destructed following the last statement of is destructed following the last statement of main()main(). .

A global object such as A global object such as identityidentity with an associated with an associated constructor and destructor is said to require both constructor and destructor is said to require both static initialization and deallocation. static initialization and deallocation.

Page 18: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

18

Static Initialization:Static Initialization:Why the Urgency?Why the Urgency?

OK, given the following program fragment, why is the OK, given the following program fragment, why is the issue of when static initialization takes place an issue of when static initialization takes place an issue?issue?

Matrix identity( 1,0,0,0,0,1,0,0,Matrix identity( 1,0,0,0,0,1,0,0, 0,0,1,0,0,0,0,1 );0,0,1,0,0,0,0,1 );

int main()int main() {{

Matrix m1 = identity;Matrix m1 = identity; // ...// ... return 0;return 0;}}

Page 19: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

19

Static Initialization:Static Initialization:Why the Urgency?Why the Urgency?

Unless identity is initialized before the first statement Unless identity is initialized before the first statement of of main()main(), our program fails. , our program fails.

In this case, it is the compiler’s responsibility to carry In this case, it is the compiler’s responsibility to carry out the necessary initialization in a timely fashion. out the necessary initialization in a timely fashion. (We’ll see a case later in which the initialization (We’ll see a case later in which the initialization becomes our responsibility.) becomes our responsibility.)

The language guarantees just that. We are promised The language guarantees just that. We are promised that identity is constructed prior to the first user that identity is constructed prior to the first user statement of statement of main()main(), and that it is destructed , and that it is destructed following the last statement of following the last statement of main()main(). .

A global object such as identity with an associated A global object such as identity with an associated constructor and destructor is said to require both constructor and destructor is said to require both static initialization and deallocation. static initialization and deallocation.

Page 20: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

20

Static Initialization at Start-upStatic Initialization at Start-up

All global objects, whether complex or simple, are All global objects, whether complex or simple, are allocated within the program data segment. allocated within the program data segment.

If an initial value is specified, and that initialize value If an initial value is specified, and that initialize value can be evaluated at compile-time, the object is can be evaluated at compile-time, the object is initialized with that value. initialized with that value.

Otherwise, a special initialization function (and Otherwise, a special initialization function (and special deallocation function, if necessary) is special deallocation function, if necessary) is synthesized within the module under compilation.synthesized within the module under compilation.

The non-constant initial expression is applied to the The non-constant initial expression is applied to the object within this function.object within this function.

Page 21: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

21

Static Initialization at Start-upStatic Initialization at Start-up

For example, in our example, the program is rewritten For example, in our example, the program is rewritten something like the following (this is Pseudo-code):something like the following (this is Pseudo-code):

Matrix identity; // no constructor applied Matrix identity; // no constructor applied int main()int main() {{ // prior to execution beginning here,// prior to execution beginning here, // __sti__someName() and all other static// __sti__someName() and all other static // initialization functions must be invoked// initialization functions must be invoked

Matrix m1 = identity;Matrix m1 = identity; // …// …

}}

// compiler synthesize initialization function// compiler synthesize initialization function void __sti_someName(){void __sti_someName(){

identity.Matrix::Matrix(identity.Matrix::Matrix( 1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1 );1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1 ); }}

Page 22: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

22

Static Initialization at Start-upStatic Initialization at Start-up

If there are multiple objects within a file requiring If there are multiple objects within a file requiring static initialization, they are placed within the static initialization, they are placed within the function in the order of their declaration.function in the order of their declaration.

This is guaranteed by the language. (What is not This is guaranteed by the language. (What is not guaranteed is the order that these initialization guaranteed is the order that these initialization functions are invoked. We’ll look at that in a minute.functions are invoked. We’ll look at that in a minute.

So, for example, given the following set of complex So, for example, given the following set of complex global objects:global objects:

const string version2( “0.00 ” );const string version2( “0.00 ” );vector<int> lut2( lut1, lut1+4 );vector<int> lut2( lut1, lut1+4 );int *pi2 = new int( lut1[ 0 ] );int *pi2 = new int( lut1[ 0 ] );int ival2 = lut1[ 0 ];int ival2 = lut1[ 0 ];

Page 23: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

23

Static Initialization at Start-upStatic Initialization at Start-up

// typical transformation// typical transformationconst string version2; // no const string version2; // no constructorconstructor

vector<int> lut2; // no vector<int> lut2; // no constructorconstructor

int *pi2 = 0;int *pi2 = 0;int ival2 = 0;int ival2 = 0;

void __sti_someName()void __sti_someName() {{ // Pseudo-code// Pseudo-code

version2.string::string(“0.00 version2.string::string(“0.00 ” );” );

lut2.vector::vector( lut1, lut2.vector::vector( lut1, lut1+4 );lut1+4 );

pi2 = new int( lut1[ 0 ] );pi2 = new int( lut1[ 0 ] ); ival2 = lut1[ 0 ];ival2 = lut1[ 0 ];}}

Page 24: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

24

Static Initialization at Start-upStatic Initialization at Start-up

The job of the compilation system, The job of the compilation system, then, is to identify and invoke all the then, is to identify and invoke all the static initialization functions within static initialization functions within the program executable. the program executable. Moreover, to do that prior to the Moreover, to do that prior to the beginning of main().beginning of main().Again, within a module, the order of Again, within a module, the order of initialization is strictly defined: it is initialization is strictly defined: it is the textual order of declaration. The the textual order of declaration. The order of destruction is the reverse.order of destruction is the reverse.Unfortunately, the order of Unfortunately, the order of initialization across modules is left initialization across modules is left undefined by C++ Standard.undefined by C++ Standard.Using objects requiring static Using objects requiring static initialization across modules requires initialization across modules requires some special programming. some special programming. Let’s first look at the problem, then Let’s first look at the problem, then illustrate one programmer solution.illustrate one programmer solution.

Page 25: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

25

Problem IllustrationProblem Illustration

#include <iostream>#include <iostream>#include "my_static_object.h"#include "my_static_object.h"

extern my_static_object *p; extern my_static_object *p;

// foo makes use of p// foo makes use of p// foo and p are defined in separate modules// foo and p are defined in separate modulesextern int foo();extern int foo();

// we want to use p and foo() here –// we want to use p and foo() here –// we have no way within the language // we have no way within the language // language to force the initialization of p.// language to force the initialization of p.int iv = p->bar();int iv = p->bar();int iv2 = foo();int iv2 = foo();

int main()int main(){{ // !! core dumps before reaching here.// !! core dumps before reaching here.

cout << "beginning main: " << p << endl;cout << "beginning main: " << p << endl;}}

Page 26: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

26

The Problem StatementThe Problem Statement

We need to guarantee that when a user includes a We need to guarantee that when a user includes a header file in which one or more global objects are header file in which one or more global objects are declared that require static initialization, that those declared that require static initialization, that those objectsobjects

Are initialized in that module in which the header Are initialized in that module in which the header file is included. file is included. But … are initialized just once although it is likely But … are initialized just once although it is likely that multiple modules will include the header file.that multiple modules will include the header file.

As an example of the problem, think of cout, cin, and As an example of the problem, think of cout, cin, and cerr, each of which must be statically initialized cerr, each of which must be statically initialized before a use and which are included in hundreds of before a use and which are included in hundreds of files.files.

Page 27: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

27

A Schwarz Counter SolutionA Schwarz Counter Solution

Jerry Schwarz, the designer of the original iostream Jerry Schwarz, the designer of the original iostream library, came up with a solution that is now generally library, came up with a solution that is now generally bears his name: bears his name: SchwarzSchwarz CounterCounter..Consider our Consider our static_objectstatic_object class, as follows: class, as follows:

class my_static_object {class my_static_object {public:public:

my_static_object();my_static_object();int foo() const { return _ival; }int foo() const { return _ival; }

private:private:int _ival;int _ival;

};};

// the external object we need to initialize// the external object we need to initializeextern my_static_object *p;extern my_static_object *p;

Page 28: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

28

A Schwarz Counter SolutionA Schwarz Counter Solution

Within the header file of the class an auxiliary class – Within the header file of the class an auxiliary class – some call it a helper class – and a static object of that some call it a helper class – and a static object of that class are introduced:class are introduced:

class my_stat_obj_init {class my_stat_obj_init {public:public:

my_stat_obj_init( );my_stat_obj_init( );~my_stat_obj_init();~my_stat_obj_init();

private:private:static int init_count;static int init_count;static int init_val;static int init_val;

};};

// the Schwarz counter static object // the Schwarz counter static object Static my_stat_obj_init obj;Static my_stat_obj_init obj;

Page 29: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

29

What’s Going on?What’s Going on?

The auxiliary class maintains a static data member The auxiliary class maintains a static data member that keeps a reference count. that keeps a reference count. This member is incremented with each constructor This member is incremented with each constructor call and decremented with each destructor call. call and decremented with each destructor call. If it is 0 prior to being incremented, the global object If it is 0 prior to being incremented, the global object we care about is initialized within the constructor:we care about is initialized within the constructor:

my_stat_obj_init::my_stat_obj_init()my_stat_obj_init::my_stat_obj_init(){{

if ( init_count++ )if ( init_count++ ) return;return;

   // this is the whole point of the class// this is the whole point of the class

p = new my_static_object( init_val );p = new my_static_object( init_val );}}

Page 30: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

30

Destructor & Static MembersDestructor & Static MembersThe cost of the static object initialization of obj is the The cost of the static object initialization of obj is the constructor call, and generally a word of storage for constructor call, and generally a word of storage for each module it is included in. each module it is included in. The benefit is that it does not require any explicit The benefit is that it does not require any explicit work from the user.work from the user.The destructor and static data declarations look as The destructor and static data declarations look as follows:follows:

my_stat_obj_init::~my_stat_obj_init()my_stat_obj_init::~my_stat_obj_init(){{

if ( -- init_count )if ( -- init_count ) return;return;delete p;delete p;

};};

int my_stat_obj_init::init_count;int my_stat_obj_init::init_count;int my_stat_obj_init::init_val = 1024;int my_stat_obj_init::init_val = 1024;

Page 31: 1 Becoming More Effective with C++ … Day Two Stanley B. Lippman

31

Potential Drawback …Potential Drawback …

If possible, I recommend not using global objects, in If possible, I recommend not using global objects, in particular global objects requiring static initialization. particular global objects requiring static initialization. A singleton object that allocates itself on a first use is A singleton object that allocates itself on a first use is one possible alternative.one possible alternative.It is important for the C++ programmer to It is important for the C++ programmer to

Recognize the difference between simple and Recognize the difference between simple and global complex object.global complex object.Recognize when a global complex object may be Recognize when a global complex object may be accessed by complex global objects in other accessed by complex global objects in other modules and program such that the global object modules and program such that the global object accessed is initialized.accessed is initialized.

Hopefully this unit has clarified the issue and Hopefully this unit has clarified the issue and suggested at least one solution.suggested at least one solution.