Transactional Memory for C/C++

23
Transactional Memory for C/C++ IBM XL C/C++ for Blue Gene/Q, V12.0 (technology preview)

Transcript of Transactional Memory for C/C++

Page 1: Transactional Memory for C/C++

Transactional Memory for C/C++IBM XL C/C++ for Blue Gene/Q, V12.0 (technology preview)

���

Page 2: Transactional Memory for C/C++

ii Transactional Memory for C/C++

Page 3: Transactional Memory for C/C++

Contents

Chapter 1. Transactional memory . . . . 1

Chapter 2. Compiler option reference . . 3-qtm . . . . . . . . . . . . . . . . . 3

Chapter 3. Predefined macro fortransactional memory . . . . . . . . . 5

Chapter 4. Environment variables fortransactional memory . . . . . . . . . 7TM_MAX_NUM_ROLLBACK . . . . . . . . 7TM_REPORT_STAT_ENABLE. . . . . . . . . 7TM_REPORT_NAME . . . . . . . . . . . 7TM_REPORT_LOG . . . . . . . . . . . . 8TM_ENABLE_INTERRUPT_ON_CONFLICT . . . 8

Chapter 5. Pragma directives forparallel processing . . . . . . . . . 11Nesting OpenMP, transactional memory, andthread-level speculative execution . . . . . . . 11#pragma tm_atomic . . . . . . . . . . . 12

Chapter 6. Built-in functions fortransactional memory . . . . . . . . 15tm_get_stats . . . . . . . . . . . . . . 15tm_get_all_stats . . . . . . . . . . . . . 16tm_print_stats . . . . . . . . . . . . . 16reset_speculation_mode . . . . . . . . . . 17

Index . . . . . . . . . . . . . . . 19

iii

Page 4: Transactional Memory for C/C++

iv Transactional Memory for C/C++

Page 5: Transactional Memory for C/C++

Chapter 1. Transactional memory

Transactional memory is a model for controlling concurrent memory accesses inthe scope of parallel programming.

In parallel programming, concurrency control ensures that threads running inparallel do not update the same resources at the same time. Traditionally, theconcurrency control of shared memory data is through locks, for example, mutexlocks. A thread acquires a lock before modifying the shared data, and releases thelock afterward. A lock-based synchronization can lead to some performance issuesbecause threads might need to wait to update lock-protected data.

Transactional memory is an alternative to lock-based synchronization. It attemptsto simplify parallel programming by grouping read and write operations andrunning them like a single operation. Transactional memory is like databasetransactions where all shared memory accesses and their effects are eithercommitted all together or discarded as a group. All threads can enter the criticalregion simultaneously. If there are conflicts in accessing the shared memory data,threads try accessing the shared memory data again or are stopped withoutupdating the shared memory data. Therefore, transactional memory is also called alock-free synchronization. Transactional memory can be a competitive alternative tolock-based synchronization.

A transactional memory system must hold the following properties across theentire execution of a concurrent program:

AtomicityAll speculative memory updates of a transaction are either committed ordiscarded as a unit.

ConsistencyThe memory operations of a transaction take place in order. Transactionsare committed one transaction at a time.

IsolationMemory updates are not visible outside of a transaction until thetransaction commits data.

Transactional memory on Blue Gene/Q

On Blue Gene/Q, the transactional memory model is implemented in the hardwareto access all the memory up to the 16 GB boundary.

Transactions are implemented through regions of code that you can designate to besingle operations for the system. The regions of code that implement thetransactions are called transactional atomic regions.

Transactional memory is enabled with the "-qtm" compiler option, and requiresthread safe compilation mode.

Execution modes

When transactional memory is activated in Blue Gene/Q, transactions are run inone of the following operating modes:

1

Page 6: Transactional Memory for C/C++

v Speculation mode (default)v Irrevocable mode

Each mode applies to an entire transactional atomic region.

Speculation mode

Under speculation mode, Kernel address space, devices I/Os, and mostmemory-mapped I/Os are protected from the irrevocable actions exceptwhen the safe_mode clause is specified. The transaction goes intoirrevocable mode if such an action occurs to guarantee the correct result.

Irrevocable mode

System calls, irrevocable operations such as I/O operations, and OpenMPconstructs trigger transactions to go into irrevocable mode, which serializestransactions. Transactions are also running in irrevocable mode when themaximum number of transaction rollbacks has been reached.

Under irrevocable mode, each memory update of a thread is committedinstantaneously instead of at the end of the transaction. Therefore, memoryupdates are immediately visible to other threads. If the transactionbecomes irrevocable, the threads run nonspeculatively.

Using variables and synchronization constructs withtransactional memory

The semantics of transactional memory ensure that the effects of transactions of athread are visible to other threads only after the transactions commit data orbecome irrevocable. When you use variables or synchronization constructs insidetransactions, be careful when the volatile and regular variables that are visible toother threads are updated.

Data races when using transactional memory

A data race might happen if a memory location is accessed concurrently from boththe following types of code sections:v A transactional atomic region that is not nested in other critical sectionsv A lock-based critical section of another thread

For example, the atomicity of a lock-based critical section might be broken whenthe transaction happens in the middle of the critical section. The atomicity of thetransaction might also be broken if the transaction becomes irrevocable and isinterleaved with the critical section.

The data race happens because each transactional atomic region can be thought ofas using a different lock. In contrast, the #pragma omp critical directive uses onelock for all critical regions in the same parallel region.

Related informationv The "-qtm" compiler optionv Built-in functions for transactional memoryv Environment variables for transactional memoryv #pragma tm_atomic

2 Transactional Memory for C/C++

Page 7: Transactional Memory for C/C++

Chapter 2. Compiler option reference

-qtm

Category

Optimization and tuning

Pragma equivalent

None.

Purpose

Enables support for transactional memory.

Syntax

��notm

-q tm ��

Defaults

-qnotm

Usage

The -qtm option requires the thread safe compilation mode. Use -qtm with thebgxlc_r, bgxlC_r, or bgxlc++_r thread safe compiler invocation command.

Related informationv Transactional memoryv #pragma tm_atomicv Built-in functions for transactional memoryv Environment variables for transactional memory

3

Page 8: Transactional Memory for C/C++

4 Transactional Memory for C/C++

Page 9: Transactional Memory for C/C++

Chapter 3. Predefined macro for transactional memory

You can use the following macro to test whether the transactional memory featureis enabled.

Compiler option:-qtm

Macro name:__IBM_TRANSACTIONAL_MEMORY__

Predefined value:1

Description:When the transactional memory feature is enabled with the -qtm option,the __IBM_TRANSACTIONAL_MEMORY__ macro is defined as 1. If theoption is not specified, the macro is undefined.

5

Page 10: Transactional Memory for C/C++

6 Transactional Memory for C/C++

Page 11: Transactional Memory for C/C++

Chapter 4. Environment variables for transactional memory

The environment variables for transactional memory have no effect unlesstransactional memory is enabled with the "-qtm" compiler option.

TM_MAX_NUM_ROLLBACK

The TM_MAX_NUM_ROLLBACK environment variable indicates the maximumnumber a thread of a particular transactional atomic region can roll back before thethread goes into irrevocable mode.

��10

TM_MAX_NUM_ROLLBACK = n ��

The default value is 10.

TM_MAX_NUM_ROLLBACK = 0 forces the thread to enter irrevocable mode.

TM_MAX_NUM_ROLLBACK = -1 sets the thread to run infinite rollbacks.

TM_REPORT_STAT_ENABLE

The TM_REPORT_STAT_ENABLE environment variable enables or disablesstatistics query built-in functions for transactional memory.

�� TM_REPORT_STAT_ENABLE= YES

��

The value is not case sensitive.

When TM_REPORT_STAT_ENABLE is set to YES, you can retrieve the statisticsthrough the following built-in functions:v “tm_get_stats” on page 15v “tm_get_all_stats” on page 16

Related informationv Built-in functions for transactional memory

TM_REPORT_NAME

The TM_REPORT_NAME environment variable specifies the name and location ofthe statistics log file for transactional memory.

�� TM_REPORT_NAME= file_path

��

7

Page 12: Transactional Memory for C/C++

If TM_REPORT_NAME is not set, the log file is placed in the current workingdirectory and is named tm_report.log.pid. The extension pid is the ID of theprocess that called tm_print_stats.

Related informationv Built-in functions for transactional memory

TM_REPORT_LOG

The TM_REPORT_LOG environment variable specifies how to create the statisticslog file for transactional memory.

By default, TM_REPORT_LOG is not defined and no log file is created.

�� TM_REPORT_LOG = SUMMARYFUNCALLVERBOSE

��

The value is not case sensitive.

SUMMARYThe statistics log file is generated only at the end of the program.

FUNCThe statistics log file is generated and updated at each call to thetm_print_stats built-in function.

ALLThe statistics log file is generated and updated at each call to thetm_print_stats built-in function and at the end of the program.

VERBOSEThe statistics log file is generated and updated at each call to thetm_print_stats built-in function and at the end of the program. The generatedreport file also includes the addresses of memory access conflicts during thespeculation.

If TM_REPORT_LOG is set to either SUMMARY, ALL, or VERBOSE, a log file thatcontains the following statistics is also created when the program ends:v It contains the cumulative statistics for transactional atomic regions that each

hardware thread has run. The statistics follow the Structure of statistic counters.v It contains a summary that shows the sum of all the statistic counters for all the

hardware threads.

Related informationv Built-in functions for transactional memory

TM_ENABLE_INTERRUPT_ON_CONFLICT

The TM_ENABLE_INTERRUPT_ON_CONFLICT environment variable indicateswhether to generate interrupts on conflicts.

8 Transactional Memory for C/C++

Page 13: Transactional Memory for C/C++

��NO

TM_ENABLE_INTERRUPT_ON_CONFLICT = YES ��

The default value is NO. The value is not case sensitive.

Conflict resolution usually happens at the end of transactions. Use thisenvironment variable to generate interrupts for access conflicts in transactionalatomic regions. This option is more useful for long running transactions becausethe conflict resolution logic immediately starts inside the interrupt handler of thekernel.

Note: Interrupts are always generated for speculative buffer overflows orsupervised-mode violations.

Chapter 4. Environment variables for transactional memory 9

Page 14: Transactional Memory for C/C++

10 Transactional Memory for C/C++

Page 15: Transactional Memory for C/C++

Chapter 5. Pragma directives for parallel processing

Parallel processing operations are controlled by pragma directives in your programsource.v The omp pragmas have effect only when parallelization is enabled with the

-qsmp compiler option.v The speculative pragmas have effect only when thread-level speculative

execution is enabled with the -qsmp=speculative compiler option.v The tm pragmas have effect only when transactional memory is enabled with the

-qtm compiler option.

Nesting OpenMP, transactional memory, and thread-level speculativeexecution

This section describes how you can mix parallel regions. The following types ofparallel regions can be used without restrictions in the same program if they arenot nested:v OpenMPv Transactional memory (TM)v Thread-level speculative execution (SE)

They can also be nested but with some restrictions. The following table describesthe behavior of different nesting scenarios.

Table 1. Nesting rules for OpenMP, TM, and SE

Scenario Description Runtime action

BEGIN SEBEGIN SE

END SEEND SE

An SE region isnested inside anSE region.

The SE nesting is flattened. The inner nested SEregion is run speculatively by one thread as part ofthe parallel outer SE region. The clauses specified onthe inner SE region are still effective.

BEGIN SEBEGIN TM

END TMEND SE

A TM region isnested inside anSE region.

The TM region is run speculatively in SE mode aspart of the outer SE region.

BEGIN SEBEGIN OpenMP

END OpenMPEND SE

An OpenMPregion is nestedinside an SEregion.

An OpenMP region running in parallel inside thespeculative SE region causes the SE region to bestopped. The stopped SE region is rolled back and runnonspeculatively. The inner OpenMP region is runnonspeculatively by multiple threads.

BEGIN TMBEGIN SE

END SEEND TM

An SE region isnested inside aTM region.

The inner SE region is run speculatively in TM modeby one thread as part of the outer TM region.

BEGIN TMBEGIN TM

END TMEND TM

A TM region isnested inside aTM region.

The TM nesting is flattened. The inner nested TMregions are run speculatively as part of the outer TMregion.

11

Page 16: Transactional Memory for C/C++

Table 1. Nesting rules for OpenMP, TM, and SE (continued)

Scenario Description Runtime action

BEGIN TMBEGIN OpenMP

END OpenMPEND TM

An OpenMPregion is nestedinside a TMregion.

An OpenMP region running in parallel inside thespeculative TM region causes the TM region to bestopped. The stopped transaction is then rolled backand run nonspeculatively. The inner OpenMP regionis run nonspeculatively by multiple threads.

BEGIN OpenMPBEGIN SE

END SEEND OpenMP

An SE region isnested inside anOpenMP region.

The SE region is run by one thread in nonspeculativemode. The outer OpenMP region is run in parallel.

BEGIN OpenMPBEGIN TM

END TMEND OpenMP

A TM region isnested inside anOpenMP region.

The outer OpenMP region is run in parallel and theTM region is run speculatively.

BEGIN TMEND TM

BEGIN SEEND SE

A programcontains separateTM and SEregions.

The first region is run in TM mode. The second regionis run in SE mode by one thread.

If the reset_speculation_mode function is called afterthe first TM region, the second SE region is run inparallel speculatively.

BEGIN SEEND SE

BEGIN TMEND TM

A programcontains separateSE and TMregions.

The first region is run in SE mode. The second regionis run in TM irrevocable mode.

If the reset_speculation_mode function is called afterthe first SE region, the second TM region is run inparallel speculatively.

#pragma tm_atomic

The tm_atomic directive indicates a transactional atomic region.

Syntax

�� # pragma tm_atomic(safe_mode)

��

safe_modeUsing the safe_mode clause reduces overhead and increases performance.However, if safe_mode is specified, irrevocable actions are not checked atruntime. The run result is undefined if an irrevocable action occurs during theexecution.

Usage

The transactional memory directive is enabled with the -qtm compiler option.

This directive must be placed immediately before the code block of thetransactional atomic region.

A transactional atomic region must be a structured block as defined in theOpenMP 3.0 specification. A structured block is one of the following constructs:

12 Transactional Memory for C/C++

Page 17: Transactional Memory for C/C++

v An executable statement, possibly compound, with a single entry at the top anda single exit at the bottom

v An OpenMP construct

Notes:

v The following restrictions apply to branch operations:– Entry to the transactional atomic region cannot be the result of a branch.– The point of entry must not be a call to setjmp().– Calls to longjmp() and throw() must not cause the transaction to branch in or

out of the transactional atomic region.v You cannot call exit() in a transactional atomic region.v You cannot create a local object of variable length array in a transactional atomic

region.

Transactional atomic regions can be nested. However, the atomicity, consistency,and isolation properties are provided at the outermost transactional atomic region.v Stopping an inner transaction causes the corresponding outer transaction to stop.v An inner transaction does not commit data at the end of its transactional atomic

region. Instead, the data of the inner transaction is committed later when thedata of the corresponding outer transaction is committed.

v If a conflict occurs in a nested transaction, the thread is rolled back to thebeginning of the outermost transaction.

Exampleint v, w, z;int a[10], b[10];

#pragma omp parallel forfor (v = 0; v < 10; v++)

for (w = 0; w < 10; w++)for (z = 0; z < 10; z++){

//Use the tm_atomic directive to indicate a transactional atomic region.#pragma tm_atomic{

a[v] = a[w] + b[z];}

}

Related informationv The "-qtm" compiler optionv Execution modesv Built-in functions for transactional memoryv Environment variables for transactional memoryv "Exception handling (C++ only)"

Chapter 5. Pragma directives for parallel processing 13

Page 18: Transactional Memory for C/C++

14 Transactional Memory for C/C++

Page 19: Transactional Memory for C/C++

Chapter 6. Built-in functions for transactional memory

To call the built-in functions that are specific to transactional memory, you mustinclude the speculation.h file in your source code.

The built-in functions for transactional memory have no effect unless transactionalmemory is enabled with the "-qtm" compiler option.

Structure of statistic counters

The following structure that contains the statistic counters can be used with thetransactional memory built-in functions. It is declared in the speculation.h file.typedef struct TmReport_s {

// Thread IDunsigned long hwThreadId;

// Total number of transactionsunsigned long totalTransactions;

// Total number of rollbacks for transactional memory threadsunsigned long totalRollbacks;

// Total number of serialization caused by JMV conflictsunsigned long totalSerializedJMV;

// Total number of serialization caused by TM_MAX_NUM_ROLLBACK reachedunsigned long totalSerializedMAXRB;

// Total number of serialization caused by buffer overflow, hardware race,// and concurrent regions of transactional memory and thread-level speculative// execution in the same processunsigned long totalSerializedOTHER;

} TmReport_t;

tm_get_statsPurpose

With the tm_get_stats built-in function, you can retrieve statistical information forthe transactional memory of a particular hardware thread in your program.

Prototypevoid tm_get_stats (TmReport_t *stats)

Usage

When the tm_get_stats built-in function is called, it updates the argument withcumulative statistics of all the transactional atomic regions that a particularhardware thread has run.

To use the tm_get_stats function, you must set the TM_REPORT_STAT_ENABLEenvironment variable to YES.

15

Page 20: Transactional Memory for C/C++

Related informationv Environment variables for transactional memory

tm_get_all_statsPurpose

With the tm_get_all_stats built-in function, you can retrieve statistical informationfor the transactional memory of all hardware threads in a program.

Prototypevoid tm_get_all_stats (TmReport_t *stats)

Usage

When the tm_get_all_stats built-in function is called, it updates the argumentwith cumulative statistics of all the transactional atomic regions that all hardwarethreads have run.

To use the tm_get_all_stats function, you must set theTM_REPORT_STAT_ENABLE environment variable to YES.

You must use this built-in function outside of parallel regions.

Related informationv Environment variables for transactional memory

tm_print_statsPurpose

With the tm_print_stats built-in function, you can write statistics for thetransactional memory of a particular hardware thread to a log file.

Prototypevoid tm_print_stats(void)

Example...

#pragma tm_atomiccode-block

tm_print_stats()...

Usage

After a transactional atomic region, you can call the tm_print_stats built-infunction to write cumulative statistics of all the transactional atomic regions that aparticular hardware thread has run to a log file.

To enable the statistics log file to be generated at each call to the tm_print_statsfunction, you must set the TM_REPORT_LOG variable to FUNC or ALL.

By default, the log file is named tm_report.log.pid, where pid is the ID of theprocess that called tm_print_stats. The log file is saved in the current working

16 Transactional Memory for C/C++

Page 21: Transactional Memory for C/C++

directory of the program that uses transactional memory. To override the defaults,you can specify the TM_REPORT_NAME environment variable.

Note: If you use this function in a transactional atomic region, it causes thetransaction running in irrevocable mode.

Related informationv Environment variables for transactional memory

reset_speculation_mode

Purpose

With the reset_speculation_mode built-in function, you can switch mode inbetween thread-level speculative execution (SE) and transactional memory (TM) atruntime.

Prototypevoid reset_speculation_mode(void)

Example...#pragma speculative for

for-loop

reset_speculation_mode()

#pragma tm_atomiccode-block

...

Usage

You must use the reset_speculation_mode function after a TM or SE region.

To switch the mode, make sure all the TM or SE executions before the call arecompleted; otherwise, it might result in undefined behavior.

Related informationv “Nesting OpenMP, transactional memory, and thread-level speculative

execution” on page 11

Chapter 6. Built-in functions for transactional memory 17

Page 22: Transactional Memory for C/C++

18 Transactional Memory for C/C++

Page 23: Transactional Memory for C/C++

Index

Bbuilt-in functions

reset_speculation_mode 17tm_get_all_stats 16tm_get_stats 15tm_print_stats 16

Ccompiler option

-qtm 3

Eenvironment variables

TM_ENABLE_INTERRUPT_ON_CONFLICT 8TM_MAX_NUM_ROLLBACK 7TM_REPORT_LOG 8

environment variables (continued)TM_REPORT_NAME 7TM_REPORT_STAT_ENABLE 7

Iirrevocable mode 1

Mmacros

related to compiler options 5

Pparallel processing

parallel processing pragmas 11nesting 11

parallel processing (continued)pragma directives 11

pragmastm_atomic 12

Ssafe mode 1supervised mode 1

Ttransactional memory 1

19