DB2 Key Performance Metric Descriptions
Transcript of DB2 Key Performance Metric Descriptions
1
DB2 OBJECTS & PERFORMANCE
Buffer Pool:
Allocated memory for DB2; the main memory allocated to DB Manager to cahce table and index data
pages as they are read from disk or modified. DB Manage decides when to bring data from disk into
buffer pool. When old data is not being used, it can be written back out to disk.
Table Space:
Logical layer between physical tables w/data and the database; maps logical database design to physical
storage. Two types of table spaces:
• (SMS) System Managed Space: OS file system allocates and manages the space where the table
is stored
• (DMS) Database Managed Space: db manager controls storage space; special purpose file
system.
Container:
Allocates storage for the table space; It is the physical storage. It can be directory name, device or a file
name. All database and table data is assigned to table spaces. A single table space can span multiple
containers, but each container can ONLY belong to one table space.
Extent:
Is a unit of space within a container of a table space. DB Objects are stored in pages within DB2 which
are grouped into allocation units.
Container_0
Table Space Extent Pages
2
Page sizes:
Rows of table data are organized in page blocks. Four sizes exist (4K, 8K, 16K, 32K)
In a page of table data , ~ 75 bytes are reserved for DB2, the remaining is used for user data.
As you increase the page size, the following items also increase:
• Columns in the table
• Maximum row length
• Maximum table size
Big Block Reads:
If several pages (extent) are retrieved in a single request, then big-block read occurs. If the rows in the
pages are in the extent retrieved, then no physical I/O required.
Sequential Pre-fetching:
Ability of DB Manager to read pages in advance pages being referenced by a query. Use I/O Servers to
perform page reading.
Page Cleaning:
As pages are read and modified, they accumulate. Page cleaner tasks write out modified pages to
guarantee availability of buffer pool pages for use by read requests.
Table Descriptor:
The table descriptor provides information about the table, particularly the data definition from the
CREATE TABLE statement that created the object.
Catalog Table Space:
Catalog is where DB2 keeps all its metadata about database objects.
Temporary Table Spaces:
Space for intermediate tables as it waits to determine the final result set
Catalog Cache:
Table descriptors for tables, views, and aliases; a descriptor stores information about a table, view, or
alias in a condensed internal format. When an SQL statement references a table, it causes an insert of a
table descriptor into the cache, so that subsequent SQL statements referencing that same table can use
that descriptor and avoid reading from disk.
3
Package Cache: (see Package)
The package cache hit ratio tells you whether or not the package cache is being used effectively. If the
hit ratio is high (more than 0.8), the cache is performing well. A smaller ratio may indicate that the
package cache should be increased.
The package and section information required for the execution of dynamic and static SQL statements
are placed in the package cache as required. This information is required whenever a dynamic or static
statement is being executed.
The package cache exists at a database level. This means that agents with similar environments can
share the benefits of another agent's work. For static SQL statements, this can mean avoiding catalog
access.
Hash Join:
In a hash join, one table (selected by the optimizer) is scanned and rows are copied into memory buffers
drawn from the sort heap allocation. The memory buffers are divided into partitions based on a hash
code computed from the columns of the join predicates. Rows of the other table involved in the join are
matched to rows from the first table by comparing the hash code. If the hash codes match, the actual
join predicate columns are compared.
Consideration of the performance implications of coding your predicates in different ways
Join Predicate:
A join predicate is applied to identify the records that shall be joined. If the predicate evaluates to True,
then the combined record occurs in the joined table; otherwise, it does not. The join predicate can be
any predicate supported by SQL, for example in WHERE and ON clauses.
A join is a relation composition. That is the fundamental operation in relational algebra
DPCs (Deferred Procedures Call):
Interrupts that run at a lower priority than standard interrupts.
User Mode:
User mode is a restricted processing mode designed for applications, environment subsystems, and
integral subsystems.
Privileged Mode:
Privileged is designed or O/S components and allows access to hardware & memory
4
Split I/O:
May result from requesting data in a size that is too large to fit into a single I/O
Logical & Physical:
• Logical Reads is the number of Logical I/O requests made by DB2 for the physical file (or table).
• Physical Reads is the actual number of Physical I/O operations performed to satisfy the Logical I/O
requests.
The values of physical disk counters are sums of the values of the logical disks (or partitions) into which
they are divided.
MDL (Memory Descriptor List) Read Hits:
Read requests to the file system cache that hit the cache; so it does not require disk accesses in order to
provide memory access to the page.
Data Map Hits:
The percentage of data maps in the file system cache that could be resolved without having to retrieve a
page from the disk, because the page was already in physical memory.
Heap:
A logical grouping of memory that fulfills the needs of a particular component. For example, the utility
heap memory is used by DB2 utilities such as backup, restore, and load.
Indexes:
Indexes provide quick access to data and can enforce uniqueness on the rows in the table.
Threshold Trigger:
An event that occurs when the value of a performance variable exceeds or falls below a user-defined
threshold value. The action that occurs as a result of a threshold trigger can be:
� Logging information in an alert log file.
� Displaying information in an alert log window.
� Generating an audio alarm.
� Issuing a message window.
� Invoking a predefined command or program.
5
A database must have at least one buffer pool, and can have a number of buffer pools depending on the
workload characteristics, database page sizes used.
Using the Hidden Buffer pools:
When the main buffer pools are configured too large, it is possible that they will not fit into the
addressable memory space. (We will talk about addressable memory later.) That means DB2 cannot
start the database, because a database must have at least one buffer pool. If the database is not started,
you cannot connect to the database and change the buffer pool sizes. For this reason, DB2 pre-allocates
these four small buffer pools. Should the main buffer pools fail to start, DB2 will start the database with
the small buffer pools.
Sorting & Memory:
Sorting is required when no index satisfies the requested ordering of fetched rows, or the optimizer
determines that a sort is less expensive than an index scan.
There are two kinds of sorts: private sorts and shared sorts.
• Private sorts take place in an agent's private agent memory
• Shared sorts take place in the database's shared memory
The following formula calculates approximately how much memory the database shared memory set
requires:
6
Database shared memory = (Main bufferpools + 4 hidden bufferpools + database heap + utility heap +
locklist + package cache + catalog cache) + (number of estore pages * 100 bytes) + approx. 10%
overhead
Agent Private Memory:
Each DB2 agent process needs to acquire memory to perform work. It will use memory to optimize,
build and execute access plans on behalf of the application, to perform sorts, to record cursor
information such as location and state, to gather statistics, etc. Agent private memory is allocated for a
DB2 agent when the agent is assigned as the result of a connect request or a new SQL request in a
parallel environment.
When an agent becomes idle,
it retains its agent private memory. This is designed to improve performance, because the agent will
have its private memory ready when it is called again. If there are many idle agents and all of them
retain their private memory, it is possible that the system runs out of memory. To avoid this, DB2 has a
registry variable which limits the amount of memory each idle agent can retain.
Example:
All database requests are serviced by DB2 agents or subagents. For example, when an application
connects to a database, a DB2 agent is assigned to it. When the application issues any database
requests, such as a SQL query, the agent goes out and performs all the tasks that are required to
complete the query - It works on behalf of the application.
Each agent, or subagent, is considered a DB2 process, and it acquires a certain amount of memory to
perform work. This memory is referred to as the agent private memory - It cannot be shared with any
other agents.
Boundaries of Private & Shared Memory:
In addition to its own private memory, in which the agent performs its "private" tasks such as private
sorts, using the sortheap; the agent also requires database level resources such as the buffer pools, the
locklist and the log buffers. These resources are found within the database shared memory.
The way DB2 works is that everything within the database shared memory is shared by all DB2 agents or
subagents connected to the same database. Therefore this memory set is called shared memory, as
opposed to private memory.
Example:
For example, agent x connecting to database A uses the resources within the database shared memory
of database A. Now a second agent, Agent y, also connects to database A. Agent y will share the
7
database memory of database A with agent x. (Of course, both agent x and y have their own agent
private memory, which is not shared.)
What is a package?
Basically, it’s a database object contained with optimized SQL. Each SQL statement must go through the
DB2 optimizer before it can be executed. The optimizer generates a data access plan (which is used to
locate the data when a query is executed), and the access plan is stored in a package. The package itself
is stored in the system catalog if the SQL statement is a static statement coded in an application, or in
the package cache, if the SQL statement is a dynamic statement.
You can view the access plan for one or more SQL statements with Explain. It's possible to run an
embedded SQL application using only the package. When you precompile the application, a package is
created containing access plans for the SQL statements coded in the application.
That package is stored in the database that was used to precompile the application.
You can also precompile an application and have the access plan information stored in an external file,
which can then be bound to any database you want to use the application with or the package can be
stored in a bind file.
The binding process generates the package and stores it in the database specified — this is referred to a
deferred binding.)
A program module that contains embedded dynamic SQL has associated package and sections but the
sections act only as placeholders for SQL statements that are dynamically prepared.
Package Section or Section:
A section is a compiled form of a SQL statement. Every section corresponds to one statement. An
optimized access plan will be stored in a section.
System Catalog
Package
Data Access Plan
Optimizer
SQL
8
Dynamic Statements:
Dynamic SQL allows a programmer or end user to create a SQL statement's specifics at runtime and pass
the statement to the database. The database then returns data into the program variables, which are
bound at SQL runtime.
Static Statement
A static SQL statement is written and not meant to be changed. Although static SQL statements can be
stored as files ready to be executed later or as stored procedures in the database, static SQL does not
quite offer the flexibility that is allowed with dynamic SQL. THINK: Stored Procedures
The problem with static SQL is that even though numerous queries may be available to the end user,
there is a good chance that none of these "canned" queries will satisfy the users' needs on every
occasion.
Comparing Dynamic vs. Static SQL Statements1
An application using dynamic SQL has a higher start-up (or initial) cost per SQL statement due to the
need to compile the SQL statements before using them.
Once compiled, the execution time for dynamic SQL compared to static SQL should be equivalent and,
in some cases, faster due to better access plans being chosen by the optimizer.
1 http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/ad/c0005785.htm
9
Each time a dynamic statement is executed, the initial compilation cost becomes less of a factor. If
multiple users are running the same dynamic application with the same statements, only the first
application to issue the statement realizes the cost of statement compilation.
Differences between Static & Dynamic:
Dynamic SQL is often used by ad hoc query tools, which allow a SQL statement to be created on-the-fly
by a user to satisfy the particular query requirements for that particular situation. After the statement
is customized according to the user's needs, the statement is sent to the database, checked for syntax
errors and privileges required to execute the statement, and compiled in the database where the
statement is carried out by the database server.
Although dynamic SQL provides more flexibility for the end user's query needs, the performance may
not compare to that of a stored procedure whose code has already been analyzed by the SQL
optimizer.
A call-level interface (CLI):
CLI is used to embed SQL code in a host program, such as ANSI C. It is one of the methods that allows a
programmer to embed SQL in different procedural programming languages. When using a call-level
interface, you simply pass the text of a SQL statement into a variable using the rules of the host
programming language.
You can execute the SQL statement in the host program through the use of the variable into which you
passed the SQL text.
Direct SQL:
Direct SQL is where a SQL statement is executed from some form of an interactive terminal. The SQL
results are returned directly to the terminal that issued the statement. Most of this book has focused on
direct SQL. Direct SQL is also referred to as interactive invocation or direct invocation.
Embedded SQL:
Embedded SQL is SQL code used within other programs, such as Pascal, FORTRAN, COBOL, and C. SQL
code is actually embedded in a host programming language, as discussed previously, with a call-level
interface.
Embedded SQL statements in host programming language codes are commonly preceded by EXEC SQL
and terminated by a semicolon in many cases. Other termination characters include END-EXEC and the
right parenthesis.
10
Deadlock:
condition under which a transaction cannot proceed because it is dependent |on exclusive resources
that are locked by another transaction, which in turn |is dependent on exclusive resources that are in
use by the original transaction
Fenced vs. Not Fenced Resources:
Fenced resource executes in a separate process from the database agent. Not fenced resource executes
in the same database process as the database agent.
File Buffer Cache
System buffer cache hit
11
12
Buffer Cache / Hit Ratio / Read & Write Request
On a file read request, the file system first attempts to read the requested data from the buffer cache. If
the data is not already present in the buffer cache, it is read from disk and cached in the buffer cache.
Similarly, writes to a file are cached so that future reads can be satisfied without necessitating a disk
access, and to reduce the frequency of disk writes. The use of a file system buffer cache can be
extremely effective when the cache hit rate is high. It also enables the use of sequential read-ahead and
write-behind policies to reduce the frequency of physical disk I/O’s.
Another benefit is in making file writes asynchronous, since the application can continue execution
without waiting for the disk write to complete. Figure 3 shows the sequence of actions for a write
request under cached I/O.
Note:
While the file system buffer cache improves I/O performance, it also consumes a significant portion of
system memory.
13
Dynamic SQL Processing
DynSQL
Dynamic SQL
SQL statement is assembled
and completed @ runtime
Global Package Cache
Executable Access Plan
Compiler invoked if
executable exists.
If executable does not
exist, compiler not
invoked.
DB Optimizer
Access plan
Table(s)
Data
14
SQL Monitoring Diagram
DB Optimizer
Access plan
Table(s)
Data
Sql
System Catalog Tables
Package: SQL executable form
(Access Plan)
Access Paths
How to get the data
What is my strategy?
• Index Usage
• Sort Methods
• Lock Semantics
• Join Methods
15
Sorting:
Data that needs to be defined in some sequence or order
DB2 attempts to perform the ordering via index usage. If an index can’t be used, the sort will occur.
A sort involves:
• Sort Phase
o Overflowed – data sorted cannot fit entirely on the sort heap it overflows into
temporary database tables
o Non-Overflowed – fits and performs better
• Return of the results of the sort phase
o Piped – if sorted information can return directly w/o requiring a temp table to store a
final, sorted list of data. This is better!
o Non-piped – if results require a temp table to be returned
Figure: Non-Piped returns
See SORTHEAP database configuration parameter
Sort Heap:
Block of memory allocated each time a sort is performed.
Sort Heap
Sorted Data Temp
Tables
Temp
16
Transactions:
A set of SQL statements (series of instructions) that execute in a single operation. A transaction is
completed when either an explicit COMMIT or ROLLBACK is encountered.
Internal Commits:
The total number of commits initiated by the database manager
Locks:
DB2 Locking mechanism attempts to avoid resource conflicts yet still provide full data integrity. Locks
are released when resource is no longer required at the end of the transaction.
Lock Escalations:
The number of locks that have been escalated from several row locks to table locks.
Logs:
As transactions are processed, they are tracked within the log files. DB2 tracks all statements that are
issued within its logs. DB2 Uses write ahead logging to ensure that changes to the database will be
applied. Changes are written to the logs first then later applied to the physical database tables.
17
.NET and DB2 Model
18