CM Programs

27
ASM OnDemand Concurrent Manager Analysis Programs Author: Martin Fitzgerald Creation Date: Friday, June 12 2009 Last Updated: Monday, June 15 2009 Version: 1.0 Approvals:

Transcript of CM Programs

Page 1: CM Programs

ASM

OnDemand Concurrent Manager Analysis Programs

Author: Martin Fitzgerald

Creation Date: Friday, June 12 2009

Last Updated: Monday, June 15 2009

Version: 1.0

Approvals:

Page 2: CM Programs

Table of ContentsConcurrent Manager Analysis.............................................................................................3

Structure..........................................................................................................................3Installation............................................................................................................................5Managing the Historical Data..............................................................................................6

Parameters.......................................................................................................................6Actions............................................................................................................................6

PLSQL tasks...............................................................................................................6SQLPLUS tasks..........................................................................................................6

Execution Frequency.......................................................................................................9Job Execution Reports.......................................................................................................10

Common Concepts........................................................................................................10Job Execution flow...................................................................................................10

Execution Time vs Actual Runtime.....................................................................11Common Filter Parameters.......................................................................................11Switching Data Store................................................................................................12

OOD: Concurrent Program Execution Summary.........................................................13Parameters................................................................................................................13Purpose.....................................................................................................................13

OOD: Program execution counts over time..................................................................14Parameters................................................................................................................14Purpose.....................................................................................................................14

OOD: Detailed Program runtime history......................................................................15Parameters................................................................................................................15Purpose.....................................................................................................................15

Manager Execution Reports...............................................................................................17OOD: Show detailed Manager activity.........................................................................17

Parameters................................................................................................................17Example Report........................................................................................................17

Report Heading section........................................................................................18Queue Heading section........................................................................................19Worker Detail section..........................................................................................20Manager summary section...................................................................................21

Known Bugs.......................................................................................................................22

Page 3: CM Programs

Concurrent Manager Analysis

Oracle OnDemand has a number of tools to analyze Concurrent Manager performance. Unfortunately most of these tools have 2 main drawbacks :

For the most part the tools are only accessible by OnDemand personnel and while that does help us to look at what happened in an environment it can leave a customer frustrated if they just want to understand what is happening on their environment.

For the most part the historical data is summarized and it can be difficult getting the detail you need to compare points in time over vast stretches.

To resolve these issues and to make it easier in general to understand what is happening on any given environment we have provided a set of basic reports. At present these reports are purely SQL*PLUS generated reports. These scripts represent the key scripts we use to help understand the performance of the Concurrent Manager subsystem separate from the performance of the jobs executed by the Concurrent Manager.

StructureIn order to use these reports we rely on some tables to store the historical data. The manner in which these tables are populated is described later on. Without these tables existing none of the reports will work.

There are 7 tables associated with the historical data store : OOD_CONCURRENT_REQUESTS – This table holds much of the information

originally contained in the FND_CONCURRENT_REQUESTS table. To keep space usage low and make access of the table faster only a small subset of the columns are pulled over – mostly date columns however we also capture some of characteristics of the job to determine its unique characteristics.

OOD_CONCURRENT_PROCESSES – This table holds the historical data for the FND_CONCURRENT_PROCESSES table.

OOD_CONCURRENT_QUEUES – This table holds the historical data for the FND_CONCURRENT_QUEUES table. As time goes by queues can be created and deleted. By keeping a secondary copy we keep the connection between jobs that ran and the queue they ran in even if the queue no longer exists.

OOD_CONCURRENT_QUEUE_SIZE- This table holds the historical data for the FND_CONCURRENT_QUEUE_SIZE table which is used to provide the number of workers, the poll time and the workshifts for a queue. Using this data we can keep track on when worker targets were changed.

OOD_CONC_REQ_SETS – It turns out the start and end times of a job are not the real story. This table tries to get the real “run” times of a job and uses that to report against.

OOD_CONCURRENT_QUEUE_CONTENT - This table holds the historical data for the FND_CONCURRENT_QUEUE_CONTENT table which is used to define the includes and excludes of all the queues.

Page 4: CM Programs

OOD_CONC_POP – This table holds the dates of when the historical tables were last updated.

The basic idea is that the key tables used to hold live data will be accessible via the historical tables once populated and most scripts should only need to change the prefix from FND to OOD to view the information.

Page 5: CM Programs

InstallationTO install follow the directions below. For ease of use just past this into an SR when asking for a CRT:______________________________________________________________________To install the reports :

cd into the newly created cm_jobs directory o cd /autofs/upgrade/OHSPERF/OOD_CMJOBS/cm_jobs

And run the install.ksh script – you will need to provide the apps password when prompted.

o ksh install.ksh Once installed the reports cannot be used until the “OOD: Create and Update

Historical CM tables” job is executed. To do this log in as System Administrator and run the job from the SRS screen. Alternatively you can login to the database via sqlplus as the APPS user and run the “ood_ctables” script

o sqlplus apps/password @ood_ctables 700

Once these steps are completed the reports are installed and ready for use. ______________________________________________________________________

Page 6: CM Programs

Managing the Historical Data

The historical data of most systems using the reduced column set taken from the fnd_concurrent_requests table should be quite small. All activities for managing the space taken is handled by the Concurrent manager program “OOD: Create and Update Historical CM tables”. This program has the following characteristics:

Parameters Historical Days to Keep – This parameter is used to delete entries older than a

certain period. At present the default is 800 – which means we will keep just over 2 years worth of history

ActionsThis program is split into 2 sections – a PLSQL procedure that manages the tables and a couple of sql statements that report on the contents.

PLSQL tasksFor each table explicitly listed above this program will perform the following actions in sequence:

1. create_table – If the table does not currently exist it will create it.2. create_indexes – We create a number of indexes (especially on the

OOD_CONCURRENT_REQUESTS table ) based on the access patterns of the provided reports. Each time the program is run we check to see if all expected indexes still exist

3. update_table – we then update the contents with the current data from the live tables

4. purge_table – Using the parameter provided on submission , the program now deletes older entries.

5. analyze_table – If any of the previous actions resulted in a structure change ( a new table or new index) then the table is analyzed.

SQLPLUS tasksTo help manage how much space we use for these table we also report on the size of the tables and their contents.

Table sizes – we run a script to look at the table sizes. This requires accurate statistics so it may not be accurate if stats were not gathered recently. An example output is below :

o An initial Execution will look similar to the following :

14:40:39 - Create table : OOD_CONCURRENT_REQUESTS 14:40:45 - Create synonym : OOD_CONCURRENT_REQUESTS 14:40:45 - Create index : ood_concurrent_requests_U1 on OOD_CONCURRENT_REQUESTS 14:40:47 - Create index : ood_concurrent_requests_n3 on OOD_CONCURRENT_REQUESTS 14:40:48 - Create index : ood_concurrent_requests_n6 on OOD_CONCURRENT_REQUESTS 14:40:49 - Create index : ood_concurrent_requests_n90 on OOD_CONCURRENT_REQUESTS 14:40:49 - Create index : ood_concurrent_requests_n91 on OOD_CONCURRENT_REQUESTS 14:40:50 - Create index : ood_concurrent_requests_n96 on OOD_CONCURRENT_REQUESTS

Page 7: CM Programs

14:40:51 - Create index : ood_concurrent_requests_n97 on OOD_CONCURRENT_REQUESTS 14:40:52 - Create index : ood_concurrent_requests_n98 on OOD_CONCURRENT_REQUESTS 14:40:52 - Create index : ood_concurrent_requests_n99 on OOD_CONCURRENT_REQUESTS 14:40:53 - Updating table : OOD_CONCURRENT_REQUESTS 14:40:55 - Purging table : OOD_CONCURRENT_REQUESTS 14:40:55 - Analyze table : OOD_CONCURRENT_REQUESTS 14:41:21 - Create table : CONCURRENT_PROCESSES 14:41:21 - Create synonym : OOD_CONCURRENT_PROCESSES 14:41:21 - Create index : ood_concurrent_processes_N1 on OOD_CONCURRENT_PROCESSES 14:41:21 - Create index : ood_concurrent_processes_N2 on OOD_CONCURRENT_PROCESSES 14:41:21 - Create index : ood_concurrent_processes_u1 on OOD_CONCURRENT_PROCESSES 14:41:21 - Updating table : OOD_CONCURRENT_PROCESSES 14:41:21 - Purging table : OOD_CONCURRENT_PROCESSES 14:41:21 - Analyze table : OOD_CONCURRENT_PROCESSES 14:41:22 - Create table : CONCURRENT_QUEUES 14:41:22 - Create synonym : OOD_CONCURRENT_QUEUES 14:41:22 - Create index : ood_concurrent_queues_N1 on OOD_CONCURRENT_QUEUES 14:41:22 - Updating table : OOD_CONCURRENT_QUEUES 14:41:22 - Purging table : OOD_CONCURRENT_QUEUES 14:41:22 - Analyze table : OOD_CONCURRENT_QUEUES 14:41:22 - Create table : ood_conc_req_sets 14:41:22 - Create synonym : OOD_CONC_REQ_SETS 14:41:22 - Create index : ood_conc_req_sets_u1 on OOD_CONC_REQ_SETS 14:41:22 - Create index : ood_conc_req_sets_n1 on OOD_CONC_REQ_SETS 14:41:22 - Updating table : OOD_CONC_REQ_SETS 14:41:23 - Purging table : OOD_CONC_REQ_SETS 14:41:23 - Analyze table : OOD_CONC_REQ_SETS 14:41:24 - Create index : ood_concurrent_queue_size_n1 on OOD_CONCURRENT_QUEUE_SIZE 14:41:24 - Updating table : OOD_CONCURRENT_QUEUE_SIZE 14:41:24 - Purging table : OOD_CONCURRENT_QUEUE_SIZE 14:41:24 - Analyze table : OOD_CONCURRENT_QUEUE_SIZE 14:41:24 - Create index : ood_concurrent_queue_cont_n1 on OOD_CONCURRENT_QUEUE_CONTENT 14:41:24 - Updating table : OOD_CONCURRENT_QUEUE_CONTENT 14:41:24 - Purging table : OOD_CONCURRENT_QUEUE_CONTENT 14:41:24 - Analyze table : OOD_CONCURRENT_QUEUE_CONTENT 14:41:24 - Create synonym : OOD_CONC_POP 14:41:24 - Create index : ood_conc_pop_n1 on OOD_CONC_POP 14:41:24 - Purging table : OOD_CONC_POP 14:41:24 - Analyze table : OOD_CONC_POP

o Subsequent execution would look like this :

23:37:22 - Updating table : OOD_CONCURRENT_REQUESTS23:37:40 - Purging table : OOD_CONCURRENT_REQUESTS23:37:40 - Updating table : OOD_CONCURRENT_PROCESSES23:37:40 - Purging table : OOD_CONCURRENT_PROCESSES23:37:41 - Updating table : OOD_CONCURRENT_QUEUES23:37:41 - Purging table : OOD_CONCURRENT_QUEUES23:37:41 - Updating table : OOD_CONC_REQ_SETS23:38:11 - Purging table : OOD_CONC_REQ_SETS23:38:12 - Updating table : OOD_CONCURRENT_QUEUE_SIZE23:38:12 - Purging table : OOD_CONCURRENT_QUEUE_SIZE23:38:12 - Updating table : OOD_CONCURRENT_QUEUE_CONTENT23:38:12 - Purging table : OOD_CONCURRENT_QUEUE_CONTENT23:38:12 - Purging table : OOD_CONC_POP

Note that in the second execution we didn’t have to create the indexes or analyze the tables.

Table contents – Next we show how many records are in the largest of the tables (OOD_CONCURRENT_REQUESTS) on a day by day basis. An example output is below :

Sizes are in MB

Page 8: CM Programs

OWNER TABLE_NAME LOGICAL PHYSICAL DIFFERENCE NUM_ROWS-------- ------------------------------ ---------- ---------- ---------- ----------APPLSYS OOD_CONCURRENT_QUEUE_SIZE .02 .10 .08 201APPLSYS OOD_CONCURRENT_QUEUES .01 .10 .09 83APPLSYS OOD_CONCURRENT_QUEUE_CONTENT .01 .10 .09 169APPLSYS OOD_CONC_POP .00 .13 .12 2APPLSYS OOD_CONCURRENT_PROCESSES .75 .92 .17 3495APPLSYS OOD_CONC_REQ_SETS .94 1.30 .35 19818APPLSYS OOD_CONCURRENT_REQUESTS 52.16 63.15 10.99 372033 ---------- ----------sum 53.89 65.80

These contents then would say that we are using about 66MB of space.

The last step in the program is to review how many records are in the DB in a day by day basis. This will also help to have a basic measure to see if jobs are increasing over time. Example output is below:

Number of ProcessingDATES Jobs Hours--------- --------- ----------22-MAY-09 19426 41923-MAY-09 9045 18624-MAY-09 6559 19925-MAY-09 7029 11826-MAY-09 22005 22327-MAY-09 21198 20328-MAY-09 21749 33329-MAY-09 19846 35630-MAY-09 4954 29831-MAY-09 8205 62501-JUN-09 17249 35602-JUN-09 23811 40403-JUN-09 20586 21104-JUN-09 20596 23405-JUN-09 18288 27206-JUN-09 8302 28107-JUN-09 7325 22108-JUN-09 21082 20409-JUN-09 20192 24910-JUN-09 30229 24911-JUN-09 29758 241From this information I can see we are running about 20,000 jobs a day although the last 2 days had a significant increase going all the way to 30,000. The number of processing hours just shows how many hours it would take to run all the jobs single threaded (ie it is the combined run time of all jobs).

Page 9: CM Programs

Execution FrequencySince this program manages the population and deletion of records from the historical data store its important that it be run frequently. At present I recommend executing nightly.

Page 10: CM Programs

Job Execution Reports

Once installed there will be 3 Job Execution reports available : OOD: Concurrent Program Execution Summary OOD: Program execution counts over time OOD: Detailed Program runtime history

This section of the document will first go over some common concepts used by these reports and will then go into the specifics of each of these reports. When viewing any of these reports Use the “View Ouput” button. The output may exist in both the log and out files however only the output in the “View Output” button is properly formatted.

Common Concepts

Job Execution flowBefore Analyzing these reports its important to understand the flow of how a job begins executing.

1. Job Submission – The time a job is submitted into the system is recorded in the FND_CONCURRENT_REQUESTS table as the “REQUEST_DATE”. The requested_start_date of a job request can actually be in the past. For Example a third party system could submit a job at 11am Jan 2nd 2009 with a requested_start_date of 11am Jan 1st 2009. To reconcile this issue we take the greater of these 2 dates to determine the effective requested_start_date.

2. Once a job is submitted it will have a phase code of ‘P’. If there are any incompatibilities between that job and any other job on the system it will have a status code of ‘Q’ for standby – otherwise it will have a status code of ‘I’

3. When the sysdate is greater than the requested_start_date one of 2 things happensa. If the status is ‘Q’ the job remains in standby mode until the Conflict

Resolution Manager (CRM) releases it. Once the CRM releases the job it updates CRM_RELEASE_DATE column to the current time and the job goes into Pending normal mode.

b. If the job is already in Pending normal mode it will be available to be picked up by a running manager.

4. When a worker wakes up to poll it will check to see if there are any jobs waiting to run. At this time it will lock the record, update it to running mode move it to running phase and begin executing the job.

5. When the job completes the manager takes care of any cleanup work (printing, rescheduling …) and completes the job by setting it to a complete phase.

In general then we are trying to track and understand the times associated with the following phases :

Status Code Q I RStatus meaning Standby Pending RunningCommon Interpretation Incompatible Waiting Executing

All the reports will break out the times by these 3 states.

Page 11: CM Programs

Execution Time vs Actual Runtime

The main issue with execution time is that it often doesn’t represent the load a job puts on the system. As an example lets take a job which executes 10 child processes. The master program submits a child and waits for the child to complete, then it submits the next process. Overall the master programs load on the system is tiny – perhaps accounting for a few seconds of runtime. However the actual completion time may be hours later than the actual start time. To reconcile this we have created a table called the OOD_CONC_REQ_SETS table which tries to capture the real load runtime of a job instead of the recorded runtime. Where possible we use this metric to account for jobload rather than just the “completion time – start time”.

Common Filter ParametersThese 3 jobs share a number of parameters that can be used to filter the selection criteria.

Job Filters – These filters on the job being executed are useful in getting a handle on the most used jobs, application, Queue or middle tier CM node in the system.

o Program Short Name – Find the short name from the Concurrent Program Define screen.

o Application Short Name – Program Application Short name (For example AR, ONT). Remember that GL is actually SQLGL and AP is SQLAP.

o Concurrent Queue - This is the Concurrent Manager short name as seen in the second in the Concurrent Manager Define screen.

o Hostname – Concurrent Manager host name. Only relevant in a PCP configuration.

Job execution times – The default for the following 3 parameters is 0. Essentially these are used to grab any jobs which were in conflict for a long time (Standby time), Waiting for a worker (Pending time) or running for a long time (running time. Times are in whole minutes.

o Min Standby Time o Min Wait Timeo Min Run Time

Timeframe - Start and end days for the reports to be looking for jobs. Starting Day begins at 00:00:00 of the specified day and Ending day ends at 23:59:59 of the specified day.

o Starting Dayo Ending Day

Business Day definition – Most of the time we are most concerned about jobs running during the business day. To exclude jobs running in the morning set the Start of business day to a more appropriate time (for example 08:00:00). Similarly to exclude jobs run at night bring the End of business day time earlier to say 5pm (eg 17:00:00)

o Start of Business Dayo End of Business Dayo Exclude Weekends – Excluding weekends also may make sense. Setting

this to Y will exclude jobs which ran on Saturday and Sunday. Submitting User filters

Page 12: CM Programs

o Application User Name – Select only a specifica Application users jobso Application Resp Nam – Select only those jobs submitted from a a

specific Responsibility.

Switching Data StoreAll the reports include the following flag:

Live or Historical Tables – By default the reports run against the historical data store populated by the job “OOD: Create and Update Historical CM tables”. By switching this to “Live” it will instead go against the live FND tables.

Page 13: CM Programs

OOD: Concurrent Program Execution Summary

ParametersBesides the filters mentioned in the common concepts sections the following parameters are used :

This report adds the capability of providing 3 unique column definitions which will be used as both the first 3 columns and the group by clause. The choices which can be used are as follows :

o QUEUE – Concurrent manager queue short nameo Application – Program Application short nameo Hostname – Concurrent Manager Host nameo Responsibility – The Responsibility of the submitting usero Username – The Username of the submitting usero Short_Program_name – Concurrent program name o Full_program_name – User familiar concurrent program nameo NONE – No filter on this column.

Each of the first 3 columns can be any of the above values. The other unique parameter is the “Order By” parameter. This allows you to

switch the sort column by either the number of jobs run or by the load the jobs represented on the system.

o JOBCOUNT – Orders the report by Number of executionso JOBLOAD – Orders the report by summed runtime.

PurposeThe point of this report is to try and answer some potentially complex questions. For example:

1. What job is run out of the GL application the most during the day?2. Who has been submitting all those FSG reports?3. What machine is getting the most load

Given the options of common filters and the addition of 3 unique groupings it should be fairly easy to drill down into what jobs/users/responsibilities are causing the most load on the environment.

Page 14: CM Programs

OOD: Program execution counts over time

ParametersBesides the filters mentioned in the common concepts sections the following parameters are used :

Interval – This parameter will display a record per interval detailing the job performance for any selected jobs within the interval. Avalable options are :

o HOURS – The Default value. Shows job performance by houro MINUTES - Shows job performance by minuteo DAYS – A day by day comparisono WEEKS – Provides a week by week comparison using the week number

of the year o MONTHS – A month by month comparison

PurposeThis report can be used to compare execution times over specific time periods. Its usefull in determining execution rates and load incurred by jobs and how that has changed over long periods of time.

Page 15: CM Programs

OOD: Detailed Program runtime history

ParametersBesides the filters mentioned in the common concepts sections the following parameters are used :

REQUEST_ID - If you know the specific request ID you are interested in you can specify it here.

PurposeThis report lists all the jobs line by line that meet the filter criteria. Included in the report will be:

Requested start date CRM Release date Actual start date Actual completion date

In addition the times between each milestone are displayed.

This report is more useful to analyze the details of specific jobs rather than trying to aggregate them up where the averages can distort the analysis.

If the REQUEST ID parameter is specified then the report will display the request ID and all its child processes in an indented fashion. For example the following report shows me all the child process of request ID 52065923:

REQUEST_ID QUE_NAME CONCURRENT_PROG USER_NAME REQUEST_DATE ...-------------------- --------------- --------------- ---------- -----------------...52065923 CUST_MRP MRCNSP PPCBATCH 09-06-14 15:11:34... 52065967 CUST_MRP MRCMON PPCBATCH 09-06-14 15:18:54... 52065968 CUST_MRP MRCSDW PPCBATCH 09-06-14 15:19:26... 52065974 CUST_MRP MRCSDW PPCBATCH 09-06-14 15:19:27... 52065975 CUST_MRP MRCSDW PPCBATCH 09-06-14 15:19:27... 52065976 CUST_MRP MRCSDW PPCBATCH 09-06-14 15:19:27... 52065977 CUST_MRP MRCSDW PPCBATCH 09-06-14 15:19:27... 52065978 CUST_MRP MRCSDW PPCBATCH 09-06-14 15:19:27... 52065969 CUST_MRP MRCNSW PPCBATCH 09-06-14 15:19:26... 52065970 CUST_MRP MRCNSW PPCBATCH 09-06-14 15:19:26... 52065971 CUST_MRP MRCNSW PPCBATCH 09-06-14 15:19:26... 52065972 CUST_MRP MRCNSW PPCBATCH 09-06-14 15:19:26... 52065973 CUST_MRP MRCNSW PPCBATCH 09-06-14 15:19:26... 52065994 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:21:17... 52065996 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:21:37... 52065997 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:21:42... 52065998 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:21:42... 52066000 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:21:46... 52066001 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:21:49... 52066002 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:22:00... 52066003 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:22:15... 52066004 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:22:15... 52066005 CUST_MRP MRCSLD PPCBATCH 09-06-14 15:22:16... 52066006 CUST_MRP MRCNEW PPCBATCH 09-06-14 15:22:16... 52066011 CUST_MRP MRPEXPWF PPCBATCH 09-06-14 15:22:58... 52066012 CUST_MRP MRPAUREL PPCBATCH 09-06-14 15:22:58... 52066010 CUST_MRP MRCEAP PPCBATCH 09-06-14 15:22:54...

Page 16: CM Programs

Using this option we can see that job 52065923 kicked off job 52065967 which kicked off job 52065968 … The indented feature allows us to see the relationship between parent, sibling and child for these processes.

Page 17: CM Programs

Manager Execution ReportsThe following report is more about the execution of the concurrent manager rather than the jobs within the concurrent manager. The report name is “OOD: Show detailed Manager activity” and the output is a somewhat graphical depiction of the activity of the manager during a short period of time (can be between 1 and 120 minutes).

OOD: Show detailed Manager activity

ParametersThere are only a few parameters used by this program:

Starting Date and Time – Start time for when the analysis should begin Reporting time in mins – Amount of time to report on. Max is 120 mins (2

hours). Queue Name to display – Setting this to something other than % will limit the

program to show only those specific queues that meet the filter. Live or Historical Tables- Setting this to LIVE will switch the program to look at

the live FND tables rather than the historical date store.

Example ReportBelow is an example report and what follow is an explanation on how to interpret it: (below report is from an actual customer however the job names have been replaced):

Looking at jobs submitted between 06/14/2009 15:18:00 and 06/14/2009 15:33:00*** Process Flags - Running=*, Pending=-, Standby=_, Paused=~*** Running...but sleeping=z *** Manager Flags - Running=*, Idle =. /* Processing requests for concurrent queue: CUST_MRP | |Manager Request CRM Actual Actual | 2 3 |Proc. Request-Program short name Start Release Start End |890123456789012|------ --------------------------- -------- -------- -------- --------|---------------|298663 52065923-MRCNSP 15:11:34 15:18:47 15:18:54 15:22:55|zzzz* || 52066010-MRCEAP 15:22:54 15:22:55 15:22:55| * || 52066012-MRPAUREL 15:22:58 15:23:19 15:23:25 15:23:25| _* || --- Mgr Process(1298663) 06/07/09 09:13:32 |zzzz**.........|\___________________________________ 298664 52065931-MRCNSW 15:12:46 15:13:14 15:13:21 15:18:01|* || 52065959-MRCSLD 15:18:01 15:18:01 15:18:01|* || 52065977-MRCSDW 15:19:27 15:19:32 15:19:33| * || 52065970-MRCNSW 15:19:26 15:19:47 15:20:03 15:21:49| -** || 52066000-MRCSLD 15:21:46 15:21:49 15:21:49| * || 52066001-MRCSLD 15:21:49 15:21:49 15:21:50| * || 52066006-MRCNEW 15:22:16 15:22:18 15:22:20 15:22:58| * || 52066011-MRPEXPWF 15:22:58 15:22:58 15:23:28| ** || --- Mgr Process(1298664) 06/07/09 09:13:32 |******.........|\___________________________________ 298666 52065919-MRCNSP 15:11:24 15:12:13 15:12:16 15:18:28|* || 52065962-MRCEAP 15:18:27 15:18:28 15:18:29|* || 52065974-MRCSDW 15:19:27 15:19:29 15:19:33| * || 52065978-MRCSDW 15:19:27 15:19:33 15:19:33| * || 52065971-MRCNSW 15:19:26 15:19:47 15:20:03 15:22:16| -*** || 52066005-MRCSLD 15:22:16 15:22:16 15:22:17| * || --- Mgr Process(1298666) 06/07/09 09:13:32 |*****..........|\___________________________________ 298667 52065927-MRCMON 15:12:16 15:12:44 15:12:46 15:18:06|* |298669 52065960-MRCNEW 15:18:01 15:18:16 15:18:23 15:18:29|* || 52065963-MRPEXPWF 15:18:29 15:18:29 15:19:03|** || 52065994-MRCSLD 15:21:17 15:21:34 15:21:36| * || --- Mgr Process(1298669) 06/07/09 09:13:32 |**.*...........|\___________________________________

Page 18: CM Programs

298673 52065967-MRCMON 15:18:54 15:19:17 15:19:25 15:22:17|_zzz* |298676 52065968-MRCSDW 15:19:26 15:19:26 15:19:43| * || 52065997-MRCSLD 15:21:42 15:21:43 15:21:44| * || 52065998-MRCSLD 15:21:42 15:21:44 15:21:45| * || 52066003-MRCSLD 15:22:15 15:22:15 15:22:15| * || 52066004-MRCSLD 15:22:15 15:22:16 15:22:16| * || --- Mgr Process(1298676) 06/07/09 09:13:32 |.*.**..........|\___________________________________ 298677 52065976-MRCSDW 15:19:27 15:19:31 15:19:34| * || 52065973-MRCNSW 15:19:26 15:19:47 15:20:04 15:22:16| -*** || --- Mgr Process(1298677) 06/07/09 09:13:32 |.****..........|\___________________________________ 298680 52065969-MRCNSW 15:19:26 15:19:47 15:19:55 15:22:01| **** || 52066002-MRCSLD 15:22:00 15:22:01 15:22:01| * || --- Mgr Process(1298680) 06/07/09 09:13:32 |.****..........|\___________________________________ 298681 52065996-MRCSLD 15:21:37 15:21:38 15:21:39| * |298682 52065975-MRCSDW 15:19:27 15:19:31 15:19:33| * || 52065972-MRCNSW 15:19:26 15:19:47 15:20:03 15:21:53| -** || 52065964-MRPAUREL 15:18:29 15:23:19 15:23:23 15:23:25|_____* || --- Mgr Process(1298682) 06/07/09 09:13:32 |.***.*.........|\___________________________________ ****** Process Jobs Busy% - Manager Summary for CUST_MRP | -------- ------ ------ | 1298663 3 26.8 06/07/09 09:13:32 |zzzz**.........|| 1298664 8 19.73 06/07/09 09:13:32 |******.........|| 1298665 0 0 06/07/09 09:13:32 |...............|| 1298666 6 18.67 06/07/09 09:13:32 |*****..........|| 1298667 1 .67 06/07/09 09:13:32 |*..............|| 1298668 0 0 06/07/09 09:13:32 |...............|| 1298669 3 4.67 06/07/09 09:13:32 |**.*...........|| 1298670 0 0 06/07/09 09:13:32 |...............|| 1298671 0 0 06/07/09 09:13:32 |...............|| 1298672 0 0 06/07/09 09:13:32 |...............|| 1298673 1 19.13 06/07/09 09:13:32 |.zzz*..........|| 1298674 0 0 06/07/09 09:13:32 |...............|| 1298675 0 0 06/07/09 09:13:32 |...............|| 1298676 5 2.13 06/07/09 09:13:32 |.*.**..........|| 1298677 2 15 06/07/09 09:13:32 |.****..........|| 1298678 0 0 06/07/09 09:13:32 |...............|| 1298679 0 0 06/07/09 09:13:32 |...............|| 1298680 2 14 06/07/09 09:13:32 |.****..........|| 1298681 1 .13 06/07/09 09:13:32 |...*...........|| 1298682 3 12.6 06/07/09 09:13:32 |.***.*.........|\___________________________________

Report Heading sectionLooking at jobs submitted between 06/14/2009 15:18:00 and 06/14/2009 15:33:00*** Process Flags - Running=*, Pending=-, Standby=_, Paused=~*** Running...but sleeping=z *** Manager Flags - Running=*, Idle =.

The heading section indicates the start and end time of the report which should match the criteria provided at submission time. In addition it details the characters used to represent various stats a program can be in :

Running – Process has begun processing by the concurrent manager Pending – Process is waiting for a manager to run the job Standby – Process is in conflict with another job and cannot run Paused – Job has submitted a child process and is now paused waiting for the

child process to complete Running … but sleeping – Job has submitted a child process but did not go into

pause state. Instead it is listed as Running by the concurrent manager and is taking

Page 19: CM Programs

up a slot in the queue. The job is actually executing a “sleep” command as it waits for the child process to complete.

The Queue is started with a number of workers and each worker can be either executing a job or waiting to execute a job. In this case the flag to represent the 2 states are also displayed :

Running – Manager is actually processing a job Idle – Manager has no work to do and is not processing any jobs.

Queue Heading section/* Processing requests for concurrent queue: CUST_MRP | |Manager Request CRM Actual Actual | 2 3 |Proc. Request-Program short name Start Release Start End |890123456789012|------ --------------------------- -------- -------- -------- --------|---------------|

Each queue will have its own heading section. In this case we can see that this queue is meant to display the activities of the CUST_MRP queue. The columns are as follows:

Manager Proc. – A Queue spawns several workers to deal with the jobs in the queue. The workers will each have a unique “Manager Process ID”. Tracking the performance of a queue then means tracking the performance of each of the workers and making sure that the workers are picking up jobs quickly. For the purposes of this report then we will be sorting the jobs by the worker process that executed that job.

Request-Program short name - This queue provides ther request ID and a substring of the job being executed by the manager process.

The next 4 columns provides the key milestones of a job:o Request Start – Greater of REQUEST_DATE and

REQUESTED_START_DATEo CRM Release - May be blank if there are no incompatibilities otherwise it

will contain the time the CRM released the job to runo Actual Start – Actual Start timeo Actual End – Actual Completion time

The next set of columns represent the time in minutes. Since this report was run for a 15 minute period there are 15 columns in this section – each column will represent 1 of those 15 minutes of the report. At every 10 minute interval we indicate whether it is 10,20,30,40 or 50 minutes into the report. In this case we can see that the report spanned both the 20 minute and 30 minute marks. If the report was set to run against 120 minutes the report would actually be 120 columns wide for this section alone. For that reason the report is limited to only 120 minutes at max.

Page 20: CM Programs

Worker Detail section

298663 52065923-MRCNSP 15:11:34 15:18:47 15:18:54 15:22:55|zzzz* || 52066010-MRCEAP 15:22:54 15:22:55 15:22:55| * || 52066012-MRPAUREL 15:22:58 15:23:19 15:23:25 15:23:25| _* || --- Mgr Process(1298663) 06/07/09 09:13:32 |zzzz**.........|\___________________________________

Each worker is displayed in numerical order. LINE1- The first worker in the list ran 3 jobs. The first job though spent pretty

much the entire time running child processes as we can see it was mainly in a “Running … but Sleeping” mode the entire time (as denoted by the “z” character). Review the details of the report example in the previous section to see what it was waiting on.

LINE2 - The second job, with the job name of MRCEAP, is not incompatible with anything so the CRM Release column is blank. In addition we can see that the time between requested_start_date and actual start_date is 1 second. Each subsequent line executed by the same worker will have a “|” instead of the worker number to indicate the they are part of the same worker flow

LINE3 - The 3rd job does have incompatibilities defined and it had to wait. Since it was requested to run at 15:22:58 it fell into the Standby section for the minute designated 22. It was released 21 seconds later but based on when it was submitted and released it will show up in the report with an “_” to designate standby mode. Also It ran so fast it took less than a second to run. Even so we show it as having been run during the minute.

Clearly the graph isn’t exact. The intent is to show some representation of the “busy –ness” of a queue and how jobs reacted to that activity. The detailed milestone dates are also supplied to give context to the graphical representation. As can be seen, for fast jobs many of them can be run in the same minute.

LINE4 - The Next line sums up the activity of the worker that ran these jobs. It details out when the worker started and if it has exited it will show its end time as well. In addition it will pull together all the statuses from the individual jobs so its clear how busy the worker might have been.

LINE5 - The last line of each worker is just a report separation line to indicate the end of the worker

It is important to note that not every worker will be represented. If the worker ran no jobs then it wont be displayed in this section. In addition if the worker only ran 1 job then there is no need to show a summary line of that worker as well. For example the following section shows worker 298667. However since it only ran the single job it skipped the summary line and just went to the next worker.298667 52065927-MRCMON 15:12:16 15:12:44 15:12:46 15:18:06|* |298669 52065960-MRCNEW 15:18:01 15:18:16 15:18:23 15:18:29|* || 52065963-MRPEXPWF 15:18:29 15:18:29 15:19:03|** |…

In these cases the summary line would equal the worker line so there is no need to show the summary line separately.

Page 21: CM Programs

Manager summary section****** Process Jobs Busy% - Manager Summary for CUST_MRP | -------- ------ ------ | 1298663 3 26.8 06/07/09 09:13:32 |zzzz**.........|| 1298664 8 19.73 06/07/09 09:13:32 |******.........|| 1298665 0 0 06/07/09 09:13:32 |...............|| 1298666 6 18.67 06/07/09 09:13:32 |*****..........|| 1298667 1 .67 06/07/09 09:13:32 |*..............|| 1298668 0 0 06/07/09 09:13:32 |...............|| 1298669 3 4.67 06/07/09 09:13:32 |**.*...........|| 1298670 0 0 06/07/09 09:13:32 |...............|| 1298671 0 0 06/07/09 09:13:32 |...............|| 1298672 0 0 06/07/09 09:13:32 |...............|| 1298673 1 19.13 06/07/09 09:13:32 |.zzz*..........|| 1298674 0 0 06/07/09 09:13:32 |...............|| 1298675 0 0 06/07/09 09:13:32 |...............|| 1298676 5 2.13 06/07/09 09:13:32 |.*.**..........|| 1298677 2 15 06/07/09 09:13:32 |.****..........|| 1298678 0 0 06/07/09 09:13:32 |...............|| 1298679 0 0 06/07/09 09:13:32 |...............|| 1298680 2 14 06/07/09 09:13:32 |.****..........|| 1298681 1 .13 06/07/09 09:13:32 |...*...........|| 1298682 3 12.6 06/07/09 09:13:32 |.***.*.........|\___________________________________

The manager summary section provides an indication of the activity of the workers in the queue. Each line has the following columns:

Process – The unique worker ID as described before Jobs – The number of jobs the worker processed in the timeframe specified Busy% - This number takes the total time available in the timeframe specified and

indicates how often that worker was actually actively running a job. This is useful to see if the queue is maxed out. As mentioned earlier just having a * in each column may only mean that 1 second of every minute was taken – the Busy% will be a far more accurate gauge f the amount of time the worker spent doing work.

The start and end times of the queue are then provided. The last column will be a repeat of the section seen earlier when the worker was

originally displayed

Note that in this section even workers which never ran a job will be portrayed. This makes it easier to see if capacity was available during specified times.

Page 22: CM Programs

Known BugsThese scripts try and mine the job history tables to show how the concurrent manager is performing. However there are known issues in the data.

Some jobs seem to resubmit themselves using the same request ID. This is an anomaly of the job and as such its hard to account for it.

Outages and system downtimes will cause huge spikes in both standby time and Pending time.

Some of the logic determining the values for actual start time and job load may need to be refined. They seem to work pretty well however it’s a bit convoluted and there may be conditions where the logic isn’t getting the right values.