1 ASAP Extension (ASAPX) 2.0 Overview September 17, 2001 Joe Davis [email protected].

101
1 ASAP Extension (ASAPX) 2.0 Overview September 17, 2001 Joe Davis [email protected]

Transcript of 1 ASAP Extension (ASAPX) 2.0 Overview September 17, 2001 Joe Davis [email protected].

1

ASAP Extension (ASAPX) 2.0 Overview

September 17, 2001

Joe Davis

[email protected]

2

The ASAP Extension (ASAPX) An application interface to ASAP An API for use by your

application A calculation and collection

service An alerting mechanism An availability tracker

compaqcompaqcompaq

compaq

3

ASAP

Application

ASAPX API

How ASAPX Works Applications register by calling

the API ASAPX allocates shared

memory in each processor for application data

Applications store and update counters and state values in shared memory

ASAPX samples shared memory to compute metrics for an interval

ASAPX

Shared Memory

ASAPDatabase

4

ASAP Entities Entities are Applications or other

logical groupings All components of an Entity share

unique ASAPX definitions Entities are described using Entity

Definition Language (EDL) Approximately 88 user entities

– Depends on System entities

– ASAP 2.0 maximum is 100 entities Generic APP entity applies when no

entity is defined

5

Entity-Specific ASAPX Definitions DataItems EDL entity property

– Up to 12 items (0-11) in shared memory

– Three classes of DataItems Counters - Type I Time Units - Types S, M, and U Constants - Type C

– Example: DataItems “0 I, 1 M, 2 C” Metrics (EDL Attributes)

– Statistical properties computed at each interval

– Computed using DataItems and user-defined formula Display Formats

6

Registering a Domain with ASAPXerror := ASAP_REGISTER_ (

domain^name:name^len

, seg^offset

, [ error^detail ]

, [ segment^id ]

, [ segment^base ]

, [ version ]

, [ asap^id:id^len ]

, [ flags ] ) extensible;

Data Items

Boundary Tag

Boundary TagChecksum

Control Info

Shared Memory

ASAPXMON

flags.<0:12> - reservedflags.<13> - allow replace of non-constantsflags.<14> - don’t concatenate process nameflags.<15> - start with ranking deactivated

7

Domain Names <entity>\<domain>[\<subdom>...]

– 1st level must be EDL defined Entity name

– 5 levels

– 64 bytes max

– Must be unique

– Examples Atm\Chicago\East\$Atm5 Assembly\Belt12\Station6\P34345 Basketball\Scores\Clippers

Domains are logical representations of business functions

8

Updating DataItemserror := ASAP_UPDATE_( seg^offset

, [ error^detail ]

, data^item

, value

, [ math ] ) extensible;

error := ASAP_UPDATELIST_( seg^offset

, [ error^detail ]

, num

, list) extensible;

Data Items

Boundary Tag

Boundary TagChecksum

Control Info

Shared Memory

math 0 = add (default)1 = replace

9

Creating Metrics (Attributes) Defined as EDL Attributes TransRate is the Attribute

name MetricRule is formula

– #<n> refers to a DataItem Difference between

samples Or constant value at

sample

– S is seconds in interval

– C<n> means <n> is a constant

Format controls display

AT TransRate Grid YES Graph NO GraphMax 1000 Format “F7.2” Help ”Successful transaction rate" StatePair YES StateRule UseStateGraphState MetricRule “#0/S" TypeData REAL64;

AT S0 Grid NO Graph NO GraphMax 9 Help "State of TransRate ";

10

ASAPX Built-in Metrics (Attributes) 11 built-in Metrics

– Avail - Availability since registration– Busy - Percent busy for interval– Cpu - Process Cpu– DownTime - Down time seconds since registration– Nak - Number of intervals without update– Pri - Priority of registering process– PState - Process state of registering process– RegTime - Domain registration date/time– UnAvail - Unavailabilty for interval– Version - Application supplied version– WState - Wait state of registering process

Include EDL definitions as described in the ASAPX Manual

11

Operational Status and StateSet by ASAPX to:

– “Up” - OpState 2

– “Inactive” - 6*

– “Down” - 8

– “Removed” - 0

– “Unknown” - 6

* If the Nak built-in is used, the state reported for “Inactive” is the OEM state for the Nak attribute instead of state 6

12

Operational Status and StateApplications can override ASAPX and set Status Text and OpState directly– Status text is 1 to 15

ASCII bytes

– OpState is a valid OEM State, 1-8

13

ASAP_OPSTATE_

error := ASAP_OPSTATE_( seg^offset

, [ error^detail ]

, OpText:OpText^Len

, OpState

) EXTENSIBLE;

Data Items

Boundary Tag

Boundary TagChecksum

Control Info

Shared Memory

If an application sets Status and OpStateand then fails, ASAPX overrides OpStateto DOWN (8). Status text is notoverridden.

14

Domain Control

error := ASAP_CONTROL_( seg^offset

, [ error^detail ]

, flags ) extensible;

flags.<0:13> - reserved

flags.<14>- enable ranking

flags.<15>- disable ranking

Data Items

Boundary Tag

Boundary TagChecksum

Control Info

Shared Memory

15

Domain Removal

Data Items

Boundary Tag

Boundary TagChecksum

Control Info

Shared Memoryerror := ASAP_REMOVE_( seg^offset, [ error^detail ], [ segment^id ], [ flags ] ) extensible;

flags.<0:14> - reservedflags.<15> - de-allocate segment

16

ASAP Discrete Object Thresholds

Standard Ranking

Percent & Historical Ranking

ASAPX fully supports ASAP DOTs– Except monitored via API

Metrics ranked to set alert levelAll version 1 Objectives functions have been removed from ASAPX.

17

ASAPX Conversion to DOTsASAPX 1.0 ASAP 2.0 DOTs

SET DATABASE SET OBJECTIVESDB

SET DATAITEMS Replaced by EDL

SET METRICS Replaced by EDL

SET NAMEFILE Obsolete

SET RANK SET OBJECTIVESRANK

SET TMF SET OBJECTIVESAUDIT

ADD, ALTER RANK

COMMIT COMMIT, MONITOR, RANK

DELETE, INFO RANK

LIST MONITOR, RANK

18

Domain Aggregation Atm\Chicago\East\$Atm37 Atm\Chicago\West\$Atm38 Atm\Newyork\East\$Atm39

– 3rd Level Aggregates Atm\Chicago\East\# Atm\Chicago\West# Atm\Newyork\East\#

– 2nd Level Aggregates Atm\Chicago\# Atm\Newyork\#

– 1st Level Aggregate Atm\#

State propagation

19

Write-to-Collector Mode

ASAPX SGP collects statistics serially from all ASAPXMON processes then forwards them to the ASAP Collector

In Write-to-Collector mode all ASAPXMON processes work in parallel

– Detail records sent directly to ASAP Collector

– Aggregate records (if any) are still collected by the SGP

– Uses fewer resources

ASAP Collector

ASAPXSGP

ASAPXMON

ASAPDatabase

1

2

3 4

20

Limits Increased 128 domains per CPU to 1024 domains per CPU!

– Done for K to S migration customers who end up with fewer processors

– Use this increase wisely, be careful of too much APP data

– ASAP 1.0 memory segments cannot be used with ASAP 2.0 due to the segment size increase for this enhancement!

20 metrics to 30 metrics (EDL Attributes)– DataItems remains 12

– Metrics increased due to new built-in metrics If you are approaching limits, we want to know!

21

ASAP Extension (ASAPX) 2.0 Managing ASAPX

September 17, 2001

Joe Davis

[email protected]

22

ASAPX Components

XMON XMON XMON XMON XMON XMON

XSGPACOL ASAPX

XTST

ASAPXMONASAPXSGP

ASAPXLIB ASAPXSRLASAPXSRO

ASAPXDECASAPXH

ZASPXCZASPXCOBZASPXTAL

ASAPX ASAPXTSTINSTALLExecutables

Libraries UtilitiesDeclarations

Definitions

23

ASAPXMON Monitor Process Handles domain registration functions in a cpu Allocates and manages shared memory Produces metrics at each interval Produces aggregates for its CPU Uses DOTs to rank metric (attribute) values Writes detail domain data to the SGP or to the ASAP

Collector Writes aggregate domain data to the SGP

24

ASAPXSGP Smart Gatherer Process Is managed by the ASAP Monitor process Manages all ASAPXMON processes Reads ASAPCONF and ASAPXCNF at startup time Collects statistics from ASAPXMON processes Creates aggregate domain records using the sum of all

CPU records received from ASAPXMON processes Creates the domain table Responds to ASAP CI commands Forwards COMMIT commands to ASAPXMON

25

ASAPX Command Interpreter Reads ASAPCONF and ASAPXCNF at startup Gutted for ASAPX 2.0 due to DOTs conversion

– CLEANUP command

– DISABLE/ENABLE STATS commands Most customers will not use ASAPX CI

26

ASAPXTST API Test Program ASAPXTST lets you execute the API procedures

interactively Used for education and testing purposes ASAPXTST API commands

– CONTROL -> ASAP_CONTROL_

– REGISTER -> ASAP_REGISTER_

– REMOVE -> ASAP_REMOVE_

– UPDATE -> ASAP_UPDATE

– UPDATELIST -> ASAP_UPDATELIST

– OPSTATE -> ASAP_OPSTATE_

27

Installing ASAPX IP Setup Install

– IP Setup moves files to NSK

– IP Setup places files

DSM/SCM Install– IP Setup moves

files to NSK– DSM/SCM

accepts files and places them correctly

28

Configuring ASAPX Once files are placed on

$SYSTEM, create the ASAPXCNF file.

– INSTALL macro Add SET commands to define

ASAPX– SWAPVOLS

Modify ASAPCONF to:– SET APP ON– SET APP PARAM– SET APP OBJECT

Can be central or remote!

SET AGGREGATE ATM 1 2SET COLLECTOR $ZOOSSET CPUS 0-7SET PRIORITY 180SET SWAPVOLS 0-15 $DATA.ASAPX

ASAPXCNF

.

.

.SET APP ON...

ASAPCONF

29

Multi-node ASAPX Configurations Centralized versus Remote ASAPXCNF files

– Default is one per node.

– For centralized, include SET APP PARAM in the ASAPCONF file and fully qualify the name of ASAPXCNF with the node name where it resides.

30

The ASAPX Command Interpreter [$SYSTEM.SYSTEM.]ASAPX [/run-params/] [asapconf]

[[;] command [[;] command, ...]]– run-params

standard run parameters

– asapconf the name of the ASAP configuration file

– command a valid ASAPX command

31

ASAPX Commands ALLOW ASAP CLEANUP DISABLE ENABLE ENV ERROR EXIT FC HELP HISTORY OBEY SET STATUS

32

ASAPX Commands ALLOW

– Sets the allowed error count for batch processing ASAP

– Accidental leftover, not useful, to be removed next release. CLEANUP

– Deletes the objectives table, shared memory and the domain table

DISABLE– Disables statistics processing by returning error -6 on

updates and control operations ENABLE

– Enables statistics processing

33

ASAPX Commands ENV

– Shows the current ASAPX environment FC

– Standard FIX command HELP

– HELP [ command ] [ subcommand ] HISTORY

– Displays up to 100 lines from the history buffer, the default is 20 lines

OBEY– Executes ASAPX commands stored in an obey file

34

ASAPX Commands STATUS

– Shows the status of ASAPX components, the ASAPXSGP and all ASAPXMON processes

13+ status Process Cpu Pin Stats Regst Avail HDmn# Shared Mem Private Mem$APPH 1 131 Enabled 0 0 0 0 15984$APPH0 0 91 Enabled 24 24 26 6598 113562$APPH1 1 156 Enabled 25 25 25 6354 109358

35

ASAPX SET Commands

SET AGGREGATE SET COLLECTOR SET CPUS SET PRIORITY SET SWAPVOLS SET TEST

36

ASAPX SET Commands SET AGGREGATE <entity> [1][2][3][4]

– Turns on aggregation for the entity at the specified level(s) of the domain names

SET COLLECTOR [<cpu>] [\<node.]$<collector-name>– Specified with <cpu> turns on write to collector mode

– Additional specifications with <cpu> define a different collector for that <cpu>

– ASAP ID can exceed 2 chars when collector is remote SET CPUS cpu-range

– SET CPUS 0-4, 9-15

– SET CPUS 0-15 (default)

– Must run in every CPU where an application resides

37

ASAPX SET Commands SET OBJECT [ ASAPXMON ] filename

– Defines ASAPXMON object file

– SET OBJECT $ALYSTS.SYSTEM.ASAPXMON

– SET OBJECT $SYSTEM.SYSTEM.ASAPXMON (default) SET PRIORITY priority

– Defines ASAPXMON priority (default is 150) SET SWAPVOLS cpu-range subvol [ , … ]

– SET SWAPVOLS 0-4 $DATA.SWAP, 5 $DATA2.SWAP

– SET SWAPVOLS 0-15 $SYSTEM.ZASAPX (default) SET TEST ON

– SET TEST ON

38

Managing ASAPX ASAPX MUST be running when Applications are calling

ASAP_REGISTER_ Once Applications have registered, ASAPX can be

shutdown and restarted without affecting the application CPU failures can corrupt shared memory! CPU failures can corrupt the domain table unless you

have TMF ON in ASAPCONF! ASAPX uses both to rebuild both after failures

39

ASAPX Limits 1024 domains per processor 12 data items per domain 30 metrics per domain

40

ASAP Extension (ASAPX) 2.0 Internals

For Compaq Employees Only

September 17, 2001

Joe Davis

[email protected]

41

Internals Data Flows and Data Structures Shared memory allocation ASAPXMEM memory analysis tool Source code overview What to collect if there is a problem

42

Data Flows1 Register request from the

application process (via the ASAP_REGISTER_ API call) to the ASAPXMON in its CPU

2 Stats request and reply from ASAPXSGP to all ASAPXMON processes

3 Stats records written to ASAPCOL from ASAPXSGP and ASAPXMON

4 ASAPX CI commands

ASAPXMON

ASAPXSGP

ASAPXAp

plic

atio

n P

roce

ss

AS

AP

X A

PI

1

2

3

4

3

ASAPCOL

43

Register Request

DEFINITION register-msg. 02 msgcode type binary 16. !register msg 02 d-name type binary 16 occurs app-dflt-dom-wlen times. 02 dname type character 1 redefines d-name. !domain name 02 dname-len type binary 16. !seg len 02 pin type binary 16. !pin 02 vers type binary 16. 02 v-bytes redefines vers. !version 09 v1 type binary 8. 09 v2 type binary 8. 02 flags type binary 16. !caller flags 02 asapx-version type binary 16. !asapx versionend

44

Register Reply

DEFINITION reply-msg. 02 addr type binary 16. !addr 02 status type binary 16. !status 02 replycode type binary 16. !reply 02 f-name type binary 16 occurs app-dflt-fname-wlen times. 02 fname type character 1 redefines f-name. !seg fname 02 fname-len type binary 16. !seg len 02 sla-offset type binary 32. !sla offset 02 rcount type binary 16. !# times registered 02 appmon-err type binary 16. !appmon err 02 dom type *. !parsed domainend

45

Statistics Request & Reply

DEF stats-msg. 02 msgcode type binary 16. !request code 02 obj-cmd type binary 16. !obj command, if any 02 stats-cmd type binary 16. !stats command, if any 02 start type binary 16. !starting s for continuationend.

DEF stats-reply. 02 addr type binary 16. !buf addr 02 status type binary 16. !status 02 replycode type binary 16. !reply code 02 data type binary 16. !start of dataend.

46

APP^STATS2DEF sgp-metric-stats2. 02 val type float 64. 02 val16 type binary 16 redefines val. 02 val32 type binary 32 redefines val. 02 val64 type binary 64 redefines val. 02 valch type character 8 redefines val. 02 state type binary 16. END.

DEFINITION APP-STATS2. 02 system type binary 16 occurs 4 times. 02 word type binary 16 redefines system. 02 sysname type character 8 redefines system. 02 P-key-x type *. 02 Q-key-x type * redefines P-key-x.

47

APP^STATS2 02 event type *. 02 Pid type character 8. 02 Cpu type binary 16. 02 Pin type binary 16. 02 Spare type binary 16 occurs 5 times. 02 AggRec type binary 16. 02 Data-count type binary 16. 02 Op-Text type binary 16 occurs 8 times. 02 OpText type character 1 redefines op-text. 02 OpState type binary 16. 02 Error type binary 64. 02 Error-state type binary 16. 02 DitemCount type float 64. 02 Ditem type float 64 occurs 12 times. 02 Data type sgp-metric-stats2 occurs 30 times.END.

48

ASAPXMON Shared Memory ASAPXMON allocates 2 shared

segments (Z<asap-id>C<n> and Z<asap-id>C<n>A)

– <n> is the CPU number in hexadecimal

C<n> is the main application shared segment

C<n>A is a private segment used only by ASAPXMON C<n> C<n>A

ASAPXMON

49

C<n> Segment Contains only two

structures– 1 occurrence of ACTRL

– <n> occurrences of SLA

SLA portion is bit-mapped

SLA always allocated to lowest memory slot

ACTRL^DEF

SLA^DEF

Slot 0

SLA^DEF

Slot 1

.

.

.

SLA^DEF

Slot (max^domains-1)

ACTRL^BASE

SLA^BASE

50

C<n> Segment ACTRL contains the last stats

collection time and the current state of stats processing

SLA is the application shared structure visible to the API

ASAP_REGISTER_’s seg^offset points directly to an SLA slot

ACTRL^DEF

SLA^DEF

Slot 0

SLA^DEF

Slot 1

.

.

.

SLA^DEF

Slot (max^domains-1)

51

Start-Tag

DT

PinDname:lenPname:lenState-AddrVersMsgUtimeFlagsSpare[0:7]Metric[0:11]

Defs[0:11]

Last-Error

Checkdata (DT)

End-Tag

The SLA Structure

The address of ACTRL

The Message Box for Control, Remove and Error functions

Application’s last update time

Application defined Data Items

Checksum word

Data Item types (I, C, S, etc.)

Domain and Process names

Boundary tag

Register flagsOpText/OpState in ASAPX 2.0

52

C<n>A Segment Segment contains 3 structures

– 1 occurrence of CTRL– 1 occurrence of XMAP– APP^MAX^DOMAIN

occurrences of SLOT SLOT portion bit-mapped to the

SLA bitmap (XMAP) for allocation

– SLOT[x] = SLA[x] SLOT entries maintained in a

doubly-linked list for processing in alpha order

– NCB is names control block

CTRL^DEF

XMAP^DEF

SLOT^DEF

Slot 0

.

.

.

SLOT^DEF

Slot (max^domains-1)

CTRL^BASE

XMAP^BASE

SLOTS^BASE

53

C<n>A Segment CTRL contains number of domains

registered and active, last stats time, etc.

XMAP is the memory bitmap of SLA and SLOT areas

SLOT contains all the internal information ASAPX needs to know about an application

– Control information– Statistics samples (last and new)– Objectives information on each

domain

CTRL^DEF

XMAP^DEF

SLOT^DEF

Slot 0

.

.

.

SLOT^DEF

Slot (max^domains-1)

54

SLOT structureDEF app-slots. 02 pin type binary 16. 02 dom type *. 02 offset type binary 32. 02 paddr type binary 32. 02 naddr type binary 32. 02 vers type binary 16. 02 version type character 2 redefines vers. 02 next type binary 16. 02 state type binary 16. 02 lrank type binary 16. 02 ldrank type binary 16. 02 rank type binary 16. 02 remove type binary 16. 02 flags type binary 16. 02 time type binary 64. 02 nak type binary 64.

55

SLOT structure 02 error type binary 64. 02 error-state type binary 16. 02 unavail type float 64. 02 rtime type binary 64. 02 dtime type binary 64. 02 ptime type binary 64. 02 optime type binary 64. 02 pstate type binary 16. 02 wstate type binary 16. 02 pri type binary 16. 02 lopstate type binary 16. 02 s1 type sla-copy. 02 s2 type sla-copy. 02 data occurs asap-max-attrs times. 04 state type binary 16. 02 objs type SGP-Domain2. end.

56

ASAPXMEM

Joe's .. ASAPXTST 256> run asapx.asapxmem/cpu 1/$system.asapx.zappc1Joe's .. ASAPXTST 256.. Compaq ASAP Extension (ASAPX) - T0403V01 - (15JUN99) System \CENTDIVCopyright Compaq Computer Corporation 1999 1+ xmap XMAP.MAP[0] = 1111111111111101 1111111111000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 2+

57

ASAPXMEM2+ ctrlCTRL = REGISTERED = 25 AVAILABLE = 25 HIGH^SLOT = 25 STIME = 1999/08/04 12:29:02 MYNAME = $APPH13+ actrlACTRL = STATE = ACTIVE STIME = 1999/08/04 12:29:024+

58

ASAPXMEM5+ sla 0

SLA[0] = START^TAG = 15038 DT = PIN = 354 DNAME = ATM\GUEST\CHICAGO\$JX00 DNAME^LEN = 23 PNAME = $JX00 PNAME^LEN = 5 STATE^ADDR = 1275068416 VERSION = A1

59

ASAPXMEM MSG = ERROR = 0 RANK = 0 REMOVE = 0 UTIME = 1999/08/04 12:31:27 FLAGS = 0 SPARE[0] = 0 0 0 0 0 0 0 0 METRIC[0] = 750330. 83370. 2504543118197. 3339610464. 4131500. 2. 0. 0. 0. 0. 0. 0. DEFS[0] = 1930 1930 1933 1932 1931 1937 1930 1930 -1 -1 -1 -1 LAST^ERROR = 0 CHECKDATA = 12413 END^TAG = 15038

60

ASAPXMEM11+ help ASAPX Memory Peeper Commands SLA [<n>[:<n>]] - Shows App Shared Memory SLOT [<n>[:<n>]] - Shows ASAPX Shared Memory MEM [<n>[:<n>]] - Shows App and ASAPX Memory DOM [<n>[:<n>]] - Shows Domain Record * ALL [<n>[:<n>]] - Shows all things CTRL - Shows ASAPXMON ctrl struct ACTRL - Shows ASAPXMON actrl struct XMAP - Shows ASAPXMON memory XMAP FC - Standard FC command HELP - This tiny bit of help HISTORY [<n>] - History for n = 1-100 lines(20) ! [<n>|<text>] - Re-execute line no.(n) or text 12+

* - Not yet implemented

61

Z<asapid>DOMZ<asapid>DOM0

The Domain Table 1 per node Completely managed by

ASAPXMON processes Used for global update and

control of domains Should be TMF protected for

optimum availability Can be purged if all

applications and ASAP are stopped using the ASAPX CLEANUP command

ASAPXMONASAPXMON

ASAPXMONASAPXMON

ASAPXMONASAPXMON

ASAPXMON

62

The Domain Table

RECORD domain-rec. FILE is "ZDOMAINS" key-sequenced maxextents 100. 02 d-name type binary 16 occurs 32 times. 02 dname type character 64 redefines d-name. 02 p-name type binary 16 occurs 4 times. 02 pname type character 1 redefines p-name. 02 cpu-num type binary 16. 02 cpu type character 2 redefines cpu-num. 02 pin type binary 16. 02 slot type binary 16. 02 dname-len type binary 16. 02 pname-len type binary 16.

63

The Domain Table 02 vers type binary 16. 02 v-bytes redefines vers. 09 v1 type binary 8. 09 v2 type binary 8. 02 state type binary 16. 02 rank type binary 16. 02 flags type binary 16. 02 rtime type binary 64. 02 dtime type binary 64. 02 itime type binary 64. 02 otime type binary 64. key is dname duplicates not allowed. key "cp" is cpu.end

64

Source Code OverviewT0403V02 Source Modules SASAPX Source code for ASAPX CI SASPETAL Source for ASAP EMS SASPLIST Source for ASAP OBJECTIVES command code SASPXCMN Common source procedures for ASAPX SASPXDDL Source for ASAPX-specific DDL SASPXFMT Source for ASAPX formatting procedures SASPXHLP Source for ASAPX help procedures SASPXHST Source for ASAPX history procedures SASPXLIB Source for ASAPXLIB, ASAPXSRO, ASAPXSRL SASPXMON Source for ASAPXMON SASPXSGP Source for ASAPXSGP

65

Source Code Overview SASPXTAL TAL output from ASAPXDDL SASPXTST Source for ASAPXTST SDDLTAL ASAP TAL output from ASPDDLDB SLOGDEC Source for ASAP EMS logging procedures SLOGSTRU ASAP Log structure SNSRVDEC ASAP Source used to build ASAPXSGP SNSSCONF ASAP Source used to build ASAPXSGP SNSSMON ASAP Source used to build ASAPXSGP SNSSSGP ASAP SGP Shell source code SSGPOBJ ASAP DOTs API source code

66

What to Collect if there is a problem Copy of ASAPLOG Copy of EMSLOG Copy of the Domain Table

– $SYSTEM.ZASAPX.Z<asap-id>DOM ASAP MONITOR APP, LIST Output Copy of Shared Memory segment(s) if the problem is with

ASAPXMON or Statistics production or accuracy.– $SYSTEM.ZASAPX.Z<asap-id>C<n>

– $SYSTEM.ZASAPX.Z<asap-id>C<n>A

67

ASAP Extension (ASAPX) 2.0 Application Instrumentation

September 17, 2001

Joe Davis

[email protected]

68

Goal: ASAP 2.0 Monitors Applications

Down domaindetected!

69

Monitoring an application with ASAP 2.0. Define Metrics (Attributes) Reverse Engineer DataItems Define EDL Load/Distribute/Verify EDL Test EDL Identify code points Instrument the Application Final Test

70

Define Metrics - What to measure?What criteria is used to judge the application at the end of the year, month, day?

What don’t you know about the application?General suggestions:

– Success rate –Total transaction rate

– Error rate – Success percentage

– Server response time – Processing statesApplication specific:

– Cash remaining – Web site visits

– Shares traded – Quantity on handRecommend using rate values instead of raw counts since objectives on counts are ASAP Rate dependent

71

Define Metrics - What is most important? What criterion is most important? Order them in descending order from most to least

important. Each item will be defined as an attribute in ASAP EDL.

Application Metrics (in priority order)

Cash remainingTotal transaction rateSuccessful transaction rateFailed transaction countCPU utilizationServer response timeCard reject rateFailed transaction rate

72

Reverse Engineer DataItems - Formulas

Cash - computed and stored as a constant by the application.

Total rate - (Trans count + Error count)/seconds in interval

Trans rate - Trans count/seconds in interval

Error count - Error count

CPU busy - an ASAPX built-in metric.

Resp time - Time/(Trans count + Error count)

Reject rate - Reject count/seconds in interval

Error rate - Error count/seconds in interval

73

Reverse Engineer DataItems - Variables Application DataItems

– Constant representing cash remaining

– Successful transaction count

– Failed transaction count

– Accumulated time spent processing in the server

– Card reject count ASAPX Computed Data

– Number of seconds in the interval

– Busy attribute

74

Reverse Engineer DataItems - Types DataItems are defined in EDL using

the DataItems EDL entity property. They are assigned numbers 0-11,

beginning with 0. DataItems are updated with 64-bit

binary values. DataItems are stored as 64-bit

floating point.

Shared Memory

Transaction count (I)

Error count (I)

Processing time (U)

Cash remaining (C)

Card rejects (I)

I - integerS,M,U - time unitsC - constant

0

1

2

3

4

75

Define EDL - What is an entity? Entity A = Application A Entity B = Application B and

Application C Entity C = 1/2 of Application D Entity D = 1/2 of Application D Common Properties

– DataItems

– Metrics

Entity AApplication A

Entity BApplication BApplication C

Entity C Entity D

Application D

76

Define EDL - Application Name SpaceAbstract application domain namesEntity name is 1st (leftmost) levelRules

– 64 bytes or less – no commas, quotes, etc.

– up to 5 levels – no unbalanced trees

– lowest level cannot be “#” – auto name appendExamples

– Atm\Chicago\Loop\Branch12\$Atm43

– Deposit\Chicago\Atm\$Atm43

77

Define EDL - Application Name Space

Denver Dallas

Deposit Transfer Withdraw Deposit Transfer Withdraw

Deposit Transfer Withdraw

Denver Dallas Denver Dallas Denver Dallas

OR

Atm

Atm

78

Define EDL - Consider Aggregates Example 1

– Atm

– Atm\Denver, Atm\Dallas

– Atm\Denver\Deposit, Atm\Denver\Transfer, Atm\Denver\Withdraw, Atm\Dallas\Deposit, etc.

Example 2– Atm

– Atm\Deposit, Atm\Transfer, Atm\Withdraw

– Atm\Deposit\Denver, Atm\Deposit\Dallas, Atm\Transfer\Denver, Atm\Transfer\Dallas, etc.

79

Checklist So Far Attributes are defined. DataItems are defined. Application Name Space is defined. Entities are defined.

80

Define EDL - Create EDL File You MUST use the ASAP2APP EDL

file as the start of all user-defined EDL definitions.

Be careful, do not use ASAP1APP. Modify Entity name and properties. Append custom Attributes to the end,

after ErrorState. Do not modify or change the order of

any preceding attribute.

ENTITY App CI ASAP Command "APP\*,RAW,TAB,STATE,AGGREGATE" Enabled NO ErrorState ErrorState Help "Generic Application Entity" KeyForNode NodeName KeyForObj Domain KeyForRow "Dateymd Time" MaxObjectives 100 SGPFile ASAPXSGP SGPManaged YES SGPSuffix H Reserved NO Version 2.10103; AT NodeName Grid YES Graph NO GraphMax 0 Help "NSK System Name"; AT Sysnum Grid NO Graph NO GraphMax 0 Help "System Number"; AT Domain Grid YES Graph NO GraphMax 0 Help "Domain Name"; AT Status Grid YES Graph YES GraphMax 0 Help "Operational Status" StatePair YES StateIsOp YES StateRule UseStateGraphState TypeData CHAR20; AT OpState Grid NO Graph NO GraphMax 9 Help "Operational State"; AT Dateymd Grid NO Graph NO GraphMax 0 Help "Date of Stats"; AT Time Grid YES Graph YES GraphMax 0 Help "Time of Stats"; AT Valid Grid NO Graph NO GraphMax 0 Help "Validity Flag"; AT ET Grid NO Graph NO GraphMax 0 Help "Elapsed Time in Minutes"; AT CT Grid NO Graph NO GraphMax 0 Help "Count of Attributes"; AT Error Grid YES Graph NO GraphMax 0 Format I4 Help "Collection Error" StatePair YES StateRule UseStateGraphState TypeData INT64; AT ErrorState Grid NO Graph NO GraphMax 9 Help "State of Error";

81

Define EDL - Entity Name & Properties

ENTITY ATM CI ASAP Command "APP\*ATM,DETAIL,RAW,TAB,STATE,AGGREGATE" Detail "APP^,TAB,STATE,DETAIL,MINSTATE” DataItems "0 C, 1 I, 2 I, 3 U, 4 I" Enabled YES ErrorState ErrorState Help “ATM Application” KeyForNode NodeName KeyForObj Domain KeyForRow "Dateymd Time" MaxObjectives 200 SGPFile ASAPXSGP SGPManaged YES SGPSuffix H Reserved NO Version 1.00000;

82

Define EDL - Attributes

AT ErrorState Grid NO Graph NO GraphMax 9 Help "State of Error";

AT Error Grid YES Graph NO GraphMax 0 Format I4 Help "Collection Error" StatePair YES StateRule UseStateGraphState TypeData INT64;

AT ErrorState Grid NO Graph NO GraphMax 9 Help "State of Error";

Copy the Error and ErrorState attributes and place them at the end of the file, following the original ErrorState attribute. We will start by modifying the copied lines to become the first custom attribute.

Copied lines

83

Define EDL - Attributes

AT Cash Grid YES Graph NO GraphMax 3000 Format “I6” Help ”Cash remaining in ATM" StatePair YES StateRule UseStateGraphState MetricRule “#0" TypeData REAL64;

AT S0 Grid NO Graph NO GraphMax 9 Help "State of Cash ";

Set GraphMax even though UseStateGraphState is on, andit doesn’t really make sense for Cash.

Create Cash attribute - modified parts of newly copied lines are in red.

Copied lines

84

After copying the new lines once again to the end of the file, create the TotalRate attribute.

Define EDL - Attributes

AT TotalRate Grid YES Graph NO GraphMax 10 Format “F7.2” Help ” Total transaction rate " StatePair YES StateRule UseStateGraphState MetricRule “(#1 + #2)/S" TypeData REAL64;

AT S1 Grid NO Graph NO GraphMax 9 Help "State of TotalRate ";

Consider ASAP rate when setting GraphMax and Format.

85

Define EDL - Attributes

AT TransRate Grid YES Graph NO GraphMax 1000 Format “F7.2” Help ” Successful transaction rate" StatePair YES StateRule UseStateGraphState MetricRule “#1/S" TypeData REAL64;

AT S2 Grid NO Graph NO GraphMax 9 Help "State of TransRate ";

Copy and Create the TransRate attribute.

86

Define EDL - Attributes

AT ErrorCount Grid YES Graph NO GraphMax 10 Format “I3” Help ” Failed transaction count" StatePair YES StateRule UseStateGraphState MetricRule “#2" TypeData REAL64;

AT S3 Grid NO Graph NO GraphMax 9 Help "State of ErrorCount ";

Create the ErrorCount attribute.

87

Define EDL - Attributes

AT Busy Grid YES Graph NO GraphMax 100 Format “F6.2” Help “Process percent busy" StatePair YES StateRule UseStateGraphState TypeData REAL64;

AT BusyState Grid NO Graph NO GraphMax 9 Help "State of Busy";

Busy is 1 of the 11 ASAPX built-in attributes. No DataItems or MetricRule formulas are required. Just include the definition ofa built-in in your EDL file exactly as indicated.

Type in the Busy attribute.

88

Define EDL - Attributes

AT RespTime Grid YES Graph NO GraphMax 10 Format “F6.3” Help ”Server response time seconds" StatePair YES StateRule UseStateGraphState MetricRule “#3/(#1 + #2)" TypeData REAL64;

AT S5 Grid NO Graph NO GraphMax 9 Help "State of RespTime ";

Create the RespTime attribute.

89

Define EDL - Attributes

AT RejectRate Grid YES Graph NO GraphMax 0 Format “F5.3” Help ”Card reject rate" StatePair YES StateRule UseStateGraphState MetricRule “#4/S" TypeData REAL64;

AT S6 Grid NO Graph NO GraphMax 9 Help "State of Rejects";

Create the RejectRate attribute.

90

Define EDL - Attributes

AT ErrorRate Grid YES Graph NO GraphMax 0 Format “F6.3” Help ” Failed transaction rate" StatePair YES StateRule UseStateGraphState MetricRule “#2/S" TypeData REAL64;

AT S7 Grid NO Graph NO GraphMax 9 Help "State of ErrorRate ";

Create the ErrorRate attribute.

91

Completed EDL FileENTITY ATM CI ASAP Command "APP\*ATM,RAW,DETAIL,TAB,STATE,AGGREGATE" Detail "APP^,TAB,STATE,DETAIL,MINSTATE” DataItems "0 C, 1 I, 2 I, 3 U, 4 C" Enabled YES ErrorState ErrorState Help ”ATM Application" KeyForNode NodeName KeyForObj Domain KeyForRow "Dateymd Time" MaxObjectives 200 SGPFile ASAPXSGP SGPManaged YES SGPSuffix H Reserved NO Version 1.00000;

AT NodeName Grid YES Graph NO GraphMax 0 Help "NSK System Name";

AT Sysnum Grid NO Graph NO GraphMax 0 Help "System Number";

AT Domain Grid YES Graph NO GraphMax 0 Help "Domain Name";

AT Status Grid YES Graph YES GraphMax 0 Help "Operational Status" StatePair YES StateIsOp YES StateRule UseStateGraphState TypeData CHAR15; AT OpState Grid NO Graph NO GraphMax 9 Help "Operational State";

AT Dateymd Grid NO Graph NO GraphMax 0 Help "Date of Stats";

AT Time Grid YES Graph YES GraphMax 0 Help "Time of Stats";

AT Valid Grid NO Graph NO GraphMax 0 Help "Validity Flag";

AT ET Grid NO Graph NO GraphMax 0 Help "Elapsed Time in Minutes";

AT CT Grid NO Graph NO GraphMax 0 Help "Count of Attributes";

AT Error Grid YES Graph NO GraphMax 0 Format I4 Help "Collection Error" StatePair YES StateRule UseStateGraphState TypeData INT64; AT ErrorState Grid NO Graph NO GraphMax 9 Help "State of Error";

92

Completed EDL File

AT Cash Grid YES Graph NO GraphMax 3000 Format "I6” Help "Cash remaining in ATM” StatePair YES StateRule UseStateGraphState MetricRule "#0" TypeData REAL64;AT S0 Grid NO Graph NO GraphMax 9 Help "State of Cash ";

AT TotalRate Grid YES Graph NO GraphMax 10000 Format "F7.2” Help "Total transaction rate" StatePair YES StateRule UseStateGraphState MetricRule "(#1 + #2)/S" TypeData REAL64;AT S1 Grid NO Graph NO GraphMax 9 Help "State of TotalRate ";

AT TransRate Grid YES Graph NO GraphMax 10000 Format "F7.2” Help "Successful transaction rate" StatePair YES StateRule UseStateGraphState MetricRule "#1/S" TypeData REAL64;AT S2 Grid NO Graph NO GraphMax 9 Help "State of TransRate ";

AT ErrorCount Grid YES Graph NO GraphMax 1000 Format ”I3” Help "Failed transaction count" StatePair YES StateRule UseStateGraphState MetricRule "#2" TypeData REAL64;AT S3 Grid NO Graph NO GraphMax 9 Help "State of ErrorCount ";

AT Busy Grid YES Graph NO GraphMax 100 Format "F6.2” Help ”Process percent busy" StatePair YES StateRule UseStateGraphState TypeData REAL64;AT BusyState Grid NO Graph NO GraphMax 9 Help "State of Busy ";

AT RespTime Grid YES Graph NO GraphMax 10 Format "F6.3” Help "Server response time" StatePair YES StateRule UseStateGraphState MetricRule "#3/(#1 + #2)" TypeData REAL64;AT S5 Grid NO Graph NO GraphMax 9 Help "State of RespTime ";

AT RejectRate Grid YES Graph NO GraphMax 100 Format ”F5.3" Help ”Card reject rate" StatePair YES StateRule UseStateGraphState MetricRule "#4/S" TypeData REAL64;AT S6 Grid NO Graph NO GraphMax 9 Help "State of Rejects";

AT ErrorRate Grid YES Graph NO GraphMax 10000 Format ”F6.3” Help "Failed transaction rate" StatePair YES StateRule UseStateGraphState MetricRule "#2/S" TypeData REAL64;AT S7 Grid NO Graph NO GraphMax 9 Help "State of ErrorRate ";

93

Load/Distribute/Verify EDL Place new EDL file

– On all NSK servers $SYSTEM.SYSTEM.MYEDL

– On all ASAP Client workstations C:\Program Files\Tandem\Asap\Edl\MYEDL.edl

Add EDL file to ASAPUSER file Compile and verify EDL

– On ASAP Client using IDE window

– On NSK Server using ASAP CI EDL automatically compiled at startup Use SHOW command

94

Load/Distribute/Verify EDL Restart ASAP on all NSK servers to pick up new EDL

– Watch EMS events for possible EDL error messages Restart all ASAP Client sessions

– Restarts the Client’s private ASAP CI

95

Test EDL - Use ASAPXTST program Register a domain for

your new Entity. Update DataItems with

expected values. Validate results using

APP commands and ASAP Client.

1+ register error := ASAP_REGISTER_(domain^name:domain^name^len-- in/required ,seg^offset -- out/required ,error^detail -- out/optional ,segment^id -- in/optional, default 0 ,segment^base -- in/optional ,version -- in/optional, default ,asap^id:id^len -- in/optional, default ,flags ) -- in/optional, bits are: -- 13 allow replace on non-constants -- 14 don't concat process name -- 15 start with rank deactivated please enter: domain^name(): myapp\test segment^id(): segment^base(): version(): asap^id:id^len("$SG":4): flags(): error: 0error^detail: 0 result: domain registered

96

Identify Code Points Where is work committed

or failed? Where are important

values computed? Where are the I/O

operations?

ComputeCash;<code point 0>

EndTransaction;<code point 1>

AbortTransaction;<code point 2>

ReadUpdate $ReceiveReply $Receive<code points 3>

ComputeShares;<code point 4>

Application Source

97

Instrument the Application Insert one or more ASAP_REGISTER_ procedure calls

into application startup sequence. Insert ASAP_UPDATE_ procedure calls into application

code points. Handle errors returned

– For some errors call ASAP_REMOVE_ to remove the domain before re-registering with ASAP using ASAP_REGISTER_.

98

Program Globals and Declarations Example

int(32) asap^offset := -1d ,.ext asap^offset2;?nolist, source zaspxtal?nolist, source $system.system.asapxdec?library $system.system.asapxlibproc do^asap(ditem,val,math) extensible; int ditem; int(64) val; int math; forward;...

99

ASAP Interface Procedure Exampleproc do^asap(ditem,val,math) extensible; int ditem; int(64) val; int math; beginstring .dname[0:11] := ["NONSTOP\DEMO"];int err := 0, err^dtl := 0;if asap^offset <= 0d then begin @asap^offset2 := $xadr(asap^offset); if (err := asap_register_(dname:12,asap^offset2, err^dtl)) <> 0 then error^msg(4021,err,err^dtl); end;if asap^offset <= 0d then return;if (err := asap_update_(asap^offset,err^dtl,ditem,val, $optional($param(math),math))) <> 0 then begin error^msg(4022,err,err^dtl); err := asap_remove_(asap^offset,err^dtl,,1); asap^offset := -1d; end;end; -- of proc do^asap

100

ASAP Within your ApplicationComputeCash(cash);do^asap(0,cash,1);..endtransaction;do^asap(1,1f);..aborttransaction;do^asap(2,1f);..readupdate(…;rtime = juliantimestamp;.Reply(…;do^asap(3,juliantimestamp-rtime);

101

Final Test Make sure ASAP is running. Start your application. Check data using the APP command. Check data using ASAP Client. Goal achieved, application instrumented for ASAP.