ppt on VSAM
-
Upload
jasim-nazeer -
Category
Documents
-
view
1.158 -
download
60
Transcript of ppt on VSAM
1-1
VSAM
Virtual Storage Access MethodAllows for interactive updates (adds,
changes, and deletes)
1-2
Three Types of VSAM files
KSDSKeyed Sequence Data Set
RRDSRelative Record Data Set
ESDSEntry Sequence Data Set
1-3
KSDS
Allows for users to access records sequentially or randomly
Includes an index and data componentsA CLUSTER consists of both the index and
data components together Index relates a value in a key field to the
actual location of the record on disk Index is only used in random (or dynamic)
processing
1-4
VSAM Data Component Storage Concepts
Control Interval (CI)Fixed amount of storage; Must be multiple of 512Usually 2048 or 4096 bytesData records are stored in a CI
• A CI of 4096 bytes can store forty 100-byte records
Control Areas (CA)One cylinder in sizeSize of a cylinder varies by disk driveContains numerous CIs
1-5
Approximate number of records in a CI
(CI size - 10) / record length
1-6
Records in a CI
Calculate the number of records in a CI formula on previous slide
Determine the percentage of records in a CI to be added or insert in an average CIConcept of FreeSpace
• Space where records can be added “on the fly”
1-7
CI in a CA
A fixed number based on CI size and disk driveFor disk drive with ACD001:
• CI of 2048: 315 CI in a CA• CI of 4096: 180 CI in a CA
Determine the percent of the CI to be left open for additions
Known as CA freespace
1-8
Freespace in a CI
Is used to add records which belong in the CI
Records in a CI are shifted automatically within the CI to accommodate the inserted record
1-9
CI Splits
If there is no freespace in the CI in which the record is to be inserted:CI splitHalf of the records in the CI are moved to a
free CI in the same CAThe inserted record is then inserted in the
proper CIThese happen routinely and are
accomplished quickly
1-10
CA splits
If a CI needs to split and there is no free CI in the CA: CA splitHalf of the CI are moved to a free CA (usually
at the end of the file, there is unused space)Therefore, each of the two CI have 50% free
spaceThe original CI can now splitCA splits have much overhead and should be
avoided!!!
1-11
Avoid CA splits
Reorganization of files Import: Copy the VSAM file to a sequential
datasetExport: Delete and reload the VSAM file from
the sequential data set: Resets the freespace as in the define cluster
Allocate adequate freespaceAnalyze the primary key
1-12
Primary Key Analysis
Is the pattern in the PKExample: The Financial Aid office must
keep three years of data on-line.• Previous year: State reporting• Current year: Distributing aid to students• Next year: Granting/guaranteeing aid for
next year
1-13
Primary Key Analysis
Key options for the financial aid file 1-digit year + SSN
• Little on-line activity on first third of file• Most “adds” are in last third: CI/CS split
likely
SSN + 1 digit year• Activity is spread evenly throughout the file• Recommended
1-14
VSAM Index Component
Every KSDS as an index component for each primary key AND each foreign key
Base cluster: Primary key index and Data components
Foreign key index is known as the alternate index
1-15
Primary Key Index and Data
Two parts of the index:Sequence setIndex set
1-16
Primary Key Index and Data
Sequence SetLowest level of the index componentContains information that relates key values
to a specific CILinks the highest PK in a CI to the address of
that CI• Stores all the key-address pairs for the Cis in a CA
in one CI of the sequence set• There is a separate CI in the sequence set for each
CA in the data
1-17
Primary Key Index and Data
Index SetHighest level of the index componentKey-address pairs stored in one CI (can be
stored in main memory for processing efficiency)
• Address Pointer links to address of the appropriate sequence set
• Based on size of data component (number of cylinders or CA needed to store data component), you may need a intermediate index
1-18
Alternate Index and Data Component
Alternate index relates alternate key to primary keyThen uses the primary key index to locate the
dataCan have unique or non-unique AIFigure 2-4 (unique)Figure 2-5 (non-unique)
1-19
Alternate Index
Systems analyst determines whether to update the AI each time the cluster is changed
Can cause much overhead to update all indices especially in the case of a CA split
Other alternative is to re-build the index periodically (every night)
1-20
Relative Record Data Set (RRDS)
Lets the user access each record at random without the overhead of maintaining an index
Instead each record in a RRDS is numbered, starting with 1 for the first record
RRDS consists of a specified number of areas or slotsKnown as the relative record number (RRN)
1-21
RRDS
May need a routine to convert the PK of a record to a relative record numberHashing
• Most common hashing routine: The remainder option on the divide
Can cause empty slots. Can waste storage; But we avoid CI/CA splits
Difficult to HASH if PK is non-numeric
1-22
RRDS
Collision If hashing routine results is same RRN for
two different PKMust set secondary searching technique in
case of collisions• Usually linear probing: check the next
record up to a maximum number of tries. Needed to know if record to be added already
exists Needed to determine if record to be retrieved
exists without reading entire file
1-23
RRDS
AdvantagesNo index overheadDirect relationship between data and
location of the dataPermits both random and sequential
processing If good hashing routine with minimal
collisions: performance efficiency is excellent
1-24
RRDS
DisadvantagesStorage efficiencyCollisionsDifficulty in determining good hashing
technique• Difficulty with alphabetic key
Does NOT support the concept of FK or AINot widely used
1-25
ESDS
Entry Sequenced Data SetThe simplest type of VSAM fileRecords are stored sequentially at time of
entry
1-26
ESDS
Similar to sequential processingDoes allow to OPEN EXTEND to add
records to the end of the file
1-27
ESDS
Author does not recommend it Says it is restricted to sequential processingBUT you can build an AI for a FKBUT as of my current COBOL manuals,
COBOL cannot use the AI and processing must be sequential.
This is my current topic for career day questions