Chapter 05 Working with Datasets
© 2006 IBM Corporation
2
Chapter objectives
Be able to:
Explain what a data set is
Describe data set naming conventions and record formats
List some access methods for managing data and programs
Explain what catalogs and VTOCs are used for
Be able to create, delete, and modify data sets
Chapter 05 Working with Datasets
© 2006 IBM Corporation
3
Key terms in this chapter
block size
catalog
data set
high level qualifier (HLQ)
library
logical record length (LRECL)
member
PDS and PDSE
record format (RECFM)
system managed storage (SMS)
virtual storage access method (VSAM)
VTOC
Chapter 05 Working with Datasets
© 2006 IBM Corporation
4
What is a data set?
A data set is a collection of logically related data records stored on one disk storage volume or a set of volumes.
A data set can be:– a source program– a library of macros– a file of data records used by a processing program.
You can print a data set or display it on a terminal. The logical record is the basic unit of information used by a program running on z/OS.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
6
What an access method is
Defines the technique used to store and retrieve data.
Includes system-provided programs and utilities to define and process data sets.
Commonly used access methods include the following:
– VSAM, QSAM, BSAM, BDAM, and BPAM.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
7
DASD: Use and terminology
Direct Access Storage Device (DASD) is another name for a disk drive.
DASD volumes are used for storing data and executable programs.
Data sets in a z/OS system are organized on DASD volumes.– A disk drive contains cylinders– Cylinders contain tracks– Tracks contain data records.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
9
Using a data set
To use a data set, you first allocate it. Then, access the data using macros for the access method that you have chosen.
Various ways to allocate a data set:– ISPF data set panel, option 3.2– Access Method Services– TSO ALLOCATE command – job control language (JCL)
Chapter 05 Working with Datasets
© 2006 IBM Corporation
10
Allocating space on DASD volumes
How space is specified:– explicitly (SPACE parameter)– implicitly (SMS data class)
Logical records and blocks:– Smallest amount of data to be processed– Grouped in physical records named blocks
Data set extents:
– Space for a disk data set is assigned in extents
Chapter 05 Working with Datasets
© 2006 IBM Corporation
11
Data set record formats
record
block block
block
record record record
record
recordrecord recordrecord record record
block
record record
record
record
record
record record
record
record
recordrecord
BDW
F
FB
V
VB
U
Fixed records.
Fixed blocked records. BLKSIZE = n * LRECL
RDWVariable records.
Variable blocked records. BLKSIZE >= 4 + n * largest LRECL
Undefined records. No defined internal structure for access method.
Record and block descriptors words are each 4 bytes long
Chapter 05 Working with Datasets
© 2006 IBM Corporation
12
Types of data sets
We discuss three types in this class:
– Sequential, partitioned, and VSAM
A sequential data set is a collection of records written and read in sequential order from beginning to end.
A partitioned data set (PDS) is a collection of sequential data sets, called members.
– Consists of a directory and one or more members.
– Also called a library.
A PDSE is a partitioned data set extended.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
14
PDS versus PDSE
PDS data sets:
– Simple and efficient way to organize related groups of sequential files.
PDSE data sets:
– Similar to a PDS, but advantages include:
• Space reclaimed automatically when a member is deleted
• Flexible size• Can be shared• Faster directory searches
Chapter 05 Working with Datasets
© 2006 IBM Corporation
15
What is a data set, and how is it stored
Record 1 Record 2 Record 3 Record 4etc ...
Sequential Data Set
Available space
Partitioned Data Set
Previously used space recoverable by compress utility
Directory
Entry for COMPJCL Entry for JCOPY Entry for SORT1
COMPJCL
JCOPY
SORT1
DASD
Partitionedand
Sequential
Chapter 05 Working with Datasets
© 2006 IBM Corporation
16
How data is stored in a z/OS system
Data is stored on a disk (DASD), magnetic tape volume, or storage device optical media.
You can store and retrieve records either directly or sequentially.
You use DASD volumes for storing data and executable programs, including the operating system itself, and for temporary working storage.
You can use one DASD volume for many different data sets, and reallocate or reuse space on the volume.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
18
How data sets are named Data set naming convention--Unique name
• Maximum 44 characters– Maximum of 22 name segments: level qualifier
• The first name in the left: high level qualifier (HLQ) Level qualifiers are separated by '.'
– Each level qualifier:• From 1 up to 8 characters• The first must be alphabetical (A-Z) or (@ # $)• The 7 remaining: alphabetical, national, numeric (0-9) or
hyphen (-)• Upper case only
– Example: MYID.JCL.FILE2 HLQ: MYID 3 qualifiers
Member name of PDS (partitioned data set)– 8 bytes long– First byte: alphabetical (A-Z) or special (@ # $)– The 7 remaining: alphabetical, special, numeric (0-9)
Chapter 05 Working with Datasets
© 2006 IBM Corporation
19
Catalogs and VTOCs
z/OS uses a catalog and a volume table of contents (VTOC) on each DASD volume to manage the storage and placement of data sets.
VTOC:
– Lists the data sets on a volume
– Lists the free space on the volume.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
21
VTOC location
LABEL(volser)
Extents
tracks
free spaceYOUR.DATAMY.DATA
trackstracks
VTOC
Chapter 05 Working with Datasets
© 2006 IBM Corporation
22
How a catalog is used
A catalog associates a data set with the volume on which the data set is located.
Locating a data set requires:– Data set name– Volume name– Unit I.e.(volume device type e.g. disk, tape)
Typical z/OS system includes a master catalog and numerous user catalogs.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
23
Catalog Structure
IBMUSER.A1USER.A1SYS1.A1
Catalog Structure
SYSTEM.MASTER.CATALOG
Data Set-SYS1.A1or
HLQs (alias)IBMUSER...USER
Master Catalog
USERCAT.COMPANY
Data Set withHLQ=USER
User Catalog
USERCAT.IBM
User Catalog
Data Set withHLQ=IBMUSER
volume (wrk002)unit (3390)
IBMUSER.A2IBMUSER.A3
volume (wrk001)unit (3390)
volume (012345)unit (tape)
USER.TAPE.A1
Chapter 05 Working with Datasets
© 2006 IBM Corporation
25
Catalog and Uncataloged Datasets
Note the ‘ // ‘ and parm statements used for Job Control Language
Chapter 05 Working with Datasets
© 2006 IBM Corporation
28
Traditional Disk Capacity (DASD or DISK )
Chapter 05 Working with Datasets
© 2006 IBM Corporation
29
Data management in z/OS Data management involves all of the following tasks:
– allocation,
– placement,
– monitoring,
– migration,
– backup,
– recall,
– recovery, and
– deletion.
Storage management is either manual or automated (or a combination ).
In z/OS, DFSMS is used to automate storage management for data sets.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
30
Data Facility Subsystem Managed Storage (DFSMS)
Chapter 05 Working with Datasets
© 2006 IBM Corporation
31
VSAM
VSAM is Virtual Storage Access Method
VSAM provides more complex functions than other disk access methods
VSAM record formats:– Key Sequence Data Set (KSDS)– Entry Sequence Data Set (ESDS)– Relative Record Data Set (RRDS)– Linear Data Set (LDS)
Chapter 05 Working with Datasets
© 2006 IBM Corporation
32
VSAM - Structure
>VSAM data sets consist of control intervals (CI) and control areas (CA).
>The size of the CI and CA is normally determined by the access method (They are not visible to the user).
>There will be a fixed number of control intervals in each control area. (e.g. CI is a block, CA is a track)
>A control interval normally contains multiple records.
>The records are stored within the control interval at one end
> Control information is at the other end of the control interval
> The space between is the free space.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
33
Simple VSAM control interval
R1 R2 R3 free space in CIRDF
RDF
RDF
CIDF
Record Descriptor Fields
Chapter 05 Working with Datasets
© 2006 IBM Corporation
34
VSAM – Structure (cont)
The control information has two types of entry:
- a control interval descriptor field (CIDF) which is always present,
- and record descriptor fields (RDF) which are present when there are records and describe the length of the associated record.
-Free space within a CI is always contiguous.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
35
VSAM – Structure (cont)
Record insertion
- placed in the correct order relative to other records.
- records may have to be moved in the control interval (CI).
- If there is not enough space, the CI is split (about equally
from a free pool in the Control Area (CA)
- If no more FREE CI in the CA, the CA is split (about
equally)
Record deletion-
- When a record is deleted, later records are moved down so
that the free space remains contiguous.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
39
VSAM – Entry Sequenced Data Set (ESDS)
>Records in ESDS are stored in order in which they are written
>Records are loaded irrespective of their contents and their byte addresses cannot be changed.
>An ESDS may have an index defined to it to enable access via keys, by defining an Alternate Index
> Records are accessed based on their sequential order, that is, the order in which they were written to the file;
> accessing a particular record involves searching all the records sequentially until it is located,
or by using a relative physical address (Relative byte address, RBA), i.e. the number of bytes from the beginning of the file to start reading.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
41
VSAM – Relative Record Data Set (RRDS)
>A Relative Record Data Set (RRDS) is a type of data set organization used by the VSAM computer data storage system.
> Records are accessed based on their ordinal position in the file (relative record number, RRN).
For example, the desired record to be accessed might be the 42nd record in the file out of 999 total.
>The concept of RRDS is similar to ESDS method, but it can access data in random or dynamic access.
>An RRDS may have an index defined to it to enable access via keys, by defining an Alternate Index
Chapter 05 Working with Datasets
© 2006 IBM Corporation
43
VSAM – Linear Data Set
>A linear data set contains data that can be accessed as byte-addressable strings in virtual storage.
> It is a VSAM data set with a CI size multiple of 4096 bytes.
>An LDS has no embedded control information in its CI
>All LDS bytes are data bytes.
> Logical records must be blocked and deblocked by the program. i.e. Logical records are not apparent from VSAM’s point of view.
In a sense, an LDS is a non-VSAM data set with some VSAM facilities, such as VSAM specific information in the catalog.
>The most common LDS user is IBM DB2®. LDS is the VSAM data set organization that is used by Data-in-Virtual (DIV) facility.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
46
z/OS UNIX file systems
z/OS UNIX System Services (z/OS UNIX) allows z/OS to access UNIX files.
A z/OS UNIX file system is hierarchical and byte-oriented.
Files in the UNIX file system are sequential files and accessed as byte streams.
UNIX files and traditional z/OS data sets can reside on the same DASD volume.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
47
MVS and File Systems..Unix file system as a dataset
Chapter 05 Working with Datasets
© 2006 IBM Corporation
48
UNIX file system structure
Directory
Directory Directory
Directory Directory
File
File
File
File
File
File
File
File
Directory
File
File
File
File
File
File
File
Chapter 05 Working with Datasets
© 2006 IBM Corporation
50
Comparison of MVS Datasets and UNIX HFS (Hierarchical file system)
Chapter 05 Working with Datasets
© 2006 IBM Corporation
52
Summary
A data set is a collection of logically related data (programs or files)
Data sets are stored on disk drives (DASD) and tape.
Most z/OS data processing is record-oriented.
Byte stream files are present only in z/OS UNIX.
z/OS records follow well-defined formats, based on record format (RECFM), logical record length (LRECL), and the maximum block size (BLKSIZE).
z/OS data set names have up to 44 characters, divided by periods into qualifiers.
Chapter 05 Working with Datasets
© 2006 IBM Corporation
53
Summary (continued)
Catalogs are used to locate data sets.
VSAM is an access method that provides more complex functions than other disk access methods.
z/OS libraries are known as partitioned data sets (PDS or PDSE) and contain members.
A file in the hierarchical file system can be either a text file or a binary file.
z/OS treats an entire UNIX file system hierarchy as a collection of “data sets” where
Each “UNIX” data set is a MOUNTABLE file system.
Top Related