Bsit61 Unix

289
UNIX SYSTEM PROGRAMMING (BSIT 61) Contributing Authors Dr. G.Raghvendra Rao Prof. & H.O.D. Dept. of Computer Science NIE, Mysore 1

Transcript of Bsit61 Unix

Page 1: Bsit61 Unix

UNIX SYSTEM PROGRAMMING

(BSIT 61)

Contributing Authors

Dr. G.Raghvendra Rao Prof. & H.O.D. Dept. of Computer Science

NIE, Mysore

1

Page 2: Bsit61 Unix

UNIX SYSTEM PROGRAMMING

Course Introduction

The present course – Unix Systems Programming – can be considered as an

extension course of the operating systems course, where you had been introduced to

the fundamentals of Unix. As you are familiar, an operating system is a platform which

helps the user or most often several users to optimally make use of the system

resources – like the CPU time, printers, memory, other storage devices etc. Unix is one

of the most popular operating systems that was initially introduced as a multi user, multi

process operating system. The student is expected to have undergone a basic study

about these features in the Unix course.

A study of an operating system can be done at two levels – at the user level or at

the system level. The user level programming must have already been studied by the

student. Here, the user becomes familiar with how to create and open files and

directories, save them an execute them using the facilities of the operating system.

Additional features like storing them under a password, storing the files under read,

write or execute restrictions, creating different users and so on can also be a part of

user level programming.

At the next level is the system programming. Here, we study the environment

under which a Unix program works. Unix, since being a multi user system will have to

store programs and data pertaining to different users and to ensure that they are made

available only to the respective users (or in some cases, other users authorized by the

owner), an elaborate system of storing the environment, sharing of data etc., have been

designed. The study of this environment manipulation becomes the first part of the

study on the Unix system programming.

Unix deals with all programs by looking at them as processes. A process can be

though of as a portion of a program or a subsection of it, which can be executed

independently. A combination of all such process interactions will give the desirable

2

Page 3: Bsit61 Unix

program output – Hence Unix has an elaborate system of creating processes, storing

them, taking them on/ off resources etc. The sequences in which the processes

execute depend on various parameters and system environment. The understanding

and ability to manipulate the these environment is the key to success. This also forms

a part of the study.

Since a number of processes keep interacting and they also will be using the

system resources in a shared manner, it becomes desirable i) to control the access of

resources by the processes depending on the process requirements and ii) to ensure

that there is no overlapping of the resource usage. To do this, the system uses a

number of signals, called semaphores. We also learn the use of such signals.

I/O operation is a very important aspect of programming, even here, it becomes

essential that the system be able to perform input and output operations pertaining to

each of the processes. Also, it becomes essential to ensure that there will be no

overlapping of I/O operations, since it may otherwise mean mixing up of data /

information pertaining to different processes. Hence, use of semaphores or record

locking may be resorted to.

Also, there should be a capability for the processes to interact with each other.

Such inter process communication needs some communication channels. These are

achieved by pipes, forks, streams etc. We also talk of co-processes, daemon

processes which run in the back ground to achieve a variety of operations etc..

A few programs to illustrate these operations are also included, but detailed

laboratory excercises should supplement the course.

After having seen a bird’s eye view of the coverage of the topics, we have some

words about the material presentation here.

3

Page 4: Bsit61 Unix

Since this is an advanced course, several prerequisite causes need to have been

completed by the student. Two of them are C programming and fundamentals of Unix.

Since C and Unix are not separable, a reasonably sound knowledge of C is a must.

Especially a good understanding of the concepts of pointers, structures, user defined

data structures etc is essential to follow the course material.

Talking about user defined data structures, a large number of data structures are

defined and used in the course. It is neither possible not desirable to give an

exhaustive list of them at the beginning itself, since to understood many of the fields, it

is necessary that we know something about the fields themselves. Hence, one often

comes across defined data structures like Pid_t, sigset_t etc.. which incidentally define

the Pid sets or signal sets. The candidate is advised to assimilate them as and when

they are encountered.

There is an associated problem. Often several description or functions are so

inter connected that one cannot be explained without the other. For example, if two

concepts a and b need the support of the other, it become necessary that they be

dealt in some order, a and b or b after a. In each case, the student is advised to

presume the concepts, until such a time a detailed analysis is made available. In some

cases, it may be even in different blocks, to ensure continuity of the concepts involved.

It is for instructors to see that the candidate does not feel out of place. Though all effort

is taken to maintain a sequential and steady flow of the issues involved, in certain cases

it becomes impossible to maintain such a flow pattern. Hence, the instructors need to

guide the candidate at such junctures.

Unix System Programming is mostly about functions and function calls.

However, certain theoretical and conceptual details need to be well understood before

the functions are explained. Since it is a unix course, many of the basic unix functions,

even if they are presumed to have been covered in the operating system course, are

reiterated in the first few pages. The student is advised to ensure that he understands

these concepts in detail clearly.

4

Page 5: Bsit61 Unix

The other important aspect is about implementation. Unix System Programming

is not unique. There are several unix implementations – SVR4, 4.3 BSD, 4.3 + BSD to

name a few. There will be subtle differences between each of them. Some of the

functions, available in one version, may differ in details in another or may not be

available at all. Hence, implementation details need to be fine tuned to meet the

available versions.

Further many of the options, like flag settings are too complicated and detailed to

be included in the course work – though many of them have been included for

completion sake, in a few cases, they have been left out to avoid ambiguity and also to

ensure continuity. In case, the candidate needs further details on them, he may have to

refer to the system manuals.

The candidates are strongly advised to go through the reference text books,

especially book (1), to get any of their other doubts classified.

Now we shall go on with the course.

5

Page 6: Bsit61 Unix

Unit – I

Unit Introduction

In this unit, we recollect the various aspects of Unix fundamentals, with which we

are already familiar. Though this unit is supposed to be a repetition, the student is

advised to diligently go through the same, since a sound knowledge of not only the

functions available to do the various operations, but the various ways in which the

arguments are passed, the type of returns one expects from the functions etc.. Are vital

for further follow up of the course.

The unit itself is divided into 3 blocks. The first block discusses about the

fundamentals of Unix file I/O operations. It is expected to clarify the various operations

on Unix files like open, read, write, lseek & closes. It also introduces to the concept of

file descriptors. For each of the operations, the list of optional arguments that change

the type of interaction with the system is also included. We also talk about the concept

of the sharing, duplication of descriptors, atomic operations and how to change the

properties of an already existing file.

The second block discuss the various file and directory functions. It begins with

the status functions which give us an idea about the status of the files. Then we look

into the different types of files are comes across normally. The block then discusses

about the concept of group and user ids and the various access permissions w.r.t

operating the files. The block also deals with the way files are actually stored, with the

alignment of various data blocks and also the concept of symbolic links.

The third block deals about the standard I/O functions. It discusses about the

need for and concept of buffering, the methods of opening a stream and getting data

into and out of such streams. The other concept dealt with is that of positioning in a

stream. The block also discusses the concepts like temporary files – which are created

by a program on the run and are closed once the program closes, password files –

encrypted password files and also shadow password files.

6

Page 7: Bsit61 Unix

Block – I

Block Introduction

This block introduces the students to the fundamentals of Unix file I/O operations.

The understanding of these concepts is doubly important because Unix operates on

most of it’s devices and resources as files. The block given a brief description of each

of the operations, the function command, the arguments to be passed, with the various

options available. It also given an account of what return value to expect when the

function succeeds or when it fails.

The following functions are dealt with

Open function : To open the files

Create function : To create new files

Close function : To close the files

lseek function : To set the file offset values

Read function : To read an opened file

Write function : To write into an opened file.

We also look into the concept of the sharing and the concept of atomic

operations. We also see how to duplicate file descriptors and changing the properties

of an already open file.

7

Page 8: Bsit61 Unix

Contents

1.0 Introduction

1.1 Open Function

1.2 Create Function

1.3 Close Function

1.4 Lseek Function

1.5 Read Function

1.6 Write Function

1.7 File Sharing

1.8 Atomic operations

1.9 Duplication of file descriptors

1.10 Changing the properties of an already open file

1.11 Block summary

1.12 Review question and answers.

8

Page 9: Bsit61 Unix

Fundamentals of Unix file I/O

In this, block we reintroduce ourselves to the fundamentals of file I/O. Though the

student is expected to have studied these and other concepts in some detail in the

earlier courses, it is essential that all concepts are unambiguously understood. So, we

get a bird’s eye view of the same, again, before going into the more advanced topics of

Unix system programming.

Most Unix file I/O operations can be performed, in fact, by using a few primitive

functions: open, write, read, lseek, close. We examine the effect of each and also the

effect of varying the buffer size on these functions. Of course, each of these functions

come with zero or more arguments, which will further qualify the actions indicated.

Before we go into the actual study of the operations, it is also desirable that we

are clear about the concept of file descriptors. To the kernel, all open files are referred

to by their respective file descriptors. Normally, a file descriptor is a non-negative

integer. When a file (either existing file or a new file) is opened, the kernel returns a file

descriptor. When a read or write operation is to be done, the file needs to identified by

it’s file descriptor which was associated with it by the kernel at the time of opening or

creating the file.

Normally, the file descriptor 0(Zero) is associated with the standard input, 1 with

the standard output, and 2 is associated with the standard error. The numbers

associated with the file descriptors normally range from 0 to 63 in many systems.

Now, we go on to the descriptions of the I/O functions.

9

Page 10: Bsit61 Unix

1.1 Open function

The open function can create a new file or open an existing file. It’s standard format

is as given below

int open (const char * pathname, int oflag.../*, mode_t mode */);

This returns a file descriptor if the operation is successful, else returns –1 on error.

Now to the arguments.

The pathname is the name of the file to be created. The options available for this

function are specified by the oflag argument. There are a large number of options that

can be specified under this head. Some are listed below:

O_RDONLY : Open for reading only

O_WRONLY : Open for writing only

O_RDWR : Open for reading as well as writing

It need not be stated that only one of these options can be specified. In addition, a

large number of optional arguments can be included ( as indicated by .....)

Some of these optional arguments are listed below:

O_Creat : Creates a new file. This option demands another argument

“mode”, which specifies the access permissions of the new file.

O_Append : Appends to the end of an existing file, whatever is going to be

written.

O_Excl : Generates an error if O_Creat is specified, but if the file is

already is available. If the file does not exist, it creates a new file.

10

Page 11: Bsit61 Unix

O_TRUNc : If the file exists and is successfully opened for either write_only

or readwrite modes, truncate it’s length to 0.

O_Sync : Have each write wait for the physical I/O to complete i.e if a slow

device like a printer is working, this command ensures that there is

synchronization in the operations.

The successful open operation returns a file descriptor, which is guaranteed to be the

lowest numbered unused descriptor. This property may become useful in some cases.

1.2. Create function

This will “Create” a new file, if the file does not already exist, otherwise it returns an

error.

The format is

int creat(const char * pathname, mode_t mode);

It returns the file descriptor if OK, returns –1 on error.

One problem with create command is that it creates it only for write operations. If it

is to be read after writing into it, it should be closed and then opened again using

open command.

1.3. Close function

This will close an already opened file

The format is

int close(int filedes);

11

Page 12: Bsit61 Unix

When a process is terminated, all opened files are automatically closed and

need not be explicitly closed by the program.

1.4. Seek function

The position at which a file is read from or written into is referred to as the

“current file offset”. This non negative integer measures the number of bytes from

the beginning of the file at which the current activity ( of reading or writing) is taking

place. Also the read/write operations suitably increment the offset as the operations

progress.

By default, the offset is initialized to 0, ( the beginning of the file ) when the file is

opened.

However, if O_Append option is specified, then it indicates the current length of

the file is bytes.

However, the offset of an open file can be explicitly positioned at a suitable place

by calling lseek.

The format is

Off_t lseek ( int filedes, off_t offset, int whence);

This returns the new file offset if OK, else returns –1.

The value of the “whence” argument determines the interpretation of the offset.

i) If whence is seek_cur, the file’s offset is set to the current value plus the

offset (the offset can be positive or negative)

ii) If whence is seek_set, the file’s offset is set to offset bytes from the

beginning of the file( obviously, offset can only be positive)

12

Page 13: Bsit61 Unix

iii) If whence is seek_end, the file’s offset is set to the size of the file plus the

offset. The offset can be positive or negative).

Note: Look at the arguments above carefully. The seek_end argument adds to the

size of the file an offset. Suppose the current size of the file is 100 bytes and we add 50

bytes as the offset. So, the next operation takes place at 150 bytes. i.e there is a “

hole” created in the file between the end of the file and the current area of operation.

These intermediate bytes are filled with 0s when read at a later stage.

You may note that the holes can be created in the other two operations also.

1.5.Read function

This is to read data from a opened file.’

The general format is

ssize_t read ( int filedes, void * buff, size_t nbytes);

It returns the number of bytes read, if the read operation is successful. If the end of

file is encountered before the read operation starts, it returns 0;

The read operation starts from the current file offset and as the reading continues,

the offset is incremented by the number if bytes actually read.

1.6 Write function

This is to write data into an opened file.

The typical format is

ssize_t write(int filedes, const void * buff, size_t n bytes);

13

Page 14: Bsit61 Unix

This returns the number of bytes written.

The value returned is usually the same as the n bytes argument. Normally, the

write operation starts at the current offset position of the file. It the option is

O_Append, the file’s offset is set to the end of the file and the offset gets

incremented suitably during the write operation.

1.7. File sharing

Often, different processes need to share open files. Unix supports such operations.

Now, what actually is file sharing and why do processes need to share files? To do this,

we need to know the data structures used by the kernel for I/O.

Every process has an entry in the process table. Each such entry has a table of

open file descriptors, in the form of a vector, with one entry per descriptor. Each file

descriptor has i) the file descriptor flags and ii) a pointer to the file table entry.

The kernel maintains a file table for all open files. Each file table entry includes i)

the file status flags of the file (read, write, append etc.. ) ii) the current file offset and

iii) a pointer to the v-node entry for the file ( see next paragraph). Each open file has a

v-node structure. It contains information about the type of file and pointers to the

functions that operate on the file. In many cases, the v-node also contains the i-node for

the file.

The information is read from the disk when the file is opened, so that all relevant

information about the file is readily available.

The figure gives some idea about the data structures we have just now

discussed.

14

Page 15: Bsit61 Unix

file table v-node table

Let us examine the case of a single process that has two different files opened –

one file is open an standard input ( file descriptors 0) and the other on standard

output(file descriptor 1)

For each of the processes a file table is opened, which contains details like the

file status flags, current file offset. Also another v node table containing details like

vnode information i-node information and current file size is created. This can be

accessed only through the v-node pointer field of the file-table.

Now suppose two/more independent processes open the same file. Then

definitely each of them should have a process table entry. Since it is the same file that

is being referred to by both of them, there can be a common v-node table. But the file

tables will have to be different, because each of the processes may be reading/writing

from different parts of the file- i.e their offsets are different.

15

file status flags

current file offset

v-node ptr

file table

file status flag

current file

fd flags ptr

fdo

fd1

fd2

v-node info

i-node infor

current file size

V node info

i node infor

Current file size

Page 16: Bsit61 Unix

process table entry

The student is advised to verify what happens when operations like write, Append, lseek

etc.. are undertaken. They will do well to trace the operations and changes on these

tables..

1.8. Atomic operations

Since Unix allows two or more processes to simultaneously operate on files, several

problems arise. Consider the following situation:-

Processes A and B are operating on a file. Both are trying to append to the file

some other data of their own. Difference scenarios arise.

We have to note that for append operation to succeed, one will have to first do the lseek

operation to the end of the file ( since the new data is to be written at the end of the file).

Consider process A has done lseek and put it’s pointer to the end of the file. Meanwhile

the process B is switched on. B also tries to find the end of the file ( any way the pointer

16

v node tablefdo

fd1

fd2

file table

file status flags

etc

Process table entry file status flags

etc

v-nodedetails

fdo

fd1

Page 17: Bsit61 Unix

is already at the end of the file). Since the lseek operation has completed fast( or for

whatever other reason), B has also time to write the data. Now suppose the current

end of the file is 2000 bytes and B writes another 200 bytes, then the end of the file will

be extended to 2200 bytes. Suppose now, the process A is given the time. Since A’s

offset points to 2000 it self, it tends to start writing from that location, overwriting the

data written by B.

It is easy to note that a similar situation happens, even if A is allowed to write

after the lseek of B, only that now the data of A will be overwritten.

The problem arises because we are using two atomic operations( lseek and

write) to perform the write operation. Thus, after each atomic operation, there is a

possibility that the process switch takes place, leading to the problems described above.

Modern versions of Unix overcome this problem by using a single atomic

operation O_Append. Though it still involves the two stages of seeking the end of the

file and then writing the data into the file, there can be no process switching in between

and none of the anomalies that we have described above can take place.

Another example for such an anomaly can be the “ Creat “ operation. When a new file

is to be created, it is desirable that the existence of the file is to be checked and only if

no such file exists, it should be created.

Suppose the checking and creation are made two different operations. Then if

the file checks for the non existence of the file and comes back to create the file in the

next slot, may be in the intermediate period the file may be created by another process.

This would bring in anomalies. The solution again is to make creat an atomic operation.

Now we can define an atomic operation. It is an operation composed of multiple

steps. If the operation is performed, then all the steps are completed in one go. It is not

17

Page 18: Bsit61 Unix

possible to perform one set of steps in one time slot and perform the remaining doing

the next slot.

1.9 Duplication of file descriptors

An existing file descriptor is duplicated by using any one of the following functions

int dup (int filedes);

int dup2( int filedes, int filedes 2)

Both of them return the new file descriptor if OK

otherwise error.

The new file descriptor returned by dup is guaranteed to be the lowest numbered

available file descriptor.

The new file descriptor to be returned by dup2 is specified in the filedes 2 argument.

If filedes2 is already open, then it is first closed and then returned as a new filedes2.

1.10. Changing the properties of an already open file:

The fcntl function:

The general format is

# include<fcntl.h>

int fcntl (int filedes, int cmd, ... / * int arg */);

returns ( depend on cmd), if not OK, returns error.

The fcntl function can be used for different purposes:

1. Let Cmd = F_DupFD. It returns a duplicate file descriptor

( note that this could be another method apart from using DUP)

2. Let Cmd = F_GETFD or Cmd = F_SetTFD. Correspondingly the function

gets or sets the file descriptor flags.

18

Page 19: Bsit61 Unix

3. Let Cmd=F_GETFL or Cmd = F_SETFL. Correspondingly the function gets

or sets the file status flag.

4. Let Cmd = F_GETOWN or Cmd = F_SeTOWN. Correspondingly the

function gets or sets the I/O ownership.

5. Let Cmd = F_GETLK or Cmd = F_seTLK. The function gets or sets the

record locks.

We will quickly see how each of these operates

F_DUPFD : Duplicates the file descriptor filedes. The new file descriptor is

returned as the value of the function. It is obviously the lowest numbered descriptor

available, which is greater than or equal to the value of the third argument. The new

descriptor has it’s own set of file descriptor flags, but shares the same file table entry as

filedes.

F_GETFD : Returns the file descriptor flags for filedes as the value of

the function.

F_SETFD : Set the file descriptor flags for filedes. The new flag value

is set from the third argument of the function.

F_GETFL : Returns the file status flag for filedes as the value of the

function.

The different file status flags are as below:

O_RDONLY : Open for reading only

O_WRONLY : Open for writing only

O_RDWR : Open for reading and writing

19

Page 20: Bsit61 Unix

O_APPEND : Append on each write

O_nonblock : non blocking mode

O_SYNC : wait for writes to complete.

F_SetFL : Sets the file status flags to the value of the third argument.

F_GETOWN : Gets the process IDs.

F_SetOWN : Sets the process IDs.

20

Page 21: Bsit61 Unix

Block Summary

We have looked into the fundamentals of Unix file I/O operations. In particular

we have seen the functions of Open, Create, Close, lseek, read and write. We have

discussed the concept of file sharing and the need for atomic operations. The concept

of duplicate file descriptors and also changing the properties of an already open file

were also discussed.

21

Page 22: Bsit61 Unix

Review Question

1. What is a file descriptor ?

2. What value is associated with standard input and what value with the standard

output ?

3. Give the format for opening a new file ?

4. What are the modes in which a file can be opened ?

5. What option is used to associate a slow I/O device with the System ?

6. For which mode of operation does a file gets opened in the creat command ?

7. Explain the concept of “ current file offset”

8. Explain how a hole gets created in file

9. What is file sharing ? How is implemented in unix

10.Explain the concept of atomic operations.

11.Which function is used for changing the properties of a file ? what is it’s format.

22

Page 23: Bsit61 Unix

Answers to review questions:

1. When a file is opened, the kernel returns a file descriptor. The file gets identified

with this identifier for reading & writing.

2. Standard input is associated with 0, standard output with 1.

3. int open ( const char * pathname, int oflag ........);

4. A file can be opened for reading only, writing only and read and write

5. O_sync

6. Write only mode.

7. The position at which a file is read from or written it is referred to as the current file

offset.

8. Sometimes, when the lseek_end argument adds offset to the end of the file, it

actually goes beyond the end of the file and starts writing. A hole is then created in

the file, between the EOF and the beginning of write operation.

9. When more then one process wants to access the same file, the concept of file

sharing comes in. Unix implements this by creating a file table to each such

process, but making it point to the same v-node table.

10.When two or more operations are to be performed to complete a task, it may so

happen that the process may be removed after performing some of the tasks. For

example finding the eof and appending the data to it. In such a case, if the process

is removed after finding eof and comes back for appending at a later stage, it may so

happen that some other process might have operated on the file, there by changing

the scenario. To overcome this, the entire set of operations need to be completed in

one shot. This is called atomic operation.

11. fcntl. The format is

int fcntl (int filedes, int cmd);

23

Page 24: Bsit61 Unix

Block –II

Block Introduction

In this block, we look into the various aspects of files, directory and the various

functions operating on them. We first begin with the stat functions – which are designed

to return the status information about the file specified. This status information would be

helpful to us in a variety of ways, as will become obvious later on.

We then have an idea of the various types of files like regular files, directory files,

special files, FIFO, sockets etc.. each of them has it’s own unique features and will be

useful in certain specific situations. We then get some idea about the various

permissions to the files. To be able to use a file, first off all one should have a valid user

id. There is also the concept of group id, wherein a group will have a single group id. In

additions, the files can be opened in read, write or execute modes or the various

combinations thereof. There is also the concept of ownership of files and directories.

Given suitable “ access rights”, a user will be able set and modify any or all of the

above. We will be familiarizing ourselves with these facts in this block.

Then we get an insight into the way the files are actually stored with the

alignment of various data blocks. We should be able to modify or unlink the

associations.

We will also get ourselves introduced to the concept of symbolic links which is an

indirector pointer to a file, which will overcome some of the drawbacks of normal links.

We see the various ways of operating on such symbolic links.

We also get some idea about the concept of times – as viewed from the file point

of view.

24

Page 25: Bsit61 Unix

Contents

2.1 Stat function

2.2 Types of files

2.3 Set-user-id and set-group-id

2.4 File access permission.

2.5 Ownership of new file and directories

2.6 Access function

2.7 Umask function

2.8 Chmod and fchmod functions

2.9 Chown, fchown and ichown function

2.10 Filesize

2.11 File truncation

2.12 File systems

2.13 Link, unlink and rename functions

2.14 Symbolic links

2.15 Symlink and readlink functions

2.16 File times

2.17 Utime function

2.18 Mkdir and rmdir functions

2.19 Chdir, fchdir and getcwd

2.20 Block summary

2.21 Review question and answers

25

Page 26: Bsit61 Unix

Block – II

Files and Directories

The previous block covered the basic I/O functions around regular files. Additional

features of the files, the file systems and other properties of files are examined in this

section.

2.1 Stat functions

There are three stat functions

Typically they appear as below

int stat ( const char * path name, struct stat * buf );

int fstat (int filedes, struct stat * buf);

int lstat ( const char * path name, struct stat * buf );

all of then return 0 if successful and –1 if error now let us see their functions.

Stat : Returns a structure of information about the file whose name

and path is given.

fstat : Returns a structure of information about the file when the

named file is a symbolic link. ( one should note that it

returns data about the symbolic link and not the file

referenced by the symbolic link. At this stage it is suffice to

say that such links are useful in situations where we are

walking down a directory hierarchy).

fstat: Obtains information about a file that is already open on the

descriptor fields.

26

Page 27: Bsit61 Unix

It may be noted that the second argument is a pointer to a structure that we must

apply. The function sends the data to the buffer pointed to by buf. Of course the actual

structure fields may differ from implementation to implementation one typical structure

could be as follows:

Struct stat {

mode_t st_mod; /* file type, mode and permissions */

ino_t st_ino; /* i-node number ( serial number)*/

dev_t st_dev; /* device number */

dev_t st_rdev /* device no. for special files */

nlink_t st_nlink; /* number of links */

uid_t st_uid; /* user ID of owner */

gid_t st_gid; /* group id of owner */

off_t st_size; /* size of regular files, in bytes */

time_t st_atime; /*time of last access */

time_t st_mtime /* time of last modification */

time_t st_cttime /* time of last file status change */

long st_blksize /* best I/O block size */

27

Page 28: Bsit61 Unix

long st_blocks /* number of 512 blocks allocated */

2.2 Types of Files

Most of the files encountered in Unix belong to either of the two types :

- regular files or

- directories

However, these need not be the only types that Unix system supports. We list

several other types below :

1. Regular file : The normal data files, either text or binary files. the

Unix kernel treats both text and binary files in the same manner,

the interpretation being left to the application programs.

2. Directory files: Unix treats directories also as files. This file

contains names of the files contained in that directory and pointers

to them. Only the kernel can write information into the directory,

but any process that has sufficient permissions would be able to

read the contents of the file.

3. Character special file : This is used for certain types of devices on

a system.

4. Block special file: This file is used for disk devices. All devices on

the system are either character special files or block special files.

5. First in First out : This type of file is used for inter process

communication between processes. Some times this is referred to

as a pipe ( the analogy is that any information meant for a

particular location is sent through a pipe. Essentially what enters

the type first will be the one that comes out first at the other end).

6. Socket : It is a type of file used for network communication

between processes ( A socket can also be used for non network

communication between processes on a single host) The analogy

28

Page 29: Bsit61 Unix

is in the electrical socket, which connects an electrical device to

the electrical network for power transfer.

7. A symbolic link: This type of file points to another file

The system identifies each of these types of files by special macronames:

Macro Type of file

S_ISREG( ) Regular Type

S_ISDIR( ) Directory file

S_ISCHR( ) Character special file

S_ISBLK ( ) Block special file

S_ISFIFO ( ) FIFO

S_ISLNK( ) Symbolic line

S_ISSOCK( ) Socket

Just to make some of these concepts clear as also to get some programming practice,

we write a simple program which accepts a strings of command line arguments and

returns the type of file indicated by argument

# include<sys/types.h>

# include<sys/stat.h>

#include “ourhdr.h”

int

main ( int argc, char * argv [ ])

{

int i;

struct stat buf;

char * ptr;

for ( i=1; i<argc, i++)

29

Page 30: Bsit61 Unix

{

printf(“%s “, argv [ i ]);

if lstat ( argv [ i ], & buf <0)

{ err_ret (“ lstat error “);

continue;

}

if (S_ISREG(buf.st_mode)) ptr = “regular” ;

else if ( S_ISDIR( buf.st_mode)) ptr = “ directory” ;

else if (S_ISCHR(buf.st_mode)) ptr = “Character special “ ;

else if ( S_ISBLK (buf. st_mode)) ptr = “ block special “;

else if ( S_ISFIF0 (buf.st_mode)) ptr = “FIFO” ;

else if ( S_ISLINK (buf.st_mode)) ptr = “ symbolic link”;

else if (S_ISSOCK(buf. st_mode)) ptr = “socket” ;

else ptr = “ unknown mode “ ;

printf(“%S \n”, ptr);

}

exit (o);

}

The program is fairly straight forward to need detailed explanation. Basically it

accepts a number of command line arguments and printout their file types.

For example if the argument is

$ a.out / vmunix / etc / bin / var / spool / croon / FIFO

the output could be

/ vmunix : regular

/ etc : directory

/ bin : symbolic link

/var / spool / cron / FIFO : FIFO

30

Page 31: Bsit61 Unix

However in most applications regular files and directories form about 90% to 95% of the

total number of files stored in a system.

2.3 Set-user-ID and set-group-ID

Every process has several ids associated with it. Some are listed as follows:

i) The real user id and real group id : These IDs identity the process.

These two do not change during a login session and are infact

found with the entry in the password file.

ii) The effective user id, effective group id and supplementary group

id and supplementary group ids determine the file access

permissions.

iii) The saved user id and saved set group id contain copies of the

effective user id and effective group id.

However in most cases the effective user id and effective group id will be

identical to the real user id and real group id respectively.

Every file has a owner and a group owner. The owner is indicated by st_uid and

group owner by st_gid of the stat structure.

2.4 File access permissions.

The st_mode value also encodes access permission bits per the file. All types of

files have permissions.

The permissions bits for the files can be listed as follows:

31

Page 32: Bsit61 Unix

St_mode mask meaning

S_IRUSR : user read

S_IWUSR : user write

S_IXuser : user execute

--------------------------------------------------

S_IRGRP : group read

S_IWGRP : group write

S_IXGRP : group execute

--------------------------------------------------

S_IROTH : others-read

S_IWOTH : others-write

S-IXOTH : others-execute

2.5. Ownership of new files and directories.

When a new file or directory is created, it is appropriate that ownership issues

are also addressed suitably. The rules for the ownership of a new directory are similar

to those of a new file.

The user ID of a new file is set to the effective user ID of the process. The Group

id of a new file can be effective group id of the process or the group id of a new file can

be the group id of the directory in which the file is created.

2.6.Access function

Typical access function is

int access ( const char * path name, int mode);

returns 0 if OK, -1 if error

32

Page 33: Bsit61 Unix

When accessing a file with the open function, the kernel performs it’s access

tests based on the effective user id and the effective group id. The access function, on

the other hand, bases it’s tests on the real user id and real group id.

The argument mode is the bitwise OR of the following values:

mode Description

R_ok test fir read permission

W_ok test for write permission

X_ok Test for execute permission

F_ok Test for existence of file.

2.7 Umask function

This function sets the file mode creation mask for the process and returns the

previous value.

Typical umask function is given by

mode_t umask (mode_t cmask)

returns previous file mode creation mask.

The cmask argument is formed as the bitwise OR of any/ all of the access

permission bits listed with section 2.4. The file mode creation mask is used whenever

the process creates a new file or a new directory.

33

Page 34: Bsit61 Unix

A typical Umost command could be

umask (S_IRGRP | IS_IWGRP | S_IROTH | S_IWORTH) ;

( what are the masks created by this command ? )

It is to be noted that most unix users may not explicitly set the umask value at all.

At the time of login, the shell’s start up file sets the values and they are never changed

by the user. However, while creating new files, it is always desirable to ensure that

specific access conditions are enabled and also we must be able to modify them at our

convenience.

2.8Chmod and fchmod functions

These two functions allow the user to change the file access permissions for an

existing files.

The typical chmod function is

int chmod (const char * path name; mode_t mode);

returns 0 if OK, -1 if error

A typical fchmod function is

int fchmod (int filedes, mode_t mode);

returns 0 if OK, -1 if error

The student might have already observed that while the chmod operates on a

file, specified by the part the fchmod operates on a file that is already opened.

The argument mode is specified as the bitwise OR of the constants listed in the

table below:

34

Page 35: Bsit61 Unix

Mode Command

S_ISUID Set user id on execution

S_ISGID Set group id on execution

S_ISVTX Saved text

S_IRWXV Read, write & execute by owner

S_IRUSR Read by owner

S_IWUSR Write by owner

S_IXUSR Execute by user

S_IRWXG Read, write & execute by group

S_IRGRP read by group

S_IWGRP Write by group

S_IXGRP Execute by group

S_IRWXO Read, write & execute by others

S_IROTH Read by others

S_IWOTH Write by others

S_IXOTH Execute by others

35

Page 36: Bsit61 Unix

There are one/two additional points to be noted.

1. If we try to set the saved text operation (S_ISVXT) of a regular file

and we do not have super user privileges, the bit corresponding to

the operation in the mode is automatically turned off. Started the

other way, if a non privileged user tries to use this option, the mode

argument is automatically modified to nullify his effort. This way,

the system prevents malicious users from simply turning on the

same text option and overflow the swap area.

2. Secondly, a calling process may try to work on a file of another

group i.e the group id of the calling process and the group id of the

newly created file may be different. Then the set group id bit is

automatically turned off. This prevents a user from creating a set

group id file owned by some other group.

2.9. Chown, fchown and / chown function:

The chown function allows the user to change the user id of a file, also it’s group

id.

The typical formats are

int chown ( const char * path name, uid_t owner, gid_t group);

returns 0 if OK, -1 on error

int fchown ( int filedes, uid_t owner, gid_t group );

returns 0 if OK, -1 on error

int Ichown ( const char * path name, uid_t owner, gid_t group);

returns 0 if OK, -1 on error

36

Page 37: Bsit61 Unix

They work in almost similar fashion, except when the file referenced is a

symbolic link.

When a symbolic link is referenced, the Ichown changes the ownership of

the symbolic link, instead of changing the file pointed to by the symbolic link..

2.10. File Size

The St_Size member of the stat structure contains the size of the file in bytes.

This field will be useful while operating regular files, directories or symbolic links.

( why not in other cases?).

For a regular file, the minimum size allowed is 0 bytes.

For a directory file, the size is usually a multiple of 16.

For a symbolic link, the file size is the actual number of bytes in the file name

It may be noted that holes can be created in a file - if a write operation seeks

past the current end of file before writing into the file, instead of starting at the exact

end of the file.

However, when such files, with holes are copied, the holes are automatically

blocked off.

2.11. File truncation

Some times we may like to truncate a file by knocking off the data at the end of a

file ( for whatever reasons).

The typical format is

int truncate ( const char * path name, off_t length);

returns 0 if OK, -1 if error.

37

Page 38: Bsit61 Unix

If the file is already open, we use the ftruncate.

int ftruncate ( int filedes, off_t length);

returns 0 if OK, -1 if error.

Both the functions chop off the portion of the file extending beyond the length

specified.

This ofcourse presumes that the present size of the file is larger than that

specified by the length field. What if it is not so? Suppose the existing size of the file is

less than the size required by truncate. Some implementations leave it as it is, while

others fill the deficit with blanks ( possibly creating a hole in the file)

2.12. File systems.

Before we can appreciate the various complexities involved with the links to a

file, symbolic links etc... it is desirable that we have clear concept of the organization

and operational mechanisms of Unix file structure. Obviously different implementations

of unix file systems exist today. However, what we are aiming at in this subsection is a

broad outline that suffices to the understanding of the concepts involved.

38

Page 39: Bsit61 Unix

Disk

Drive

FileSystem

bootblock

Superblock

A disk drive is divided into a number of partitions, each partition into a file system

as indicated in the second level and each i-list is actually a list of i-nodes as in the

lowest level.

Suppose we look into a detailed file system

i-node i-node inode i-node

i-node file name no i-node file name no

Partitions 1 Partitions 2 Partitions 3

i-list directories blocks & data blocks

i-node i-node i-node

i list data block data block directory block data directory

i-node filename

39

Page 40: Bsit61 Unix

A little explanation is in order.

Each file system is made up of a series of directory blocks and data blocks.

( after all unix manages directories also as files).

The i-node list contains a number of i-nodes, each pointing to data block(s). The

i-nodes contain all informations about the file – file type, file’s access permission bits,

size of the file, pointers to the data blocks for the file etc.. Most of the information for the

stat structure are obtained from the i-node.

One more interesting point when the same file is copied into different directories

or when it is renamed, it is not necessary to physically duplicate the contents of the file.

All that is needed is to enter the filename in the corresponding directory and make it

point to the respective i-node. This is a very important concept that becomes useful in

the next section.

2.13. Link, unlink, remove and rename functions

We have seen that any file can have multiple directory entries pointing to it’s i-

node. What we do is to create a link to the existing file using the link function.

A typical format is as follows:

int link ( const char * existing path, const char * newpath)

returns 0 if OK, -1 if error.

This function creates a new directory entry called new path that references the

existing file existing path.

40

Page 41: Bsit61 Unix

If new path already exists, an error is returned.

One other point to note is that every working directory has a “ link count” field,

which indicates the number of links the directory is handling.

When a “link” operation is executed, the link count of the directory is incremented

by one.

The creation of a new directory and incrementing of the link count must be an

atomic operation.

( exercise to students : why ?)

It may be noted that indiscriminate use of links may cause loops in the file

system, creating problems. Thus, the link operations are normally to be executed by

the super user.

To remove an existing directory link, we use the unlink function.

It’s format is

int unlink ( const char * pathname)

returns 0 if OK, -1 if error

This function removes an existing directory entry, decrements the link count of

the file referenced by pathname. If there are other links to the file, the data in the file

will be accessible through other links.

Only when the link count reaches 0, can the contents of the file will be deleted.

In fact whenever a file is to be closed, the kernel does it as a two stage

operation.

41

Page 42: Bsit61 Unix

i) Count the number of processes that have the file in question open

ii) Once a) becomes 0, it checks for the link count, if it also becomes

0, the file’s content is deleted.

If the pathname is a symbolic link, it will unlink references the symbolic link, not

the file referenced by the link .

A file can also be unlinked using the remove function.

The typical format is

int remove( const char * pathname);

returns 0 if Ok, -1 on error.

For a file the operation remove is the same as unlink i.e the file becomes in

accessible through that directory entry, whereas for a directory, it means removing the

directory after it is made empty.

A file or directory can be renamed with the rename function.

Typical format

int rename(const char * oldname, const char * newname);

returns 0 if OK, -1 if error.

a) If old name specifies a file ( and not a directory) then we are

renaming a file. If the file named in the new name already

exists, it is removed and the old name file is renamed as new

name. Note that renaming involves changing the contents of

both the old name and name directories. Hence to execute this

command, one should have write permission to both old name

and new name directories.

b) If old name specifies a directory, that means we are renaming

the directory – If new name exists and it is a directory, it must be

42

Page 43: Bsit61 Unix

empty. Then we can specify new name to the directory old

name. Otherwise, if a non-empty new name exists, then just as

in the case of files, the new name is renamed and then old

name is renamed as new name.

c) If old name and new name refer to the same file, then the

function returns successfully without doing any thing.

2.14 Symbolic links:

We have talked about symbolic links often in the previous sections. We look into

more formal details in this section.

A symbolic link is an indirect pointer to a file. It does not point directly to the file.

In fact symbolic links were introduced to overcome some of the limitations imposed by

the normal links.

a) The normal pointers normally require that the link and the file reside in

the same file system.

b) Only the super user can create a hard link to the directory.

Whereas in the case of symbolic links, there is no restriction about place of

residence of the link and the file which it points to. Also any body can create the links.

Also not all functions allow tracing of symbolic links to the files they point to

The following table gives a list of functions that allow the use of symbolic links and

those that do not.

Function Allows Symbolic Link

Access Yes

Chdir Yes

chmod Yes

chown Yes

Creat Yes

43

Page 44: Bsit61 Unix

exec Yes

ichown No

link Yes

lstat No

mkdir Yes

mkfifo Yes

mknod Yes

open Yes

opendir Yes

pathconf Yes

readlink No

remove No

rename No

stat Yes

truncate Yes

unlink No

2.15 Symlink and readlink functions

Symlink creates a symbolic linkThe Format is

int symlink (const char * actual path, const char * sympath)

A new directory entry, sympath, is created to point to actual path. It may look

strange, but actual path need not exist when sympath is being created. Also actual path

and sympath need not reside in the same file system.

Since the open function follows a symbolic link, we need a way to open the link

itself and read the name in the link. The function read link does this.

int readlink (const char *path name, char * buf, int bufsize)

is the typical format

44

Page 45: Bsit61 Unix

If the function is successful, it returns the number of bytes placed into buf.

2.16 File times

The system maintains three time fields for each file. Their names and purposes

are listed in the table below

Field Purpose

St_a time Previous access time of the file

St_m time Previous modification time of file data

St_c time Previous change time of i-node status

The modification time is the time when the file data was previous modified,

whereas the changed status time was when i-node was previously changed. A write

operation changes the contents of the file whereas chmod and chown etc., change the

status of the i-node. The access time can be used by the system administrators to

delete these files which have not been accessed for a long time by the users. Similarly

the modification time and changed status time can be useful to archive these files

whose contents have been modified over the previous time slot.

2.17 utime function

The access time and modification time of a file can be changed with the u time

function.

Int utime (const char * path time, const struct utimebuf * times);

This function uses the structure

Struct utimbuf {

time_t actune; /* access time */

time_t mod time /* modification time */

}

45

Page 46: Bsit61 Unix

2.18 mkdir and rmdir functions

mkdir is used to create directories.

The typical format is

int mkdir (const char * pathname, mode_t node);

This creates a new empty directory.

Similarly an empty directory is deleted with a rmdir function

Int rmdir (const char * path name); is the format

If the link count of the directory becomes 0 with this call and no other process

has the directory open, then space occupied by the directory is freed. If one or more

processes have the directory open when the link count becomes 0, then no more links

are allowed to be made to the directory but the directory is released only after the last of

the processes closes the directory.

2.19 Chdir, fchdir and get cwd

Every process works with the current working directory. All path names (that do

not begin with a will be searched beginning from the working directory. The current

working directory, in fact, is an attribute to the process we can change the current

working directory of the calling process by calling chdir function.

int chdir (const char * path name)

Similarly we have fchdir for individual files

Int fchdir (int filedes);

Similarly a function is provided for finding the complete absolute path to the

current working directory.

Char * get cwd (char *buf, size_t size);

Returns the path name if successful.

46

Page 47: Bsit61 Unix

Block Summary

We began with a discussion about the structure detail availability using the stat

functions. Then we got ourselves introduced to serial types of files like regular files,

directory files, special files sockets etc.. We discussed about the concept of usr and

group ids and also about the ownership of files and directories, and how to change

them.

We discussed also about the symbolic links – an indirect link to a pointer file to

overcome some of the short comings of the normal pointers. We closed the discussion with the

concept of file times.

47

Page 48: Bsit61 Unix

REVIEW QUENSTION

1. What is the need for stat functions, name the various functions & their formats.

2. What are the different types of files one normally comes across?

3. What are different file access permissions?

4. What is the Umask function do?

5. How do you change the file access permission for existing files?

6. What functions are available to change the ownership of files

7. What do the truncate function do?

8. Give the format of remove function

9. Give the format of rename function

10. In what way are symbolic links an improvement over normal links.

48

Page 49: Bsit61 Unix

Answers:

1. The stat functions allow us to get the structure of information about the file indicated

There are three stat functions

i) int stat ( const char * pathname, struct stat * buf);

ii) int fstat ( int filedes, struct stat * buf);

iii) inst lstat ( const char * pathname, struct stat * buf);

2. The different types of files one encounters in unix are, regular files, directory files, character

special files, block special files, FIFO, socket and symbolic links.

3. The different file access permissions are read, write and execute – each enabled for user,

group and others. So, these are actually 9 combinations.

4. It sets the file mode creation mask for the process and returns the previous value

5. By the Chmod function, whose format is

int chmod (const char * pathname; mode_t mode);

6. We have the functions chown, fchown and lchown.

7. There are 2 truncate functions:

truncate knocks off the remaining portion of the file after opening it

ftruncate does the same for an already opened file.

8. int remove(const char * pathname)

9. The typical format is

int rename(const char * oldname, const char * newname);

10. i) The link and the file the link is pointing to need not reside in the same file system

ii) There is no restriction that only the super user should create the links.

49

Page 50: Bsit61 Unix

Block Introduction (BLOCK – III)

In this block, we look into some of the standard I/O library concepts. The I/O

routines operate on what are known as file descriptors. When a file is opened, the I/O

function actually returns a pointer to the file. This pointer is used for further operation.

In this block, we look at so me of the aspects of managing the standard I/O.

We begin with the concept of buffering, which is normally assumed be available,

unless specified otherwise. We discuss about fully buffered, line buffered and

unbuffered operation, looking at the respective functions and also the methods of using

them.

The we go on to the methods of opening a stream – it can be done is various

modes like read, write and various combinations of appending to the e. of., truncate to

length specified etc.,

After having opened a stream, we look into the concept of taking input into and

output out of them. It may be one character at a time, one line at a time or a block at a

time. We have various function for each of them.

There is also the concept of positioning a stream, where in we can start our

operations at any desired place in the stream by suitably positing the off sets. We will

then be looking into the concept of temporary files, which are created on the file by the

programs in execution and are closed once the program creating them is terminated.

We close the discussions with the important concept of password files. There

are drawbacks in storing the password file as a simple file. Hence, unix systems store

then as encrypted files. Even this may not totally solve the problem. So, sometimes we

may resort to store then else where in the system – by making use of the concept of

shadow password files.

50

Page 51: Bsit61 Unix

Contents:

3.1 Introduction

3.2 Concept of buffering

3.3 Opening a stream

3.4 input into and output out of a stream

3.5 Line at a time I/O

3.6 Positioning a stream

3.7 Concept of Temporary Files

3.8 Password file

3.9 Shadow passwords

3.10 Review Question & Answers

51

Page 52: Bsit61 Unix

3. Standard I/O Library

3.1 Introduction:

Here we briefly look into the concept of standard I/O library, which handles

details such as buffer allocation and performing I/O operations optimally.

Normally all I/O routines are centred around file descriptors. When a file is

opened, a descriptor is returned for the file by the kernel. All subsequent I/O operations

w.r.t. the file are done using the descriptor. With the standard I/O library, the focus

shifts to the “streams”. Each file is associated with a stream, when a file is opened or

created and the streams will be useful in further I/O operations.

When a stream is opened, the standard I/O function of open returns a pointer to

a FILE object. This object in fact is a structure that containing information needed by

the standard I/O library. The normal fields of this structure include the file descriptor to

be used for I/O, a pointer to a buffer that holds the stream, the size of the buffer, a count

of the number of characters currently in the buffer etc.,.

In this section, we get a detailed view of the standard I/O library. As in other

cases we get some insight into the library, those different versions of implementations

may provide slightly different operations.

Before looking into specific aspects of streams, we predefine three streams

which are automatically available to all the processes – standard input, standard output

and standard error, indicated by STDIN_FILE NO, STDOUT_FILE NO and

STDERR_FILE NO. We simply call them stdin, std out, std err. The <stdio.h> header

provides all of them.

52

Page 53: Bsit61 Unix

3.2 Concept of buffering

Buffering is resorted to in order to ensure efficiency in the number of read and

write calls. When several similar input or output operations are to be done sequentially,

buffering can obviate the need for repeated input and output operations.

Further, buffering, ideally should be automatic for each I/O stream, so that the

application programmers need not worry about them. In general, three types of

buffering are normally provided

a) Fully buffered: Here the actual I/O takes place when the standard I/O buffer

is full. Files residing on the disk are all fully buffered by the standard I/O

library, using the “malloc” when the I/O operation is performed the first time.

Further a buffer can be “flushed” automatically by the standard I/O routines or

by calling the flush function. The flush operation actually means writing out

the contents of the buffers, even if it is partially filled. The data already in the

buffer is normally discarded.

b) Line buffered: In this case, I/O takes place only when a new line character is

encountered. The input can come at any speed, even at one character at a

time, the actual I/O taking place only when the line is completed. Normally

when I/O using a terminal is being used, wherein the data is being typed

manually, line buffered I/O is used.

c) Unbuffered: Here the characters are not buffered. The actual instance of I/O

is expected to be as early as possible after the function call.

The normal functions used in buffering are setbuf and setvbuf.

Typical they appear as follows

Void setbuf (FILE *fp, char *buf);

Int setvbuf (FILE *fp, char *buf, int mode, size_t size);

53

Page 54: Bsit61 Unix

They return 0 if successful, else return a nonzero integer for error .

Commonsense dictates that these functions are to be called after opening the streams ,

but before performing any other operations on them.

The setbuf can be used to turn on the buffer or turn it off. To turn on buffering,

buf must point to a buffer of length BUFSIZ (a constant defined in <stdio.h>). To turn off

the buffer, buf must point to NULL.

The setvbuf will be used to exactly specify the type of buffering needed. The

mode argument will help us choose from the three types of buffering discussed above.

- IOFBF : fully buffered

- IOLBF : Line buffered

- IONBF : Not buffered.

If we specify non buffering, but still give the buffer size and other arguments, the

non buffering mode is given priority and the other arguments are ignored.

The following table gives an idea about various options of buffering

Function node bufl Buffer & Length Types of buffering

Set buf Non null

null

Buf=BUFSIZ

(no value)

Fully buf or line buf

Unbuffered

Setvbuf

IOFBF nonnull

null

Buf=size

System buffer

Fully buffered

IOLBF nonnull

null

Buf = size

System buffer

Line buffered

IONBF ignored (no buffer) Unbuffered

This table is nothing but an extract of the previous discussions of this section.

We can also make a stream to be flushed

54

Page 55: Bsit61 Unix

Int fflush (FILE *fp);

This forces any unwritten data for the stream to be passed to the kernel.

3.3 Opening a stream

Fopen, freopen and fdopen

The following are the typical formats of the functions

FILE *fopen(const char * path name, const char * type);

FILE *freopen (const char *pathname, const char * type, FILE *fp);

FILE *fdopen (int filedes, const char *type);

All of them return file pointers if OK, else NULL

Fopen opens a specified file.

Freopen opens a specified file or specified stream. If the stream is already open,

it is closed first and then reopened. This is used to open a specified file as one of the

following steams! Standard input, standard output or standard error.

Fdopen takes the existing file descriptor and associates a standard I/O stream

with the descriptor.

type Description

r or rb Open for reading

w or wb Truncate to 0 length or create for writing

a or ab Open for appending or create for writing

r+ or r+b or rb+ Open for reading and writing

w + or w+b or wb+ Truncate to 0 length or create for read write

a + or a+b or ab+ Open or create for reading and writing at end of the file

You may wonder why two different types of commands are there to do the same

job. The character b helps system to differentiate between text and binary files at the

55

Page 56: Bsit61 Unix

I/O level. At the kernel level, it makes no difference, since unix kernel makes no

difference between text and binary files.

However, a close examination of the type description and the functions available

to open a stream also brings home another point. Not all combinations listed in the

above table can be used with all the functions. A summary of the different legal ways to

open a file, with different initial conditions is listed in the following table. Students are

advised to figure out why other combinations are not tenable.

Condition r w a r+ w+ a+

File must already exist y - - y - -

Previous content of file discarded - y - - y -

Stream can be read y - - y y y

Stream can be written - y y y y y

Stream can be written only at the end - - y - - y

By default, a stream that is newly opened in fully buffered, unless it refers to a

terminal device like a keyboard, in which case it is line buffered. However, once the

stream is opened, but before any other operations are done on it, we can change the

buffering if we want to, by using set buf or setvbuf functions.

A stream which is already opened may be closed by using fclose.

Int fclose (FILE *fp);

Any buffered input data is discarded before it is closed. Similarly any buffered

output data is flushed. Also when a process is terminated normally, all unwritten

buffered data are flushed and all open standard I/O streams are closed.

3.4 Input into and output out of a stream

There are three different types of unformatted I/O.

56

Page 57: Bsit61 Unix

a) One character at a time: One can read or write are one character a

time, with the buffering operations being taken case of by the I/O

functions.

b) One line at a time: The normal fgets and fputs will do the job. Each

line is terminated with a new line character, with the maximum length

of the line being predefined.

c) Direct I/O: This type of I/O is supported by fread or fwrite functions.

Each operation of read or write means a certain number of objects,

each a specified size is read or written.

There are three input functions to read one character at a time:

Int get c(FILE *fp);

Int fget c (FILE * fp);

Int getchar (void);

There are subtle differences between getc and fgetc. Getc can be implemented

as a macro, while fgetc cannot be so. i.e. the arguments of getc cannot be an

expression that has side effects. Further since fgetc is a function, we can pass it’s

address as an argument to another function. Also calling getc will be faster, since

macros are known to be faster than function calls.

All of them return the next character if OK, EOF on end of file or error. Notice

that EOF is returned both at end of file or an error. To distinguish between the two

either ferror or feof is used.

Int ferror (FILE *fp);

Or

Int feof (FILE *fp);

Both of them will return a non zero if condition is true and 0 otherwise.

After reading the stream, we can push back the character by using unget C

Int ungetc (int C, FILE *fp);

57

Page 58: Bsit61 Unix

The characters so pushed back are returned by subsequent reads on the stream

in the reverse order of pushing i.e. the last character pushed will appear first.

Similarly we have three output functions

Int putc(int C, FILE *fp);

Int fput c (int C, FILE *fp);

Int putchar (int C);

3.5 Line at a time I/O

Line at a time input is provided by the following functions

char *fgets(char *buf, int n, FILE *fp);

char *gets (char *buf);

buf in both cases mean the buffers to be read into gets reads from the standard input

device and hence nothing is specified about the input, whereas fgets reads from the

specified stream. n in the fgets is the size of the buffer. This function reads upto (n-1)

characters including newline characters, if any. The buffer is terminated by a null

character.

Line at a time output is provided by the two similar functions

int fputs (const char *str, FILE *fp);

int puts(const char *str);

There is need for explanation of the details

3.6 Positioning a stream:

Positioning a standard I/O stream becomes important when a read/write is being

done into a nonempty stream.

58

Page 59: Bsit61 Unix

There are two ways to position a standard I/O stream. The first method makes

use of the function ftell and fseek, which store the file’s current position as a long integer

and return the current file position and point to that location respectively.

typical formats are as below

long ftell (FILE *fp);

int fseek (FILE *fp, long offset, int whence);

ftell indicates the current file position as a long integer, whereas fseek returns 0, if it can

successfully position, otherwise it returns a non zero zero.

Ofcourse, there is a rewind command to return to the beginning of the stream..

void rewind (FILE *fp);

However, ANSIC also provides two other functions to do the job

int fgetpos (FILE *fp, fpos_t *pos)

int fsetpos (FILE *fp, const fpos_t *pos)

f get pos stores the current value of the file’s position in the object pointed to by

pos.

f set pos positions the stream to the location pointed to by pos.

3.7 Concept of Temporary Files:

Temporary files are created on the fly by the programs in execution and they are

closed as and when they are closed or the program is terminated as the case may be.

The standard I/O library provides two function to assist the creation of temporary

files.

char *tmpnam (char *ptr)

59

Page 60: Bsit61 Unix

This generates a valid path name

FILE *tmpfile (void)

To demonstrate how these functions work, we look at a small piece of code that

works using them.

main (void)

{

char name [L_tmpnam], line [MAXLINE]

FILE *fp

print (“%s \n”, tmpnam (NULL));

/* first temporary name);

tmpnam (name);

/* second temporary name */

print f (“% \n”, name);

if ((fp = tmp file ()) == NULL);

err_sys (“tmpfile errror”);

/* create the temporary file by assigning it to the

pointer. If it is null, return error */

fputs (“This is a temporary file”)

/* Write into the temporary file */

rewind (fp); /* rewind to read the contents */

if (fgets (line, size of (line), fp) == NULL)

err_sys (“fgets error”) /* display error*/

fputs (line, stdout); /* print the line into standard output */

exit (0);

}

60

Page 61: Bsit61 Unix

3.8 Password file

The unix password file contains the following fields

Description member

user name char * pw_name

encrypted password char *pw_passwd

numerical user id vid_t pw_uid

numerical group id gid_t pw_gd

comment field char * pw_gecos

initial working directory char *pw_dir

initial shell program char * pw_shell

The description fields refers to the fields that describe each entry in the password

file.

The corresponding entries are all included in a structure called struct passwd.

The fields as stored in the structure are indicated by the member field in the above

table.

There will be one entry for each password.

There will be atleast one entry called root. The user id for this is 0.

The encrypted password will normally be a copy of the user’s password through

a one-way encryption algorithm. This means, even if some unauthorized person were

to get hold of this field, he may still not be able to decipher the password of the user.

Some of the fields may be empty as well.

3.9 Shadow passwords

In the previous section, we talked about the password entries being encrypted,

so that even if one were to get the password file, it may not be possible to directly get

61

Page 62: Bsit61 Unix

back the passwords. However, many unauthorized users often make it a point to get a

copy of the password file and make intelligence guesses about the passwords. The

guesses may be run through an algorithm to see if it matches with the entries in the

password file. One may say it is a laborious process, but the success rate, in the hands

of experienced persons is quite high. To make it harder for such persons to obtain the

encrypted passwords, some systems store the passwords else where, often called the

shadow password file.

They may also require that the user changes the pass word at regular intervals –

this concept is called the password aging.

With these backgrounds, we are in a position to look into the actual details of the

unix process operations.

62

Page 63: Bsit61 Unix

Block Summary

In this block, we got some important introduction to the standard I/O functions.

We begin with the concept of buffering, discussed about full buffering, line buffering, line

buffering and unbarred systems.

We then looked into the various methods of opening a stream – various

combinations of allowed operations. We then looked into the concept of streaming, the

ability to read one character at a time, one line at a time or one block at a time. Also,

the methods of positioning at desired points in a stream, by the use of offsets was

discussed.

We also discussed the concepts of temporary files, password files, encrypted

password files used by unix and also the meaning of shadow password to control the

onslaught of password breakers. Within this, we are now in a position to actually start

discussions about the unix processes.

63

Page 64: Bsit61 Unix

Review Questions

1. What is the need for buffering?

2. What are the 3 types of buffering?

3. What functions are to set / turn off buffers ? Explain

4. What functions are available for opening a stream?

5. What are the 3 modes of I/O into a stream?

6. What is a temporary file?

7. What is the specialty of Unix password file?

64

Page 65: Bsit61 Unix

Answers

1. When a file is repeatedly needed, it is wasteful to go the memory for every

read / write call. So to ensure efficient operation in input and output, they are

buffered.

2. a) Fully buffered b) Line buffered c) Unbuffered.

3. The functions set buf and set v buf are used

The typical formats are

Void setbuf (FILE *fp, char *buf);

And int set v buf (FILE * fp, char *buf, int mode, size_t size);

The difference is that while set buf is sued for turning on / off the buffers, setvbuf

allows us to specify the type of buffering needed.

4.There are 3 functions available for opening a stream. They are fopen, freopen and

fdopen. The typical formats are

FILE *fopen (const char *path name, const char *type);

FILE *freopen(const char *pathname, const char *type, FILE *fp);

FILE *fdopen (int filedes, const char *type);

Fopen opens a specified file,

Freopene opens a file. If it is already open, it will be closed and opened again.

Fdopen takes the existing file descriptor and associates a standard I/O stream

with the descriptor.

5. The three modes are i) one character at a time ii) one line at a time

iii) direct I/O.

6. A temporary file is created on the file by the program in execution and is closed

once the program terminates.

7. Unix system store passwords in encrypted form. i.e. the users password is

encrypted through a one way encryption algorithm and is stored in the password file.

65

Page 66: Bsit61 Unix

UNIT – II

Unit Introduction

This unit is made up of three blocks and is expected to give the student an

insight into some of the most important concepts of system programming.

The first block deals with the various aspects of processes & their controls. We

note that each process is identified by it’s id and can create child processes and also

interact with other processes. Each program maintains a detail of the environment in

which it is working. Processes also share system resources and they may, sometimes

have to be made to wait until the resource they need becomes available. It may also

happen that the resource in question may be held by a process which itself is waiting for

some other process etc thereby making the wait process to be infinite. This concept of

“race” and how to avoid it are also studied. We also have ideas about setting and

resetting user ids, concept of accounting and measure of process times.

The second block deals with the concept of signaling. Signals are essential for

process synchronization and to meet a variety of conditions like hardware exceptions,

terminal generated signals etc.. This block gives an idea about the various signals that

are encountered – the actions that the processes take on encountering a signal ( ignore,

catch or default action ) and the need of alarm and pause signals. We also see the

concept of masking of signals and of signal sets. We briefly list a few job control signals

also.

The third block given ideas about what happens during a login. We study the

difference between terminal login and network login and how they are handled. We

look into the idea of process groups and how to get or set their ids. We familiarise

ourselves with the idea of sessions and controlling terminals also finals we see how the

foreground and background job controlling can be done.

66

Page 67: Bsit61 Unix

BLOCK – IV

Block Introduction

In this block, we get ourselves acquainted with several of the system concepts.

In fact, this could be considered as one of the more important blocks in the course.

The block beging with the concept of processes. Each program will possess an

environmental list, which gives the details of the system environment in which it is

working. This will give useful input for the Kernel in processing the programs. There

are functions that allow the system to get details about the environment variables of

the system, as also to set those variables. Further each program is also handed

over certain limits to the system resources that it can handle. These can be set or

modified if need arises. WE study these concepts in brief.

Then we move on to the concept of the process. Each process is identified by

it’s identification no(id). A process can create child processes to perform specific

tasks by calling the fork function. There are several aspects to the relationship

between a parent process and a child process. These are discussed in some detail.

We also discuss various possibilities arising out of the order of terminations of the

processes.

There is also a possibility of making the processes wait, till the resources they

are looking for becomes available. We may wait in general or for some specific

resources. Different types of wait operations will be studied.

We will see the concept of race condition, where in two or more processes are

cyclically waiting for some system resource to become available, which is held by

some other process. In such a situation, so process will get to complete it’s task and

the wait becomes infinitely long. We see how we can counter such situations.

67

Page 68: Bsit61 Unix

We also have a section or real and effective user ids, how to set or reset them

etc. We touch upon the concepts of process accounting, user identification and

process times. This, in brief is the overall view of this block.

68

Page 69: Bsit61 Unix

Contents

4.0 Introduction

4.1 Process termination

4.2 Memory allocation

4.3 Finding the limits of a process

4.4 Unix process control

4.5 The fork function.

4.6 vfork function

4.7 Exit functions

4.8 Wait or wait_pid functions

4.9 Wait 3 and wait 4 functions

4.10 Race conditions.

4.11 exec functions

4.12 User Ids and Group ids

4.13 System Function

4.14 Process accounting

4.15 User identification

4.16 Process time

4.17 Review Questions

69

Page 70: Bsit61 Unix

THE UNIX PROCESS

4.0 Introduction : We now look into the concepts of Unix process. To begin with, we

talk of a single process environment and then move onto the process control primitives,

wherein several processes are controlled.

Any C program starts with the execution of the math function

The proto type for the main function is

int main (int argc, char *argu [ ]), argc is a number indicating the number of

command line arguments and argv is an array of pointers to the arguments.

When a c program is started by the kernel, a special start up routine is called

before the main function is called. This startup routine takes certain values like the

command line arguments and the environment from the kernel and prepares the stage

for the execution of the main function.

4.1 Process termination

Processes can be terminated in any of the following ways

a) return from main

b) calling exit

c) calling _exit

d) calling abort

e) by a terminating signal.

of these, (a), (b) and (c) may be terned as normal terminations, while d and e are

abnormal terminations.

70

Page 71: Bsit61 Unix

Of these, return from the main is the most natural way of terminating processes.

The startup routine, mentioned in the previous section ensures that when the main

function completes execution, an exit is executed.

We look into the other methods of process termination briefly.

4.1.1. exit and _ exit function:

Both these functions terminate the process normally whereas _exit returns to the

kernel immediately, exit returns to the kernel only after certain cleanup operations are

performed.

Most unix shells provide us a way to examine the exit status of a process. If the

function is called without an exit status or main does not return a value, the exit status of

the process is not fully defined.

4.1.2 Environment list:

Each program is also supposed to be handed over an environment list. The

argument list is an array of character pointers each pointing to the address of a null

terminated string. One typical environment could be as below

environment pointer environment list environment strings

Home = - - - - - - - -

path = - - - - - - - - -

Shell = - - - - - - - - -

user = - - - - - - - - -

The entire structure is pointer by a environ pointer. Using this, specific fields of

the structure may be accessed. Later on, we also see the functions getenv and putenv

will become useful in operation of environments.

NULL

71

Page 72: Bsit61 Unix

4.1.3 The memory layout of a C program:

Unless specified otherwise, a c program has the following components

a) a Text segment consisting of the instructions to be executed by the machine.

The text segment is sharable with other programs – only a single copy is to

be maintained, even if they are to be used by a number of users.

b) The data segment contains the variables that are initialized in the program.

c) The unutilized data segment. Normally the data is initialized to zero or null by

the kernel.

d) stack: All the automatic variables, along with the function call informations

etc., are stored in the stack. It is also useful in calling recursive C function.

e) Heap: The dynamic memory location operation is done by the heap.

Environment variables

stack

heap

uninitialised data

initialized date

Text

4.2 Memory allocation:

There are three functions specified in C for memory allocation

a) malloc allocates the specified number of bytes of memory, with their initial

values being uncertain.

72

Page 73: Bsit61 Unix

b) Calloc allocates space for the specified number of objects each of a specified

size, with all of them initialized to zero.

c) Realloc: changes the size of the previously allocated area. When size

decreases, it may not lead to any problems, but when there is a size increase

the blocks may have to be moved to some other location to provide the

additional space required. This also means the initial values of space

between the old contents and the end of the new area may be indeterminate.

The typical formats of these functions is as below:

void * malloc (size_t size);

void * calloc (size_t no, size_t size);

void *realloc (void *ptr, size_t new size)

The descriptions above should indicate the need for each of the parameters

concerned.

All of them return a pointer to the memory space, if the operation is successful,

else they return a null, in case of error.

There is also a command to free the space once it’s use is over.

void free (void *ptr)

4.1.4 alloca function:

There is one more function alloca, which has the same sequence of operations

as malloc.

Void *alloca (size_t size)

It functioning is also similar to that of malloc, but instead of allocating memory

from the heap, it allocates from the stack frame of the function. The advantage is that

73

Page 74: Bsit61 Unix

we need not have to free the space at the end – which, anyway, is normally forgotten

by many programmers leading to difficult situation. The allocated space gets

automatically returned when the calling function returns. The only disadvantage is that

in some cases it may not be possible to change the size of the stack frame after the

function is called. So, some systems do not support the alloca function.

4.2 The Environment Variables:

We have briefly discussed the concept of environment strings in a previous

section. We look into it’s various details and manifestation here.

The environment strings are of the form

Name of the variable = value of the variable.

The unix kernel, normally does not have much use of these variables, though

they are routinely updated. It is for the other applications to make use of them in a

manner that is useful to them. The most common user of the environment variables

are the shell programs.

ANSI C defines a function that helps us to fetch values from the environment.

Char *getenv (const char *name);

This returns a pointer value associated with the name of the environment

variable being accessed (given by *name) and null if it is not found.

The following table gives a list of the commonly used environment variables and

their descriptions.

variables Description

Home Home directory

Lang Name of locale

74

Page 75: Bsit61 Unix

LOG name Login name

Path List of paths to search for the file

TERM Terminal type

TZ Time zone information

However, it is not enough it we are only able to get the value of the environment

variable. We may like to change the existing variable, add a new variable or remove an

existing variable etc.,.

These operations can be done by the following functions.

4.2.1. Putenv, getenv and unsetenv function:

The typical formats are

Int put env (const char * str);

Int set env (const char *name, const char *value, int rewrite);

Both of them return 0 if successful, otherwise return a nonzero error.

Put env: function takes a string of the form name = value and puts it in the environment

list. The name already exists, the previous instance of the function name is first

removed.

Set env: sets the name to the value specified. If the name already exists then several

options arise (a) If the rewrite option is set to non zero, the existing definition for name is

modified to the new value (b) If, however, the rewrite option is set to 0, then existing

definition is not disturbed and hence and error message is returned.

The third function is of the type

Void unsetnv (const char *name);

75

Page 76: Bsit61 Unix

It removes any existing definition of name. If no such previous definition exists,

then it returns an error.

At the implementation level, it is interesting to see how these functions operate.

Beginning with the easiest of them, from the implementation point of view, are needs to

just delete the pointer by moving up all the other pointers suitably. But adding a string

or modifying a string is more difficult. Often the space at the top of the stack cannot be

expanded any further. However the following steps are taken.

1. If an existing name is being modified

a) If the size of the new value is smaller the size of the existing value, we can

overwrite the new string over the existing string and probably deleting the

excess number of characters of the old string.

b) If the new string indicating the value is longer than the size of the existing

value, then the problems arise. We must call malloc or such other suitable

function to get enough space for the increased length of the string, copy the

new value of the string to this area, replace the pointer in the environment list

to point to this area to ensure that the new value of the environment variable

is what has been included newly.

2. If a new name is being added, the process becomes still more complicated. We

first call for malloc to allocate space for the (name = value) string and copy the

string to this area. Then depending an whether this is first time that we have

added a new to the list or whether we have done so previously, we call either

malloc or realloc to ask for space to include the pointer. We make this pointer

point to the value which we want the name to take up.

Of course, we should add a null pointer at the end of all the names. If this

description appears a bit confusing, look at the following figure, which is

expected to clear the doubts.

76

Page 77: Bsit61 Unix

Point to values

1

2

3

4

5

X

Suppose we want to include (3a) between 3 and 4 what do we do?

We want to add a new value to the pointer. If the number of locations needed by

the new value is more than what is already allocated, what do we do?

These were the question answered in the earlier description.

4.3 Finding the limits of a process

getr limit and set limit functions

Every system has it’s own limitations on the system resources. It has to work

with in these limits to perform the objectives. In turn, the system will have to impose

limits on the resources allocatable to the processes that work under it. The processes

in turn will have to work with in the limits specified to them. However, for the smooth

functioning, it becomes desirable that the processes should be able to find out the limits

of the resources allocated to them and in extreme cases, may also get them changed.

These operations can be done by using the getr limit and setr limit functions. Their

typical format is as follows:

Int getr limit (int resource, struct rlimit * rlptr);

Int setr limt (int resource, const struct rlimit * rlptor);

Both of them return 0 if successful, else return an error number.

Environ pointer

Names

77

Page 78: Bsit61 Unix

It may be noted that each call to these functions can work with one resource at a

time and pointer to the resource structure is provided in the following format

Struct rlimit {

r lim_t r lim_cur /*current limit of the resource/

rlim_t rlim_max /* max value for rlimt*/

};

The following constraints hold good regarding the changing of these limits.

a) only a super user can raise the upper limit (max limit, also called the hard

limit.) It may be noted that raising the upper limit would put pressure on the

overall resource position of the system and hence need to be undertaken only

by the super user.

b) Any process can lower it’s maximum limit to a value greater than or equal to

the value of the current limit, Again lowering the value would be a decision of

the individual user and it allows the process to get serviced quickly in certain

situations. However, such lowering of the limit is irreversible.

c) A current limit can be increased only upto the maximum value allocated for

the process.

An infinite value can be specified by RLIM_INFINITY.

Some of the resources that can be limited in the above manner are listed below

Command Description

RLIMIT_CORE The maximum size (in bytes) of a core file. It the limit is set to 0, the core file is not created

RLIMIT_CPU The maximum amount of CPU time allocated, in seconds, when the current limit is exceeded, the SIGXCPU signal is sent to the process.

RLIMIT_Data The max size (in bytes) of data segment. This includes the area of initialized data, uninitialised data and heap

78

Page 79: Bsit61 Unix

RLIMIT-FSIZE The max size in bytes of a file that can be created when the a file of size greater than the current limit is created, the signal SIGXFSZ is raised

RLIMIT_MEMLOCK Locked in memory address spaceRLIMIT_NO FILE The maximum number of open files per process at

any given instant. Changing this limit affects the value of the function sysconf for the argument – SC_OPEN_Max

RLIMIT_NPROC The maximum number of child processes per real user id. Changing this limit affects the value returned for SCCHILD_Max by the SYSCONF

RLIMIT_RSS Maximum resident set size (RSS). If the availability of physical memory is not sufficient, the kernel takes away memory from processes that exceed their RSS(on the other hand, if there is sufficient memory available, say, because the other processes are not utilizing their full value, the system may ignore this exceeding)

RLIMIT_STACK The maximum size of the stackRLIMIT_VMEM The maximum size of the mapped address space.

It affects the mmap function.

The resource limits not only affect the calling process but are also

inherited by the children of the process. This is again logical, since the creation

of new processes need not affect the overall resource balance of the system.

To gain an insight into the actual working of these resource names and

the functions they perform, we write a simple program that simply takes the

names of resources and prints their limits for the process.

We shall first see the program (ref: Text book 1) and then the details

# include <sys/types.h>

# include <sys/time.h>

# include <sys/resource.h>

# define do it (name) pr limits (# name, name)

/* include the do it macro, which expands into pr limits (- - -) */

static void pr_limits (char *, int);

int

main (void)

79

Page 80: Bsit61 Unix

{

do it (RLIMIT_CORE);

do it (RLIMIT_CPU);

do it (RLIMIT_DATA);

do it (RLIMIT_FSIZE);

do it (RLIMIT_MEMLOCK);

do it (RLIMIT_OFILE);

do it (RLIMIT_NPROC);

do it (RLIMIT_RSS);

do it (RLIMIT_STACK);

exit(o);

}

static void

pr_limits (char *name, int resouce)

{

struct rlimit limit;

if (getrlimit (resource, & limit) <0)

err_sys (“get limit error for %s”, name);

printf(“% - 2OS “, name);

if (limit rlim_cur = = RLIM_INFINITY)

printf (“in finite”);

else printf(“% lold”, limit rlim.cur);

if (limit.rlim_max = = RLIM_INFINITY)

printf(“infinite \n”);

else printf(“ lo ld /n”, limit.rlim_max);

}

Now a brief description of what we have done.

To begin with we have defined the do it (name) macro to equate pr_limits ( )

which expands the name of the resource “name” into an ASCII string “name” as also the

80

Page 81: Bsit61 Unix

corresponding resource name. All that the pr_limits function does is to use a series of if

statements to check for the current and maximum limits of the resource and print the

same. If the value is infinity, (un limited) it prints infinite against the resource limit, else

it prints the corresponding limit, using long integer.

4.4 Unix process control:

Having seen several of the unix commands in the previous unit and also how a

single process can be controlled, we are now in a position to study the concept of

process control in Unix. Some of the controls that we look into are creation of new

processes, execution of programs and termination of the processes.

Before looking into the actual control mechanism, we shall see how unix keeps

track of the various processes. After all when one is talking of controlling a number of

processes. After all, when one is talking of controlling a number of processes, the basic

thing is to be able to identify one process from another. Unix does this in a strength

forward manner – by assigning a unique identifier – hereafter called process ID to each

process. The process ID will be a non negative integer.

Normally process ID zero (0) is used for a scheduler process called swapper.

Process ID-1 is the init process which is invoked by the kernal during the boot strap

procedure. Process 1D-2 is normally the page daemon.

Unix also provides for getting information about the ID of the process and other informations. The

following list given some of the information obtainable by the functions.

Functions Information

Pid_t getpid(void) Process id of calling process

Pid_t getppid(void) Parent process id of process

Pid_t getuid(void) Real user id of process

Pid_t geteuid(void) Effective user id process

Pid_t getgid(void) Real group id processes

81

Page 82: Bsit61 Unix

Pid_t getegid(void) Effective group id process

4.5 The fork function:

Now we see one of the most important functions in unix process control. This

function, called the fork function allows an existing process to create a new process.

Infact, this is the only way in which a new process can be created by the unix Kernel.

The typical structure of the process is

Pid_t fork (void);

Returns 0 in child, process id of the child in parent and –1 if unsuccessful.

The new process created is called the child process and function which created

the child process is called the parent process.

(you may note: Since each child can have only one parent, it gets a return value

of 0 on creation. If can always get back to it’s parent by calling get ppid- on the other

hand, a single parent can have an umber of children, hence the parent is given the

sl.no. of the child, so that it can distinguish between it’s children).

After the creation, both the parent and the child keep continuing their operations.

The child is considered a copy of the parent and hence gets access tothe copy of

parent’s data space, heap and stack.

(Again note: It is only a copy, not the actual space of the parent. Otherwise, there

will be chaos with both the parent and the children doing all sorts of modifications on the

parent’s space).

Ofcourse, some of the recent implementations make some smart operations

here. Instead of copying the complete space every time a child is created, they allow

82

Page 83: Bsit61 Unix

the child to use the parental space on a read only basis. Only when the child starts

writing into the space, does it get a physical copy for itself.

Once a child is created, it is expected compete for resources just like the parent

and whether the parent gets to execute first or the child depends on the scheduling

algorithms, resource demands and a lot of other factors. To synchronies the activities

of the parent and the child, it is always necessary to introduce sufficient checks and

balances.

This is more important when we use standard I/O devices to be shared between

processes. If the I/O is line buffered (look at a previous block) we may still get the

answers correctly, since each new line character resets the buffers. But if it is fully

buffered, then sufficient care about the operation and flushing sequences is to be

maintained, if one were not to either lose data or get duplications of the same.

4.5.1 FILE sharing between the parent and the child

We have said briefly in the previous section that the parent and child share the

same data space. As a corollary, it is also a fact that the parent and the child processes

share the same open files. In fact since all descriptors of the parent are also duplicated

in child process, by default a child will be sharing the same files as of the parent.

However, one or two issues need to be resolved.

Since both the parent and the child are allowed to operate on the same set of

files, it is imperative that they share the same offset. At the instant of the fork operation,

this is automatically taken care of However, during the subsequent read and (more

importantly) write operations, the parent’s offsets should be suitably updated by the

child and vice versa so that they can be sure that they do not overwrite each other.

83

Page 84: Bsit61 Unix

Continuing in the same line, if they (parent and the child) were to write to the

same descriptor, their outputs will be intermixed, unless both of them synchronize their

operations.

There are two ways in which this can be handled after a fork.

(a) The parent waits, while the child completes the operation. After the

termination of the child, the offsets of the descriptors will have been

upgraded accordingly for the parent to use them.

(b) After the fork, the parent and the child close the descriptors and open those

descriptors that they need. In this way each of them does not interface with

the descriptor of the other.

In addition to open files and descriptors, the following properties of the parent are

“inherited” by the child.

a) real user id, real group id

b) effective user id, effective group id

c) supplementary group ids.

d) Process group id.

e) Session id.

f) Controlling terminal.

g) Current working directory, root directory.

h) File mode creation mask

i) Environment, resource limits and host of other features.

The following are the properties not shared by the parent and child processes

a) return value from the fork

b) have different process ids.

84

Page 85: Bsit61 Unix

c) Different parent process ids (naturally !)

d) Child’s time values for various operations are set to 0 at the beginning.

e) File locks are set by the parent and not inherited by the child

f) Does not inherit any pending alarms from the parent.

g) Set of pending signals are not passed on to the child.

Though we have discussed so many features of fork, we have not discussed why

forking is needed first of all

Forking is needed,

a) So that the parent and child can each execute portions of the same code

independently. For example, when a server gets a request for a service from

one of it’s clients, all that it does is to fork a new process, allow it to execute

the request and the server itself waits for the next request.

b) When the process wants to execute a different program, it hands over it’s

present task to a forked process and it itself jumps to the new program.

However, one also needs to be careful in creating new processes with fork. If

there are already too many processes or if the present real user id has exceeded

the limit on the number of simultaneous processes, then fork is likely to fail and is

to be avoided.

4.6 Vfork function:

The function Vfork has the same format and return values as the fork.

V fork is intended to create a new process which is supposed to execute a

new program (whereas fork executes the same process as of the parent).

Though v fork creates a new process, it does not hand over a full copy of the

parent’s address space. Instead, it runs on the address space of the parent until

85

Page 86: Bsit61 Unix

it calls another execute or exit, when afterwards it executes the new program and

hence does not need the parent’s address space any more.

A major difference w.r.t. to fork is that while in the case of fork there is no

way of checking whether the parent or the child gets executed first , in the case

of v fork , the child always runs first and continues till the child calls execute or

exit. The parent resumes after this.

Let us write a simple program to see how v fork differs from fork.

# include <sys/types.h>

int glob=6 /* the external variable is initialized* /

int

main (void)

{

int var; /* automatic variable on the stock */

pid_t pid;

var = 88;

printf(“before v fork \n”);

if ((pid= v fork (1) <0)

err_sys (“v fork error”);

else if (pid = = 0) /* if it is a child */

glob ++ ; /* modify the parent’s variables/

var ++

exit (0); /* child terminates/

}

/*parent */

printf (“pid = %d, glob = %d; var = %d \n”, get pid ( ), glob, var);

exit (0);

}

86

Page 87: Bsit61 Unix

The program is fairly straight forward to need any explanation. The only point to

note is that the parent’s variables, which are modified by the child, become available to

the parent. This, of course is to be expected since to the parent and the child operate

from the same address space.

4.7 Exit functions:

We repeat what we have seen in the earlier section:- a process can terminate

either normally or abnormally.

It can terminate normally by

a) executive a return from the main function.

b) Calling an exit function

c) Calling a _exit function

It can terminate abnormally by

a) calling abort

b) on receiving signals to terminate itself.

In all cases of termination, the kernel acts similarly. It closes all the open

descriptors (of the process), releaser the memory earmarked to the process, modify the

environment suitably etc.

The parent also needs to be intimated about the termination of it’s child process,

and more importantly how it terminated, since the parent may have to take suitable

action based on this information. The exit and _exit functions generate an exit status as

an argument. In the case of abnormal terminations, the Kernel generates a “termination

status” to indicate the type of abnormal termination and the reason for the same. The

parent can obtain the termination status from either the wait or wait pid function. (refer

next section)

87

Page 88: Bsit61 Unix

Now, what happens if the parent process has terminated before the child

process? In such a case, where do the information indicated in the above paragraph go

to? Unix has an answer. The Kernel starts all programming exercises with a “init”

process. This can be thought of as some sort of a “mother of all processes”, in the

sense, it creates the first process, then the second, then their children etc. Whenever a

process terminates, the init checks whether the parent of the process is still around. If

not init itself becomes the parent process (by changing the parent field of the child

process to 1, see earlier discussions an process ids)

In this way, there can be no process without a parent, since init terminates only

after all the processes in it’s environment are terminated.

Unfortunately, there is a problem even when the parent is alive when the child

terminates. Preliminary concepts of scheduling tell us that only one process can be

active at any instance of time (in single processor systems, which is almost always the

case). In such a case, when a process becomes active after a child has terminated and

starts looking for it’s child process, it does not help if the child has simply vanished after

returning the relevant status codes or whatever signals. Again the init will have to

maintain some minimal information about the terminated process, it’s termination status,

amount of CPU time used etc., This information is called a “Zombie”. The Zombies are

maintained till such time the data is picked up by their parents.

4.8 Wait or wait_pid functions:

A process that calls wait or waidpid can

a) block (if all it’s children are still running)

b) return immediately with the termination status of a child

c) return immediately with an error (if it has not child processes)

88

Page 89: Bsit61 Unix

But why should a process call the wait or wait pid function, to begin with?

Actually it can call it randomly or in response to a signal from init. In either case,

however, it’s purpose is to collect some terminating information about it’s children.

Going back to the previous section, where we left off, a terminated child process,

whether the termination is normal or abnormal, makes the Kernel send a SIGCHLD

signal to the parent. The parent, in response to this signal, may execute a wait function.

It is clear that if it waits when a SIGCHLD signal has been sent by the Kernel, it

can immediately return after collecting the termination status of the child in question,

since the data is already available. On the other hand, if the wait is called without any

such signals, then it may enter the block state , until such time a child terminates.

The typical formats of wait and wait pid are as follows:

Pid_t wait (int *statloc);

Pid_t wait pid (pid_t pid, int *statloc, int options);

Both of them return process id if successful, 0 (indicating the condition listed

later) or –1 an error.

The wait function, as indicated above, may block the caller process until such

time a child process terminates. But wait pid has an option to prevent such blocking.

Also wait pid simply does not wait for any child process to terminate, it can be specific

about the process whose termination it is waiting for. But wait returns the moment any

child process terminates, giving it’s status details.

Now we shall briefly see the arguments of the two functions statiol is a pointer to

an integer. Unless it is a null pointer, the termination status of the terminated (child)

process is stored in this location. If the parent process does not care about the actual

status of the terminated process (but is only interested in finding whether the process

has terminated or not) this filed is set to null.

89

Page 90: Bsit61 Unix

The pid argument in wait pid can have different values and accordingly it’s

interpretation and the action taken changes.

a) pid = = -1, the function waits for the termination of any process. In

effect, wait pid becomes similar to wait.

b) Pid >0 waits for the child whose process id equals the value of pid.

c) Pid = = 0 waits for any child whose process group id becomes equal

to that of the calling process.

d) Pid < -1 waits for any child whose process group id equals the

“absolute” value of pid.

To sum up wait pid is a sophisticated version of wait in that

a) It provides a non blocking version of wait. This becomes

important in several occasions, where we want to just check the

status of the process, but are not interested in waiting for it’s

termination to occur.

b) Wait pid allows us to wait for a particular process termination.

This is more meaning full, because in most cases, we are

interested in the status of a required process to continue the

operations, not the status of any process.

c) It helps the job control operation. We can actually control the

sequence in which the processes work and terminate.

4.8.1 Before we continue, we look a very useful and curiously simple function.

Especially when we want the job control operation to be done, the sequence in which

the processes work and interact becomes exceedingly important . Suppose a process

needs some input from another process. Then it is essential that the second process

should terminate first. What happens if it gets delayed and on the other hand, the first

process almost completes, except, of course the portion which it needs to be

supplemented by the second process. To make it wait, one of the ways could be the

sleep function. We simply call sleep ( ) to make up for the imbalance in the timings.

90

Page 91: Bsit61 Unix

4.9 Wait 3 and wait 4 functions:

These can be considered the extensions of wait and wait pid. They are planned

to provide an additional argument that allows the Kernel to return a summary of the

resources used by the terminated process (and all it’s child processes)

The typical format is

Pid_t wait3 (int statloc, int options, struct rusage * rusage);

Pid_t wait 4(pid_t pid, int statloc, int options, struct rusage * rusage);

Both return the usage id if successful, else return –1.

4.10 Race conditions:

The Race conditions occur because of the limited resources of the systems.

Basically, if there is no sharing of resources, there is little possibility of race conditions.

Race conditions occurs when multiple processes are trying to operate on a shared

resource and the final outcome would depend on the order in which the processes run.

The fork operations, if resorted to ceaselessly, would produce race conditions.

Suppose there is a logical sequence in the program that depends, for it’s result,

on whether the child or parent operations runs first after fork. Normally, no such

prediction is possible, since it depends the scheduling algorithm and the conditions

existing at the time when the scheduler has to choose the process to run.

One way of overcoming the problem is to use sleep, described in the previous

section. Suppose we want to child to complete before the parent. If it happens, no

problem. Otherwise, if the parent completes before the child, it must be put to wait. On

the other hand if the child completes before the parent, but has to wait for the parent, it

can be put to sleep. But how long? It is a problem that can never be decided before

hand.

91

Page 92: Bsit61 Unix

One simple way to overcome this problem is to use polling. Suppose instead of

simply calling the process to sleep, we use the following code

While (get ppid ( ) ! = 1)

Sleep (1)

Until the parent is terminated, the child sleeps. But it has to be woken up once

every second (sleep (1)) to check whether the test condition has been met. This can be

wasteful in many situations.

To overcome these difficulties, various types of inter process communication

signals can be used, which we shall see in the later sections.

4.11 exec functions:

The fork function, as we have seen earlier, creates a new (child) process and

causes another program to execute by calling the exec function. When a exec is called,

the present process is completely replaced by the new program which starts executing.

The process id does not change because exec simply replaces the current process with

a new program.

Though we refer to exec as one function, there are as many as six exec

functions.

They are of the following types:

Int excel (const char *path name, const char * argo, arg1, - - - - - - );

Int execv(const char *path name, char * const arvg [ ]);

Int execle (const char * path name, const char *argo, arg1 - - - - );

Int execve (const char *pathname, char * const argv [ ], char * const envp [ ]);

Int execlp (const char *filename, const char * argo, arg1 - - - - - -);

Int execvp (const char *filename, char * const argv [ ]);

92

Page 93: Bsit61 Unix

All of them return –1 on error, do not return any thing if successful.

Now, how do they differ from one another?

The first difference is that execl, exe cv, execle and execve take a path name as

the argument while the others take file name as the argument (when a path is indicated

it indicates a list of directories, separated by colons)

The second difference is regarding the argument list. It may be noted that the

functions execl, execlp and execle require the command line arguments to be specified

as a list of separate arguments – arg1, arg2, arg3 - - - - etc.,. But execv, execvp and

execve allow the user to build an array of pointers to the respective arguments and

indicate this array as the argument – like arg [ ]. In fact, student can note that in the

command stands for list, (execl, execle, execlp) while v stands for vector (execv,

execup and execve).

Now the two functions whose names end with e, (namely execle and execve)

allow the user to a pass a pointer to the array of pointers which in turn point to the

environment strings. The others use the environ variable in the calling process to make

a copy of the existing environment. Normally, when a child is created, the environment

is copied on the child (except certain special cases, which we do not deal with here).

To conclude, we make the following (repeated) observations regarding exec

functions:

i) when the function ends with e, it will take a pointer to give the address of

the array of pointers.

ii) When it ends with P, it takes file name as the argument. It then uses

the environmental variable path = - - - - to find the path to the executable

file.

93

Page 94: Bsit61 Unix

iii) when it has l, it means the function takes a list of arguments, to be

indicated individually, where instead if it equals with v, it means an array

(or a vector) will be the argument.

When an exec operation takes place, there are several properties that the new

program inherits from the calling process. One of them, obviously, is the process Id.

The others in the list are as follows:

a) parent process id

b) real user id and real group id

c) supplementary group ids.

d) Process group id

e) Session id

f) Controlling terminal

g) Time left for the process

h) Current working directory

i) Root directory

j) File mode creation masic

k) File locks

l) Pending signals

m) Resource limits etc.,.

4.12 User Ids and Group Ids

The set uid and set gid functions.

The real user id and effective user id are important arguments, which can be set

by the set uid functions. Similarly the group ids (both real user and effective user) can

be set the using set gid function.

94

Page 95: Bsit61 Unix

Typical formats are

Int setuid (uid_t uid);

Int setgid (gid_t gid)

Both of them return 0 is successful, else return –1. Obviously changing user ids is a

serious business and hence strict restrictions need to be applied.

The following are some of the rules as to who can change the ides

a) If the process has super user privileges, the set uid function sets the

real user id, effective user id and the saved set user id to the value

indicated by uid.

b) If the process does not enjoy super user privileges, but if uid equals

either the real user id or the saved set user id, then only the effective

user id -–uid is changed. The real user id and the saved set user id

are not touched.

c) If the process does not have super user privileges nor does the value

of uid is the same as the saved set user id or real user id, then an error

is raised.

d) Only a super user process can change the real user id. The real user

id is set when a user logs in and is not changed for the duration of the

login. Since login is a super user process, the login process can set all

the three user ids by calling set uid.

e) The effective user id is set by the exec. functions, only if the set user id

bit is set for the program file. Otherwise, the exec function maintains

the effective user id at the current value.

f) The saved set-user-id is taken from the effective user id by exec. This

copy is made available after exec stores the effective user id from the

file’s user id.

95

Page 96: Bsit61 Unix

4.12.1 Setreuid and setregid functions:

This function is for swapping the effective user id and the real user id.

Typical format is

Int setreuid (uid_t ruid, uid_t euid);

Int setregid (gid_t rgid, gid_t egid);

Both of them return 0 on successful opertion and –1 an error.

This will help an unprivileged user to swap between real user id and effective

user id. This helps in swapping the normal permissions.

4.13 System Function:

This helps the user from inside a program to use system commands. A typical

format is

Int system (const char * cmdstring)

The cmd string, (the command string) is a command for the system function, and

the actual operation depends on what the command stands for.

The following bit of program would help us get a clear idea of the operation of the

system command.

# include <sys/types.h>

# inlcude <sys/wait.h>?

int

main void ( )

{ int status;

if (( status = system (“date”) <0)

err_sys (“system error”);

pr_exit (status);

96

Page 97: Bsit61 Unix

if (( status = system (“no such command”)) <0)

err_sys (“System error”);

pr_exit (status);

if ((status = system (“who ; exit 44”) < 0 )

err_sys (“system error”);

pr_exit (status);

exit (0);

}

All that the program does is to accept the calls to system functions, give the

appropriate error signals, if necessary, otherwise give the requisitioned outputs.

The function pr_exit is a function that uses the appropriate macros to print the

description as required, before exiting.

4.14 Process accounting:

Once we talk of resource sharing, it also means some way of maintaing who has

used how much of which resource – in short the process of accounting. This is an

optional process that needs to be enabled for it’s operation. Once enabled, it writes an

accounting record for each process that terminates. These are normally 32 bytes of

binary data which give some of the important details about the process like the amount

of CPU time used, user id, group id, elapsed time etc.,.

A typical accounting structure that maintains the accounting records can be of

the following format:

Struct acct

{

char ac_flag /* flag */

char ac_stat /* termination status */

97

Page 98: Bsit61 Unix

uid_t ac_uid /* real user id */

gid_t ac_gid /* real group id */

dev_t ac_tty /* controlling terminal */

time_t ac_btime /* starting time */

comp_t ac_timeu /* user CPU time (clock cycles) */

comp_t ac_stime /* system CPU time (clock cycles) */

comp_t ac_etime /* elapsed time (clock cycles) */

comp_t ac_mem /* memory usage (average) */

comp_t ac_io /* bytes transferred during read and write */

comp_t ac_rw /* blocks read or written */

comp_t ac_com[8] /* command name */

All the fields are self explanatory except the first one, namely the flag, which

needs some explanation.

The ac_flag member of the structure records certain events during the execution of the process

Ac_flag Description

AFORK Process is the result of fork, but did not call

exec

ASU Process used super user privileges

A COMPAT Process used compatibility mode

A CORE Process dumped core

AXSIG Process was killed by a signal

Whenever a new process is created, the data required for accounting are

initialised by the Kernel and are updated as long as the process is alive once the

process terminates, the corresponding accounting record of the process is written into.

Since the records are written after each process terminates, the entry of the records in

the accounting file is in the order of the termination, not in the order of their creation.

98

Page 99: Bsit61 Unix

One other information we should note is that the accounting is for the processes,

not programs. That means when one programs calls another and then the third one

etc., all that we get is a single accounting record. So, the accounting records need

some modifications before they can be directly used to debit / charge costs for the use

of system resources.

Now he question arises: how does one use this information for the accounting

purposes of a particular program. One sequence of operations can be as follows:-

a) Become a super user and enable accounting with the accton

command.

b) Run the program for which accounting is needed. Depending on the

number of processes that it creates, we get that many number of

accounting records in the accounting file.

c) Disable the accounting by entering the super user mode.

d) Another program can be run (at the super user level) to pick the

required fields from the accounting file and print it. Once the fields of

the accounting record are known, it is easy to write a program that

does it.

4.15 user identification:

Any process is identified with it’s real and effective user ids and effective and real

group ids. Suppose we want to know the user login name. We can call a function get

pwuid. But if a user has multiple login names, each with the same user id, it becomes

not an easy proposition. The get login function gives a method to fetch such a login

name.

Typical format is

Char * get login (void);

Returns a string pointer to point to the login name, points to NULL on error.

99

Page 100: Bsit61 Unix

4.16 Process times

Most systems deal with three measures of time. Real time (or clock time), user

CPU time and System CPU Time. Any process can call the times function to obtain the

time values for it’s terminated children and for itself.

The typical format is

Clock_t times (struct tms *buf);

Returns the elapsed time in clock ticks if ok, else return error (-1)

The structure tms, pointed to by the buffer is of the following format

Struct tms {

Clock_t tms_utime /* user CPU time */

Clock_t tms_stime /* System CPU time */

Clock_t tms_cutime /* user CPU time for the terminated children */

clock_t tms_cstime /* system cpu time, for terminated children */

}

Two points need to be noted at this stage

1. The time referred to in the fields of the above structure are all clock

ticks. This can be converted to seconds using the number of clock

ticks per second, the _SC_CLK_TCK value returned by syconf.

2. The system does not provide any absolute times. Infact hardly

anybody will be interested in absolute values of time most of the time.

Most often, we will be interested in the elapsed time – The time

duration between two events – which can be got by simply subtracting

a new clock value from a previously set value.

100

Page 101: Bsit61 Unix

Block Summary

This is an important block, where in we discussed a number of vital issues. We

learnt about the concept of a process; which can be roughly described as the

executing unit of a program we learnt about the system and process environment,

their usefulness, how to view the environment variables, as well as to set them.

Then the concept of a parent program creating a child program using a fork was

discussed. Several aspects about the shareability of files and resources between

the parent and the child processes as also about the results of the termination of

either the parent or the child early were discussed. We also learnt how to make the

processes wait and studied the different methods of calling the wait function.

Then we were introduced to the concept of race condition, where in a process is

waiting for a resource held by another process, but the second process is waiting for

a resource held by the first process, thereby producing an infinite wait. We have

also seen how the system intervenes to overcome this situation.

We also talked about the user ids, methods of setting or getting the data. The

concepts of how to do process accounting, user identification and process times.

101

Page 102: Bsit61 Unix

Review Questions

1. What is alloca function ? How does it differ from malloc?

2. What are the commonly used environment variables?

3. Which function helps us to fetch the environment values?

4. Which functions help us to set the environment values?

5. Which functions help us to get and set the limits of a process?

6. What does the fork function do?

7. Which two functions help us to make a process wait?

8. What is a race condition?

9. What are the three measures of time? Which function allows to get the time?

102

Page 103: Bsit61 Unix

Answers

1. alloca is used for allocating and memory of a suitable size, by calling in the form

void *alloca (size_t size)

But instead of allocating from the heap, it allocates memory from the stack frame.

2. The commonly used environment variables are

Home – home directory

Lang – Name of he locale

LOG name – Login name

Path – List of paths

Term – Terminal type

TZ – time zone etc.,.

3. The function getenv is used. It’s typical format is

Char *getenv (const char *name).

4. The function set env (const char *name, const char *value, int rewrite);

5. The functions getr limit an dsetr limit are used.

Their typical format are

Int getr limit (int resource, struct rlimit *rlptr);

and Int setr limit (int resource, const struct rlimit * rlptr);

respectively

6. It creates a child process for a parent.

7. The functions wait and wait_pid will be useful. Their formats are

Pid_t wait (int * statloc); and

Pir_t waitpid (pid_t pid, int * statloc, int options);

Respectively.

103

Page 104: Bsit61 Unix

8. Two or more process each waiting for a resource held by the other, so that in

effect, no process gets the resource it is waiting for nor does it release the

resource it holds is called a race condition. In this case, the wait becomes

infinitely long.

9. The three measures are real time, user CPU time and System CPU Time.

The function clock_t times (struct tms * buf) help to access the time storage

buffer.

104

Page 105: Bsit61 Unix

BLOCK - V

Block Introduction

This block is about signals signaling is a very important aspect of synchronizing

the processes. Since every process will have to share the system resources and also

share data and information, it becomes essential that there has to be signaling process

to convey messages.

We start with the basic concepts of signaling and what process should do on

receiving signal can ignore, can catch or allow the default action for the signal to

happen. Then we learn some of the commonly encountered signal this list is not

comprehensive but is only indicative in nature.

We then look at some functions that operate with signals. We also look into the

concept of a reentrant program – which is of vital importance if we have to operate with

signals.

We also look into the concept of signal sets and functions to operate on signal

sets. We finally have an encounter with a few job control signals.

105

Page 106: Bsit61 Unix

Contents

5.1 Introduction

5.2 Basic Concepts of signals

5.3 Some of the commonly used signals

5.4 We now look at some of the functions of unix that operate with signals

5.5 The concept of Reentrant functions

5.6 Reliable signal terminology

5.7 Raise and Kill functions

5.8 Alarm and pause functions

5.9 signal sets

5.10 Sigpending function

5.11 Sigaction function

5.12 Sigsuspend function

5.13 Abort function

5.14 Sleep function

5.15 Job control signals

106

Page 107: Bsit61 Unix

Signals

5.1 Introduction:

In this block, we study about signals used by Unix system for its various

operations. They are of utmost importance in process synchronization, handling of

asynchronous events and such other operations.

Signals can be considered as software interrupts. Their function is to make the

presently working program or the other processes in the various states take note of

some important events that have taken place- like a user typing a key to stop a program

or same program getting terminated.

Once basic problem with signaling is that they should not get lost, but must be

registered by the programs and processes which are supposed to take notice of them.

What we see here is essentially a list of signals, their description, their intended effects

and side effects if any. The student can keep in mind that it is not necessary to

remember all of them, though that may be desirable. What is more important is to note

that such signals exist and in specific cases, one should be able to use them and more

importantly, use them correctly.

5.2 Basic Concepts of signals

As we have noted that signals are software interrupts that are expected to signal

some situations. They all have names. To differentiate them, all of them begin with the

characters SIG.

To assist in system implementation, all these names are defined by positive

constants (Called signal numbers) in the header <signal.h> which is to be included in

header files before the signals can be processed.

107

Page 108: Bsit61 Unix

Having seen what a signal stands for, we now see under what conditions signals

get generated. Actually a large number of conditions and combinations of conditions

generate the signals. We list only a few of them below.

a) Hardware Exceptions: These are special conditions generated during the

execution of the programs which make further computations not possible or

prone to errors- like invalid memory reference, division by 0 and so on. Normally,

the system hardware has the capability to detect such erroneous situations and

raise an alarm to the kernel. The kernel, on receiving these alarms (which are

normally in the form of some bit being set / reset) generates the corresponding

signal, so that the process currently running or any other process can be warned

about the same.

b) Terminal generated signals: These signals are encountered when users

press certain terminal keys. For example, when a process is in execution, if the

DELETE key is pressed, the terminal generates an interrupt signal. This

mechanism is normally useful in stopping a runaway program.

c) Some times it is desirable to send signals to other processes. Suppose a

background process is in a runaway conditions. One way of stopping it could be

to send a kill(1) command, that terminates the back ground process.

d) Similarly a kill (2) function allows a process to send any signal to any other

process or process group. Obviously there are limitations as to which process

can send the kill(2) to which other process. As a thumb rule, one will have to be

the owner of the process that is receiving the signal. Alternatively, in the

superuser mode, any signal can be sent to any of the process.

e) There are certain situations wherein the condition existing in the execution of

the program needs to be made available to the other processes, but not

hardware-generated exceptions are available. Situations like when an alarm set

by the process goes or when a process writes to a pipe after the reader has

terminated his reading need special attention. In such cases, software signals

are generated.

108

Page 109: Bsit61 Unix

It may be noted that by definition signals are not synchronous events. No one

can predict, before hand, when the signal goes off. (If one can predict such events,

then signals are not needed.) The process can check a particular bit, a memory

location etc. to find predictable nature of the events, signals are needed – so that the

process need not keep checking for the happening of the event, but the event on it’s

happening sends a signal to the process.

So far, we have tried to understand the need for and the nature of signals. Now,

once a signal is triggered, what should the kernel do? Technically, what the kernel does

on receipt of it’s signal is called “disposition of the signal”. Some authors also call it “

Action associated with the signal”.

Normally, three types of actions are possible.

a) Ignore the signal: Many of the signals go away automatically if ignored by the

kernel. You may ask why a signal that is ignored is to be generated in the first

place. The answer is that the kernel gets intimated about certain situations by

the signals. It is for the kernel to decide whether the signals need to be attended

to or to be ignored, based on the situations prevailing.

However, two signals, SIGKILL and SIGSTOP can not be ignored. We see

shortly that these are the signals to kill or stop a process and the intention is to

ensure that the action takes place at all costs.

b) Catch the signal: This is actually the action part of the signaling process.

Whenever a signal is received, the kernel is asked to call a particular function. In

that function, the programmer will have given a routine as to what action is to be

taken.

c) Allow the default action to take place: Every signal has a default action. If the

kernel neither decides to ignore the signal nor to catch the signal, default action

takes place. It may be noted that in most cases, the default action is to terminate

the process.

109

Page 110: Bsit61 Unix

5.3 Some of the commonly used signals

Now we make a list of some of the commonly used signals and a brief

description of each of them. Though the student may not be able to remember all of

them, it may be pointed out that experienced programmers are normally expected to

know that such signals exist.

SIGABRT This is generated by calling abort. The process terminates

automatically

SIGALRM This is generated when a timer that is set by the alarm

function expires. This is also generated when an interval

timer set by settimer(2) function expires.

SIGBUS This is generated when an implementation defined

hardware fault occurs

SIGCHLD Whenever a process stop or terminates, the SIGCHLD

signal is sent to the parent. By default the signal is ignored.

The parent can catch this signal if it wants to be notified

about the child’s status and it’s changes. The normal action

in the signal catching function is to call one of the wait

functions, to fetch the child’s process ID and termination

process.

SIGCONT This signal is sent to a stopped process to make it continue.

The default action is to ignore this signal, if the process has

not stopped.

SIGEMT This is raised when there is an implementation defined

hardware fault.

SIGFPE This is raised, when an arithmetic exception, like overflow

or underflow occur.

SIGHUP This signal is raised when a disconnect is detected by the

terminal interface and is meant to be addressed to the

controlling process associated with the controlling terminal.

110

Page 111: Bsit61 Unix

This signal is also generated when the controlling process

terminates. In this case, the signal is sent to each process

in the foreground process group. This signal makes the

daemon processor reread their configuration files.

SIGILL This signal is raised when a process has executed an illegal

hardware instruction.

SIGINFO This signal is raised by typing the status key. It normally

causes the status information on the processes in the

foreground process group to be displayed in the terminal

SIGIO This is raised to signify an asynchronous I/O event.

SIGIOT This is raised to indicate an implementation defined

hardware fault.

SIGKILL This is raised by the system administrator to kill a process.

This cannot be ignored.

SIGPIPE This is raised if a pipeline is being written into, but the

reader of the pipe has terminated. This is also raised if a

socket is being written into by a process, but the reader

has terminated.

SIGPROF This is raised when a profiling interval timer set by set

timer(2) function has expired.

SIGPWR This is an interesting signal, available in SVR4, but is

system dependent. It is useful on a system connected

through a UPS. When power fails, the UPS takes over the

supply of power and the system can be notified. At this

point of time, since the power supply is being continued,

there is nothing that the system should do. If the battery of

the UPS gets too low the software is again notified and this

second notification is done by SIGPWR. The signal sends

a message to the init process to immediately start the

shutdown process.

111

Page 112: Bsit61 Unix

SIGQUIT This signal is raised when a terminal quit key is typed. It is

sent to all processes in the foreground process group,

asking the processes to terminate and also to generate a

core file

SIGSEGV This is raised when a process has made an invalid memory

reference.

SIGSTOP This is raised to stop a process. This cannot be ignored.

SIGSYS This is raised to signal an invalid system call

SIGTERM This is raised to send a termination signal by the kill(1)

command

SIGTRAP This is raised to indicate an implementation defined

hardware fault

SIGSTP This is raised when the terminal driver encounters a

suspend key being pressed. It is sent to all processes in

the foreground process group

SIGTTIN This is raised by the terminal driver when a process in the

back ground process group tries to read from it’s controlling

terminal .

SIGTTOU This is raised by the terminal driver when a process in the

background process group tries to write to it’s control

terminal.

SIGURG This is raised by a process to indicate that an urgent

condition has been encountered.

SIGUSR1

SIGUSR2

User defined signal for use in application programs

SIGXCPU This is raised to indicate that a process has exceeded it’s CPU current limit.

SIGXFSZ This is raised when the filesize limit (current) is exceeded by a process.

SIGVTALRM This is raised when a virtual timer alarm set by settimer(2)

function expires

112

Page 113: Bsit61 Unix

5.4 We now look at some of the functions of Unix that operate with signals.

The first of them is the signal function.

It’s typical format is

Void(* signal (int signo, void (*func)(int)))(int);

We look at the argument, the signo argument is the name of the signal, the value

of func can be (i) the constant SIG_IGN(i) The constant SIG_DFL or the address of the

function to be called when the signal is raised.

If we specify SIG_IGN, for fun, the function will ignore the signal. (ofcourse the

signals SIGKILL and SIGSTOP cannot be ignored).

If we indicate the value of fun as SIG_DFL we are setting the associated action

to be the default action of the signal.

If a function name is associated, then when the signal is raised, it is caught by

calling the function specified. Some authors call the function “signal handler” or “signal

catching function”.

Before we proceed, we would like to know the default actions of the signals we

have studied earlier. Fortunately, we need not have to make an exhaustive list again.

Most of them simply specify terminate as the default action. We can simply list out those

that have a different default condition.

The following signals have ignore as the default action

SIGCHLD, SIGPWR, SIGURG

The following signals have stop process as the default action

SIGSTP, SIGTTIN, SIGTTOU

All others have terminate as the default action.

To make ourselves familiar with all the concepts so far, we look into a simple signal

handler program that catches either of the two user defined signals (Ref. Text Book 1).

113

Page 114: Bsit61 Unix

#include <signal.h>

/* This is the handler function for both the signals */

main (void)

{

if (signal(SIGUSR1,SIG_USR)== SIG_ERR)

err_sys (“Can’t catch SIGUSR1”);

if (signal(SIGUSR2,SIG_USR)== SIG_ERR)

err_sys(“can’t catch SIGUSR2”);

pause();

}

static void

sig_usr (int signo) /* argument is signal no */

{

if(Signo = =SIGUSR1)

printf(“received SIGUSR1 \n”) ;

else if (signo==SIGUSR2)

printf(“received SIGUSR2 \n”);

else

err_dump(“received signal %d \n”, signo)

return;

}

5.5 The concept of Reentrant functions

We have started the discussion of signals indicating that a signal can be studied in

terms of software interrupts. It actually means, when a signal appears, if a process

is executing, the process is temporarily interrupted to handle the signal (by running

the signal handler function) and if the signal handling does not end up in termination

of the process we come back to the process that was executing when the signal

was raised and try to continue from the place where we left off.

Looks simple enough, if we can mark the position where we abandoned the

process to execute the signal handler. This can also be easily done using the

114

Page 115: Bsit61 Unix

system stack. But the problem lies elsewhere. We cannot exactly simulate the

original process, if it was executing certain specific operations when we interrupted.

For example, if it was handling static memory allocation, we may not be able to

exactly start where we left off, because of so many other executions that have taken

place in between.

A function which can be executed from where we left off (as desired in the above

discussion) is said to be reentrant function i.e. the result of previous executions do

not affect the sequence of the present execution.

Most systems make a list of Reentrant functions that they guarantee would

satisfy this property; The programmer will have to be extra careful while handling

non reentrant programs.

5.6. Reliable signal terminology

we define some of the terminology that we may be using for our future

discussions.

A signal is said to be “generated” for a process when the event that causes the

signal occurs. When the signal is generated, the kernel usually sets a flag to

indicate the same.

The signal is said to be “delivered’, when the action for the signal is taken. In

between the periods (from the time the signal is generated and the signal is

delivered) the signal is pending.

A process can “block” a signal. Suppose a signal is generated for a process and

if it cannot be ignored, the process has the other option of blocking the signal. It

continues to be in the blocked state until either it is unblocked or the process

115

Page 116: Bsit61 Unix

executes the action as specified or changes the action field so that it can ignore the

signal.

This mechanism helps the process to decide what to do, not immediately after

the signal is generated but just before the signal is delivered.

The sig pending function can be called by a process to determine which signals

are blocked and pending for the process.

If a signal is blocked and meanwhile a new signal is generated before it is

unblocked, then the signals are queued and delivered.

Each process is associated with a signal mask, which normally takes the form of

a bit. There will be one bit for each possible signal. By making the bit on, the signal is

said to be currently blocked. To check the status of he current signal mask, the process

can use the sigprocemask function.

The typical format of the function is

Int sigprocmask (int how, const sigset_t *set, sigset_t * oset);

Returns 0 if successful, else returns –1.

If, at the time of calling, oset is a null pointer, the current signal mask for the

process is returned through oset.

If set is a null pointer, then the how argument indicates how the current signal is modified. The different

values of how and their significance are listed below.

How Description

Sig_BLOCK Create a “set” containing the additional

signals that need to be blocked. The new

signal mask will be the union of the current

signal mask and the set.

116

Page 117: Bsit61 Unix

SIG_UNBLOCK Set contains the signals to be unblocked.

The new signal mask will be the intersection

of it’s current signal and the set.

SIG_setmask The new signal mask is the value set by set

If set is a null mask, then the process is not changed. Then the value of how is

redundant.

If there are any pending unblocked signals, the unblocked signals, after calling

sigprocmask, will be delivered to the process.

5.7 Raise and Kill functions:

The kill function sends a signal to either a process or a group of processes. The

raise function allows a function to send a signal to itself

Typical formats are

Int kill (pid_t pid, int signo);

Int raise (int signo);

Both return 0 if successful, else return –1.

For the kill function

If pid > 0 the signal is sent to the process whose process id is pid.

Pid = = 0 the signal is sent to all processes, whose group id equals to

the process group id of the sender and for which the sender

has the permission to send the signal.

Pid < 0 The signal is sent to all processes whose process group id

equal the absolute value of pid and for which the sender has

permission to send the signal.

117

Page 118: Bsit61 Unix

Pid = -1 unspecified.

For signo, signal 0 is the null signal, which is used to check whether a process

still exists. If we send a null signal to a process that does not exist, it returns –1 as the

return value. If any other value is returned, we say the process exists.

Unfortunately, there is a big catch here. Unix recycles pids. What does it mean?

When a process is killed or terminated or whatever, it’s id may be allotted to some other

process. If you want to ascertain whether the earlier process still exists and send a null

signal, you may get a signal that the process exists, but it may not same process that

you are expecting it to be, but a new process with the same id.

5.8 Alarm and pause functions:

The alarm function is used to specify a time (in future) when a timer should go off.

When the time expires, SIGALRM is generated. If we do not catch the signal, the

default action is to terminate the process.

The typical format is as below

Unsigned int alarm (unsigned int seconds)

Return 0 or number of seconds left in the previous alarm

The seconds value is the number clock seconds after which the signal goes off.

At that instant the Kernel raises the SIGALRM. But the process may actually get the

signal later, because of the several aspects that we have discussed earlier, as also the

fact that there can be scheduling delays.

Also, only one alarm can be set for one process. If suppose, there is already an

alarm set for the process. Then alarm function resets the alarm to the value specified

and returns the “ remaining” time for the previous alarm in seconds.

118

Page 119: Bsit61 Unix

If the alarm is not caught, it can, by default, terminate the process. But most

processes catch the alarm before deciding as to whether to terminate or not.

The typical pause function as follows

Int pause (void)

Returns –1 with error no., otherwise does not return.

The pause function suspends the calling process until a signal is caught.

Using pause and alarm combination, we can simulate the sleep function. Let us

call this sleep new function.

# include <signal.h>

#include <unist.h>

static void

sig_alrm (int signo)

{ return; /* the main process returns after sleep */

}

unsigned int

sleep new (unsigned int nses)

/* takes the delay parameter in nanosecs */

if (signal (SIGALRM, Sig_alrm) = = SIG_ERR

return (n secs);

alarm (n secs) /* start the timer */

pause ( ) /* wait till the signal raiser */

return alarm (0) /* turn off the timer */

}

This simple implementation, though illustrates the use of the alarm, pause and

signals, is imperfect.

The student is advised think why it is imperfect, based on the discussions so far.

119

Page 120: Bsit61 Unix

5.9 Signal sets.

In one of the previous functions (the concept of blocking to be exact). we talked

about signal sets, though we did not specify the actual form that these sets take. We

deal with them in more detail.

So for we have been analyzing the effects of single signals - one at a time. But

these signals need not be mutually exclusive . In fact they may be used as

complementary to one another. In such cases, we talk of sets of signals - a signal set.

Note that we used a signal set to arrange and rearrange masks in sigprocmask.

The following functions manipulate signal sets.

1) int sigemtyset (sigset_t *set);

2) int sigfillset (sigset_t *set);

3) int sigaddset (sigset_t *set, int signo);

4) int sigdelset (sigset_t *set, int signo);

all of them return 0 if ok, -1 on error

5) int sigismember (const sigset_t *set, int signo);

returns 1 if true, 0 if false.

The function sigemptyset initializes the signal set, pointed to by set, so that all

signals are null (excluded).

The function sigfillset initializes the signal set, pointed to by set, so that all signals

are included. Before using a signal set for any operation, the applications should call

either of these two functions, to initialize the sets.

The addset and deleteset functions respectively add to and delete from the set

pointed to by set, the signal whose number appears in signo. Note that at each stage,

we can add or delete one signal only.

120

Page 121: Bsit61 Unix

5.10 Sigpending function

Often we are interested in knowing the set of signals that are blocked from

delivery and are currently pending with the calling process. The sigpending function will

be doing the job.

The typical format is

int sigpending (sigset_t *set);

Returns 0 if ok, -1 if error.

We can write a small program to find all the blocked signals of a process as

below.

#include <signal.h>

void

pr_mask (const char * str)

{

sigset_t sigset;

int errno_save;

}

errno_save = errno;

if (sigprocmask (0,NULL, &offset) < 0)

err_sys ("sigprocmask error");

printf("%s", str);

if((sigismember(&sigset, SIGINT)) printf ("SIGINT");

if((sigismember(&sigset,SIGQUIT)) printf("SIGQUIT");

if((sigismember(&sigset, SIGusr1)) printf ("SIGUSR1");

if((sigismember(&sigset,SIGALRM)) printf("SIGALRM");

/* print the corresponding signal names.

This list can go on, to check as many signals as you need. */

printf("\n");

errno=errno_save;

}

121

Page 122: Bsit61 Unix

5.11 Sigaction function

This function is used to examine, modify or examine and modify the action

associated with a particular signal. The function sigaction supercedes the function

signal, we have seen earlier in this section.

The typical format is

int sigaction(int signo, const struct sigaction * act, struct sigaction * Oact);

returns 0 if ok, -1 on error.

The argument signo is the signal which we are examining or modifying. If the act

pointer is not null, the action is of modification. If O act is not null, the system returns

the previous action of the signal.

The structure sigaction is of the following form.

Struct sigaction{

void (* sa_handler)(); /* address of the signal handler */

/* it can also be either SIG_IGN or SIG_DFL */

sigset_t sa_mask; /* additional signals to mask */

int sa_flags; /* signal option */

};

The list of sa_flags are listed below along with their descriptions.

Once a sigaction sets up the action for a particular signal, it remains unchanged

until it is explicitly changed by calling sigaction again.

Option flags Description

SA_NO CLD STOP if signo is SIGCHLD, do not generate the signal when

a child process stops.

SA_Restart System calls interrupted by this signal are

automatically restarted.

122

Page 123: Bsit61 Unix

SA_Onstack If an alternate stack has been declared, this signal is

delivered to the process as the alternate stack.

This is not the complete set of options. The student may refer to standard

literature to get a complete list of a options.

5.12 Sigsuspend function.

Often we would like not to receive signals when certain critical sections of the

code are being executed. one way to do would be to use the signal mask to block and

unblock signals. When we are just entering the, critical section, we can mask the

signals and unmask them after we pass the section. However, consider the scenario

when we unblock the signal and pause, waiting for a blocked signal to occur. If the

signal we are waiting for happens to come between the unblocking and pause actions,

we tend to lose the signal.

There is another way in which we both reset the mask and put the process to

sleep with an atomic operation. (Note that an atomic operation is one wherein the

process cannot be deallocated when the operation is going on. This means, unmasking

and pause are completed in one stroke, avoiding the problem we have just discussed).

int sigsuspend(const sigset_t * sigmask);

returns -1 with error no.

The signal mask of hte process is set to the value pointed to by sigmask.

5.13 Abort function

A typical abort function is as follows

void abort (void);

This function never returns .

This function sends SIGABRT signal to the process and the process normally

should not ignore this signal.

123

Page 124: Bsit61 Unix

However the process can catch SIGABRT. This facility is provided to allow for

the process to effect any cleanup if it wants to, before the process termination.

5.14 Sleep function

We have used and also briefly seen sleep function in one of the previous

sections. However, typically sleep is a function that needs to be studied in the context

of signals.

The typical implementation of sleep is

unsigned int sleep (unsigned int seconds);

returns 0 or the number of unslept seconds.

This function, when called, causes the calling process to be suspended until one

of the two conditions are met.

a) The amount of real time, specified by the seconds field has elapsed.

b) A signal is caught by the process and the signal handler returns.

However, the difficulty is still the same as we have seen with the alarm signal.

i.e. there may be a time log between the time when the sleep ends and when the actual

return to the process takes place.

A word about the values returned by sleep. In a normal return, it returns 0.

However, if it returns early, because of certain signal being caught (case (b) above), it

returns the number of seconds that were still to elapse for the normal return.

5.15 Job control signals.

There are six signals that can be considered to be job control signals.

SIGCHLD child process has terminated or stopped

SIGCONT continue the process, if stopped

SIGSTOP stop the signal (cannot be ignored)

SIGTSTP Interactive stop signal

SIGTTIN Read from the controlling terminal

124

Page 125: Bsit61 Unix

SIGTTOU write from the controlling terminal.

Typically a job control signals is one that can be used for controlling the

sequence of process execution.

Since we have already seen these signal operations, we will not discuss them in

any detail. The student is advised to note that by suitably introducing these signals, we

can control the way the processes are scheduled, irrespective of the scheduling, and

get the desired sequence of operations. However, one should be careful about the

interactions between the signals to ensure proper sequencing.

One such example is if any of the stop signals are generated (SIGSTP,

SIGSTOP SIGTTIN or SIGTTOU) and pending SIGCONT signal for the process is

discarded.

Similarly, when SIGCONT signal comes up, any pending stop signals are

cleared.

125

Page 126: Bsit61 Unix

Block Summary

We have briefly discussed the importance of signals in synchronization of

processes. Signaling necessary to ensure proper sequencing of process and sharing of

resources as also to take care of conditions like hardware exceptions, terminal

generated signals, killing o fprocesses etc. We have seen a list of normally used

signals and the fact that the process can ignore, catch or allow default action an signals.

It was also established that recentrant functions are essential for smooth operation of

signal generated operations.

We have looked into the details for several function like Raise, Kill, alarm pause

etc. Finally we discussed the concept of signal sets and five functions to manipulate

such signal sets. A part from this, we studied the sigpending, sigaction, sigsuspend,

Abort and sleep functions. We closed the discussion with Job control signals.

126

Page 127: Bsit61 Unix

Review Questions

1. What are the situations in which signaling becomes necessary?

2. What are the actions that a process can take on encounting a signal?

3. What is a recentrant functions?

4. Give the format of kill and raise fucntions?

5. What is the purpose of alarm and pause functions?

6. What is a signal set?

7.What are the opreations on signal sets?

8. Give the format of the sleep function?

9. Name the Siz Job control signals

10. Which signal does the abort function send?

127

Page 128: Bsit61 Unix

Answers

1. Signaling becomes necessary in situation like process synchnomisation, resource sharing, hardware exceptions, terminal generated signals, killing o fproces etc.

2. There are 3 types of actions possible – ignore the signal, catch the signal and allow default action.

3. A function which can be executed from where we had left it before without affecting the quality of results is called a reentrant function.

4. The format are int kill (pid_t pid, int signo);

and int raise (int signo); respectively

5. The alarm function is sued to specify a future time when the signal should go off. The pause function makes the process suspend it’s activities temporarily.

6. they are collection of signals, so that arrangement of maskings becomes easy.

7. To check whether the set is empty ; to enter an object into it, Tos elect an object, to check whether a given objet is a member etc.

8. Unsigned int sleep (unsigned int seconds)

9. SIGCHLD, SIGCONT, SIG STOP, SIGTSTP, SIGTTIN and SIGTTOU.

10. SIGABRT Signal

128

Page 129: Bsit61 Unix

BLOCK VI

BLOCK INTRODUCTION

In this block we study interactions between processes. In the previous chapters

we have seen that there are relationships between different processes. Every process

has a parent process, but need not necessarily have child processes. Whenever a child

process terminates the parent is notified. Also the parent can obtain the exit status

details of it's children.

In this block, we look at the process groups in more detail, in order to study the

relationships between the processes and also the relationship between the login shell

(which gets invoked when we login) and the processes that start from the login shell.

CONTENTS

6.1 The concept of terminal login

6.2 Network logins

6.3 Process groups

6.4 Sessions

6.5 Controlling terminal

6.6 Tcgetprgp and tcsetpgrp functions

6.7 Job control

6.8 Review questions and answers

129

Page 130: Bsit61 Unix

6.1 The concept of terminal login

What happens when a user logs in from a terminal? Though there are slight

differences between one version of unix and another, we look into the more general

format, of course specifying the specialties of the versions when necessary.

Normally the system administrator will have created a file whcih we can call login

file. In this file, each line corresponds to one terminal device of the syste (Note that it

need not always be an I/o terminal, other devices are also included). That line specifies

the name of the device, parameters to be exchanged for logging in etc. when the

system is on, thekernel creates the first process, the init (also indicated by the process

id1) and this init is responsible for the interaction with the various users as and when

they login at different terminals. init reads theavove cited login file and for every

terminal that is allowed to login, produces a fork and an exec of a program to get the

details of the terminal (let us call it getty_ ty stands for terminal device). The entire

process can be shown as in the following figure.

process id1

fork forks, one per each terminal

each child process executes getty

The processes describes so far are all working with super user privileges and as given

real user id of 0.

Getty calls the open function for the terminal device. This opens the terminal for

reading and writing. Once the device is open (depending on the type of the device, the

actual mode of opeing and the delay before it becomes open differ), file descriptors 0,1

and 2 are set to the device. (The student is adviced to look back into the concept of file

descriptors to be sure what these descriptors are actually about). The getty outputs a

Boot strap

init

init

getty

130

Page 131: Bsit61 Unix

login function (or something similar) and works for the user to type in his user name.

once the name is typed, it invokes the login program. init invokes getty and attaches

the environment for login, with the name of the terminal and other details. At this stage,

the status of the processes appeared as below.

Fork exec reads tty forks one per terminal

Creates empty environment

Opens terminal device (file descriptor 0,1,2

exec Reads user name, sets initial env.

Now login takes over. Since it is working with the superuser privilages, it calls a

function getpwnam to fetch the passwordentry for the user. It asks the user to give his

password by prompting him to do so and reads the password typed in by him. It calls

for the encryption of the password and compared it with the encrypted entry in the

password file (student will do well to remember that passwords in unix are stored in

encrypted form in the password file and any entry into the terminal is to be encrypted by

using the suitable algorithm, so that it can be compared). If the password fails to match

(after several trials , which can be set) it calls for exit with an argument of 1 being

returned. This termination is noticed by init which will do another fork, exec and getty,

so that the user can try the login process again.

If on the other hand, the login has been successful, then login changes to the

home directory of the user (chdir). The ownership of the terminal is changed using

chown, so that the user who has successful logged in becomes the owner and group

owner. The user permissions are suitably changed, enabling the user to read, write and

group_write. Group ids are the home directory, user name and path are incorporated.

The user who has logged in will now be able to work under his own user id. At this

stage the arrangement of processes appears as follows.

Boot strap

init

getty

login

131

Page 132: Bsit61 Unix

process id 1

fork

exec

File descriptors 0,1 and 2

Device driver

RS232 connection

6.3 Network Logins

In the present day scenario, networks are taking over from individual systems. In

the case of the terminal login described above, where in a central unit is catering to the

needs of several terminals, init will have full information about the terminal devices

enabled for logins and uses the getty for the devices. In the case of a network,

however, the login is done through the ethernet drivers which interface to the kernel and

no prior information about how many such logins can occur is available. Hence, unlike

the case we have discussed previously, wherein the process creates a fork of each

terminal and waits for the login to come in, it is more sensible to act as and when a

network connection request arrives.

At startup, the init process executes a shell script when the system is ready for

the multiuser operations. This starts a daemon process called inetd. This daemon

Boot strap

Init

Init

getty

login

Login shell

Terminal

User at terminal

132

Page 133: Bsit61 Unix

keeps waiting for any TCP/IP connection request to arrive and once such a request

arrives, it does a fork and exec (as in the previous case).

However, in a network instead of login the user is supposed to telnet to get

himself into the network. Normally, he does this by a typing in the command "telnet

hostnode".

Then a sequence of operations called telnetd are started (ofcourse, the usual

process of checking the password, exiting if the password does not match etc. are to

be gone through. At this stage, we presume a successful matching passwords). The

telnetd then produces two processes, one to take care of network connection and

communication with the client and the second doing the job of login shell. This telnetd

can be thought of as a pseudoterminal which connects the client and the server. The

file descriptors 0,1 and 2 are attached to the pseudoterminal. Then, the login performs

operations like changing the home directory, setting group ids and user ids and setting

up the initial environment.

The arrangement of the process can be briefly given as below

process id1Executes a shell script when the system comes up for multiuser

TCP Connection request Telnet

fork after the connection request

exec

exec

file descriptors 0,1,2

network connection

init

inetd

inetd

telnetd

login shell

pseudo terminal

user at a terminal

133

Page 134: Bsit61 Unix

6.4 Process groups

Every process has a process id and it will also have a process group id. A

process group is a collection of one/more processes, with it's own unique process group

id. They are positive integers and can be stored in a data type pid_t- similar to the

process id.

the function getpgrp returns the process group id of the calling process. The

typical format is

Pid_t getpgrp(void);

returns the process group id f the calling process.

Further each process group may have a process group leader, who is identified by

having it's process id equal to the process group id of the group.

A process group leader creates a process group and creates processes in the group. It

is not necessary that the life of the process group leader should equal or be greater

than the life of the group itself. A process group survives as long as even one of hte

processes in the group is surviving. But the group leader can terminate even before it.

The last process of the group may either terminate or enter some other process group,

in either case the group itself gets terminated.

The function setpgid helps a process to either join an existing process group or

creates a new process group itself.

the typical format is

int setpgid (pid_t pid, pid_t pgid);

This sets the process group id pid to pgid - i.e. the process with the id pid to the group

with the group id pgid. However, if the two fields of pid and pgid are equal, then pid

creates a new group and itself becomes the group leader.

A process can set the id of either itself or one of its children. However, once the

child has called one of the exec functions, the parent can not change the pid of the

child.

If pid=0 then the process id of the caller is used. If pgid=0 then pid is used as the

group id.

134

Page 135: Bsit61 Unix

6.5 Sessions

In this section, we look into a very important concept, the concept of sessions. A

session is a collection of one/more process groups.

A process can establish a new session by calling the setsid function .

Typical format is

Pid_t setsid (void)

returns the process group if successful, returns -1 if on error. If the calling

process is not a process group leader, then setsid function creates a new session.

In fact, several possible options are possible

a) The calling process becomes a session leader of the new session. The

process will be the only process of the new session.

b) The process becomes the process group leader of a new group

c) The process loses it’s association with it’s controlling terminal.

Now what is a control terminal ? look into the next session.

6.6 Controlling terminal

The concept of control terminal was raised in the previous section. We would

like to discuss about what a control terminal is all about, in this section.

A controlling terminal is normally a terminal device in which one logs in. A

session can have a single control terminal. A session leader establishes the connection

with the control terminal on behalf of the session. Such a session leader is called a

controlling process. The process groups within a session can be grouped into a single

foreground process group and one or more number of background process groups.

When the terminal’s interrupt key is pressed, a quit signal is sent to all process in

the foreground process group.

Normally once a user creating a session logs in, his terminal automatically

becomes the control terminal.

In case any other program or process wants to communicate with the controlling

terminal, it has to open the file/dev/tty and use it for communication.

135

Page 136: Bsit61 Unix

6.7 tcgetprgp and tcsetpgrp functions.

These functions will be useful to indicate the foreground process group to the

device driver.

The general format is

Pid_t tcgetpgrp(int filedes);

This returns the group id for the foreground process if ok, otherwise returns –1 as

error.

The other function is

Int tcsetpgrp(int filedes, pid_t pgrpid);

This returns 0 if successful, -1 if on error.

The function tcgetpgrp returns the process group id of the foreground process

group for the terminal open on filedes.

Tcsetpgrp can be used to set the foreground process group id to pgrpid, (if the

process has a controlling terminal). The pgrpid value should be the process group id of

a process group in the same session. Filedes refers to the filedescriptor of the

controlling terminal of the session.

6.8 Job control

The term job control has been encountered previously also. It normally means

the capability to start multiple jobs, groups or processes from a single terminal and

control their sequence of execution like which job can access the terminal and which

should run in the background. Job controlling essentially means we are using a shell

that supports the task of job control and uses the features to do the same.

When a background job is started, the shell assigns a job identifier and prints one

or more process ids. The interaction with the terminal driver becomes important

because the suspend key- if entered from the terminal affects the foreground jobs. This

136

Page 137: Bsit61 Unix

key, when depressed, makes the driver send SIGTSTP signaling the foreground

process group jobs to stop. However, the background jobs remain unaffected.

There is yet another condition that the terminal must be able to handle. There is

one foreground job and one or more background jobs. If a character is entered from the

terminal, which of these jobs receive that input will be the question. The foregound job

is normally expected to receive the input. If the background job is in need of an input, it

may try to read the input. If this is detected by the terminal, it sends a SIGTTIN signal

to the background job. The background job stops and if needed, the user can bring it to

the foreground. Then afterwards, it can read the inputs from the controlling terminal.

Similarly a background job may try to output to the terminal. Depending on the

circumstances, the user may allow or disallow the same. However, there is a conflicting

opinion about the need or otherwise for the job control. With the advent of windowing

systems it can be claimed that job control is not needed and the effect and effort an

dresources needed for it can be used for better purposes. Others feel it could

supplement the facilities provided by the windowing system.

137

Page 138: Bsit61 Unix

Block Summary

In this block, we looked into the concept of a terminal login. We understand the

various steps that the shell will have to undertake for a successful login. We also

discussed the difference between a terminal login and network login. The concept of

process groups and how to set the group ids and get the current value of group id were

also discussed.

Then the concept of sessions was discussed as also the idea of a controlling

terminal. We familiarised ourselves with the functions that indicate the foreground

process group to the driver. The final idea discussed was the concept of foreground

and back ground jobs and how we can control them.

138

Page 139: Bsit61 Unix

Review Question

1. What is the diffeence between network login and terminal login

2. What is a process group?

3. Which fuction gives the process group id of the calling process?

4. What is a session?

5. Which function establishes a new session. Give it’s format

6. What is a controlling terminal in the context of a session?

7. What does the function tcgetpgrp do?

8. What does the function tcsetpgrp do?

Answers

1. In a terminal login, we use simple login process, whreas in a

network login we use telnet.

2. A Process group is a collection of one/more processes, with

it’s own process group id.

3. The format of the function is pid getprgrp(void), which

returns the program group id.

4. A session is a collection of one / more process groups.

5. The format is pid_t set sid (void)

6. A controlling terminal in the context of a sessio is the

terminal through whihc a session leader establishes the connetion.

7. It returns the group id of the foreground process.

8. It helps to set the foreground proces group id.

139

Page 140: Bsit61 Unix

Unit – III

INTRODUCTION

This unit consists of two blocks. In the first block, we are introduced to the

details of terminal I/O operations. We get acquainted with the two major modes of I/O

operations – the canonical and non canonical modes. Then we get started with the

normal I/O functions, how to set and control the I/O parameters etc.. The next concept

is how to interact with slower devices, especially when a no. of such devices are there –

either by polling or by use of semaphores etc..

We also see what a deadlock is, which incidentally is a result of the concept of

record lockings. We also learn to distinguish between advisory and mandatory dead

locks. Then we get ourselves introduced to streams – the details of stream messages.

We also discuss about daemons – which are actually background processes that are to

do several sundry and accounting jobs.

In the second block, we look into the concept of inter-process communications.

We start with the concept of pipes, which can be viewed as a connecting channel

between two processes through the kernel. It is a half duplex connection. We study the

various functions to open and close pipes and also how we can get a full duplex pipe

out of the structure.

We also study FIFOs – which help us connect even processes that are not from

the same ancestor – which is not possible with pipes. The other concept we

study is about message queues.

Then we move on to semaphores – which help the system processes to share

the resources by giving suitable indications. We study the various functions to operate

on semaphores we study the concepts of shared memory, stream pipes and client

server operations also.

140

Page 141: Bsit61 Unix

BLOCK - VII

Block Introduction

In this block, we shall be looking into the details of the terminal I/O oprations,.

The block starts with an introduction to the various terminology used in connetion with

I/O oprations. WE start with the two major modes of I/O oprations – the canonical and

non canonical modes. Then we see the normal I/O opration fucntions, getting or setting

them, band rate and line control fucntions and the concept of erminal identification.

Then the concept of Non blocking I/O, which facilitate interaction with slow

devices is introduced. The other key concept is record locking that allows multiple

devices to use the same file simulatneously, we see how to set or adjust locks and other

details.

The next concept is of the dead lock, which of course is a produt of record

locking. When two or more processes have locked their own resources and are waiting

for other resouces held by othre dead locks occur. We also see the concept of implied

interitence and how to release dead locks. A subtopic is the concept of advisory and

mandatory dead locks.

Then we move on to streams – the details of stream messages the functions that

help us to operate with streams for reading and writing. The other concept is I/O

multiplexing – how to choose and operate with one out of a several list of devices.

Finally, we look at the concept of daemous, why do we need them, how to creat

them etc.

141

Page 142: Bsit61 Unix

Contents:

7.1 Introduction

7.2 Getting an dsetting attriubtes

7.3 Baud rate functions

7.4 Line control functins

7.5 Terminal identification

7.6 The canonical mode

7.7 Non canonical mode

7.8 Termcap, terminfo and curser

7.9 Non blocking I/O

7.10 Record locking

7.11 Concept of deadlock

7.12 Streams

7.13 I/O multiplexing

7.14 Select function

7.15 Poll function

7.16 Introudtion to daemons and their characterization

7.17 How to write daemons and why?

Review Question and Answers

142

Page 143: Bsit61 Unix

BLOCK VII

Terminal I/O Operations

7.1 Introduction: Handing of I/O is at best a messy operation. A lot of factors need to

be taken in to account and even then it can at best be only satisfactory. Before going

into the details of I/O operations, we start by noting the two major modes of I/O.

a) Canonical mode input processing

b) Non canonical mode input processing.

In the canonical mode terminal input mechanism, the input is processed as lines.

The terminal driver returns one line per read request, which is processed by the

system.

In the noncanonical mode input processing, the input characters remain

as such, and are never assembled or considered to be lines.

Default operation is canonical.

We can think of a terminal device as being controlled by the terminal

driver. Each terminal device has an input queue and an output queue.

Next character read by next character written by

process the processes

if echo is

enabled

max input

next character read next character to

from device transmit to device

Input queue Output queue

143

Page 144: Bsit61 Unix

The above logical representation gives the input and output queues for a terminal

device. There are several points to consider.

a) There is a link between the input and output queues only if echo is

enabled

b) The max size of the input queue may be predefined. Once this size is

exceeded, what happens to the next incoming input will be

implementation dependent. One simple mechanism could be to raise

an alarm in some form.

c) There is also a limit on the maximum and number of bytes that the

canonical line can contain.

d) The output queue is also limited in size. However there may not be

any over flow condition here, because if the buffers get filled up, the

Kernel can just put the writing process to sleep.

e) There is a tc flush function that helps to flush either the input queue or

out queue.

Most unix systems implement the canonical processing using a module called

terminal line discipline. This is given below:

Read & write function

Terminal line

discipline

Terminal device driver

User process

Kernal

Device

144

Page 145: Bsit61 Unix

When the user process wants to have I/O with a device, it calls the

appropriate read and write functions. These function, in turn call the terminal line

discipline, which interacts with the terminal device driver to interact with the

actual devices. Note that the actual devices can be physical devices with varying

physical properties that need to be tuned properly before the I/O operations can

start. Hence the need for the terminal line discipline.

All the characteristics of the terminal device (those that can be checked and

changed) are contained in a structure called termios.

Struct termios {

Tcflag_t c_i flag /* input flag */

Tcflag_T c_oflag /* output flag */

Tcflag_t c_c flag /* control flag */

Tcflag_t c_i flag /* local flag */

Cc_t c_cc /* control characters */

}

The data type tc flag_t can hold a variety of flag values. The nature of each of

the flag is evident in the comment fields. Each of these flags can be set to a number of

optional values. The actual list of values is too long to be listed here, but it is enough to

point out that most of the possible settings are included in the list.

We also list out the normal functions used in I/O operations and their method of

usage in brief.

function Description

tcgetattr Fetches the attributes and returns in a

termios structure

tcsetattr Sets the attributes as included in the

termios structure

145

Page 146: Bsit61 Unix

Cfgeti speed Get the input speed (in bauds)

Cfget0speed Get the out put speed (in bauds)

Cfsetispeed Set the input speed (in bauds)

cfsetospeed Set the out put speed (in bauds)

tcdrain Wait for all output to be transmitted

tcflow Suspend, transmit or receive

tcflush Flush pending inputs and outputs

Tc send break Send BREAK character

tcgetpgrp Get foreground process group id

tcsetpgrp Set foreground process group id

Some of these functions, we study in more detail.

7.2 Getting and setting attributes:

tcgetattr and tcsetattr

Typical formats are

Int tcget attr (int filedes, struct termios * termptr);

Int tc set attr (int filedes, int opt, const struct termios * termptr);

The fields refers to the terminal device. The set function takes the options and

sets the current attributes. The get function returns the current terminal attributes.

7.3 Baud rate functions:

Baud rate, as we know, is the speed at which data is transferred to or from the I/O

device. Baud rate stands for bits per second. A number of functions are available to

manipulate the baud rates of the connected I/O devices. We shall examine a few of

them.

146

Page 147: Bsit61 Unix

Typical formats are

Speed_t cfgetispeed (const struct termios * termptr);

Speed_T cfgetospeed (const struct termios * termptr);

Both of them return the baudrate value.

Int cfsetispeed (struct termios *termptr, speed_t speed);

Int cfsetospeed (struct terrnios *termptr, speed_t speed);

Both return 0 if OK and –1 if error.

However, note that the set functions do not indicate whether the settings have

properly been done or not. They only return whether the setting was successful or not.

It is a good practice to use tcgetattr to check whether the settings have been

proper before going ahead with the next steps.

Similarly it is advisable to use tcgetattr to get the setting details before attempting

to set the new values.

7.4 Line Control functions:

Some functions are available which provide the line control capability to the

terminal devices.

The typical formats are

Int tcdrain (int filedes);

Int tcflow (int filedes, int action);

Int tcflush (int filedes, int queue);

Int tc send break (int filedes, int duraton);

All of them return 0 for successful operation and –1 for error.

Now brief description of each of them:

a) The tcdrain waits for all outputs to be transmitted.

147

Page 148: Bsit61 Unix

b) The tc flow provides us several options to manipulate the flow action.

The actual action to be performed is provided by the action argument.

The action can be any one of the following values:

TCOOFF Suspends the output

TCOON Restarts the previously suspended output.

TCIOFF Transmits the stop character, which has the effect

of asking the terminal device to stop transmission.

TCION Transmits the START character, which has the

effect of asking the terminal device to restart

transmission.

c) The tcflush function lets us flush the input buffer or the output buffer.

I.e. we may want to start afresh after clearing all that data that is lying

in the buffer, yet to be read by the programs or by throwing away all

output data that has been written by the programs but not yet

transmitted to the device. The reason why we may want to do it is not

important here, except that the tc flush can accomplish it by the

suitable argument being placed in the queue field.

The queue can take any one of the following arguments.

TCIflush To flush the input queue

TCOflush To flush the output queue

TCIO Flush To flush both input and output queues.

d) The tcsend break function transmits a continuous stream of 0 bits for

the duration specified in duration argument. However, the actual

duration of transmission, though is proportional to the value specified,

but will not be exactly the same. For example a value of 0 for duration

would effectively transmit for about 0.5 seconds.

148

Page 149: Bsit61 Unix

7.5 Terminal identification:

The typical format is

Char * ctermid (char *ptr);

If the pointer is non null, the function returns a pointer to an array that controls

the name of he controlling terminal.

There are two other similar functions

Int isatty (int filedes)

This returns 1 (true) if the device pointed to by filedes is a terminal device,

returns 0 (false)if it is not.

The other function is

Char *tty name (int filedes);

This returns a pointer to the path name of the terminal indicated by filedes,

NULL if error.

7.6 The canonical mode:

Having seen some of the functions, we would like to look at the canonical mode as

a whole.

Implementing and operating in canonical mode is fairly straight forward. Once a

read command is issued, the terminal returns the input when a complete line is read.

How to decide when a complete line has been inputted:

i) Read can be effected if a prerequested number of bytes have been

inputted.

ii) Read can be effected when a line delimiter is encountered.

149

Page 150: Bsit61 Unix

iii) Read can be effected if a signal is caught and the function does not restart

automatically.

To make the concept a bit more clear, we write a small function get pass, that

reads a password from the user at the terminal (Ref the text book (i) for the program)

# include <stdio.h>

# include <signal.h>

# inlcude <termios.h>

# define MAX_PASS_LEN 8

/* maximum number of characters in the pass word */

char * get pass (const char * prompt)

/* function beginning here */

{ static char buff [MAX_PASS_LEN+1];

char * ptr;

sig set_t sig, sigsave;

struct termios term, termsave;

FILE * fp

Int c;

If ((fp = fopen (ctermid (NULL), “r+’))==NULL)

Return NULL; /* open the file */

Set buf (fp, NULL);

Sign empty set (& sig); / * create empty set */

/* block SIGINT and SIGTSTP, save signal mask */

sig add set (&sig, SIGINT);

sig add set (&sig, SIGTSTP);

sig pro mask (SIG_BLOCK, & sig, & sig save);

tcgetattr ((fileno(fp), & term save);

/* save tty state */

term = term save;

/* structure copy */

150

Page 151: Bsit61 Unix

term.c_flag & = (ECHO/ECHOE/ECHOK/ECHONL);

/* set the flags to echo the input characters on to the terminal */

tc set attr ((fileno (fp), TC$A FLUSH, & term);

fputs (prompt, fp);

ptr = buf; /* point to the buffer */

while ((c=get c (fp) ! = EOF & & C ! = ‘\n’)

{ if ptr<& buf (MAX_PASS_LEN))

*ptr ++ = C;

} /* while the max length is not reached, input the character */

*ptr = 0; /* null terminate */

put c (‘\n’, fp); /* echo a new line */

tc set attr (file no (fp), TCSA FLUSH, & term save);

/* restore tty state */

SIGPROCMask (SIG_set mask, & Sigsave; NULL);

/* restore the signal mask */

fclose (fp);

return (buf);

}

The comments made at appropriate places in the program are self explanatory.

7.7 Non canonical mode:

This mode, as we have seen does not look at the input data / output data in terms

of lines. This mode can be specified by turning off the flag ICANON in the C-1 flag field

of the termios structure, Apart from the fact that the data is not processsd in terms of

lines, the other difference with respect to the canonical form is that certain special

characters that indicate erase, EOF, EOL etc lose their meaning and hence are not

processed.

151

Page 152: Bsit61 Unix

In the canonical form, all that the system has to do is to check when the line is full

and process it suitably. But in the noncanonical form the system has to decide when to

return the data / get data.

Normally a prespecified amount of data is processed or data is processed at

prespecified duration of time, irrespective of the amount of data accumulated. There

are two variables in the C-CC array in the termios structure – MIN and TIME. These

two can be used to set the operation mode in the noncanomical mode. Time specifies

the numbers of 0.1 seconds to wait from the data to arrive. MIN indicates the minimum

number of bytes to wait for before a read operation is done. These two, as we can see,

can be set independently, but their operations are inter related. Let us look at the

combinations.

a) Min = 0, Time = 0

If any data becomes available, reader returns the data immediately. If no data is

available, returns 0 immediately.

b) Min = 0, Time >0

The non zero time specifies a read timer when a read is to be called It returns the

data available as and when the prespecified time interval is reached. If no data

becomes available at that period, it returns 0.

c) Min > 0, Time = 0

The operation takes place only when the specified minimum number of bytes

have been received. This may become dangerous, as in some cases it may

mean waiting for very long periods.

d) Min > 0, Time >0

If the sufficient minimum number of bytes is received before the time expires, the

minimum number of bytes are returned. If the time elapses before the MIN

number of bytes are received what ever bytes are available are received.

152

Page 153: Bsit61 Unix

7.8 Termcap, terminfo and curser:

These are the functions which allow the processes to do small manipulations with

the terminals, without depending on the OS to do it.

Term cap stands for “terminal capability”. It contains a file containing descriptions

of various terminals – the features that they support, and how to adjust the terminal

parameters. These can be used along with the vi editor.

However, termcap has a serious draw back. As more and more terminals are

added, scanning the termcap file for the details of a particular terminal itself becomes

difficult. To over come this problem, terminfo was created, which contains compiled

version of the textual description and hence can be detected much faster.

While both term cap and term info tell us about the methods to change the

terminal capabilities, neither of them is able to do soon it’s own. For this there is the

“cursors”, which contains several functions to perform the various operations.

7.9 Non blocking I/O:

In one of the previous sections, we discussed about blocking of I/O. The “slow”

system calls are those that can block for long periods (for ever, for that matter)

For example, reading from a file in which data is not present, reading and writing

of files that have record locked etc can result in infinite blocking of the I/O operation.

Non blocking I/O allows us to issue an I/O command and ensure that if the

operation cannot be completed for whatever reason, the control returns immediately, of

course returning with an error noting that the operation could have been blocked.

153

Page 154: Bsit61 Unix

Now, how do we specify the non blocking I/O for a given descriptor:

a) if we are using open to get the descriptor, we can specify the 0_non block

flag.

b) If the descriptor is already open, we can call fcnt1 to switch on the 0_non

block flag.

7.10 Record locking:

This is a very important I/O concept on which lot of effort and times are spent. The

problem is very simple. We know unix allows sharing of files. More than one process

(or user) can access the same file (or copies of the same file in some cases)

simultaneously. What happens if both of them (or three of them) try to modify the

contents of a file? Or one of them is writing into it and the other is trying to read from the

file? Since we can never be sure about the sequence in which operations are being

done (whether reading precedes writing and reading comes after writing), the output

that the processes produce become indeterminate.

One simple way to avoid this problem is to ensure that when one of the

processes is writing into the file, no other process can access the file (in some cases

reading can be allowed, but that again, depends), all that the unix system does is to lock

the file. (or the record concerned, so that other records of the same file may continue to

be accessible.)

We can define record locking as the mechanism wherein a given process can

prevent other processes from modifying a region of the file. what is locked in most

cases is a “range” in a file.

7.10.1. Some functions for record locking

The typical format of the normal command is

Int fcnt1 (int filedes, int cmd, - - - -, struct flock * flockptr)

154

Page 155: Bsit61 Unix

For record locking, cmd can be any one of F_Getlk, F_Setlk or set LKW

The last argument flock ptr is a pointer to an flock structure.

The flock structure itself can look like this

Struct flock {

Short 1_type /* F_RDLCK, F_WRLCK, or F_UNLCK */

Offt_t l_start /* off set in bytes, relative to l */

Short l_whence /* seek_Set, seek_cur or seek_end */

Off_t l_len /* length in bytes */

Pid_t l_pid /* returned with F_Getlk */

}

The first line describes the various types of locks that can be incorporated – a

shared read lock (F_RDLCK); and exclusive write lock (F_WRLCK) or unlock the region

(F_UNLCK);

The starting bytes offset (of the region where locking or unlocking region is to

begin) is given by l_start and l_whence.

The size of the region that is being blocked or unblocked is given by l_len.

Now, a few words about shared and exclusive locks. Commonsense tells us that

no harm is done if several processes share a read lock over a region – i.e. if they can

read different bytes of the region independently, but only one process should have

exclusive write lock, being able to write into the region.

Further, when a write operation is in progress in any region of a file, no read

operation should be taking place in the same region.

155

Page 156: Bsit61 Unix

Needless to say to obtain the read and write locks, the descriptor needs to be

opened in the appropriate mode.

Now, we are in a position to define three different commands (for the cmd

argument) for the fcntl function.

a) F_GeTLK : to check whether the block pointed to by the description in

the flock ptr is already locked by some other process. Then, the

information about this lock is returned. This can easily be done by

overwriting the structure in the fcntl command with the information

about the already existing lock. If there is no lock, that can be prevent

our lock from being created, the structure pointed to by flockptr is left

unchanged.

b) F_SetLK : set the lock described by the flock ptr.

c) F setLKW : This is the same as FsetLK except for the w – meaning

wait. If the requested lock cannot be granted for the reason that

another process has already locked the region, the calling process is to

be put to sleep. The wait is interrupted, only if the signal to the effect

that the region is unlocked is caught. Again, since testing for the lock

being released and trying for it’s own lock are two different operations

– it is not an atomic operation, there is no guarantee that some other

process will not obtain the lock in the inter – mediate time lag.

When setting or releasing locks, the system combines or splits adjacent areas as

required.

7.11 Concept of deadlock:

A dead lock is said to occur when two processes are reaching for a resource

each held ( and locked by the other). To elaborate, suppose process 1 needs resource

156

Page 157: Bsit61 Unix

B, which is not presently available. It also needs resource A, which is available and

process 1 locks it. However, it can proceed only if resource A also becomes available.

Similarly let process 2 needs resource A, which is held by process 1 as above.

It already has with it the resource B, which it has locked.

Now process 1 does not proceed with it’s operation until resource B becomes

available. Resource B will never become available, because it is with process 2, which

will be released only when resource A is made available to process 2. But this cannot

happen because process 1 will not release it till process 2 releases it’s resource etc.

So, for an infinite time, A and B are held locked by process 1 and process 2

respectively.

Now, if a deadlock is detected, the kernel can choose one process to receive the

resource held by the other process. For example in the above case, it may force

process 2 to release the resource A so that process 1 can complete it’s job. Once

process 1 release A & B process 2 can complete. However, the main difficulty in

tacking deadlocks lies not in releasing the deadlock so much as detecting whether a

dead lock has occurred and if so why it has occurred.

7.11.1 Implied inheritance and release of dead locks:

We shall look into certain other implications of record locking. As we have seen

locks are associated with a process and a file when a process terminates, all it’s locks

are released on the contrary, when a descriptor is closed, any locks on the file

referenced by the descriptor for that process also need to be closed.

Suppose we have executed the following steps

Fd1 = open (path 1, - - - - -)

Read_lock (fd1, - - - -- -- -)

157

Page 158: Bsit61 Unix

Fd2 = dup (fd1) / * duplicate fd1 into fd2 */

Close (fd2).

After closing (fd2) not only the lock associated with fd2 but also the one

associated with fd1 will be unlocked.

We can look at another example

fd1 = open (path 1, . .. . . . . . )

read_lock (fd1, . . . . . . . . .)

fd2 = open (path 1, . . . . . . .)

close (fd2)

The second aspect to note is that locks are normally not inherited by the child

created by a fork... It a process had secured a lock and then calls a fork, the child may

be considered to be an entirely different process and hence cannot claim to any of the

locks on any descriptors that are held by it’s parent.

In fact this goes by the definition of locks and common sense. Look at the basic

logic behind locking. More than one process cannot be (or should not be ) able to write

or read from the locked sections. If the children of the parent process are allowed to

inherit the locks, then over a period of time, more than one process will be able to

operate on files, thereby defeating the purpose of locking.

However, one exception is that locks may be inherited across an exec, because

in that case, effectively only one process will still be able to access the records.

7.11.2 Advisory and Mandatory locks:

Roughly the concept of Mandatory locking can be said to be similar to the one

that being enforced by the Kernel. When such a locking is done, every calling process

158

Page 159: Bsit61 Unix

will have it’s open, read and write actions verified by the Kernel to ensure that it is not

violating the locking restrictions. If a process tries to read / write from a region locked

by another process, the process trying to access may have it’s actions blocked.

On the other hand, suppose we have several routines, which are likely to be

frequently accessing, say a database. If we know exactly that only these many

functions are likely to access the database, then the processes may be pooled to form

what can be called a set of “cooperating processes”. Every time a read / write is to be

done, they need not specifically lock or unlock the regions, nor has the Kernel to check

for the validity of the actions. The routines themselves may be so designed that they do

not overreach their limits. Such a “locking” can be termed advisory.

The advantages of advisory locking are obvious. But it is to be borne in mind

that any outside (“rogue”) process, that does not form itself to be a part of the pool and

may write / read from any where it likes.

However, it may be noted that even mandatory record locking can be

circumvented by intelligent programmers. Also malicious users (who may themselves

be genuine) can simply lock publicity readable files, so that they prevent any one else

from making use of the files.

7.12 Streams

Streams are provided as a general way to interface communication drives into the

Kernel. A Stream provides a full duplex path between the user process and device

driver. Streams can be used with pseudo device drivers. A simple stream can be

visualized as below.

159

Page 160: Bsit61 Unix

Beneath the stream head, any number of processing modules can be included into

the stream. Infact we can define downstream and upstream sides for a stream. The

data that we write into a stream goes down stream whereas the data read by the device

driver is sent upstream.

Stream modules are similar to device drivers in the sense that they execute as a

part of the Kernel and are normally link edited into the Kernel when the Kernel is built.

We can access a stream with the normal functions like –

Open, close, read, write and ioctl.

In addition, several version of unix contribute their own set of commands to operate

on them. We see some of these commands as we go along.

7.12.1. Stream messages:

All input and output operations to the streams are based on messages. (In fact

all stream devices are character files). The stream head and user process exchange

messages using read, write, ioct1, getmsg, getpmsg, putmsg and putpmsg.

Any message between the user process and the stream head consists of the

following:

User processes

Stream head(System call

Device driver(or pseudo device

160

Page 161: Bsit61 Unix

i) A message type

ii) (Optional) Control information.

iii) (Optional) data.

Of course, different types of messages can accommodate different types of

information, which we see in detail.

The control information and data can be specified by the following type of

strucutres:

Struct str buf

{ int max leng; /* size of buffer */

int leng; /* no. of bytes currently in buffer */

char buf; / pointer to buffer */

}

The size of the buffer and current no. of bytes in the buffer need to be set so that

the Kernel won’t over flow the buffer.

A zero length buffer is also acceptable and a long field of –1 indicates that there

is no control or data information.

7.12.2: We now look into some of the functions that will be helpful to us in operating

with streams.

A stream message is written into the stream using either putmsg or put pmsg

function.

Typical format are

Int putmsg (int filedes, const struct strbuf * ctlptr, const struct strbuf

*dataptr, int flag);

And

Int putpmsg (int filedes, const struct strbuf * ctlptr, const struct strbuf

161

Page 162: Bsit61 Unix

* dataptr, int band, int flag);

Both of them return 0 if successful, -1 if not.

The arguments of file description, pointer to the control buffer and pointer to the

data buffer which are the first three arguments are fairly evident.

The only difference between putmsg and putpmsg is that the latter allows us to

define a priority band indicated by the argument band as one of the arguments.

Every message can be defined to be having a queuing priority. The three priority

bands are :

i) high priority messages (highest priority)

ii) priority band messages (medium priority)

iii) ordinary messages (lowest priority).

The messages with ordinary priority (lowest priority) will have a band value of 0.

Messages with higher priorities may have priorities from 1 to 255, higher the priority,

higher the number of band. The messages in the input queue are arranged in the order

of their priorities.

7.12.3 Reading data from a stream:

Just as we are able to write into a stream using write, putmsg and putpmsg for

writing into a stream, we use read, getmsg and getpmsg to read data from a stream.

Typical formats are given below:

int getmsg(int filedges, struct str buf * Ctlptr, struct strbuf * dataptr, int * flagptr);

and

int getpmsg (int fildes, struct strbuf * ctlptr, struct strbuf * dataptr, int *bandptr, int

* flag ptr);

Both return non negative value for success and –1 on error.

162

Page 163: Bsit61 Unix

The details of these two functions are similar to putmsg and getpmsg. Only one

detail can be noticed. Instead of band and flag values, we have pointers to bandptr and

flagptr. Needless to say that these values are to be set properly before the functions

are called.

By making the flagptr to 0, we get the next message, waiting for the stream, in

the queue.

If we want to get only high priority messages, we should make integer pointed to

by flagptr to RS_HIPRI.

7.13 I/O multiplexing:

When we want to read from one descriptor and write into another, we can use

blocking I/O in a loop.

while (( n= read (Stdin_fileno, buf, bufsiz)) > 0 )

if (write (std out_file no, buf, n) ! = n)

err_sys (“Write error”);

In this case, we simply block the I/O over and over, until the buf size becomes

empty.

Suppose, we have to read from two descriptors. We cannot use the blocking

principle, since while we are blocked on one descriptor, the other descriptor may

provide data.

One method to take care of such a situation is to create a fork so that the two

processes can take care of he two I/O operations. (each taking one half of the

operation).

163

Page 164: Bsit61 Unix

In such a case, each of the two processes can block one of the streams.

However, there may be a small problem during termination.

If the child terminates first, then the parent is intimated about the same and it can

also terminate. On the other hand, if the parent terminates first, the child may not be

able to continue. In such cases, it is desirable that a suitable signal is generated to

enable the child also to stop.

There is a second way of I/O multiplexing. In this case, we avoid blocking. We

set both descriptors to non blocking and issue a read on the first descriptor. If data is

available, it is processed. If no data is available, since it is non blocking, the call returns

immediately. The call can now be raised to the second descriptor. After some time

gap, we can again try reading the first descriptor and subsequently the second

descriptor and so on. This type of operation is called polling. Note that there may be

more than two streams also and we keep asking each stream 1,2 ,3 . . . in the same

order for data and then allow a time gap, before repeating the process again. This

method will be useful when most often we get the data that we are looking for from the

streams. But imagine a case that there is a high probability that our polling will not

result in data being read, as each stream chooses to send data only sparingly. Then,

we will be wasting a lot of CPU time in polling.

A third method would be the asynchronous I/O. In this case, the Kernel (or the

processes) will not go round searching for data as in polling, nor does the stream is

blocked b;y a process. When a descriptor has data available for I/O, it should signal to

the Kernel. One problem that may arise is when the Kernel receives a signal about the

data availability, it knows that data is available on one of the descriptors, but it does not

know on which. So there should either be a mechanism where in a descriptor can

identify it self or the Kernel should do a polling once a data ready signal is available.

The first option of the descriptor identifying itself would make programming (at the

system level) a bit more complex, while the second option can be time consuming when

a large number of descriptors need to be checked to ascertain which of them is ready

164

Page 165: Bsit61 Unix

with the data. With these principles in mind, we look at some of the functions that are

useful in I/O multiplexing.

7.14 Select function:

Typical format is

int select (int maxfdp1, fd_set * readfds, fd_set * write fds, fd_Set except fds,

struct time val * tvptr);

returns a count of read descriptors on successful operation.

0 on time out, -1 on error.

The argument read fds, write fds and except fds are pointers to the descriptor

sets. They are the sets to specify which are the descriptors we are interested in

readable condition, which are the descriptors in writeable condition and which ins an

exception condition, A descriptor set stored in a fd_Set data type can be used for each

of the operations. This fd_set data type can be used to allocate a variable of the

corresponding type, assign a variable of this type to another variable of the same type

or use the macros to do some operation on the sets.

The normally used macros can be

Fd_Zero (fd_set * fd set) /* clear all bits in fdset */

Fd_Set (int fd, fd_Set * fdset) /* turn on the bit for fd */

Fd_CLR (int fd, fd_Set * fdset) /* turn off the bit for fd */

Fd_Isset (int fd, fd_set * fdset) /* test bit for fd */

The argument maxfdp1 stands for max fd plus 1. We take the number of the

highest descriptor we are interested in, add 1 to it and store it in the first argument

(Since the descriptors numbers start from 0, if suppose n is the largest numbered

descriptor we are interested in, then actually there are 0, 1, 2, . . . . n = n+1 descriptors.

That is the reason why we add 1 to the max fd descriptor.)

165

Page 166: Bsit61 Unix

Now, let us look at the last argument. This specifies how long one will wait.

Struct time val {

Long tv_Sec; /* seconds */

Long tv_u sec; /* and micro seconds */

}

So we specific for - - - - seconds and - - - - microseconds we are waiting, before

we return.

There can be different values of this function:

a) tvptr = = NULL wait indefinitely, until a signal is caught return is

made only when one of the specified signals is caught.

b) Tvptr.tvsec = = 0 and tvptr tvusec = = 0 Both seconds and

microseconds are zero, indicating that once all the descriptors

are tested, a return is made immediately. This means there is

no blocking in the select function.

c) Both the fields are given certain nonzero values. Wait for the

specified period and a return is made after the time expires.

A select function either returns a positive value indicating the number of

descriptors that are ready, or 0 indicating that no descriptor is ready or a – 1 to indicate

that an error has occurred.

a) If a descriptor in read fds is ready, then we can read the data from it,

readily.

b) If a descriptor in the write fds is ready, then we can write data into it

readily.

c) A descriptor in the exception set, if ready, indicates that an exception is

pending. It may mean an arrival of out of band data or such similar

error conditions, which need to be cleared.

166

Page 167: Bsit61 Unix

7.15 Poll function:

This can be considered to be similar to select in the sense it allows the

programmer to poll the descriptors, but it’s interface is slightly different

Typical format is

int poll (struct poll fd fd array [ ], unsigned long nfds, int time out);

It returns a count of ready descriptors, 0 on time out and –1 on error.

Instead of writing a list of descriptors for each operation (read, write or exception)

as in the case of select, poll builds and operates on an array of structures, each element

specifying the conditions of one descriptor.

The typical structure of poll fd is as follows:

Struct poll fd

{ int fd; /* file descriptor number */

short events; /* events to be checked for */

short revents; /* events that have occurred on fd*/

} ;

The number of the elements that have to be polled will appear in nfds

The events member of each element in the poll fd is to be suitably set.

The field time out specifies how long one is to wait. It can be

a) timeout = inftim –> infinite time (until a signal is caught)

b) time out = 0 –> no waiting time

c) time out > 0 --> The waiting time is specified in milli seconds.

167

Page 168: Bsit61 Unix

These fields are similar to those of the select function.

7.16 Introduction to daemons and their characterization:

Daemons are processes that live for long periods of time. They start normally

when a system is boot strapped, continue throughout the operation of the system,

terminating only during shut down. Normally they are designed to run in the

background. Their main reason of existence is to perform a lot of house keeping,

accounting and other day-to-day jobs so that the other processes can work comfortably.

Further all daemons work with super user privileges (user id=0), none of them have

a control terminal meaning all of them will be running in the background. The parent of

all these daemons is the init process.

7.17 How to write daemons and why?

We have already noted that daemons are written and executed to perform

several house keeping activities, in the background. However, since there will be a

number of such daemons, all working in the background, it becomes essential to ensure

that there will be no unwanted side effects. By side effects we means interactions

between the daemons or between a daemon and another process that has not been

visualized fully. Otherwise, we may end up in tackling situations that are not only

undesirable, but are quite difficult to trace in the first place.

Now, what are these ground rules that are to be expected to follow, while coding

daemons:

a) Let all daemons fork from the parent exit. This ensures that the daemons do

not terminate prematurely. This also ensures that since exit is performed last,

168

Page 169: Bsit61 Unix

no daemon may end up becoming a group leader just because it’s parent

has terminated.

b) Call set sid to create a new session.

c) Change the current working directory to the root directory.

d) Set the file mode creation mask to 0. This would ensure full flexibility of

operation to the daemon process, while it goes about creating files.

Since the daemon does not have a controlling terminal, error messages from the

daemon are difficult to handle. Hence, normally a central error logging facility is

required. Most unix systems provide log / device drives to facilitate the same.

169

Page 170: Bsit61 Unix

Block Summary

We began the discussion with the concept of canonical and non canonical modes

of input processing, also nothing that most unix systems use canonical mode. We

noted that the charactersties of aterminal device are contained in the strucutre termios,

which can be accessed. We discussed how to get these attributes as well as boudrate

fucntions, and line control functions.

We also saw how, int he noncanonical mode, we can set the times for data

input / output.

The next major topic was record locking to ensure data integrity, but we also saw

how this can lead to dead locks. Of course we saw some functions to help us in record

locking. We saw the differene between advisory and mandatory deadlocks also.

The next concept was about streams and the functions that help us to operate

with streams.

The next concept was I/O multiplexing – with the study of the fucntins select, poll

and the need for polling.

The last item was about daemous – whihc are background processes to do

sundry jobs. We looked into their need and certain ground rules as to how to oprate

them.

170

Page 171: Bsit61 Unix

Review Questions

1. What are the twomodes of terminal I/O operations?

2. Give the strucutre of termios.

3. What functions are abailable to set and get the terminal attributes. What is

their format?

4. Name the line control functions?

5. What facility is available to set the amount of data processed in the non

canonical form?

6. Why is record locking needed?

7. Distinguish between advisory and mandatory locks.

8. What does pur msg function does?

9. Name the two fucntions available for reading data from a stream?

10.What does the poll function do?

171

Page 172: Bsit61 Unix

Answers

1. They are canonical mode and non canonical mode.

2. The strucutre is

struct termios {

tcflag_t c_iflag

Tcflag_t c_oflag

tcflag_t c_c flag

tcflag_t c_i flag

cc_t c_cc

}

3. int tcget attr (int filedes, struct termios * termptr) to get the attributes

int tc set attr (int filedes, int opt, const struct termios * termptr) to set the

attributes

4. They are tcdrain, tcflow, tcflush, tc send break.

5. The two values of min and time which specify the minimum number of bytes

and time in 0.1 seconds respectively. If sufficient no. of bytes or the time expires,

which ever is earlier, sets the process.

6. It is needed to ensure that multiple read and write oprations by different

processes on a single block of file do not produce jubled data.

7. Advisory locks is self imposed by the coprocess. Mandatory locks are ensured

by strict monitoring by the system about the various restrictions.

8. The put msg function helps to put a message into a stream.

9. They are get msg and get pm sg

10. It helps to poll the I/O devices and return the no. of ready descriptors.

172

Page 173: Bsit61 Unix

Block – VIII

Block Introdution

In this block, we deal with the concepts of interprocess communication between

different proess can be taken up in a number of ways. One important mechanism is by

pieps. A pipe can be though of as a half duplex connection between two proceses,

through which data can pass. They can be created between the processes and to

ensure the actual duplex communications that one would be needing, two such pipes

are created between a pair of processes and we close the appropirate ends. This

concept, along with the relevent functions, will be studied in the block.

The other concept we study is about coprocesses which can be thought of as an

alternative to pipes. We also study FIFOs – which help us overcome some of the

restrictions pose by pipes. We also help transfer messages, especiallyb etween the

clients and the server.

Then we get an idea into semaphores – which are indicates about the availability

or otherwise of resources. We see how to implement and oprate semaphoes and also

the various functions for the same. We then move on to the concept of shared memory

and how to operate on it – by setting suitable limits to the individual proceses. We also

see the concept of stream pipes and the basic client server oprations.

173

Page 174: Bsit61 Unix

Contents

8.1 Introduction

8.2 Pipes

8.3 Popen and Pclose function

8.4 Concept of Coprocesses

8.5 FIFOs

8.6 Message Queues

8.7 Semaphores

8.8 The concept of shared memory

8.9 Client Server Properties

8.10 Steam pipes

8.11 Passing file descriptors

174

Page 175: Bsit61 Unix

Inter Process communication

8.1 Introduction: There are enormous needs for the processes to share data and

information. So far, we have seen one way of data exchange between the processes –

by explicitly passing open files across a fork or an exec or though the file system.

However, there are other techniques available that facilitate communication across

processes – termed IPC, Interprocess communication by programmers. We see some

of them in the next few sections.

8.2 Pipes:

The concept of pipes is a very important and time tested method of IPC in Unix

systems. They however have two limitations:

a) Data flows in only one direction – They are half dupliex

b) They can be used only between processes that have a common ancestor –

i.e. the two processes, must share a common fork parent at some level.

Now, to begin with how do we create pipes and what is a pipe.?

A pipe can be viewed as a connection between two processes, through which

data can pass. But the fact is that this connection has to pass through the Kernel. So,

we look at the process pictorially as below:

A pipe can be created by calling a pipe function:

Pipe

Fd[0] fd[1]User process

175

Page 176: Bsit61 Unix

The typical format is

int pipe (int filedes [2]);

returns 0 is successful, -1 if error.

Note that there are two file descriptors to be passed through the array fields [ ] .

The first of these arguments, filedes [0] is open for reading and the second argument,

filedes [1] is for writing ie. filedes [1] writes it’s output into filedes [0]. These is also

shown pictorially in the above figure.

Normally a pipe is followed by a call to fork, thereby creating an IPC from the

parent to the child.

Once a pipe is created, the direction of data transfer is to be reflected in the pipe.

If the parent wants to send data to the child, the parent closes the read end of the pipe

(fd[0]) and the child closes the write end fd[1]. Similarly if the child wants to send data

to the parent, it closes fd[0] and the parent closed fd[1]

The following figures give the description

Parent Child

Kernal

Note that by closing a corresponding pair of fd[0] and fd[1] , we can ensure a

unidirectional data transfer.

Fd[0] fd[1] Fd[0] fd[1]

pipe

176

Page 177: Bsit61 Unix

The same concept of sending data from the parent to the child over a pipe is

illustrated in the following program:

int

main (void)

{

int n, fd[2];

pid_t pid;

char line [maxline];

if (pipe (fd) < 0)

err_sys (“Pipe error”);

else if (pid >0)

{ close (fd[0]); /* parent */

write (fd[1]. “ sample closing \n”);

}

else

{ close (fd[1]); /* child */

n=read (fd[0], line, Maxline);

write (stdio-FILENo, line, n);

}

exit (0);

}

8.3 Popen and Pclose function:

The process of creating a pipe, forking of a child, closing the unused ends of the

pipe, executing a shell to execute the command and waiting for the command to

terminate – This is the normal sequence of operations when a process is trying to

contact another process and transmit data.

177

Page 178: Bsit61 Unix

The unix standard library provides two functions to do all this work. They are

popen and pclose functions.

Typical formats are

FILE * open (const char *cmd string, const char * type);

Returns a file pointer if successful, else a NULL pointer.

Similarly, the other functions is

Int pclose (FILE *fp)

Returns termination status of cmd string or –1 on error.

The function popen does a fork operation, executes the cmd string using exec

and returns a file pointer. If the type argument is r, the pointer is connected to the

standard output of cmd string.

Parent child (cmd String)

On the other hand, if the type is w, the file pointer gets connected to the input of

cmd string.

The pclose function closes the standard I/O stream, waits for the command to

terminate and returns the termination status.

Just to familiarize ourselves with the concepts, we write a small program, which

is copying a file to a pager program (Ref. Text No:1)

# include <sys.waith.h>

# define pager “$ { pager:- more}”

/* This is the environmental variable to define pager */

fp Std out

178

Page 179: Bsit61 Unix

int

main (int argc, char * argv[ ] )

{

char line [MAXLINE]

FILE *fpin, * fout;

if (argc ! = 2)

err_quit (“usage: a.out < pathname>”);

if ((fp in = fopen (argv[1], “r”)) = = NULL)

err_sys (“can’t open % s”, argv[1]);

if (( fpout = popen (pager, “W”)) = = NULL

err_sys (“open error”);

/* copy argv[1] to pager */

while (fgets (line, Maxline, fpin) ! = NULL

{ if (fputs (line, fpout) = = EOF)

err_sys (“fputs error to pipe”);

}

if (ferror (fpin))

err_sys (“fgets error”);

if (pclose / fpoput) = = -1)

err_sys (“pclose error”);

exit (0);

}

8.4 Concept of Coprocesses:

Unix has a concept of a filter. A filter can be described as a program that reads

from the standard input and writes into the standard output, in the most simplistic case.

Filters are normally connected in shell pipelines. If, however the same program

generates it’s input and reads it’s output, then the filter can be termed a coprocess.

179

Page 180: Bsit61 Unix

A coprocess runs in the background from a shell and it’s standard input and

standard output are connected to another program by means of a pine.

Looking another way, while the popen gives us a one way pipe to the standard

input from a process or from a process to a standard output, a coprocess can be viewed

as a two way pipe – or rather two one way pipes in different directions between the

same set of processes – one to the standard input and one from the standard output to

a process.

The concept can be viewed as per the following sketch

Parent child (coprocess)

Pipe1

Pipe2

The following is a simple program utilizing the coprocess that reads two numbers

from it’s standard input and computes their sum to display on the standard output.

int

main (void)

{

int n, int1, int2;

char line [MAX line];

while (( n= read (stdin_fileno, line, maxline))>0)

{ line [n] = 0; /* terminate with null */

if (sscanf (line, %d%d”. & int1, &int2) = = 2)

{ s printf (line, “%d\n”, int1 + int2);

n=strlen (line);

if (write (Stdout_fileno, line, n) ! = n)

err_sys (“write error”);

Fp1[1]

Fd2[0]

Std in

Std out

180

Page 181: Bsit61 Unix

}

else

{

if (write (STDOUT_Fileno, “invalid args \n”. 13) ! = 13)

err_sys (“write error”);

}

} exit (0);

}

The above program acts as a filter to add two numbers.

We now write one more program that use the add2 coprocess, after reading the

two numbers from the standard input. The value from the coprocess is added to the

standard output.

# include <signal.h>

static void sig_pipe (int); /* signal handler */

int

main (void)

{

int n, fd1[2], fd2[2];

pid_t pid;

char line [max line];

if (signal (SIGPIPE, sig_pipe) = = sig_err

err_sys (“signal error”);

if pipe (fd1) < 0 \ \ pipe (fd2) < 0)

err_sys (“pipe error”);

if ((pid = fork( ) ) <0)

err_sys (“fork error”);

else if (pid >0) /* parent */

{ close (fd1[0]);

181

Page 182: Bsit61 Unix

close (fd2[0]);

while (fgets (line, maxline, std in) ! = NULL)

{ n = strlen (line);

if (write (fd1[1], line. n) ! = n)

err_sys (“write error to pipe”);

if ((n=read (fd2[0], line, maxline))<0)

err_sys (“read error from pipe”);

if (n==0) {

err_msg (“child closed pipe”);

break;

}

line [n] =0; /* line termination char */

if (fputs (line, stdout) = = EOF)

err_sys (“fputs error”);

}

if (ferror (stdin))

err_sys (“fgets error on stidin”);

exit (0);

}

else /* child */

{ close (fd1 [1]);

close (fd2[0]);

if (fd1[0] ! = stdin_fileno)

{ if (dup2 (fd1[0], stdin Fileno) ! = stdin_Fileno)

err_sys (“dup2 error to stdin”);

close (fd1[0]);

}

if (fd2[1] ! = stdout_fileno)

{ if (dup2 (fd2[1], stdout_file no) ! = stdout, file no)

err_sys (“dup2 error to stdout”);

close (fd2[1]);

182

Page 183: Bsit61 Unix

}

if (excel (“./ add2”, “add2”, (char*) 0 ) < 0)

err_sys (“excl error”);

}

}

static void

sig_pipe (int signo)

{

printf (“sigpipe caught \n”);

exit (1);

}

8.5 FIFOS:

FIFOS are also sometimes called pipes. Pipes can be used only between related

processes when they have a common ancestor, in the definition that we have seen

sofar. With FIFOs, however, unrelated processes can also exchange data.

Creating a FIFO is similar to creating a file

The typical format is

Int mkfifo (const char * path name, mode_t mode);

Returns 0 if successful, -1 if on error.

The specifications for the argument mode in the mkfifo function are similar to the

open function options seen in the first unit. Also, rules for creation of user and group

ownership of new FIFO are the same as described earlier for processes.

Once a FIFO is created using mkfifo, we can operate on it in a way similar to

normal file operations. We may open it using open function and use normal I/O

functions like read, write, close etc as we do with files.

183

Page 184: Bsit61 Unix

However, there are one or two concepts that we have to bear in mind while

dealing with FIFOs, regarding the effect of the non blocking (O_Nonblock)flag.

a) In the normal case, when the o_Nonblock is not specified, a FIFO if

open for read only operation blocks, This can be relieved only if some

other process opens the same FIFO for writing. Similarly a write only

opened FIFO blocks, until some other process opens it for reading.

b) Suppose the O_Nonblock is specified, A FIFO opened for read only

does not block, but returns immediately. But a FIFO opened for write

only returns an error if no process opens it for reading.

Also, normally several processes try to write simultaneously into an FIFO. I.e.

more than one process may open a given FIFO for writing. Obviously this may lead to

intermixed writing into the FIFO. To avoid it, writing into a FIFO will be made a an

atomic operation. This would ensure that no interleaved writing would be there, but a

single process may go on writing indefinitely into a FIFO. To over come this, the

maximum amount of data that can be atomically written into a FIFO is also specified.

FIFOs are used by shell commands to pass data from one shell pipe line to

another, by passing the need for creating intermediate temporary files. They also

become useful in the client server scenario.

8.6 Message Queues:

As the name indicates, they are a list of messages. Messages queues are

normally stored as linked lists of messages in the Kernel. Each message is identified by

a message queue identifier. For simplicity, we may refer to a message queue as simply

a “queue” and a message queue id as simply a “queue id” (in this section only)

New queues can be created or an existing queue may be opened and new

messages can be added to the end of the queue. Every individual message is identified

184

Page 185: Bsit61 Unix

by it’s length fields and the actual messages. Though it is called a queue, it is not

always necessary that we fetch the messages in a first-in-first-out manner. Messages

can also be fetched based on their message type.

Before we start looking into the details and the functions operating on the

message queues, we define a data structure which we call msquid_ds, to associate the

message queues.

The following is the msquid_ds strucutre :

Struct msquid_ds {

Struct ipc_perm msg_perm; /* message permissions */

Struct msg * msg_first; /* pointer to the first message on the queue */

Struct msg * msg_last; /* pointer to the last message */

Ulong msg_cbytes; /* current no. of bytes in queue */

Ulong msg_qnum; /* no. of messages in queue */

Ulong msg _qbytes; /* max no. of bytes in queue */

Pid_t msg_ispid; /* pid of last msgsnd ( ) */

Pid_t msg_lrpid; /* pid of last msgrcv ( ) */

Time_t msg_stime /* last – msg snd ( ) time */

Time_ t msg_rtime /* last – msgrcv ( ) time */

Time_t msg_ctime /* last change time */

}

Now briefly look at each of these fields:

msg_first and msg_last point to the locations where the first and last messages

are stored in the Kernel.

The first field is a structure used to defined inter process communication

permissions. These permissions necessarily indicate the type of associations of that

185

Page 186: Bsit61 Unix

the messages can have with over messages. They normally define the various

ownerships and the types of permission allocated.

A typical permission structure looks as follows

Struct ipc_perm {

uid_t uid; /* owner’s effective user id */

gid_t gid ; / * owners effective group id */

uid_t cuid ; /* creater’s effective user id */

gid_t cgid; /* creator’s effective group id */

mode_t mode; /* access modes */

ulong seq; /* slot usage sequence number */

key_t key; /* key */

};

All the fields in the ipc permission structure are self evident.

Some of the fields in the message queue structure become obvious as we

proceed further with the various functions.

Further, message queues, just because they are linked list structures, cannot grow to any size. The system

provides certain limitations on them, which of course can be modified or reset. The following are some of the

limits.

Name Description

MSGMax The size of the largest message that can be used, in bytes

MSGMNB The maximum size, in bytes, of one particular queue.

MSGMNI The maximum number of message queues

MSGTQL The maximum number of messages.

186

Page 187: Bsit61 Unix

Normally the largest message size is set to 2048 bytes and the max number of

message queues to 50.

Now we look at the various functions that operate on the message queues.

The message mssgget is used, as indicated earlier, either to open an existing

message queue or to create a new message queue.

The typical format is

int msgget (key_t key, int flag);

it returns the message queue if successful, other wise returns –1.

Each message queue is associated with a key. If a new queue is being created,

the user has to specify the key. If an existing queue is being opened, then the key

specified must be the same as the one of that was specified when the queue was

opened for the first time. The kernel converts the key to an identifier to associate it

uniquely with the queue.

When we create a new queue, we must also specify a flag with both Ipc_creat an

Ipc_Excl bits set.

The other function that one would like to examine is the msgct1 function. This

performs various functions on the queue.

The typical format is

int msgct1 (int msqid, int emd, struct msquid_ds * buf);

returns 0 if successful, -1 on error.

Look at the fields:

Msquid is the id of the queue.

The cmd argument specifies the command to be performed on the queue. The

normal operations are as follows

187

Page 188: Bsit61 Unix

IPC_Stat To fetch the msquid_ds structure for the queue and store it in

the structure pointed to by buf.

IPC_Set Choose the fields msg_perm, uid, msg_perm.gid,

msg_perm.mode and msg.qtypes and set them – This can be

done only if the effective user id of he process doing the

operation equals the value of msg_perm.uid or

msg_perm.cuid or by a process enjoying super user

privileges.

To increase the no. of bytes allocated to msg_qbytes, only the

super user is permitted.

IPC_RMID To remove the message queue along with any data in it from

the system – immediately. So much so that any process

already using the queue will get an error message. This

command can be executed only by a process whose effective

user id equals that of msg-perm.cuid or msg-perm.uid or by

the one with super user privileges.

To place data into the message queue, we use the function msgsnd.

The typical format is

int msgsnd (int msqid, const void *ptr, size_t nby;tes, int flag)

The message returns 0 if ok else returns –1.

Now the fields.

Msquid is the message queue id in question. Ptr points to a long integer that

contains the message type as an integer followed by the message data. The structure

to hold this can be something as follows:

Struct mesg

{ long m type /* type of the message */

188

Page 189: Bsit61 Unix

char m text [max] /* message text, with size max */

}

The ptr points to this structure.

A flag can specify whether the message will be of no wait type or not. (The

comparable fields of non blocking in the I/O opeation).

Messages can be retrieved from the queue by the msgrcv function.

The typical format is

Int msgrcv (int msquid, void *ptr, size_t nbytes, long type, int flag);

Returns the size of the data portion if successful, -1 on error.

Ptr points to a structure just like in msgsnd. N bytes indicates the data buffer for

the message data. If the message received is longer than the buffer size, then

message is truncated to fit the buffer size, if MSG_no error flag is set. If this value is not

set, a receipt of a message longer than the buffer size would mean only an error

message.

The type argument helps us to specify the type of message

Type = 0 The first message is returned.

Type > 0 The first message whose message type equals “type” is returned.

Type <0 The first message whose type is the lowest value less than or equal to

the absolute value of type is returned.

Note that a type = 0 means a First in First out operation. The declaration of a

non zero type could produce a priority based queue.

We can specify a flag value of IPC_Nowait to make it non blocking (refer the

previous message description)

189

Page 190: Bsit61 Unix

8.7 Semaphores:

A semaphore is a counter used to provide access to a shared data object for

multiple process. To obtain a shared resource, a process will follow the following steps.

a) Check the status of the semaphore that indicates the availability or

otherwise of the resource.

b) If the semaphore indicates a positive value, the process can make use

of the resource. The process decrements the value of the semaphore.

This indicates that one more unit of the resource indicated by the

semaphore has been committed.

c) If the value of the semaphore is 0, the resource is not available and

hence the process goes to sleep till the semaphore value becomes

greater than 0. Then it goes to step (a)

d) When a process has made use of the shared resource and returns the

resource, the semaphore value is increased by 1.

In some cases, a signal may be sent to the sleeping processes.

Common sense tells us that the checking of the semaphore value and

decrementing it’s value (indicating alocation to a process) should be an atomic

operation.

8.7.1. Implementation of semaphore structures:

The kernel maintains a semid_ds structure for each semaphore.

The data structure can be described as follows

Struct semid_ds

{

190

Page 191: Bsit61 Unix

struct ipc_perm sem_perm /* permissions */

struct sem * sem_base /* pointer to the first semaphore in the set */

ushort sem_n sems /* no. of semaphores in the set */

time_t sem_0 times /* last - semop ( ) time */

time t semc time /* last – change time */

}

the sem-base points to the memory in the kernel, where the semaphore is stored.

This points to an array of sem structures.

Struct sem {

Ushort semval /* semaphore value, always >=0 */

Pid_t sempid /* pid for last operation */

Ushort semcnt /* no. of processes awaiting sem val > carrval */

Ushort semzcnt /* no. of processes awaiting semval = 0 */

Now, we shall look at a few functions to operate on semaphores. The first

function is to obtain a semaphore id.

The typical format

int semget (key_t key, int n sems, int flag);

returns the semaphore id on success, else returns –1.

This can be used either to get an existing semaphore or to create a new

semaphore structure.

If a new set is being created, then apart from other initializations, we set sem_ 0

time is set to 0, set sem_c time to current time and sem_nsems to no. of semaphores in

the set.

If we are referencing an existing set of semaphores, we specify n sems to 0.

191

Page 192: Bsit61 Unix

8.7.2. The function sem Ct1:

The next function we see is semct1, which can be thought of as a catch all

function for all sorts of operations on semaphores

The typical for is

Int semct1 (int semid, inte semnum, int cmd, union semun arg);

The argument semid specifies the id of the semaphore on which operation are to

be done.

Semnum specifies the particular member of the semaphore set on which

operations are to be performed. The valid semnums are from 0 to nsems –1.

Cmd indicates the set of commands that can be performed on the particular

semaphore member indicated by semnum of the specified semaphore whose id is

indicated in semid. There are about 10 commands, each of them will be indicated briefly

here

IPC_stat Fetch the semid_ds structure for this set and store it in the structure

pointed to by org.buf.

IPC-set Set the following fields from the structure pointed to by arg.buf

i) sem_perm.uid

ii) sem_perm.gid

iii) Sem_perm.mode.

These commands can be executed only if the effective user id of the

process calling the command equals sem_perm.cuid or

sem_perm.uid or, of course by the super user privileged process.

IPC_RM

ID

Remove the semaphore set from the system As we have seen earlier,

this removal is instantaneous and any process still using the

semaphore may end up getting an error signal on it’s next attempt to

access the semaphore. This command, again can be executed if the

ffective user id of the process calling the operation equals

sem_perm.cuid or sem_perm.uid or by a process with super user

privileges.

192

Page 193: Bsit61 Unix

Getval Return the value of the semval for the member semnum

Setval Set the value of semval for the member semnum. The value for which

it is to be set is indicated by arg.val.

Getpid Return the value ofsempid for the member semnum

Getncnt Return the value of semncnt for the member semnum.

Getzcnt Return the value of semZcnt for the member semnum.

Getall Fetch all the semaphore values in the set. These values are stored in

an array pointed to by arg.array.

Setall Set all the semaphore values in the set to the value pointed to by

arg.array.

The final argument of the command is the actual union, named semun.

Union semun {

Int val /* for set val */

Struct semid_ds * buf /* for ipc_Stat and Ips_set */

Ushort * arry /* for get all and set all */

8.7.3. The function sem op

The next function we shall be looking at is the semop which atomically performs

a number of operations on a semaphore set.

The typical format is

Int semop array (int semid, struct sembuf semop array [ ], size_t nops);

Sem id is the semaphore id

Semop array is the pointer to an array that points to the operations to be done on

the semaphore. Each field of the array is of the following type.

Struct sembuf {

193

Page 194: Bsit61 Unix

Ushort sem_num ; /* member no. is set (0,1,2, . . . . nsem-1)*/

Short sem_op; /* operation (negative, 0 or positive */

Short sem_flg; /* IPC – no wait , sem_undo */

}

nops specifies the no. of operations in the array.

The operation to be done on each member of the set is specified by the

respective sem_op values. As has been indicated, this value can be negative, 0 or

positive);

a) when the sem-op flag is positive, it indicates that the resources of the

system should be returned by the process. The value of sem_op is

added to the current value of the semaphore we are operating upon.

If the undo flag is set, sem_op is also subtracted from the semaphore’s

adjustment value for the process.

b) If sem_op is negative, it means the process wants to obtain the

resources that it had indiated. In such a case, a number of alternative

situations are possible. If the current value of the semaphore is

greater than are equal to the absolute value of sem_op (which indites

the required resources are available), the absolute value of sem_op is

subtracted from the semaphore’s value. The resulting value after the

subtraction should not fall below 0. If the undo flag has been specified,

the absolute value of the sem_op is also added to the semaphore’s

adjustment value for this process. On the other hand, if the

semaphore’s value is less than the absolute value of sem_op, it means

the process is requesting for more amount of resources then is

actually available. In such a case.

i) If IPC-no wait is specified, control returns with an error

message.

194

Page 195: Bsit61 Unix

ii) If Ipc_no wait is not specified, the sem cnt for the semaphore is

incremented and the calling process is put to sleep (suspended)

pending one of the following occurances

1. The semaphore value becomes greater than or equal to the

absolute value of sem_op. This happens when certain other

processes have released their resources. Then the value of sem

ncnt for the semaphore is decrementd and the absolute value of

sem_op is subtracted from the new semaphore value. If undo flag

is specified, the absolute value of sem_op is also added to

semphore adjustment value.

2. The semaphore is removed from the system. In this case the

process gets an error message.

3. A signal is caught by the process and the signal handler returns.

Then the value of semncnt for the semaphore is decrementd and

the function returns error.

c. Suppose the value of sem_op is 0 we would then wait until the

semaphore’s value becomes 0. In such a case, the current value of the

semaphore is 0, the function returns immediately.

If it is not zero, then

i. If Ipc_no wait is specified, an error is returned.

ii. If Ipc_no wait is not specified, the semzcnt for the

semaphore is incremented and the calling process is

suspended until one of the following things occur.

1. The semaphore’s value becomes 0. Then the value or semzcnt

of the semaphore is decrementd.

2. The semaphore is removed from the system. Then the process

gets an error message.

3. A signal is caught by the process and the signal handler returns.

Then the value of semncnt for the semaphore is decremented

and the function returns error.

195

Page 196: Bsit61 Unix

8.7.4. What to do if a function with system resources terminates?

It becomes a problem, if a process suddenly terminates, for what ever reasons,

while it is still having the resources allocated to it which are indicated in the semaphore.

How do we account for the resources held by the process? When we specify the

sem_undo flag for a semaphore operation and allocate resources, the kernel

remembers how many resources are allocated from each particular semaphore to the

process (indicated by the absolute value of sem_op). Once the process terminates,

voluntarily or involuntarily, the kernel checks whether the process has any outstanding

semaphore adjustments and suitably adjusts the respective semaphores.

If the value of the semaphore is set using semctl, using either setval or setall

commands, the adjustment value for that semaphore in all processes will be set to 0.

8.8 The concept of shared memory:

The basic principle behind the shared memory is that two or more processes are

allowed to share a given region of memory. The most important concept is

synchronizing access to a given region amongst different processes. This makes the

operation faster because data need not have to be copied between the client and the

server, for example. The only constraint is that when data is being put in to the shared

region, some other process should not be able to access the data from the same region

and viceversa. Two mechanisms are used to synchronize the operations -–use of

semaphores and using record locking.

We look into the details of the same in the subsequent paragraphs.

The Kernel maintains the information about each shared memory segment in the

following structure type:

Struct shmid_ds

196

Page 197: Bsit61 Unix

{

struct ipc_perm shm_perm /* security permissions */

struct anon_map * shm_amp /* pointer in Kernel */

int shm_segsz /* size of the segment in bytes */

ushort shm_lkcnt /* no. of times segment is being locked */

pid_t shm_cpid /* pid of the last shmop ( ) */

pid_t shm_cpid /* pid of the creator */

ulong shm_nattch /* no. of current attaches */

ulong shm_cnattch /* used only for shminto */

time_t shm_a time /* last attach time */

time_t shm_d time /* last detach time */

time_t shm_c time /* last change time*/

};

All the fields are self explanatory

Also, certain limits will have to be set to the shared memory size

Name Description

SHAMAX Maximum size in bytes of a shared memory segment

SHAMIN Minimum size in bytes of a shared memory segment

SHAMNI Maximum no. of shared memory segments, system wide

SHMSEG Maximum no. of shared memory segments, per process.

These limits can be used to manipulate the various parameter of hte memory

sharing process.

Now, we are in a position to look into several of the memory sharing functions:

8.8.1. The function shm get

The typical format is

Int shmget (key_t key, int size, int flag);

197

Page 198: Bsit61 Unix

If successful, returns the shared memory id, -1 if error.

When a new shared memory segment is to be created, the following members of

the shmid structure need to be initialized.

i) The ipc_perm is to be initialized. The method of initializing this field

is the same as we have seen in several previous occasions.

ii) Shm_lpid, shm_nattach, shm_ a time and shm_d time need to be

initialized to 0.

iii) Shm_ltime is set to the current time.

Size indicates the maximum size of the shared memory segment. If we are

referencing an existing segment, size is specified as zero. If a new segment is being

created, the size should be specified suitably.

8.8.2 The shmctl functin:

The shmctl function can do various shared memory operations

The typical format is

Int shmctl (int shmid, int cmd, struct shmid_ds *buf)

Returns 0 if successful, otherwise returns –1.

The cmd argument specifies one of the following commands to be performed.

The operation is done on the segment specified by shmid.

Ipc_stat Fetch the shmid_ds struture for this segment and store it in the

strucutre pointed to be buf.

Ipc_Set The following fields from the structure pointed to by buf in the

strucutre associated with this segment – shm_perm.uid,

shm_perm.gid, and shm_perm.mode.

The command can be executed only if the process calling the

function has it’s effective user id equal to shm_perm.cuid or

shm_perm.uid or by a process with super user privileges.

Ipc_RMID Remove the shared memory segment set from the system.

198

Page 199: Bsit61 Unix

Note that the shm_nattch field in the shmid_ds structure is an

attachment count for the shared memory segment. So, when a

shared memory segment is to be removed, the segment’s

identifier is removed, so that shmat can no longer attach the

segment. The shared segment is actually removed from only

after the last process using the segment either terminates or

detaches from it.

This command can be executed only by a process whose

effective user id equals shm_perm.cuid or shm_perm.uid or by

a super user privileged process.

SHM_Lock Lock the memory segment in the memory. This can be

executed only by the super user.

Shm_unlock Unlock the shared memory segment. Again this can be

executed only by a super user.

8.8.3 The shmat function:

This function helps to attach a process to the address space of a shared memory

segment

The typical format is

Void *shmat (int shmid, void *addr, int flag)

The function returns a pointer to the shared memory segment if successful,

otherwise –1.

The address at which the segment gets attached to the process depends on the

addr argument and also whether SHM_RND bit is specified in the flag field.

a) If addr is 0, the segment is attached at the first available address

selected by the Kernel.

199

Page 200: Bsit61 Unix

b) If addr is non zero and shm_RND is not specified, the segment is

attached at the address given by addr.

c) If addr is non zero and shm_RND is specified, the segment gets

attached at the address given by (addr –(add mod shmlba)). ShmLba

stands for “lower boundary address multiple of shared memory”.

It is advisable to specify the addr as 0 and let the system choose the

corresponding address.

8.8.4 The shmdt function:

The function helps to detach the memory segment, once we have completed the

operations

The typical format is

Int shmdt (void *addr);

Returns 0 if successful otherwise –1.

It may be noted that the function detaches the memory segment but does not

remove the identifier and the associated data structures.

8.9 Client Server Properties:

There are different properties of the client – server mode operation that are

affected by the IPCs

The simplest way of having a client – server operation is to have the client fork

and exec the desired server. Before the fork, pipe operations can be used to create two

one way pipes, needed for two way data transfer, as envisaged earlier. The server can

be a set-user-id program. Also the server can determine the real user id and hence

determine the client’s identity.

200

Page 201: Bsit61 Unix

This arrangement helps one to build on “open server”. It can open files for the

client, instead of the client calling the open function. In this type of server client

architectures, since the server is a child of the parent, it can only pass the contents of

the file back to the parent. But the child cannot pass a file descriptor back to the parent

though the parent can pass such a descriptor to the child.

The next type of server is a daemon process that is working using some form of

IPC with it’s clients. In this case, either FIFOs or message queues are used for

message transfers. Once such message queues are created, several possibilities

regarding their operations arise.

a) A single queue can be used for passing of messages between the

server and all it’s clients. The clients, when sending their requests,

can use their process id, which can be used by the server for returning

the responses.

b) Each client has it’s own message queue. Before sending it’s first

request, each client creates it’s own message queue with a IPC –

private key. The server also has it’s own queue, with a key known to

each of it’s clients. When the client sends the first request to the

server’s known queue, the request contains the message queue id of

the client’s queue. The server sends it’s first response to the client’s

queue and all future requests and responses are exchanged on this

queues.

The problem is that the server has to go on reading messages from multiple

queues, as neither select nor poll work with message queues.

The basic problem with the shared memory concept is that a single message can

be in a shared memory segment at a time. On the other hand, it is possible that each

201

Page 202: Bsit61 Unix

client has one shared memory segment with the server. But this poses the additional

problem that the server should be able to identify the client accurately.

8.10 Stream pipes:

One problem with the pipe structure we have seen so far is that they are

unidirectional. So, to have a full duplex pipe, we had to create two such unidirectional

pipes and close the complementary ends etc as we have seen earlier.

One other way of doing it is to use a bi-directional pipe, called the stream pipe.

It’s structure is given below:

User process user process

To see how a single stream pipe can operate, we shall rewrite the program which

we did using coprocessors, in section 8.4

# include <signal.h>

static void sig_pipe (int) ; /* our signal handler */

int

main (void)

{ int n, fd[2];

pid_t pid;

char line [Max Line];

if (signal (SIG Pipe, Sig pipe) = = sig_err)

Fd[0] fd[1] Fd[0] fd[1]

Stream pipe

202

Page 203: Bsit61 Unix

err_sys (“signal error”);

if (S_pipe (fd) <0) / * only needs a signal stream pipe */

err_sys (“pipe error”);

if ((pid = fork ( ) ) < 0)

err_sys (“fork error”);

else if (pid > 0) /* parent */

{ close (fd1[1]);

while (fgets (line, Maxline, Stdin) ! = NULL)

{ n = strlen (line);

if (write (fd[0], line, n) ! = n)

err_sys (“write error to pipe”);

if ((n=read (fd[0], line, max line )) < 0)

err_sys (“read error from pipe”);

if (n = = 0)

{ err_msg (“child closed the pipe”);

break;

}

line [n] =0; /* null terminate */

if (fputs (line, std out) = = EOF)

err_sys (“fputs error”);

}

if (ferror (stdin))

err_sys (“fgets error on stdin”);

exit (0);

}

else { /* child */

close (fd[0]);

if (fd[1] ! = stdin_File no)

{ if (dup2 (fd[1], std in file no) ! = stdinfile no)

err_sys (“dup 2 error to stdin”);

}

203

Page 204: Bsit61 Unix

if (fd[1] ! = stdout_fileno)

{ if (dup 2 (fd[1], stdout_fileno) ! = stdout_fileno)

err_sys (“dup2 error to stdout”);

}

if (excel (“ ./ add2” , “add2”, NULL <0)

err_sys (“execl error”);

}

}

static void

sig_pipe (int signo)

{ printf (“sig pipe caught \n”);

exit(1);

}

Since each end of the stream pipe is fully duplex, the parent, in the above

program uses only fd[0] for both reading and writing and the child duplicates fd[1] to

both stdin and stdout (standard input and standard output)

Parent Child

8.10.1 The s_pipe function:

This function, to create the stream pipe is similar to the pipe function. So, we do

not go into great detail, except to state that the s_pipe function just calls the standard

pipe function, to create a full duplex pipe

Fd[0]StdinFd[1]Stdout

204

Page 205: Bsit61 Unix

The typical format is

int s_pipe (int fd[2]);

It returns two file descriptors, fd[0] and fd[1].

8.11 Passing file descriptors:

The ability to pass an open file descriptor between process is a desirable feature

as it helps in designing different types of applications. For example one process (may

be typically a server) can do everything needed to open a file and the calling process

just needs to accept the descriptor passed on to it by the server to do I/O functions

using the same. The file or the device opening details are all transparent to the calling

process (let us call it a Client).

When an open file descriptor is passed from one process to another, the passing

process (server) and the receiving process (client) should share the same file table

entry. In simple terms, it just means passing a pointer to an open file table from one

process to another. The pointer is assigned the first available file descriptor in the

receiving process.

Process table entry File Table v node table

Fd ptr File status vnode information

Current offset inode info

(1) Vnode pt Current file size

Process table entry

Fd ptr

(2)

205

Page 206: Bsit61 Unix

The process is passing the file descriptor from (1) to (2).

8.11.1 Functions for sending and receiving file descriptors:

The following functions can be used to send and receive file descriptors:

Their typical format are

int send_Fd(int spipefd, int filedes);

int send_err (int spipe fd, int status, const char *errmsg);

Both return 0 if successful otherwise returns –1.

When a process wants to pass a file descriptor, it calls either send fd or send err.

The send_Fd sends the descriptor fieldes across the stream pipe spipefd.

Send_err sends the errmsg across the stream pipe spipe fd, followed by status byte.

The value can be from –1 through – 255.

The receiver, to receive the descriptor does so by calling the function recv_Fd.

The typical format is

int recv_Fd (int spipefd, ssize_t_ (*userfunc) (int, const void *, size_t));

It returns the file descriptor if successful, else a number less than 0.

If an error message was sent by the server, the client’s user function is called to

process the message. The first argument of user function is the constant std err_fileno,

followed by a pointer to the error message and it’s length.

206

Page 207: Bsit61 Unix

Block Summary

We began with the basic idea of interprocess communications. We introduced

ourselves to the concept of pipes which can be viewed as a connecting channel

between two processes, but is goes through the kernel. We also noted that a pipe call

is followed by a call to the fork. When studied functions to open and close the pipes –

the popen and pclose fucntions.

Then we moved on to the coprocesses and wrote a program that utilised the

concept of coprocesses. The next topic was FIFOs – which can be looked upon as

pipes beween processes that do not share a cammon ancestor in fork oprations. We

also saw the concept of message queues – which is a list of messages, as the name

suggests. We studied mssget and msgct1 functions.

We moved on to semaphores; a mechanism to provided access to shared data

object for multiple processes. We studied the methods of implementing the semphore

strucures – the functions semct1, semop etc.

Then we studied the concept of shared memory and the ufnctions shm get and

shmct1, as also the functions shmat and shmdf. We also briefly studied the stream

pipes and the s_pipe fucntions. The last concept that we studied was about passing file

descriptors and the fucntions required for the same.

207

Page 208: Bsit61 Unix

Review Questions:

1. What is a pipe? What are it’s limitations?

2. What are the typical formats for popen and pclose functions?

3. What is a filter?

4. How does a FIFO differ from a pipe?

5. What is a message queue? How does is store and identify messages?

6. What is a semaphore?

7. What is the need for shared memory ? What is the concept behind it?

8. name the function used to create a shared memory segment?

9. What is a stream pipe?

10. Name the functions used to send and receive file escriptors?

208

Page 209: Bsit61 Unix

Answers

1. A pipe is a connection between two processes through which the dta can pass. It’s

limitations are that data can passin only one directions and they can be sued only by

processes who have a common ancestor.

2. The typical formats are

FILE * open (const char *cmd string, const char * type)

And Int pclose (FILE *fp)

3. A filter can be described as a program that reads from the standard input and write

on to the standard output.

4. A FIFO will help to exchange data een between unrelated processes.

5. A message queue is a list of messges. Message queues are normally stored as linke

dlists of messages in he kernel. Each message is identified by a message queue

identifier.

6. A semaphore is a counter used to provide access to a shared data object for multiple

processes.

7. The concept behind shared memory is to allow two or more processes to share a

given region of memory – after suitable synchronisation is done. This makes the

oprations faster.

8. The typical format of the function is

Int shmget (key_t key, int size, int flag);

9. A stream pipe can be viewed as a bidirectional stream.

10. The functions are send_fd and recv_fd respectively.

209

Page 210: Bsit61 Unix

References:

1. W.Richard Stevens: Advanced Programming in UNIX Environment, Addison –

Wesley (The course closely follows the pattern of the book)

2. Terrance chan: Unix Programming Using C++, PHI.

210