Unix Shell Scripting

130
1 UNIX SHELL SCRIPTING Perhaps the most important achievement of UNIX is to demonstrate that a powerful operating system for interactive use need not be expensive either in equipment or in human effort: UNIX can run on hardware costing as little as $40,000, and less than two man-years were spent on the main system software. The UNIX Time-Sharing System (1974) Dennis M. Ritchie and Ken Thompson

Transcript of Unix Shell Scripting

Page 1: Unix Shell Scripting

1

UNIX SHELL SCRIPTING

Perhaps the most important achievement of UNIX is to demonstrate that apowerful operating system for interactive use need not be expensiveeither in equipment or in human effort: UNIX can run on hardware costingas little as $40,000, and less than two man-years were spent on the mainsystem software.

The UNIX Time-Sharing System (1974)Dennis M. Ritchie and Ken Thompson

Page 2: Unix Shell Scripting

2

Tabel of Contents

Module 1. Introduction to Operating System 01

Module 2 Exploring the UNIX Shell 20

Module 3 Processes 41

Module 4 A Shell Script 45

Module 5 A Overview 49

Module 6 The vi Editor 54

Module 7 The variable 57

Module 8 Parameters 68

Module 9 Regular Expressions 71

Module 10 A Sample Shell Script 80

Module 11 Useful Utilities of Shell 93

Module 12 Arithmetic on Shell 103

Module 13 Functions 104

Module 14 Sed and AWK 107

Module 15 Database Using Shell Script 117

Module 16 Overview of Perl 123

Exercise for Lab Experience

Appendix List of UNIX Commands 126

Page 3: Unix Shell Scripting

3

Module 1Introduction to Operating System:

In simple terms, an operating system is a manager. It manages all theavailable resources on a computer. These resources can be the hard disk, aprinter, or the monitor screen. Even memory is a resource that needs to bemanaged. Within an operating system are the management functions thatdetermine who gets to read data from the hard disk, what file is going to beprinted next, what characters appear on the screen, and how much memory acertain program gets.

Note: An operating system (OS) is a collection of system programs thattogether control the operation of a computer system.

Operating systems may be classified by both how many tasks they canperform `simultaneously' and by how many users can be using the system`simultaneously'. That is: single-user or multi-user and single-task or multi-tasking. A multi-user system must clearly be multi-tasking.

Single User Operating System

MS D O S /P C D O S w as desig ned specifica lly to su it a sing le user‘s

requirements. The user can run only one program at a time. At any instance of time there is only one process going on in the CPU

Multi User Operating System

Here the system is such that many users can work at a time. There isone large CPU and high capacity storage medium enclosed into whatis called as the system unit and different terminals are attached to it.E ach user w orks on a separate term ina l and utilizes the C PU ‘sresources.

Each users program and other files are stored in th e system u n its‘storage media. Thus the CPU is one and many users are using it.Therefore there is a need of such an OS that will effectively divide theresources of the CPU among all users. Such an OS is called a multiuser OS.

Page 4: Unix Shell Scripting

4

Features of Multi User OS

1. Multi Processing

As many users are working at a time, every user will run theirown program. When one program is run by a user it is aprocess. When the same program is run by another user it isanother process. If there are different users running differentprograms there are many processes undergoing execution. Auser shou ld n ot w a it un tilo th e r use rs‘p rogra m s fin ish execution.

Same program can share by many users at a time and run thattogether. This ability of the OS to run several processingtogether is called multi-processing.

2. Time Sharing

The CPU can execute only one instruction at a time. Since thereare several users running their programs the OS divides the CPUtime for each user. It allots a definite time interval called time slicew ith in w h ich that user‘s program is executed. O nce th e tim e sliceisover the C P U sw itches to th e next u se r and executes tha tuser‘s p rogra m . A fter the tim e slice of the user is over the next user‘s program is executed.

Thus every user‘s programis constantly bein g in terrupted b y anoth er user‘s program but no user rea lizes th is because the C PUis very fast. Thus the OS effectively divides the CPU time betweenseveral users.

3. Memory Management

A program can run only if it is loaded into the internal memory. Sowhen many users are running their programs that all programs haveto be loaded into the memory. So the CPU memory is dividedlogically such that all users programs get their share of the CPUm em ory. A lso w he n a user‘s p rogram finishes execution it has tobeeliminated from the internal memory and that part of the memoryshould be utilized for storing other use r‘s program .

4. Multi Tasking

Many users work on a multi user environment each running theirown process. Thus there is more than one process executingtogether. But a user can run more than one process or program forhim self if his requirement demands. Such an activity when anumber of processes are running for one user is called Multi-

Page 5: Unix Shell Scripting

5

tasking.PARTS OF Operating System:

Any Operating System consists of two parts.o The Shell ando The Kernel

The Shell: The shell acts as an interface between the user and themachine and effectively interprets every command given by the user andadvices the kernel to act accordingly.

A single user OS will have only one shell devoted entirely to the userwhereas in a multi user OS every user will have a separate shell.

Kernel: The Kernel is the part of OS that interacts directly with thehardware of the Computer system.

Why is UNIX Important?

During the past 25 years the UNIX Operating System has evolved into apowerful, flexible, and versatile operating system. It serves as the OperatingSystem for all types of computers, including single user personal computersand engineering workstations, multi-user microcomputers, minicomputers,mainframes and supercomputers, as well as special purpose devices, withapproximately 20 million computers now running UNIX and more than 100million people using these systems. This rapid growth is expected to continue.The success of UNIX is due to many factors, including its portability to a widerange of machines, its adaptability and simplicity, the wide range of tasks thatit can perform, its multi-user and multi tasking nature, and its suitability fornetworking, which has become increasingly important as the Internet hasblossomed. What follows is a description of the features that have made theUNIX system so popular.

Understanding UNIX:

The UNIX operating system was designed to let a number of programmersaccess the computer at the same time and share its resources.The operating system coordinates the use of the computer's resources,allowing one person, for example, to run a spell check program while anothercreates a document, lets another edit a document while another createsgraphics, and lets another user format a document -- all at the same time, witheach user oblivious to the activities of the others.The operating system controls all of the commands from all of the keyboardsand all of the data being generated, and permits each user to believe he orshe is the only person working on the computer.

This real-time sharing of resources makes UNIX one of the most powerful

Page 6: Unix Shell Scripting

6

operating systems ever.Although UNIX was developed by programmers for programmers, it providesan environment so powerful and flexible that it is found in businesses,sciences, academia, and industry. Many telecommunications switches andtransmission systems also are controlled by administration and maintenancesystems based on UNIX.While initially designed for medium-sized minicomputers, the operatingsystem was soon moved to larger, more powerful mainframe computers. Aspersonal computers grew in popularity, versions of UNIX found their way intothese boxes, and a number of companies produce UNIX-based machines forthe scientific and programming communities.

The uniqueness of UNIX

The features that made UNIX a hit from the start are:

Multitasking capability Multi-user capability Portability Cooperative Tools and Utilities Excellent Networking capability Open Source Code

MultitaskingMany computers do just one thing at a time, as anyone who uses a PC orlaptop can attest. Try logging onto your company's network while openingyour browser while opening a word processing program. Chances are theprocessor will freeze for a few seconds while it sorts out the multipleinstructions.UNIX, on the other hand, lets a computer do several things at once, such asprinting out one file while the user edits another file. This is a major feature forusers, since users don't have to wait for one application to end before startinganother one.

Multi-userThe same design that permits multitasking permits multiple users to use thecomputer. The computer can take the commands of a number of users --determined by the design of the computer -- to run programs, access files,and print documents at the same time.The computer can't tell the printer to print all the requests at once, but it doesprioritize the requests to keep everything orderly. It also lets several usersaccess the same document by compartmentalizing the document so that thechanges of one user don't override the changes of another user.

PortabilityA major contribution of the UNIX system was its portability, permitting it tomove from one brand of computer to another with a minimum of code

Page 7: Unix Shell Scripting

7

changes. At a time when different computer lines of the same vendor didn'ttalk to each other -- yet alone machines of multiple vendors -- that meant agreat savings in both hardware and software upgrades.It also meant that the operating system could be upgraded without having allthe customer's data inputted again. And new versions of UNIX were backwardcompatible with older versions, making it easier for companies to upgrade inan orderly manner.

Cooperative Tools and Utilities

UNIX comes with hundreds of programs that are divided into two classes: Integral utilities that are absolutely necessary for the operation of the

computer, such as the command interpreter, and

Tools that aren't necessary for the operation of UNIX but provide theuser with additional capabilities, such as typesetting capabilities and e-mail.

DC Mail

Man

fsck nroff viCalendar

Fig 1.1 UNIX Tools

Tools can be added or removed from a UNIX system, depending upon theapplications required.

Excellent Networking Capability:

The UNIX system provides an excellent environment for networking. It offersprograms and utilities that provide the services needed to build networkedapplications-the basis for distributed, networked computing. With networkedcomputing, information and processing is shared among different computersin a network. The UNIX system has proved to be useful in client/servercomputing. The UNIX system also has been the base system for thedevelopment of Internet Services. UNIX provides an excellent platform forWeb Servers.

Page 8: Unix Shell Scripting

8

Open Source Code:

UNIX has provision for protecting data and communicating with other users.The source code (Open Source) for the UNIX system has been madeavailable to users and programmers.

History of UNIX:

1965 Bell Laboratories joins with MIT and General Electric in thedevelopment effort for the new operating system, Multics, which wouldprovide multi-user, multi-processor, and multi-level (hierarchical) filesystem, among its many forward-looking features.

1969 AT&T was unhappy with the progress and drops out of the Multicsproject. Some of the Bell Labs programmers who had worked on this project,Ken Thompson, Dennis Ritchie, Rudd Canaday, and Doug McIlroydesigned and implemented the first version of the Unix File System on a PDP-7 along with a few utilities. It was given the name UNIX by Brian Kernighan asa pun on Multics.

1971 The system now runs on a PDP-11, with 16Kbytes of memory, including8Kbytes for user programs and a 512Kbyte disk.

Its first real use is as a text processing tool for the patent department at BellLabs. That utilization justified further research and development by theprogramming group. UNIX caught on among programmers because it wasdesigned with these features:

Programmers environment Simple user interface Simple utilities that can be combined to perform powerful functions Hierarchical file system Simple interface to devices consistent with file format Multi-user, multi-process system Architecture independent and transparent to the user.

1973 UNIX is re-written using C, a new language developed by DennisRitchie. Being written in this high-level language greatly decreased the effortneeded to port it to new machines.

1974 Thompson and Ritchie publish a paper in the Communications of theACM describing the new Unix OS. This generates enthusiasm in theAcademic community which sees a potentially great teaching tool for studyingprogramming systems development. Since AT&T is prevented from marketingthe product due to the 1956 Consent Decree they license it to Universities foreducational purposes and to commercial entities.

By 1977, the fifth and sixth editions had been released; these contained manynew tools and utilities. The number of machines running the UNIX System,

Page 9: Unix Shell Scripting

9

primarily at Bell laboratories and Universities, increased to more than 600 by1978. The seventh edition, the direct ancestor of the UNIX OperatingSystem available today, was released in 1979.

UNIX System III, based on the Seventh edition, became A T & T ‘s firstcommercial release of the UNUX System in 1982. However, after System IIIwas released, AT&T, through its Western Electric manufacturing subsidiary,continued to sell versions of the UNIX system. UNIX System III, the variousresearch editions, and experimental versions were distributed to colleagues atuniversities and other research laboratories.

A UNIX System Timeline

The following timeline summarizes the development of UNIX from itsbeginning -

Year

1969

1973

1975

1978

1979

1980

1980

1982

1983

1983

1984

UNIX Variant or Standard

UNICS (later called UNIX)

Fourth Edition

Sixth Edition

3BSD

Seventh Edition

Xenix

4BSD

System III

System V Release 1

4.1BSD

4.2BSD

Comments

A new operating system invented byKen Thompson and Dennis Ritchiefor the PDP-7

Written in C programming language;widely used inside Bell Laboratories

First version widely availableoutside of Bell Labs; more than 600machines ran it

Virtual memory

Included the Bourne shell, UUCP,and C; the direct ancestor ormodern UNIX

Introduced by Microsoft

Introduced by UC Berkeley

First public release outside of BellLabs

First supported release

UC Berkeley release withperformance enhancements

Page 10: Unix Shell Scripting

10

1984

1986

1987

1987

1988

1989

1990

1990

1991

1992

1992

1993

1993

1993

1994

1994

1995

1995

1995

1996

System V Release 2

HP-UX

System V Release 3

4.3BSD

POSIX

System V Release 4

XPG3

OSF/1

Linux 0.01

SVR4.2

HP-UX 9.0

Solaris 2.3

4.4BSD

SVR4.2MP

Linux 1.0

Solaris 2.4

UNIX 95

Solaris 2.5

HP-UX 10.0

Linux 2.0

UC Berkeley release with manynetworking capabilities

Protection and locking of files,enhanced system administration,and job control features added

First version of HP-UX released forHP Precision Architecture

STREAMS, RFS, TLI added

Minor enhancements to 4.2BSD

POSIX.l published

Unified System V, BSD, and Xenix

X/Open specification set

Open Software Foundation releasedesigned to compete with SVR4

Linus Torvalds started developmentof Linux

USL developed version of SVR4 forthe desktop

Supported workstations including aGUI

POSIX compliant

Final Berkeley release

Last version of UNIX developed byUSL

First version of Linux not considereda "beta"

Motif supported

X/Open mark for systems registeredunder the Single UNIX Specification

CDE supported

Conformed to the Single UNIX

Page 11: Unix Shell Scripting

11

1997

1997

1997

1997

1997

1998

1998

1999

Solaris 2.6

Single UNIX Specification, Ver2

System V Release 5 (SVR5)(SCO)

UnixWare 7

HP-UX 11.0

UNIX 98

Solaris 7

Linux 2.2

Specification and the CommonDesktop Environment (CDE)

Performance improvements andnetworking software added

UNIX 95 compliant, JAVA supported

Open Group specification set

Enhanced SV kernel, including 64-bit support, increased reliability, andperformance enhancements

SCO UNIX based on SVR5 kernel

64-bit operating system

Open Group mark for systemsregistered under the Single UNIXSpecification, Version 2

Support for 64-bit applications, freefor noncommercial users

Device drivers added

Versions of UNIX Today

With most things in life, where there is active competition the best willultimately survive and triumph. This is the case with several different versionsor flavors of UNIX. Although many different versions exist, a common designand/or code base is present in most of them. Also, two major kinds of UNIXoperating system software markets exist today. The commercial market iswhere customers generally have to pay for the operating system software andgenerally may not get any source code (well, not for free anyway!). The othermarket is also commercial, but is considered open source. Open sourcemeans that you get full access to the source code of the system or programsand can make changes or modifications to that source code as long as youmaintain the rights of the original software owner.

Today, the UNIX leaders include Solaris, Linux, HP-UX, AIX, and SCO.

Page 12: Unix Shell Scripting

12

Why UNIX Is Popular?

Many people ask why UNIX is so popular or why it is used so much, in somany different ways and in so many computing environments. The answerlies with the very nature of UNIX and the model that was used to design,build, and continuously improve the operating system.

Availability of Source Code

One of the most significant points of UNIX is the availability of source code forthe system. (For those new to software, source code contains theprogramming elements that, when passed through a compiler, will produce abinary program— which can be executed.) The binary program containsspecific com puter instructions, w hich te lls the system ―w h at to d o.‖ W hen th e source code is available, it means that the system (or anysubcomponent) canbe modified without consulting the original author ofthe program. Access tothe source code is a very positive thing and can result in many benefits. Forexample, if software defects (bugs) are found within the source code, they canbe fixed right away— without perhaps waiting for the author to do so.

Another great reason is that new software functions can be integrated into thesource code, thereby increasing the usefulness and the overall functionality ofthe softw are. H aving the ability to extend the softw are to the u s e r‘srequirements is a massive gain for the end user and the software industry asa whole. Over time, the software can become much more useful. Onedownside to having access to the source code is that it can become hard tomanage, because it is possible that many different people could havemodified the code in unpredictable (and perhaps negative) ways. However,th is prob lem is typ ica lly ad dressed by having a ―source cod e m aintaine r,‖which reviews the source code changes before the modifications areincorporated into the original version.

Another downside to source code access is that individuals may use thisinformation with the goal in mind of compromising system or componentsecurity. The Internet Worm of 1988 is one such popular example. Theauthor, who was a graduate student at Cornell University at the time, was ableto exploit known security problems within the UNIX system to launch asoftware program that gained unauthorized access to systems and was ableto replicate itself to many networked computers. The Worm was so successfulin attaching and attacking systems that it caused many of the computers tocrash due to the amount of resources needed to replicate. Although the Wormd id n ‘t actu a lly ca use sig n ifican t p e rm a ne n t d a m a g e to th e syste m s it in fected,it opened the eyes of the UNIX community about the dangers of source codeaccess and security on the Internet as a whole.

Flexible Design

Page 13: Unix Shell Scripting

13

UNIX was designed to be modular, which makes it a very flexible architecture.The modularity helps provide a framework that makes it much easier tointroduce new operating system tools, applications, and utilities, or to help inthe migration of the operating system to new computer platforms or otherdevices. A lthough so m e m ight argum e nt that U N IX isn ‘t flexib le enough fo r their needs, it is quite adaptable and can handle most requirements.This is evidenced by the fact that UNIX runs on more general computerplatforms and devices than any other operating system.

GNU

The GNU project, started in the early 1980s, was intended to act as acounterbalance to the widespread activity of corporate greed and adoption oflicense a gree m e n ts fo r com p u te r so ftw a re . T h e ―GNU is not UNIX ‖p ro jectw as resp onsib le for producing som e of the w orld ‘s m ost popular UN IXsoftware.

This includes the Emacs editor and the gcc compiler. They are thecornerstones of the many tools that a significant number of developers useevery day.

Open Software

UNIX is open, which basically means that no single company, institution, orindividual owns UNIX— nor can it be controlled by a centralauthority. However, the UNIX name remains a trademark. Anyone usingthe Internet may obtain open source software, install it, and modify it, andthen redistribute the software without ever having to shell out any money inthe process. The open source movement has gained great

advances and has clearly demonstrated thatquality software can, in fact, be free. Granted, it is quite true that certainversions of UNIX are not open, and you do indeed need to pay to usethese operating systems in the form of an end-user licensing agreement.Generally speaking, vendors that charge for UNIX represent onlya portion of the total number of UNIX releases available within the UNIXcommunity.

Programming Environment

UNIX provides one ofthe best development environments available byproviding many of the important tools software developers need. Also, thereare software tools such as compilers and interpreters for just about everymajor programming language known in the world. Not only can one writeprograms in just about any computer language, UNIX also provides additionaldevelopment tools such as text editors, debuggers, linkers, and relatedsoftware. UNIX was conceived and developed by programmers forprogrammers, and it stands to reason that it will continue to be theprogram m er‘s developm ent platform of choice now and in the future.

Page 14: Unix Shell Scripting

14

Availability of Many Tools

UNIX comes with a large number of useful applications, utilities, andprogram s, w hich m an y peo ple con sider to b e on e of U N IX‘s g re a te st strengths. They are collectively known or commonly referredto as UNIX―too ls,‖ and th e y cover a w id e rang e o f functions an d purposes. O ne o f th e most significant aspects of UNIX is the availability of software toaccomplish one or more very specific tasks. You will find throughout thistext that the concept of tools is quite universal and is used repeatedly. Thisbook not only discusses the subject of system administration but alsoprovides detailed descriptions of UNIX-based tools. As a systemadministrator, you will come to depend on certain tools to help you do yourjob. Just as construction workersrely on the tools they use, so too will the administrator rely on the softwarethat permits them to handle a wide range of functions, tasks, issues, andproblems.

There are tools to handle many system administration tasks that you mightencounter. Also, there are tools for development, graphics manipulation, textprocessing, database operations— just about any user- or system-related req u ire m en t. If the basic op era tin g system version doesn ‘t p rovid ea p a rticular tool that you need, chances are that someone has alreadydeveloped the tool and it would be available via the Internet.

System Libraries

A system library is a collection of software that programmers use to augmenttheir applications. UNIX comes with quite a large collection of functions orroutines that can be accessed from several different languages to aid theapplication writer with a variety of tasks. For example, should the need ariseto sort data, UNIX provides several different sort functions.

Well Documented

UNIX is well documented with both online manuals and with many referencebooks and user guides from publishers. Unlike some operating systems, UNIXprovides online main page documentation of all tools that ship with thesystem.

Also, it is quite customary that open source tools provide gooddocumentation.

Further, the UNIX community provides journals and magazine articles aboutUNIX, tools, and related topics of interest.

Page 15: Unix Shell Scripting

15

ARCHITECTURE OF UNIX SYSTEM:To understand how the UNIX System works, you need to understand itsstructure. The UNIX Operating System is made up of several majorcomponents. Those components include the Kernel, the shell, the filesystem, and the commands or user programs.

UNIX is a layered operating system. The innermost layer is the hardware thatprovides the services for the OS. The operating system, referred to in UNIXas the kernel, interacts directly with the hardware and provides the servicesto the user pro gram s. T hese user p rogram s d on‘t n eed to know an ything about the hardware. They just need to know how to interact withthe kernel and it‘s up to the ke rn e lto provide the desire d service . O ne of the b ig app ea lsof UNIX to programmers has been that most well written user programs areindependent of the underlying hardware, making them readily portable to newsystems.

Note: The core of the UNIX system is the Kernel. The kernel controls theco m p u ter’s reso u rces,allo ttin g th em to d ifferen t u sers an d to d ifferen ttasks.

User programs interact with the kernel through a set of standard systemcalls. These system calls request services to be provided by the kernel. Suchservices would include accessing a file: open close, read, write, link, orexecute a file; starting or updating accounting records; changing ownership ofa file or directory; changing to a new directory; creating, suspending, or killinga process; enabling access to hardware devices; and setting limits on systemresources.

UNIX is a multi-user, multi-tasking operating system. You can have manyusers logg ed into a system sim ultane ously, each running m an y programs. It‘s the kernel‘s jo b to keep each proce ss and user separa te and to

regu la te access to system hardware, including CPU, memory, disk andother I/O

devices.

UNIX utilities or commands are a collection of about 200 programs thatservice the day-to-day processing requirements. These programs are invokedthrough the shell, which is itself another utility.

Apart from the utilities that are provided as part of the UNIX operating system,more than a thousand UNIX based application programs, like databasemanagement systems, word processors, accounting software etc.,

The basic unit used to organize information in the UNIX System is called afile. The UNIX file system provides a logical method for organizing, storing,retrieving, manipulating, and managing information.

Page 16: Unix Shell Scripting

16

UNIX SHELLS

The Shell reads your commands and interprets them as requests to executea program or programs, which it then arranges to have carried out. Becausethe shell plays this role, it is called a command interpreter. Besides being acommand interpreter, the shell is also a programming language. As aprogramming language, it permits you to control how and when commandsare carried out. For each user working with UNIX at any time different shellprograms are raining. There may be several shells running in memory, butonly one kernel.

UNIX shell, including three major variants of the shell.

1. The Bourne shell

2. The C Shell

3. The Korn shell

Page 17: Unix Shell Scripting

17

The original UNIX system shell, sh, was written by Steve Bourne, and as aresult it is known as the Bourne shell.

The C shell, csh, was originally developed as part of BSD UNIX. cshintroduced a number of important enhancement to sh, including the conceptof a command history list and job control.

The Korn shell, ksh, builds on the sh and extends it by adding many featuresfrom the C shell.

Each of these shells has their own respective prompts. The Bourne shell hasthe $ prompt. So when you login it is the bourn shell that is established foryou and the stage is set for you to work on the machine.

Features of Shell:

Interactive Processing: It acts as an interface and providescommunication between the users and the system.

Background Processing: Time consuming; non-interactive tasks canproceed while the user continues with other processing.

Input/Output redirection: Programs, which can interact with a user,can be made to take their input from another source, such as a file andsend their output to another destination, such as printers.

Shell Scripts: A frequently used sequence of shell commands can bestored in a file. The name of the file can be later used to execute thestored sequence with a single command.

Shell Variables: The user can control the behavior of the shell, as wellas other programs utilities by storing data in variables.

Page 18: Unix Shell Scripting

18

The File System

The UNIX file system looks like an inverted tree structure. You start with theroot directory, denoted by /, at the top and work down through sub-directoriesunderneath it.

Sreedhar Solo STUD

Each node is either a file or a directory of files, where the latter can containother files and directories. You specify a file or directory by its path name,either the full, or absolute, path name or the one relative to a location. The fullpath name starts with the root, /, and follows the branches of the file system,each separated by /, until you reach the desired file, e.g.:

/home/Sreedhar/source/xntp

A relative path name specifies the path relative to another, usually the currentworking directory that you are at. Two special directory entries should beintroduced now:

● the current directory●●the parent of the current directory

S o if I‘m at /hom e/frank and w ish to specify the path ab ove in a re la tivefashion I could use:

●● /Sreedhar/source/xntp

Page 19: Unix Shell Scripting

19

This indicates that I should first go up one directory level, then come downthrough the Sreedhar directory, followed by the source directory and then toxntp.

Unix Directories, Files and Inodes

Every directory and file is listed in its parent directory. In the case of the rootdirectory, that parent is itself. A directory is a file that contains a table listingthe files contained within it, giving file names to the inode numbers in the list.An inode is a special file designed to be read by the kernel to learn theinformation about each file. It specifies the permissions on the file, ownership,date of creation and of last access and change, and the physical location ofthe data blocks on the disk containing the file.

The system does not require any particular structure for the data in the fileitself. The file can be ASCII or binary or a combination, and may representtext data, a shell script, compiled object code for a program, directory table,junk, or anything you would like.

T here ‘s no header, tra iler, la bel in form ation or EOF character as part of thefile.

Unix Programs

A program, or command, interacts with the kernel to provide theenvironment and perform the functions called for by the user. A program canbe: an executable shell file, known as a shell script; a built-in shell command;or a source compiled, object code file.

The shell is a command line interpreter. The user interacts with the kernelthrough the shell. You can write ASCII (text) scripts to be acted upon by ashell.

System programs are usually binary, having been compiled from C sourcecode. These are located in places like /bin, /usr/bin, /usr/local/bin, /usr/ucb,etc.

Page 20: Unix Shell Scripting

20

Module 2Exploring the UNIX Shell:

The shell is a rather unique component of the UNIX operating system since itis one of the primary ways to interact with the system. It is typically throughthe shell that users execute other commands or invoke additional functions.

The shell is commonly referred to as a command interpreter and isresponsible for executing tasks on behalf of the user. Figure 2-1 shows apictorial view of how the shell fits with the UNIX system. As you can see, theshell operates within the framework just like any other program. It provides aninterface between the user, the operating system functions, and ultimately thesystem Kernel.

The UNIX Shell

Another powerful feature of the UNIX shell is the ability to support thedevelopment and execution of custom shell scripts. The shell contains a miniprogramming language that provides a lightweight way to develop new toolsand utilities without having to be a heavyweight software programmer. A UNIXshell script is a combination of internal shell commands, regular UNIXcommands, and some shell programming rules.

UNIX supports a large number of different shells, and also many of thepopular ones are freely available on the Internet. Also, many versions of UNIXcome with one or more shells and as the system administrator, you can install

Page 21: Unix Shell Scripting

21

additional shells when necessary and configure the users of the system to usedifferent shells, depending on specific preferences or requirements. The tablebelow lists many of the popular shells and a general description of each.

Once a user has logged into the system, the default shell prompt appears andthe shell simply waits for input from the user. Thus, logging into a Solarissystem as the root user for example, the standard Bourne shell prompt will be

#

The system echoes this prompt to signal that it is ready to receive input fromthe keyboard. At this point, this user is free to type in any standard UNIXcommand, application, or custom script name and the system will attempt toexecute or run the command. The shell assumes that the first argument given

Shell Name General Descriptionsh Standard Bourne shell, which is one of the most popular shells

around.

csh Standard shell with C like language support

bash GNU Bourne-Again shell that includes elements from the Kornshell and C shell.

tcsh Standard C shell with command-line editing and filenamecompletion capabilities.

ksh The Korn shell combines the best features of the Bourne and Cshells and includes powerful programming tools

zsh Korn shell like, but also provides many more features such asbuilt-in spell correction and programmable command completion.

Accessing a UNIX System

The configuration you use to access your UNIX System can be based on oneof two basic models: using multi-user computer or single user computer.

On a multi-user system, you use your own terminal device to access the UNIXsystem. The computer you access can be a workstation, a microcomputer, amainframe computer, or even a super computer.

Single user systems are direct personal computer. In this you can directly runUNIX OS. (UnixWare 7.1 by SCO, Solaris 7 from SunSoft, Public domainVersion of UNIX, and popular variant of UNIX known as Linux can use onsingle user system).

Page 22: Unix Shell Scripting

22

Your display can be character-based, or it can be bit mapped. It may display asingle window or multiple windows, as in the X-Windows system.

Before You Start

UNIX System from a PC: Many different application packages, calledterminal emulators, run on a PC and enable you to connect to a UNIX system.Terminal emulators all function the same basic way, in that they act asterminal attached to the UNIX machine. This allows you to enter commandsthe same way that you would if you were using a terminal.

UNIX System from a Terminal: If your terminal has not been set to work witha UNIX System, you must have its options set appropriately. Setting options isdone in different ways on different terminals.

Selecting a LOGIN : Every UNIX System has at least one person, called theSystem Administrator, whose job is to maintain the system, and make itavailable to its users. The system administrator is also responsible for addingnew users to the system and setting up their initial work environment on thecomputer.

Login name should created by the system administrator. In general, loginname (logname) can be almost any combination of letters and numbers, butthe UNIX System places some constraints on logname selections:

Login name must be more than two characters long, and if it is longerthan eight, only the first eight characters are relevant.

It can contain any combination of lowercase letters and numbers andmust begin with a lowercase letter. If you log in using uppercaseletters, a UNIX system will assume that your terminal can only receiveuppercase letters, and will only send uppercase letters for the entiresession.

Your logname should not have any symbols or spaces in it, and it mustbe unique for each user. Some lognames are reserved customarily forcertain uses. For example, the root normally refers to the systemadministrator or superuser who is responsible for the whole system.

Connecting to a UNIX System:

Direct Connect: With single user workstations and personal computers, andwith the primary administration terminal on a multi-user system (console), acable permanently connects the terminal with the computer. After bootingyour PC and invoking your terminal emulator or turning on your terminal, hitthe carriage return and you should see the UNIX System prompt that says

Page 23: Unix Shell Scripting

23

login:

Dial in Access: You may have to dial into the computer using a modembefore you are connected. Use your emulator or dial function to dial the UNIXSystem access number. When the system answers the call, you will hear ahigh-pitched tone you should see some characters appear on screen. Thenyou getting UNIX system login prompt.

Local Area Network: Another means of connecting your PC or terminal to theUNIX System is via a local I network. A local area network (LAN) is a set ofcommunication devices and cables t connects several PCs or terminals andcomputers. A number of LAN environments are in use today, such as LANManager and NetWare. Each LAN environment provides a set of software thatcan be used in conjunction with a specialized hardware card at each end ofthe network, called a NIC (network interface card) or a LAN card; that enablesyou to connect a client machine to a server machine. The clients and serversmay be running Windows or UNIX, or both. The protocol most frequently usedto connect a client machine to a UNIX server is TCP/IP, with other protocolssuch as IPX and SPX also widely used on LANs.

An example of this environment would be a group of Windows PCs connectedto a common UNIX server running a UNIX operating system such asUnixWare 7, Solaris, or Linux. This type of environment usually is maintainedby a LAN administrator, a person who knows how local area networks work.This is often the same person like system administrator.

In accessing a UNIX System on a LAN, you first need to configure your PC tobe able to recognize the system you wish to connect to.

IP Network: If PC is connected to an IP network, such as the Internet or anintranet, you can use the telnet command to access any computer on thisnetwork that allows such connections. The computer you access may be aUNIX computer, or a computer running some other Operating system, and itmay be a local computer or one located thousands of miles away. A variety oftelnet commands can help you manage a telnet session with the computeryou accessing.

Logging In:

As a multi-user system, the UNIX System first requires that you identifyyourself before you access to the system.

login: <user login name>

Page 24: Unix Shell Scripting

24

Changing Your Password:

When you first log into a UNIX System, you will have either no password at all(a null password) or an arbitrary password assigned by the systemadministrator. These are only intended for temporary use. Neither offers anyreal security. A null password gives anyone access to your account; oneassigned by the system administrator is likely to be easily guessed bysomeone. Officially assigned passwords often consist of simple combinationsof your initials and your student, employee, or social security number. If yourpassword is simply your employee number and the letter X, anyone withaccess to this information has access to all of your computer files. Sometimesrandom combinations of letters and numbers are used. Such passwords aredifficult to remember, and consequently users will be tempted to write themdown in a convenient place. (Resist this temptation!)

The passwd Command :

You change your password by using the passwd command. When you issuethis command, the system checks to see if you are the owner of the login.This prevents someone from changing your password and locking you out ofyour own account. passwd first announces that it is changing the password,and then it asks for your (current) old password, like this:

$ passwd

passwd: changing password

Old password:

New password:

Re-enter new password:

$

The system asks for a new password and asks for the password to be verified(you do this by retyping it). The next time you log in, the new password iseffective. Although you can ordinarily change your password whenever youwant, on some systems after you change your password you must wait aspecific period of time before you can change it again.

How to pick a password?

When choosing a password, it is important that it be something that could notbe guessed -- either by somebody unknown to you trying to break in, or by anacquaintance who knows you. Suggestions for choosing and using apassword follow:

Page 25: Unix Shell Scripting

25

Don't . Use a word (or words) in any language Use a proper name Use information that can be found in your wallet Use information commonly known about you (car license, pet

name, etc) Use control characters. Some systems can't handle them Write your password anywhere Ever give your password to *anybody*

Do . Use a mixture of character types (alphabetic, numeric, special) Use a mixture of upper case and lower case Use at least 6 characters Choose a password you can remember Change your password often Make sure nobody is looking over your shoulder when you

are entering your password

Caution: If you do forget your password, there is no way to retrieve it.Because it is encrypted, even your system administrator cannot lookupyour password. If you cannot remember it administrator will have to giveyou a new password.

Changing a Password at Initial Login

On some systems, you will be required to change your password the first timeyou log in. This will work as described previously and will look like this:

login: sreedharPassword:Your password has expired.Choose a new one.

Old password:New password:

Re-enter new password:

Password Aging

To ensure the secrecy of your password, you will not be allowed to use thesame password for long stretches of time. On UNIX Systems, passwords age.When yours gets to the end of its lifespan, you will be asked to change it. Thelength of time your password will be valid is determined by your systemadministrator. However, you can view the status of your password on mostUNIX systems. Generally, the s option to the passwd command shows youthe status of your password, like this:

Page 26: Unix Shell Scripting

26

$ passwd -srayjay PW 04/01/99 7 30 5

name

passwd status

date last changed

min days between changes

max days between changes

days before user will be warned to change password

The first field contains your login name; the next fields list the status of yourpassword, the date it was last changed, and the minimum and maximum daysallowed between password changes; and the last field is the number of daysbefore your password will need to be changed. Note that this is simply anexample-Km your system, you may not be allowed to read all of these fields.

An Incorrect Login

If you make a mistake in typing either your login or your password, the UNIXSystem will respond this way:

login: sreedharPassword:Login Incorrectlogin:

You will receive the "Password:" prompt even if you type an incorrect ornonexistent login name. This prevents someone from guessing login namesand learning which one is valid by discovering one that yields the"Password:" prompt. Because any login results in "Password:" an intrudercannot guess login names in this way.

If you repeatedly type your login or password incorrectly (three to five times,depending on how your system administrator has set the default), the UNIXSystem will disconnect your terminal if it is connected via modem or LAN. Onsome systems, the system administrator will be notified of erroneous loginattempts as a security measure. If you do not successfully log in within sometime interval (usually a minute), you will be disconnected.

If you have problems logging in, you might also check to make sure that yourCAPS LOCK key has not been set. If it has been set, you will inadvertently enteran incorrect logname or password, because in UNIX uppercase andlowercase letters are treated differently. (Note that unlike in some otherenvironments, your account will not get locked if you enter your passwordincorrectly some number of times, you will just get disconnected.)

Page 27: Unix Shell Scripting

27

When you successfully enter your login and password, the UNIX Systemresponds with a set of messages, similar to this:

login: sreedharPassword:UNIX System V/386/486 Release 4.0 Version 3.0minnieCopyright (c) 1984, 1986, 1987, 1988, 1989, 1990 AT&TCopyright (C) 1987, 1988 Microsoft Corp.

Copyright (C) 1990, NCR Corp.All Rights Reserved

Last login: Mon January 29 19:55:17 on term/17

You first see the UNIX System announcement that tells you the particularversion of UNIX you are using. Next you see the name of your system, minniein this case. This is followed by the copyright notice.

Finally, you see a line that tells you when you logged in last. This is a securityfeature. If the time of your last login does not agree with when you rememberlogging in, call your system administrator. This discrepancy could be anindication that someone has broken into your system and is using your login.

After this initial announcement, the UNIX System presents system messagesand news.

Message of the Day (MOID)

Because every user has to log in, the login sequence is the natural place toput messages that need to be seen by all users. When you log in, you will firstsee a message of the day (MOTD). Because every user must see this MOTD,the system administrator (or root) usually reserves these messages forcomments of general interest, such as this:

Attention ALL Users !!!minnie will be coming down on Sunday Feb. 5, 2007 from8:00am until 12:00pm (noon) for system maintenance. Pleaseschedule your work accordingly. Thank you.

The UNIX System Prompt

After you log in, you will see the UNIX System command prompt at the far leftside of the current line. The default system prompt (for most UNIX Systems) isthe dollar sign:

Page 28: Unix Shell Scripting

28

$

This $ is the indication that the UNIX System is waiting for you to enter acommand.

In the examples in this book, you will see the $ at the beginning of a line as itwould be seen on the screen, but you are not supposed to type it.

The command prompt is frequently changed by users. Users who haveaccounts on different machines may use a different prompt on each one toremind them which computer they are using. Some users change their promptto tell them where they are in the UNIX file system or you may simply find the$ symbol unappealing and wish to use a different symbol or set of symbolsthat you find more attractive. It is simple to do this.

The UNIX System enables you to define a prompt string, PS1, which is usedas a command prompt. The symbol PS1 is a shell variable (see Chapter 7)that contains the string you want to use as your prompt. To change thecommand prompt, set PS1 to some new string. For example,

$ PS1 = "UNIX:> "

changes your primary prompt string from whatever it currently is to the string "UNIX:> ". From that point, whenever the UNIX System is waiting for you toenter a command, it will display this new prompt at the beginning of the line.You can change your prompt to any string of characters you want. You canuse it to remind yourself which system you are on, like this:

$ PS1="MyUnix->MyUnix->

or simply to give yourself a reminder:

$ PS1="Leave at 4:30 PM> "Leave at 4:30 p.m.>

If you redefine your prompt, it stays effective until you change it or until youlog off. Later in this chapter, you will learn how to make these changesautomatically when you first log in.

Some Basic UNIX CommandsEntering Commands on UNIX Systems

The UNIX System makes a large number of programs available to the user.To run one of these programs you issue a command. For example, when youtype news or passwd, you are really instructing the UNIX System commandinterpreter to execute a program with the name news or passwd, and todisplay the results on your screen.

Page 29: Unix Shell Scripting

29

Some commands simply provide information to you; news works this way. Anoften-used command is date, which prints out the current day, date, and time.There are hundreds of other commands, and you will learn about many ofthem in this book. Different variants of the UNIX system share a largecommon set of commands (sometimes different names are used for the samecommand in different UNIX variants) and provide other commands that areunique for that particular version of UNIX.

Unix Command Line Structure

The UNIX system offers several file and directory related commands whichthe user can use according to his requirement.

A command is a program that tells the Unix system to do something. It hasthe form:

command [options] [arguments]

where an argument indicates on what the command is to perform its action,usually a file or series of files. An option modifies the command, changing theway it performs.

Commands are case sensitive. command and Command are not the same.

Options are generally preceded by a hyphen (-), and for most commands,more than one option can be strung together, in the form:

command -[option][option][option]

e.g.: ls –alR

will perform a long list on all files in the current directory and recursivelyperform the list through all sub-directories.

For most commands you can separate the options, preceding each with ahyphen, e.g.:

command -option1 -option2 -option3

as in:

ls -a -l –R

Some commands have options that require parameters. Options requiringparameters are usually specified separately,

e.g.: lpr –P printer3 -# 2 file

will send 2 copies of file to printer3.

Page 30: Unix Shell Scripting

30

These are the standard conventions for commands. However, not all Unixcom m ands w ill fo llo w the sta nd ard . S om e do n ‘t req u ire the h yph e nbefo re options a nd som e w on ‘t le t yo u group options together, i.e . the y ma y re q u ire that each option be preceded by a hyphen and separated by whitespace from other options and arguments.

Options and syntax for a command are listed in the man page for thecommand.

UNIX Commands:

UNIX comes with a large number of commands that fall under each of thecategories listed above for both the generic user and the systemadministrator. It is quite hard to list and explain all of the available UNIXfunctions and/or commands in a single book. Therefore, a review of some ofthe more important user-level commands and functions has been providedand subsequent modules provide a more in-depth look at system-levelcommands. All of the commands discussed below can be run by genericusers and of course by the system administrator. However, one or moresubfunctions of a command may be available only to the systemadministrator.

The standard commands are listed bellow, which are available across manydifferent versions of UNIX. For example, if we wanted to get a listing of all theusers that are currently logged into the system, the who command can beused.

UNIX

cat

date

hostname

find

grep

ls

more

ps

who

Command Meaning

Show the content of file.

Show system date and time.

Display name of system.

Search for a specific file.

Search a file for specified pattern.

List files in a directory.

Another command to show content of file.

Show status of processes.

Show current users on the system.

Page 31: Unix Shell Scripting

31

Metacharacters and Wildcards

The metacharacters have special meaning to the shell; they should notnormally be used as any part of a file name.

The "-" symbol can usually be used in a filename provided it is not the firstcharacter. For example, if we had a file called -l then issuing the command ls-l would give you a long listing of the current directory because the lscommand would think the l was an option rather than -l being a file nameargument. Some UNIX commands provide facilities to overcome this problem.

The shell offers certain special characters called a wild card character thathelps us to specify certain patterns. The shell will then match the pattern inthe file names and select all the files whose name matches the pattern andwill apply the specified file command. The wild card characters are as follows

This wild card character matches any number of characters.

Therefore any pattern which contains the symbol it will be replacedby any number of any characters.

Page 32: Unix Shell Scripting

32

The wildcard ? is expanded by the shell to match any single character in a filename. The exception is that the ? w ill N O T m atch a dot ―. ‖ as thefirst character of a file name (for example, in a hidden file).

The wildcard * is expanded by the shell to match zero to any number ofcharacters in a file name. The single * will be expanded to mean all files in thecurrent directory except those beginning with a dot. Beware of the commandrm * which could cause serious damage removing all files!

Specifying a Multiple File Names

Multiple filenames can be specified using special pattern-matching characters.The rules are:

'?' matches any single character in that position in the filename. '*' matches zero or more characters in the filename. A '*' on its

own will match all files. '*.*' matches all files with containing a '.'. Characters enclosed in square brackets ('[' and ']') will match

any filename that has one of those characters in that position. A list of comma separated strings enclosed in curly braces ("{"

and "}") will be expanded as a Cartesian product with thesurrounding characters.

For example:

1. ??? matches all three-character filenames.

Page 33: Unix Shell Scripting

33

2. ?ell? matches any five-character filenames with 'ell' in themiddle.

3. he* matches any filename beginning with 'he'.4. [m-z]*[a-l] matches any filename that begins with a letter from 'm'

to 'z' and ends in a letter from 'a' to 'l'.5. {/usr,}{/bin,/lib}/file expands to /usr/bin/file /usr/lib/file /bin/file and

/lib/file.

Note that the UNIX shell performs these expansions (including any filenamematching) on a command's arguments before the command is executed.

Example*c

includes all files ending with '.c' because * stands for any number ofany characters, e.g new.c, ptr.c, str.c etc.

A command like rm *.c will therefore delete all files ending with '.c' Theother files which do not end with '.c' will be retained. The patternspecifies that the files must neccessarily end with '.c'.

? ▬ T his w ild card specifies an y o ne character. T herefore in a pattern if thewild card ? appears then it will be replaced by any one character.

Examplecat ab?xy

The above command will display the contents of all files whose name startswith ab followed by any one character followed by xy.

This wild card specifies any one of the character listed out within the [ ].

Examplerm ab[efg]yz

The above command will delete all the files that begin with ab followed byeither e, f, or g followed by xy.

PIPES UNIX offers a provision whereby the output of one program can bemade the

input of another program. Both the programs are separated by the |symbol.

Example$ cat fil.cjpg

Page 34: Unix Shell Scripting

34

The above command will display the contents of the file fll.c page by pagebecause the output is piped to a program called pg which displays the outputonly one screenful at a time.

UNIX Standard Files:There are three files are automatically opened for each process in the system.

These files are referred to as standard input, standard output and standarderror.

Standard input, sometimes abbreviated to stdin is where a command expectsto find its input, usually the keyboard.

Standard out (stdout) and standard error (stderr) is where the commandexpects to put its output, usually the screen.

These defaults can be changed using redirection.

Page 35: Unix Shell Scripting

35

Note: Remember that in AIX, not all file names refer to real data files!

S om e files m ay b e ―specia lfiles‖ w h ich in rea lity are a po inter to som e ofthe devices on the system (for example /dev/tty0).

Page 36: Unix Shell Scripting

36

Page 37: Unix Shell Scripting

37

Page 38: Unix Shell Scripting

38

Two or more commands can be separated by a pipe on a single commandline. The requirement is that any command to the left of a pipe must sendoutput to standard output.

Any command to the right of the pipe must take its input from standard input.

The example on the visual shows that the output of who is passed as input towc -l, which gives us the number of active users on the system.

Page 39: Unix Shell Scripting

39

A command is referred to as a filter if it can read its input from standard input,alter it in some way, and write its output to standard output. A filter can beused as an intermediate command between pipes.

A filter is commonly used with a string of piped commands, as in the exampleabove. The ls -l command lists all the files in the current directory and thenpipes this information to the grep command. The grep command will becovered in more detail later in the course, but in this example, the grepcommand is used to find all lines beginning with a d (directories).

The output of the grep command is then piped to the wc -l command. Theresult is that the command is counting the number of directories. In thisexample, the grep command is acting as a filter.

Placing multiple commands separated b y a ― ; ‖ o n a sin g le lin e p rod uces th e same result as entering each command on a separate command line.There need be no association between the two commands.

Page 40: Unix Shell Scripting

40

The \ must be the last character on the line and immediately followed bypressing Enter.

Do not confuse the continuation prompt > with the redirection character >. Thesecondary prompt will not form part of the completed command line. If yourequire a redirection character you must type it explicitly.

Module 3

Processes:

A program or a command that is actually running on a system is referred to as

Page 41: Unix Shell Scripting

41

a process.

UNIX can run a number of different processes at the same time as well asmany occurrences of a program (such as vi) existing simultaneously in thesystem.

The process ID (PID) is extracted from a process table.

In a shell environment, the process ID is stored in the variable $$.

To identify the running processes, execute the command ps, which will becovered later in this course. For example, ps -u team01 shows all runningprocesses from user team01.

Page 42: Unix Shell Scripting

42

ps prints information only about processes started from your current terminal.Only the Process ID, Terminal, Elapsed Time and Command are displayed.The -e option displays information about EVERY process running in thesystem.

The -f option in addition to the default information provided by ps, displays theUser Name, PPID, start time for each process (that is, a FULL listing).

The -l option displays the User ID, PPID and priorities for each process inaddition to the information provided by ps (that is, a LONG listing)

Page 43: Unix Shell Scripting

43

Processes that are started from and require interaction with the terminal arecalled foreground processes. Processes that are run independently of theinitiating terminal are referred to as background processes.

Background processes are most useful with commands that take a long timeto run.

A process can only be run in the background if:

1. It doesn't require keyboard input, and2. It is invoked with an ampersand & as the last character in the commandline.

Notes: The <ctrl-c> may not always work. A Shell script or program can trapthe signal a <ctrl-c> generates and ignore its meaning.

Page 44: Unix Shell Scripting

44

You can stop a foreground process by pressing <ctrl-z>. This does notterminate the process; it suspends it so that you can subsequently restart it.

To restart a suspended processes in the background, use the bg command.To bring a suspended or background process into the foreground, use the fgcommand.

To find out what suspended/background jobs you have, issue the jobscommand.

The bg, fg, kill commands can be used with a job number. For instance, tokill job number 3, you can issue the command: kill %3 The jobs commanddoes not list jobs that were started with the nohup command if the user haslogged off and then logged back into the system. On the other hand, if a userinvokes a job with the nohup command and then issues the jobs commandwithout logging off, the job will be listed.

Page 45: Unix Shell Scripting

45

Module 4

Shell Script:

A shell script is a simple text file that contains UNIX commands.

When a shell script is executed, the shell reads the file one line at a time andprocesses the commands in sequence.

Any UNIX command can be run from within a shell script. There are also anumber of built-in shell facilities which allow more complicated functions to beperformed. These will be illustrated later.

Any UNIX editor can be used to create a shell script.

Page 46: Unix Shell Scripting

46

A shell script is a collection of commands in a file. In the example a shellscript hello is shown.

To execute this script, start the program ksh and pass the name of the shellscript as argument:

$ ksh hello

This shell reads the commands from the script and executes all commandsline by line.

The .profie file

A fte r a u se r lo g s in a n d a s p a rt o f sta rtin g u p th e u se r‘s sh e ll, two profilefiles are executed. The first is the system profile /etc/profile, which is run byevery user, and the second is the .profile in the user home directory, whichis only run by the user who owns it.

The .profile contains a sequence of commands that help you customize yourenvironment. Because the .profile is read each time you start a new Kornshell, the commands you put in this file to customize your environment will beexecuted each time you start a new ksh.

These commands can include, but are certainly not limited to, the following:

Page 47: Unix Shell Scripting

47

1. aliases2. terminal control characteristics3. creation/definition of shell environment variables (including your

prompt)

The first file that the operating system uses at login is the /etc/environmentfile. This file contains variables specifying the basic environment for allprocesses and can only be changed by the system administrator.

The second file that the operating system uses at login time is the /etc/profilefile. This file controls system-wide default variables such as the mailmessages and terminal types.

/etc/profile can only be changed by the administrator.

The .profile file is the third file read at login time. It resides in a user's logindirectory and enables a user to customize their individual workingenvironment. The .profile file overrides commands run and variables set andexported by the /etc/profile file.

Ensure that newly created variables do not conflict with standard variablessuch as MAIL, PS1, PS2 and so forth.

Page 48: Unix Shell Scripting

48

At startup time the shell checks to see if there is any new mail in/usr/spool/mail/$LOGNAME. If there is then MAILMSG is echoed back. Innormal operation, the shell checks periodically.

The ENV="$HOME/.kshrc" variable will cause the file $HOME/.kshrc to be runevery time a new Korn shell is explicitly started. This file will usually containKorn shell specifics.

The .profile file is read only when the user logs in.

Be aware that your .profile file may not be read if you are accessing thesystem through CDE (the Common Desktop Environment). By default, CDEinstead uses a file called .dtprofile. In the CDE environment, if you wish touse the .profile file, it is necessary to uncomment the DTSOURCEPROFILEvariable assignment at the end of the .dtprofile file.

Page 49: Unix Shell Scripting

49

Module 5

OverviewThe tilde (~) Expansion:

The C shell provides an easy way to abbreviate the pathname of your homedirectory. When the tilde symbol (~) appears at the beginning of a word inyour command line, the shell replaces it with the full pathname of your logindirectory.

Example:

% mv file ~/newfile

Is the abbreviated way of typing this

% mv file $home/newfile

The whence Command

The whence command can be used to determine exactly where the commandyou specify is located. For instance, it may be a command located on the diskdrive, it may be an alias, or it may be built-in to the Korn shell. whence reportsthe proper location.

whence

$ whence ls <works with basic commands>/bin/ls

$ whence dir <works with aliases>/bin/ls -al | more

$ whence echo <works with built-in commands>echo

Page 50: Unix Shell Scripting

50

Aliases

Aliases in the Korn shell allow you to create your own commands. You cansimply rename existing commands, or you can group commands together tocreate entirely new commands. This feature is also available in the C shell,but the command syntax is slightly different.

The ksh syntax for alias commands:

alias name='value'

Page 51: Unix Shell Scripting

51

The ENV variable specifies a Korn shell script to be invoked every time a newshell is created. The shell script in this example is .kshrc (which is thestandard name used), but any other filename can also be used.

The difference between .profile and .kshrc is that .kshrc is read each time asubshell is spawned, whereas .profile is read once at login.

You can also set the following variable in $HOME/.profile:

EDITOR=/usr/bin/viexport EDITOR

It will do the same thing that the set -o vi command does as shown in theexample.

The alias command invoked with no arguments prints the list of aliases in theform name=value on standard output.

Page 52: Unix Shell Scripting

52

The Korn shell sets up a number of aliases by default. Notice that the historyand r commands are in fact aliases of the fc command. Once this alias isestablished, typing an r will reexcute the previously entered command.

To carry down the value of an alias to subsequent subshells, the ENV variablehas to be modified. The ENV variable is normally set to $HOME/.kshrc in the.profile file (although you can set ENV to any shell script). By adding the aliasdefinition to the .kshrc file (by using one of the editors) and invoking the.profile file, the value of the alias will be carried down to all subshells, becausethe .kshrc file is run every time a Korn shell is explicitly invoked.

The file pointed to by the ENV variable should contain Korn shell specifics.

The unalias command will cancel the alias named. The names of the aliasesspecified with the unalias command will be removed from the alias list.

Page 53: Unix Shell Scripting

53

The /etc/environment file contains default variables set for each process.Only the system administrator can change this file. PATH is the sequence ofdirectories that is searched when looking for a command whose path name isincomplete.

TZ is the time zone information.

LANG is the locale name currently in effect.

LOCPATH is the full path name of the location of National Language Supportinformation, part of this being the National Language Support Table.NLSPATH is the full path name for messages.

Page 54: Unix Shell Scripting

54

Module 6

The vi Editor

It is important to know vi for the following reasons:

• It is the only editor available in maintenance mode on RISC System/6000• S tandard editor across a llU N IX system s• C om m and-line editing feature• U sed as default ed itor for som e prog ram s

This unit covers only a subset of the vi functions. It is a very powerful editor.Refer to the online documentation for additional functions.

vi does its editing in a buffer. When a session is initiated, one of two thingshappens:

• If the file to be e dited exists, a cop y of the file is put in to a buffer in /tmp bydefault.• If the file does not exist, an empty buffer is opened for this session.

Tildes represent empty lines in the editor.

Page 55: Unix Shell Scripting

55

The editor starts in command mode.

Page 56: Unix Shell Scripting

56

Page 57: Unix Shell Scripting

57

Module 7

The Variables:There are a number of variables automatically set by the shell when it starts.These allow you to reference arguments on the command line.

User Variables

It is legal to assign any sequence of non-blank characters as the name of avariable. The sample session below creates a variable called person andinitializes it with the string Richard.

It is important to note that you must NOT precede or follow the equal sign witha space or TAB character.

Sample Session:

$person=Sreedhar

This sample session indicates that person does not represent the stringRichard. The string person is echoed as person. The BourneShell will onlydo the substitution of the value of the variable when the name of the variableis preceded with a dollar sign ($).

Sample Sesssion:

$echo personperson$echo $personSreedhar$

If you want to have imbedded spaces in a variable, it is necessary to quotethe string.

Sample Session:

$person=‘S ree dhar a nd Venkatesh'$echo $personSreedhar and Venkatesh$

Page 58: Unix Shell Scripting

58

Shell variables are an integral part of shell programming. They provide theability to store and manipulating information within a shell program.

All shell variable names are case sensitive. For example, HOME and homeare not the same.

As a convention uppercase names are used for the standard variables set bythe system and lowercase is used for the variables set by the user.

Page 59: Unix Shell Scripting

59

The set command displays your current option settings for all the variables.The set command is a built-in command of the shell, and therefore gives adifferent output depending on the shell being run, for instance a Bourne or aKorn shell.

The echo command displays the string of text to standard out (by default tothe screen).

To set a variable, use the = with NO SPACES on either side. Once thevariable has been set, to refer to the value of that variable precede thevariable name with a $. There must be NO SPACE between the $ and thevariable name.

Page 60: Unix Shell Scripting

60

Notice there need not be a space BEFORE the $ of the variable in order forthe shell to do variable substitution. Note, though, what happened when therewas no space AFTER the variable name. The shell searched for a variablewhose name was xylong, which did not exist. When a variable that has notbeen defined is referenced, the user does not get an error. Rather a null stringis returned.

To eliminate the need for a space after the variable name, the curly braces { }are used.

Note that the $ is OUTSIDE of the braces.

Page 61: Unix Shell Scripting

61

A variable can be set to the output of some command or group of commandsby using the backquotes (also referred to as grave accents). They should notbe mistaken for single quotes. In the examples the output of the date andwho commands are stored in variables.

The backquotes are supported by the bourne shell, C shell and Korn shell.The use of $(command) is specific to the Korn shell.

Read-Only User Variables

The contents of the user variables and the shell variables can be modified bythe user. It is possible to assign a new value to them. The new value can beassigned from the dollar ($) prompt or from inside a BourneShell script.

Read-only variables are different. The value of read-only variables can not bechanged.

The variable must be initialized to some value; and then, by entering thefollowing command, it can be made read only.

Command format: readonly variable_name

variable_name = name of the variable to be made read only

Page 62: Unix Shell Scripting

62

Sample Session:

$person=Sreedhar$readonly person$echo $personSreedhar$person=Venkateshperson: is read only$

The readonly command given without any arguments will display a list of allthe read-only variables.

Sample Session:

$person=Sreedhar$readonly person$example=Venkatesh$readonly example$readonlyreadonly personreadonly example$

Read-Only Shell Variables

The read-only shell variables are similar to the read-only user variables;except the value of these variables is assigned by the shell, and the userCANNOT modify them.

Name of the Calling Program

The shell will store the name of the command you used to call a program inthe variable named $0.

It has the number zero because it appears before the first argument on thecommand line.

Sample Session:

$cat name_execho 'The name of the command used'echo 'to execute this script was' $0

$name_exThe name of the command usedto execute this script was name_ex

Page 63: Unix Shell Scripting

63

$

Arguments

The BourneShell will store the first nine command line arguments in thevariables named $1, $2, ..., $9. These variables appear in this sectionbecause you cannot change them using the equal sign. It is possible tomodify them using the set command.

Sample Session:

$cat arg_execho 'The first five command line'echo 'arguments are' $1 $2 $3 $4 $5$arg_ex Sreedhar Venkatesh SanthoshThe first five command linearguments are Sreedhar venkatesh Santhosh$

The script arg_ex will display the first five command-line arguments. Thevariables representing $4 and $5 have a null value.

The BourneShell variable $* represents all of the command-line arguments asshown in the following example.

Sample Session:

$cat display_allecho $*$display_all Sreedhar venkatesh SanthoshSreedhar venkatesh Santhosh$

The BourneShell variable $# contains the number of arguments on thecommand line. This is a string variable that represents a decimal number.You can use the expr utility to perform calculations with that number and testto perform logical tests on it.

Sample Session:

$cat num_argsecho 'This script was called with'echo $# 'arguments'$num_args Sreedhar venkatesh SanthoshThis script was called with3 arguments$

Page 64: Unix Shell Scripting

64

BourneShell Environment - Exporting Variables

Within a process, you can declare, initialize, read, and modify variables. Thevariable is local to that process. When a process forks a child process, theparent process does not automatically pass the value of the variable to thechild process.

Here is an example of the variables not being exported.

Sample Session:

$cat no_exportcar=mercedes # set the variableecho $0 $car $$ # $0 = name of file executed

# $car =value of variable car# $$ = PID number (process id)

inner # execute another BourneShell scriptecho $0 $car $$ # display same as above$cat innerecho $0 $car $$ # display variables for this process$chmod a+x no_export$chmod a+x inner$no_exportno_export mercedes 4790inner 4792no_export mercedes 4790$

When no_export was executed, it, of course, assigned a value of mercedes tothe variable car and printed it out. The call to inner created a child process.Its PID is 4792, while the parent PID is 4790. Notice, when inner tried to printthe value of car, it printed nothing. The reason is because the value of carwas not passed by the parent.

Can the value be passed from parent to child process? Yes, by using theexport command. Let's look at an example.

Sample Session:

$cat export_itcar=mercedesexport carecho $0 $car $$inner1echo $0 $car $$$cat inner1echo $0 $car $$car=chevy

Page 65: Unix Shell Scripting

65

echo $0 $car $$$chmod a+x export_it$chmod a+x inner1$export_itexport_it mercedes 4798inner1 mercedes 4800inner1 chevy 4800export_it mercedes 4798$

In the export_it BourneShell script, the variable car was initialized tomercedes; and then it was exported. This means that the value of car is nowavailable to a child process. When inner1 prints out the value of car it has thevalue of mercedes. This is as we expect because the value of car wasexported from the parent. The next line of inner1 changes the value of car tochevy. This is shown in the next line of the sample session. The last line ofthe session shows the return to the parent process and the value is stillmercedes. How is this possible?

Exporting variables is only valid from the parent to the child process. Thechild process cannot change the parent's variable.

Reading Input Into a Shell Variable

The BourneShell script can read user input from standard input. The readcommand will read one line from standard input and assign the line to one ormore variables. The following example shows how this works.

Sample Session:

$cat read_scriptecho "Please enter a string of your choice"read aecho $a$

This simple script will read one line from standard input (keyboard) and assignit to the variable a.

Sample Session:

$read_scriptPlease enter a string of your choiceHere it isHere it is$

Page 66: Unix Shell Scripting

66

The line read from standard input can also be assigned to several variablesas shown in the following example.

Sample Session:

$cat readsecho "Please enter three strings"read a b cecho $a $b $cecho $cecho $becho $a$

This time, we will turn on the trace mechanism and follow the execution of thisBourneShell script.

Sample Session:

$sh -x reads+ echo Please enter three stringsPlease enter three strings+ read a b cthis is more than three strings+ echo this is more than three stringsthis is more than three strings+ echo more than three stringsmore than three strings+ echo isis+ echo thisthis$

It is interesting to note that the spaces separate the values for the variablesa,b, and c. For example, the variable a was assigned the string this, thevariable b was assigned the string is, and the remainder of the line wasassigned to c (including the spaces).

Sample Session:

$cat read_execho 'Enter line: \c'read lineecho "The line was: $line"$

Page 67: Unix Shell Scripting

67

In this example, the \c option will suppress the carriage return.The single quote marks protect the backslash from being interpretedby the shell. Also notice that the double quote marks have noeffect on the substitution of the variable line.

Sample Session:

$read_exEnter line: All's well that ends wellThe line was: All's well that ends well$

Page 68: Unix Shell Scripting

68

Module 8

Parameters:A shell is invoked by typing its name. Parameters are passed to the script byappending them to the script name, with spaces as separators.

POSITIONAL PARAMETERS

A BourneShell script can also read in command-line arguments. The firstargument is referred to as $1, the second is $2, and so on. Command-linearguments are referred to as positional parameters.

Let's look at an example BourneShell script to see how these are used.

Sample Session:

$cat neat_shellecho $1 $2 $3echo $0 is the name of the shell scriptecho "There were $# arguments."echo $*$

Insure that the BourneShell script is executable by issuing this command:

Sample Session:

$chmod a+x neat_shell$

Now, if we type the name of the BourneShell script with no arguments, we getthe following results.

Sample Session:

$neat_shell

neat_shell is the name of the shell scriptThere were 0 arguments.

$

Page 69: Unix Shell Scripting

69

In this sample session, there were no arguments given so none were printed.$0 is the positional parameter that refers to the name of the script. Sincethere were no arguments given with this invocation of neat_shell, there werezero arguments listed.

$0: The Name of the Invoking Command

The special variable $0 represents the name of the executing program. Thefollowing shell, if called script.sh would output This program is calledscript.sh.:

#!/bin/shecho This program is called $0.exit 0

$1 $2 $3 ... $9, $*: Shell Parameters

The first parameter to the shell is known as $1, the second as $2, etc. Thecollection of ALL parameters is known as $*.

Consider the following as an example (file prog):

#!/bin/shecho the first parameter is $1echo the second parameter is $2echo the collection of ALL parameters is $*exit 0

The output of that program could be:

sh_prompt;SPMgt; prog first secondthe first parameter is firstthe second parameter is secondthe collection of ALL parameters is first secondsh_prompt;SPMgt;

$#: Number of Parameters

The number of parameters used can be obtained by looking at the value of$#.

Setting values of positional Parameters

Though we have compared the positional parameters with variables, they arein essence quite different. For instance you can‘t assig n values to $1, $2 ..

Page 70: Unix Shell Scripting

70

etc. as we do to any other user-defined variables, or system variables for thatmatter.

Saying a=10 or b=alpha is fine but $1=dollar or $2=100 is simply not done.There is one way to assign values to the positional parameters using the setcommand.

$ set Friends come and go, but enemies accumulate

T he above com m and sets the value $1 w ith ‗F riends‘, $2 w ith ‗com e‘and soon. To verify, we use the echo statement to display their values.

$ echo $1 $2 $3 $4 $5 $6 $7Friends come and go, but enemies accumulate

Using shift: Shifts Parameters

When a large number of parameters (more than 9) are passed to the shell,shift can be used to read those parameters. If the number of parameters to beread is known, say three, a program similar to the following could be written:

#!/bin/shecho The first parameter is $1.shiftecho The second parameter is $1.shiftecho The third parameter is $1.exit 0

Obviously the above example contains redundancy, especially if there are alarge number of parameters.

To solve this problem: use a for or while loop.

Page 71: Unix Shell Scripting

71

Module 9

Regular Expresiion:What is a Regular Expression?

A regular expression is a set of characters that specify a pattern. The term"regular" has nothing to do with a high-fiber diet. It comes from a term used todescribe grammars and formal languages.

Regular expressions are used when you want to search for specify lines oftext containing a particular pattern. Most of the UNIX utilities operate on ASCIIfiles a line at a time. Regular expressions search for patterns on a single line,and not for patterns that start on one line and end on another.

It is simple to search for a specific word or string of characters. Almost everyeditor on every computer system can do this. Regular expressions are morepowerful and flexible. You can search for words of a certain size. You cansearch for a word with four or more vowels that end with an "s." Numbers,punctuation characters, you name it, a regular expression can find it. Whathappens once the program you are using find it is another matter. Some justsearch for the pattern. Others print out the line containing the pattern. Editorscan replace the string with a new pattern. It all depends on the utility.

Regular expressions confuse people because they look a lot like the filematching patterns the shell uses. They even act the same way--almost. Thesquare brackers are similar, and the asterisk acts similar to, but not identicalto the asterisk in a regular expression. In particular, the Bourne shell, C shell,find, and cpio use file name matching patterns and not regular expressions.

The Structure of a Regular Expression

There are three important parts to a regular expression. Anchors are used tospecify the position of the pattern in relation to a line of text. Character Setsmatch one or more characters in a single position. Modifiers specify howmany times the previous character set is repeated. A simple example thatdemonstrates all three parts is the regular expression "^#*." The up arrow isan anchor that indicates the beginning of the line. The character "#" is asimple character set that matches the single character "#." The asterisk is amodifier. In a regular expression it specifies that the previous character setcan appear any number of times, including zero. This is a useless regularexpression, as you will see shortly.

There are also two types of regular expressions: the "Basic" regularexpression, and the "extended" regular expression. A few utilities like awk andegrep use the extended expression. Most use the "regular" regular

Page 72: Unix Shell Scripting

72

expression. From now on, if I talk about a "regular expression," it describes afeature in both types.

Here is a table of the Solaris (around 1991) commands that allow you tospecify regular expressions:

Utility Regular Expression Typevi Basicsed Basicgrep Basiccsplit Basicdbx Basicdbxtool Basicmore Basiced Basicexpr Basiclex Basicpg Basicnl Basicrdist Basicawk Extendednawk Extendedegrep ExtendedEMACS EMACS Regular ExpressionsPERL PERL Regular Expressions

The Anchor Characters: ^ and $

Most UNIX text facilities are line oriented. Searching for patterns that spanseveral lines is not easy to do. You see, the end of line character is notincluded in the block of text wthat is searched. It is a separator. Regularexpressions examine the text between the separators. If you want to searchfor a pattern that is at one end or the other, you use anchors. The character"^" is the starting anchor, and the character "$" is the end anchor. The regularexpression "^A" will match all lines that start with a capital A. The expression"A$" will match all lines that end with the capital A. If the anchor charactersare not used at the proper end of the pattern, then they no longer act asanchors. That is, the "^" is only an anchor if it is the first character in a regularexpression. The "$" is only an anchor if it is the last character. The expression"$1" does not have an anchor. Neither is "1 .̂" If you need to match a "^" at thebeginning of the line, or a "$" at the end of a line, you must escape the specialcharacters with a back slash. Here is a summary:

Page 73: Unix Shell Scripting

73

Pattern Matches^A "A" at the beginning of a lineA$ "A" at the end of a lineA^ "A^" anywhere on a line$A "$A" anywhere on a line^^ "^" at the beginning of a line$$ "$" at the end of a line

The use of "^" and "$" as indicators of the beginning or end of a line is aconvention other utilities use. The vi editor uses these two characters ascommands to go to the beginning or end of a line. The C shell uses "! "̂ tospecify the first argument of the previous line, and "!$" is the last argument onthe previous line.

It is one of those choices that other utilities go along with to maintainconsistancy. For instance, "$" can refer to the last line of a file when using edand sed. Cat -e marks end of lines with a "$." You might see it in otherprograms as well.

Matching a character with a character set

The simplest character set is a character. The regular expression "the"contains three character sets: "t," "h" and "e." It will match any line with thestring "the" inside it. This would also match the word "other." To prevent this,put spaces before and after the pattern: " the ." You can combine the stringwith an anchor. The pattern "^From: " will match the lines of a mail messagethat identify the sender. Use this pattern with grep to print every address inyour incoming mail box:

grep '^From: ' /usr/spool/mail/$USER

Some characters have a special meaning in regular expressions. If you wantto search for such a character, escape it with a back slash.

Match any character with .

The character "." is one of those special meta-characters. By itself it will matchany character, except the end-of-line character. The pattern that will match aline with a single characters is

^.$

Specifying a Range of Characters with [...]

If you want to match specific characters, you can use the square brackets toidentify the exact characters you are searching for. The pattern that will matchany line of text that contains exactly one number is

Page 74: Unix Shell Scripting

74

^[0123456789]$

This is verbose. You can use the hyphen between two characters to specify arange:

^[0-9]$

You can intermix explicit characters with character ranges. This pattern willmatch a single character that is a letter, number, or underscore:

[A-Za-z0-9_]

Character sets can be combined by placing them next to each other. If youwanted to search for a word that

1. Started with a capital letter "T."2. Was the first word on a line3. The second letter was a lower case letter4. Was exactly three letters long, and5. The third letter was a vowel

the regular expression would be "^T[a-z][aeiou] ."

Exceptions in a character set

You can easily search for all characters except those in square brackets byputting a "^" as the first character after the "[." To match all characters exceptvowels use "[^aeiou]."

Like the anchors in places that can't be considered an anchor, the characters"]" and "-" do not have a special meaning if they directly follow "[." Here aresome examples:

Regular Expression Matches[] The characters "[]"[0] The character "0"[0-9] Any number[^0-9] Any character other than a number[-0-9] Any number or a "-"[0-9-] Any number or a "-"[ -̂0-9] Any character except a number or a "-"[]0-9] Any number or a "]"[0-9]] Any number followed by a "]"[0-9-z] Any number,

or any character between "9" and "z".[0-9\-a\]] Any number, or

Page 75: Unix Shell Scripting

75

a "-", a "z", or a "]"

Repeating character sets with *

The third part of a regular expression is the modifier. It is used to specify howmay times you expect to see the previous character set. The special character"*" matches zero or more copies. That is, the regular expression "0*"

matches zero or more zeros, while the expression "[0-9]*" matches zero ormore numbers.

This explains why the pattern "^#*" is useless, as it matches any number of"#'s" at the beginning of the line, including zero. Therefore this will matchevery line, because every line starts with zero or more "#'s."

At first glance, it might seem that starting the count at zero is stupid. Not so.Looking for an unknown number of characters is very important. Suppose youwanted to look for a number at the beginning of a line, and there may or maynot be spaces before the number. Just use "^ *" to match zero or more spacesat the beginning of the line. If you need to match one or more, just repeat thecharacter set. That is, "[0-9]*" matches zero or more numbers, and "[0-9][0-9]*" matches one or more numbers.

Matching a specific number of sets with \{ and \}

You can continue the above technique if you want to specify a minimumnumber of character sets. You cannot specify a maximum number of sets withthe "*" modifier. There is a special pattern you can use to specify the minimumand maximum number of repeats. This is done by putting those two numbersbetween "\{" and "\}." The back slashes deserve a special discussion.Normally a backslash turns off the special meaning for a character. A periodis matched by a "\." and an asterisk is matched by a "\*."

If a backslash is placed before a "<," ">," "{," "}," "(," ")," or before a digit, theback slash turns on a special meaning. This was done because these specialfunctions were added late in the life of regular expressions. Changing themeaning of "{" would have broken old expressions. This is a horrible crimepunishable by a year of hard labor writing COBOL programs. Instead, addinga back slash added functionality without breaking old programs. Rather thancomplain about the unsymmetry, view it as evolution.

Having convinced you that "\{" isn't a plot to confuse you, an example is inorder. The regular expression to match 4, 5, 6, 7 or 8 lower case letters is

[a-z]\{4,8\}

Any numbers between 0 and 255 can be used. The second number may beomitted, which removes the upper limit. If the comma and the second number

Page 76: Unix Shell Scripting

76

are omitted, the pattern must be duplicated the exact number of timesspecified by the first number.

You must remember that modifiers like "*" and "\{1,5\}" only act as modifiers ifthey follow a character set. If they were at the beginning of a pattern, theywould not be a modifier. Here is a list of examples, and the exceptions:

Regular Expression Matches_* Any line with an asterisk\* Any line with an asterisk\\ Any line with a back slash^* Any line starting with an asterisk^A* Any line^A\* Any line starting with an "A*"^AA* Any line if it starts with one "A"^AA*B Any line with one or more "A"'s followed by a "B"^A\{4,8\}B Any line starting with 4, 5, 6, 7 or 8 "A"'s

followed by a "B"^A\{4,\}B Any line starting with 4 or more "A"'s

followed by a "B"^A\{4\}B Any line starting with "AAAAB"\{4,8\} Any line with "{4,8}"A{4,8} Any line with "A{4,8}"

Matching words with \< and \>

Searching for a word isn't quite as simple as it at first appears. The string "the"will match the word "other." You can put spaces before and after the lettersand use this regular expression: " the ." However, this does not match wordsat the beginning or end of the line. And it does not match the case wherethere is a punctuation mark after the word.

There is an easy solution. The characters "\<" and "\>" are similar to the "^"and "$" anchors, as they don't occupy a position of a character. They do"anchor" the expression between to only match if it is on a word boundary.The pattern to search for the word "the" would be "\<[tT]he\>." The characterbefore the "t" must be either a new line character, or anything except a letter,number, or underscore. The character after the "e" must also be a characterother than a number, letter, or underscore or it could be the end of linecharacter.

Page 77: Unix Shell Scripting

77

Backreferences - Remembering patterns with \(, \) and \1

Another pattern that requires a special mechanism is searching for repeatedwords. The expression "[a-z][a-z]" will match any two lower case letters. If youwanted to search for lines that had two adjoining identical letters, the abovepattern wouldn't help. You need a way of remembering what you found, andseeing if the same pattern occurred again. You can mark part of a patternusing "\(" and "\)." You can recall the remembered pattern with "\" followed bya single digit. Therefore, to search for two identical letters, use "\([a-z]\)\1."You can have 9 different remembered patterns. Each occurrence of "\(" startsa new pattern. The regular expression that would match a 5 letter palindrome,(e.g. "radar"), would be

\([a-z]\)\([a-z]\)[a-z]\2\1

Potential Problems

That completes a discussion of the Basic regular expression. Before I discussthe extensions the extended expressions offer, I wanted to mention twopotential problem areas.

The "\<" and "\>" characters were introduced in the vi editor. The otherprograms didn't have this ability at that time. Also the "\{min,max\}" modifier isnew and earlier utilities didn't have this ability. This made it difficult for thenovice user of regular expressions, because it seemed each utility has adifferent convention. Sun has retrofited the newest regular expression libraryto all of their programs, so they all have the same ability. If you try to usethese newer features on other vendor's machines, you might find they don'twork the same way.

The other potential point of confusion is the extent of the pattern matches.Regular expressions match the longest possible pattern. That is, the regularexpression

A.*B

matches "AAB" as well as "AAAABBBBABCCCCBBBAAAB." This doesn'tcause many problems using grep, because an oversight in a regularexpression will just match more lines than desired. If you use sed, and yourpatterns get carried away, you may end up deleting more than you wantedtoo.

Extended Regular Expressions

Two programs use the extended regular expression: egrep and awk. Withthese extensions, those special characters preceded by a back slash nolonger have the special meaning: "\{," "\}," "\<," "\>," "\(," "\)" as well as the"\digit." There is a very good reason for this, which I will delay explaining tobuild up suspense.

Page 78: Unix Shell Scripting

78

The character "?" matches 0 or 1 instances of the character set before, andthe character "+" matches one or more copies of the character set. You can'tuse the \{ and \} in the extended regular expressions, but if you could, youmight consider the "?" to be the same as "\{0,1\}" and the "+" to be the sameas "\{1,\}."

By now, you are wondering why the extended regular expressions is evenworth using. Except for two abbreviations, there are no advantages, and a lotof disadvantages. Therefore, examples would be useful.

The three important characters in the expanded regular expressions are "(,""|," and ")." Together, they let you match a choice of patterns. As an example,you can egrep to print all From: and Subject: lines from your incoming mail:

egrep '^(From|Subject): ' /usr/spool/mail/$USER

All lines starting with "From:" or "Subject:" will be printed. There is no easyway to do this with the Basic regular expressions. You could try" [̂FS][ru][ob][mj]e*c*t*: " and hope you don't have any lines that start with"Sromeet:." Extended expressions don't have the "\<" and "\>" characters. Youcan compensate by using the alternation mechanism. Matching the word "the"in the beginning, middle, end of a sentence, or end of a line can be done withthe extended regular expression:

(^| )the([^a-z]|$)

There are two choices before the word, a space or the beginining of a line.After the word, there must be something besides a lower case letter or elsethe end of the line. One extra bonus with extended regular expressions is theability to use the "*," "+," and "?" modifiers after a "(...)" grouping. Thefollowing will match "a simple problem," "an easy problem," as well as "aproblem."

egrep "a[n]? (simple|easy)? problem" data

I promised to explain why the back slash characters don't work in extendedregular expressions. Well, perhaps the "\{...\}" and "\<...\>" could be added tothe extended expressions. These are the newest addition to the regularexpression family. They could be added, but this might confuse people ifthose characters are added and the "\(...\)" are not. And there is no way to addthat functionality to the extended expressions without changing the currentusage. Do you see why? It's quite simple. If "(" has a special meaning, then"\(" must be the ordinary character. This is the opposite of the Basic regularexpressions, where "(" is ordinary, and "\(" is special. The usage of theparentheses is incompatable, and any change could break old programs.

If the extended expression used "( ..|...)" as regular characters, and "\(...\|...\)"for specifying alternate patterns, then it is possible to have one set of regularexpressions that has full functionality. This is exactly what GNU emacs does,by the way.

Page 79: Unix Shell Scripting

79

The rest of this is random notes.

RegularExpression Class Type Meaning

_

. all Character Set A single character (exceptnewline)

^ all Anchor Beginning of line$ all Anchor End of line[...] all Character Set Range of characters* all Modifier zero or more duplicates\< Basic Anchor Beginning of word\> Basic Anchor End of word\(..\) Basic Backreference Remembers pattern\1..\9 Basic Reference Recalls pattern_+ Extended Modifier One or more duplicates? Extended Modifier Zero or one duplicate\{M,N\} Extended Modifier M to N Duplicates(...|...) Extended Anchor Shows alteration_\(...\|...\) EMACS Anchor Shows alteration\w EMACS Character set Matches a letter in a word\W EMACS Character set Opposite of \w

Page 80: Unix Shell Scripting

80

A Sample Shell Script

Module 10

This visual shows another way of invoking a shell script. This method relies onthe user first making the script an executable file with the chmod command.

After this step the script can be invoked by its name.

Note that the shell uses the PATH variable to find executable files. If you getan error message like the following,

$ helloksh: hello: not found

check your PATH variable. The directory in which the shell script is storedmust be defined in the PATH variable.

Page 81: Unix Shell Scripting

81

Each shell script is executed in a subshell. Variables defined in a shell scriptcannot be passed back to the parent shell.

If you invoke a shell script with a . (dot), it runs in the current shell. Variablesdefined in this script (dir1, dir2) are therefore defined in the current shell.

Every process gives back an exit status to its parent process. Per convention0 is given back when the process ended successfully and not equal 0 in allother cases.

Page 82: Unix Shell Scripting

82

To find out the exit code of a completed command, use echo $?:

$ date$ echo $?0$_

This shows successful execution of the date command. The visual shows anexample for an unsuccessful execution of a command.

CONTROL CONSTRUCTS:

The BourneShell control constructs can alter the flow of control within thescript. The BourneShell provides simple two-way branch if statements andmultiple-branch case statements, plus for, while, and until statements.

In discussing these control structures, the BourneShell keywords will be inbold type and the normal type are the user supplied items to cause thedesired effect in command format boxes.

Types of Tests Used with Control Constructs:

The test utility evaluates expressions and returns a condition indicatingwhether or not the expression is true (equal to zero) or false (not equal tozero). There are no options with this utility. The format for this utility is asfollows:

Command Format: test expression

expression - composed of constants, variables, andoperators

Expressions will be looked at in greater detail later with some examples.There are a few items that need to be mentioned that apply to expressions.Expressions can contain one or more evaluation criteria that test will evaluate.A -a that separates two criteria is a logical AND operator. In this case, bothcriteria must evaluate to true in order for test to return a value of true. The -ois the logical OR operator. When this operator separates two criteria, one orthe other (or both) must be true for test to return a true condition.

You can negate any criterion by preceding it with an exclamation mark (!).Parentheses can be used to group criteria. If there are no parentheses, the -a(logical AND operator) takes precedence over the -o (logical OR operator).The test utility will evaluate operators of equal precedence from left to right.

Page 83: Unix Shell Scripting

83

Within the expression itself, you must put special characters, such asparentheses, in quote marks so the BourneShell will not evaluate them but willpass them to test.

Since each element (evaluation criterion, string, or variable) in an expressionis a separate argument, each must be separated by a space.

The test utility will work from the command line but it is more often used in ascript to test input or verify access to a file.

Another way to do the test evaluation is to surround the expression with leftand right brackets. A space character must appear after the left bracket andbefore the right bracket.

test expression = [ expression ]

Test on Numeric Values

Test expressions can be in many different forms. The expressions can appearas a set of evaluation criteria. The general form for testing numeric values is:

int1 op int2

This criterion is true if the integer int1 has the specified algebraic relationshipto integer int2.

The valid operators (op) are:

-eq equal

-ne not equal

-gt greater than

-lt less than

-ge greater than or equal

-le less than or equal

Test on Character Strings

The evaluation criterion for character strings is similar to numericcomparisons. The general form is:

string1 op string2

Page 84: Unix Shell Scripting

84

The operators (op) are:

string1 = string2 true if string1 and string 2 are equal

string1 != string2 true if string1 and string2 are not equal

string1 true if string1 is not the null string

Sample Session:

$cat test_stringnumber=1numero=0001if test $number = $numerothen echo "String vals for $number and $numero are ="else echo "String vals for $number and $numero not ="fiif test $number -eq $numerothen echo "Numeric vals for $number and $numero are ="else echo "Numeric vals for $number and $numero not ="fi

$chmod 755 test_string

$sh -x test_stringnumber=1numero=0001+ test 1 = 0001+ echo String vals for 1 and 0001 not =String vals for 1 and 0001 not =+ test 1 -eq 0001+ echo Numeric vals for 1 and 0001 are =Numeric vals for 1 and 0001 are =

$test_stringString vals for 1 and 0001 not =Numeric vals for 1 and 0001 are =

$

Test on File Types

The test utility can be used to determine information about file types. All ofthe criterion can be found in Appendix B. A few of them are listed here:

-r filename true if filename exists and is readable

-w filename true if filename exists and is writable

Page 85: Unix Shell Scripting

85

-x filename true if filename exists and is executable

-f filename true if filename exists and it is a plain file

-d filename true if filename exists and it is a directory.

-s filename true if filename exits and it containsinformation (has a size greater than 0bytes)

Example:$test -d new_dir

If new_dir is a directory, this criterion will evaluate to true. If it does not exist,then it will be false.

Taking Decisions using if then

The format for this construct is:

Command Format: if expressionthen commands

fi

The if statement evaluates the expression and then returns control based onthis status. The fi statement marks the end of the if, notice that fi is if spelledbackward.

The if statement executes the statements immediately following it if theexpression returns a true status. If the return status is false, control willtransfer to the statement following the fi.

Sample Session:

$cat check_argsif (test $# = 0)

then echo 'Please supply at least 1 argument'exit

fiecho 'Program is running'$

Page 86: Unix Shell Scripting

86

This little script will check to insure that you are giving at least one argument.If none are given it will display the error message and exit. If one or morearguments are given it will display "Program is running" and run the rest of thescript, if any.

Sample Session:

$check_argsPlease supply at least 1 argument$check_args xyzProgram is running$

Taking Decision using if then else

The format for this construct is:

Command Format: if expressionthen commandselse commands

fi

The else part of this structure makes the single-branch if statement into a two-way branch. If the expression returns a true status, the commands betweenthe then and the else statement will be executed. After these have beenexecuted, control will start again at the statement after the fi.

If the expression returns false, the commands following the else statement willbe executed.

Sample Session:

$cat test_stringnumber=1numero=0001if test $number = $numerothen echo "String values of $number and $numero are equal"else echo "String values of $number and $numero not equal"fiif test $number -eq $numerothen echo "Numeric values of $number and $numero are equal"else echo "Numeric values of $number and $numero not equal"fi$

Page 87: Unix Shell Scripting

87

Taking Decision using if then elif

The format for this construct is:

Command Format: if expressionthen commands

elif expressionthen commandselse commands

fi

The elif construct combines the else and if statements and allows you toconstruct a nested set of if then else structures.

The case control Structure

The format for this construct is:

Command Format: case test-string inpattern-1 ) commands-1 ;;pattern-2 ) commands-2 ;;pattern-3 ) commands-3 ;;...*) commands ;;

esac

The case structure allows a multiple-branch decision mechanism. The paththat is taken depends on a match between the test-string and one of thepatterns.

Sample Session:

$cat case_execho 'Enter A, B, or C: \c'read lettercase $letter in

A) echo 'You entered A' ;;B) echo 'You entered B' ;;C) echo 'You entered C' ;;*) echo 'You did not enter A, B, or C' ;;

esac$chmod a+x case_ex$case_ex

Page 88: Unix Shell Scripting

88

Enter A, B, or C: BYou entered B$case_exEnter A, B, or C: bYou did not enter A, B, or C$

This example uses the value of a character that the user entered as the teststring. The value is represented by the variable letter. If letter has the valueof A, the structure will execute the command following A. If letter has a valueof B or C, then the appropriate commands will be executed. The asteriskindicates any string of characters; and it, therefore, functions as a catchall fora no-match condition. The lowercase b in the second sample session is anexample of a no match condition.

The Loop Control Structure

The for Loop:

The format for this construct is:

Command Format: for loop-index in argument-listdocommandsdone

This structure will assign the value of the first item in the argument list to theloop index and executes the commands between the do and donestatements. The do and done statements indicate the beginning and end ofthe for loop.

After the structure passes control to the done statement, it assigns the valueof the second item in the argument list to the loop index and repeats thecommands. The structure will repeat the commands between the do anddone statements once for each argument in the argument list. When theargument list has been exhausted, control passes to the statement followingthe done.

Sample Session:

$cat find_henry1for x in project1 project2 project3dogrep henry $xdone

Page 89: Unix Shell Scripting

89

Sample Session:

$head project?==> project1 <==henryjoemikesue

==> project2 <==joemikesue

==> project3 <==joemikesuehenry

==> project4 <==joemike

$find_henryhenryhenry$

Each file in the argument list was searched for the string, henry. When amatch was found, the string was printed.

The while Loop

The format for this construct is:

Command Format: while expressiondocommandsdone

As long as the expression returns a true exit status, the structure continues toexecute the commands between the do and the done statement. Before eachloop through the commands, the structure executes the expression. When

Page 90: Unix Shell Scripting

90

the exit status of the expression is false (non-zero), control is passed to thestatement following the done statement.

The commands to be executed must change the expression test or an infiniteloop can result.

The until Loop

The format for this construct is:

Command Format: until expressiondocommands

done

The until and while structures are very similar. The only difference is that thetest is at the top of the loop. The until structure will continue to loop until theexpression returns true or a nonerror condition. The while loop will continueas long as a true or nonerror condition is returned.

The commands to be executed must change the expression test or an infiniteloop can result.

Sample Session:

$cat until_exsecretname='jenny'name='noname'echo 'Try to guess the secret name!'echountil (test "$name" = "$secretname")

doecho 'Your guess: \c'read namedone

echo 'You did it!'$

The until loop will continue until name is equal to the secret name.

Sample Session:

$chmod a+x until_ex$until_exTry to guess the secret name!

Page 91: Unix Shell Scripting

91

Your guess: gaylanYour guess: artYour guess: richardYour guess: jennyYou did it!$

The break, and continue Statement

The break and continue loop control commands correspond exactly to theircounterparts in other programming languages. The break commandterminates the loop (breaks out of it), while continue causes a jump to the nextiteration (repetition) of the loop, skipping all the remaining commands in thatparticular loop cycle.

#!/bin/bash

LIMIT=19 # Upper limit

echoecho "Printing Numbers 1 through 20 (but not 3 and 11)."

a=0

while [ $a -le "$LIMIT" ]doa=$(($a+1))

if [ "$a" -eq 3 ] || [ "$a" -eq 11 ] # Excludes 3 and 11.then

continue # Skip rest of this particular loop iteration.fi

echo -n "$a " # This will not execute for 3 and 11.done

# Exercise:# Why does loop print up to 20?

echo; echo

echo Printing Numbers 1 through 20, but something happens after 2.

##################################################################

# Same loop, but substituting 'break' for 'continue'.

a=0

Page 92: Unix Shell Scripting

92

while [ "$a" -le "$LIMIT" ]doa=$(($a+1))

if [ "$a" -gt 2 ]then

break # Skip entire rest of loop.fi

echo -n "$a "done

echo; echo; echo

exit 0

The break command may optionally take a parameter. A plain breakterminates only the innermost loop in which it is embedded, but a break Nbreaks out of N levels of loop.

Page 93: Unix Shell Scripting

93

Module 11Useful Utilities for Shellscat - concatenate a file

Display the contents of a file with the concatenate command, cat.

Syntax cat [options] [file]

Common Options

-n precede each line with a line number-v display non-printing characters, except tabs, new-lines, and

form-feeds-e display $ at the end of each line (prior to new-line)

(when used with -v option)

Examples % cat filename

You can list a series of files on the command line, and cat will concatenatethem, starting each in turn, immediately after completing the previous one,e.g.:

% cat file1 file2 file3

DATE

The date command gives the current date and time.

Syntaxd ate[“+fo rm at-S trln g ”]

Example$date

Its output is as follows

Mon Nov 27 11:24:35 EST 2006

The date format can also be used with format-string

The format-string can include following format-characters.

%D Gives date in MM/DD/YY format

Page 94: Unix Shell Scripting

94

%T Gives time as HH:MM:SS

%H Gives Hour from 00 to 23

%M Gives Minute from 00 to 59

%S Gives Second from 00 to 59

%m Gives month of the year

%d Gives day of the month

%y Gives last two digits of the year.

ExampleS date "+DATE IS%D TIME IS %T"

Will give the output as: DATE IS 1 1/27/06 TIME IS 16:38:23

Example$ date "+DAY %d MONTH %m YEAR %y"

will give the output as: DAY 27 MONTH 1 1 YEAR 06

The find command recursively searches the directory tree for each specifiedpath, seeking files that match a Boolean expression written using the termsgiven in the text that follows the expression. The output from the findcommand depends on the terms specified by the final parameter.

Page 95: Unix Shell Scripting

95

Note that the -print option is the default so is not required. This was notalways the case. In earlier versions of AIX and on other UNIX systems thathave not yet implemented the POSIX standard for the find command, the -print option is required for the result to be displayed or used in a pipe.

The command following -exec, in this example ls, is executed for each filename found.

Page 96: Unix Shell Scripting

96

find replaces the {} with the names of the files matched. It is used as aplaceholder for matches.

Note use of the escaped ; to terminate the command that find is to execute.

The find command may also be used with a -ls option; that is, $ find . -name'm*' -ls.

Note that the -exec option is non-interactive.

The \; is hard coded with the find command. This is required for use with -exec and -ok.

It is a good idea to use the -ok option rather than -exec if there are not a lot offiles that match the search criteria. It is a lot safer if your pattern is not exactlywhat you think it is.

Page 97: Unix Shell Scripting

97

The grep command searches for the pattern specified and writes eachmatching line to standard output.

The search can be for simple text, like a string or a name. grep can also lookfor logical constructs, called regular expressions, that use patterns andwildcards to symbolize something special in the text, for example, only linesthat start with an uppercase T.

The command displays the name of the file containing the matched line, ifmore than one file is specified for the search.

Page 98: Unix Shell Scripting

98

Page 99: Unix Shell Scripting

99

Page 100: Unix Shell Scripting

100

On-Line Documentation:

The UNIX manual, usually called man pages, is available on-line to explainthe usage of the UNIX system and commands. To use a man page, type thecommand "man" at the system prompt followed by the command for whichyou need information.

Syntaxman [options] command_name

Common Options

-k keyword list command synopsis line for all keywordmatches

-M path path to man pages

-a show all matching man pages (SVR4)

Backup using tar

Another program used to read and write files associated with an archive is tar.Some of the available options are

-A Append files to an archive-c Create a new archive-f Name of archive-P Keep absolute paths of files-t List the files in an archive-v Verbose mode-x Extract files from an archive-z Compress/decompress files using gzip

Page 101: Unix Shell Scripting

101

gzip

This reduces the size of a file, thus freeing valuable disk space. For example,type

% ls -l science.txt

and note the size of the file using ls -l . Then to compress science.txt, type

% gzip science.txt

This will compress the file and place it in a file called science.txt.gz

To see the change in size, type ls -l again.

To expand the file, use the gunzip command.

% gunzip science.txt.gz

nslookup

Nslookup is a program to query Internet domain name servers. Nslookuphas two modes: interactive and non-interactive. Interactive mode allows theuser to query name servers for information about various hosts and domainsor to print a list of hosts in a domain. Non-interactive mode is used to print justthe name and requested information for a host or domain.

nslookup host

domain name, IP address, and alias information for the given host.e.g., nslookup www.kent.edu gives related data for www.kent.edu

Cut command.

cut command selects a list of columns or fields from one or more files.Option -c is for columns and -f for fields. It is entered ascut options [files]for example if a file named testfile contains

this is firstlinethis is secondlinethis is thirdline

Examples:cut -c1,4 testfile will print this to standard output (screen)ts

Page 102: Unix Shell Scripting

102

tstsIt is printing columns 1 and 4 of this file which contains t and s (part of this).

Options:

-c list cut the column positions identified in list. -f list will cut the fields identified in list. -s could be used with -f to suppress lines without delimiters.

Awk and Sed

Awk is a programming language that can be applied to data-manipulation andcomputing tasks on a UNIX operating system. Sed, a stream editor, acts likea filter by executing a group of editing instructions for a text file.

Examples:

df -t | awk 'BEGIN {tot=0} $2 == "total" {tot=tot+$1} END {print(tot*512)/1000000}'

Will give total space in your system in megabytes.Here the output of command df -t is being passed into awk which is countingthe field 1 after pattern "total" appears. Same way if you change $1 to $4 it willaccumulate and display the addition of field 4

sed command launches a stream line editor which you can use at commandline.

you can enter your sed commands in a file and then using -f option edit yourtext file. It works as

sed [options] files

options:

-e 'instruction' Apply the editing instruction to the files. -f script Apply the set of instructions from the editing script. -n suppress default output.

for more information about sed, enter man sed at command line in yoursystem.

Page 103: Unix Shell Scripting

103

Module 12Arithmetic on Shell Variables

A Unix command called expr evaluates an expression given to it on thecommand line

Each operator and operand given to expr must be a separate argument The usual arithmetic operators (+.-,*,/,%) are recognized by expr Remember to use backslashes to protect the expression from the shell expr only evaluates integer arithmetic expressions Use the ':' operator with expr to match characters in the first operand

against a regular expression given as the second argument; by defaultit returns the number of characters matched

Arithmetic is done with expr

expr 5 + 7expr 5 \* 7

Backslash required in front of '*' since it is a filename wildcard and would betranslated by the shell into a list of file names

You can save arithmetic result in a variable

Store the following in a file named arith.sh and execute it

#!/bin/sh# Perform some arithmeticx=24y=4Result=`expr $x \* $y`echo "$x times $y is $Result"

read and echo Revisited

Let us consider the simple shell script given bellow:

#!/bin/sh#Usage read echo

echo Enter the Values of a, b and cread a b cecho $a $b $c

Page 104: Unix Shell Scripting

104

Module 13FunctionsLike "real" programming languages, Bash has functions, though in asomewhat limited implementation. A function is a subroutine, a code blockthat implements a set of operations, a "black box" that performs a specifiedtask. Wherever there is repetitive code, when a task repeats with only slightvariations, then consider using a function.

function function_name {command...}

or

function_name () {command...}

This second form will cheer the hearts of C programmers (and is moreportable).

As in C, the function's opening bracket may optionally appear on the secondline.

function_name (){command...}

A function may be "compacted" into a single line.

fun () { echo "This is a function"; echo; }

In this case, however, a semicolon must follow the final command in thefunction.

fun () { echo "This is a function"; echo } # Error!

Functions are called, triggered, simply by invoking their names.

Example: Simple functions

#!/bin/bash

JUST_A_SECOND=1

Page 105: Unix Shell Scripting

105

funky (){ # This is about as simple as functions get.

echo "This is a funky function."echo "Now exiting funky function."

} # Function declaration must precede call.

fun (){ # A somewhat more complex function.

i=0REPEATS=30

echoecho "And now the fun really begins."echo

sleep $JUST_A_SECOND # Hey, wait a second!while [ $i -lt $REPEATS ]do

echo "----------FUNCTIONS---------->"echo "<------------ARE-------------"echo "<------------FUN------------>"echolet "i+=1"

done}

# Now, call the functions.

funkyfunexit 0

DebuggingThe Bash shell contains no debugger, nor even any debugging-specificcommands or constructs. Syntax errors or outright typos in the script generatecryptic error messages that are often of no help in debugging a non-functionalscript.

Example: A buggy script

#!/bin/bash# ex74.sh

# This is a buggy script.# Where, oh where is the error?

Page 106: Unix Shell Scripting

106

a=37

if [$a -gt 27 ]then

echo $afi

exit 0

Output from script:

./ex74.sh: [37: command not foundWhat's wrong with the above script (hint: after the if)?

Page 107: Unix Shell Scripting

107

Module 14

Sed:Sed is the ultimate stream editor. If that sounds strange, picture a streamflowing through a pipe. Okay, you can't see a stream if it's inside a pipe.That's what I get for attempting a flowing analogy.

Anyhow, sed is a marvelous utility. Unfortunately, most people never learn itsreal power. The language is very simple, but the documentation is terrible.The Solaris on-line manual pages for sed are five pages long, and two ofthose pages describe the 34 different errors you can get. A program thatspends as much space documenting the errors than it does documenting thelanguage has a serious learning curve.

Sed has several commands, but most people only learn the substitutecommand: s. The substitute command changes all occurrences of the regularexpression into a new value. A simple example is changing "day" in the "old"file to "night" in the "new" file:

sed s/day/night/ <old >new

I didn't put quotes around the argument because this example didn't needthem. If you read my earlier tutorial, you would understand why it doesn't needquotes. If you have meta-characters in the command, quotes are necessary.In any case, quoting is a good habit, and I will henceforth quote futureexamples. That is:

sed 's/day/night/' <old >new

There are four parts to this substitute command:

s Substitute command/../../ Delimiterday Regular Expression Pattern Stringnight Replacement string

sed in shell script

If you have many commands and they won't fit neatly on one line, you canbreak up the line using a backslash:

sed -e 's/a/A/g'-e 's/e/E/g' \-e 's/i/I/g' \

Page 108: Unix Shell Scripting

108

-e 's/o/O/g' \-e 's/u/U/g' <old >new

Sed is extremely powerful, and you can do things in sed that you can't do inany standard word processor. And because sed is external to the wordprocessor and comes with every Unix system in the world, once you learn sedyou'll have a very handy tool in your toolkit, even if (like me) you rarely useUnix.

How it works: You feed sed a script of editing commands (like, "change everyline that begins with a colon to such-and-such") and sed sends your revisedtext to the screen. To save the revisions on disk, use the redirection arrow,>newfile.txt. Sample syntax:

sed "one-or-two-sed-commands" input.file >newfile.txtsed -f bigger_sed.script input.file >newfile.txt

awk:

Awk is a ``pattern scanning and processing language'' which is useful forwriting quick and dirty programs that don't have to be compiled. The callingsyntax of awk is like sed:

UNIX> awk program [ file ]or

UNIX> awk -f program-file [ file ]

Like sed, awk can work on standard input or on a file. Like the shell, if youstart an awk program with

#!/bin/awk –f

then you can execute the program directly from the shell.

Most systems also have nawk, which stands for ``new awk.'' Nawk has manymore features than awk and is generally more useful. I am just going to coverawk, but you should check out nawk too in your own time. Nawk has somenice things like a random number generator, that awk doesn't have.

awk programs are composed of ``pattern-action'' statements of the form:

pattern { action }

What such a statement does is apply the action to all lines that match thepattern. If there is no pattern, then it applies the action to all lines. If there is

Page 109: Unix Shell Scripting

109

no action, then the default action is to copy the line to standard output.Patterns can be regular expressions enclosed in slashes (they can be morethan that, but for now, just assume that they are regular expressions).

So, for example, the program awkgrep works just like ``grep Jim''.

UNIX> cat awkgrep#!/bin/awk -f

/Jim/UNIX> cat inputWhich of these lines doesn't belong:

Bill ClintonGeorge BushRonald ReaganJimmy CarterSylvester StalloneUNIX> awkgrep inputJimmy CarterUNIX> awkgrep < inputJimmy CarterUNIX>

Basically look like C programs. There are some big differences, but for themost part, you can do most basic things that you can do in C.

Awk breaks up each line into fields, which are basically whitespace-separatedwords. You can get at word i by specifying $i. The variable NF contains thenumber of words on the line. The variable $0 is the line itself.

So, to print out the first and last words on each line, you can do:

UNIX> cat inputWhich of these lines doesn't belong:

Bill ClintonGeorge BushRonald ReaganJimmy CarterSylvester StalloneUNIX> awk '{ print $1, $NF }' inputWhich belong:

Bill ClintonGeorge BushRonald ReaganJimmy CarterSylvester StalloneUNIX>

Page 110: Unix Shell Scripting

110

An alternative awkgrep prints out $0 when it finds the pattern:UNIX> cat awkgrep2#!/bin/awk -f

/Jim/ { print $0 }UNIX> awkgrep2 inputJimmy CarterUNIX>

Awk has a printf just like C. You don't have to use parentheses when you callit (although you can if you'd like). Unlike print, printf will not print a newline ifyou don't want it to. So, for example, awkrev reverses the lines of a file:

UNIX> cat awkrev#!/bin/awk -f

{ for (i = NF; i > 0; i-- ) printf "%s ", $iprintf "\n" }

UNIX> awkrev inputbelong: doesn't lines these of Which

Clinton BillBush GeorgeReagan RonaldCarter JimmyStallone SylvesterUNIX>

A few things that you'll notice about awkrev: Actions can be multiline. Youdon't need semicolons to separate lines like in C. However, you can specifymultiple commands on a line and separate them with semi-colons as in C.And you can block commands with curly braces as in C. If you want acommand to span two lines (this often happens with complex printfstatements), you need to end the first line with a backslash.

Also, you'll notice that awkrev didn't declare the variable i. Awk just figuredout that it's an integer.

Type casting

Awk lets you convert variables from one type to another on the fly. Forexample, to convert an integer to a string, you simply use it as a string. Stringconstruction can be done with concatenation, which is often very convenient.These principles are used in awkcast:

UNIX> echo "4 Jim" | awkcastWord 1: as a number: 4, as a string: 4.

0 appended: number: 40, string 40Word 2: as a number: 0, as a string: Jim.

Page 111: Unix Shell Scripting

111

0 appended: number: 0, string Jim0UNIX>

Casting a string to an integer gives it its atoi() value.

BEGIN and END

There are two special patterns, BEGIN and END, which cause thecorresponding actions to be executed before and after any lines areprocessed respectively. Therefore, the following program (awkwc) counts thenumber of lines and words in the input file.

UNIX> cat awkwc#!/bin/awk -f

BEGIN { nl = 0; nw = 0 }{ nl++ ; nw += NF }

END { print "Lines:", nl, "words:", nw }UNIX> awkwc awkwcLines: 5 words: 26UNIX> wc awkwc

5 26 103 awkwcUNIX>

next and exit

Awk tries to process each statement on each line. Unlike sed, there is no``hold space.'' Instead, each statement is processed on the original version ofeach line. Two special commands in awk are next and exit. Next specifies tostop processing the current input line, and to go directly to the next one,skipping all the rest of the statements. Exit specifies for awk to exitimmediately.

Here are some simple examples. awkpo prints out only the odd numberedlines (note that this is an awkward way to do this, but it works):

UNIX> cat awkpo#!/bin/awk -f

BEGIN { ln=0 }{ ln++

if (ln%2 == 0) nextprint $0

}

UNIX> cat -n input1 Which of these lines doesn't belong:2

Page 112: Unix Shell Scripting

112

3 Bill Clinton4 George Bush5 Ronald Reagan6 Jimmy Carter7 Sylvester Stallone

UNIX> cat -n input | awkpo1 Which of these lines doesn't belong:3 Bill Clinton5 Ronald Reagan7 Sylvester Stallone

UNIX>

awkptR prints out all lines until it reaches a lines with a capital R

UNIX> cat awkptR#!/bin/awk -f

/R/ { exit }{ print $0 }

UNIX> awkptR inputWhich of these lines doesn't belong:

Bill ClintonGeorge BushUNIX>

Arrays

Arrays in awk are a little odd. First, you don't have to malloc() any storage --just use it and there it is. Second, arrays can have any indices -- integers,floating point numbers or strings. This is called ``associative'' indexing, andcan be very convenient. You cannot have multi-dimensional arrays or arraysof arrays though. To simulate multidimensional arrays, you can justconcatenate the indices.

Take a look at awkgolf. This is typical of quick-and-dirty awk programs thatyou sometimes write to look at data. This one processes golf scores. Supposeyou have some score files, as in the files usopen, masters, kemper andmemorial. These files first have the name of the tournament in all caps, andthen scores for a bunch of golfers. Suppose you'd like to see all the golferswith scores for each tournament in a readable form. This is what awkgolfdoes. Let's break it into its four parts.

The first part is the BEGIN line:

BEGIN { nt = 0 ; np = 0 }

Page 113: Unix Shell Scripting

113

This simply initializes two variables: nt is the number of tournaments, and npis the number of players.

The next line looks a little cryptic:

/^[A-Z]*$/ { this = $0; tourn[nt] = $0 ; nt++; next }

This only works on lines that are all capital letters. These are the lines thatidentify tournaments. On these lines, it does the following:

Sets the this variable to be the tournament name. Puts the tournament's name into the tourn array. Increments nt variable. Skips the rest of the program and goes onto the next line.

The next part works on all lines that contain the pattern '--'. These are thelines with golfers' scores:

/--/ { golfer = $1for (i = 2; $i != "--" ; i++) golfer = golfer" "$iif (isgolfer[golfer] != "yes") {

isgolfer[golfer] = "yes"g[np] = golfernp++;

}score[golfer" "this] = $(i+1)

}The first two lines of this action set the golfer variable to be the golfer's name.Note that you can do string comparison in awk using standard booleanoperators, unlike in C where you would have to use strcmp().

The next 5 lines use awk's associative arrays: The array isgolfer is checkedto see if it contains the string ``yes'' under the golfer's name. If so, we haveprocessed this golfer before. If not, we sed the golfer's entry in isgolfer to``yes,'' set the np-th entry of the array g to be the golfer, and increment np.

Finally, we set the golfer's score for the tournament in the score array. Notethat we don't use double-indirection. Instead, we simply concatenate thegolfer's name and the tournament's name, and use that as the index for thearray.

The last part of the program does the final formatting:

END { printf("%-25s", " ");for (j = 0; j < nt; j++) printf("%9s", tourn[j])printf("\n")

for (i = 0; i < np; i++) {printf("%-25s", g[i])for (j = 0; j < nt; j++) printf("%9s", score[g[i]" "tourn[j]])

Page 114: Unix Shell Scripting

114

printf("\n")}

}The first three lines print out 25 spaces, and then the names of thetournaments as held in the tourn array. Then we loop through each golfer,and print the golfer's name, padded to 25 characters, and then his score ineach tournament. Note that if the golfer didn't play in the tournament, thatentry of the tournament array will be the null string. This is quite convenient,because we don't have to test for whether the golfer played the tournament --we can just use awk's default values.

Ok, lets try awkgolf:

UNIX> awkgolf kemper # Note that the ouput is only sorted because its# sorted in the input file

KEMPERJustin Leonard -10Greg Norman -7Nick Faldo -7Nick Price -7Loren Roberts -6Jay Haas -5Paul Stankowski -5Lee Janzen -4Phil Mickelson -4Davis Love III -3Tom Lehman 0Vijay Singh 0Kirk Triplett 1Steve Jones 2Mark O'Meara 5Don Pooley missedErnie Els missedFred Couples missedHal Sutton missedJesper Parnevik missedScott McCarron missedSteve Stricker missedUNIX> cat masters usopen kemper memorial | awkgolf

MASTERS USOPEN KEMPER MEMORIALTiger Woods 281 6 5Tommy Tolles 283 2 -11Tom Watson 284 16 0Paul Stankowski 285 6 -5 -3Fred Couples 286 13 missedDavis Love III 286 5 -3 -7Justin Leonard 286 9 -10 0Steve Elkington 287 7Tom Lehman 287 -2 0 -3Ernie Els 288 -4 missed -1

Page 115: Unix Shell Scripting

115

Vijay Singh 288 21 0 -14Jesper Parnevik 289 11 missed -4Lee Westwood 291 6Nick Price 291 6 -7Lee Janzen 292 13 -4 -11Jim Furyk 293 2 -12Mark O'Meara 294 9 5 -2Scott McCarron 294 3 missed missedScott Hoch 298 3 -11Jumbo Ozaki 300 missedFrank Nobilo 303 9 -10Bob Tway missed 2 -7Brad Faxon missed 17 2David Duval missed 11 -5Greg Norman missed missed -7 -12Loren Roberts missed 4 -6Nick Faldo missed 11 -7Phil Mickelson missed 10 -4Steve Jones missed 15 2 3Steve Stricker missed 9 missed -1Jay Haas 2 -5 -4Billy Andrade 4 -7Hal Sutton 6 missed -1Kirk Triplett 1 -2Don Pooley missed -4UNIX>

File indirection

You can specify that the output of print and printf go to a file with indirection.For example, to copy standard input to the file f1 you could do:

UNIX> awk '{print $0 > "f1"}' < inputUNIX> cat f1

Which of these lines doesn't belong:

Bill ClintonGeorge BushRonald ReaganJimmy CarterSylvester StalloneUNIX>

Awk without standard input

Sometimes you just want to write a program that doesn't use standard input.To do this, you just write the whole program as a BEGIN statement, exiting atthe end.

Page 116: Unix Shell Scripting

116

Multiline awk programs in the Bourne shell

The Bourne shell lets you define multiline strings simply by putting newlines inthe string (within single or double quotes, of course). This means that you canembed simple multiline awk scripts in a sh program without having to usecumbersome backslashes, or intermediate files. For example, shwc worksjust like awkwc, but works as a shell script rather than an awk program.

UNIX> shwc awkwcLines: 5 words: 26UNIX> shwc < awkwcLines: 5 words: 26UNIX> shwc awkwc awkwcusage: shwc [ file ]UNIX>

Awk's limitations

Awk is useful for simple data processing. It is not useful when things get morecomplex for a few reasons. First, if your data file is huge, you'll do better towrite a C program (using for example the fields library from CS302/360)because it will be more efficient sometimes by a factor of 60 or more. Second,once you start writing procedure calls in awk, it seems to me you may as wellbe writing C code. Third, you often find awk's lack of double indirection andstring processing cumbersome and inefficient.

Awk is not a good language for string processing. Irritatingly, it doesn't let youget at string elements with array operations. I.e. the following will fail:

UNIX> cat sp.awk{ s = $1 ; s[0] = 'a' ; print s }

UNIX> awk -f sp.awk inputawk: syntax error near line 1awk: illegal statement near line 1UNIX>Of course, sed is ideal for string processing, so often you can get what youwant with a combination of sed and awk.

Page 117: Unix Shell Scripting

117

Module 15

Database Using Shell ScriptsThere are one or two facts about databases. If you know anything at all aboutdatabases you'll know everything that follows.

1. A database consists of one (or more) tables which consist of asequence of identically structured rows or records. The rows (records)are subdivided into fields or columns. A schema is a table thatdescribes a table or tables.

2. The data in a database is manipulated (updated, queried etc.,) usingcommands written in SQL (Structured Query Language). Many peopleseem to associate SQL with one particular database package, this iswrong, all well known database packages (Oracle, MySQL, MSAccess, Postgres, MS SQL Server etc.,) support SQL although theremay be minor differences.

3. Most database packages operate in a client/server fashion. Thedatabase server receives SQL requests via the net and returns resultsvia the net. The results of such queries will, in general, be sets of rowsor records. The database server is a permanently running programmein principle similar to a WWW server. One exception to this rule is MSAccess which operates by direct manipulation of the host operatingsystem files that hold the database tables.

4. How databases actually store their tables, schemas etc., varies frompackage to package and is, almost always, of no concern to the user.For information MS Access stores all the tables and schemas of adatabase in a single file whose name conventionally ends in the letters".mdb". For each table MySQL maintains several Unix file system files,typically one for the data, one for the schema and one for the index.Oracle stores everything for all its databases in a group of 4-10 filesthat are built on top of the local file system.

A Shell Script (CGI Backend)

#!/bin/shPATH=$PATH:/usr/local/mysql/binexport PATHecho "Content-type: text/html"echoPLACE=`echo $QUERY_STRING | cut -d= -f2`echo "<html><head><title>Shell Example #3</title></head>"echo "<body><h1>Shell Example #3</h1><p>Results of database query for"echo $PLACE

Page 118: Unix Shell Scripting

118

echo "<p>"echo "use mydatabase;" > /tmp/$$.sqlecho "select latitude,longitude,easting,northing from gazetteer where feature= '$PLACE';" >> /tmp/$$.sqlmysql -u demo < /tmp/$$.sql > /tmp/$$.resROWS=`cat /tmp/$$.res | wc -l`if [ $ROWS -eq 0 ]then

elseecho "No information for" $PLACE

echo "<table border=2><tr>"tail +2 /tmp/$$.res | sed -e 's/ /̂<tr><td>/

s/ /<td>/g'echo "</table>"

fiecho "</body></html>"rm /tmp/$$.*

Actual database access is performed using the command line MySQL clientprogramme. To ensure that this can be found the search path is modified bythe second and third lines of the script.

PATH=$PATH:/usr/local/mysql/binexport PATH

The name of the location being queried is then extracted from theQUERY_STRING environment variable.

The MySQL command line client can be used non-interactively by arrangingfor it to read SQL from its standard input, in this case using redirection from afile. The required SQL is constructed in a temporary file.

On a normal Unix system any user can create files in the directory /tmp, thesymbol $$ in the file name is replaced by the current process identificationnumber, this is always unique so avoids any problems with two instances ofthe back end running simultaneously.

Here is a typicaly example of the contents of the SQL file.

use mydatabase;select latitude,longitude,easting,northing from gazetteer where feature ='Prague';

The output from the MySQL client is also written to a temporary file. Typicaltext is shown below (for a different query).

latitude longitude easting northing195180 -21240 145 487190860 -8040 384 346

Page 119: Unix Shell Scripting

119

188820 -11160 325 284197880 -5820 424 563

It will be noted that the output file includes column names and that columnsare separated by TAB characters.

The next step is to determine the number of lines in the output file, this will bezero if no matches have been found. This is done by arranging the for thestandard Unix command wc to read the file and write the number of lines to itsstandard output.

The code

if [ $ROWS -eq 0 ]then

elseecho "No information for" $PLACE

echo "<table border=2><tr>"tail +2 /tmp/$$.res | sed -e 's/ /̂<tr><td>/

s/ /<td>/g'echo "</table>"

fi

operates conditionally on the number of rows. The interesting case ariseswhen the number of rows is non-zero. In this case the standard Unixcommand tail is used to transfer the file, less its first line, to the standard inputof the standard Unix command sed. sed is the Unix non-interactive editorthat is used here to modify the MySQL command line client output by

Inserting <tr><td> at the start of every line. Remember that themetacharacter ^ matches the start of a line in the regular expressionsused by all Unix editors.

Replacing all occurences of TAB characters by the string <td>. Thefinal g on the sed sub-command ensures that the substitution is global.

Note that the sed edit script, introduced by the sed command line argument -espreads over two lines.

Simple File Creation:

There are two simple ways to create another file, one uses the cat commandin conjunction with the redirect symbol, the other way is to use the echocommand in conjunction with the redirect symbol. The example Indented Catis a good example of the cat method in the Pipes and Redirects section. Thisexample only contains litteral text however. It is more appropriate to seesomething like the example below, which shows a variable being used in thesource data block.

Page 120: Unix Shell Scripting

120

Example cat and variablescat >> $sql0 <<-EOA

SET ECHO OFFSET FEEDBACK OFFSET HEADING OFFSELECT my_package.my_function($column)

FROM v\$databaseWHERE name LIKE '%&1%';

EOAEXIT

sqlplus -s $uid/$password@database @$sql0 $sql_arg_1 > $log0

The file created has its name stored in the variable $sql0 and as we can seethe block between the EOA flags is the data that goes into the file. The datablock is actually a segment of SQL*Plus statements, as indicated by thefilename variable. As is common with SQL*Plus code, the key words arepicked out in ALL CAPS, with objects (tables, procedures, columns, etc.) all inlower case. The SELECT line contains a reference to a called, packaged,PL/SQL function which has a column name as an argument. Here the columnname is held in a variable called $column and this will be substituted at scriptrun-time by the real value.

There are some unfortunate consequences of generating SQL*Plusstatements from within a shell script which you have to be aware of. Firstly,don't forget to put the EXIT statement at the end of the block or you will endup with a script that stays in SQL*Plus forever. Secondly, don't forget to putthe semi-colons (;) at the end of every SQL statement, or each statement willoverwrite the previous one or just create one long unprocessable mess.Thirdly, some internal database tables may contain the dollar symbol, which isspecial to the shell, so escape them with the back-slash (\) as shown on theFROM line.

On the WHERE line there is a reference to a SQL*Plus positional parameter'&1' which will pick up its value from the variable $sql_arg_1 at run-time asshown in the last line, just after the end of block flag. Did I say this was asimple example? Well, at least you don't have to worry about quoting whenusing this method. All quotes find their way to the destination file unscathed.Now to do the same thing using echo instead of cat, see the example below.

Example simple echo

echo "SET ECHO OFF" >> $sql0echo "SET FEEDBACK OFF" >> $sql0echo "SET HEADING OFF" >> $sql0echo "SELECT my_package.my_function($column)" >> $sql0echo " FROM v\$database" >> $sql0echo " WHERE name LIKE '%&1%';" >> $sql0echo "EXIT" >> $sql0sqlplus -s $uid/$password@database @$sql0 $sql_arg_1 > $log0

Page 121: Unix Shell Scripting

121

Complex File Creation:

So what's the point of all this extra typing? Well for one thing it allows you toput special bits of code into the block which will only be used at certain times,by hiding them in complex command groups. This example shows how this isdone below.

Example complex echo forms

echo "SET ECHO OFF" >> $sql0echo "SET FEEDBACK OFF" >> $sql0echo "SET HEADING OFF" >> $sql0echo "SELECT my_package.my_function($column)" >> $sql0echo " FROM v\$database" >> $sql0if [ "$db_type" = "m" ]then

echo " WHERE name = '$db_name';" >> $sql0else

echo " WHERE name LIKE '%&1%';" >> $sql0fiecho "EXIT" >> $sql0sqlplus -s $uid/$password@database @$sql0 $sql_arg_1 > $log0

This is basically the same block except the WHERE clause has been hiddeninside an if statement. Now, depending on the Database Type in the $db_typevariable, the WHERE clause can take one of two forms. Conveniently, theadditional argument which is not required by SQL*Plus in the first form, isignored at execution time, even though it is still available on the last line. Thisis common with all scripts, arguments are only used if they are referencedfrom within the script.

So there you have the first two ways of creating another file from a script. Theversion using cat can only cope with a single output form, the version usingecho can output a multitude of forms depending on the complex commandforms you use. The choice is yours. There are, however, other ways to createoutput files. You can use direct generation as in the example List to create alist of files. Or the indirect method shown in the example Counted List wherelines are built inside a loop construct and then appended to the file to create amenu file. Or in the example Sorted List where a list of words is sorted intoalphabetic order, duplicates are removed, then the rest stored in a file.

Example list

ls -1 *.log > $lst0

Example counted list

count=1

Page 122: Unix Shell Scripting

122

for file in `ls -1 *.log`do

echo "$count: $file" >> $mnu0count=`expr $count + 1`

done

Example sorted list

echo $@ | tr ' ' '\n' | sort -u > $lst0

Page 123: Unix Shell Scripting

123

Module 16OVERVIEW OF PERL

What is perl?Perl, sometimes referred to as Practical Extraction and Reporting Language,is an interpreted programming language with a huge number of uses,libraries and resources. Arguably one of the most discussed and usedlanguages on the internet, it is often referred to as the swiss army knife, orduct tape, of the web.

Perl was first brought into being by Larry Wall circa 1987 as a generalpurpose Unix scripting language to make his programming work simpler.Although it has far surpassed his original creation, Larry Wall still overseesdevelopment of the core language, and the newest version, Perl 6.

Running Perl

The simplest way to run a Perl program is to invoke the Perl interpreter withthe name of the Perl program as an argument:

perl sample.pl

The name of the Perl file is sample.pl, and perl is the name of the Perlinterpreter. This example assumes that Perl is in the execution path; if not,you will need to supply the full path to Perl too:

/usr/local/hin/perl sample.pl

This is the preferred way of invoking Perl because it eliminates the possibilitythat you might accidentally invoke a copy of Perl other than the one youintended. We will use the full path from now on to avoid any confusion.

This type of invocation is the same on all systems with a command-lineinterface. The following line will do the trick on Windows NT, for example:

c:\NTperl\perl sample.pl

Page 124: Unix Shell Scripting

124

Invoking Perl on UNIX

UNIX systems have another way to invoke an interpreter on a script file. Placea line like

#!/usr/local/bin/perl

at the start of the Perl file. This tells UNIX that the rest of this script file is to beinterpreted by /usr/local/bin/perl. Then make the script itself executable:

chmod +x sample.pl

You can then "execute" the script file directly and let the script file tell theoperating system what interpreter to use while running it.

You can supply Perl command-line arguments on the interpreter invocationline in UNIX scripts. The following line is a good start to any Perl script:

#!/usr/local/bin/perl -w -t

A Perl Script

A Perl program consists of an ordinary text file containing a series of Perlcommands. Commands are written in what looks like a bastardized amalgamof C, shell script, and English. In fact, that's pretty much what it is.

Perl code can be quite free-flowing. The broad syntactic rules governingwhere a statement starts and ends are

Leading white space is ignored. You can start a Perl statementanywhere you want: at the beginning of the line, indented for clarity(recommended), or even right-justified (definitely frowned on) if youlike.

Commands are terminated with a semicolon. White space outside of string literals is irrelevant; one space is as good

as a hundred. That means you can split statements over several linesfor clarity.

Anything after a pound sign (#) is ignored. Use this to pepper yourcode with useful comments.

Here's a Perl statement:

print "My name is Sreedhar\n";

No prizes for guessing what happens when Perl runs this code; it prints

My name is Sreedhar

Page 125: Unix Shell Scripting

125

If the \n doesn't look familiar, don't worry; it simply means that Perl shouldprint a newline character after the text; in other words, Perl should go to thestart of the next line.

Printing more text is a matter of either stringing together statements or givingmultiple arguments to the print function:

print "My name is Sreedhar,\n";print "I live in Bangalore,\n",

"I work in a Wipro there.\n";

That's right, print is a function. It may not look like it in any of the examples sofar, where there are no parentheses to delimit the function arguments, but it isa function, and it takes arguments. You can use parentheses in Perl functionsif you like; it sometimes helps to make an argument list clearer. Moreaccurately, in this example the function takes a single argument consisting ofan arbitrarily long list. We'll have much more to say about lists and arrayslater, in the "Data Types" section. There will be a few more examples of themore common functions in the remainder of this chapter, but refer to the"Functions" chapter for a complete run-down on all of Perl's built-in functions.

So what does a complete Perl program look like? Here's a trivial UNIXexample, complete with the invocation line at the top and a few comments:

#!/usr/local/bin/perl -w # Show warnings

print "My name is Sreedhar,\n"; # Let's introduce ourselvesprint "I live in Bangalore,\n",

"I work in a Wipro there.\n"; # Rememberthe line breaks

That's not at all typical of a Perl program though; it's just a linear sequence ofcommands with no structural complexity. The "Flow Control" section later inthis overview introduces some of the constructs that make Perl what it is. Fornow, we'll stick to simple examples like the preceding for the sake of clarity.

Exercise:

1. Write a shell script to modify all files in a directory.2. Create a shell script to write to create a user screen, which will

allow user to enter data in a file, delete a record, add a record,and also allow updating or querying the file.

Page 126: Unix Shell Scripting

126

Appendix A

List of basic UNIX Commands:

The basic UNIX commands include some of the most commonly used commands forusers, and constructs for building shell scripts.

The following charts offer a summary of some simple UNIX commands. These arecertainly not all of the commands available in this robust operating system, but thesewill help you get started.

Ten ESSENTIAL UNIX Commands:These are ten commands that you really need to know in order to get started withUNIX. They are probably similar to commands you already know for anotheroperating system.

Command Example Description

1. ls lsls -alF

Lists files in current directoryList in long format

2. cd cd tempdircd ..cd ~dhyatt/web-docs

Change directory to tempdirMove back one directoryMove into dhyatt's web-docsdirectory

3.mkdir

mkdir graphics Make a directory called graphics

4. rmdir rmdir emptydir Remove directory (must be empty)

5. cp cp file1 web-docscp file1 file1.bak

Copy file into directoryMake backup of file1

6. rm rm file1.bakrm *.tmp

Remove or delete fileRemove all file

7. mv mv old.htmlnew.html

Move or rename files

8. more more index.html Look at file, one page at a time

9. lpr lpr index.html Send file to printer

10. man man ls Online manual (help)

Page 127: Unix Shell Scripting

127

Ten VALUABLE UNIX Commands:

Once you have mastered the basic UNIX commands, these will be quite valuable inmanaging your own account.

Command Example Description

1. grep<str><files>

grep "bad word" * Find which files contain acertain word

2. chmod <opt><file>

chmod 644 *.htmlchmod 755 file.exe

Change file permissionsread onlyChange file permissionsto executable

3. passwd passwd Change passwd

4. ps <opt> ps auxps aux | grepdhyatt

List all running processesby #IDList process #ID's runningby dhyatt

5. kill <opt> <ID> kill -9 8453 Kill process with ID #8453

6. gcc (g++)<source>

gcc file.c -o fileg++ fil2.cpp -o fil2

Compile a program writtenin CCompile a program writtenin C++

7. gzip <file> gzip bigfilegunzip bigfile.gz

Compress fileUncompress file

8. mail(pine)

mail [email protected]< file1pine

Send file1 by email tosomeoneRead mail using pine

9. telnet <host>ssh <host>

telnetvortex.tjhsst.edussh -l dhyattjazz.tjhsst.edu

Open a connection tovortexOpen a secure connectionto jazz as user dhyatt

10. ftp <host>ncftp<host/directory>

ftp station1.tjhsst.eduncftpmetalab.unc.edu

Upload or Download filesto station1Connect to archives atUNC

Page 128: Unix Shell Scripting

128

Ten FUN UNIX Commands:

These are ten commands that you might find interesting or amusing. They areactually quite helpful at times, and should not be considered idle entertainment.

Command Example Description

1. who who Lists who is logged on yourmachine

2. finger finger Lists who is on computers in thelab

3. ytalk<user@place>

ytalkdhyatt@threat

Talk online with dhyatt who is onthreat

4. history history Lists commands you've donerecently

5. fortune fortune Print random humerous message

6. date date Print out current date

7. cal <mo><yr>

cal 9 2000 Print calendar for September2000

8. xeyes xeyes & Keep track of cursor (in"background")

9. xcalc xcalc & Calculator ("background"process)

10. mpage<opt> <file>

mpage -8 file1| lpr

Print 8 pages on a single sheetand send to printer (the font willbe small!)

Page 129: Unix Shell Scripting

129

Ten HELPFUL UNIX Commands

These ten commands are very helpful, especially with graphics and word processingtype applications.

Command Example Description

1. netscape netscape & Run Netscape browser

2. xv xv & Run graphics file converter

3. xfig / xpaint xfig & (xpaint&)

Run drawing program

4. gimp gimp & Run photoshop type program

5. ispell <fname> ispell file1 Spell check file1

6. latex <fname> latex file.tex Run LaTeX, a scientificdocument tool

7. xemacs / pico xemacs (orpico)

Different editors

8. soffice soffice & Run StarOffice, a full wordprocessor

9. m-tools (mdir,mcopy,

mdel, mformat,etc. )

mdir a:mcopy file1a:

DOS commands from UNIX(dir A:)Copy file1 to A:

10. gnuplot gnuplot Plot data graphically

Page 130: Unix Shell Scripting

130

Ten USEFUL UNIX Commands:

These ten commands are useful for monitoring system access, or simplifying yourown environment.

Command Example Description

1. df df See how much free disk space

2. du du -b subdir Estimate disk usage ofdirectory in Bytes

3. alias alias lls="ls -alF" Create new command "lls" forlong format of ls

4. xhost xhost +threat.tjhsst.eduxhost -

Permit window to display fromx-window program from threatAllow no x-window access fromother systems

5. fold fold -s file1 | lpr Fold or break long lines at 60characters and send to printer

6. tar tar -cf subdir.tarsubdirtar -xvf subdir.tar

Create an archive calledsubdir.tar of a directoryExtract files from an archive file

7. ghostview(gv)

gv filename.ps View a Postscript file

8. ping(traceroute)

pingthreat.tjhsst.edutraceroutewww.yahoo.com

See if machine is alivePrint data path to a machine

9. top top Print system usage and topresource hogs

10. logout(exit)

logout or exit How to quit a UNIX shell.