Scripting with Perl and Tcl - heim.ifi.uio.noheim.ifi.uio.no/~inf3330/materiell/perl.pdf · 1.7 GUI...

78
Scripting with Perl and Tcl Hans Petter Langtangen Simula Research Laboratory and Department of Informatics University of Oslo

Transcript of Scripting with Perl and Tcl - heim.ifi.uio.noheim.ifi.uio.no/~inf3330/materiell/perl.pdf · 1.7 GUI...

Scripting with Perl and Tcl

Hans Petter Langtangen

Simula Research Laboratory

and

Department of Informatics

University of Oslo

Table of Contents

1 Introduction to Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . 31.1.2 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1.3 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1.4 The Concept of Context in Perl . . . . . . . . . . . . . . . . . . . . . 7

1.2 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 81.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 There’s More Than One Way To Do It . . . . . . . . . . . . . . . . . . . . . . 121.3.1 A Script for Perl Beginners . . . . . . . . . . . . . . . . . . . . . . . . . 131.3.2 Using the Underscore Variable . . . . . . . . . . . . . . . . . . . . . . . 141.3.3 A Script Written in Typical Perl Style . . . . . . . . . . . . . . . . 141.3.4 Shorter Scripts for Lazy Programmers . . . . . . . . . . . . . . . . 151.3.5 The Ultimate Goal: Getting Rid of the Script File . . . . . 151.3.6 Perl Has a Grep Function Too . . . . . . . . . . . . . . . . . . . . . . . 15

1.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.4.1 Basic Control Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.4.2 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.4.3 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.4.4 One-Line Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.4.5 Array and List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 211.4.6 Hash Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241.4.7 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 251.4.8 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.4.9 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.4.10 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.4.11 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.4.12 Nested, Heterogeneous Data Structures . . . . . . . . . . . . . . . 321.4.13 Testing a Variable’s Type . . . . . . . . . . . . . . . . . . . . . . . . . . . 331.4.14 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341.4.15 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 341.4.16 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341.4.17 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 351.4.18 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 361.4.19 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 361.4.20 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361.4.21 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 371.4.22 Downloading Internet Files . . . . . . . . . . . . . . . . . . . . . . . . . . 381.4.23 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

1.4.24 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 401.4.25 Debugging Perl Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401.4.26 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421.4.27 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461.4.28 Building and Using Modules . . . . . . . . . . . . . . . . . . . . . . . . 491.4.29 Binary Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1.5 Installing Perl and Additional Modules . . . . . . . . . . . . . . . . . . . . . . 521.5.1 Installing Basic Perl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 521.5.2 Manual Installation of Perl Modules . . . . . . . . . . . . . . . . . . 521.5.3 Automatic Installation of Perl Modules . . . . . . . . . . . . . . . 531.5.4 The Required Perl Modules . . . . . . . . . . . . . . . . . . . . . . . . . 54

1.6 Perl Versus Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541.6.1 Python’s Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541.6.2 Perl’s Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561.6.3 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

1.7 GUI Programming with Perl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . 591.7.1 The First Perl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 601.7.2 The Similarity of Python/Tkinter and Perl/Tk . . . . . . . . 621.7.3 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

1.8 Web Interfaces and CGI Programming . . . . . . . . . . . . . . . . . . . . . . 631.8.1 Web Versions of the Scientific Hello World Program . . . . 631.8.2 Debugging CGI Scripts in Perl with CGI::Debug . . . . . . . 651.8.3 Using Perl’s CGI Module to Construct Forms . . . . . . . . . 67

2 Introduction to Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702.1 A Scientific Hello World Script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

2.1.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712.1.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

2.2 Reading and Writing Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722.2.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732.2.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732.2.3 Double Quotes, Braces, Brackets, and Variable Substi-

tution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762.3 Automating Simulation and Visualization . . . . . . . . . . . . . . . . . . . 77

2.3.1 The Complete Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 772.3.2 Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

2.4 Frequently Encountered Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 812.4.1 File Reading and Writing . . . . . . . . . . . . . . . . . . . . . . . . . . . 822.4.2 Running an Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822.4.3 List Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 832.4.4 Associative Array Operations . . . . . . . . . . . . . . . . . . . . . . . 842.4.5 Splitting and Joining Text . . . . . . . . . . . . . . . . . . . . . . . . . . 852.4.6 Text Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852.4.7 String Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 862.4.8 Numerical Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

2.4.9 Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872.4.10 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 882.4.11 Listing of Files in a Directory . . . . . . . . . . . . . . . . . . . . . . . 912.4.12 Testing File Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.4.13 Copying and Renaming Files . . . . . . . . . . . . . . . . . . . . . . . . 912.4.14 Creating and Moving to Directories . . . . . . . . . . . . . . . . . . 912.4.15 Removing Files and Directories . . . . . . . . . . . . . . . . . . . . . . 922.4.16 Splitting Pathnames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 922.4.17 Traversing Directory Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 922.4.18 CPU-Time Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . 932.4.19 Programming with Classes . . . . . . . . . . . . . . . . . . . . . . . . . . 932.4.20 Building and Using Packages . . . . . . . . . . . . . . . . . . . . . . . . 932.4.21 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 952.4.22 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

2.5 GUI Programming with Tcl/Tk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 962.5.1 The First Tcl/Tk Encounter . . . . . . . . . . . . . . . . . . . . . . . . 972.5.2 Binding Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 982.5.3 Widget Name Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992.5.4 The Similarity of Python/Tkinter and Tcl/Tk . . . . . . . . . 992.5.5 Using Variables in Widget Names . . . . . . . . . . . . . . . . . . . . 1002.5.6 Configuring Widgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1012.5.7 The Grid Geometry Manager . . . . . . . . . . . . . . . . . . . . . . . . 102

Preface

The purpose of this document is to show how the introductory programmingexamples from the book “Python Scripting for Computational Science” [?]can be implemented in Perl and Tcl. In addition, we list some core func-tionality of these scripting languages, typically corresponding to the sameinformation and examples as in Chapter 3Basic Pythonchapter.135 in [?]. Ifyou know the examples in a Python context from Chapters 2Getting Startedwith Python Scriptingchapter.49 and 3Basic Pythonchapter.135 in [?], it isquite easy to pick up basic Perl and Tcl from the present note. The Perl andTcl chapters can be read independently.

The author has a desire to include other scripting languages, e.g., Rubyand Scheme. Potential authors of such (independent) chapters, with the samestructuring as the Perl and Tcl chapters, are encouraged to drop me an email([email protected]).

The present printing of the document contains the Perl part only.

Chapter 1

Introduction to Perl

This chapter gives a quick introduction to the Perl language for readerswho are familiar (at least to some extent) with the Python scripts fromChapters 2.1A Scientific Hello World Scriptsection.50–2.3Gluing Stand-AloneApplicationssection.89 and 3Basic Pythonchapter.135 in the book [?]. Weshall look at the same sample scripts and show how the syntax changes whenwe program in Perl.

Recommended Documentation. As a companion to the introductory examplesand the overview of basic Perl functionality provided in this appendix, youneed the Perl man pages. These come along with the Perl distribution. I find itconvenient to read the man pages in plain text format using the perldoc tool.Some common ways of looking up information with perldoc are exemplifiedbelow.

perldoc perl # overview of all Perl man pagesperldoc perlsub # read about subroutinesperldoc Cwd # look up a special module, here ’Cwd’perldoc -f open # look up a special function, here ’open’perldoc -q cgi # seach the FAQ for the text ’cgi’

A Web version of the man pages can be found in the doc.html file. There youcan also find the Perl FAQ and a quick reference.

Having grasped the basic introduction to Perl from this appendix, youwill find the definite Perl reference, the famous “Camel book” [?], very useful.However, much of the text in [?] coincides with the Perl man pages. If you feelthat a more comprehensive introduction to Perl is needed, “Learning Perl”[?] and [?] are recommended. Ready-made recipes for numerous commontasks in scripting are collected in the highly recommended “Perl Cookbook”[?]. Advanced features of Perl are well discussed in [?] and [?]. Some Webresources regarding Perl topics are listed in doc.html.

The first Perl encounter consists of three of the examples from the intro-duction to Python in Chapter 2Getting Started with Python Scriptingchapter.49in [?]. We start out with a Hello World script, before continuing with a scriptconcerning file handling and array processing. Thereafter we present a scriptgluing a simulation and a visualization program. All these scripts referred toin this section are found in src/perl. Thereafter, in Chapter 1.4 we list, in anexample-oriented way, some basic and useful Perl functionality for quick refer-ence. Chapter 1.5 explains how to install Perl and additional modules. A briefcomparison of Perl versus Python appears in Chapter 1.6, while Chapters 1.7

2 1. Introduction to Perl

and 1.8 deal with graphical user interfaces: standard GUIs and dynamic Webpages, respectively.

1.1 A Scientific Hello World Script

Our first look at Perl will be the Scientific Hello World script from Chap-ter 2.1A Scientific Hello World Scriptsection.50 in [?]. This script reads a realnumber from the command line, takes the sine of the number, and writes“Hello, World! sin(r)=s” with the appropriate values of the numbers r and s.In Perl, we can write the script like this:

#!/usr/bin/perl$r = $ARGV[0]; # fetch the first ([0]) command-line argument$s = sin($r); # compute sin(r) and store in variable sprint "Hello, World! sin($r)=$s\n"; # print to standard output

Comments in Perl start with # and continue for the rest of the line. However,the first line #/usr/bin/perl! has a special meaning: Under Unix it tells thatthe script, if run as an executable file, is to be interpreted by the program/usr/bin/perl. If the executable Perl interpreter is stored in another path onyour system, you must write the correct full path in the top line of the scriptor (usually better) use a different header to be presented in Chapter 1.1.1.

Scalar variables in Perl are always preceded by a $ sign, i.e., $r and $s arescalar variables in the present script. The command-line arguments to a Perlscript are automatically stored in the array ARGV. Subscripting this array isdone as in $ARGV[0] (which implies extracting the first entry; arrays in Perlstart with 0 as in C and Python). The length of the array is $#ARGV+1, i.e.,$ARGV[$#ARGV] is the last entry of the array. The array itself as a variable isreached with the syntax @ARGV (and one can say, e.g., print "ARGV=@ARGV").

Variables can be directly inserted into a text string, a convenient featurecalled variable interpolation:

print "Hello, World! sin($r)=$s\n"; # print to screen

Such variable interpolation works only if the string is surrounded by doublequotes. Single quotes just leads to output of text with dollar characters.

Perl’s syntax is much inspired by C. For example, the newline characteris \n and all statements are terminated by a semicolon.

As usual in scripting, variables are never declared; the context determinesthe type. Contrary to Python, a variable can be used both as a string anda floating-point number. For example, $r is initialized to a text, but can besent to the sine function, which expects a floating-point variable, without anyexplicit type conversion.

Perl’s printf function gives good control of the output format of numbersand strings:

printf "Hello, World! sin(%g)=%12.5e\n", $r, $s;

1.1. A Scientific Hello World Script 3

There is no possibility to control the format when using variable interpolation(i.e., Python’s %(s)12.5e is not supported).

If the script is stored in a file hw.pl, you can execute the script by typing

perl hw.pl 0.1

or you can make the file executable under Unix (chmod a+x hw.pl) and thenjust write

./hw.pl 0.1

1.1.1 Reading and Writing Data Files

Chapter 2.2Working with Files and Datasection.59 in [?] deals with a scriptfor reading a file with (x, y) data points in two columns and writing a newtwo-column file with transformed data points (x, f(y)). On the next pageswe shall present and explain a Perl counterpart to the Python scripts. Thiscase study demonstrates how to work with files, subroutines, and arrays inPerl.

1.1.2 The Complete Code

: # *-*-perl-*-*eval ’exec perl -w -S $0 ${1+"$@"}’if 0; # if running under some shell

die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;

($infilename, $outfilename) = @ARGV;

open(INFILE, "<$infilename"); # open for readingopen(OUTFILE, ">$outfilename"); # open for writing

# read one line at a time:while (defined($line=<INFILE>)) {

($x, $y) = split(’ ’, $line); # extract x and y value$fy = myfunc($y); # transform y valueprintf(OUTFILE "%g %12.5e\n", $x, $fy);

}close(INFILE); close(OUTFILE);

sub myfunc {my ($y) = @_;if ($y >= 0.0) { return $y**5.0*exp(-$y); }else { return 0.0; }

}

This script is stored in src/perl/datatrans1.pl.

4 1. Introduction to Perl

1.1.3 Dissection

The Perl script starts with a header

: # *-*-perl-*-*eval ’exec perl -w -S $0 ${1+"$@"}’if 0; # if running under some shell

This header ensures that executing the script as

./datatrans1.pl infile outfile

implies interpreting the code by the first perl program encountered in thedirectories listed in your PATH environment variable. The explanation of allthe details in our Perl header is intricate, but it can be found in the filesrc/perl/headerfun.sh. (This is actually a document written in Bash (!) soyou need to run the file to get the document printed.)

In the case where the user has failed to provide two command-line ar-guments, we want to write a usage message and abort the script. This isaccomplished by Perl’s die statement: die prints a string on standard errorand terminates the script. In the present example the script dies if there areless than two command-line arguments:

die "Usage: $0 infilename outfilename\n" if $#ARGV < 1;

Recall that $#ARGV is the last legal index in @ARGV, i.e., the length of @ARGV is$#ARGV+1, so the test is $#ARGV+1 < 2, leading to $#ARGV < 1.

Extracting the first two command-line arguments can be performed bystandard subscripting:

$infilename = $ARGV[0];$outfilename = $ARGV[1];

However, it is more common (and elegant) to use Perl’s list assignment con-struction:

($infilename, $outfilename) = @ARGV;

The list on the left-hand side is set equal, entry by entry, to the entries inthe array on the right-hand side. We refer to the remark at the end of thissection for an explanation of the difference between list and array in Perlterminology.

Opening files in Perl is done with the open function:

open(INFILE, "<$infilename"); # open for readingopen(OUTFILE, ">$outfilename"); # open for writing

1.1. A Scientific Hello World Script 5

The first argument to open is a file handle, which is used for accessing thefile in the Perl code. Input files are recognized by < in front of the name1, >signifies an output file, and >> implies that text will be appended to the file.

Reading from a file handle, line by line, is accomplished by

while (defined($line=<INFILE>)) {# process $line

}

In the present script we want to split the line into an array of words, separatedby whitespace. The split function performs this task:

($x, $y) = split(’ ’, $line); # extract x and y value

Having the coordinates $x and $y available, we can transform the y value bycalling a function myfunc,

$fy = myfunc($y); # transform y value

One way of printing the transformed coordinate pair to the output file is toapply the printf function:

printf(OUTFILE "%g %12.5e\n", $x, $fy);

The core of a printf call is the format string, which follows the same syntax asin C and Python (and all other languages that supports the C’s printf stylefor formatting). Perl’s ordinary print function can also be used for writingto files, e.g., print OUTFILE "$x $fy\n";

The myfunc function is defined as

sub myfunc {my ($y) = @_;if ($y >= 0.0) { return y**5.0*exp(-$y); }else { return 0.0; }

}

Functions are referred to as subroutines in Perl. Their look is typically

sub name {# all subroutine arguments are stored in the array @_...return ...

}

The most striking difference from subprograms in other languages is that theargument list is not a part of the subroutine heading. Instead, all argumentsare available in an array @_. The first step is normally to store the argumentsin local variables:

1 If there is no < symbol, the file is opened for reading. In fact,opentt(F,"<$name"), open(F,"$name"), and open(F,$name) all lead to open-ing a file a file with name $name.

6 1. Introduction to Perl

my ($y) = @_; # list assignment# ormy $y = @_[0]; # subscripting

The my keyword tells that all variables on the left-hand side are declared aslocal variables in the subroutine. This is a good habit as using unintendedglobal variables inside a subroutine may have undesired effects in other partsof the script.

As in Chapter 2.2Working with Files and Datasection.59 in [?], we canmodify datatrans1.pl such that (i) the file is loaded into an array of lines,(ii) the x and y coordinates are stored in two arrays, and (iii) the output fileis written by a for loop over the array entries.

We start with making the open statement a bit more robust. Perl does notby default write any error message if the file we try to open does not exist.This can be quite annoying, but the problem is solved by a “try somethingor die” construction:

open(INFILE, "<$infilename")or die "unsuccessful opening of $infilename; $!\n";

The $! variable is a special variable in Perl containing the last error messageissued by the operating system.

Loading a file into an array of lines is enabled by the syntax

@lines = <INFILE>;

One can then process the array @lines, line by line:

for $line (@lines) {# process $line

}

# equivalent syntax:foreach $line (@lines) {

# process $line}

In the present case we want to create two arrays, @x and @y, containing thex and y coordinates:

@x = (); @y = (); # start with empty arraysfor $line (@lines) {

($xval, $yval) = split(’ ’, $line);push(@x, $xval); push(@y, $yval);

}

The x and y coordinates are extracted by splitting the line with respect towhitespace, exactly as we did in the datatrans1.pl code. The push functionappends new array entries.

Creating the output file can now be performed by a C-like for loop overthe array indices:

1.1. A Scientific Hello World Script 7

open(OUTFILE, ">$outfilename")or die "unsuccessful opening of $outfilename; $!\n";

for ($i = 0; $i <= $#x; $i++) {$fy = myfunc($y[$i]); # transform y valueprintf(OUTFILE "%g %12.5e\n", $x[$i], $fy);

}close(OUTFILE);

Recall that $#x is the last valid index in the array @x. The complete code isfound in src/perl/datatrans2.pl.

Remarks on Terminology. Perl distinguishes between the terms array andlist. Roughly speaking, an array is the variable having a list as value [?,Ch. 4.0]. For example, in an assignment @a = ("a","b","c"), a is an array,whereas its value ("a","b","c") is a list. The function push operates on ar-ray variables and not on lists, meaning that push(@a,"q") works well, whilepush(("a","b","c"),"q") does not make sense.

1.1.4 The Concept of Context in Perl

Operations in Perl er evaluated in a specific context. For newcomers to thelanguage the context concept can be quite confusing. A thorough explanationof context is provided in the “Camel” book [?, Ch. 2] or the perldata manpage (invoke perldoc perldata and search for “Context”). Here we shall onlyexemplify the two major contexts: scalar and list. The assignment

@a = ("a","b","c");

evaluates the list on the right-hand side in a list context, and @a becomesan array variable having its entries equal to the three scalars in the list("a","b","c"). When assigning the list to a scalar,

$a = ("a","b","c");

the list on the right-hand side is evaluated in a scalar context. In this case,the value of the list is the value of the last element (as with the C commaoperator). Therefore, $a becomes "c". On the other hand,

$b = @a;

evaluates the array variable @a in a scalar context, and its value is then thelength of the array. That is, $b becomes 3.

These examples show that an array variable can have a list as value ina list context and its length as value in a scalar context. A hash evaluatedin a scalar context becomes true if there are elements in the hash, and falseotherwise2.

The property that an array evaluates to its length in a scalar context isoften taken advantage of by Perl programmers. Two common applicationsare2 There is more information in the scalar value, see the perldata man page.

8 1. Introduction to Perl

for ($i = 0; $i < @a; $i++) {# work with $a[$i] ...

}

die "Usage: $0 file" unless @ARGV;die "Usage: $0 -f file" unless @ARGV == 2;

Especially the two latter examples have an attractive readability.The return value of many Perl functions depends on the context. One

example is localtime:

$t = localtime();

yields the date as a string; $t is "Sun May 13 09:02:27 2001", for instance. Ina list context,

@t = localtime();

localtime returns a list of nine values containing the time, day, month, year,etc. (see perldoc -f localtime), and @t becomes an array of numbers (say)(27, 2, 9, 13, 4, 101, 0, 132, 1).

1.2 Automating Simulation and Visualization

Chapter 2.3Gluing Stand-Alone Applicationssection.89 in [?] describes a sim-ple simulation code, called oscillator, for solving a differential equation mod-eling an oscillating system. Using a script, we can improve the user friendli-ness of the simulation code and also launch a visualization of the solution. APython version of such a script is explained in detail in Chapter 2.3GluingStand-Alone Applicationssection.89 in [?], and the purpose of the presentsection is to present the Perl version of that script.

1.2.1 The Complete Code

: # *-*-perl-*-*eval ’exec perl -w -S $0 ${1+"$@"}’if 0; # if running under some shell

# default values of input parameters:$m = 1.0; $b = 0.7; $c = 5.0; $func = "y"; $A = 5.0;$w = 2*3.14159; $y0 = 0.2; $tstop = 30.0; $dt = 0.05;$case = "tmp1"; $screenplot = 1;

# read variables from the command line, one by one:while (@ARGV) {

$option = shift @ARGV; # load cmd-line arg into $optionif ($option eq "-m") {

$m = shift @ARGV; # load next command-line arg}elsif ($option eq "-b") { $b = shift @ARGV; }elsif ($option eq "-c") { $c = shift @ARGV; }

1.2. Automating Simulation and Visualization 9

elsif ($option eq "-func") { $func = shift @ARGV; }elsif ($option eq "-A") { $A = shift @ARGV; }elsif ($option eq "-w") { $w = shift @ARGV; }elsif ($option eq "-y0") { $y0 = shift @ARGV; }elsif ($option eq "-tstop") { $tstop = shift @ARGV; }elsif ($option eq "-dt") { $dt = shift @ARGV; }elsif ($option eq "-noscreenplot") { $screenplot = 0; }elsif ($option eq "-case") { $case = shift @ARGV; }else {

die "$0: invalid option ’$option’\n";}

}

# create a subdirectory with name equal to case and generate# all files in this subdirectory:$dir = $case;use File::Path; # contains the rmtree functionif (-d $dir) { # does $dir exist?

rmtree($dir); # remove directory (old files)}mkdir($dir, 0755) or die "Could not create $dir; $!\n";chdir($dir) or die "Could not move to $dir; $!\n";

# make input file to the program:open(F,">$case.i") or die "open error; $!\n";print F "

$m$b$c$func$A$w$y0$tstop$dt

";close(F);

# run simulator:$cmd = "oscillator < $case.i"; # command to run$failure = system($cmd);die "running the oscillator code failed\n" if $failure;

# make gnuplot script:open(F, ">$case.gnuplot");print F "set title ’$case: m=$m b=$b c=$c f(y)=$func A=$A w=$w y0=$y0 dt=$dt’;";if ($screenplot) {

print F "plot ’sim.dat’ title ’y(t)’ with lines;\n";}print F <<EOF; # print multiple lines using a "here document"set size ratio 0.3 1.5, 1.0;# define the postscript output format:set term postscript eps monochrome dashed ’Times-Roman’ 28;# output file containing the plot:set output ’$case.ps’;

10 1. Introduction to Perl

# basic plot command:plot ’sim.dat’ title ’y(t)’ with lines;# make a plot in PNG format as well:set term png small;set output ’$case.png’;plot ’sim.dat’ title ’y(t)’ with lines;EOFclose(F);# make plot:$cmd = "gnuplot -geometry 800x200 -persist $case.gnuplot";$failure = system($cmd);die "running gnuplot failed\n" if $failure;

The complete source code appears in src/perl/simviz1.pl.

1.2.2 Dissection

The script starts with a safe Perl header, which ensures interpretation ofthe script by the first Perl interpreter found in the user’s path. After havingassigned default values to the input parameters to the oscillator code, weencounter an important part of many scripts, namely parsing of command-line arguments. The idea is that we “eat” the entries in @ARGV one by oneusing the shift operator:

$option = shift @ARGV;

This statement implies setting $options equal to the first element in @ARGV

and then removing this element from @ARGV3. We search for options on thecommand line until the @ARGV array is empty:

while (@ARGV) { # while @ARGV is non-empty$option = shift @ARGV; # load command-line arg. into $optionif ($option eq "-m") {

$m = shift @ARGV; # load next command-line arg}elsif ($option eq "-b") { $b = shift @ARGV; }...else {

die "$0: invalid option ’$option’\n";}

}

As an alternative to this explicit grabbing of command-line arguments, wecan use a special Perl utility called GetOptions [?, p. 445]:

use Getopt::Long; # load module with GetOptions functionGetOptions("m=f" => \$m, "b=f" => \$b, "c=f" => \$c,

"func=s" => \$func, "A=f" => \$A, "w=f" => \$w,"y0=f" => \$y0, "tstop=f" => \$tstop,"dt=f" => \$dt, "case=f" => \$case,"screenplot!" => \$screenplot);

3 Experienced Perl programers will often write just $options = shift; becauseshift without arguments implies shifting @ARGV. More examples regarding suchshortcuts in Perl are provided in Chapter 1.3.

1.2. Automating Simulation and Visualization 11

The syntax m=f means searching for the command-line argument --m andloading the proceding argument as a floating-point number (=f) into the Perlvariable $m. A single hyphen as in -m works too. Similarly, func=s specifies--func to take a string argument. The specification of the flag screenplot

allows us to use either --screenplot for setting $screenplot to a true valueor --noscreenplot for setting $screenplot to a false value (note to get thison/off behavior, the exclamation mark is required in "screenplot” =¿ $screen-

plot!). The GetOptions function has a rich functionality; the purpose here justis to notify the reader about the existence of such a handy function. Instruc-tive information is obtained from perldoc Getopt::Long. There are severalother modules in the Getopt family. For example: Getopt::Simple for a sim-plified interface to Getopt::Long, Getopt::Std for single-character options,Getopt::Mixed for long and single-character options, and Getopt::Declare forhandling command-line options or configuration files with associated helptext and initialization code.

The next step in our script is to move to the prescribed directory. However,we should first check whether the directory exists, and if so, we should delete itand recreate it to avoid mismatch between old and new result files. Checkingif a directory exists is done by the command if (-d $directoryname) in Perl.Removing a non-empty directory can be conveniently done by first loading anexternal Perl module, use File::Path, and then calling the function rmtree

in that module:

use File::Path; # has the rmtree functionif (-d $dir) { # does $dir exist?

rmtree($dir); # remove directory (old files)}mkdir($dir, 0755) or die "Could not create $dir; $!\n";chdir($dir) or die "Could not move to $dir; $!\n";

Observe that we test for success of mkdir. For example, insufficient permissionto create a new directory will not be noticable when running the script unlesswe include the or die statement4.

The next task is to write an input file for the oscillator program. Multi-line output can easily be created through an ordinary string with embeddednewlines5

print F "$m$b$c$func$A$w$y0

4 Python will in such cases abort the script and write a “Permission denied” mes-sage to standard output. See Exercise 1.8.

5 Python requires a triple quoted string for this purpose.

12 1. Introduction to Perl

$tstop$dt

";

Alternatively, we can use a special Perl construction (stemming from Unixshells), known as a here document :

print F <<EOF;$m$b$c$func$A$w$y0$tstop$dt

EOF

Everything between the two EOF marks is treated as output text. The enclos-ing EOF must start in the first column of the script file. The Gnuplot scriptlater in the simviz1.pl code is actually written as a here document.

Perl’s system function is used for running applications:

$cmd = "oscillator < $case.i"; # command to run$failure = system($cmd);die "running the oscillator code failed\n" if $failure;

Visualization of the solution in Gnuplot requires writing a small script withthe proper Gnuplot commands:

open(F, ">$case.gnuplot");print F <<EOF; # print multiple lines using a "here document"...# output file containing the plot:set output ’$case.ps’; # variable interpolation...EOFclose(F);

# make plot:$failure = system("gnuplot $case.gnuplot");die "running gnuplot failed\n" if $failure;

Never forget to close files before continuing with system commands involvingthe generated files!

1.3 There’s More Than One Way To Do It

A famous Perl slogan is “There’s More Than One Way To Do It” (often ab-breviated TIMTOWTDI, pronounced “Tim Toady”). The goal of the present

1.3. There’s More Than One Way To Do It 13

section is to exemplify this slogan and demonstrate different Perl program-ming styles. We shall develop scripts for finding files containing a specifiedstring and show that there might be many different Perl solutions to a pro-gramming problem.

When working with computers, you have probably often tried to find a filecontaining some particular text, but you have a hard time figuring out whatthe filename is. If you remember parts of the text, the Unix grep commandis handy. For example,

grep superLibFunc *

searches all files (*) in the current working directory for the text stringsuperLibFunc and writes out the matches. This can help you finding the fileyou are looking for. We shall present a cross-platform Perl script, which im-plements the grep functionality.

1.3.1 A Script for Perl Beginners

A verbose, easy-to-read grep script in Perl can take the following form.

: # *-*-perl-*-*eval ’exec perl -w -S $0 ${1+"$@"}’if 0; # if running under some shell

die "Usage: $0 pattern file1 file2 ...\n" if $#ARGV < 1;

# first command-line argument is the pattern to search for:$pattern = shift @ARGV;# run through the next command-line arguments, i.e. files, and grep:while (@ARGV) {

$file = shift @ARGV;if (-f $file) {

open(FILE,"<$file");@lines = <FILE>; # read all linesforeach $line (@lines) {

if ($line =~ /$pattern/) {print "$file: $line";

}}close(FILE);

}}

The only new statement here is

if ($line =~ /$string/)

which is a test whether the variable $line matches the regular expressioncontained in $string. If so, we write out this line.

14 1. Introduction to Perl

1.3.2 Using the Underscore Variable

The Perl program can be written more compactly using the implicit $_ vari-able. Let us present the code first and the explain what the syntax means.

#!/usr/bin/perldie "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;($pattern, @files) = @ARGV;foreach (@files) {

if (-f) {open(FILE,"<$_");foreach (<FILE>) {

if (/$pattern/) {print;

}}close(FILE);

}}

The extraction of command-line arguments is elegantly performed by divid-ing the arguments into the leading search string and an array holding thefilenames:

($pattern, @files) = @ARGV;

Many Perl commands can be issued without an explicit variable to workwith. One example is foreach (@files). In such cases the “invisible” variableis $_. That is, foreach (@files) actually means foreach $_ (@files).

The previous code is best explained by showing the equivalent Perl state-ments where the $_ appears explicitly:

#!/usr/bin/perldie "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;($pattern, @files) = @ARGV;foreach $_(@files) {

if (-f $_) {open(FILE,"<$_");foreach $_ (<FILE>) {

if ($_ =~ /$pattern/) {print $_;

}}close(FILE);

}}

1.3.3 A Script Written in Typical Perl Style

A more modern Perl style could be introduced in the script that makes useof the implicit $_ variable:

1.3. There’s More Than One Way To Do It 15

#!/usr/bin/perldie "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;($pattern, @files) = @ARGV;foreach (@files) {

next unless -f;open FILE, $_;foreach (<FILE>) { print if /$pattern/; }close FILE;

}

The next unless -f statement means that one jumps to the next iteration inthe loop unless the test if (-f $_) is true, i.e., unless the current filename($_) is an existing file.

1.3.4 Shorter Scripts for Lazy Programmers

There are many shortcuts in Perl aimed at lazy programmers. Here is anexample of a grep script equivalent to those above, but with a much morecompact file reading construction:

#!/usr/bin/perl$pattern = shift; # shift; means shift @ARGVwhile (<>) { # read line by line in file by file

print if /$pattern/o; # o increases the efficiency}

The while (<>) loop implies reading all lines in all files whose names arein @ARGV. (If there are no filenames on the command line, <> reads fromstandard input.) Since processing a list of files in a line-oriented fashion isa frequently encountered task in scripts, while (<>) is a popular and widelyused construction that saves quite some typing. It goes without saying thateach line is available in the $_ variable.

1.3.5 The Ultimate Goal: Getting Rid of the Script File

We can also do the grep operation with a command-line Perl script:

perl -n -e ’print if /superLibFunc/;’ file1 file2 file3

Here, the -n option tells Perl to invoke a loop over all lines in all files specifiedon the command line (equivalent to while (<>)) and execute the string after-e as a Perl script applied to each line. Implicit here is that the line is storedin the $_ variable.

1.3.6 Perl Has a Grep Function Too

The grep operation is so common that Perl has in fact a built-in grep function:

16 1. Introduction to Perl

#!/usr/bin/perldie "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;($pattern, @files) = @ARGV;foreach $file (@files) {

if (-f $file) {open FILE, "<$file";@lines = <FILE>;@match = grep /$pattern/, @lines; # Perl grepprint "$file: @match";close FILE;

}}

The grep function searches for $string in a list of all the lines in the file andreturns a list with the lines that contain $string. Of course, this readablescript can be condensed to two lines if desired, using the <> notation:

#!/usr/bin/perl$pattern = shift;print grep /$pattern/, <>;

Observe that we here do not easily print the filename.

Remark. We should mention that reading the whole file into memory at once,which is implied by @lines=<FILE> and also the <> operator, may face memoryproblems if you work with large data files. The line-by-line reading can thenbe more appropriate.

Exercise 1.1. Modify a very Perl-ish grep script.Consider a grep script in typical modern Perl style:

#!/usr/bin/perldie "Usage: $0 pattern file1 file2 ...\n" unless @ARGV >= 2;($pattern, @files) = @ARGV;foreach (@files) {

next unless -f;open FILE, $_;foreach (<FILE>) { print if /$pattern/; }close FILE;

}

Extend this script such that the filename and the line number are printedat the beginning of the lines that match the given string. You can countthe number of lines in the last foreach loop, or you can make use of Perl’sspecial variable $., which holds the line number of the current line. Writethe line number in a field of width (say) 5 characters such that the out-put is nicely aligned in three colums (filename, line number, line), see Ex-ercise 8.4Exercisesexercise.483 on page 349Exercisesexercise.483 in [?] for asample output.

Observe how simple such an extension would have been if we had usednamed variables instead of $_, or in other words, readability and extendabilityare seldom well supported by extensive use of $_. �

1.4. Frequently Encountered Tasks 17

1.4 Frequently Encountered Tasks

Frequently encountered tasks in Perl scripts have been collected and orga-nized in the present section, with the aim of providing a kind of example-oriented quick reference for the reader. The following tasks are covered:

– basic control structures,

– file reading and writing,

– (multi-line) output with format control,

– executing other programs,

– working with arrays and hashes,

– splitting, joining, searching, and replacing text,

– writing and calling Perl subroutines,

– checking a file’s type, size, and age,

– listing and removing files,

– creating and removing directories,

– moving to directories, traversing directory trees,

– measuring CPU time,

– building simple Perl modules,

– working with regular expressions.

1.4.1 Basic Control Statements

A typical if-else test follows this syntax:

if ($answer eq "copy") {$copy = 1;

} elsif ($answer == 0) {$quit = 1;

} elsif { $answer eq ’run’ or answer eq ’execute’) {$run = 1;

} else {print ’Invalid answer $answer\n’;

}

Perl has numerous ways of writing if tests. Some examples are

if ($pen ne "up") { $pen = "up"; }if (not $pen eq "up") { $pen = "up"; }if (! $pen eq "up") { $pen = "up"; }$pen = "up" if $pen ne "up";$pen = "up" if not $pen eq "up";$pen = "up" if ! ($pen eq "up");

The for or foreach statement visits the entries in an array, entry by entry:

18 1. Introduction to Perl

# convert some PostScript files to GIF:@somelist = (’file1.ps’, ’file2.ps’, ’file3.ps’);for $psfile (@somelist) {

$giffile = $psfile; $giffile ~ s/\.ps/.gif;system("convert ps:$psfile gif:$giffile");

}

There is both a while loop and a do-while loop in Perl:

$r = 0; $dr = 0.1;while (r <= 10) {

$s = sin($r); print "$s\n";$r += $dr;

}

$r = 0; $dr = 0.1;do {

$s = sin($r); print "$s\n";$r += $dr;

} while ($r <= 10);

The last statements breaks out of a loop:

for $line (@list_of_lines) {last if line[:5] =~ /^% set/;

}

The next statement continues with the next iteration in the loop:

# print lines not starting with ’#’:for $file (@files) {

next if not -f $file; # continue with next file# process $file:...

}

1.4.2 File Reading and Writing

The following code segments demonstrate opening a file and reading it lineby line or loading it into a list of lines:

$infilename = "myprog.cpp";open(INFILE, "<$infilename") # open for reading

or die "Cannot read file $infilename; $!\n";@lines = <INFILE>; # load file into a list of lines

# alternative reading, line by line:while (defined($line = <INFILE>)) {

# process $line}

# quicker variant, using $_:while (<INFILE>) {

# process current line, stored in $_}close(INFILE);

1.4. Frequently Encountered Tasks 19

The recipe for opening a file for writing a list of lines is given next.

$outfilename = "myprog2.cpp";open(OUTFILE, ">$outfilename") # open for writing

or die "Cannot write to file $outfilename; $!\n";$line_no = 0; # count the line number in @linesforeach $line (@lines) {

$line_no++;print OUTFILE "$line_no: $line";

}close(OUTFILE);

We can proceed with appending text to a file, using Perl’s features for writing(large) blocks of text in one output statement, with embedded variables ifdesired:

open(OUTFILE, ">>$filename") # open for appendingor die "Cannot append to file $filename; $!\n";

# print multiple lines at once, using a ‘‘here document’’:print OUTFILE <<EOF;/*

This file, "$outfilename", is a versionof "$infilename" where each line is numbered.

*/EOF

# equivalent output using a string instead:print OUTFILE \"/*

This file, \"$outfilename\", is a versionof \"$infilename\" where each line is numbered.

*/";

close(OUTFILE);

If you need to treat a file handle, such as OUTFILE, like a variable, e.g.,when sending it to a function, you should use Perl’s FileHandle objects, seeperldoc FileHandle.

1.4.3 Running an Application

Any operating system command can be executed by calling the system func-tion. Here is an example involving running an application myprog:

$cmd = "myprog -c file.1 -p -f -q";$failure = system("$cmd > res"); # output goes to file resdie "$0: running $cmd failed\n" if $failure;

A different way of testing for failure is

system("$cmd > res") == 0 or die "$0: running $cmd failed\n";

The return value from system is also available in the special Perl variable $?:

20 1. Introduction to Perl

system("$cmd > res");die "$0: running $cmd failed\n" if $?;

To redirect the output from the application into a list of lines, one canuse back quotes:

$cmd = "myprog -c file.1 -p -f -q";@res = ‘$cmd‘;

Alternatively, one can open a pipe to the application and read the output asif it were a file:

open(APP, "$cmd |");@res = <APP>;

# alternative line by line reading:open(APP, "$cmd |");while (<APP>) {

# process the current line, stored in $_}close(APP);

Pipes can also be used for running interactive applications. After havingopened a write pipe to a program, we can issue various commands, which areexecuted upon closing the pipe. Here is an example involving the interactiveGnuplot program:

open (GNUPLOT, "| gnuplot -persist"); # open a pipe to Gnuplotprint GNUPLOT "set xrange [0:10]; set yrange[-2:2]\n";print GNUPLOT "plot sin(x)\n"; # draw a sine functionprint GNUPLOT "quit\n";close(GNUPLOT); # run Gnuplot with the commands

1.4.4 One-Line Perl Scripts

Perl supports some command-line options for wrapping a script with a loopover all lines in a series of files. This is very convenient for creating one-linescripts on the fly. For example,

perl -p -i.bak -e ’...’ file1 file2 file3

runs a loop over all lines in file1, file2, and file3. For each line, the Perlcommands provided inside the quotes (after the -e option) are executed,and the -p option implies that the line is printed after execution of thecommands. Without the -i option the printing goes to standard output, butwith -i the files are modified in-place, i.e., the original file is replaced by thenew output. With -i.bak the file file1 is first copied to file1.bak before itis being overwritten. The -p and -i.bak options are normally combined into-pi.bak. Each line in the files is stored in $_. As an illustration we can let thescript specified by the -e option be s/float/double/g; meaning that float isreplaced by double in some files (here file1, file2, and file3):

1.4. Frequently Encountered Tasks 21

perl -pi.bak -e ’s/float/double/g;’ file1 file2 file3

To avoid automatic printing of each line, we can replace the -p option by-n. Suppose a data file has numbers in a series of columns, separated bywhitespace, and you want to extract the first and the fourth column. Therelevant one-liner is then

perl -ne ’@s=split; print "$s[0]\t$s[3]\n"’ datafile

Calling split without an argument implies splitting $_ with respect to whites-pace. The equivalent Perl script, stored in a file, in this latter example canalso be made very short:

while (<>) {@s=split; print "$s[0]\t$s[3]\n";}

1.4.5 Array and List Operations

The most common statements for creating and traversing arrays are listednext. Creating an array with three entries goes like this:

@arglist = ($myarg1, "displacement", "tmp.ps");

We can use an array as an entry too,

@arr = ($var1, $var2);@arglist = ($myarg1, "displacement", @arr, "tmp.ps");

but @arglist does not have an array as the third element; the @arr array’sentries are simply inserted in @arglist, i.e., @arglist now contains

($myarg1, "displacement", $var1, $var2, "tmp.ps");

To force the third entry to be the @arr array, this entry must be a referenceto @arr, obtained by prefixing @arr with a backslash (see page 29):

@arglist = ($myarg1, "displacement", \@arr, "tmp.ps");

New entries can be appended to an array using the push function, e.g.,

push(@arglist, $myvar2);push(@arglist, @arr2);

Changing entries is enabled by subscripting, e.g.,

$arglist[2] = "displacement";

Traversing lists applies the for or foreach loop,

22 1. Introduction to Perl

foreach $entry (@arglist) {print "entry is $entry\n";

}# orfor $entry (@arglist) {

print "entry is $entry\n";}

Index-based traversal is also possible:

for ($i = 0; $i <= $#arglist; $i++) {print "entry is $arglist[$i]\n";

}# orfor ($i = 0; $i < @arglist; $i++) {

print "entry is $arglist[$i]\n";}

A widely used shortcut for creating a list of strings is the qw operator:

@strlist = qw/item1 item2 item3/;# equivalent to:@strlist = ("item1", "item2", "item3");

The qw operator is frequently used in Perl/Tk programming.Extracting entries from an array is often performed by a list assignment,

e.g.,

($filename, $plottitle, $psfile) = @arglist;

This assignment works regardless of the length of @arglist6. If @arglist has(say) two elements, $psfile becomes an undefined variable. The final listentry on the left-hand side can be a list, e.g.,

($filename, $plottitle, @rest) = @arglist;# @rest becomes $arglist[2], $arglist[3] and so on

The shift function returns and removes the first array element:

$first_entry = shift @arglist;

The pop function returns and removes the last array element:

$last_entry = pop @arglist;

Without arguments, shift and pop works on @ARGV in the main program and@_ in subroutines, e.g.,

6 Similar list assignments in Python requires that the lists on each side of theassignment operator have equal lengths.

1.4. Frequently Encountered Tasks 23

$file = shift; # same as shift @ARGV;

sub myroutine {my $arg1 = shift; # same as shift @_;my $arg2 = shift;...

}

Array items can be changed in-place:

# @A is some array of numbersfor ($i=0; $i<=$#A; $i++) {

if ($A[$i] < 0.0) { $A[$i] = 0.0; }}# @A does not contain negative numbers

The follwing construction also works, i.e., entries in @A are changed7:

for $r (@A) {if ($r < 0.0) { $r = 0.0; }

}

Perl arrays allow slicing: @arglist[1..3] returns the second up to andincluding the fourth entry, that is, 1..3 denotes the indices 1-3.

Unlike Python, an array assignment like

@a = @b;

creates a new array @a where each element is a copy of the corresponding arrayelement in @b. To make a refer to the array b, as in the Python assignment a

= b, we need to let a be a reference:

$a = \@b;

See page 29 for more information about references and how to access thevalues referred to by $a.

Reversing the order of the entries in an array is performed by the reverse

function:

@reversed_strlist = reverse(@strlist);

Sorting an array is also easy:

@sortedl_ist = sort(@list); # sort in ascending ASCII order

The sort order can be controlled by a user-defined function, e.g.,

7 The similar construction does not work in Python (cf. the example starting onpage 87Lists and Tuplessubsection.148 in [?]).

24 1. Introduction to Perl

sub numeric_sort {if ($a < $b) { return -1; }elsif ($a == $b) { return 0; }else { return 1; }

}@sorted_list = sort numeric_sort @list;

The arguments $a and $b in sort criteria routines are automatically initializedby Perl and used instead of the @_ array for speed. The numeric_sort routineis often required, but writing a separate subroutine is actually not necessarybecause Perl already has a compound comparison operator <=> that workswith numbers:

@sorted_list = sort { $a <=> $b } @list; # numeric sort

The statement $a <=> $b evalues to −1, 0 or 1, depending on whether $a isless than, equal to, or greater than $b, respectively. The operator works fortext too. We refer to the description of the sort function in perldoc perlfunc

(or write just perldoc -f sort) for numerous examples on writing customizedsort functions, e.g., case-insensitive text comparison.

The perlfunc man page is very useful; if you wonder about the Perl func-tion name for doing a specific task, write perldoc perlfunc and search forkeywords in this man page.

1.4.6 Hash Operations

A hash, also known as associative array in other languages, or dictionary inPython, is a kind of array where the index, called key, can be an arbitrarytext. For example, all command-line options to a script could be stored in ahash with the name of the option (without any hyphens) as key:

$cmlargs{’m’} = 1.2; # or $cmlargs{m} = 1.2;$cmlargs{’tstop’} = 6.0; # or $cmlargs{tstop} = 6.0;

This allows for easy processing of a large number of command-line argumentsand corresponding script variables. Here is a possible code segment:

# init the entire hash with default values:# (the entire hash is preceded by %)%cmlargs = (

’tstop’ => 6.0,’m’ => 1.2);

while (@ARGV) { # run through all command-line arguments$option = shift @ARGV;$option = substr($option, 2); # strip off hyphens (--)if (exists($cmlargs{$option})) {

# next command-line argument is the value:$value = shift @ARGV$cmlargs{$option} = $value;

} else {

1.4. Frequently Encountered Tasks 25

die "The option $option is not registered\n";}

}# traverse the hash structure, key by key:foreach $option (keys %cmlargs)

{ print "cmlargs{’$option’}=$cmlargs{$option}\n"; }

With this technique you could develop various tools for initializing and pro-cessing command-line options, and each time you need to add a new variableand a corresponding option to the script, you can simply add one new lineto the initialization of the default values in the hash cmlargs.

1.4.7 Splitting and Joining Text

The split function splits a string according to a delimiter string or a regularexpression. A common use of split is to split a text into words:

$files = "case1.ps case2.ps case3.ps";@filenames = split(’ ’, $files); # split wrt whitespace

The entries in @filenames become

("case1.ps", "case2.ps", "case3.ps")

The behavior of split(’ ’, $str) is equivalent to str.split() in Python,i.e., whitespace surrounding the words is ignored. Any string delimiter canbe used, e.g.,

$files = "case1.ps, case2.ps, case3.ps";@filenames = split(’, ’, $files);

results in @filenames as

("case1.ps", "case2.ps", "case3.ps")

The split function can also split with respect to a regular expression, justas re.split in Python, e.g.,

$files = "case1.ps, case2.ps, case3.ps";@filenames = split(/,\s*/, $files);

This results in the correct split of $files.(There is a slight difference between Perl and Python when splitting a

string with respect to whitespace using the regular expression \s+. Leadingand trailing blanks results in an empty string as first and last element inthe returned list, when using Python, whereas Perl’s split function does notresult in an array element corresponding to the trailing blanks.)

The join command is the inverse of split:

@filenames = ("case1.ps", "case2.ps", "case3.ps");$cmd = "print " . join(" ", @filenames);

yields $cmd as the string "print case1.ps case2.ps case3.ps".

26 1. Introduction to Perl

1.4.8 Text Processing

A basic issue in text processing is recognizing and replacing parts of a text.Recognizing text can be done in several ways:

# exact string match:if ($line eq "double") { # is $line equal to "double"?

# matching with full regular expressions:if ($line =~ /double/) { # does $line contain double?# (here, double can be replaced by any valid regular expression)

Note that in Perl, the comparison operators for strings and numbers aredifferent8 (e.g., eq and ne for strings vs. == and != for numbers, see alsoChapter 1.4.14).

Here is an example regarding substituting double by float everywhere ina file:

$copyfilename = "$filename.old~~";rename($filename, "$copyfilename"); # take a copy of the fileopen(FILE," <$copyfilename") or die "$0: couldn’t open file; $!\n";$filestr = join("", <FILE>); # read lines and join them to a stringclose(FILE);

$filestr =~ s/float/double/g; # substitute

open(FILE, ">$filename"); # write to the orig fileprint FILE $filestr; # print the whole (modified) fileclose(FILE);

Since the need for such types of file substitutions often arises, Perl offers aone-line statement for accomplishing the task:

perl -pi.old~~ -e ’s/float/double/g;’ *.c

See page 20 for an explanation of the various parts of this command.

1.4.9 String Operations

Strings in Perl are enclosed in single or double quotes, but the type of quotesaffects the string contents, as illustrated next. Double quotes enable variableinterpolation:

$w = ’World’;$s1 = "Hello, $w!"; # becomes "Hello, World!"

Single quotes preserve $, @, and other special Perl characters:

$s2 = ’Hello, $w!’; # becomes "Hello, $w!"

Multi-line strings are also possible:

8 Python applies == as well as <, <=, >, >= for all data types.

1.4. Frequently Encountered Tasks 27

$s3 = "ordinary stringscan be used formulti-linetext";

String concatenation is enabled by the dot operator:

$myfile = $filename . ’_tmp’ . ’.dat’;

The $myfile variable becomes case1_tmp.dat if $filename is the string case1.Substrings can be extracted by the substr function, e.g.,

$teststr = ’0123456789’;# extract 6 characters, starting# from the beginning of the string:$strpart = substr($filename, 0, 5);# result: ’01234’

# another example:$strpart = substr($filename, 3, 5);# result: ’34567’

# skipping the first two characters:$strpart = substr($filename, 2);

# skipping up to the last three characters:$strpart = substr($filename, -3);

Stripping away leading and trailing blanks in a string is easily carried outby regular expressions:

$line1 =~ s/^\s*//; $line1 ~= s/\s*$//;

1.4.10 Environment Variables

The environment variables are stored in a Perl hash called ENV. You canmodify, e.g., $ENV{PATH} in the script and it has effect on all child processes(started by calls to the system function, for instance). Here is an example howwe can read the PATH environment variable, split it into its various directories,and check each directory if it contains the executable file vtk:

$program = "vtk";$path = $ENV{PATH}; # /usr/bin:/usr/local/bin:/usr/X11/bin etc.@paths = split(/:/, $path);foreach $dir (@paths) {

if (-d $dir) {if (-x "$dir/$program") {

$program_path = $dir;last; # jump out of the loop (as break in C and Python)

}}

}if (defined($program_path)) {

print "$program found in $program_path\n";} else { print "$program not found\n"; }

28 1. Introduction to Perl

Note that the regular expression split on colon is Unix specific. On Windowswe need to insert a semi-colon instead (note that /[:;]/ does not give across-platform solution since colon is used in Windows paths, e.g., C:\). Alsonote the need for double quotes in the second if test; writing $dir/$program

without double quotes would be an invalid mixture of variables and text (theslash), or division of two text variables – what we need is to construct a newstring using variable interpolation.

1.4.11 Subroutines

Functions in Perl are called subroutines. Subroutines take the form

sub name {# extract local variables from the argument array @_# body of routine...return # some data structure

}

The arguments are not part of the subroutine heading. Instead, they areavailable in the array @_. Output variables are transferred to the calling codeby returning an appropriate data structure, e.g., a list of the various outputquantities. The return statement can be omitted.

A Simple Example of a Subroutine. A subroutine for finding the maximumvalue of two numbers can be written straightforwardly as follows:

sub max {my ($a, $b) = @_;my $max; # = maximum value of $a and $bif ($a > $b) { $max = $a$; } else { $max = $b; }return $max;

}

The my keyword makes variables local to the subroutine9. Unless you specifya variable with my it is treated as a global variable whose value is visibleoutside the routine as well. Frequently, one maps the @_ array onto suitablelocal values using convenient list techniques, e.g.,

my ($a, $b) = @_;

This allows working with scalars, such as $a and $b, instead of the arrayentries $_[0] and $_[1]. Alternatively, we can extract $a and $b using theshift operator:

my $a = shift; # same as shift @_;my $b = shift;

9 See [?] for a precise explanation of the my keyword.

1.4. Frequently Encountered Tasks 29

Variable Number of Arguments. Here is a subroutine statistics, with avariable number of arguments, which returns a list containing the averageand the minimum and maximum value of all the arguments:

($avg, $min, $max) = statistics($v1, $v2, $v3, $b); # usage

sub statistics {# arguments are available in the array @_my $avg = 0; my $n = 0; # local variables

foreach $term (@_) { $n++; $avg += $term; }$avg = $avg / $n;

my $min = $_[0]; my $max = $_[0];shift @_; # swallow first arg., it’s already treatedforeach $term (@_) {

if ($term < $min) { $min = $term; }if ($term > $max) { $max = $term; }

}

return ($avg, $min, $max);}

Call by Reference. Modifying the arguments inside the subroutine, i.e., callby reference, is enabled by working directly on the @_ array. For example,

swap($v1, $v2); # swap the values of $v1 and $v2

sub swap {my $tmp = $_[0];$_[0] = $_[1];$_[1] = $tmp;

}

That is, @_ contains references to the variables used in the subroutine call10.We remark that the swap function is just an example on call by reference; theelegant Perl way of swapping two variables reads ($v2,$v1)=($v1,$v2).

One can also pass references to variables to subroutines and in this way getthe effect of call by reference. A reference to a variable $a reads \$a. Havingthe reference as a variable $a_ref, we can extract its value by ${$a ref}. Wemay then write the swap function as

sub swap {my ($a_ref, $b_ref) = @_; # extract references# swap the contents of the underlying variablesmy $tmp = ${$a_ref};${$a_ref} = ${$b_ref};${$b_ref} = $tmp;

}

swap(\$v1, \$v2);

10 Perl applies call by reference, and copying the arguments in @ into local variablesin a my statement simulates call by value.

30 1. Introduction to Perl

Alternatively, we can just swap the references themselves:

sub swap2 {my ($a_ref, $b_ref) = @_;# swap references:my $tmp = $a_ref; $a_ref = $b_ref; $b_ref = $tmp;

}

Another example on using references in Perl appears on page 31.

Keyword Arguments. By using a hash to hold the arguments passed to asubroutine, one can obtain a very readable syntax and the possibility forassigning default values to an arbitrary set of the arguments11. Here is anexample, where we call a subroutine with two parameters, message and file:

$filename = "my.tmp";print2file(message => "testing hash args", file => $filename);

sub print2file {my %args = (message => "no message", # default

file => "tmp.tmp", # default@_); # assign and override

open(FILE,">$args{file}");print FILE "$args{message}\n\n";close(FILE);

}

Inside the subroutine we first assign default values to the hash entries andthereafter we insert the argument list @_, which can be interpreted as a hashas well. This latter hash might then override our default values. For example,calling

print2file(file => $filename);

leaves $args{message} as no message, but $args{file} is overwritten by the$filename variable inside the print2file subroutine. The use of a hash in sub-routine calls also makes the sequence of arguments irrelevant. The techniqueis used throughout Perl’s Tk module for creating graphical user interfacesand (see Chapter 1.7).

Omitting Parenthesis in a Call. If a subroutine is declared before you call it,you can omit the parenthesis in the call statement, e.g.,

sub myproc {my $file1 = shift; // implicit shift on @_my $file2 = shift;...

}# call myproc without parenthesis:myproc $myfile, "$yourdir/$yourfile";

11 This is the counterpart to Python’s keyword arguments, see page 111KeywordArgumentssubsection.175 in [?].

1.4. Frequently Encountered Tasks 31

All the subroutines in the Perl libraries are declared before you use them soyou can omit parenthesis if you desire. Here are some examples:

print "No of iterations=$iter\n";print("No of iterations=$iter\n");

open TMPFILE, ">$tmpfile";open(TMPFILE, ">$tmpfile");

system "simulator -q 1.2";system("simulator -q 1.2");

Multiple Arrays as Arguments. If you want to send several arrays to a sub-routine, you need to explicitly pass references to the arrays. Otherwise, onecannot detect where one array stops and the next starts in @_. We shall nowshow an example where we transfer two arrays to a subroutine and printthem out simultaneously in a nice format:

@curvelist = (’curve1’, ’curve2’, ’curve3’);@explanations = (’initial shape of u’,

’initial shape of H’,’shape of u at t=2.5’);

# send the two arrays to displaylist, using references# (\@list is a reference to the array @list):displaylist(list => \@curvelist, help => \@explanations);

The implementation of the displaylist routine, taking two array argumentstransferred by references, is listed next.

sub displaylist {my %args = (@_);# extract the two lists from the two references:my $list_ref = $args{’list’}; # extract referencemy @list = @$list_ref; # extract array from referencemy $help_ref = $args{’help’}; # extract referencemy @help = @$help_ref; # extract array from reference

my $index = 0; my $item;for $item (@list) {

printf("item %d: %-20s description: %s\n",$index, $item, $help[$index]);

$index++;}

# Alternative, without lots of local variables:$index = 0;for $item (@{$args{’list’}}) {

printf("item %d: %-20s description: %s\n",$index, $item, ${@{$args{’help’}}}[$index]);

$index++;}

}

32 1. Introduction to Perl

The output of displaylist looks like this:

item 0: curve1 description: initial shape of uitem 1: curve2 description: initial shape of Hitem 2: curve3 description: shape of u at t=2.5

We refer to the Pass by Reference section of perldoc perlsub (or the equiv-alent text in [?, p. 116-118]) for more information.

1.4.12 Nested, Heterogeneous Data Structures

The problems with displaylist and the need for references also occur innested, heterogeneous data structures. Say we want a list such as the curves1

list in page 88Lists and Tuplessubsection.148 in [?]. In Perl we could buildsome of its components first, which are straight arrays:

@point1 = (0,0);@point2 = (0.1,1.2);@point3 = (0.3,0);@point4 = (0.5,-1.9);

A list of these points must be a list of references to @point1, @point2, etc.:

@points = (\@point1, \@point2, \@point3, \@point4);

Now, suppose we have an array @xy1 similar to @points. The curves1 arrayis supposed to contain a string, @points, another string, and @xy1. Again,references are required to avoid “flattening” the structure:

@curves1 = ("u1.dat", \@points, "H1.dat", \@xy1);

It is tedious to write the sublist as separate variables so we can do with

@curves1 = ("u1.dat", [[0,0], [0.1,1.2], [0.3,0], [0.5,-1.9]],"H1.dat", \@xy1);

That is, lists in square brackets provides a reference to an array.Indexing is performed with a syntax similar to Python. For example,

$a = $curves1[1][1][0];

yields $a as 0.1.Nested data structures in Perl must make use of references, and it can

be troublesome to debug such structures. The Data::Dumper module convertsPerl data structures to readable strings: print Dumper(@curves1) results inthe present case in

1.4. Frequently Encountered Tasks 33

$VAR1 = ’u1.dat’;$VAR2 = [

[0,0

],[’0.1’,’1.2’

],[’0.3’,0

],[’0.5’,’-1.9’

]];

$VAR3 = ’H1.dat’;$VAR4 = [

[’0.3’,0

],[’0.5’,’-1.9’

]];

The Data::Dumper module supports lots of output formats, see perldoc Data::Dumper.More information about references can be found in perldoc perlreftut.

1.4.13 Testing a Variable’s Type

An ordinary Perl variable is either a scalar, an array, or a hash. The prefixdetermines the type of the variable, so the variable name together with itsprefix shows its type; it is no need to test on the variable’s type (as in Python).Writing

$var = 1; # scalar@var = (1, 2); # array%var = (key1 => 1, key2 => ’two’); # hash

creates three different Perl variables. Every time we use one of the variables,the prefix immediately shows its type.

However, when working with references the prefix is always a dollar. Thefunction ref can be used to test what kind of underlying data structure thereference is pointing to. The return value in a scalar context is a string, like’SCALAR’, ’ARRAY’, or ’HASH’. In a boolean context, ref returns true if itsargument is a reference:

34 1. Introduction to Perl

if (ref($r) eq "HASH") { # test return valueprint "r is a reference to a hash.\n";

}unless (ref($r)) { # use in boolean context

print "r is not a reference at all.\n";}

The ref function is handy when you work with nested, heterogeneous datastructures. See perldoc -f ref and perldoc perlref for more information.

1.4.14 Numerical Expressions

Perl supports the same numerical expressions as C. Strings are automaticallytransformed to numbers when required:

$b = 1.2; # b is a number$b = "1.2"; # b is a string$a = 0.5 * $b; # b is converted to a real number before mult.

if ($b < 100) { print "ok\n"; } else { print "error!\n"; }# prints "ok"

In the last test, the < operator works on numbers, and $b is interpreted as anumber (<, >, ==, =!, etc. are the comparison operators for numbers, whereasstrings must be compared with lt, gt, eq, ne, etc.).

1.4.15 Listing of Files in a Directory

The following statements return a list of files (in the current working direc-tory) having extensions .ps or .gif:

@filelist = glob("*.ps *.gif");

# alternative:@filelist = <*.ps *.gif>;

A more sophisticated glob function is also available, see perldoc File::Glob.

1.4.16 Testing File Types

Perl supports a range of tests for classifying files:

if (-f $myfile) { print "$myfile is a plain file\n"; }if (-d $myfile) { print "$myfile is a directory\n"; }if (-x $myfile) { print "$myfile is executable\n"; }if (-z $myfile) { print "$myfile is empty(zero size)\n"; }if (-T $myfile) { print "$myfile is a text file\n"; }if (-B $myfile) { print "$myfile is a binary file\n"; }

There are also tests for the size and age of a file:

1.4. Frequently Encountered Tasks 35

$size = -s $myfile;$days_since_last_access = -A $myfile;$days_since_last_modification = -M $myfile;

See perldoc perlfunc and search for -f, -d, and so on for information aboutfile tests.

The stat function gives more detailed results about a file:

($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks) = stat($myfile);

A quote from the description of stat in the man page perlfunc explains whatthe various list entries above mean:

0 dev device number of filesystem1 ino inode number2 mode file mode (type and permissions)3 nlink number of (hard) links to the file4 uid numeric user ID of file’s owner5 gid numeric group ID of file’s owner6 rdev the device identifier (special files only)7 size total size of file, in bytes8 atime last access time since the epoch9 mtime last modify time since the epoch

10 ctime inode change time (NOT creation time!) since the epoch11 blksize preferred block size for file system I/O12 blocks actual number of blocks allocated

There is an alternative stat function in the File::stat module, see perldoc File::stat.

1.4.17 Copying and Renaming Files

Renaming a file is simple:

rename($myfile, "tmp.1"); # rename $myfile to tmp.1

Moving files across file systems is reliably done with the move function inPerl’s File::Copy library:

use File::Copy;move($myfile, "/work/temp") or die "Could not rename file\n";

Copying a file $file to a file $tmpfile is performed with the copy function inthe File::Copy library:

use File::Copy;copy($file, $tmpfile);

36 1. Introduction to Perl

1.4.18 Creating and Moving to Directories

Creating a directory and moving to a directory are tasks performed with themkdir and chdir functions, respectively:

use Cwd; $origdir = cwd; # remember where we are$dir = "../mynewdir";mkdir($dir, 0755) or die "$0: couldn’t create dir; $!\n";chdir($dir);...chdir($origdir); # move back to the original directorychdir; # move to your home directory ($ENV{HOME})

Suppose you want to create a new directory perl/projects/test1 in yourhome directory, but neither perl, nor projects and test1 exist. Instead ofusing repeated mkdir commands, Perl offers the mkpath command, from theFile::Path module, to create the whole path in one statement:

use File::Path;mkpath("$ENV{HOME}/perl/projects/test1");

1.4.19 Removing Files and Directories

Single files are removed by the unlink statement, e.g.,

unlink("myfile") or die "Could not remove file\n";

A list of files can also be transferred to unlink:

unlink(@files); unlink(glob("*.ps *.gif"));

unlink "myfile", ’yourfile’, @thosefiles, "$file.tmp" or \die "Could not remove files\n";

Frequently, one wants to remove a directory tree, possibly full of files, anaction that requires the rmtree function from the File::Path library:

use File::Path;rmtree("mydir");

1.4.20 Splitting Pathnames

Let $fname be a filename containing a possibly long path, e.g.,

$fname = /usr/home/hpl/scripting/perl/intro/hw2a.pl

Occasionally, one wants to split this filename into the basename hw2a.pl andthe directory name /usr/home/hpl/scripting/perl/intro/:

1.4. Frequently Encountered Tasks 37

use File::Basename;$basename = basename($fname);$dirname = dirname($fname);

One can also extract the base of the basename, hw2a, either by

$base = $basename;# or by substituting the file extension by an empty string:$base =~ s/\.pl$//g;

or by the fileparse function:

($base, $dirname, $extension) = fileparse($fname,".pl");

The fileparse function can take an arbitrary number of possible extensions.

1.4.21 Traversing Directory Trees

The very useful Unix find command can be implemented in a cross-platformfashion in Perl using the File::Find library and its find function. The basicrecipe for using Perl’s find goes as follows.

use File::Find;# run through directory trees dir1, dir2, and dir3, and# for each file call the user-provided subroutine ourfunc:find(\&ourfunc, "dir1", "dir2", "dir3");

sub ourfunc {# $_ contains the name of the selected file$file = $_;# process $file# $File::Find::dir contains the current directory# (you are automatically chdir()’ed to this directory)# $File::Find::name contains $File::Find::dir/$file

}

We shall now implement a script that lists all files larger than 1Mb in thehome directory tree. The easiest way to extract the size of a file is to write

$size = -s $file;

Our script in Perl might then look like

#!/usr/bin/perluse File::Find;

find(\&printsize, $ENV{HOME}); # traverse home-directory tree

sub printsize {$file = $_; # more descriptive variable name...if (-f $file) { # is $file a plain file, not a directory?$size = -s $file; # or $size = (stat($file))[7];if ($size > 1000000) {

printf("%.1fMb %s in %s\n",$size/1000000.0,$file,$File::Find::dir);

}}

}

38 1. Introduction to Perl

We recommend to read perldoc File::Find to see the many possibilities thatPerl’s find function offers.

There is a program find2perl that translates a Unix find command intothe equivalent Perl program. The resulting program is not always easy toread for newcomers to Perl so writing the Perl script yourself gives bettercontrol of what you want to do. In the present example you can try

find2perl find $HOME -name ’*’ -type f -size +2000 -exec ls -s {} \;

and realize that the resulting code has 55 (!) lines and is less cross-platformthan our hand-coded version.

1.4.22 Downloading Internet Files

The libwww-perl package contains numerous modules and scripts for workingwith the World Wide Web. You can easily test if libwww-perl is alreadyinstalled on your system by trying

perl -e ’use LWP::Simple’

If this one-liner gives an error message, you need to get libwww-perl fromCPAN (see page 54).

The Perl script lwp-download (from the libwww-perl package) fetches asingle file whose URL is known:

lwp-download http://www.ifi.uio.no/~hpl/downloadme.dat

The script looks at the file contents and creates a suitable local filename forthe copy. In this case, downloadme.dat is a text file that lwp-download storesas downloadme.dat.txt. A second argument to lwp-download can be used tospecify a local filename.

Inside a Perl script we can easily copy a file, given as a URL, to a localfile:

use LWP::Simple;$URL = "http://www.ifi.uio.no/~hpl/downloadme.dat";getstore($URL, "downloadme.dat");# copy only if local file is not up-to-date:mirror($URL, "downloadme.dat.pl");

or we can load the remote file directly into an array of lines:

@lines = get($URL);

The URL in these examples could also have been an ftp address, e.g.,

ftp://ftp.ifi.uio.no/pub/blab/xite/xite3_4.tar.gz

Check out perldoc LWP::Simple for details regarding more functionality.

1.4. Frequently Encountered Tasks 39

1.4.23 CPU-Time Measurements

Measurement of elapsed time in Perl can be done with the time function:

$t0 = time; # elapsed time in seconds since the epoch# do tasks...$elapsed_time = time - $t0;

Because time is measured in seconds, you need to perform efficiency teststhat last several seconds. Timing with finer resolution is possible, see thePerl FAQ: perldoc -q ’time under a second’.

Throughout this section we assume that the reader is familiar with termslike epoch, elapsed time, system time, CPU time, and the difference betweenchildren and parent processes, as briefly explained in Chapter 8.10.1CPU-Time Measurementssubsection.577 in [?].

A more sophisticated function times returns an array with four entries.The first two represent the user and system times of the current process whilethe next two contain the user and system times of the current process’ childprocesses.

@t0 = times;# do tasks...system "$time_consuming_command" # child process@t1 = times;$user_time = $t1[0] - $t0[0];$system_time = $t1[1] - $t0[1];$cpu_time = $user_time + $system_time;$cpu_time_system_call = $t1[2] - $t0[2] + $t1[3] - $t0[3]

There is also a higher-level module Benchmark, based on the time and times

functions, with various support for timing of Perl scripts. The usuage goesas follows.

use Benchmark;$t0 = new Benchmark;# do some tasks...$t1 = new Benchmark;$td = timediff($t1, $t0); # time difference between $t0 and $t1$nice_td_formatting = timestr($td, ’noc’);print "tasks: $nice_td_formatting\n";

The output looks like this:

tasks: 9 wallclock secs( 3.12 usr + 0.10 sys = 3.22 CPU)

The Benchmark module has also a function timeit that runs a piece of Perlcode a specified number of times:

use Benchmark;print "100 runs took", timestr(timeit(100,\&somefunc)), "\n";

40 1. Introduction to Perl

We refer to perldoc Benchmark for more details about this module.From a pedagogical point of view it might be instructive to write a func-

tion like timeit in the Benchmark module. Doing this we also have the pos-sibility of tailoring such a timing function to suit our needs. The function,here called timer, can take four arguments: (i) a function to call, (ii) a list ofarguments to be used in the function to call, (iii) the number of call repeti-tions, and (iv) the name of the function to call. In Perl we would representthe first two arguments by a function reference and a reference to a list. Thecomplete function could then take the following form:

sub timer {my ($func_ref, $args_ref, $repetitions, $func_name) = @_;my $t0 = time; # initial elapsed timemy ($u0, $s0, $rest) = times; # initial user and system timefor (my $i = 0; $i < $repetitions; $i++) {

&$func_ref(@$args_ref);}my @t1 = times;printf("$func_name: elapsed=%g, CPU=%g\n",

time - $t0, $t1[0] - $u0 + $t1[1] - $s0);}

The similar Python function is presented in Chapter 8.10.1CPU-Time Measurementssubsection.577in [?].

1.4.24 Programming with Classes

Classes are implemented in Perl using quite advanced concepts like referencesand packages. Although Perl fans claim that classes in Perl are much moreflexible than those in C++ and Java, it is no doubt that programming withclasses is more weird in Perl than in C++, Java, and Python. ExplainingPerl classes in a couple pages without first covering references and packagesis difficult and therefore omitted here.

1.4.25 Debugging Perl Scripts

Unfortunately, Perl is by default quite silent about errors. The following shortscript, which tries to open a non-existing file, illustrates the point:

perl -w -e ’open(F,"<mynonexistingfile"); close(F);’

Perl executes this script without any error message12. The Fatal module canbe used for letting Perl speak up about run-time errors:

perl -e ’use warnings; use strict; use diagnostics; \use Fatal qw/open/; local *F; \open(F,"<mynonexistingfile"); close(F);’

12 Python provides instructive run-time messages by default in similar examples(and the messages can be turned off by, e.g., appropriate exception handling inthe script).

1.4. Frequently Encountered Tasks 41

Note that you must list the functions you want to be verbose, here open. Thereported error message now contains the helpful message

Can’t open(F, <mynonexistingfile): No such file or directory

The use warnings, use strict, and use diagnostic commands can help youdetecting statements that are candidates for trouble. However, applying use strict

modules to (most of) the Perl scripts in this appendix will result in lots oferror messages about lack of the main:: prefix for all global variables or anexplicit my or local operator (to make variables local). For quick scriptingthis can be a bit annoying. When writing larger scripts, on the other hand,use strict is a good habit. Here is a sample code demonstrating some im-plications of use strict:

use strict;# introduce the global variable $counter for the first time:$counter = 1; # generates error message$main::counter = 1; # ok, explicit indication of package namemy $counter = 1; # ok, localizing $counter with the my operatormy $counter; $counter = 1; # equiv. with the line above

The reader is encouraged to take a look at the man pages for the Fatal,strict, and diagonstic modules. For details on warnings, see perldoc warnings

and man perllexwarn.Inserting print statements on the fly in the code is an efficient and widely

used debugging method among Perl programmers. Alternatively, the -d op-tion to a Perl script enables you to interactively debug the script through acommand-line debugger,

perl -d -w mybuggyscript.pl

The -w option turns on many useful warnings about, e.g., unused variables.The most important commands inside the debugger are s for single step, nfor single step without stepping into subroutines, x for pretty-print of datastructures and variables, and b 85 for setting a break point at line 85. Moredetailed information is provided by perldoc perldebug.

There is a Perl/Tk GUI for the Perl debugger, available in the moduleptkdb. Invoke the debugger by

perl -d:ptkdb -w mybuggyscript.pl

There are several Perl debuggers with graphical interfaces, check out the linksin the Perl resources section in doc.html.

Another Perl module is Devel::Trace, which prints each statement priorto executing it (the same effect as the -x option to Unix shell scripts).

42 1. Introduction to Perl

1.4.26 Regular Expressions

The material on regular expressions explained in a Python context in Chap-ter 8.2Regular Expressions and Text Processingsection.463 in [?] carries overto Perl, but the surrounding Perl code is different. To test if a string $str

matches a regular expression contained in a string $pattern, one writes

if ($str =~ /$pattern/) { ... }

A specific example can be

$str = "myfile.tmp";if ($str =~ /\.tmp$/) { print "$str has extension .tmp"; }

Backslashes and special symbols are preserved in text enclosed in forwardslahes /.../, as in Python raw strings. However, if the regular expression isto be stored in a double-quoted string, backslashes and special Perl charactersmust be preceded by a backslash:

$pattern = "\\.tmp\$";if ($str =~ /$pattern/) { print "$str has extension .tmp"; }

With single-quoted strings a backslash is a backslash, but Perl’s variableinterpolation cannot be used.

Pattern-Matching Modifiers. Perl offers pattern-matching modifiers to adjustthe meaning of the dot, ^, $, whitespace, etc. The syntax for applying apattern-matching modifier is like

if ($str =~ /$pattern/q) { ... }

where q denotes one or more single-character pattern-matching modifiers fromthe following list:

i case-insensitive matchingg match globally, i.e., find all occurrencess let . match newline as wellm treat string as multiple lines, i.e, change ^

and $ from matching at only the very start or end ofthe string to the start or end of any line anywherewithin the string (a line is from a newline to thenext newline)

x extend the pattern’s legibility by permittingwhitespace and comments

o compile pattern once only (for increased efficiency)

The o modifier is a counterpart to compiling regular expressions in Python.We can use other delimiters than forward slashes if the /.../ group is

preceded by an m, e.g.,

$found = 1 if $path =~ m#/usr/local/bin#;

Extracting Multiple Matches. Suppose you have a string with several num-bers. To extract all numbers from this string, without knowing how manynumbers there may be, we can apply the following Perl construct13:

13 This construct is a counterpart to Python’s findall function in the re module.

1.4. Frequently Encountered Tasks 43

$s = "3.29 is a number, 4.2 and 0.5 too";@n = $s =~ /\d+\.\d*/g;

The array @n now contains the entries 3.29, 4.2, and 0.5.

Groups. Groups are constructed by enclosing parts of a pattern in parenthe-sis, cf. Chapter 8.2.4Using Groups to Extract Parts of a Textsubsection.469in [?]. Perl stores the matches correspodning to groups in the variables $1

(first group), $2 (second group), and so on. An illustrating code segment isgiven next.

$interval = "[1.45, -1.99E+01]";if ($interval =~ /\[(.*),(.*)\]/) {

print "lower limit=$1, upper limit=$2\n"}

Substitution. The basic syntax of substitution in Perl is

$somestring =~ s/pattern/replacement/g;

implying that pattern is replaced by replacement in $somestring. The pattern-matching modifier g ensures that all occurrences (not only the first) are beingsubstituted. Here is a specific example:

# change /usr/bin/perl to /usr/local/bin/perl$line =~ s/\/usr\/bin\/perl/\/usr\/local\/bin\/perl/g;

The forward slashes in the path names must be quoted because the forwardslash is a delimiter for the substitution operator. Fortunately, this “LeaningToothpick Syndrome” [?, p. 70] can be avoided by choosing another delimiter,e.g.,

$line =~ s#/usr/bin/perl#/usr/local/bin/perl#g;# or$line =~ s{/usr/bin/perl}{/usr/local/bin/perl}g;

See [?, p. 255] for more information on alternative delimiters.The one-line Perl command for substitution listed on page 20 is so useful

that we repeat it here:

perl -pi.bak -e ’s/pattern/replacement/g;’ *.c *.h

The pattern is replaced by replacement everywhere in the files *.c and *.h.A nice feature is that the -i.bak option leads Perl to take a copy, here withextension .bak, of the original files.

44 1. Introduction to Perl

Substitution with Groups. The example of switching arguments in functioncalls, as covered in Chapter 8.2.9Substitution and Backreferencessubsection.478in [?] is readily implemented in Perl using the group variables $1, $2, and soon. If superLibFunc(arg1,arg2) is supposed to be edited to superLibFunc(arg2,arg1),where arg1 and arg2 are legal variable names in C and the call can containextra whitespace, a suitable substitution code segment reads:

$arg = "[^,]+";$call = "superLibFunc\\s*\\(\\s*($arg)\\s*,\\s*($arg)\\s*\\)";

# perform the substitution in a file stored as a string $filestr:$filestr =~ s/$call/superLibFunc($2, $1)/g;

# or (less preferred style):$filestr =~ s/$call/superLibFunc(\2, \1)/g;

print FILE $filestr; # print everything back to file

Note the need for double backslashes when the regular expression is storedas an ordinary Perl string. The complete script for the substitution is foundin src/perl/swap1.pl. Another version, appearing in the file swap2.pl, hascomments in the regular expression:

$arg = "[^,]+";$call = "superLibFunc # name of function to match

\\s* # possible whitespace\\( # left parenthesis\\s* # possible whitespace($arg) # first argument plus optional whitespace, # comma between the arguments\\s* # possible whitespace($arg) # second argument plus optional whitespace\\) # closing parenthesis";

$filestr =~ s/$call/superLibFunc($2, $1)/gx;

The final x is the pattern-matching modifier for comments and extra whites-pace in regular expressions. A more typical Perl style is to write the previouscode segment without storing the regular expression in a string variable:

$filestr =~ s{superLibFunc # name of function to match

\s* # possible whitespace\( # left parenthesis\s* # possible whitespace($arg) # first argument plus optional whitespace, # comma between the arguments\s* # possible whitespace($arg) # second argument plus optional whitespace\) # closing parenthesis

}{superLibFunc($2, $1)}gx;

This version is available in the swap3.pl script.

1.4. Frequently Encountered Tasks 45

Debugging Regular Expressions. After having spent quite some energy onfiguring out a complicated regular expression, nothing is less exciting thanseeing the regex fail to behave the way you expect. The $\& variable in Perlis set equal to the complete text matched by the specified pattern and is thuscentral when debugging regular expressions.

A Perl counterpart to the Python function debugregex on page 347Debug-ging Regular Expressionssubsection.481 in [?] is presented below. Because ofthe difference in functionality for extracting groups and matches in Perl andPython, the two versions of debugregex do not have a line by line correspon-dance.

#!/usr/bin/perl

sub debugregex {my ($pattern, $str) = @_;$s = "does ’" . $pattern . "’ match ’" . $str . "’?\n";

if ($str =~ /$pattern/) {# obtain a list of groups (if present):@groups = $str =~ m/$pattern/g;

# repeat string, but with match enclosed in square brackets:$match = $&;$str2 = $str; $str2 =~ s/$match/[$match]/g;$s = $s . $str2;

if ($groups[0] == 1) {# ordinary match, no groups (see perlop man page)

} else {for $group (@groups) {

$s = $s . "\ngroup: " . $group;}

}} else {

$s = $s . "No match";}return $s;

}

$teststr = "some numbers 2.3, 6.98, and 0.5 are here";$pattern1 = "(\\d+\\.\\d+)"; # 3 groups (numbers)$pattern2 = "^(\\w+)\\s+.*\\s+(\\w+)\$"; # 2 groups (some and here)

print debugregex($pattern1, $teststr), "\n";print debugregex($pattern2, $teststr), "\n";

The output becomes

does ’(\d+\.\d+)’ match ’some numbers 2.3, 6.98, and 0.5 are here’?some numbers 2.3, 6.98, and [0.5] are heregroup: 2.3group: 6.98group: 0.5does ’^(\w+)\s+.*\s+(\w+)$’ match

’some numbers 2.3, 6.98, and 0.5 are here’?

46 1. Introduction to Perl

[some numbers 2.3, 6.98, and 0.5 are here]group: somegroup: here

The Perl version of debugregex is perhaps less useful than the Python versionsince regular expressions are seldom stored in strings in Perl. Instead theyappear directly inside /.../ constructions, and to use debugregex, we need tocopy the regular expression, store it in a string, and quote special charactersbefore performing the call. However, listing debugregex illustrates useful codesegments that can be reused in your own scripts when debugging regularexpressions.

1.4.27 Exercises

Most of the Python exercises from Chapters 2Getting Started with PythonScriptingchapter.49–8Advanced Pythonchapter.453 in [?] are well suited forimplementation in Perl. For an efficient hands-on training with Perl, we rec-ommend in particular the following set of exercises: 2.4Exercisesexercise.82,2.6Exercisesexercise.84, 2.7Exercisesexercise.85, 2.11Exercisesexercise.111, 3.2Exercisesexercise.164,3.15Exercisesexercise.198, 3.7Exercisesexercise.169, 3.16Exercisesexercise.199,8.7Exercisesexercise.486, 8.12Exercisesexercise.491, 8.18Exercisesexercise.497,and ??.

File handles for standard input and standard output, used in Exercise 2.6Exercisesexercise.84in [?], have the names STDIN and STDOUT, respectively, in Perl.

Regarding Exercise 3.15Exercisesexercise.198 on page 128Exercisesexercise.198in [?], check out Perl’s documentation of the localtime function for extract-ing the correct date information (just write the operating system commandperldoc -f localtime).

The relevance of Exercise 8.17Exercisesexercise.496 in [?] for Perl pro-grammers is minor, because Perl has a special variable $/, the input recordseparator, which you can set to an empty string "" to make Perl read inputin a paragraph-by-paragraph style. For example,

$/ = "";@paragraphs = <SOMEFILE1>;# each entry in @paragraphs is now a paragraph

# alternative:$/ = "";while (<>) {

# $_ is now a paragraph}

# read the whole file into a string:undef $/; # ’slurp’ mode$filestr = <SOMEFILE2>;

1.4. Frequently Encountered Tasks 47

Exercise 1.2. Make a flexible file/directory remove function.On page 120Removing Files and Directoriessubsection.186 in [?] we present

a flexible Python function remove for removing one or more files and directo-ries. The function can be called with a string or list of strings as argument.Implement the same function in Perl and explain why there is no need to testfor the argument type in Perl in this application. �

Exercise 1.3. Make a generic debug print function in Perl.Make a Perl counterpart to the Python debug function from page 111Functionssection.173

in [?]. Test the function on a nested, heterogeneous list of list and hash struc-tures. (Hint: use the Data::Dumper module.) �

Exercise 1.4. Use Getopt::Long to parse the command line.The Getopt::Long module allows you to specify and handle command-line

options in Perl code. Read the man page documentation of this module andapply it to improve the simviz1.pl script. �

Exercise 1.5. Interpret a Perl script.What do the following commands do?

1> perl -ne ’print if $. > 10 and $. < 22’ outbox2> perl -lne ’END{print $.}’ file3> perl -lne ’BEGIN{$c=0}$c++ if /#/;END{print "$ARGV: $c"}’ \

file1 file2 file3

One hint might be to find the source of the man pages and search for specialconstructions.

Write a Bourne shell script that first writes an explanation of what thecommands do to the screen and then demonstrates the command throughan example (generate file(s) in the Bourne shell script before running thecommands). �

Exercise 1.6. Automatic editing of script headers.The purpose of this exercise is to write a script which changes typical Perl

headings (she-bang lines) like

#!/usr/bin/perl

or other hardcoded paths, to the Perl header

: # *-*-perl-*-*eval ’exec perl -w -S $0 ${1+"$@"}’if 0; # if running under some shell

Such a tool is useful when you want to make portable scripts that are alwaysinterpreted by, e.g., your own Perl installation. Search a set of directory treesand for each executable file that is not a directory, check if the first linematches text of the form #s*/.*/perl!, and if so, edit the file automatically.

In case you become interested in the magic of Perl headers, you are en-couraged to run the script src/perl/headerfun.sh. �

48 1. Introduction to Perl

Exercise 1.7. Mail a collection of files and directories.The usual way to mail files electronically is to include them as attach-

ments. However, manual attachment is inconvienient if there are many files.Sometimes one also wants to mail complete directory trees, and a mail scriptwith the following interface would then be attractive:

mailfiles.pl -s subject -c comments -a [email protected] \-x ’(.tex,.pl,.py,.c)’ doc/README src app/test1 app/test2

In this interface, the -s option is used to specify a subject for the email,-c is used to insert a text in the body of the email, -a is used to assign anemail address, and -x takes a list of extensions of files to be mailed (list itemsare separated by comma, and the list is enclosed in parenthesis; the quotesare used to tell the shell that the list is one command-line argument). Therest of the command-line arguments are the file or directory names to bemailed. For example, doc/README is typically a file, whereas src, app/test1,and app/test2 could be directory trees. All files with extensions coincidingwith those given through the -x option will be picked out from these directorytrees and mailed. If the -x is not specified, all files in the trees are to be sent.

The inner workings of mailfiles.pl will consists of (i) creating a list of allfiles to be mailed, then (ii) packing these files into one single file using a tar

facility, (iii) compressing the tarfile, and finally (iv) including the compressedtarfile as an attachment in a mail. The body of the mail should consist of theuser’s comments given through the -c option to mailfiles.pl in addition toa list of all the files that are included in the tarfile.

Perl has some useful modules for accomplishing the tasks in the mailfiles.plscript: Archive::Tar creates tarfiles [?, p. 164], Compress:Zlib can be used forgzip/gunzip file compression, and Mail::Send [?, p. 349] provides functions forsending emails with attachments. Information on the usage of these modulesis provided by the associated man pages.

Write the mailfiles.pl script and write also the inverse script mail2files.pl,which packs out the attached tarfile from an email saved to file.

Exercise 1.8. Make Perl less silent about errors.Create a directory without write permissions and move to this directory.

(Relvant Unix commands are mkdir tmp; chmod a-w tmp.) Run the followingPerl command to create a new directory:

perl -e ’mkdir("tmp2", 0755);

The one-line script is executed without error messages, but no directory iscreated. The similar Python command,

python -c ’import os; os.mkdir("tmp2")’

1.4. Frequently Encountered Tasks 49

results in an error message (“Permission denied”). Use the Fatal module inPerl so that mkdir prints an error message if creation of a new directory isunsuccessful.

Exercise 1.9. Debug a Perl substitution command.Assume that we want to remove comments like

/*REMAINING TASKS:......

*/

from a set of files. We try the following one-line substitution command inPerl:

perl -pi.old~~ -e ’s#/\*\s*REMAINING TASKS.*?\*/##gs;’ f1.c f2.c f3.c

However, this command does not seem to work. Can you explain what isgoing on in detail and create a script that works as intended?

Exercise 1.10. Document a script using POD.First, read about Perl’s documentation system POD. A good source is

perldoc perlpod. Then, find a non-trivial script you have written in Perl andequip it with documentation in man-page style, using the POD tool. Usethe programs pod2html and pod2man to generate man pages for the script inHTML and nroff formats. �

1.4.28 Building and Using Modules

The simplest possible Perl module consists of some Perl code stored in a filewith extension .pm. Here is a sketch:

# module file: MyMod.pm

package MyMod;

$logfile = ""; # variable shared among subroutines

sub set_logfile {($logfile) = @_;print "logfile=$logfile\n";

}

sub another_routine {print "inside another_routine\n";for $arg (@_) { print "argument=$arg\n"; }

}

1;

50 1. Introduction to Perl

The package statement defines a namespace MyMod, i.e., all variables and func-tions in a user code must be prefixed by the package name: MyMod::. Thisavoids name clashes with other modules that have subroutines or variableswith the same names (e.g., set_logfile or logfile).

A module file must end with a statement that evaluates to true. Thestandard choice is 1; and Perl will issue a compilation error if you forget thiscrucial statement.

In the user’s code the MyMod module is imported by writing use MyMod.This import statement will be successful only if Perl knows where to findyour module. There are four ways of telling Perl about your directory ofmodules:

1. apply the use lib statement, e.g., use lib "/home/hpl/lib";

2. set the PERLLIB or PERL5LIB environment variables,

3. modify Perl’s list of search directories for modules, or

4. install the module in the official Perl library directories.

Suppose we have stored the MyMod.pm in the subdirectory src/intrp/perl

under the directory stored in the environment variable scripting. Let usexplain the details of the four technqiues for making Perl aware of our MyMod

module. The use lib statement has the form

use lib "/home/hpl/scripting/src/perl";

We can work with paths built from environment variables, to make the scriptmore portable:

use lib "$ENV{scripting}/src/perl";

The increased portability comes at a cost of requireing all users to haveparticular environment variables set correctly. (This requirement makes thetechnique less applicable to CGI scripts.)

As an alternative to use lib you can include your directories with modulesin an environment variable PERLLIB or PERL5LIB. Here is an example in a Bashstart-up file:

export PERLLIB=$PERLLIB:$scripting/src/perl

Another solution is to modify Perl’s list of directories, @INC, used to searchfor modules [?, p. 171]:

BEGIN { unshift @INC, "$ENV{scripting}/src/perl"; }use MyMod;

In the case Perl appears to have trouble with finding libraries, it will print thecontent of the @INC variable, thus letting you check if all required directoriesare present or not.

1.4. Frequently Encountered Tasks 51

Installing a module in the official Perl library directories should be donein terms of a Makefile.PL file automatically generated by the h2xs program.An excellent description of this process is available in the item “Use h2xs togenerate module boilerplate” in the “Effective Perl” book [?].

Being sure that Perl will find your new module, you can start using it. Acomplete test is shown next.

#!/usr/bin/perl

use lib "$ENV{scripting}/src/intro/perl";use MyMod;

MyMod::set_logfile(’tmp.log’);print "MyMod::logfile=$MyMod::logfile\n";$p1 = "just some text";MyMod::another_routine($p1, ’mytmp.tmp’);

The visibility of the module’s functions and variables can easily be controlled(public versus private access). We refer to the description in [?] for details.

The example of a Perl module presented in this section is very simple;Perl has many additional features to control the behavior of a module. Anoverview of making modules is provided by the question “How do I create amodule?” in the Perl FAQ (see link from doc.html). You can look up [?] fora more comprehensive description of making modules.

1.4.29 Binary Input/Output

Perl has the functions pack and unpack for handling binary data, and thesefunctions work in much the same way as in Python (see Chapter 8.3.6BinaryInput/Outputsubsection.507 in [?]). For example, converting the Perl variable$np to the binary int format in C can be done with a call to pack:

$cvar = pack(’i’, $np);

Similarly, converting $r to the binary double format in C is done by

$cvar = pack(’d’, $r);

The format specification for the most common formats is the same as inPython, see perldoc -f pack for a complete list. Writing an array of realnumbers on file in C double format can be peformed by the loop

foreach $r (@array) { print FILE pack(’d’,$r); }

Reading C doubles from file can be done using read to get a specified numberof bytes and then unpack to convert a binary number to a Perl variable:

@array = ();$d = length(pack(’d’)); # no of bytes in a C double# read $d bytes into $data in each pass in the loop:while (($n = read(FILE, $data, $d)) == $d) {

push(@array, unpack(’d’, $data); # convert to Perl variable}

52 1. Introduction to Perl

Forcing a file to be treated as a binary file, which is necessary on some oper-ating systems, is obtained by calling binmode(FILE), see perldoc -f binmode.

1.5 Installing Perl and Additional Modules

This section explains how to install Perl and the modules referred to in thisdocument. The set up follows Appendix A in [?]. You should therefore makesure that you have set the right environment variables and made the rightdirectories according to [?] before proceeding.

1.5.1 Installing Basic Perl

Download a tarfile with the stable Perl distribution (see doc.html for an ap-propriate link) and pack it out in $SYSDIR/src/perl. Go to the new directory,which was created when unpacking the tarfile. First you need to run a con-figuration script to set up the proper makefiles for compiling Perl:

sh ./Configure -Dprefix=$PREFIX -sde

If this script executes successfully, the next step is to compile Perl by typingmake. To test if the building of Perl was successful, type make minitest (can beomitted). To install Perl, type make install. Object files and other temporaryfiles from building Perl can be removed by the command make clean.

1.5.2 Manual Installation of Perl Modules

The CPAN archive contains a large number of Perl modules. If you lack somespecial functionality in Perl and plan to develop it yourself, check out CPAN– very often you will find that others have already done the job! Modules fromCPAN can be installed either manually or automatically. We recommend touse the automatic procedure described below under the heading “AutomaticInstallation of Perl Modules”.

Manual installation of modules from CPAN follows this recipe:

1. Download a tarfile, e.g., SomeMod-0.01.tar.gz. (Here 0.01 denotes theversion number of the SomeMod module). Pack it out in some suitabledirectory, e.g., $SYSDIR/src/perl/tools (if you follow the set-up in thisappendix):

gzip -dc SomeMod-0.01.tar.gz | tar xvf -

2. Go to the module’s directory:

cd SomeMod-0.01

3. All modules shall contain a makefile called Makefile.PL. Run

perl Makefile.PL

1.5. Installing Perl and Additional Modules 53

and then

make

Sometimes you will see that a module depends on other modules, so youneed to install these first.

4. To test the module you can write make test. This step is optional.

5. To install the module such that Perl can find it when you say use SomeMod

in some script, you need to run

make install

This will copy the necessary files to the directories where Perl is installed.If you have your own installation of Perl, this will work fine, but if Perlwas installed by your system administrator, you probably do not havewrite permission in the relevant directories. In that case, you can install aprivate copy of the module by specifying a path to the desired installationdirectory (say /home/hpl/lib:

perl Makefile.PL LIB=/home/hpl/lib

When you use the module in a Perl script you need to make Perl awareof where the module is installed, e.g., by writing

use lib "/home/hpl/lib";

prior to use SomeMod (see Chapter 1.4.28).

1.5.3 Automatic Installation of Perl Modules

Perl has a special module called CPAN for automating installation of Perlmodules. The first time you make use of this utility you need to configureyour set-up for the CPAN module. Invoke the CPAN shell:

perl -MCPAN -e shell

and answer no to the question “Are you ready for manual configuration?”.This will initiate an automatic configuration, which might be sufficient inmany cases, especially if you have your own Perl installation. In case you donot like the decisions that were made, invoke the CPAN shell again and issuethe command o conf init to revisit the dialog. For example, you may desireto specify a LIB option when the CPAN module runs perl Makefile.PL foryou (see the description of manual installation of Perl modules).

To install a module, invoke the CPAN shell as described above and justsay install and the module name, e.g.,

install Tar

The latest version of the Tar module will be fetched from a nearby CPANsite, together with all the modules that Tar depends on and that you do notalready have. The tarfiles are unpacked and the installation procedure is run.You can also install modules without using the interactive CPAN shell:

54 1. Introduction to Perl

perl -MCPAN -e ’install Tar’

We remark that even in this case some of the installation procedures willprompt the user for extra information, but the default values will often besufficient. You can perform a manual installtion if you run into problems (bydefault, the CPAN shell packs out the sources in .cpan/build in your homedirectory – just go to a module’s directory and issue the commands requiredfor manual installation).

To learn about the modules, look at their man pages (write, for instance,perldoc Tar).

1.5.4 The Required Perl Modules

The Perl examples in this document require some CPAN modules. You caninstall these manually or automatically. In the latter case, run

perl -MCPAN -e ’install Bundle::libnet’perl -MCPAN -e ’install Tk’ # Perl/Tkperl -MCPAN -e ’install LWP::Simple’ # WWW tools

# for CGI programming (dynamic Web pages):perl -MCPAN -e ’install CGI::Debug’perl -MCPAN -e ’install CGI::QuickForm’

# for an exercise:perl -MCPAN -e ’install Tar’

# for the Regression module in "Python Scripting for# Computational Science" book:perl -MCPAN -e ’install Algorithm::Diff’

1.6 Perl Versus Python

For newcomers to Perl and Python it may be difficult to judge which languageto use in a given project. Below are some comments on pros and cons of thesetwo languages.

1.6.1 Python’s Advantages

– Python is easy to learn. Many people find Python to be considerablyeasier to learn than Perl. Python has also been used for teaching pro-gramming in high schools with success14.

– Python has a very clean syntax. Python’s clean syntax makes programseasy to read and modify, also for others than the author. Going back toa Python script after a year is normally much easier than going back to

14 See link in doc.html.

1.6. Perl Versus Python 55

the same type of code in Perl. A Python slogan15 is “there should be one– and preferably only one – obvious way to do it”, in contrast to Perl’s“There’s More Than One Way To Do It” (see Chapter 1.3). The one-way-to-do-it philosophy in Python eliminates to a large extent the need fordetailed programming standards in a project. Different Perl cultures usedifferent styles and constructs, a fact that can be confusing when novicePerl programmers read other people’s code.

– Python has easy-to-use data structures. Lists and dictionaries (hashes)can be heterogeneous and arbitrarily nested in Python and Perl, butin Python there is no need to work with references and the associateddereferencing syntax as in Perl. That is, the Python syntax is intuitiveand makes it easy to work with complicated, nested data structures.Morover, the keys in Python dictionaries can be arbitrary (immutable)objects, whereas the keys in Perl hashes are limited to plain text.

– Python has full and natural support for object orientation. Object-orientedprogramming is more awkward in Perl as it was added at a late stage inthe development of the language.

– Python’s application domain is wide. Python is a combination of a “script-ing language” for gluing existing applications (for which Unix shells, Tcl,and Perl are competing tools) and a system programming language withobject-orientation and support for complicated, nested, heterogenesous,user-defined data structures (an area where C++, Java, and to someextent Perl are competing tools). Perl also supports complicated datastructures, but with a less attractive syntax; in Python you can workdirectly with the objects in the structure, while in Perl you must workthrough references, a fact that clutters the code with combinations of@, $, and backslashes and makes it less readable. (Compare, for example,the displaylist subroutine on 31 with the corresponding straightforwardimplementation in Python.)

– Python supports multi-language programming in an easy way. There areseveral tools that make it easy to combine Python with C, C++, orFortran libraries, as shown in Chapters 5Combining Python with For-tran, C, and C++chapter.280, 9Fortran Programming with NumericalPython Arrayschapter.597, and 10C and C++ Programming with Nu-merical Python Arrayschapter.631 in [?]. Combining Perl with C++, andin particular Fortran, is less well supported.

– Python programs can easily be equipped with a GUI. There are well-developed interfaces to various GUI tools, e.g., wxWindows, Qt, Gtk, JavaFoundation Classes (JFC), and MFC (Microsoft Foundation Classes), be-sides this book’s main GUI library: Tk. Although Tk, and other librarieslike Gtk, can be used from Perl as well, we find that Python’s simple

15 Type import this in a Python interpreter to see this and more slogans.

56 1. Introduction to Perl

class construct provides a convenient and efficient way of simplifying thedevelopment of independent and reusable GUI components.

– Python supports numerical programming. There are Python modules forefficient numerics and visualization, making Python the most widespreadand perhaps the preferred language for scripting in scientific computing.

– Python offers extensive error checking. Descriptive error messages areissued when something goes wrong at run-time. Python programs neverdump core; you always get a traceback, showing where and why the errorarose. Perl is much more silent (by default; see Chapter 1.4.25).

– Python statements can be written in an interactive shell. There is a veryuser-friendly interactive shell in the IDLE tool. Especially in combinationwith Python interfaces to your Fortran and C/C++ codes (as explainedin Chapter 5.3A Simple Computational Steering Examplesection.299 in[?]), this interactive mode allows you to work in a Matlab-style with yourown libraries. To use Perl interactively, you need to invoke a debugger.

– Python and Java are seamlessly integrated. A special version of Python,called Jython, is implemented in 100% pure Java. Java classes can beused directly in Python programs and Python classes and Java code canmake use of Python classes.

– The popularity of Python is fast growing. Many application areas wherePerl has a strong tradition, e.g. databases, interactive Web pages, or textprocessing, see an increasing use of Python.

– It only takes a few hours to get started and be productive in Python. Afteran efficient and gentle start, you can pick up the more advanced parts ofthe language in parallel with applying it to your programming projects.The learning curve is, according to my teaching experience, clearly steeperin Perl.

1.6.2 Perl’s Advantages

Perl and Python have quite similar functionality and application areas. Evenif you know Python, there are many reasons why you should seriously considerpicking up the Perl language too. Some good reasons are mentioned below.

– Perl is probably the most widespread and popular dynamically typed lan-guage today. There is a wide collection of Perl-based tools freely availableon the Internet. Therefore, it is likely that you encounter Perl scripts inyour work, and knowing some basic Perl puts you in a position where youcan fix weaknesses or tailor the scripts to your own needs.

– Perl has an extensive set of add-on modules that can be freely obtainedfrom the searchable CPAN archive. You will often experience that thereis a Perl module available for your scripting problem at hand.

– Python scripts can make use of Perl code through the pyperl tool.

1.6. Perl Versus Python 57

– Perl is particularly popular for writing CGI scripts, i.e., creating inter-active Web pages. Perl’s CGI tools are more comprehensive than thoseavailable for Python.

– Perl has (together with Python) more comprehensive interfaces to op-erating system commands and file operations than most other scriptinglanguages.

– Regular expression syntax is best documented in a Perl context, i.e., youneed to understand Perl code to look up all the good regular expressiondocumentation. Although Python supports most of Perl’s regular expres-sion syntax, many programmers (even Python fans) prefer Perl for heavytext processing with regular expressions.

– Perl has some nice features that allow small scripts to be expressed asvery compact one-line operating system commands.

– Perl’s “There’s More Than One Way To Do It” principle gives the pro-grammer greater flexibility (and perhaps more fun and challenges) thanwhen using Python to solve a problem.

– Perl is usually faster than Python – up to about twice as fast for typicalapplication areas of scripting in computational science and engineering.

1.6.3 Efficiency

It may be of interest to compare the efficiency of Perl versus Python. Wehave made a series of scripts in the directory tree src/efficiency for com-parisons of Perl, Python, and to some extent Tcl, C, and C++. The relativeefficiency of the datatrans1.* scripts is reported in Chapter 2.2.7EfficiencyMeasurementssubsection.76 in [?]. In that example we found that Perl ranalmost twice as fast as Python. In the following we present two other appli-cations concerning reading text files with regular expressions and performingmathematical operations in for loops. We remark that the relative efficiencyof the two languages reported here depends on the version of the interpreters,the C compiler used to compile them, and the hardware platform. The bestapproach of extrapolating the results to your own programming life is torepeat the tests on your own machine.

Interpreting Text Files via Regular Expressions. The subdirectory regex ofsrc/efficiency tests the efficiency of reading a big data file consisting oftextual array data. Each line takes the form

[2,10,0]=16.7

meaning that a three-dimensional array has the value 16.7 for the indices(2, 10, 0). The purpose is then to read this file, line by line, extract the indiciesand the array value, using the regular expression

\[(\d+),(\d+),(\d+)\]=(.*)

58 1. Introduction to Perl

and fill the corresponding array item. The Python implementation, in regexread.py,applies a three-dimensional NumPy array, whereas the Perl implementation,in regexread.pl, stores the numbers in a one-dimensional array (since three-dimensional arrays in Perl go via references and hence needs some extra loopsinitially). The data file is created by the makedata.py script.

As usual in the efficiency tests in [?], we normalize the CPU time by theCPU time of the fastest implementation. Perl then ran at 1.0 and Python at3.3.

(A faster solution to this programming would be to avoid regular expres-sions and instead split the line first with respect to =, strip the first (left-handside) string, remove the braces (first and last character), and then split withrespect to comma. This implementation and its relative efficiency to usingregular expression is left as an exercise.)

If we instead load a data file where the array data are written in NumPyformat, such that a plain eval(file.read()) type of statement recreates thearray, the Python script (readeval.py) is very short and ran at 2.8 CPU timeunits. These numbers are independent of the size of the array as long as thearray fits into memory.

From these tests on interpreting text files with regular expression, we mayconclude that Perl is more than three times faster than Python. The readingin Python is a bit faster if the array is in NumPy format (which is expectedsince there is less text to interpret in this format). We remark that largearray structures should perferably be stored in binary format. The differencebetween Perl and Python is then much smaller.

Function Evaluations. The next test concerns computing the integral∫ b

af(x)dx

by the Trapezoidal rule:

I =1

2f(a) +

n−1∑

i=1

f(a + ih) +1

2f(b), h = (b − a)/n .

A program implementing this formula, which is in fact the purpose of Exer-cise 4.5Exercisesexercise.230 in [?], will spend almost all its time on evaluatingthe function f(x).

The scripts for computing I are found in src/efficiency/integration;int.py is the Python implementation, int.pl is the corresponding Perl ver-sion, and int_vec.py is a vectorized Python version based on NumPy. Sinceint.pl ran faster than int.py, we scale the CPU times by the former. ThePerl version then ran at 1.0, plain Python at 1.2, and the vectorized Pythonversion at 0.1. Matlab implementations, based on both plain loops and vec-torization, ran at approximately the same speed as the corresponding Pythonversions.

1.7. GUI Programming with Perl/Tk 59

1.7 GUI Programming with Perl/Tk

Two types of graphical interfaces to programs are covered in this section.Chapter 1.7 explains how to apply the Tk library from Perl for creatingstandard graphical user interfaces in Perl programs. Chapter 1.8 addressesgraphical user interfaces in Web pages. These interfaces are handled by CGIscripts.

The exposition in this section assumes the reader to be familiar withPython/Tkinter programming from Chapter 6Introduction to GUI Programmingchapter.306([?]) in general and Chapter 6.1Scientific Hello World GUIsection.307 ([?])in particular.

The Tk library for easy creation of graphical user interfaces can be calledfrom Perl, using a Perl module referred to as Perl/Tk. Programming Perl/Tkis, not surprisingly, very similar to programming Python/Tkinter. The pur-pose of this section is therefore to illustrate the syntax differences a Python/Tkinterprogrammer faces when working with Perl/Tk. The purpose is to get thereader started with the Perl/Tk syntax such that further GUI programmingin Perl scripts for a Python/Tkinter programmer becomes a trivial task.

All the GUI scripts *.pl in the src/perl directory correspond to the simi-lar Python scripts with the same basenames and extension .py in src/py/gui.

Documentation. A book [?] is devoted to GUI building with Perl. This bookalso contains a complete reference to Perl/Tk widgets. Tk comes, of course,with man pages that can be accessed with perldoc: perldoc Tk::Scale, forinstance. There is also a Perl/Tk quick reference book [?]. The doc.html con-tains links to the Perl/Tk FAQ as well as to electronic Perl/Tk introductions.

The source of the Perl/Tk distribution contains a demo of Perl/Tk wid-gets. Go to the Perl/Tk package’s directory, then to its demos subdirectory,and write perl widgets. A GUI appears with a list of Tk demos. For eachdemo you can view the source code in a separate window. Newcomers toPerl/Tk might be confused by the Perl framework for organizing the de-mos. However, concentrating on the specific widget commands and knowingthe meaning of the qw operator (see page 22) should be sufficient for takingadvantage of this collection of Perl/Tk examples.

Megawidgets. Perl/Tk contains the Tk package plus many of the megawidgetsin the Tix library (which is originally a Tcl/Tk extension). The collection ofwidgets is quite rich and includes scrolled widgets, notebooks, advanced lists,combo boxes, and so on. A nice feature of Perl/Tk is that the module comeswith all the necessary C files for compilation, that is, there is no need toinstall the native Tcl/Tk libraries or other Perl extension modules. All youhave to say to install Perl/Tk is perl -MCPAN -e ’install Tk’.

We shall in the following write the Scientific Hello World GUI scriptsfrom Chapter 6.1Scientific Hello World GUIsection.307 in [?] using Perl’s Tk

60 1. Introduction to Perl

extension. This will point out how easy it is to use Tk from Perl once you arefamiliar with the Tk widgets, their functions, arguments, and functionality.

The scripts presented in this section are found in the directory src/perl.

1.7.1 The First Perl/Tk Encounter

We start out with creating a GUI as in Figure 6.1The First Python/TkinterEncounterfigure.311 on page 222The First Python/Tkinter Encounterfigure.311in [?]. In this GUI the user can fill in a number, push the equals button andsee the sine of the number written to the right in the window. The scriptcoded in Perl/Tk is called hwGUI1.pl and looks as follows.

: # *-*-perl-*-*eval ’exec perl -w -S $0 ${1+"$@"}’if 0; # if running under some shell

use Tk;# create main window perl variable ($main_win) to hold all widgets:$main_win = MainWindow->new();$top = $main_win->Frame(); # create frame$top->pack(-side => ’top’); # pack frame in main window

$hwtext = $top->Label(-text => "Hello, World! The sine of");$hwtext->pack(-side => ’left’);

$r = 1.2; # default$r_entry = $top->Entry(-width => 6, -relief => ’sunken’,

-textvariable => \$r);$r_entry->pack(-side => ’left’);

$compute = $top->Button(-text => " equals ", -command => \&comp_s);$compute->pack(-side => ’left’);

sub comp_s { $s = sin($r); }

$s_label = $top->Label(-textvariable => \$s, -width => 18);$s_label->pack(-side => ’left’);

MainLoop();

Let us explain the script line by line. The first three lines are standard andensure that we call the first Perl interpreter in our path, see Chapter 1.1.1.

Any Perl script using the Tk extension must begin with

use Tk;

Unfortunately, many Perl installations do not contain the Tk package. Youcan easily test whether your Perl interpreter understands Tk commands byrunning

perl -e ’use Tk’

If you get an error message of the type

1.7. GUI Programming with Perl/Tk 61

Can’t locate Tk.pm in @INC(@INC contains: ....

you need to get Perl/Tk installed. Instructions are provided in Chapter 1.5.Before we can create and pack widgets with Perl/Tk we need to make a

main window:

$main_win = MainWindow->new();

Having the main window, we normally start with making a frame to hold allour widgets and subframes:

$top = $main_win->Frame(); # create frame$top->pack(-side => ’top’); # pack frame in main window

The label “Hello, World! The sine of” is constructed in this way inPerl/Tk:

$hwtext = $top->Label(-text => "Hello, World! The sine of");$hwtext->pack(-side => ’left’);

The reader should notice the close similarity with the corresponding codeexpressed in Python/Tkinter:

hwtext = Label(top, text="Hello, World! The sine of")hwtext.pack(side=’left’)

If desired, we can merge the creation of the Label object and its packing inone statement:

$top->Label(-text => "Hello, World! The sine of")->pack(-side => ’left’);

Now we do not have any variable holding the label anymore, which means thatwe cannot update (configure) the label later. Usually, this is a disadvantage.

A text entry tied to the Perl variable $r is created by calling the Entry

method in our top frame:

$r = 1.2; # default$r_entry = $top->Entry(-width => 6, -relief => ’sunken’,

-textvariable => \$r);$r_entry->pack(-side => ’left’);

The width of the text entry equals 6 characters, and the entry is displayedwith a sunken relief, giving a 3D effect in the GUI. Technically, we send areference to the variable $r, denoted by \$r in Perl, as argument to the entrywidget.

The button in our GUI is supposed to compute $s = sin($r) when beingpressed. The corresponding Perl/Tk code is

$compute = $top->Button(-text => " equals ", -command => \&comp_s);$compute->pack(-side => ’left’);

sub comp_s { $s = sin($r); }

62 1. Introduction to Perl

The subroutine to be invoked by pressing the button is assigned by the-command argument. The value of the argument is a reference to the sub-routine, produced by the syntax \&comp_s in Perl.

The last widget is a label for displaying the result of the sine computation.The text in this label must reflect the value of the Perl variable $s:

$s_label = $top->Label(-textvariable => \$s, -width => 18);$s_label->pack(-side => ’left’);

or shorter:

$top->Label(-textvariable=>\$s, -width=>18)->pack(-side=>’left’);

Finally, we need to call the event loop when all GUI components are declaredand packed:

MainLoop();

This loop waits for the user’s events, such as pressing buttons and writingtext, and performs the corresponding actions as defined by the widgets. Ifyou forget to call MainLoop, nothing will be shown on the screen and the scriptjust hangs.

1.7.2 The Similarity of Python/Tkinter and Perl/Tk

At this stage we can outline how a typical Python/Tkinter statement trans-lates to Perl/Tk. Consider

widget_var = widget_type(parent_widget, opt1=v1, opt2=v2,command=myfunc)

In Perl/Tk this takes the form

$widget_var = $parent_widget->widget_type(-opt1 => v1, -opt2 => v2,-command => \&myfunc);

1.7.3 Binding Events

Binding the event “pressing return in the text entry $r_entry” to calling thecomp_s subroutine is accomplished with this statement in Perl/Tk:

$r_entry->bind(’<Return>’, \&comp_s);

Exercise 1.11. Perl/Tk version of the GUI in Chapter 6.2Adding GUIs toScriptssection.345 in [?].

Write a GUI for the simulation and visualization script in Chapter 1.2,following the simvizGUI1.py Python script from Chapter 6.2Adding GUIs toScriptssection.345 in [?].

1.8. Web Interfaces and CGI Programming 63

Hint: The PhotoImage class in the Python script takes the name Photo inPerl/Tk, and the construction is also slightly different. Another differencebetween Python/Tkinter and Perl/Tk is that the from_ argument in Scale

has the stright name from in Perl/Tk. Check out the Tk::Image and Tk::Scale

man pages with perldoc.�

A List of Common Widget Operations. Chapter 6.3A List of Common WidgetOperationssection.356 in [?] describes a script demoGUI.py containing many ofthe most common Tk widgets. The script serves as both a demo of the lookand workings of the widgets and as a kind of quick reference for typical widgetconstructions. Newcomers to Tk GUI programming may find the demo scriptvery useful as one can simply copy code segments from this script to get awidget up and running. A simplified (and actually quite different) Perl/Tkversion of the demoGUI.py script is found in demoGUI.pl in the src/perl direc-tory.

1.8 Web Interfaces and CGI Programming

The number of Perl users increased dramatically in the latter half of the1990s when Perl’s powerful text processing features were recognized to makeinteractive Web pages based on the Common Gateway Interface (CGI) mucheasier than in traditional languages like Fortran, C, C++, and even Java.Perl is now a core technology in the Internet programming world, and severalmodules for CGI programming are available.

Perl offers more CGI and network programming modules than Python.On the other hand, the powerful Zope and Plone tools for managing dynamicWeb pages are based on and programmable from Python. The optimal CGIprogramming environment is therefore likely to be a Python program callingup special Perl functionality when desired. This is fortunately a reality as thecompany ActiveSate has created a tool pyperl for calling Perl from Python.

Some Python tools for CGI programming are explained in Chapter 7WebInterfaces and CGI Programmingchapter.408 in [?]. Here we shall have a lookat similar tools in Perl and see how they apply to the examples from in [?].This implies that the present section is written for readers that have graspedthe basics of CGI programming in Python from Chapter 7Web Interfaces andCGI Programmingchapter.408 in [?].

1.8.1 Web Versions of the Scientific Hello World Program

We start out with a Web interface to our Scientific Hello World program.Figure 7.1Introductory CGI Scriptssection.409 on page 290Introductory CGIScriptssection.409 in [?] displays a Web page where the user can fill in a valuer, click the equals button, and see sin(r) being written in a new Web page, see

64 1. Introduction to Perl

Figure 7.2Introductory CGI Scriptsfigure.411 on page 290Introductory CGIScriptsfigure.411 in [?]. The Web page with a text entry is a plain HTMLfile:

<HTML><BODY BGCOLOR="white"><FORM ACTION="hw1.pl.cgi" METHOD="POST">Hello, World! The sine of<INPUT TYPE="text" NAME="r" SIZE="10" VALUE="1.2"><INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton"></FORM></BODY></HTML>

This file is identical to the hw1-py.html file from page 291Web Forms andCGI Scriptssubsection.412 in [?], used in conjunction with a CGI script inPython, except from the ACTION parameter, which now specifies a Perl scripthw1.pl.cgi to be called. This CGI script is almost a line-by-line translation ofthe similar script in Python. Perl has a module called CGI with easy access toform variables. As an example, the value of the form parameter r is extractedby calling the param function in the CGI module:

use CGI;$form = CGI->new();$r = $form->param("r");

With this information, and having a look at the Python counterpart hw1.py.cgi,we can easily create the CGI script in Perl:

#!/usr/local/bin/perluse CGI;# required opening of all CGI scripts with output:print "Content-type: text/html\n\n";# extract the value of the variable "r" (in the text field):$form = CGI->new();$r = $form->param("r"); $s = sin($r);# print answer (very primitive HTML code):print "Hello, World! The sine of $r equals $s\n";

The script is found in the file src/perl/hw1.pl.cgi. Observe that the outputyields an incomplete HTML code, but it will most likely be correctly shownin a browser.

In an improved version this CGI script we let the user stay within thesame page, i.e., the Web page acts as a sine calculator. This is accomplishedby letting the script generate the HTML code for the Web form:

#!/usr/local/bin/perluse CGI;# required opening of all CGI scripts with output:print "Content-type: text/html\n\n";# extract the value of the variable "r" (in the text field):$form = CGI->new();if (defined($form->param("r"))) {

$r = $form->param("r"); $s = sin($r);} else {

$r = "1.2"; $s = ""; # default

1.8. Web Interfaces and CGI Programming 65

}# print form with value:print <<EOF;<HTML><BODY BGCOLOR="white"><FORM ACTION="hw2.pl.cgi" METHOD="POST">Hello, World! The sine of<INPUT TYPE="text" NAME="r" SIZE="10" VALUE="$r"><INPUT TYPE="submit" VALUE="equals" NAME="equalsbutton"> $s</FORM></BODY></HTML>EOF

The if test can in principle be omitted, that is, we can write

$r = $form->param("r");

even when no form variables are available (which is the case the first timethe script is launched). However, the sine computation requires that $r is avalid real number. The script can be found in the file src/perl/hw2.pl.cgi

and is a line-by-line translation of the similar Python script.

1.8.2 Debugging CGI Scripts in Perl with CGI::Debug

Debugging CGI scripts quickly becomes challenging as the browser respondswith the standard message Internal Server Error if something goes wrong. Acompulsory step is to run the CGI script on the command line to detect errorsand warnings. Form variables can be provided as command-line argumentsto Perl scripts. For example,

perl myscript.cgi formvar1=’some text’ var2=’another answer’ q=4

yields three form variables: formvar1, var2, and q.The Perl module CGI::Debug is very useful for debugging CGI scripts

when run in a browser, because it prints informative and helpful messagesabout errors. The following demonstrations will indicate the basic features ofCGI::Debug.

Because many standard Perl installations do not contain the CGI::Debug

module, you may need to install the module, most easily accomplished usingthe CPAN shell, see page 53. In case you install CGI::Debug as part of yourown Perl installation, the CGI script needs to run this Perl interpreter. Thatis, the header in the CGI script must contain the complete path to your Perlinterpreter, e.g.16,

#!/some/long/path/to/my/own/bin/perl -wuse CGI::Debug;

Alternatively, you can notify any Perl interpreter about the location of thedirectory where CGI::Debug is installed, e.g.,

16 The need for a wrapper script, as explained in Chapter 7.1.4A General ShellScript Wrapper for CGI Scriptssubsection.421 in [?], is not necessary unless youhave linked your Perl interpreter with special, local shared libraries.

66 1. Introduction to Perl

#!/usr/local/bin/perl -wuse lib ’/some/long/path/to/my/own/lib’; # CGI::Debug is hereuse CGI::Debug;

Writing use CGI::Debug turns on debugging facilities. This will not affect aworking script. We have included CGI::Debug and introduced two errors in aversion of the hw2.pl.cgi script (the erroneous version is called hw2e.pl.cgi).

After the CGI object form is created we write some statements that triggerwarnings and errors:

print "$undefined_var\n"; # warningprint "$form->param("undefined_key")\n"; # error# CGI scripts are normally not allowed to write files:open(FILE, ">myfile") or die "Cannot open myfile!";

Running the script in a browser leads to a compilation error from Perl anda corresponding report written to the browser by CGI::Debug. The report issomething like

syntax error at /some/path/hw2e.pl.cgi line 13,near ""$form->param("undefined_key"

Execution of /some/path/hw2e.pl.cgi aborted due to compilation errors.

Your program doesn’t produce ANY output!

Parameters----------equalsbutton = 6[equals]r = 3[1.2]

followed by a listing of the contents of all environment variables related toCGI programming.

Let us look at the effect of removing the wrong hash index. Runningthe script again leads to a run-time error since the CGI script is run by a“nobody” who is not likely to have permission to new files for writing. Theopen command will work if the owner of the directory creates a file myfile

and gives all others permission to write on it, or if the owner gives all otherswrite access to the current directory. CGI::Debug will in the present examplewrite the error message from the die statement in a new Web page. Of course,both the undefined key and the undefined variable will be caught by runningthe script from the command line (the owner executes the script has writepermission in the current directory). Opening a file, on the other hand, workswell when running the script directly from the command line, but not whenthe script is run in a browser. This latter type of error is very common, andCGI::Debug helps you detect it easily.

Finally, we can remove the open statement and the script runs correctly,but Perl issues a warning about the undefined_var variable. This warning iswritten by CGI::Debug at the end of the Web page.

1.8. Web Interfaces and CGI Programming 67

1.8.3 Using Perl’s CGI Module to Construct Forms

The CGI module has numerous functions for writing HTML code. Considerthe CGI script hw2.pl.cgi from page 64. Instead of writing the HTML codeas plain text in print statements, we employ some hopefully self-explanatoryfunctions from the CGI module:

#!/usr/local/bin/perluse CGI;$wp = CGI->new();print $wp->header,

$wp->start_html(-title=>"Hello, Web World!",-BGCOLOR=>’white’),

#$wp->start_form(-action=>’hw3.pl.cgi’), # default$wp->start_form,"Hello, World! The sine of ",$wp->textfield(-name=>’r’, -default=>1.2, -size=>10),"\n", $wp->submit(-name=>’equals’), " ";

if ($wp->param()) {$r = $wp->param("r"); $s = sin($r);

} else { $s = sin(1.2); }print $s, "\n", $wp->end_form,

$wp->end_html, "\n";

The complete source code is found in src/perl/hw3.pl.cgi. This code can bemade nicer by removing the explicit appearance of the CGI object, i.e., the$wp-> construct. The statement

use CGI qw/:standard/;

imports the most standard CGI functions directly into the namespace so donot need to work with a CGI object. This enables the following more readableversion of the script:

#!/usr/local/bin/perluse CGI qw/:standard/;print header,

start_html(-title=>"Hello, Web World!",-BGCOLOR=>’white’),

start_form,"Hello, World! The sine of ",textfield(-name=>’r’, -default=>1.2, -size=>10),"\n", submit(-name=>’equals’), " ";

if (param()) { $r = param("r"); $s = sin($r); }else { $s = sin(1.2); }print $s, "\n", end_form, end_html, "\n";

The source code is found in src/perl/hw4.pl.cgi. More information on fea-tures in the CGI module can be obtained from the well-written man page; justtype perldoc CGI.

68 1. Introduction to Perl

Using Perl’s CGI::QuickForm Module to Construct Forms. Plain output ofHTML text or using the CGI utilities for generating HTML code quickly leadsto somewhat lengthy scripts, especially when forms with many elements needto use HTML tables for achieving satisfactory layout. The CGI::QuickForm

module takes the specification of a form and automatically generates anHTML page, typeset in a nicely formatted way. The Web version of ourinteractive Scientific Hello World program can be expressed as follows usingthe CGI::QuickForm module:

#!/ifi/ganglot/k00/inf3330/www_docs/packages/SunOS/bin/perluse CGI qw/:standard/;use CGI::QuickForm;

show_form(-ACCEPT => \&on_valid_form, # must be supplied-TITLE => "Hello, Web World!",-FIELDS => [

{ -LABEL => ’Hello, World! The sine of ’,-TYPE => ’textfield’, -name => ’r’,-default => 1.2, },

],-BUTTONS => [ {-name => ’compute’}, ], # "submit" button(s)

);

sub on_valid_form {my $r = param(’r’);my $s = sin($r);print header, $s; # write new page with the answer

}

You can find the source in src/perl/hw5.pl.cgi. Unfortunately, few stan-dard Perl installations contain the CGI::QuickForm module. We thereforehave to hardcode the path to our own Perl interpreter, which knows aboutCGI::QuickForm (if you have followed the instructions in Chapter 1.5), orwe can apply a use lib statement to notify the Perl interpreter where ourCGI::QuickForm module is installed. Since CGI scripts are run by a “nobody”user and not yourself, the script will not be aware of your special environmentvariables. You therefore need to hardcode the complete path in a use lib

statement. If you need to make use of environment variables that are underyour control, you can wrap a Bourne shell script around your CGI script;details are provided in Chapter 7.1.4A General Shell Script Wrapper for CGIScriptssubsection.421 in [?]. When you let a CGI script use your own Perl,make sure that the Perl interpreter and its libraries are readable for all users.

The CGI::QuickForm man page offers a good introduction to the manyfeatures of this module. Although the use of CGI::QuickForm in our ScientificHello World example is an overkill, the advantages of the module becomemore apparent when the form contains several form elements and when vali-dation of user input is desired.

One example showing that CGI::QuickForm is handy is the

src/perl/simviz1.pl.cgi

1.8. Web Interfaces and CGI Programming 69

script, which is a counterpart to the CGI script cg/src/python/simviz1.py.cgifrom Chapter 7.2Adding Web Interfaces to Scriptssection.425 in [?]. The pur-pose of this script is to create a Web interface to the oscillator code fromChapter 2.3Gluing Stand-Alone Applicationssection.89 in [?]. The user cantype in values of the mathematical parameteres and get a plot of the solution.

2.5. GUI Programming with Tcl/Tk 101

pack $hw.text -side top -pady 20

# second frame consists of the sine computations:set sine [ frame $top.sine ]pack $sine -side top -pady 20 -padx 10label $sine.intro -text "The sine of "set r 1.2; # defaultentry $sine.r -width 6 -relief sunken -textvariable rbutton $sine.eq -text " equals" -command comp_s -relief flatlabel $sine.s -textvariable s -width 14pack $sine.intro $sine.r $sine.eq $sine.s -side left

# third frame consists of a quit button (we drop the# frame and add the button straight into the toplevel widget)button $top.quit -text "Goodbye, GUI World!" -command exit \

-background yellow -foreground bluepack $top.quit -side top -pady 5 -fill x

bind $sine.r <Return> comp_sproc comp_s { } { global r; global s; set s [ expr sin($r) ] }bind . <q> exit

2.5.6 Configuring Widgets

In Chapter 6.1.6An Alternative to Tkinter Variablessubsection.322 in [?] wealso develop a version of the Scientific Hello World GUI where we do not tievariables to widgets but instead extracts their content with a get function andset their values either with set or configure. The hwGUI7_novar.tcl scriptis a version of hwGUI7.tcl employing this strategy. After having created anentry,

label .sine.intro -text "The sine of "entry .sine.r -width 6 -relief sunken

we insert a default value:

.sine.r insert end 1.2; # default value

The default value can, of course, also be inserted at construction time of theentry.

When we need to read the contents of the entry (in the comp_s routine)we simply use the get command:

set r [ .sine.r get ]set s [ expr sin($r) ]

Updating the contents of the result label is done with configure:

.sine.s configure -text $s

where .sine.s is the name of a label widget. This means that the comp_s

function takes the form

102 2. Introduction to Tcl/Tk

proc comp_s { } {global .sine.r .sine.s; # access global widgetsset r [ .sine.r get ]set s [ expr sin($r) ].sine.s configure -text $s

}

2.5.7 The Grid Geometry Manager

The grid geometry manager works as explained in Chapter 6.1.8An Intro-duction to the Grid Geometry Managersubsection.333 in [?], and a Tcl/Tkversion of the Scientific Hello World GUI is found in the file hwGUI9_grid.tcl.

With a background from these examples and some experience with theTcl syntax, the reader should have a good starting point for being productivewith Tcl/Tk programming, or at least, the reader should be able to look upTk documentation written with Tcl syntax (for example, the original Tk manpages). The Tk functions and options are almost identical regardless of thelanguage we use. Unfortunately, some small differences exist, and these canbe annoying. If a particular Tk construction triggers an error when expressedin another language, you should try to find man pages for the Tk bindingsin the language in question and check if the option has a different name ordifferent legal values.

Exercise 2.1. Tcl/Tk version of the GUI in Chapter 6.2Adding GUIs to Scriptssection.345in [?].

Write the simviz1.py script from Chapter 6.2Adding GUIs to Scriptssection.345in [?] in Tcl/Tk. You can use the Scientific Hello World GUIs as startingpoint, but you need to consult the Tk man pages to find the proper syntaxfor a slider and a picture in Tcl/Tk.

Hint: PhotoImage in the Python script is a counterpart to the image pro-cedure in Tk, being an example of different names in the Python and Tclinterfaces to Tk. The construction of the image is also different. �

A List of Common Widget Operations. Chapter 6.3A List of Common WidgetOperationssection.356 in [?] presents a demo script containing many of themost important widgets in Tk and Tk extensions. A simpler type of demoscript has been written in Tcl/Tk and [incr Widgets], see demoGUI.tcl insrc/tcl. The demoGUI.tcl script serves as rough overview of common GUIconstructions in Tcl/Tk and [incr Widgets].

Bibliography