Introduction to Linux for Computational Research · Introduction to Linux for Computational...

38
©University of Reading 2009 Thursday, 27 August 2009 Page 1 Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas of computational research. In this course we present an introduction to the features of Linux which are useful for computational research, for instance using ThamesBlue or the Campus Grid. The course is a mixture of exercise and reference. Any trademarks in these notes are the property of their owners: Linux is a registered trademark of Linus Torvalds. Windows is a registered trademark of Microsoft Corporation. Ubuntu is a registered trademarks of Canonical Ltd. IT Services and Systems Engineering

Transcript of Introduction to Linux for Computational Research · Introduction to Linux for Computational...

Page 1: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 1

Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas of computational research. In this course we present an

introduction to the features of Linux which are useful for computational research, for instance

using ThamesBlue or the Campus Grid. The course is a mixture of exercise and reference.

Any trademarks in these notes are the property of their owners: Linux is a registered trademark of Linus Torvalds.

Windows is a registered trademark of Microsoft Corporation. Ubuntu is a registered trademarks of Canonical Ltd.

IT Services and Systems Engineering

Page 2: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 2

Table of Contents Table of Contents 2

The Practical Session 3

First Session 4

Before you start 4

Exercise 1: Opening a terminal 5

Exercise 2: Where am I? 6

Exercise 3: Permissions 8

Exercise 4: Wildcards 9

Exercise 5: Creating a file with a text editor 10

Exercise 6: Looking inside files 11

Exercise 7: Copying, Moving and Deleting files and directories 12

Exercise 8: The Linux environment 13

Exercise 9: Job Control 14

Second Session 16

Exercise 1: The Linux File System 16

Exercise 2: Redirection 18

Exercise 3: More commands - filters 20

Exercise 4: Command substitution 23

Exercise 5: Making a shell script (1) 23

Exercise 6: Making a shell script (2) 25

Exercise 7: Installing Programs 27

Exercise 8: Manually Installing Programs 29

Exercise 9: Logging on to Remote Systems 32

Exercise 10: More Utilities 33

More Help 35

Answers 36

Session 1 36

Session 2 37

Page 3: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 3

The Practical Session

Answers

You will find the answers to the main exercises at the end of the notes.

Fonts

This is the font that is used for general information throughout this guide.

Text with a shaded background like this contains instructions which you should carry out.

Commands and options to be typed into the computer are in this font. Commands are typed in

at a ‚prompt‛ which is represented by:

prompt>

Parts of a command you should replace (e.g. fillinyourname) are in italics.

So if you saw:

prompt> ls -l filename

You would type ‚ls -l ‛ and then the filename of your choice and hit the return key.

Downloading the CD

You can take the CD you are using back with you. If you were not at the course then please

follow these instructions:

Go to http://www.ubuntu.com/getubuntu/download

Click ‚Begin Download Ubuntu 9.10 Desktop (32-bit)‛ (this is a slightly newer version

than the one used in the course, but should be compatible).

You will get a .iso file which is a copy of the whole data on a CD, you will need to write

this to a CD. You’ll need a CD writer drive and the software to go with in. There are

instructions at https://help.ubuntu.com/community/BurningIsoHowto which cover some

Windows and Mac clients.

Re-boot your machine with the CD in the drive. If this doesn’t just work then often

pressing F12 or ESC at boot time brings up a menu for you to choose, but this does vary

widely between machines and it is best to ask someone who knows - this is not a Linux

specific thing so your local IT-whiz should know what to do.

When the Ubuntu screen appears then select ‚English‛ then ‚Try Ubuntu without

Installing‛.

Page 4: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 4

First Session

Before you start You should find that the machine has been booted already into the Ubuntu Linux desktop (it

will be generally a brown colour). But by default the live CD we are using uses a US keyboard

layout - you’ll find these activities much easier if we change it to UK.

So in the top left corner click ‚System‛ then ‚Preferences‛ then ‚Keyboard‛. In the Window

that opens select the ‚Layouts‛ tab at the top. Next click ‚Add‛, selecting ‚United Kingdom‛ in

the ‚Country:‛ list. Then click ‚Add‛. Next click on ‚USA‛ and then ‚Remove‛. You should

now only have ‚United Kingdom‛ in the central list - as shown below:

Finally click ‚Apply System-Wide...‛ and then ‚Close‛.

Warning: DO NOT click on ‚Install‛ or anything similar at any point during this tutorial - it will

make those in IT Services who’ll have to re-install the machine very unhappy.

Page 5: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 5

Exercise 1: Opening a terminal The very first exercise to undertake is to open a ‚terminal‛. Terminals get their name from the

hardware which used to allow many users to connect to a central, usually UNIX based,

computer system and type in commands. These days a terminal is usually a piece of software

which gives you a prompt where you can type shell commands (‚the shell‛ is another name for

the command line) and run programs on your local machine (but the user process is fairly

similar).

To open a terminal, Look at the top left of your desktop and run Applications, then Accessories,

then Terminal.

You should now see a window which looks something like this:

This is where you can type commands in the remaining exercises.

Various things work slightly different in Linux - the default for cut and paste is often not

Control-C and Control-V like in Windows. Instead when you highlight text it is automatically

copied and then pressing both mouse buttons simultaneously (or the middle button if you have

one) will paste the last highlighted text where you are pointing. Many Linux applications also

Page 6: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 6

support the Windows way of doing cut and paste but the terminal is not one of them so watch

out!

In addition when you are typing in the terminal there is a useful feature called ‚tab-

completion‛. This means that if you type in the start of a command or filename and then press

the tab key then (if it is the only allowable name starting with what you typed) it will complete

the name for you. This can save a lot of typing. For example, try typing:

prompt> ech

and the press the tab key.

There is also ‚history‛ on the command line - you can scroll through previous commands you

entered and edit them using the cursor arrows, then press return to run the edited command.

Exercise 2: Where am I?

Introduces the commands pwd, ls, cd, mkdir

Introduces the concepts of files, directories and the home and current directories

A Linux system is arranged in a tree like structure with files (leaves) and directories (branches).

Windows users will recognise directories as being the Linux equivalent of folders which are

used to organise files. In this exercise we are going to navigate around some parts of the Linux

file system to see what we can find.

When you do something with files and directories on Linux (making a new file for example), if

you do not specify where in the directory ‚tree‛ the file should be created, it defaults to your

current ‚location‛. This is known as your current working directory. Let's find out where in the

file system we are at the moment by using pwd (which is short for print-working-directory):

prompt> pwd

In fact because you have just logged in, your current working directory is also your ‚home

directory‛ i.e. the place in the directory tree where you will always start off.

Your home directory is special and important to you for two reasons:

Several (hidden) configuration files which affect your environment (see later) are stored

in this area

It's owned by you and you can freely manipulate the directories and files under your

home area – it's your personal space to work in.

Let's list what you have in your home directory:

prompt> ls

Page 7: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 7

which gives you a list of files and directories. Let's create a new directory to do some work in.

prompt> mkdir training

prompt> ls

Now let's move out of our home area and into the training directory, checking it worked:

prompt> pwd

prompt> cd training

prompt> pwd

prompt> ls

The ls should have no output, it's a new directory and therefore it's empty. To go up one

directory, there is a shorthand name for ‚up one level‛

prompt> cd ..

prompt> pwd

You can also jump directly to any other part of the file system (please note that file names are

case sensitive):

prompt> cd /etc/X11/cursors

prompt> pwd

Now keep trying to go up one directory level until you can't go any higher – you have reached

the ‚root‛ of the file system – what is it called?

Where you give a directory or filename which begins with / - i.e. it is an absolute position; this is

known as a path.

Now, to go back to your home directory, there are two short cuts:

prompt> cd

or

prompt> cd ~

What do you think will happen if you run the following command (try it)?

prompt> cd ~/training

Page 8: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 8

Exercise 3: Permissions Introduces the commands chmod, man

Introduces the concept of permissions on files and directories, groups, man pages and flags

Linux (and Unix) attaches an ownership and set of permissions to every file and directory.

What you can and cannot do to files and directories is defined by those permissions. How do we

find out what permissions are in place for a given file/directory? We use the ls command but

with an optional extra setting called a ‚flag‛. Let's try it on a directory called ‚/etc‛:

prompt> ls -l /etc

Please note that the character after the minus sign is the letter ‚l‛ rather than the number ‚1‛.

You should see output which looks something like this:

-rw-r--r-- 1 root root 4221 2007-06-18 10:55 wgetrc

-rw-r--r-- 1 root root 1343 2007-01-09 18:39 wodim.conf

drwxr-xr-x 2 root root 63 2009-04-20 14:07 wpa_supplicant

drwxr-xr-x 9 root root 80 2009-09-14 14:25 X11

drwxr-xr-x 5 root root 98 2009-04-20 14:08 xdg

drwxr-xr-x 2 root root 184 2009-04-20 14:05 xml

drwxr-xr-x 2 root root 35 2009-04-20 14:06 xulrunner-1.9

-rw-r--r-- 1 root root 461 2009-04-17 08:29 zsh_command_not_found

The permissions of the files are listed in the first column of the output. The first characters are

split thus:

Character 1 is the type of file.

Characters 2-4 are the read/write/execute permissions for the owner of the file. The

owner of the file is given in the third column of the directory listing.

Characters 5-7 are the read/write/execute permissions for the group the file associated

with. Files are associated with a named group of users (note the owner of the file does

not need to be in this group), this is shown in fourth column.

Characters 8-10 are the read/write/execute permissions for everyone else.

In the case of the file ‚wodim.conf‛ this breaks down as:

First character is a ‚-‛ so it's a normal file (d means a directory)

2-4 are ‚rw-‛ so the user can read and write to the file but not execute (run) it

5-7 are ‚r- -‛ so any other user in the group ‚root‛ can read the file but nothing else

8-10 means any user can read the file but nothing else.

Let's now backup for a moment. Where did ls -l come from? Each Linux command comes with a

little manual of its own. To read the manual for ls:

Page 9: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 9

prompt> man ls

and see what it tells you about the ‚-l‛ flag. Hit ‚space‛ to scroll down and ‚q‛ to quit.

Now look at the manual for a command called chmod

Work out the full chmod command you would use to make sure that anyone in any

group could read the contents of a file you create. Also, how would you make a file

executable (so it can be run as a program). Check the answer with a demonstrator.

There is one common error message which Linux users might encounter: ‚Permission denied.‛

This can cover a range of situations such as:

Trying to read or alter a file which you have no right to

Trying to delete a file or directory which you have no right to

Trying to create a file in an area which you have no right to

Try the following and see what happens:

prompt> mkdir /etc/wibble

This is because /etc is owned by the systems administrators account (known as ‚root‛) and

root has not given write permission to this area.

Exercise 4: Wildcards Concepts introduced include wildcards and case sensitivity

As you saw from the /etc directory, some areas contain lots of files. There's a handy trick for

gathering groups of files together using ‚wildcards‛. These are useful for many Linux

commands but for now, let's stick with the ls command on the /etc directory.

Remind yourself of the contents of /etc:

prompt> ls /etc

and then try

prompt> ls /etc/a*

‚*‛ is a substitute for ‚anything from zero to N characters of any type‛. So everything starting

with a ‚a‛ is listed. If a directory starts with an ‚a‛, all its files are listed even if they do not start

with ‚a‛ themselves.

Page 10: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 10

Now try

prompt> ls /etc/A*

Is the result the same or different?

What do you think will happen if you try

prompt> ls /etc/*a*

Try the following:

prompt> ls /etc/*a

prompt> ls /etc/[a-c]*

prompt> ls /etc/*[0-9]*

Can you explain what happens in each case?

Exercise 5: Creating a file with a text editor Introduces the command nano.

Unlike Windows where files tend to be in special formats specific to the application designed to

use them, Linux tries to use plain text format as much as possible. This keeps files simple and

readable and allows lots of different applications to easily read the same file. There are lots and

lots of text editors for Linux. Long time users have very firm opinions as to which is the best

while not being able to agree with each other. In this exercise, we'll use a very simple text

editor to create a text file. The text editor we introduce here is called nano. It's just a very

simple screen with some commands written on the bottom.

prompt> cd ~/training

prompt> ls

prompt> nano myfile.txt

Type in the following paragraphs:

The Reading Campus Grid is a Condor Pool consisting of about 300

nodes. It is provided by IT Services and the School of Systems

Engineering.

The aim of the Campus Grid is to provide researchers with a High

Throughput Computing resource; these are designed to process tasks

that require fairly short processing times (usually minutes or

hours) but that need to be run 100's or 1000's of times.

Page 11: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 11

For example; say you have a process that takes a sample of data

from a data-set, analyses the sample and then stores a result. This

process takes 30 minutes but needs to be done 1000 times. This

requires 500 hours of processing time. On a single computer this

task will take about 21 days to complete whereas using the Campus

Grid (using around 250 nodes) this can be done in 2 hours!

Now save the file (press Ctrl-O then RETURN) and exit (Ctrl-X).

prompt> ls

You should have a new file called ‚myfile.txt‛.

Exercise 6: Looking inside files Introduces the commands cat, less, grep

You can use a text editor to look inside a file without changing anything of course but there are

quicker (and it turns out, more useful) ways of looking inside text files. To see what's inside a

text file:

prompt> cat myfile.txt

(cat is short for concatenate). The problem with cat is that big files rapidly scroll off the top of

the terminal screen. A more powerful command is called more but that alas been supplanted by

an even more powerful command named less:

prompt> less myfile.txt

(press ‚space‛ to scroll, ‚q‛ to quit). Less allows you to search through files by hitting the ‚/‛ key

and entering the string to search for.

Look for the words ‚example‛ and then ‚process‛ in myfile.txt using less.

There are even faster ways to look for words in text files.

Use grep to look for all the lines of myfile.txt containing the word ‚Engineering‛ on myfile.txt:

prompt> grep Engineering myfile.txt

Page 12: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 12

Exercise 7: Copying, Moving and Deleting files and directories Introduces the concept of Linux not warning about deletions

Introduces the commands cp, mv, rm and rmdir

To copy a file, use cp and to move or rename it, use mv.

prompt> cp myfile.txt myjunk.txt

prompt> ls

prompt> mv myjunk.txt myold.txt

prompt> ls

1. Copy myfile.txt to another file with the name ‚myfile2.txt‛.

2. Edit the second file with nano and change some text at random.

3. Come up with a single command using wildcards which searches both files for a

particular word

4. Make a directory and try to copy it – what happens

You can copy directories (and their contents) by adding the -r ‚flag‛ to cp. mv does not require

this:

prompt> mkdir test

prompt> cp –r test test2

prompt> mv test2 test3

To delete a file we use the command rm.

prompt> rm myold.txt

Notice how you got no warning whatsoever. This is one of the most important differences

between Linux and Windows and MacOS – Linux assumes you always know exactly what you're

doing and rarely checks to see if it's a sensible thing to do.

What do you think the command ‚rm *‛ does? What would happen if you tried this in a

directory containing all your research results?

Now make a new directory called ‚temporary‛ and then try to remove it using:

rm

rmdir

rm is for removing files, rmdir for empty directories. You can also use a -r flag to make rm

delete directories (it will delete them and all the files in them too).

Page 13: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 13

Exercise 8: The Linux environment Introduces the Linux environment variables and shell variables.

Introduces the command export and set.

So far we've talked a lot about commands in Linux. Before we look at producing our own shell

scripts (user defined commands), in the next session, we also need to consider the Linux

‚environment‛. The Linux environment is comprised of a set of variables each of which have a

value. These environment variables are widely used for the configuration of both the shell and

applications, and when writing shell scripts they are also used as variables.

When we refer to an environment variable in the shell we prefix it with a ‚$‛. To print

anything in Linux we use the echo command, so to print the contents of a variable called PATH,

we'd do the following:

prompt> echo $PATH

Sometimes it can be confusing to know where the name of the environment variable ends so

there is another form ${PATH}. Try setting a variable yourself:

prompt> export MYNAME=yourname

prompt> echo ${MYNAME}

To print out the whole of your environment, use the env command.

prompt> env

Some applications require environment variables to be set in order to work properly. There are

a number of standard variables which are useful to know about when working in Linux:

$USER Your username

$SHELL The name of the shell you are using

$PATH The list of directories to look in for programs when running a command,

these are separated by a colon (:).

$PWD The current directory

$HOME Your home directory (same as ~)

How could you add the current directory to the list of directories to search for programs in?

There are a number of other environment variables which specify a list of directories to look for

files of various types in (these tend to end in PATH). The directories are searched in a left-to-

right direction so it makes a difference to whether you put a new directory at the start or end of

the list. Also watch not to put extra colons in at the end or beginning of the path as the shell

will translate this to mean that you wish to also look in the current directory (when the search

happens).

Page 14: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 14

There is a very similar concept to environment variables: shell variables. Try the following

command to set a shell variable, which are set using:

prompt> myname=yourname

Can you tell what the difference is?

You should see that you can use echo to print the value of the shell variable, but you’ll not see it

in the output of env. This is because env is a separate command (like an application) to the

shell. Environment variables are seen by programs you run, whereas shell variables are only

seen by the shell interpreter. In the shell all environment variables are also shell variables. You

can see all the shell variables using the set command:

prompt> set

Notice how many more shell variables there are than environment variables. In general

environment variables have names in upper-case with underscores (_) representing spaces while

shell variables have names in lower-case with underscores (_) representing spaces.

Exercise 9: Job Control Introduces the commands sleep, jobs, kill, ps, fg, and bg.

So far we have only had one program running at a time and we have had to wait for it to finish

before we are able to go on to the next command, but you can run as many programs

concurrently as you wish from the Linux shell. We can run any command we like in the

background, while we get on with work at the terminal, by adding a ‚&‛ symbol at the end of

the line. For example, let’s start a simple command, sleep, which basically waits for as long (in

seconds) as you tell it to, and then returns:

prompt> sleep 1000 &

You will find that you get a new prompt straight away. This is because the sleep command ran

in the background.

We can see all the running tasks using the jobs command:

prompt> jobs

[1]+ Running sleep 1000 &

The *1+ gives the job number, the word ‚Running‛ means that the task is running and has not

been stopped. We can kill the task using kill:

Page 15: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 15

prompt> kill %1

prompt> jobs

[1]+ Terminated sleep 1000

Note we need to add a % to the start of the job number.

We can use the commands fg and bg to change the task from a background task (i.e. one run

with a &) to a foreground task (i.e. one that you have to wait for). You may wonder how you

may enter the bg command if the task is already running in the foreground and there is no

prompt. Well, we can temporarily stop the command using the key combination Control-Z.

We can also use Control-C to kill the current foreground task.

prompt> sleep 1000

Press control-z.

prompt> jobs

prompt> bg %1

prompt> jobs

prompt> fg %1

Press control-c.

Lastly, the job that has a ‚+‛ symbol next to its number is the task that fg or bg will operate on

if you don’t specify an argument.

There is also a lower-level command for looking at processes called ps:

prompt> ps

This will show you the process id and name of all the processes running in your shell (there can

be many processes per job). In this case it will be the ps command itself and bash. You can use

kill with a process id instead of a number; in this case there is no ‚%‛.

From the ps listing try to kill the bash command.

This will not work - tasks can choose to ignore the kill command. But we can make kill be

more insistent (and not let the command ignore us) using the -KILL option. (e.g. kill -KILL

processnumber). This is a last resort!

Try to kill bash with the -KILL option - it will close the terminal window - you’ll have to open

another one!

Page 16: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 16

Second Session There are two routes through the second session:

Shell Scripts - Exercises 1, 2, 3, 4, 5, 6, 10 (about 2 hours)

Basic Administration - Exercises 1, 7, 8, 9, 10 (about 1 hours)

Choose the area that is most interesting to you and then go back to the other ones if you have

time. If you wish to continue after today then email [email protected] with any

questions.

Exercise 1: The Linux File System Introduces the structure of the Linux file system - its structure and what is normally stored in different

directories.

Introduces to commands df, find and locate.

In the first session we looked at the concept of directories but here we want to have a quick

look at the purpose of the different parts of the Linux file system.

In Linux all the files are in one big directory tree (unlike in Windows where there is one

directory tree per drive). In fact it’s even more inclusive then that because every device

connected to your system and every running program is represented within that one directory

tree! The layout here is typical, but a systems can be set out in widely differing ways.

It all begins with the directory / - the ‚root‛ directory. Let’s have a look at what’s in it:

prompt> ls /

Here we see the desire of the original UNIX programmers to save typing - just like the short and

confusing names for commands, the directory names are mostly confusing short-hands. There

are some directory names that come up again and again at different points in the tree:

bin programs - these directories will often be in $PATH

sbin programs that only the root or super user can run

lib support ‚libraries‛ for running programs - these are shared pieces of code

that more than one program can make use of

lib64 If you have a 64bit machine you’ll see lib64 directories to store libraries

with 64 bit code in them

tmp Space for temporary files

etc Space for configuration files

src Source code

include Files used during the compiling of source code to access libraries

doc Documentation

man ‚man page‛ documentation

info ‚info page‛ documentation (a newer help system than man, see later)

Page 17: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 17

When you see these directories in the root directory they predominantly store operating system

files. The directory names are replicated in the /usr directory and here they contain

applications which come with the operating system. They appear again within the /usr/local

directory and its sub-directories and in this case they contain locally installed applications.

Other common directories in the root directory:

/home The directory where all the user home directories are stored

/root The super user’s home directory

/boot The files used to start the machine

/dev Contains lots of ‚special‛ files which represent all the devices attached to

the computer

/proc Information on current processes and special files used for information

and control of various aspects of the system

/sys Information and control of various aspects of the system

/mnt And /cdrom or /media - places where removable drives appear

/var Contains data that changes while the system is running

In Linux any directory can be a ‚mount point‛ - a directory which the operating system replaces

with the directory structure stored on a different disk. The df command shows all the disks (or

filesystems) and where they are mounted. Try it:

prompt> df -h

(The -h flag says to display figures in human readable units) What you’ll get on the live CD

system we’re using for this course is very different to what you’d see on an installed system - no

real disks are being used, instead we see lots of ‚tmpfs‛ which means a file system held in

memory. But you can see how the different parts of the directory tree (shown on the far right)

are provided by different devices (shown on the left).

All this goes to show that the Linux file system is very complex - so how do you find a file that

you’ve lost? There are two commands that can help you here: find and locate. locate is a

utility which has access to an index of all the files on the system and will return all files for

which the word you give appears anywhere in their full path and filename:

prompt> locate purple

Surprisingly even for the word purple there is a whole list of files matching. One note of

caution: locate will only show files that you are allowed to access.

find is also a powerful command for finding files. You give it the name of a directory and it lists

everything in it - including any subdirectories and their contents. Try:

prompt> find ~

Page 18: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 18

You’ll notice some extra files and directories to what you see with ls. Run ls on your home

directory and compare the output - what have the extra files and directories got in common?

Files and directories whose name starts with a period (.) are known as hidden files/directories

and are not normally displayed unless you ask for them to be, try:

prompt> ls -A

And see them appear. Returning to the find command. A list of all files is not that useful, but

we can make use of the Linux methodology of doing complex things through bringing together

many tools that do one simple thing. Here we take the output of the find command and make

it the input to the grep command we came across earlier to search for all occurrences of ‚bash‛

in our home directory:

prompt> find ~ | grep bash

The section 2 gives more detail of how this strange command actually works.

Exercise 2: Redirection Introduces the concepts of redirection and “piping” commands together.

Introduces the command sort.

In the previous section we ran the command:

prompt> find ~ | grep bash

We should notice two things from this. Firstly the | or ‚pipe‛ symbol, which tells the shell to

take the output of the first command (rather than sending it to the screen) and make it the

input to the second command (rather than taking it from the keyboard). So A | B can be

represented as:

(In our example, though, find does not prompt for any input from the keyboard). We can

actually link as many commands as we like together in this way.

The second thing we notice is that the grep command has an argument missing (in Session 1,

Exercise 6 we used ‚grep Engineering myfile.txt‛) and here we only give one argument - the

word to look for. As we have not told grep which file to look in we may expect there to be an

Keyboard A B Screen

Page 19: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 19

error - but there isn’t. This is because if you don’t tell grep where to get input from it will read

it from its input, for example the keyboard, or in this case the output from the find command.

It is fairly standard for Linux commands to by default read from its input (where no input files

are given) and write to its output (if no output file is specified).

What happens if there is an error with the first command? Re-run the last example, this time

causing find to complain about an error. What happens?

My example is:

prompt> find notthere | grep bash

find: `notthere`: No such file or directory

Somehow the error message from find gets to the screen. This is because each program has

one input and two outputs. The input is known as standard in or STDIN; and one output,

known as standard out or STDOUT, is for normal output and the other, known as standard error

or STDERR, is for errors. The pipe collects the standard out but leaves the standard error to still

go to the screen. So we can update our diagram to:

This concept is taken further as we can also tell the shell to send output to files or get input

from files. This (along with pipes) is known as ‚redirection‛.

To send the (standard) output from a command to a file use >filename:

prompt> echo hello world >myfile

Or if we want to append the output to the file, rather than overwrite the file we use >>filename:

prompt> echo added line >>myfile

Check the contents of myfile. There are a similar operators 2> and 2>> which redirects the

standard error output.

Take the command “find notthere | grep bash” and amend it so that errors are sent

into a file ‚errorfile‛. Use 2>> as you don’t want the second command’s error output

overwriting the first.

We can take input from a file using <, for example:

prompt> grep “hello” <myfile

Keyboard A B

Screen out out err err

in in

Page 20: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 20

Let’s combine this together. This example looks in the spell checker’s list of words, finds all the

words with ‚house‛ in, sorts them and saves them to a file:

prompt> grep house < /usr/share/dict/words | sort >houses

prompt> less houses

(Remember press ‘q’ to exit less).

This is the Linux (and UNIX) methodology of doing work by combining lots of small programs

together to tackle complex programs - it is a very powerful tool.

Exercise 3: More commands - filters Introduces the commands: nl, tac, uniq, head, tail, wc, seq, tee, tr and cut.

The commands grep, sort, cat are small commands that do specialised jobs, taking some input,

changing it in some way and then outputting it. These are known as filters. There are a wide

range of other filters that you’ll find useful when bringing together commands to perform a

task.

These filter commands are described below. Note that where arguments are in brackets they

are optional - e.g. [filenames]:

grep pattern [filenames]

Search for lines that contain ‚pattern‛, either in the given filenames or in standard

input, and output the matching lines to standard output. Some interesting optional

arguments which can be added before ‚pattern‛ are -i to make the search case

insensitive and -v to output the lines that don’t match ‚pattern‛.

cat [filenames]

Short for concatenate. Take the files given as arguments and print them in turn to

standard output. If no files are given copy standard input to standard output. cat is

often used at the start of a pipe instead of redirecting standard input as it gives a more

logical layout.

sort [filenames] or sort -n [filenames]

Take the files given as arguments or standard input, sort the lines in alphabetical order

and prints to standard output. If you add -n to the list of arguments then it will sort

numerically.

nl [filenames]

Same as cat but add line numbers.

Page 21: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 21

tac [filenames]

Same as cat but output the lines in reverse order.

uniq [filenames] or

uniq -c [filenames]

Go through the input lines (files or standard input) and remove duplicate lines. Input

must be sorted. With the -c argument it gives a count of how many times each line

appeared in the input.

head -# [filenames]

Same as cat but only output the first # lines (e.g. head -10 myfile).

tail -# [filenames]

Same as cat but only output the last # lines (e.g. tail -10 myfile).

wc [filenames]

Output the number of lines, words and characters in the input files (or standard input).

With -w just outputs number of words, with -l just outputs number of lines and with -m

just outputs number of characters.

seq end or

seq start end or seq start increment end

Prints a sequence of numbers on consecutive lines - does not look at standard input. By

default it prints every whole number up to the value of end. Will work with real

numbers too.

tee filename

The tee command copies its standard input to standard output but also saves a copy of

the standard input to the file given - this is really useful if you want to see what is going

on half way down a pipe, or see the output of a command on the screen but also save a

copy for later.

Taking the two word lists /usr/share/dict/american-english and /usr/share/dict/british-english

as the inputs perform the following calculations using the above commands:

Count the number of British words which contain ‚ous‛

Count the number of American words which do not contain ‚ous‛

Count the number of words which are in both dictionaries

So far all these have been dealing with whole lines, there is also a set of useful tools which deal

with processing the characters on each line:

tr set1 set2 or

tr -d characters or tr -s characters

Page 22: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 22

tr stands for translate and performs operations on individual characters in lines. The

first form tr set1 set2 takes two sets of characters and where it finds one in the first set it

will change it to the corresponding character in the second set.

For example, print the word list capitalising ‘a’, ‘b’ and ‘c’:

prompt> cat /usr/share/dict/words | tr abc ABC

But if we wish to capitalise all letters than this is a bit of a pain so we can use ranges of

characters as well, useful ones include ‚a-z‛, ‚A-Z‛ and ‚0-9‛:

prompt> cat /usr/share/dict/words | tr a-z A-Z

tr -d characters will simply drop any matching character while tr -s characters will delete

repeats of matching characters. This is very useful for deleting repeated spaces, but how

would you do that as spaces separate the arguments? We can put arguments with spaces

in quotes, for example:

prompt> cat ~/training/myfile.txt | tr -s „ „

cut -c ranges

Here we specify the ranges of characters in each standard input line which we want to

copy to the standard output, for example to just output the 1st, 2nd, 5th and 6th

character of every word in the word list we would do:

prompt> cat /usr/share/dict/words | cut -c 1-2,5-6

cut -d character -f ranges

In this second form we tell cut that we want to split our input into ‚fields‛ separated by

a given character, we can then tell it which of these fields to extract. For example to get

list of usernames and real names from the password file (in which file each line is

separated by a ‘:’) we’d use:

prompt> cat /etc/passwd | cut -d : -f 1-1,5-5

You may also want to see what the file looked like before:

prompt> cat /etc/passwd

We suggest that later you spend some time trying out each of these commands.

Page 23: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 23

Exercise 4: Command substitution Introduces command substitution.

There is one other useful trick which is similar to redirection, command substitution. This is

where the output of one command is used as part of the command line used to run another

command. When the shell is reading a command line and comes across code like:

$(command line)

Then it will execute the command line given and replace the $() with the output of that

command. The most common use of command substitution is in setting variables, for example:

prompt> pwd

prompt> export MY_PWD=$(pwd)

prompt> echo $MY_PWD

There is a more common way of writing command substitution using back-ticks (which look

like ` - typed using the key directly below the ‚ESC‛ key in the top left of a keyboard). For

example:

prompt> export MY_HOME=`cat /etc/passwd | grep $USER | cut -d ‟:‟ -

f 6-6`

prompt> echo $MY_HOME

When we are using command substitution we often want to discard the errors from a

command. The standard way of doing this is to redirect to the file ‚/dev/null‛ - this is a special

file that simply discards anything written to it. For example:

prompt> export BASH_FILES=$(find notthere | grep bash)

prompt> export BASH_FILES=$(find notthere 2>/dev/null | grep bash)

Notice how the errors are suppressed with the second form.

Exercise 5: Making a shell script (1) Introduces shell scripts and special shell variables.

Linux is made up from lots of very small commands which perform very specific jobs. It is often

the case that you need to group lots of commands together into a ‚recipe‛ which you use over

and over again. Such recipes are held in text files and are known as ‚scripts‛. There are different

flavours of scripts depending on which language they are written in and because we are making

a script written with bash-shell commands, this will be a ‚shell script‛.

Page 24: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 24

Our script will do two things:

1. List the contents of /etc

2. Put the output into a new file called listing.txt

Use nano to enter the following script into a file called listing.sh:

#This is a comment. Other than #! anything after a # is ignored

#List /etc and REDIRECT the output to a file called listing.txt

ls /etc > listing.txt

and save the script as ‚listing.sh‛.

There are two ways we can run this script. Method one involves sourcing the script - saying to

the shell you're typing at, ‚please run this text file as a bash script‛.

prompt> . ./listing.sh

prompt> less listing.txt

(The ‚.‛ character means ‚run this‛. it is good practice to refer to the shell script with a ‚./‛ in

front of it. ‚./listing.sh‛ means ‚the listing.sh file in this directory only‛. Referring ‚listing.sh‛

on its own means ‚the first listing.sh in the directories listed in $PATH‛.

The second way of doing this is a bit tidier. We place the strange looking line ‚#!/bin/bash‛ at

the start of the script which forces the script to run as a bash shell script every time.

Use nano to change the first line of the script so it looks like this:

#!/bin/bash

# This is a comment. Other than #!, anything after a # is ignored

# List /etc and REDIRECT the output to a file called listing.txt

ls /etc > listing.txt

Now try to run the file directly (so no ‚.‛ sign).

prompt> ./listing.sh

You should get ‚permission denied‛. This is because your listing.sh file doesn't have execution

permission (a safety feature which stops you running any old file).

Use the chmod command to make listing.sh executable and run it using the line above.

Now look inside your new file, ‚listing.txt‛ to check the script has worked.

Page 25: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 25

There are some more important bash features that are useful when writing shell scripts. The

first is some special shell variables:

$0 The name of the script

$1 The first argument to the script

...

$9 The ninth argument to the script

$# The number of arguments

$* All the arguments used when calling the script

Using these variables you can easily add arguments to your scripts. For example add two

arguments to the listing.sh script - the first to specify which directory to list and the second the

name of the file to put the listing in. Test this with another directory.

Exercise 6: Making a shell script (2) Introduces the for and if commands

The second useful bash feature is control statements. If you are familiar with other

programming languages you’ll be familiar with for-loops, while-loops and if-then-else

statements. These allow you to control which commands get executed, how many times and in

what order. bash has equivalent commands, but these are slightly more ‚clunky‛ than in most

programming languages. There is plenty in the man pages for bash if you want to know more,

but we’re going to look at two useful cases here. The first is a form of the for loop:

for variable in list; do

commands

done

We provide a list of items where it says list and the variable named in variable is set to each item

in the list and for each item the commands in commands are run. For example, type the

following into a shell (be careful as bash is fussy about things like semi-colons and newlines!):

prompt> for i in /etc /bin /usr; do

> ls $i

> done

Note that the prompt will change as you type the second and third line to remind you that you

have not finished typing the command.

For loops are most useful when combined with command substitution. Can you do the same

loop but this time listing the contents of all the directories in /usr?

The second useful form we shall use is two types of the if command:

Page 26: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 26

Form 1:

if command; then

commands

else

commands

fi

Form 2:

if command; then

commands

fi

If the command returns ‚true‛ then the first list of commands is executed, otherwise the optional

second list of commands is executed. You may ask how a command may return ‚true‛. We’re

not really going to go into that here, but suffice to say that all programs in Linux return an

integer number (a return code), normally 0 for normal exit and otherwise an integer

representing an error code. But if takes a zero return code to be true. The variable $? contains

the return code from the last command. Try:

prompt> true

prompt> echo $?

prompt> false

prompt> echo $?

We can make use of the [[ ... ]] command to evaluate if conditions are true. This allows you to

do normal comparisons, but in a strange way. Here are some of the more useful ones:

-a filename true if f i lename exists

-d filename true if f i lename exists and is a directory

-f filename true if f i lename exists and is a normal file

-r filename true if f i lename exists and is readable by you

-w filename true if f i lename exists and is writable by you

-x filename true if f i lename exists and is executable by you

-s filename true if f i lename exists and is not empty

s1 == s2 true if string s1 is equal to string s2

s1 != s2 true if string s1 is not equal to string s2

! exp true only if exp is false (NOT)

exp1 && exp2 true if both exp1 and exp2 are true (AND)

exp1 || exp2 true if either exp1 or exp2 are true (OR)

( ... ) groups expressions (to stop confusion)

Please note that there must be spaces between every symbol.

Page 27: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 27

Also because shell or environment variables used as arguments are sometimes undefined, this

can cause problems, therefore you may see the following as a test for whether a variable has

been set:

if [[ “x$myvar” == “x” ]]; then

echo empty

fi

So if we wanted to check whether the filename stored in $file was a valid program we may use:

if [[ -x $file && -s $file ]]; then

echo Program!

else

echo Not a program.

fi

Write a script which takes a directory name as its only argument and goes though each item in

the directory and prints either ‚FILE: filename‛ if it is a file, ‚DIRECTORY: filename‛ for directories

and ‚OTHER: filename‛ for everything else (there are sometimes other things other than files and

directories - try your script out with the /dev directory). Check that your program does correctly

report the types of directories and the special files in /dev.

Exercise 7: Installing Programs Introduces the concept of installing from packages and the “root” user.

Introduces the commands sudo/su, apt-get/apt-cache and yum.

The slides at the start of this session demonstrated how to use the graphical installation utility

in Ubuntu, and other Linux flavours will have similar graphical packaging systems, but what if

you need to install a package at the command line?

Each variety of Linux has its own set of command line packaging tools. Here we go through

those provided by Ubuntu (and Debian), but also give the equivalent commands for Red Hat and

its derivatives (e.g. CentOS, Fedora Core, Scientific Linux) in call out boxes.

To handle packages in any Linux you will need to be logged in as the super-user (‚root‛) - the

same as the ‚Administrator‛ in Windows - the user who can do anything. In Linux the

command sudo allows you to run a command as another user (depending on whether it is

allowed in a policy). The sudo command can be configured in many ways, but in Ubuntu by

default it will allow you to run any command as root, after asking you for your password. Let’s

try to install the popular emacs text editor:

prompt> apt-get install emacs

Page 28: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 28

You will get a message such as:

E: Could not open lock file /car/lib/dpkg/lock - open (13

Permission Denied)

E: Unable to lock the administration directory (/var/lib/dpkg/),

are you root?

Now try using sudo:

prompt> sudo apt-get install emacs

You’ll get the message:

Reading package lists... Done

Building dependency tree

Reading state information... Done

The following extra packages will be installed:

emacs22 emacs22-bin-common emacs22-common emacsen-common xaw3dg

Suggested packages:

emacs22-el

The following NEW packages will be installed:

emacs emacs22 emacs22-bin-common emacs22-common emacsen-common

xaw3dg

0 upgraded, 6 newly installed, 0 to remove and 0 not upgraded.

Need to get 21.1MB of archives.

After this operation, 66.2MB of additional disk space will be used.

Do you want to continue [Y/n]?

Answer yes: ‚y‛

And you’ll get a long set of messages as it downloads and installs the packages. (Notice that you

did not have to type in a password - the live session user does not have one, but if you were

using a proper install of Ubuntu it would prompt for your password).

If you were using a RedHat type system, you’d use yum install emacs. In addition sudo will

not be set up as Ubuntu, you will need to use su to temporarily log in as root (you’ll need to

know the root password). This should be used like ‚su -‛and you should type ‚exit‛ when you

no longer need to be root.

But what if you don’t know the name of the package? You can use apt-cache to find the

package. Say we want to install java - a popular programming language which is needed to view

some webpages. We may just ask the system to install java, try:

prompt> sudo apt-get install java

You’ll see an error because there is no package called java. So we search for java in the

descriptions of all packages:

prompt> apt-cache search java

Page 29: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 29

Java is mentioned in many packages, but we notice ‚openjdk-6-jre - OpenJDK Java runtime,

using Hotspot JIT‛ which looks right, so we ask the system to tell us more info:

prompt> apt-cache show openjdk-6-jre

This prints out lots of information, but from the description we see that this is the package to

install.

In RedHat type systems you can use ‚yum search‛

If you which to uninstall a package then use ‚apt-get remove‛, for example:

prompt> sudo apt-get remove emacs

In RedHat systems you use ‚yum erase <package>”.

Lastly, you may want to apply all updates to your system (like ‚Microsoft Update‛ in Windows):

prompt> sudo apt-get update

prompt> sudo apt-get upgrade

(Don’t run these there will not be enough space on our temporary copy of Ubuntu). The first

command updates the lists of packages, the second upgrades to the latest version.

In RedHat type systems use ‚yum update‛ (the lists of packages are updated automatically by

yum).

Updating your packages is VERY important to do regularly (even daily) on all the Linux

machines in your control. Ubuntu and other versions of Linux do provide graphical tools for

updating (and even display warnings when you haven’t done so) on the desktop, so you will

probably only need to do it using apt-get on a server. On Ubuntu this is the ‚Update Manager‛

mentioned in the slides.

Exercise 8: Manually Installing Programs Introduces compiling from source and the commands: ./configure, make, wget, gzip, tar

Exercise 7 looked at installing programs where you have root access and there is a package in

the system. This is good for computers that you manage, but occasionally there may not be a

package, or, if you are a user on a shared system you will not have permission to use the

package manager. In both cases you will need to install the program manually.

Page 30: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 30

In UNIX and Linux there are very few programs that come as ready to use programs. Instead

they are distributed as ‚source‛ - the actual text files that the programmer wrote to make the

program - rather than as ‚binary‛ - the program converted so it will directly run on the

computer. Turing source files to binary programs is known as ‚compiling‛. This is because

each Linux machine is different and so the program needs to be converted to a slightly different

form for each machine.

Luckily, often someone has already compiled the program for your flavour of Linux and this will

run directly without any modification, as long as the machine has all the other programs and

libraries installed that the program relies on (known as ‚dependencies‛). In Ubuntu and Debian

these files are .deb files and in RedHat type systems they are .rpm files. To install a .deb file

use ‚dpkg -i program.deb‛ and to install a .rpm use ‚rpm -i program.rpm‛. You must be

careful to get the right flavour and version of Linux otherwise it will probably not work. Don’t

forget you have to sudo or be root to install packages.

Compiling from the source code is a standard procedure for 90% of programs. We’ll

demonstrate by installing the command units, a program for converting between units of

measurement in our home directory.

First we need to get the source code - we can browse around the internet to find this, but I know

that I can get units using:

prompt> cd

prompt> wget http://ftp.gnu.org/gnu/units/units-1.87.tar.gz

wget is a program for downloading files. Type wget followed by the full URL (we could have

used firefox from the desktop, but on a computational resource this may not be available).

Notice the file ends with tar.gz - this means that it is a tar archive compressed with gzip. To

expand out this file we need to use:

prompt> tar xzvf units-1.87.tar.gz

You should see a list of the files that are created. Other common archive types, with the

commands to create archives and extract the files, include

Extension Extract Create

file.tgz tar xzvf file.tar.gz tar czvf file.tar.gz filelist

file.tgz tar xzvf file.tgz tar czvf file.tgz filelist

file.tar.bz2 tar xjfv file.tar.bz2 tar cjvf file.tar.bz2 filelist

file.zip unzip file.zip zip file.zip filelist

In the vast majority of cases the programmers will be polite and all the files will appear in a

new directory with the same or similar name as the filename (without the .tar.gz). Let’s go to

that directory and see what files it contains.

Page 31: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 31

prompt> cd units-1.87

prompt> ls

There will nearly always be a README and INSTALL (or similarly named) files and you should

read these to check how to install the programs. Normally programs are installed using three

commands ‚./configure‛, ‚make‛ and ‚make install‛.

The first stage is to configure the compiling process using a script that is supplied with the

source code - this checks that the machine has the required dependencies, enables and disables

features; and sets options. Try it:

prompt> ./configure --prefix=$HOME/install

Note we add one option (--prefix=$HOME/install) - we tell the system to install units in the

install directory of our home directory. We would not have permission to install it to the

default location. We continue with the process:

prompt> make

prompt> make install

prompt> cd

And the package is installed. It is good practice to install all your local programs (at least those

that are utilities) to one place so you can simply add this location to your ‚PATH‛ and then

Linux will know where to look for the programs. If you explore the install directory you’ll be

able to find the units program:

prompt> cd

prompt> ls install

prompt> ls install/bin

We can add this to the PATH by executing:

prompt> export PATH=$PATH:$HOME/install/bin

And if we add this as the last line to our .bashrc file (the commands to automatically run when

we open a terminal) then we’d always have the units command available.

Try it out:

prompt> units

2411 units, 71 prefixes, 33 nonlinear units

You have: miles per hour

You want: km/hr

Page 32: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 32

* 1.609344

/ 0.62137119

Press Control-D to exit.

Exercise 9: Logging on to Remote Systems Introduces the commands ssh and scp and the Windows programs PuTTY and WinSCP. We will also add

users and change passwords (passwd, adduser)

Frequently you will need to connect from one Linux machine to another or copy files to another

Linux machine. The command used to ‚log-on‛ to another Linux machine’s command line is

called ssh. The command is used as follows:

ssh username@machine

To try this out we need to have a machine running a ssh server – ie a machine that allows

command-line access – for the sake of this exercise we’re going to use our training machine.

Let’s install the ssh server.

prompt> sudo apt-get install ssh-server

We also need to have a user to log in with – let’s create a new user. Normally you could do this

from ‚System->Administration->Users and groups‛; but as this is a command-line based course

let’s use the adduser command, which adds a new user and allows us to set a password and

provide information about them:

prompt> sudo adduser newuser

Remember the password that you give to the account! Indecently the passwd command allows

you to change your password or if you use sudo to run it then you can change anyone’s

password – just give their username as the first argument!

Right let’s now try to log-in to the machine using ssh:

prompt> ssh newuser@localhost

Say yes when asked to whether to continue - this message will always appear the first time you

access a machine using ssh. If you’ve used ssh to contact the machine from the same place

previously then contact the administrator of the remote machine for help. localhost is always

used to denote the machine you are running the command on. You’ll have to give the password.

Type pwd and you’ll see we’re now in newuser’s home directory. Type exit to log out.

Page 33: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 33

To copy files we use the command scp, which is very similar to cp, you can even use the -r flag

to copy whole directories. The addition is that you can specify that the source file that you are

copying from or the output file you are copying to can be on a different machine, by using the

format username@machine:filename. If you do not specify a path for the filename then it looks

in your home directory. For example let’s send the units-1.87.tar.gz file to newuser:

prompt> scp units-1.87.tar.gz newuser@localhost:

Log-in again as newuser to check it’s there, log out and copy it back to a different filename.

If you wish to log in to Linux machines from Windows you can use a tool called PuTTY

(http://www.chiark.greenend.org.uk/~sgtatham/putty/) and you can copy files using WinSCP

(http://winscp.net/). These use the same protocols but add graphical interfaces. As these are not

Linux tools we’ll leave it to you to have a go another time.

Exercise 10: More Utilities Introduces the commands: diff, top, nice and time.

There are just a few more commands that we wish to introduce to you. The time command

allows you to time the length of time that a command or program takes to run. Try it out:

prompt> time sort /usr/share/dict/british-english >output

You will get some lines like:

real 0m2.992s

user 0m0.308s

sys 0m0.092s

This gives three pieces of information - the first line tells you how long the command took to

run in real-world time (from you pressing return until you get the next prompt). The second

time is the actual CPU time for running the code which is part of the command. The third gives

the amount of time that the CPU ran Linux operating system code on behalf of your command.

Note that most of the time is unaccounted - this was taken up waiting for things to happen (for

instance actually writing the output to the window and it being scrolled) and running other

programs.

Page 34: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 34

The nice command is another command that can be used to prefix a different command or

program. This tells your computer that the command should be run at a lower priority than

normal. This is not that useful on your own machine, but allows you to be a good neighbour on

shared machines which are heavily loaded. For example:

prompt> time nice sort /usr/share/dict/british-english >output

You will get some lines like:

real 0m5.169s

user 0m0.320s

sys 0m0.916s

In this case the program overall took twice as long to run in real-time, but didn’t take any more

CPU time.

Another command that is useful on multiuser machines is top. This can be used to view all the

running processes (or programs) on the machine, sorted in order of the amount of CPU they are

using. There is a lot of information displayed and it is updated every few seconds. This way you

can check if it is one of your programs making the machine run slow.

prompt> top

Press ‘h’ to get help and ‘q’ to quit.

Finally, the diff command is a very useful command. It allows two text files to be compared

line by line and any changes noted. Try the following:

prompt> diff -u /usr/share/dict/american-english /usr/share/dict/

british-english

The last lines of output look like:

@@ -98322,13 +98083,9 @@

yips

yo

yodel

-yodeled

-yodeler

-yodeler‟s

-yodelers

-yodeling

yodelled

yodeller

+yodeller‟s

yodellers

Page 35: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 35

yodelling

yodels

This shows the changes that are needed to go from the first to the second file (take-away words

like ‚yodeler‛ and add ‚yodeller’s‛). The lines starting with + means add this line, - means

remove it and a space means the line is the same in both files. The line starting with @@ is a

cryptic reference to where in each file this change is. The -u flag allows you to see the changes

in context. There are very many different formats and options to diff, but this is the one most

likely to be useful to find minor changes in text files.

More Help A great place to find more help is the man pages for various commands. There are also info

pages which are accessed in the same way but have a simple hyper-text system to navigate

around the command’s description rather than simply presenting it as one long page. Man

pages can also be found on the internet by searching for ‚man command‛.

The web is really the best place to find out more information - most problems have been

encountered by someone else before you!

Page 36: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 36

Answers For this online copy of the notes we provide suggested answers for the main exercises where

there are no examples/answers in the main text.

Session 1

Exercise 2: Where am I?

The name of the ‚root‛ of the file system is a directory called / (i.e. the symbol ‘/’).

The command ‚cd ~/training” moves you directly to the training directory which you created

inside your home directory.

Exercise 3: Permissions

The full chmod command to make sure that anyone in any group could read the contents of a

file you create is:

chmod a+r filename

To make a file executable (just for you) use:

chmod u+x filename

Exercise 4: Wildcards

This section has a number of uses of the ls command, here is what each produces:

ls /etc the names of all the files in the directory /etc.

ls /etc/a* the names of all the files with names beginning with ‘a’ in the /etc directory AND

the list of the contents of all the directories with names beginning with ‘a’ in the /etc directory.

ls /etc/A* the names of all the files with names beginning with ‘A’ in the /etc directory AND

the list of the contents of all the directories with names beginning with ‘A’ in the /etc directory.

The result is difference because in Linux it matters whether you use upper or lower case letters

in filenames.

ls /etc/*a* filenames and directory contents where their names contain a ‘a’ anywhere.

ls /etc/*a filenames and directory contents where their names end in ‘a’.

ls /etc/[a-c]* filenames and directory contents where their names start with a ‘a’, ‘b’ or ‘c’.

ls /etc/*[0-9]* filenames and directory contents which contain a number anywhere.

Exercise 7: Copying, Moving and Deleting files and directories

1. Copy myfile.txt to another file with the name ‚myfile2.txt‛.

cp myfile.txt myfile2.txt

2. Edit the second file with nano and change some text at random.

nano myfile2.txt

Page 37: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 37

3. Come up with a single command using wildcards which searches both files for a

particular word

grep word myfile*.txt

4. Make a directory and try to copy it – what happens

mkdir test

cp test test2

-> there is an error as cp will not copy a directory.

What does ‚rm *‛ do - it deletes all the files in the current directory!

Exercise 8: The Linux environment

Adding the current directory to the list of directories to search for programs in:

export PATH=$PATH:$PWD

Session 2

Exercise 2: Redirection

Take the command ‚find notthere | grep bash‛ and amend it so that errors are sent into a file

‚errorfile‛. For this question you must notice that it is the first of the two commands, find,

that produces the error - the message produced was:

prompt> find notthere | grep bash

find: `notthere`: No such file or directory

Therefore we need to redirect the error output from the first command, which is before the

pipe (‘|’) symbol. So the answer is:

find notthere 2>>errorfile | grep bash

Exercise 3: More commands - filters

Count the number of British words which contain ‚ous‛:

grep ous /usr/share/dict/british-english | wc -l

Count the number of American words that do not contain ‚ous‛:

grep -v ous /usr/share/dict/american-english | wc -l

Count the number of words that are in both dictionaries:

cat /usr/share/dict/british-english /usr/share/dict/american-

english | sort | uniq -c | grep 2 | wc -l

Page 38: Introduction to Linux for Computational Research · Introduction to Linux for Computational Research Course Notes, Exercises and Example Answers Linux systems are used in many areas

©University of Reading 2009 Thursday, 27 August 2009 Page 38

This is complicatied but this is how it works - concatenate the two word lists, sort them into

alphabetical order, make a list of how many times each unique line appears, find those lines for

which there are two instances and then count them.

Exercise 5: Making a shell script (1)

The command to make the listing.sh script executable is:

chmod u+x listing.sh

The listing.sh script with the two arguments added is:

#!/bin/bash

# This is a comment. Other than #!, anything after a # is ignored

# List /etc and REDIRECT the output to a file called listing.txt

ls $1 > $2

And you can use it to list the /usr directory to the file usr-listing.txt with:

./listing.sh /usr usr-listing.txt

Exercise 6: Making a shell script (2)

A for loop which lists the contents of all the directories in /usr:

for dir in `ls /usr`; do

ls /usr/$dir

done

Notice the use of command substitution to make the list of items for the loop and that the ls

command needs to be given the full path to the directory we want to list, so we require

“/usr/$dir” rather than just “$dir”.

The full script listing a given directory as: ‚FILE: filename‛, ‚DIRECTORY: directoryname‛ or

‚OTHER: othername:

#!/bin/bash

for name in `ls $1`; do

if [[ -f $1/$name ]]; then

echo FILE: $name

else

if [[ -d $1/$name ]]; then

echo DIRECTORY: $name

else

echo OTHER: $name

fi

fi

done

If you save this as script.sh then you can try it out using:

chmod u+x script.sh

./script.sh $HOME

./script.sh /dev