Perl 101 - The Basics of Perl Programming

48
Perl -101

Transcript of Perl 101 - The Basics of Perl Programming

Page 1: Perl  101 - The Basics of Perl Programming

Perl -101

Page 2: Perl  101 - The Basics of Perl Programming

Objectives

• To introduce Perl Scripting Language• To introduce the concept of regular expressions and pattern matching• To introduce various file operations in Perl

2Linux User Group at San Jose State University

Page 3: Perl  101 - The Basics of Perl Programming

Agenda Part 1 (1:30PM to 2:45PM)Introduction to Perl– Types of Variables– I/O Functions– Operators– String and formatting strings– Selectional, iterational and miscellaneous

control statements

Data Structures– Arrays and various operations on arrays– Hashes and various operations on hashes

Functions:– Subroutines– Command line arguments

3Linux User Group at San Jose State University

Part 2 (2:50PM to 3:45PM)Regular Expressions:– Regular Expressions– Meta characters– Character classes– Pattern matching

File IO:– Reading, Writing text files– Various file I/O modes

Misc: – Installing modules

Questions and Answers

Page 4: Perl  101 - The Basics of Perl Programming

What is PERL and where PERL is being used?

• PERL is scripting language • PERL is mostly used for extracting information from text files and

generating various reports based on that• PERL is also used for developing web applications using CGI standards• It is also used for developing scripts of automated testing, system

administration etc.• It is one of the most popular scripting language

4Linux User Group at San Jose State University

Page 5: Perl  101 - The Basics of Perl Programming

Perl Introduction• PERL – Practical Extraction and Report Language

– Developed by Larry Wall in 1987

– Originally designed for reading text files and preparing reports based on that

information

– Combines features of C ,sed, awk and sh

– Gives unlimited data size

– Recursion of unlimited depth

– Sophisticated pattern matching techniques

– Used for web programming

– Widely used for system administration

– Wide range of third party modules

5Linux User Group at San Jose State University

Page 6: Perl  101 - The Basics of Perl Programming

The PERL Interpreter

• Perl Interpreter– Converts the scripts in to a parse tree and executes immediately– Internally Perl maintains a byte code for execution– Known as interpreter/ compiler

6Linux User Group at San Jose State University

Page 7: Perl  101 - The Basics of Perl Programming

PERL - Variables• Scalar Variables

– Simple variables containing an element, either a number or a string– Defined by $ symbol

• Array – List of scalar variables– Defined by @ symbol

• Hashes– Similar to arrays– Each item is identified by a key value– Defined by % symbol

7Linux User Group at San Jose State University

Page 8: Perl  101 - The Basics of Perl Programming

PERL – Scalar Variables– Defined by $ Symbol– Stores either a number or a String– No declaration required, creation is done when a variable is referred first in

the program,known as auto vivification– Strings and Numbers are interchangeable – Variable names are case sensitive– Variable name can have alphabets, numbers and underscore (_)

• Should not start with a number• Variables starting with _ is special

8Linux User Group at San Jose State University

Page 9: Perl  101 - The Basics of Perl Programming

PERL - Lists– Collection of scalar data, Arrays and Hashes– Represents a logical group of items– Scalar data inside ( and ) separated using , represents a list– A set of operators works on list – called as list operator

• E.g. print • print (1,2,3) displays 123

9Linux User Group at San Jose State University

Page 10: Perl  101 - The Basics of Perl Programming

PERL – Display Functions• print

– Unformatted output statement which can print a string, Number or a List– Default outputs to STDOUT, can be redirected to STDERR or a File – Returns ‘true’ if successful else ‘false’ [1 represents ‘true’]

print Listprint FILEHANDLE List

• printf– Formatted output statement– By defaults outputs to STDOUT. Can be redirected to STDERR or a File

printf Format, Listprintf FILEHANDLE, Format, List

• warn– Outputs a message on the STDERR stream (unbuffered)– Used to trace error when STDOUT is redirected– Program is neither terminated nor an exception is thrown

warn “Message”

10Linux User Group at San Jose State University

Page 11: Perl  101 - The Basics of Perl Programming

PERL - Operators• Assignment Operator• Auto increment and decrement operators• Additive operators• Multiplicative operators• Exponentiation Operator• Relational Operators• Logical Operators• Range Operator• Binding Operator• Arrow operator • File I/O Operator

11Linux User Group at San Jose State University

Page 12: Perl  101 - The Basics of Perl Programming

PERL – Operators contd• Assignment Operator (=)• Auto increment and decrement operator (++, --)• Additive Operator (+, -, .)

• Addition & Subtraction operator• Concatenation Operator (.)

• Multiplicative Operators• Multiplication & Division• Repetition Operator (x)• Exponential Operator

12Linux User Group at San Jose State University

Page 13: Perl  101 - The Basics of Perl Programming

PERL – operators contd• Relational Operators

– Operating on Numeric data<, <=, >, >=, ==, !=,

- Operating on String datalt – less than E.g. ($Myname lt $YourName)gt – greater than le – less than or equal to ge– greater than or equal toeq – equal tone – not equal tocmp – compare (returns -1, 0 , 1)

- Logical OperatorAND - && or ‘and’OR - || or ‘or’NOT - not

13Linux User Group at San Jose State University

Page 14: Perl  101 - The Basics of Perl Programming

PERL – Operators contd• Range Operator (..)

• Returns a list of sequence of valuesprint (1..50); print (a..z);

• Bind Operator (=~, !~)• Used for binding scalar expressions to a pattern match

• Arrow Operator (->)• Used to retrieve elements from a reference

• File Operator (<File_Handle>)• Used for File I/O Operations• <> represents <STDIN>

Note: Bind, Arrow and File operators will be explained later

14Linux User Group at San Jose State University

Page 15: Perl  101 - The Basics of Perl Programming

PERL – Standard Input • Input operator

– <STDIN> or <>– Read input from the user including ‘\n’ from the user– chomp operator can be used for removing the ‘\n’ character– By default input is stored in a default variable $_

• chomp– Usually used to discard the ‘\n’ character at the end of input string

15Linux User Group at San Jose State University

Page 16: Perl  101 - The Basics of Perl Programming

PERL - Strings• Sequence of characters, each character is an 8 bit value• Single quoted string

– Escape sequence except \\ & \’ are not interpreted so• Double quoted string

– All escape sequences are interpreted so• Concatenation Operator

– ‘.’ can be used for concatenating string values• Repetition Operator

– ‘x’ makes as many number of concatenated copies of string

16Linux User Group at San Jose State University

Page 17: Perl  101 - The Basics of Perl Programming

PERL – String Functions• Converting to lower case using – lc STRING• Converting to upper case using – uc STRING• Converting initial case to lower using – lcfirst STRING• Converting initial case to upper using – ucfirst STRING• Getting string length using – length (STRING)

– Returns the length of the string– If STRING is omitted, it returns the length of the string stored in $_

17Linux User Group at San Jose State University

Page 18: Perl  101 - The Basics of Perl Programming

PERL – String Functions contd• Searching string using – index (STRING, SUBSTRING, [POSITION])

– returns the index of the first occurrence of the SUBSTRING in STRING on or after POSITION

– If POSITION is ignored, it starts searching from the first location– returns -1 if not found– rindex can be used to find the index of the last occurrence of the SUBSTRING

in STRING on or before POSITION

18Linux User Group at San Jose State University

Page 19: Perl  101 - The Basics of Perl Programming

PERL – String Functions contd• Extracting/Replacing substring using –

substr STRING,OFFSET,[LENGTH], REPLACEMENT

– Extracts and returns the substring from OFFSET to LENGTH+OFFSET– If LENGTH is omitted, extraction starts from OFFSET to end of STRING– If LENGTH is negative, extraction omits that much number of characters from

the end– If OFFSET is negative extraction starts from end of the string and moves back

wards– If REPLACEMENT is specified, extracted substring will be replaced by

REPLACEMENT

19Linux User Group at San Jose State University

Page 20: Perl  101 - The Basics of Perl Programming

PERL – Control Structures• Branching Statements

– if– if – else– if – elsif

• Looping Statements– while / until– do – while/ until– for– foreach

• Miscellaneous Control Statements– last, next, redo– Expression modifiers– exit, die

20Linux User Group at San Jose State University

Page 21: Perl  101 - The Basics of Perl Programming

PERL – Control Structures contd– if statement

if ( condition ) {#True Block

}– If - else statement

if( condition ) {#True Block

}else {#False Block

}– if – elsif statement

If( condition1) {#True block for Condition1

}elsif(condition2) {#True block for Condition2

}else {#False block

}

21Linux User Group at San Jose State University

Page 22: Perl  101 - The Basics of Perl Programming

PERL – Control Structures contd• while loop

while (condition) {#while block statements

}Control comes out of the loop when the condition is FALSE

• until loopuntil (condition) {

#until block statements}Control comes out of the loop when the condition is TRUE

22Linux User Group at San Jose State University

Page 23: Perl  101 - The Basics of Perl Programming

PERL – Control Structures contd• do while loop

do{#while block statements

}while (condition)Control comes out of the loop when the condition is FALSE

• do until loopdo {

#until block statements}until (condition)Control comes out of the loop when the condition is TRUE

23Linux User Group at San Jose State University

Page 24: Perl  101 - The Basics of Perl Programming

PERL – Control Structures contd• for

for(init expr ; condition ; increment/decrement ) {#for block

}• foreach

– mostly used when iteration has to be done for different values in a listforeach Variable (LIST) {

#Block of Statements}

24Linux User Group at San Jose State University

Page 25: Perl  101 - The Basics of Perl Programming

PERL – Control Structures contd• continue

– flow control statement follows while or foreach loop– continue block will be executed after every successful completion of an

iteration• next

– used to continue with the next iteration of the loop by stopping current iteration

– cannot be used in side do-while/do-until loop– if continue block is present it will also be executed

• redo– restarts the loop without evaluating the condition again– continue block is not executed, if present

• last– exits the loop– continue block is not executed, if present

25Linux User Group at San Jose State University

Page 26: Perl  101 - The Basics of Perl Programming

PERL – Control Structures contd• exit

– used to terminate the execution of a script– exit 0 denotes exit on success and exit 1 denotes exit on error– cannot display error messages while exiting$Choice=<STDIN>if ($Choice==4) { exit 0};

• die– displays the error message to STDERR stream– and terminates the execution of a scriptcopy($TargetFile,$SourceFile) or die “File cannot be copied”

26Linux User Group at San Jose State University

Page 27: Perl  101 - The Basics of Perl Programming

PERL - Arrays– Set of scalar values– Dynamically grows/shrinks when ever required– Can store dissimilar data types

– Prefixed with ‘@’ symbol– Un-initialized locations and locations beyond the end of array will be undef

– Can be initialized using• List • qw operator• Repetition operator (X)• Range operator (..)• Individual locations

27Linux User Group at San Jose State University

Page 28: Perl  101 - The Basics of Perl Programming

PERL – Arrays contd• Creation using Range operator

@EmpNumber = (1001..1200) ;- Can be used if array has to be initialized by a range of elements

• Creation using Repetition operator@Number = (0) X 10;

- Can be used if array has to be initialized by same value in all locations

• Creation of array with dissimilar elements@Details=(1001,”John”,3400.50);

28Linux User Group at San Jose State University

Page 29: Perl  101 - The Basics of Perl Programming

PERL – Arrays contd• Length of the array

– $#<array name> stores the last index of the array– $#<array name> +1 will give the number of elements

• Emptying an array$#Emp = -1;

OR@Emp = ( );

OR@Emp = undef;

• Looping through an Array using foreach statement• converting scalar into an array using split function• Sorting an Array using – sort @<array_name>• Reversing an Array using – reverse @<array_name>

29Linux User Group at San Jose State University

Page 30: Perl  101 - The Basics of Perl Programming

PERL – Arrays contd• Pushing and Popping elements

– push (<Array>, <Value>/<List>)• Will add the element to the end of array• Array expands automatically

– pop (<Array>)• Will remove the element from the end of the array• Array shrinks automatically

• Shifting and Unshifting– unshift (<Array>, <Value>/<List>)

• Will add the element/list to the front of the array & returns number of elements

• Array expands automatically – shift (<Array>)

• Will remove the element from front of the array• Array shrinks automatically

30Linux User Group at San Jose State University

Page 31: Perl  101 - The Basics of Perl Programming

PERL – Hashes

• Hashes– Unordered collection of values– Each element is linked to a key, which uniquely identifies that element– Any element from the hash can be retrieved, added, deleted using the key – A hash variable is prefixed with % – Also called as associative arrays

• Accessing elements – Elements can be retrieved using key $RetrievedElement = $Hash{Key};

• Adding new elements– $Hash{ new key}= new element

• Changing the existing elements– $Hash{key} = new element

31Linux User Group at San Jose State University

Page 32: Perl  101 - The Basics of Perl Programming

PERL – Hashes contd• Deleting key / value pairs

– delete ($Hash{Key})

• Getting all keys or all values– keys (%hash)– values (%hash)

• Does a key exist??– Using exists keyword

• Looping through hash elements– using foreach and keys

32Linux User Group at San Jose State University

Page 33: Perl  101 - The Basics of Perl Programming

Perl – more data structures

• Combination of arrays and hashes• Arrays of Arrays• Hashes of Arrays• Arrays of Hashes• Hashes of Hashes

and so on…http://perldoc.perl.org/perldsc.html

Linux User Group at San Jose State University 33

Page 34: Perl  101 - The Basics of Perl Programming

PERL – Regular Expressions• A regular expression, often called a pattern in Perl, is a template that

either matches or doesn't match a given string.

• Using simple patterns:– To compare a pattern (regular expression) to the contents of $_, simply

put the pattern between a pair of forward slashes (/)

$_ = “perl script";if (/perl/) { print "It matched!\n";}

Note: $_ is the default variable

34Linux User Group at San Jose State University

Page 35: Perl  101 - The Basics of Perl Programming

PERL – Regular Expression contd• Metacharacters

– Dot (.) is a wildcard character - it matches any single character except a newline (which is represented by "\n")

/b.t/ will match bat, bet, bit, but etc. It will not match bt or boat etc.

– A backslash in front of any metacharacter makes it non-special.

/3\.14159/ doesn't have a wildcard character.

– Star (*) is a wildcard character – it matches the preceding character zero or more times

/perl\t*script/ matches any number of tab characters between perl and script. i.e. it matches “perl\tscript" with one tab, or "perl\t\tscript" with two tabs, or “perl\t\t\tscript" with three tabs, or even “perlscript" with nothing in between at all.

35Linux User Group at San Jose State University

Page 36: Perl  101 - The Basics of Perl Programming

PERL – Regular Expression contd• Metacharacters

– Plus (+) is to match the preceding character one or more times

/perl +script/ matches if perl and script are separated only by spaces. This pattern won't match “perlscript”, since the plus matches only if there are one or more spaces between the two words.

– Question mark ("?") means that the preceding character is optional

/perl ? script/ matches it perl and script are separated by one space or no space. This will match “perl script” as well as “perlscript”.

– Parentheses ("( )") may be used for grouping

/(perl)+/ matches strings like perlperlperl

– The vertical bar (|), often pronounced "or" means that either the left side may match, or the right side

/unix|perl/ will match any string that mentions unix or perl

36Linux User Group at San Jose State University

Page 37: Perl  101 - The Basics of Perl Programming

PERL – Regular Expression contd• Character class

– a list of possible characters inside square brackets ([]), matches any single character from within the class.

[abcwxyz] may match any one of the seven characters. [a-zA-Z] may match any of the alphabets (both uppercase and lowercase).

– A caret (^) at the start of the character class negates it. • [^def] will match any single character except “d”, “e” or “f”.

– [\d] or [0-9] will match any digit – [\w] is a shortcut for any "word" character:

• [\w] or [A-Za-z0-9_] will match any “word”– \s is a shortcut for white space character

37Linux User Group at San Jose State University

Page 38: Perl  101 - The Basics of Perl Programming

PERL – Regular Expression Contd• Matches with m//

– m// (pattern match) operator allows you to choose any pair of delimiters to quote the contents

– m(perl), m<perl>, m{perl}, or m[perl] • Case Insensitive Matching with /i

– To make a case-insensitive pattern match, so that you can match PERL as easily as perl or Perl, use the /i modifier:

• Binding Operator =~– Matching against $_ is merely the default; the binding operator (=~) tells Perl

to match the pattern on the right against the string on the left, instead of matching against $_

• Substitution with s///– s/// is the search and replace feature. This simply replaces whatever part of a

variable matches a pattern with a replacement string.– If the match fails, nothing happens, and the variable is untouched

38Linux User Group at San Jose State University

Page 39: Perl  101 - The Basics of Perl Programming

PERL – Regular Expression Contd• Global Replacements with /g

– s/// will make just one replacement, even if others are possible. The /g modifier tells s/// to make all possible replacements

– A fairly common use of a global replacement is to collapse whitespace; that is, to turn any arbitrary whitespace into a single space

39Linux User Group at San Jose State University

Page 40: Perl  101 - The Basics of Perl Programming

PERL - Subroutines• Section of the script that performs a specific task• Values can be passed to subroutine so that it performs the task on these values• Values passed to the subroutines are called arguments• Return value is send back by the subroutine • @_ is an array where the argument values are stored during subroutine call• Subroutine definitions can be placed anywhere in your program text

40Linux User Group at San Jose State University

Page 41: Perl  101 - The Basics of Perl Programming

PERL – Subroutine contd• Subroutine Declaration

– sub MySubroutine()• MySubroutine accepts no arguments

– sub MySubroutine($)• MySubroutine accepts one scalar argument

– sub MySubroutine($$) • MySubroutine accepts 2 scalar arguments

– sub MySubroutine(@)• MySubroutine accepts an array as an argument

– sub MySubroutine(%) • MySubroutine accepts a hash reference as an argument

– sub MySubroutine($@)• MySubroutine accepts a scalar value and an array as argument

– sub MySubroutine($;$)• MySubroutine accepts 2 arguments, where the second argument is

optional

41Linux User Group at San Jose State University

Page 42: Perl  101 - The Basics of Perl Programming

PERL – Subroutine contd• Argument passing & Returning Values

– Arguments passed from the calling function are stored in an array @_• Default arguments

– Default arguments are specified in the prototype, separated by ‘;’ from mandatory parameters

• Scope of a variable– Default scope of all variables are global– Scope can be changed by using ‘my’ or ‘local’ keyword

• Changing the scope of a variable using ‘my’– Restricts the scope of a variable to the block in which it is defined– It is not visible to the subroutine called from the enclosing block

• Changing the scope of a variable using ‘local’– Restricts the scope of a variable to the block in which it is defined– It is visible to the subroutine called from the enclosing block

42Linux User Group at San Jose State University

Page 43: Perl  101 - The Basics of Perl Programming

PERL – Subroutine contd• Reference Variable

– Reference variable store the reference (address) of the variables– Variables can be directly accessed using reference variables– Reference to a scalar

• Command line arguments– Command line arguments are stored in a built in array @ARGV when a perl

script is executed

43Linux User Group at San Jose State University

Page 44: Perl  101 - The Basics of Perl Programming

PERL – File I/O• Data that outlives the program

– needs to be stored– in a permanent storage– so that it can be referred later on

• Example– word processing application might save the text in a linked list or some other

data structure when the application is running, but when the application is terminated the contents of the linked list need to stored in a file

44Linux User Group at San Jose State University

Page 45: Perl  101 - The Basics of Perl Programming

PERL – File I/O contd• Opening a File

open FILEHANDLE, FileName– Returns true (1) if successfully opened, false (0) otherwise

open (MyFileHandle,”MyData.txt”)

– File will be opened for input if file name is prefixed by ‘<‘ open (MyFileHandle,”<MyData.txt”)

– File will be opened for output if the file name is prefixed by ‘>’• If the file exists contents will be erased• If the file doesn’t exists new file will be created open (MyFileHandle,”>MyData.txt”)

– File will be opened for appending if the the file name is prefixed by ‘>>’• If the file doesn’t exist new file will be created open (MyFileHandle,”>>MyData.txt”)

45Linux User Group at San Jose State University

Page 46: Perl  101 - The Basics of Perl Programming

PERL – File I/O contd• Opening a File contd

– File will be opened for read & write if the file name is prefixed by ‘+>’ or ‘+<‘• ‘+>’ will erase the existing contents and open the file for read and write• ‘+<‘ must be used if the file has to be opened for updating the existing

contents open (MyFileHandle, “+<MyData.txt”)

• Handling Errors in File Open– An alternate statement can be attached to the open statement using ‘or’– The alternate statement will be executed if opening a file fails

• open FileHandle,FileName or {Alternate Statement}

• Closing a File– Closing a file will transfer the un buffered data to the file– Returns true(1) if successful, false(0) otherwise

close FileHandle

46Linux User Group at San Jose State University

Page 47: Perl  101 - The Basics of Perl Programming

PERL – File I/O contd• Reading from a File using <> (angle) operator

– <FileHandle> returns the next line of input from the file– If <> alone is specified, input is taken from STDIN– Returns null when end of file is reached

• Reading from file using readline – readline (FileHandle) returns one line of data from the file– Undefined value will be returned if end of file is reached

• Read from file using getc– getc(FileHandle) returns a character from the file– Undefined value will be returned if end of file is reached

• Writing unformatted records into a file using print– print FileHandle Message

• Returns true if successful, false otherwise– print FIleHandle Message or Alternate statement

• Returns false and executes alternate statement if unsuccessful, true otherwise

47Linux User Group at San Jose State University

Page 48: Perl  101 - The Basics of Perl Programming

Questions

48Linux User Group at San Jose State University