Perl Basics A Perl Tutorial NLP Course - 2003. What is Perl? Practical Extraction and Report...

22
Perl Basics A Perl Tutorial NLP Course - 2003
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of Perl Basics A Perl Tutorial NLP Course - 2003. What is Perl? Practical Extraction and Report...

Perl Basics

A Perl Tutorial

NLP Course - 2003

What is Perl? Practical Extraction and Report Language Interpreted Language

Optimized for String Manipulation and File I/O Full support for Regular Expressions

Running Perl Scripts Windows

Download ActivePerl from ActiveState Just run the script from a 'Command Prompt'

window UNIX – Cygwin

Put the following in the first line of your script#!/usr/local/bin/perl

Make the script executable% chmod +x script_name

Run the script% ./script_name

Basic Syntax Statements end with semicolon Comments start with ‘#’

Only single line comments Variables

You don’t have to declare a variable before you access it

You don't have to declare a variable's type

Scalars and Identifiers Identifiers

A variable name Case sensitive

Scalar A single value (string or numerical) Accessed by prefixing an identifier with '$' Assignment with '='

$scalar = expression

Strings Quoting Strings

With ' (apostrophe) Everything is interpreted literally

With " (double quotes) Variables get expanded

With ` (backtick) The text is executed as a separate process, and

the output of the command is returned as the value of the string

Check 01_printDate.pl

String Operation Arithmetic

lt less than <

gt greater than >

eq equal to ==

le less than or equal to <=

ge greater than or equal to >=

ne not equal to !=

cmp compare, return 1, 0, -1 <=>

Comparison Operators

Operator Operation

||, or logical or

&&, and logical and

!, not logical not

xor logical xor

Logical Operators

Operator Operation

. string concatenation

x string repetition

.= concatenation and assignment

$string1 = "potato";

$string2 = "head";

$newstring = $string1 . $string2; #"potatohead"

$newerstring = $string1 x 2; #"potatopotato"

$string1 .= $string2; #"potatohead"

String Operators

Check concat_input.pl

Perl Functions Perl functions are identified by their unique names

(print, chop, close, etc) Function arguments are supplied as a comma

separated list in parenthesis. The commas are necessary The parentheses are often not Be careful! You can write some nasty and unreadable

code this way!

Check 02_unreadable.pl

Lists Ordered collection of scalars

Zero indexed (first item in position '0') Elements addressed by their positions

List Operators (): list constructor , : element separator []: take slices (single or multiple element chunks)

List Operations sort(LIST)

a new list, the sorted version of LIST reverse(LIST)

a new list, the reverse of LIST join(EXPR, LIST)

a string version of LIST, delimited by EXPR split(PATTERN, EXPR)

create a list from each of the portions of EXPR that match PATTERN

Check 03_listOps.pl

Arrays A named list

Dynamically allocated, can be saved Zero-indexed Shares list operations, and adds to them

Array Operators @: reference to the array (or a portion of it, with []) $: reference to an element (used with [])

Array Operations push(@ARRAY, LIST)

add the LIST to the end of the @ARRAY pop(@ARRAY)

remove and return the last element of @ARRAY unshift(@ARRAY, LIST)

add the LIST to the front of @ARRAY shift(@ARRAY)

remove and return the first element of @ARRAY scalar(@ARRAY)

return the number of elements in the @ARRAY

Check 04_arrayOps.pl

Associative Arrays - Hashes Arrays indexed on arbitrary string values

Key-Value pairs Use the "Key" to find the element that has the

"Value" Hash Operators

% : refers to the hash {}: denotes the key $ : the value of the element indexed by the key (used

with {})

Hash Operations keys(%ARRAY)

return a list of all the keys in the %ARRAY values(%ARRAY)

return a list of all the values in the %ARRAY each(%ARRAY)

iterates through the key-value pairs of the %ARRAY delete($ARRAY{KEY})

removes the key-value pair associated with {KEY} from the ARRAY

Pattern Matching A pattern is a sequence of characters to be

searched for in a character string /pattern/

Match operators =~: tests whether a pattern is matched !~: tests whether patterns is not matched

Pattern Matches Pattern Matches

/def/ "define" /d.f/ dif

/\bdef\b/ a def word /d.+f/ dabcf

/^def/ def word /d.*f/ df, daffff

/^def$/ def /de{1,3}f/ deef, deeef

/de?f/ df, def /de{3}f/ deeef

/d[eE]f/ def, dEf /de{3,}f/ deeeeef

/d[^eE]f/ daf, dzf /de{0,3}f/ up to deeef

Patterns

Character RangesEscape

SequencePattern Description

\d [0-9] Any digit

\D [^0-9] Anything but a digit

\w [_0-9A-Za-z] Any word character

\W [^_0-9A-Za-z] Anything but a word char

\s [ \r\t\n\f] White-space

\S [^\r\t\n\f] Anything but white-space

Backreferences Memory of matched portion of input

/[a-z]+(.)[a-z]+\1[a-z]+/ asd-eeed-sdsa, sd-sss-ws NOT as.eee-dfg

They can even be accessed immediately after the pattern is matched (.) in the previous pattern is $1

Pattern Matching Options

Escape Sequence

Description

g Match all possible patterns

i Ignore case

m Treat string as multiple lines

o Only evaluate once

s Treat string as single line

x Ignore white-space in pattern

Substitutions Substitution operator

s/pattern/substitution/options If $string = "abc123def";

$string =~ s/123/456/

Result: "abc456def" $string =~ s/123//

Result: "abcdef" $string =~ s/(\d+)/[$1]/

Result: "abc[123]def"