Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String...

60
Lesson2: String Operations, Writing Scripts Fundamentals of Text Processing for Linguists Na-Rae Han

Transcript of Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String...

Page 1: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Lesson2: String Operations,

Writing Scripts

Fundamentals of Text Processing for Linguists

Na-Rae Han

Page 2: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Objectives

Writing and executing Python scripts

Learn Python basic syntax

Variable assignments: = vs. +=

Equality comparison: ==

String operations

Boolean operators: and, or, not

1/15/2014 2

Page 3: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Interactive programming vs. scripts

1/15/2014 3

We have been programming INTERACTIVELY through IDLE shell.

We will now try writing and running a stand-alone

Python SCRIPT.

x = 'a'+'b' print x … …

IDLE interprets and responds to each line

of code you type in

executed Python

script file

You have to retype the code next time

Your IDLE shell

Page 4: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

1. Open up a new window (Ctrl+n).

2. Python editor window opens up. Type in:

3. Save the script as "hello.py".

Don't forget to include ".py" extension!

4. Run the script by selecting "Run" > "Run Module" or pressing F5.

5. Success! The command executes in the IDLE shell window.

Python script in IDLE editor mode

1/15/2014 4

print "Hello, world!!"

Page 5: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

String operations

1/15/2014 5

String: A single piece of text, composed of a sequence of letters.

Operations on string objects:

print statement. prints string

len() returns integer

+, += returns string

.endswith(), .startswith() returns True/False

in returns True/False

.upper(), .lower() returns string

.replace() returns string

.split() splits a string into a list, returns it

Page 6: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Concatenation and variable assignment

1/15/2014 6

>>> v = 'walk'

>>> v + 'ed'

'walked'

>>> v

'walk'

>>> vd = v + 'ed'

>>> vd

'walked'

>>> v

'walk'

>>> v = v + 'ed'

>>> v

'walked'

>>> v == vd

True

Page 7: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> v = 'walk'

>>> v + 'ed'

'walked'

>>> v

'walk'

>>> vd = v + 'ed'

>>> vd

'walked'

>>> v

'walk'

>>> v = v + 'ed'

>>> v

'walked'

>>> v == vd

True

Concatenation does not change original

1/15/2014 7

Value of v is unchanged!

Concatenation operation

Page 8: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> v = 'walk'

>>> v + 'ed'

'walked'

>>> v

'walk'

>>> vd = v + 'ed'

>>> vd

'walked'

>>> v

'walk'

>>> v = v + 'ed'

>>> v

'walked'

>>> v == vd

True

New variable

1/15/2014 8

New variable vd is assigned to the output

of concatenation

v is still unaffected

Page 9: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> v = 'walk'

>>> v + 'ed'

'walked'

>>> v

'walk'

>>> vd = v + 'ed'

>>> vd

'walked'

>>> v

'walk'

>>> v = v + 'ed'

>>> v

'walked'

>>> v == vd

True

Changing the original variable

1/15/2014 9

Here, v is assigned a new value: its former self suffixed with 'ed'

Page 10: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> v = 'walk'

>>> v + 'ed'

'walked'

>>> v

'walk'

>>> vd = v + 'ed'

>>> vd

'walked'

>>> v

'walk'

>>> v = v + 'ed'

>>> v

'walked'

>>> v == vd

True

==: Equality comparison

1/15/2014 10

Double equal sign: confirms that

v and vd have an equal value!

Page 11: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

11

Equality comparison vs. assignment

name = value

=

value1 == value2 ==

Attaches a name to a value

Compares two values, returns True or False

Page 12: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> num = 5

>>> num = num + 2

>>> num

7

Right Hand Side (RHS) first

1/15/2014 12

num has different values in one statement. How could this be?

Answer: Right hand side is evaluated first, and then variable assignment happens.

This num has value 5

This num has value 7

Page 13: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> num = 5

>>> num = num + 2

Right Hand Side (RHS) first

1/15/2014 13

int 5

1. Integer 5 is created in Python memory

Page 14: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Right Hand Side (RHS) first

1/15/2014 14

1. Integer 5 is created in Python memory

2. Variable num is created, points to the memory location

int 5 num >>> num = 5

>>> num = num + 2

Page 15: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Right Hand Side (RHS) first

1/15/2014 15

1. Integer 5 is created in Python memory

2. Variable num is created, points to the memory location

3. num + 2 is evaluated; value 7 is created in memory

int 5

int 7

num >>> num = 5

>>> num = num + 2

Page 16: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Right Hand Side (RHS) first

1/15/2014 16

1. Integer 5 is created in Python memory

2. Variable num is created, points to the memory location

3. num + 2 is evaluated; value 7 is created in memory

4. The name num is then attached to integer 7 in Python memory

int 5

int 7

num >>> num = 5

>>> num = num + 2

Page 17: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Right Hand Side (RHS) first

1/15/2014 17

1. Integer 5 is created in Python memory

2. Variable num is created, points to the memory location

3. num + 2 is evaluated; value 7 is created in memory

4. The name num is then attached to integer 7 in Python memory

5. 5 is no longer needed; gets freed up from memory

int 5

int 7

num ✘ >>> num = 5

>>> num = num + 2

Page 18: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

+ and +=

1/15/2014 18

>>> homer = 'doh'

>>> print homer

doh

>>> homer = homer + '!'

>>> print homer

doh!

>>> homer = homer + '!'

>>> print homer

doh!!

>>> homer += '!'

>>> print homer

doh!!!

>>> homer += '!'

>>> print homer

doh!!!!

Page 19: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> homer = 'doh'

>>> print homer

doh

>>> homer = homer + '!'

>>> print homer

doh!

>>> homer = homer + '!'

>>> print homer

doh!!

>>> homer += '!'

>>> print homer

doh!!!

>>> homer += '!'

>>> print homer

doh!!!!

+ and +=

1/15/2014 19

Do exactly the same thing:

suffixing homer with '!'

Page 20: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Augmented assignments

1/15/2014 20

foo += 10

foo -= 10

foo *= 10

foo /= 10

foo **= 10

foo %= 10

foo = foo + 10

foo = foo - 10

foo = foo * 10

foo = foo / 10

foo = foo ** 10

foo = foo % 10

Subtract 10

Multiply by 10

Power to 10

Divide by 10

Modulo 10

Page 21: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice

1/15/2014 21

2 minutes

Repeated commands: Use and!

>>> chor = 'tra'

>>> chor = chor + '-la'

>>> print chor

tra-la

>>> chor += '-la'

>>> print chor

tra-la-la

>>> chor += '-la'

>>> print chor

tra-la-la-la

>>> num = 2

>>> num += 1

>>> print num

3

>>> num += 1

>>> print num

4

>>> num *= 2

>>> print num

8

>>> num *= 2

>>> print num

16

Also try: -= /= **=

Page 22: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.startswith(), .endswith()

1/15/2014 22

>>> len('cat')

3

>>> 'cat'.endswith('t')

True

>>> 'cat'.startswith('t')

False

>>> 'cat'.endswith('')

True

>>> c = 'cat'

>>> c.endswith('at')

True

>>> c.endswith('cat')

True

>>> c.startswith('cat')

True

Page 23: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> len('cat')

3

>>> 'cat'.endswith('t')

True

>>> 'cat'.startswith('t')

False

>>> 'cat'.endswith('')

True

>>> c = 'cat'

>>> c.endswith('at')

True

>>> c.endswith('cat')

True

>>> c.startswith('cat')

True

.startswith(), .endswith()

1/15/2014 23

.endswith() can be called on a string

or a variable whose value is a string

Page 24: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Boolean operators: and, or, not

1/15/2014 24

>>> 'cat'.endswith('at')

True

>>> not 'cat'.endswith('at')

False

>>> not True

False

>>> not False

True

>>> 'cat'.endswith('at') and 'dog'.endswith('g')

True

>>> 'cat'.endswith('at') and 'dog'.endswith('at')

False

>>> 'cat'.endswith('at') or 'dog'.endswith('at')

True

>>> 'cat'.endswith('at') and not 'dog'.endswith('at')

True

Page 25: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> 'cat'.endswith('at')

True

>>> not 'cat'.endswith('at')

False

>>> not True

False

>>> not False

True

>>> 'cat'.endswith('at') and 'dog'.endswith('g')

True

>>> 'cat'.endswith('at') and 'dog'.endswith('at')

False

>>> 'cat'.endswith('at') or 'dog'.endswith('at')

True

>>> 'cat'.endswith('at') and not 'dog'.endswith('at')

True

Boolean operators: and, or, not

1/15/2014 25

not is a unary operator: not A flips the truth value of A

Page 26: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> 'cat'.endswith('at')

True

>>> not 'cat'.endswith('at')

False

>>> not True

False

>>> not False

True

>>> 'cat'.endswith('at') and 'dog'.endswith('g')

True

>>> 'cat'.endswith('at') and 'dog'.endswith('at')

False

>>> 'cat'.endswith('at') or 'dog'.endswith('at')

True

>>> 'cat'.endswith('at') and not 'dog'.endswith('at')

True

Boolean operators: and, or, not

1/15/2014 26

and and or are binary operators: they take

two arguments A and B A or B

Page 27: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

in as a substring operation

1/15/2014 27

>>> 'a' in 'cat'

True

>>> 'ca' in 'cat'

True

>>> 'cat' in 'scattered'

True

>>> 'et' in 'scattered'

False

>>> '' in 'scattered'

True

Page 28: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> 'a' in 'cat'

True

>>> 'ca' in 'cat'

True

>>> 'cat' in 'scattered'

True

>>> 'et' in 'scattered'

False

>>> '' in 'scattered'

True

in as a substring operator

1/15/2014 28

in as a "substring" operator A in B

True if A is a substring of B False otherwise

Page 29: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> 'a' in 'cat'

True

>>> 'ca' in 'cat'

True

>>> 'cat' in 'scattered'

True

>>> 'et' in 'scattered'

False

>>> '' in 'scattered'

True

>>> 'cat' not in 'scattered'

False

not in

1/15/2014 29

negating in with not

Page 30: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

The empty string

1/15/2014 30

>>> ''

''

>>> len('')

0

>>> 'cat'.startswith('')

True

>>> 'cat'.endswith('')

True

>>> ''.startswith('')

True

>>> ''.endswith('')

True

>>> ''.endswith('t')

False

>>> '' in 'cat'

True

>>> '' in ''

True

Page 31: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> ''

''

>>> len('')

0

>>> 'cat'.startswith('')

True

>>> 'cat'.endswith('')

True

>>> ''.startswith('')

True

>>> ''.endswith('')

True

>>> ''.endswith('t')

False

>>> '' in 'cat'

True

>>> '' in ''

True

Strings can have 0 length

1/15/2014 31

'' is an empty string; the length is 0

Page 32: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> ''

''

>>> len('')

0

>>> 'cat'.startswith('')

True

>>> 'cat'.endswith('')

True

>>> ''.startswith('')

True

>>> ''.endswith('')

True

>>> ''.endswith('t')

False

>>> '' in 'cat'

True

>>> '' in ''

True

'' starts and ends every string

1/15/2014 32

The empty string '' begins and ends

every string – even the empty string

itself!

Page 33: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> ''

''

>>> len('')

0

>>> 'cat'.startswith('')

True

>>> 'cat'.endswith('')

True

>>> ''.startswith('')

True

>>> ''.endswith('')

True

>>> ''.endswith('t')

False

>>> '' in 'cat'

True

>>> '' in ''

True

'' is a substring of every string

1/15/2014 33

The empty string '' is a substring of every string –

even the empty string itself!

Page 34: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> not True

False

>>> not False

True

>>> 'cat'.endswith('at') and 'dog'.endswith('g')

True

>>> 'cat'.endswith('at') and 'dog'.endswith('at')

False

>>> 'cat'.endswith('at') or 'dog'.endswith('at')

True

>>> 'cat'.endswith('at') and not 'dog'.endswith('at')

True

>>> 'ca' in 'cat'

True

>>> 'cat' in 'scattered'

True

>>> 'et' in 'scattered'

False

Practice

1/15/2014 34

2 minutes

Also try with empty string ''

Page 35: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.upper(), .lower(), .capitalize()

1/15/2014 35

>>> sign = 'Please be quiet'

>>> sign.upper()

'PLEASE BE QUIET'

>>> sign.lower()

'please be quiet'

>>> sign.capitalize()

'Please be quiet'

>>> 'hello!'.upper()

'HELLO!'

>>> 'hello!'.upper().lower()

'hello!'

>>> 'hello!'.capitalize()

'Hello!'

Page 36: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.upper(), .lower()

1/15/2014 36

>>> sign = 'Please be quiet'

>>> sign.upper()

'PLEASE BE QUIET'

>>> sign.lower()

'please be quiet'

>>> sign.capitalize()

'Please be quiet'

>>> 'hello!'.upper()

'HELLO!'

>>> 'hello!'.upper().lower()

'hello!'

>>> 'hello!'.capitalize()

'Hello!'

capitalize and lower-case every character in

a string

Page 37: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.capitalize()

1/15/2014 37

>>> sign = 'Please be quiet'

>>> sign.upper()

'PLEASE BE QUIET'

>>> sign.lower()

'please be quiet'

>>> sign.capitalize()

'Please be quiet'

>>> 'hello!'.upper()

'HELLO!'

>>> 'hello!'.upper().lower()

'hello!'

>>> 'hello!'.capitalize()

'Hello!'

capitalizes the first character

in the string

Page 38: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> 'hello!'.upper()

'HELLO!'

>>> 'hello!'.upper().lower()

'hello!'

>>> 'hello!'.capitalize()

'Hello!'

>>> sign = 'Please be quiet'

>>> sign.upper()

'PLEASE BE QUIET'

>>> sign.lower()

'please be quiet'

>>> sign.capitalize()

'Please be quiet'

Nested functions

1/15/2014 38

uppercases 'hello!' and then

lowercases it back!

Page 39: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.replace()

1/15/2014 39

>>> foo = 'hello, world!'

>>> foo.replace('l', 'r')

'herro, worrd!'

>>> foo.replace('l', 'r').replace('r', 'l')

'hello, wolld!'

>>> faa = 'colour rumour'

>>> faa.replace('our', 'or')

'color rumor'

>>> mary = 'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.replace(' ', '')

'Maryhadalittlelamb'

Page 40: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.replace()

1/15/2014 40

>>> foo = 'hello, world!'

>>> foo.replace('l', 'r')

'herro, worrd!'

>>> foo.replace('l', 'r') .replace('r', 'l')

'hello, wolld!'

>>> faa = 'colour rumour'

>>> faa.replace('our', 'or')

'color rumor'

>>> mary = 'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.replace(' ', '')

'Maryhadalittlelamb'

replaces every instance of 'l' with 'r'

Page 41: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Stacking .replace()

1/15/2014 41

>>> foo = 'hello, world!'

>>> foo.replace('l', 'r')

'herro, worrd!'

>>> foo.replace('l', 'r') .replace('r', 'l')

'hello, wolld!'

>>> faa = 'colour rumour'

>>> faa.replace('our', 'or')

'color rumor'

>>> mary = 'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.replace(' ', '')

'Maryhadalittlelamb'

replaces every instance of 'l' with 'r',

and then replaces every 'r' with 'l'

Page 42: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.replace()

1/15/2014 42

>>> foo = 'hello, world!'

>>> foo.replace('l', 'r')

'herro, worrd!'

>>> foo.replace('l', 'r') .replace('r', 'l')

'hello, wolld!'

>>> faa = 'colour rumour'

>>> faa.replace('our', 'or')

'color rumor'

>>> mary = 'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.replace(' ', '')

'Maryhadalittlelamb'

replaces a string ('our') with another string ('or')

Text has been Americanized!

Page 43: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.replace() for removing

1/15/2014 43

>>> foo = 'hello, world!'

>>> foo.replace('l', 'r')

'herro, worrd!'

>>> foo.replace('l', 'r') .replace('r', 'l')

'hello, wolld!'

>>> faa = 'colour rumour'

>>> faa.replace('our', 'or')

'color rumor'

>>> mary = 'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.replace(' ', '')

'Maryhadalittlelamb'

Removing every space: achieved by

replacing '' with the empty string ''

Page 44: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Splitting a string with .split()

1/15/2014 44

>>> mary = 'Mary had a little lamb'

>>> mary

'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.split()

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split(' ')

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split('a')

['M', 'ry h', 'd ', ' little l', 'mb']

>>> len(mary)

22

>>> len(mary.split())

5

Page 45: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> mary = 'Mary had a little lamb'

>>> mary

'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.split()

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split(' ')

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split('a')

['M', 'ry h', 'd ', ' little l', 'mb']

>>> len(mary)

22

>>> len(mary.split())

5

Splitting a string with .split()

1/15/2014 45

splits on

every ''

splits on every 'a'

Page 46: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> mary = 'Mary had a little lamb'

>>> mary

'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.split()

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split(' ')

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split('a')

['M', 'ry h', 'd ', ' little l', 'mb']

>>> len(mary)

22

>>> len(mary.split())

5

Splitting a string with .split()

1/15/2014 46

No separator given: splits on

whitespace

splits on

every ''

Same result in this case, but

not always

Page 47: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

>>> mary = 'Mary had a little lamb'

>>> mary

'Mary had a little lamb'

>>> print mary

Mary had a little lamb

>>> mary.split()

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split(' ')

['Mary', 'had', 'a', 'little', 'lamb']

>>> mary.split('a')

['M', 'ry h', 'd ', ' little l', 'mb']

>>> len(mary)

22

>>> len(mary.split())

5

len() works with strings and lists

1/15/2014 47

len(string) returns the length

of string: # of characters

len(list) returns the # of

items in a list

Page 48: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Splitting and whitespace characters

1/15/2014 48

>>> hi = 'Hello mother,\nHello father.'

>>> print hi

Hello mother,

Hello father.

>>> hi.split()

['Hello', 'mother,', 'Hello', 'father.']

>>> hi.split('\n')

['Hello mother,', 'Hello father.']

>>> record = 'Lisa Simpson\tBorn 8/12 2002\tSpringfield'

>>> print record

Lisa Simpson Born 8/12 2002 Springfield

>>> record.split('\t')

['Lisa Simpson', 'Born 8/12 2002', 'Springfield']

Page 49: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Nope!

>>> la = 'la di da'

>>> la.split()

['la', 'di', 'da']

>>> la.split(' ')

['la', 'di', '', '', 'da']

>>> foo = 'colorless green ideas'

>>> foo.split('e')

['colorl', 'ss gr', '', 'n id', 'as']

1/15/2014 49

So, as long as a string contains only

spaces and no other whitespace, .split()

is the same as .split(' '), right?

three spaces ' '

Page 50: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

.split() default behavior

1/15/2014 50

If no separator is specified:

Splits on whitespace (includes space, line break \n, tab \t).

Repeated whitespaces do NOT result in empty string tokens.

If a separator is specified, then:

Repetition DOES result in empty string tokens.

>>> la = 'la di da'

>>> la.split()

['la', 'di', 'da']

>>> la.split(' ')

['la', 'di', '', '', 'da']

>>> foo = 'colorless green ideas'

>>> foo.split('e')

['colorl', 'ss gr', '', 'n id', 'as']

Page 51: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 1

1/15/2014 51

2 minutes

Page 52: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 1

1/15/2014 52

2 minutes

Page 53: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 2

1/15/2014 53

2 minutes

Page 54: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 2

1/15/2014 54

2 minutes

Page 55: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 3

1/15/2014 55

2 minutes

Page 56: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 3

1/15/2014 56

2 minutes

Page 57: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Practice 4

1/15/2014 57

Write a Python script that:

internally stores the "Fox in Sox" text as a string variable named "fox", http://www.pitt.edu/~naraehan/ling1901/sample-texts.txt

prints out the text,

prints out how many lines are in the text,

prints out how many words are in the text,

prints out how many characters are in the text,

prints out how many times 'ree' occurs in the text.

Use .count() method. Try and figure it out!

3 minutes

Page 58: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

A note on Python syntax

1/15/2014 58

✔ ✔

Page 59: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

if... : elif ... : else :

Conditional

Looking ahead

1/15/2014 59

>>> grade = 87

>>> if grade >= 90 :

print 'You got an A'

elif grade >= 80 :

print 'You got a B'

else :

print 'Try harder!'

You got a B

>>>

Page 60: Lesson2: String Operations, Writing Scripts › ~naraehan › ling1901 › Lesson2.pdf · String operations 1/15/2014 5 String: A single piece of text, composed of a sequence of letters.

Wrap-up

1/15/2014 60

If you want to try more commands, visit:

A Beginner's Python Tutorial, Lesson 6

http://www.sthurlow.com/python/lesson06/

Exercise #2

http://www.pitt.edu/~naraehan/ling1991/exercise.html#ex2

Due Tuesday midnight