REGULAR EXPRESSIONS FRIEND OR FOE?. INTRODUCTION TO REGULAR EXPRESSIONS.
Textpad and Regular Expressions
-
Upload
ocsi -
Category
Technology
-
view
39.019 -
download
4
description
Transcript of Textpad and Regular Expressions
Textpad and Finding ThingsA primer for the uninitiated
Graham LallyApril 2012
[email protected]://www.ocsi.co.uk/
Coming up
• Hello Textpad• Learning to love Regular Expressions• Other useful things
Why Textpad?
Text-editing Lots of options
http://www.textpad.com/
http://www.flickr.com/photos/eldholm/2354982554/ http://www.flickr.com/photos/grandmaitre/2387376806/
+
Here are the files you have open
Write things here
Configure > Preferences
Basic set-up
Allow multiple instances to runReplace NotepadPut shortcut to Textpad on: Context Menu
Also set under "Editor" settings:
Use POSIX regular expression syntax
This means we can use ( ) and { } instead of \( \) and \{ \}
Shortcuts to get started
Search current file(s) F5
Search and replace F8
Search through a whole directory Ctrl + F5
Regular Expressions
Why (oh why)?
Regular expressions (REs / RegExps) let you search for a *range* of things in one go, rather than just a single thing.
Old: Find "Kings Cross"
Kings XKing's X Kings CrossKing's Cross
RE: Find "King'?s (Cross|X)"
Kings XKing's X Kings CrossKing's Cross
Why Regular Expressions?
Also find:
Any letter/number
Any letter followed by any number
Any number at the start/end of a line
Any non-numbers
etc.
My first regular expression
Regular expression
Find: .a
F5:
".a"
"." is a special character in REs, meaning find any character (including letters, numbers, spaces and punctuation)
So "..." finds any 3 characters in a row: abc, 123, y r, even ...
.a Find any character followed by an "a", e.g. ba, 1a, %a
And "a.c" will find abc, aac, a6c, a?c, a c, a[c, a-c, etc
Characters to watch out for
In Textpad*, most characters do what they should.
Some don't:
. ? * + | [ ] ( ) { } ^ $
To actually look for one of these, put a "\" before it. e.g. \? \(
To look for a \, use \\
(* Other RegExp software may differ)
Finding other things
[...] Find any single character listed in the [ ]
e.g. [abc][123] will match a1, c1, b2, c3, etc
...|... Find anything that matches before or after the |
e.g. abc|123 will match abc and 123
Can also be more than 2: abc|123|xyz|000
Counting things
Use +, * or ? after a character to indicate how many times it should occur:
+ Find one or more characters e.g. ab+c will find abc, abbc, abbbbbc, etc
* Find zero or more characters e.g. ab*c will find ac, abc, abbc, abbbbbc, etc
? Find zero or one characters (i.e. "maybe") e.g. ab?c will find ac and abc, but not abbc
Counting things
Use {min,max} after a character to set your own limits
e.g. ab{4,6}c will find abbbbc, abbbbbc and abbbbbbc
max is optional too
e.g. ab{3,}c will find abbbc, abbbbbbbbbbbbbbbbbbc, etc
min isn't optional in Textpad, so use 0 instead
e.g. ab{0,3}c will find ac, abc, abbc, abbc
More character rangesTextpad also has some "named" ranges you can use, e.g.
[:alpha:] Any letter
[:digit:] Any digit
[:blank:] Space or tab
[:space:] Space, tab, return, line feed, et al
These go inside other [ ] e.g. [[:alpha:][:digit:]] finds any letter or number.
See TextPad's Help (available from the Find dialogue) for more.
Character ranges
Writing [0123456789] every time is silly.
Instead, use [...-...] (dash) to specify a range of characters.
e.g. [0-9] to find any single number [a-z] to find any letter
These can be combined with themselves, plus other characters
e.g. [a-z0-9:\.] to find any letter, number, colon or period
Line and word endings
\n Lets you look for line breaks, e.g. Hello\nworld will find Hello world
\< and \> Match the start and end of a word
^ Match the start of a line e.g. ^: will find any colon at the start of a line
$ Match the end of a line e.g. \.$ will find any period at the end of a line
Combine these to match whole lines, e.g. ^\?.*$ to match all lines that start with a question mark
Getting more complicated
Use (...) to set up "groups" - these can then be checked and counted in themselves
e.g. (hello)+ will find hello, hellohello, hellohellohello, etc.
(hello|bye){2} will find hellohello, byebye, hellobye and byehello
Hello (Tom|Stefan|Emma) will look for any of the 3 names
(t[io]ck){2}! will find ticktock!, tocktick!, ticktick!, and tocktock!
More examples to figure out
What do the following regular expressions do?
^[0-9]{3}
^(Source|[0-9]{3})
^.*[0-9]{3}$
Replacements
F8
As expected, Textpad will replace anything matched, with the text specified.
Replacements
Remember though:
The RE will (only/completely) match what you've typed. Textpad will replace what's been matched.
Work out what's being matched and what's being replaced.
e.g. We may want to match (find) all lines ending with a ?, but only replace the word before the ?
Replacements
A more useful example:
[a-z]+[?!]$ Find any lines ending in letters, then ? or !
We want to replace the letters (i.e. [a-z]+) with something, but keep the ? or ! in place
But we don't know which one to use when replacing...
Replacements
Luckily, we can refer to things that have been matched using (...) groups.
Each group has its own number (in sequential order), and can be inserted using \n
e.g. [a-z]+([?!])$ has one (...) group: \1
To replace the letters, we can use new word\1
So hello! would become new word!
Replacements
Textpad uses \1 to \9 - which we can use wherever we want
e.g. ([a-z]+) ([0-9]+) will match Pelham 123
Replacing this with \2 \1 gives us 123 Pelham
Replacing it with \2\2 gives us 123123
(Be a bit careful if a (...) group is optional though, i.e. (...)? - if it's not found, your higher numbers will be wrong.)
Phew...
Can get tricky quickly, but remember:
o Use the "Help" button
o Use the "Find Next" button to check what your RE findso Check the Cheat Sheet - http://bit.ly/textpadref
Handy Tip 1: Multiple files
To replace across many files, open all the files and tick "All documents" when replacing:
Handy Tip 2: Searching a folder
To look through (closed) files in a directory, use Ctrl + F5. Tick "Search subfolders" if folders are several layers deep.
Handy Tip 3: Save everything
Finally:
You can see which files have changed - these have a * next to their filename.
You can also save all changes to all open files using:
File > Save All
Other Textpad Useful Bits
Block select mode (Ctrl+Q, then B):
Bookmarked lines:
Cut/copy/deleteeverything selectedor bookmarked.
Bookmarks
http://xkcd.com/208/