Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

36
Regular Expressions: Regular Expressions: The Proper Care and The Proper Care and Feeding Feeding Zain Naboulsi Zain Naboulsi MSDN Developer Evangelist MSDN Developer Evangelist Microsoft Microsoft

Transcript of Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Page 1: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Regular Expressions: Regular Expressions: The Proper Care and FeedingThe Proper Care and Feeding

Zain NaboulsiZain NaboulsiMSDN Developer EvangelistMSDN Developer EvangelistMicrosoftMicrosoft

Page 2: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Introduction to Regular ExpressionsIntroduction to Regular Expressions

What Are Regular Expressions?What Are Regular Expressions?

Why Would I Want To Use Them?Why Would I Want To Use Them?

Common MisconceptionsCommon Misconceptions

Anatomy of An Regular ExpressionAnatomy of An Regular Expression

Page 3: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

DisclaimerDisclaimer

All opinions in this session are All opinions in this session are provided "AS IS" with no warranties, provided "AS IS" with no warranties, and confer no rights.and confer no rights.

All opinions are my mine and don't All opinions are my mine and don't necessarily reflect the opinion of necessarily reflect the opinion of Microsoft.Microsoft.

Page 4: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

What Are What Are Regular Expressions?Regular Expressions?

Page 5: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Regular ExpressionsRegular Expressions““Regular expressions provide a powerful, Regular expressions provide a powerful, flexible, and efficient method for processing flexible, and efficient method for processing text. text.

[They allow] you to quickly parse large [They allow] you to quickly parse large amounts of text to find specific character amounts of text to find specific character patterns; to extract, edit, replace, or delete patterns; to extract, edit, replace, or delete text substrings; or to add the extracted strings text substrings; or to add the extracted strings to a collection in order to generate a report.”to a collection in order to generate a report.”

http://msdn2.microsoft.com/en-us/library/hs600312.aspx

Page 6: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Do What?Do What?

Simply put, regular expressions will help you Simply put, regular expressions will help you find text patterns and do pretty much find text patterns and do pretty much whatever you want to it.whatever you want to it.

It sounds simple but regular expressions are It sounds simple but regular expressions are one of the most difficult and least understood one of the most difficult and least understood constructs in programming.constructs in programming.

Page 7: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

WarningWarning

Regular expressions are part art and part Regular expressions are part art and part science. There is a steep learning curve but science. There is a steep learning curve but the rewards are significant.the rewards are significant.

Page 8: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

The PossibilitiesThe Possibilities

Page 9: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Okay, So What Is A Pattern?Okay, So What Is A Pattern?

““a regular or repetitive form, order, or a regular or repetitive form, order, or arrangement”arrangement”

http://encarta.msn.com/dictionary_1861724272/pattern.html

Page 10: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

PATTERNS ARE PATTERNS ARE EVERYWHEREEVERYWHERE

Page 11: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Checker BoardChecker Board

Page 12: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Fibonacci SequenceFibonacci Sequence

Page 13: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

TextText

The IP Address for the server is 192.169.1.3 The IP Address for the server is 192.169.1.3 but it should be 192.168.1.5, and I am not but it should be 192.168.1.5, and I am not sure how we managed to get into the sure how we managed to get into the 192.169.1 subnet but we need to remove 192.169.1 subnet but we need to remove ourselves from it immediately unless we are ourselves from it immediately unless we are moving to it then I want the new IP to be moving to it then I want the new IP to be 192.169.1.3 I suppose.192.169.1.3 I suppose.

Page 14: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

YOU HAVE USED YOU HAVE USED PATTERNS BEFOREPATTERNS BEFORE

Page 15: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Wildcard Searches For FilesWildcard Searches For Files

Wildcards = VERY simple pattern matching Wildcards = VERY simple pattern matching constructs and are NOT regular expressionsconstructs and are NOT regular expressions

Examples:Examples:*.txt*.txt

b*b*b*b*

?un.txt?un.txt

Page 16: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Why Use Why Use Regular Expressions?Regular Expressions?

Page 17: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Major Uses of Major Uses of Regular ExpressionsRegular Expressions

Matching = find any text anywhere Matching = find any text anywhere regardless of complexityregardless of complexity

Substitution = once found, you can replace Substitution = once found, you can replace texttext

Page 18: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

FeaturesFeatures

Can literally turn 10 lines of code into 1 Can literally turn 10 lines of code into 1

Extremely efficient pattern matching Extremely efficient pattern matching mechanismmechanism

Once learned, becomes one of the most Once learned, becomes one of the most indispensible techniques you can haveindispensible techniques you can have

Page 19: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Languages That SupportLanguages That SupportRegular ExpressionsRegular Expressions

All .NET languagesAll .NET languages

JScriptJScript

XML: XPath & XQueryXML: XPath & XQuery

T-SQLT-SQL

PERLPERL

JavaJava

[insert language here][insert language here]

Page 20: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

ASP.NET ControlASP.NET Control

Page 21: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Common Common MisconceptionsMisconceptions

Page 22: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

MisconceptionsMisconceptions

Regular Expressions can do complex Regular Expressions can do complex programming logicprogramming logic

Regular Expressions can do mathRegular Expressions can do math

Regular Expressions will give me winning Regular Expressions will give me winning lottery numberslottery numbers

Page 23: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Anatomy of an Anatomy of an Regular ExpressionRegular Expression

Page 24: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

A Sample ExpressionA Sample Expression

^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$

Page 25: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

AnatomyAnatomy

CharactersCharacters

MetacharactersMetacharacters

SubexpressionsSubexpressions

Page 26: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

CharactersCharacters

A literal character represents any valid value A literal character represents any valid value represented by the current encoding method.represented by the current encoding method.

For example the “@” literal character is For example the “@” literal character is represented as the decimal value 65 in the represented as the decimal value 65 in the ASCII encoding system.ASCII encoding system.

^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$

Page 27: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

MetacharactersMetacharacters

Unlike literal characters, metacharacters are Unlike literal characters, metacharacters are used as “place holders” for characters.used as “place holders” for characters.

For example, the metacharacter “\t” in regular For example, the metacharacter “\t” in regular expressions represents the tab character, expressions represents the tab character, whereas the “\d” matches any digit 0 through whereas the “\d” matches any digit 0 through 9.9.

^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$^\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,3}$

Page 28: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

SubexpressionsSubexpressions

These are simply smaller expressions nested These are simply smaller expressions nested inside larger ones.inside larger ones.

For example, the following expression has a For example, the following expression has a subexpression inside it:subexpression inside it:

(john|jane)doe(john|jane)doe

Page 29: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Must Have ResourcesMust Have Resources

Page 30: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

ToolsTools

http://www.RegExLib.com

http://www.ultrapico.com/Expresso.htm

Page 31: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

BookBook

Page 32: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

ToolsTools

Page 33: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

SummarySummary

Page 34: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

SummarySummary

Regular expressions can be used to Regular expressions can be used to manipulate and change textmanipulate and change text

While there is a steep learning curve, regular While there is a steep learning curve, regular expressions are invaluable as a programming expressions are invaluable as a programming tooltool

Regular expressions are supported by Regular expressions are supported by virtually all major programming languagesvirtually all major programming languages

Page 35: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.

Next StepsNext StepsCheck out some of the patterns on the Check out some of the patterns on the RegExLib siteRegExLib site

Do a live search on regular expressions and Do a live search on regular expressions and see what others have to say about themsee what others have to say about them

Prepare your self mentally for a rewarding Prepare your self mentally for a rewarding journey into the world of regular expressionsjourney into the world of regular expressions

Have Fun!!!Have Fun!!!

Page 36: Regular Expressions: The Proper Care and Feeding Zain Naboulsi MSDN Developer Evangelist Microsoft.