Advanced Regular Expressions in .NET
-
Upload
patrick-delancy -
Category
Software
-
view
320 -
download
0
Transcript of Advanced Regular Expressions in .NET
![Page 1: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/1.jpg)
Advanced Regular Expressions in .NET
Patrick Delancy
![Page 2: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/2.jpg)
NOTICE!!!
This slide deck has been adapted from a
presentation that was intended to be given live,
in person…. like with a real person in front of
real people. You know… breathing the same air
and all that.
The key points have been transcribed onto
separate slides, so you still get some benefit
from reading through it all, but you are still
missing out on all of the great stories, witty
banter, hilarious costumes, stunning arias … or
something like that.
If you REALLY want to get the most out of this
presentation, go to patrickdelancy.com and ask
him to come give it to your group!
![Page 3: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/3.jpg)
This presentation will help you understand what Regex
is capable of.
![Page 4: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/4.jpg)
Don’t bother trying to memorize the syntax, just remember the concepts.
![Page 5: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/5.jpg)
Then you can make a more intelligent decision about
when you should and should not use Regex.
![Page 6: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/6.jpg)
Common Features
...but not ubiquitous
● Non-capturing groups
● Look ahead
● Look behind
● Free-spacing
![Page 7: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/7.jpg)
Non-Capturing Groups
^(.*)(@)(.*)$
[email protected][1] = email[2] = @[3] = ddress.com
^(.*)(?:@)(.*)$
[email protected][1] = email[2] = ddress.com
![Page 8: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/8.jpg)
Look Ahead
\b\w+(?=\.) # match the word at end of each sentence# but don’t capture the period.
See Dick. See Jane. See Dick and Jane run.
DickJanerun
![Page 9: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/9.jpg)
Look Behind
(?<=\b19)\d{2}\b # match all years in the 1900’s# capturing only the 2-digit year
1842 1902 1776 1985 2003 1999
028599
![Page 10: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/10.jpg)
Free Spacing (Ignore Pattern Whitespace)
new Regex(@”\b[^@]+ # pattern can now span multiple lines@[^\b]+\b # and include white space for readability
”, RegexOptions.IgnorePatternWhitespace);
![Page 11: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/11.jpg)
Less-Common Features
...in more advanced engines
● Named Captures
● Comments
● Inline Directives
● Conditional Alternation
● Atomic Groups
● Compiled Patterns
● Unicode Categories and
Named Character Blocks
![Page 13: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/13.jpg)
Comments
^.*@.*$ # comment to the end of the line
^.*@(?# this is an inline comment).*$
![Page 14: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/14.jpg)
Inline Directives
John the (?ix) (?: wiser | better and greater | privy )
John the Wiser, John the BetterAndGreater, john the privy, John the Better and Greater
John the WiserJohn the BetterAndGreater
![Page 15: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/15.jpg)
^Type:(?:(?<ssn>SSN)|(?<eid>EID)), ID:(?(ssn)\d{3}\-\d{2}\-\d{4}|[-\d]+)$
Type:SSN, ID:352-23-4567Type:EID, ID:35-2234567Type:SSN, ID:35-2234567Type:EID, ID:???
Conditional Alternation
![Page 16: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/16.jpg)
\b(in|integer|insert)\b
integerintegersininsert
Atomic Grouping / Possessive Quantifiers
\b(?>in|integer|insert)\b
integerintegersininsert
![Page 17: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/17.jpg)
var pattern = new Regex(@”a+h+!+”);
return pattern.IsMatch(value);
Compiled Patterns
var pattern = @”a+h+!+”;
return Regex.IsMatch(pattern, value);
![Page 18: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/18.jpg)
\b(?:\p{IsGreek}+\s?)+\p{Pd}\s(?>\p{IsBasicLatin}+\s?)+
Κατα Μαθθαίον - The Gospel of Matthew
Named Character Blocks & Unicode Groups
![Page 19: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/19.jpg)
Unique Features...in the .NET RegEx engine
● Balancing Groups
● Character Class Subtraction
● Explicit Capture Only
![Page 20: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/20.jpg)
^(?:[^{}]|(?<open>{)|(?<-open>}))*(?(open)(?!))$
{ if (true) { return “A”; } else { return “B”; } }{ if (true) { return “A”; } else { return “B”; }
Balancing Groups
![Page 21: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/21.jpg)
[0-9-[1-8]]
0123456789
[0-9-[1-8-[2-7]]]
0123456789
Character Class Subtraction
[\w-[aeiou]]
Lazy dog, quick fox, blah,blah, blah.
![Page 22: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/22.jpg)
^(?<name>[^@\+]+(\+[^\+]+)?)@(?<domain>(\w+)\.(com|net|org))$
[email protected][name] = e+mail[2] = +mail[domain] = ddress.com[4] = ddress[5] = com
Explicit Capture Only
(?n)^(?<name>[^@\+]+(\+[^\+]+)?)@(?<domain>(\w+)\.(com|net|org))$
[email protected][name] = e+mail[domain] = ddress.com
![Page 23: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/23.jpg)
Patrick Delancy
patrickdelancy.com
This Presentation:
patrickdelancy.com/presentations/...
@patrickdelancy
linkedin.com/in/patrickdelancy
google.com/+patrickdelancy
![Page 24: Advanced Regular Expressions in .NET](https://reader034.fdocuments.in/reader034/viewer/2022051520/58ee584c1a28abb1618b469d/html5/thumbnails/24.jpg)
Some Additional Resources
• https://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines - This is a little outdated, but still a good overview of how Regex implementations vary.
• https://msdn.microsoft.com/en-us/library/20bw873z(v=vs.110).aspx#SupportedNamedBlocks –Here is a reference of all of the named Unicode blocks that .NET supports in Regex. Linked here because I told you I would : )
• http://www.regular-expressions.info/refflavors.html - This is a very comprehensive reference for many common Regex engines. Some content may be out of date as new versions of each platform are released.
• http://www.regexplanet.com/ - An online pattern tester. Not the best interface, but very capable and has some nice features.