Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of...
-
Upload
august-haynes -
Category
Documents
-
view
214 -
download
0
Transcript of Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of...
![Page 1: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/1.jpg)
Regular Expression
Dr. Tran, Van Hoai
Faculty of Computer Science and Engineering HCMC Uni. of Technology
![Page 2: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/2.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Text pattern
Used for text-processing utilities Text-pattern = normal characters +
metacharacters = regular expression Metacharaters in regular expressions are
different from those in file name expansion
![Page 3: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/3.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Example (1)
grep [A-Z]* script*.shmeansgrep a.txt abc script1.sh script2.sh
grep "[a-z]*" script*.shmeans to find the pattern "[a-z]*" in
"script*.sh" Good and safe solutions are "" and ''
![Page 4: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/4.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Metacharacter sets Depends on usage context
searchingreplacing
Also depends on programs Different engines
PerlPHP .NET regular expression libraryJava JDK
![Page 5: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/5.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Searching patterns (1)
Character Pattern
. single character, except newline
* any number of characters immediately preceding it
^ the following regex at the beginning
[ ] any one of the enclosed characters, which can be given in range (-). "^" right after "[" means not
{n,m} a range of occurences of regex preceding it. {n} matches exactly n occurrences. {n,} matches at least n occurrences. {n,m} matches occurrences between n and m
\ turn off special meaning
\b word boundary
![Page 6: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/6.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Searching patterns (2)
Character Pattern
\{n,m\} a range of occurences of regex preceding it. {n} matches exactly n occurrences. {n,} matches at least n occurrences. {n,m} matches occurrences between n and m
+ one or more instances of preceding regex
? zero or one instances of preceding regex
| alternation
( ) apply match to an enclosed group of regex
![Page 7: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/7.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Example (2)
Pattern What does it match?
bag bag
^bag bag at the beginning of line
bag$ bag at the end of line
^bag$ only bag on the line
[Bb]ag Bag or bag
![Page 8: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/8.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Example (3)
Pattern What does it match?
b[aeiou]g second letter is vowel
b[^aeiou]g second letter is consonant (or uppercase, symbol)
b.g second letter is any
^\. any line begins with a dot
^\.[a-z][a-z] ?
^[^\.] ?
bugs* bug, bugs, bugss,etc.
![Page 9: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/9.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Example (4)
Pattern What does it match?
"word" ?
"*word"* ?
[A-Z][A-Z]* one or more uppercase letters
^\. any line begins with a dot
^\.[a-z][a-z] ?
^[^\.] ?
bugs* bug, bugs, bugss,etc.
![Page 10: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/10.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Example (5)
Pattern What does it match?
floating point number
java identifier
java simple arithmetic expression
![Page 11: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/11.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Replacing patterns (1)
Character Pattern
\ turn off special meaning
\n reuse the text matched by the nth subpattern previously saved by \( and \). Numbered from 1 to 9
& text match search pattern
~, % reuse previous replacement pattern
\u convert first character of replacement pattern to uppercase
\U convert entire replacement pattern to uppercase
\l, \L same; to lowercase
\e turn off previous \u or \l
\E turn off previous \U or \L
![Page 12: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/12.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Example (6)
Command Result
s/.*/( & )/ add space and parentheses
s/.*/mv & &.old/ ?
/^$/d delete blank lines (vi, g/^$/d for all lines)
%s/ */ /g turn one or more spaces into one space
%s/.*/\L&/ lowercase entire file
%s/yes/No/g replace yes to No
%s/Yes/~/g replace yes to No (previous replacement)
s/\(F\)\(ORTRAN\)/\1\L\2/g
FORTRAN to Fortran
![Page 13: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/13.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n
Applications
Pattern Matched text
grab a specific HTML tag
[0-9]\{1,3\}\. ???? IP address
Email address
Valid dates (day-month-year)
WeWe, does not match Wee
<TAG\b[^>]*>\(.*?\)</TAG>
[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}
![Page 14: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.](https://reader036.fdocuments.in/reader036/viewer/2022071808/56649f005503460f94c15831/html5/thumbnails/14.jpg)
Dr. Tran, Van Hoai2007
Re
gu
lar
Exp
res
sio
n text processing utilities
is NEXT