Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of...

14
Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology [email protected]

Transcript of Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of...

Page 1: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Regular Expression

Dr. Tran, Van Hoai

Faculty of Computer Science and Engineering HCMC Uni. of Technology

[email protected]

Page 2: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Text pattern

Used for text-processing utilities Text-pattern = normal characters +

metacharacters = regular expression Metacharaters in regular expressions are

different from those in file name expansion

Page 3: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Example (1)

grep [A-Z]* script*.shmeansgrep a.txt abc script1.sh script2.sh

grep "[a-z]*" script*.shmeans to find the pattern "[a-z]*" in

"script*.sh" Good and safe solutions are "" and ''

Page 4: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Metacharacter sets Depends on usage context

searchingreplacing

Also depends on programs Different engines

PerlPHP .NET regular expression libraryJava JDK

Page 5: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Searching patterns (1)

Character Pattern

. single character, except newline

* any number of characters immediately preceding it

^ the following regex at the beginning

[ ] any one of the enclosed characters, which can be given in range (-). "^" right after "[" means not

{n,m} a range of occurences of regex preceding it. {n} matches exactly n occurrences. {n,} matches at least n occurrences. {n,m} matches occurrences between n and m

\ turn off special meaning

\b word boundary

Page 6: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Searching patterns (2)

Character Pattern

\{n,m\} a range of occurences of regex preceding it. {n} matches exactly n occurrences. {n,} matches at least n occurrences. {n,m} matches occurrences between n and m

+ one or more instances of preceding regex

? zero or one instances of preceding regex

| alternation

( ) apply match to an enclosed group of regex

Page 7: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Example (2)

Pattern What does it match?

bag bag

^bag bag at the beginning of line

bag$ bag at the end of line

^bag$ only bag on the line

[Bb]ag Bag or bag

Page 8: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Example (3)

Pattern What does it match?

b[aeiou]g second letter is vowel

b[^aeiou]g second letter is consonant (or uppercase, symbol)

b.g second letter is any

^\. any line begins with a dot

^\.[a-z][a-z] ?

^[^\.] ?

bugs* bug, bugs, bugss,etc.

Page 9: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Example (4)

Pattern What does it match?

"word" ?

"*word"* ?

[A-Z][A-Z]* one or more uppercase letters

^\. any line begins with a dot

^\.[a-z][a-z] ?

^[^\.] ?

bugs* bug, bugs, bugss,etc.

Page 10: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Example (5)

Pattern What does it match?

floating point number

java identifier

java simple arithmetic expression

Page 11: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Replacing patterns (1)

Character Pattern

\ turn off special meaning

\n reuse the text matched by the nth subpattern previously saved by \( and \). Numbered from 1 to 9

& text match search pattern

~, % reuse previous replacement pattern

\u convert first character of replacement pattern to uppercase

\U convert entire replacement pattern to uppercase

\l, \L same; to lowercase

\e turn off previous \u or \l

\E turn off previous \U or \L

Page 12: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Example (6)

Command Result

s/.*/( & )/ add space and parentheses

s/.*/mv & &.old/ ?

/^$/d delete blank lines (vi, g/^$/d for all lines)

%s/ */ /g turn one or more spaces into one space

%s/.*/\L&/ lowercase entire file

%s/yes/No/g replace yes to No

%s/Yes/~/g replace yes to No (previous replacement)

s/\(F\)\(ORTRAN\)/\1\L\2/g

FORTRAN to Fortran

Page 13: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n

Applications

Pattern Matched text

grab a specific HTML tag

[0-9]\{1,3\}\. ???? IP address

Email address

Valid dates (day-month-year)

WeWe, does not match Wee

<TAG\b[^>]*>\(.*?\)</TAG>

[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}

Page 14: Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology hoai@cse.hcmut.edu.vn.

Dr. Tran, Van Hoai2007

Re

gu

lar

Exp

res

sio

n text processing utilities

is NEXT