faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web...

19
Grep Take a look : https://shapeshed.com/unix-grep/ General idea and theory used to search for content within a given input requires a search term (regex, or something simple) and the input o input can be a file, displayed material piped, etc… the results are (by default) are the LINES THAT MATCH!! GREP Command line Flags/Options showing most, but not all notice that regex or something simple can be used within the syntax there are three options in flags o regex and interpretation o output control o miscellaneous will go over each option throughout this doc Selected Regexp selection & interpretation -i: ignore the case of your search term -x: return only an exact match (used with regEx only!!) -E: interpret search as an extended regular expression -F: interpret search as a list of fixed strings,

Transcript of faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web...

Page 1: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

GrepTake a look : https://shapeshed.com/unix-grep/

General idea and theory used to search for content within a given input requires a search term (regex, or something simple) and the input

o input can be a file, displayed material piped, etc… the results are (by default) are the LINES THAT MATCH!!

GREP Command line Flags/Options showing most, but not all notice that regex or something simple can be used within the syntax there are three options in flags

o regex and interpretationo output controlo miscellaneous

will go over each option throughout this doc

Selected Regexp selection & interpretation-i: ignore the case of your search term-x: return only an exact match (used with regEx only!!)-E: interpret search as an extended regular expression-F: interpret search as a list of fixed strings, including newlines, dots, etc-f: get the search patterns from this file-e: search literally, and protects patterns starting with a hyphen-w: find matches surrounded by space-v: show lines that don’t match, instead of those that do

Selected Miscellaneous--color: add color to the matched output--help: get some help-V: get grep’s version

Page 2: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

Selected Output control-c: instead of returning matches, return the number of matches-H: print the filename with each match-m: stop reading file after n number of matches-n: print the line number of where matches were found-q: don’t output anything, but exit with status 0 if any match is found (check that status with echo $?).-A: print n number of lines after the match-B: print n number of lines before the match-C: print n number of lines before and after the match-o: print only the matching part of the line

The Basic Grep Command(s) the basic command required 3 things

o “grep”o search string to matcho file (input) to look for search string

but there are variation on how to receive and export datao input

redirection > pipes just name a file

-r is still good to use for recursively going through a directoryo shown later

Page 3: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

Various input methods with basic grep commandsStraight up Grep command

Using STDIN to supply the Grep command

Various result methodsResults to the screen

Redirecting the output to a file

search multiple files that are named similarly

search multiple files that are of various name

Page 4: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

The –w option when you want to find that particular word

o surrounded by a whitespace or punctuation will excluded if it is a substring portion of a larger word BUT, whatever is AFTER the search string is considered a valid answer not good with file input as a search pattern

Using the –w optionMatches before –w option

Using the –w option

Using the –w option (noticing it catches those values at the end of a line)

( notice it matches the string, but the matches might be attached to other values)

Page 5: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

Getting search patterns from a file “search patterns” are placed in a file, line by line is a new pattern be careful what your patterns are

o Reg Ex (default) "1.4[0-9]\{4\}$" result.txt

o just Strings need the –F option

o digits (especially decimal points) again, need the –F option

option –w is useless!!!o overridden by the pattern in the fileo the text in the search pattern file would need spaces around ito so use -F option instead

the file setupo careful that an empty line within the file, is really not empty!!

except the VERY last line if there is a blank line, the \n is there, and will match many lines in

your input!!

File setup is importantSearch Pattern File (with extra blank line)

Simple pattern matching (-f)

(notice Lupoli was not there, but could be replaced with another word that would match)Search Pattern File (with extra blank line)

Page 6: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

Result

(and on and on)

The –H option print the filename with each match doesn’t make sense with the simple examples that select only one file for

input but when used with the recursive search (-r), makes much more sense and

very useful

-H can be helpful when searching many files

Using RegEx in Grep thankfully, all with the same syntax

o overall grep syntax except anything with { } (see below)

o overall RegEx syntax but, there are some Grep “presets”

o built in syntax to find regEx values (covered later)

Page 7: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

Basic Regular Expression NotesSyntax Meaning Example Matched DFS !Match. Any single non-null character Sh.t Shot, Shut, etc.. - Sht, Shoot,a This particular character alone a a

Any other character than a

ab This particular characters joined alone

tha. that, than, thal, thay Any other joined character than ab

a|b Or demo|example demo, example c, ab, ba, aa

* Zero or more times go*gle gooooogle, gogle, google

ggle, gooogoogle

[abc] any of these single characters tha[nt] than, that tha, thant

[a-d] any of these single characters in range

so[b-f] sob, soc, sod, soe, sof so, sobb, soy

[^abc] none of these characters(notice ^ leads off)

[^a-d] not a character within this range(notice ^ leads off)

so[^b-f] soa,sog, soh, sot, sos sob, soc, sod, soe,sof

^ starts withnotice NOT within [grouping]

^The These, The, Theatre, Theta

these, Tomas, Darn

$ string or ϵ ends with $ton cotton, Clinton, ton, Scraton, Easton

jerk, certain,

? Zero or one character(need a value in front)

(dos)?e

doss?e(s in front of ? is targetted)

dose, e

dosse, dose, dossse

nose, doe

doddoss, dosss

Page 8: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

+ one or more(need a value in front)

(dos)+e

doss+e(s in front of ? is targetted)

is the same as below, but less resources

\{n\} n times exactly(need a value in front)

w\{3\}(nag){3} = ???

www ww, w, wwww

\{n,m\} from n to m times(need a value in front)

(blah)\{3,5\} blahblahblah, blahblahblahblah

blah,blahblahblah blahblahblah

\{n,\} at least n times(need a value in front)

[] group\ Escape\s White Space\S non-White Space\d digit character\D non-digit character\w Word\W non-Word (punctuation,

spaces)

Page 9: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

RegEx examplesSimple RegEx with the option m

RegEx beginning of a line example

Rookie syntax for finding words that contain ‘ll’

Better syntax for finding words that contain ‘ll’

Page 10: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

returning the strings that match the regEx

1. As a group (of 2), complete the exercises below.2. Pick someone within the group, create a file named answers.txt3. Copy this file into your local unix directory

/afs/umbc.edu/users/s/l/slupoli/pub/labCode433/GREP/grepdata.txtor

http://faculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/data/grepdata.txt

4. Complete the exercises below, pasting your answer within the file.

Write grep statements that use command-line options along with the pattern to do the following:

1. Print all lines that contain CA in either uppercase or lowercase.2. Print all lines that contain an email address (they have an @ in them),

preceded by the line number.3. Print all lines that do not contain the word Sep. (including the period).4. Print all lines that contain the word de as a whole word.

The –x optionthis is like parenthesizing the pattern and then surrounding it with ‘^’ and ‘$’. (-xis specified by POSIX.)

Grep mixed with Bash with -x

def main():

#set variables operand1= "" operand2= ""

if grep -F "def main" "$1"; then # then incorrect mark with Xdef main(): if grep -xF "def main" "$1"; then # then incorrect mark with X(nothing)

Page 11: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

Grep Presets these preset functions can be used along with RegEx to find what you need all use ‘ ‘ and [[ ]] to denote you are using the shortcuts (presets)

GREP Presets[[:alnum:]]: any alphanumeric character[[:alpha:]]: any alphabetic character[[:contrl:]]: any control character // watch no “o”!![[:digit:]]: any number[[:lower:]]: any lower case character[[:print:]]: any printable character[[:space:]]: any space character, including space, tab, newline, CR, FF, etc.

[[:digit:]] preset examplespreset alone

preset along with RegEx (version 1) – rookie version of finding words

Page 12: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

preset along with RegEx (version 2) using the –w option to find words

Adding to the file, write a series of grep statements that do the following:1. Print all lines that contain a phone number with an extension (the letter x or

X followed by four digits).2. Print all lines that begin with a minimum of three digits followed by a

blank. Your answer must use the \{ and \} repetition specifier.3. Print all lines that contain a date. Hint: this is a very simple pattern. It does

not have to work for any year before 2000.4. Print all lines containing a vowel (a, e, i, o, or u) followed by a single

character followed by the same vowel again. Thus, it will find “eve” or “adam” but not “vera”. Hint: \( and \)

5. Print all lines that do not begin with a capital S.

Page 13: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

SolutionsPart 1#11. grep --color=always -i 'CA' grepdata.txt1. cat grepdata.txt | grep -i --color=always CA1. grep --color=always -i ca grepdata.txt1. grep --color=always -i CA grepdata.txt#22. cat grepdata.txt | grep -n --color=always @2. grep --color=always '\w*@\w*\.com grepdata.txt2. grep --color=always @ -n grepdata.txt#33. grep --color=always -v -F "Sep." grepdata.txt3. grep --color=always -v 'Sep\.' grepdata.txt3. grep -v -F -w Sep. grepdata.txt3. cat grepdata.txt | grep -v --color=always Sep\.#44. grep --color=always -w de grepdata.txt4. grep --color=always '\w*de\w*' grepdata.txt4. grep -w de grepdata.txt4. cat grepdata.txt | grep -w --color=always de

Part 2#11. grep --color=always -i 'x[[:digit:]]\{4\}' grepdata.txt1. grep --color=always -i "X[[:digit:]]\{4\}" grepdata.txt#22. grep --color=always -F '^[[digit:]]\{3\}\s' grepdata.txt2. cat grepdata.txt | grep -i --color=always '[0-9]\{3,\}\w'2. grep --color=always '^[[:digit:]]\{3\}[[:digit:]]*[[:space:]]' grepdata.txt2. grep --color=always -i "^[[:digit:]]\{3,\}\s" grepdata.txt#33. grep --color=always -w '\s20[[:digit:]][[:digit:]]' grepdata.txt3. cat grepdata.txt | grep -i --color=always ', 20[0-9][0-9]'3. grep -i ,'[[:space:]]'2'[[:digit:]]\{3\}' grepdata.txt#44. grep '\([aeiou]\).\(\1\)' grepdata.txt4. grep '\([aeiou]\).\1' grepdata.txt#55. grep --color=always -v "^S" grepdata.txt

Page 14: faculty.cse.tamu.edufaculty.cse.tamu.edu/slupoli/notes/ScriptingLanguages/Grep.docx  · Web viewsearch string to match. file (input) to look for search string. but there are variation

SourcesTutorialshttps://danielmiessler.com/study/grep/

Exerciseshttp://evc-cit.info/cit052/grep1.html

Grep with colorhttp://linuxcommando.blogspot.com/2007/10/grep-with-color-output.html

Varioushttp://www.thegeekstuff.com/2009/03/15-practical-unix-grep-command-examples

-x optionhttps://www.gnu.org/software/grep/manual/grep.html