Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and...

18
Revision Lecture Mauro Jaskelioff

Transcript of Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and...

Page 1: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Revision Lecture

Mauro Jaskelioff

Page 2: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

AWK Program Structure

• AWK programs consists of patterns and procedures

Pattern_1 { Procedure_1}Pattern_2 { Procedure_2}Pattern_3 { Procedure_3} … …Pattern_n { Procedure_n}

• Additionally, a program can contain function definitions (but we don’t need to worry about them now)

Page 3: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Example program

• Don’t mind details! Try to recognize the general structure described on the previous slide.

BEGIN { FS= ":" print “Example v0.1"

}$7 ~ /bash/ {

print $1 " uses bash"}

$4 == 0 { print "user " $1 " belongs to the root group"}

{ print "--------------------------------"}

Page 4: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

AWK Input

• AWK input consists of records and fields• Records are separated by a record

separator RS• By default the RS is a newline, so each

record is a line of input• Each record consists of zero or more fields,

separated by a field separator FS• By default the FS is blank space.• The current record is $0. Each of its fields

is $1, $2, …

Page 5: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Example of inputsConsider the following

input file:• Default RS and

default FSif $0=“Red,255 0 0”

then $1=“Red,255”,$2=“0” and $3=“0”

• With FS=‘,’if $0=“Red,255 0 0”

then $1=“Red” and $2=“255 0 0”

Red,255 0 0Green,0 255 0Blue,0 0 255

Red,255 0 0Green,0 255 0Blue,0 0 255

Red,255 0 0Green,0 255 0Blue,0 0 255

Page 6: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

AWK’s Main loop (simplified)

for each input record r doparse rfor each pattern pati do

if r matches pati then

execute proci

Page 7: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

PatternsA pattern can be:• Relational expression

– Use relational operators, e.g. $1 > $2awk -F: ‘$1 > $2 {print $0}’ /etc/passwd

– Can do numeric or string comparisonsawk -F: ‘$1==“gdm” {print $0}’ /etc/passwd

• An empty patternawk -F: ‘{print $0}’ /etc/passwd

– Always True– Equivalent to a true expression. For example,

the command above is the same as:awk -F: ‘1 < 2 {print $0}’ /etc/passwd

Page 8: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Patterns (2)

• Pattern-matching expression– E.g. quoted strings, numbers, operators,

defined variables… – ~ means match, !~ means don’t matchawk -F: '$1 ~ /.dm.*/ {print $0}' /etc/passwdawk -F: '$0 ~ /^...:/ {print $0}' /etc/passwdawk -F: '$1 !~ /^g/ {print $0}' /etc/passwd

• /regular expression/– Equivalent to $0 ~ /regular expression/

awk -F: ‘/^...:/ {print $1}’ /etc/passwd

Page 9: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Special patterns

• Two special patterns:– BEGIN

• Specifies procedures that take place before the first input line is processedawk ‘BEGIN {print “Version 1.0”}’ dataFile

– END• Specifies procedures that take place after the last

input record is readawk ‘END {print “end of data”}’ dataFile

• This means we need to refine description of the main loop (see next slide)

Page 10: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

AWK’s refined Main loop

for each BEGIN pattern doexecute corresponding procedure

for each input record r doparse rfor each pattern pati do

if r matches pati thenexecute proci

for each END pattern doexecute corresponding procedure

This is the previousversion of the main loop

Page 11: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Procedures

• Procedures consist of the usual assignment, conditional, and looping statements found in most languages.

• These are separated by newlines or semi-colons and are contained within curly brackets { }

• A procedure can be empty. The empty procedure prints $0.

Page 12: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

awk Built-in Variables

• awk has a number of built in variables:– FILENAME - current filename– FS - Field separator– NF - Number of fields in current record– NR - Number of current record– RS - Record separator– $0 - Entire input record– $n - nth field in current record

Page 13: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Control Structures

• if (condition) statement• if (condition) statement else

statement• for (expr1; expr2; expr3) statement• for (index in array) statement

– More about this when we review arrays.

• while (condition) statement

Page 14: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

For-While equivalence

for (expr1; expr2; expr3) statement

is equivalent to:

expr1;while (expr2) {

statement;expr3

}

Page 15: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

awk Operators

Symbol Meaning$ Field reference

++ -- Increment, decrement

+ - ! Addition, subtraction, negation

* / % Multiplication, division, modulus

< <= > >= != == Relational operators

~ !~ Match regular expression and negation

in Array membership

&& || Logical and, Logical or

?: If-then-else for expressionsx == y ? “Equal” : “Not equal”

= += -= *= /= %= Assignment

Page 16: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Arrays in awk

• awk has arrays with elements subscripted with strings (associative arrays)

• Assign arrays in one of two ways:– Name them in an assignment statement

• myArray[i]=n++• myArray["Red"]="255 0 0"

– Use the split(str,arr,fs) function which splits the string str into elements of array arr, using field separator, fs. It returns the number of fields used.

• n=split(input, words, " ")

Page 17: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Example of split

results in:m ← 4colors[1] ← "Blue"colors["2"]← "0"colors[3] ← "0"colors["4"]← "255"

• Since indexes are really strings it's legal to write them enclosed in quotes

m=split("Blue 0 0 255",colors," ")

Page 18: Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and procedures Pattern_1 { Procedure_1} Pattern_2 { Procedure_2}

Reading elements in an array

• Using a for loop:

– Since indexes are strings, this is the only way to loop through all elements of an array

• Using the operator in:

– we use this to test if an index exists.

for (index in array)print array[index]

if (index in array)...