Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and...
-
Upload
darrell-glenn -
Category
Documents
-
view
212 -
download
0
Transcript of Revision Lecture Mauro Jaskelioff. AWK Program Structure AWK programs consists of patterns and...
Revision Lecture
Mauro Jaskelioff
AWK Program Structure
• AWK programs consists of patterns and procedures
Pattern_1 { Procedure_1}Pattern_2 { Procedure_2}Pattern_3 { Procedure_3} … …Pattern_n { Procedure_n}
• Additionally, a program can contain function definitions (but we don’t need to worry about them now)
Example program
• Don’t mind details! Try to recognize the general structure described on the previous slide.
BEGIN { FS= ":" print “Example v0.1"
}$7 ~ /bash/ {
print $1 " uses bash"}
$4 == 0 { print "user " $1 " belongs to the root group"}
{ print "--------------------------------"}
AWK Input
• AWK input consists of records and fields• Records are separated by a record
separator RS• By default the RS is a newline, so each
record is a line of input• Each record consists of zero or more fields,
separated by a field separator FS• By default the FS is blank space.• The current record is $0. Each of its fields
is $1, $2, …
Example of inputsConsider the following
input file:• Default RS and
default FSif $0=“Red,255 0 0”
then $1=“Red,255”,$2=“0” and $3=“0”
• With FS=‘,’if $0=“Red,255 0 0”
then $1=“Red” and $2=“255 0 0”
Red,255 0 0Green,0 255 0Blue,0 0 255
Red,255 0 0Green,0 255 0Blue,0 0 255
Red,255 0 0Green,0 255 0Blue,0 0 255
AWK’s Main loop (simplified)
for each input record r doparse rfor each pattern pati do
if r matches pati then
execute proci
PatternsA pattern can be:• Relational expression
– Use relational operators, e.g. $1 > $2awk -F: ‘$1 > $2 {print $0}’ /etc/passwd
– Can do numeric or string comparisonsawk -F: ‘$1==“gdm” {print $0}’ /etc/passwd
• An empty patternawk -F: ‘{print $0}’ /etc/passwd
– Always True– Equivalent to a true expression. For example,
the command above is the same as:awk -F: ‘1 < 2 {print $0}’ /etc/passwd
Patterns (2)
• Pattern-matching expression– E.g. quoted strings, numbers, operators,
defined variables… – ~ means match, !~ means don’t matchawk -F: '$1 ~ /.dm.*/ {print $0}' /etc/passwdawk -F: '$0 ~ /^...:/ {print $0}' /etc/passwdawk -F: '$1 !~ /^g/ {print $0}' /etc/passwd
• /regular expression/– Equivalent to $0 ~ /regular expression/
awk -F: ‘/^...:/ {print $1}’ /etc/passwd
Special patterns
• Two special patterns:– BEGIN
• Specifies procedures that take place before the first input line is processedawk ‘BEGIN {print “Version 1.0”}’ dataFile
– END• Specifies procedures that take place after the last
input record is readawk ‘END {print “end of data”}’ dataFile
• This means we need to refine description of the main loop (see next slide)
AWK’s refined Main loop
for each BEGIN pattern doexecute corresponding procedure
for each input record r doparse rfor each pattern pati do
if r matches pati thenexecute proci
for each END pattern doexecute corresponding procedure
This is the previousversion of the main loop
Procedures
• Procedures consist of the usual assignment, conditional, and looping statements found in most languages.
• These are separated by newlines or semi-colons and are contained within curly brackets { }
• A procedure can be empty. The empty procedure prints $0.
awk Built-in Variables
• awk has a number of built in variables:– FILENAME - current filename– FS - Field separator– NF - Number of fields in current record– NR - Number of current record– RS - Record separator– $0 - Entire input record– $n - nth field in current record
Control Structures
• if (condition) statement• if (condition) statement else
statement• for (expr1; expr2; expr3) statement• for (index in array) statement
– More about this when we review arrays.
• while (condition) statement
For-While equivalence
for (expr1; expr2; expr3) statement
is equivalent to:
expr1;while (expr2) {
statement;expr3
}
awk Operators
Symbol Meaning$ Field reference
++ -- Increment, decrement
+ - ! Addition, subtraction, negation
* / % Multiplication, division, modulus
< <= > >= != == Relational operators
~ !~ Match regular expression and negation
in Array membership
&& || Logical and, Logical or
?: If-then-else for expressionsx == y ? “Equal” : “Not equal”
= += -= *= /= %= Assignment
Arrays in awk
• awk has arrays with elements subscripted with strings (associative arrays)
• Assign arrays in one of two ways:– Name them in an assignment statement
• myArray[i]=n++• myArray["Red"]="255 0 0"
– Use the split(str,arr,fs) function which splits the string str into elements of array arr, using field separator, fs. It returns the number of fields used.
• n=split(input, words, " ")
Example of split
results in:m ← 4colors[1] ← "Blue"colors["2"]← "0"colors[3] ← "0"colors["4"]← "255"
• Since indexes are really strings it's legal to write them enclosed in quotes
m=split("Blue 0 0 255",colors," ")
Reading elements in an array
• Using a for loop:
– Since indexes are strings, this is the only way to loop through all elements of an array
• Using the operator in:
– we use this to test if an index exists.
for (index in array)print array[index]
if (index in array)...