1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai...

Post on 05-Jan-2016

219 views 4 download

Transcript of 1 P51UST: Unix and Software Tools Unix and Software Tools (P51UST) Awk Programming (2) Ruibin Bai...

1P51UST: Unix and Software Tools

Unix and Software Tools (P51UST)

Awk Programming (2)

Ruibin Bai (Room AB326)

Division of Computer Science

The University of Nottingham Ningbo, China

This Lecture

• Awk commands

• Loops and conditionals

• Arrays

• Functions

2P51UST: Unix and Software Tools

Awk Commands

• Types of commands– Assignments of variables or arrays

– Input/output

– String operations

– Control-flow commands

– User-defined functions (commands)

3P51UST: Unix and Software Tools

Assignment

• User defined awk variables are initialised to either zero or the empty string, depending on how they are used.

• Assign variables with an =, E.g.– FS = “:”– var = count+2– var = max-min

• Assignment syntax is less strict– Can have space before and/or after equal to sign ‘=’

4P51UST: Unix and Software Tools

Input/output (1)

• print function – print [argument] [destination]

– Used to print one or more variables, fields, or strings to standard out.

– If arguments separated by space, output will be concatenated, if separated by commas, output will be separated by OFS (output field separator).

– Strings are enclosed by qutation marks.

– The output can be redirected to files

Print $1, $2, $4 > “myfile”

Input/output (2)

• Formatted printing: printf

• Syntax printf ([format [, values]])

• Very similar syntax to printf in sh.

printf ("*%-10s*%-5d*%+5d\n","hello",10,10)

Output:

*hello *10 * +10

Input/output (3)

• getline - Read a string from keyboard or from a filegetline [variable string] [<input file] or

command | getline [variable string]

• Example: target file is not specified{

printf(“Please enter two values > ”)

getline

printf(“$1 = %s\t$2 = %s\n”, $1, $2)

}

getline

• Reading from keyboardgetline var-name <“-” orgetline var-name <“/dev/stdin” orgetline var-name <“/dev/tty”

• ExampleBEGIN{printf (“Please enter the name I should search for: > “)getline name < “/dev/tty”

}$1 == name || $2==name || $3 == name {printf("$1 = %s\t$2 = %s\t$3 = %s\n", $1, $2, $3)

}

A Note About getline

• getline is a function and does return a value, BUT if you put brackets after it, e.g.:– getline()

– You will get an error!

• Examples– getline newValue < “myFile”

BEGIN {printf “Enter a name:>“getline < “-”print

}

9P51UST: Unix and Software Tools

In this example, the user is prompted to enter their name. This is assigned to $0 and the print statement outputs the value of $0 by default

Here, the input record is assigned to the variable “newValue”

Control-Flow Commands (1)

• Conditionals

if (condition) {statement1statement2…

}else {

statement3statement4…

}

• Conditional operators

< less than

<= less than or equal to

== equal to

> greater than

>= greater than or equal

!= not equal to

~ /re/ contains the regular

expression re.

Control-Flow Commands (2)

• For loopsfor (x= start; x<=maximum; x++){ command(s)

}Or

for (element in array) { command(s)}

for loop example

• BEGIN{for (x=1; x<=10; x++)

print x

}

Control-flow Commands (3)

• While loops

BEGIN {while (name==“”){

printf(“Give me a name please >”) getline name <“/dev/stdin”

}}$1==name || $2==name {

printf(“here are the data you requested:

\n\n”) printf(“\t%s\n\n”,$0)}

Control-flow Commands (4)

• break– Used to exit from a loop

• continue– Skip the current body of a loop to the next loop

• next– The next statement forces awk to immediately stop

processing the current record and go on to the next record.

• nextfile– Skip the remainder of an input file and go on to the

next input file

Arrays in Awk

• awk has arrays with elements subscripted with numbers or strings (associative arrays)

• Assign arrays in one of two ways:– Name them in an assignment statement

myArray[i]=n++

– Use the split() function (to be discussed shortly)

n=split(input, words, " ")

Array in Awk (2)

• Under awk, it's customary to start array indices at 1, rather than 0. myarray[1]="jim“myarray[2]=456

• Array elements can be subscripted with number or string. BEGIN { my_array[1] = "pear" my_array[2] = "tree"; my_array["David"] = "Cassidy"; }

16P51UST: Unix and Software Tools

Reading Elements in an Array

• Using a for loop:

• Using the operator in:

• …use this to see if an element exists. It does so by testing to see if its index exists (nawk)

for (item in array)print array[item]

if (index in array)…

An Array Example

BEGIN { my_array[1] = "Partridge" my_array[2] = "pear" my_array[3] = "tree" my_array[13] = "Cassidy" print "Print array element using item-in-array for loop:" for (i in my_array) print i "=" my_array[i]

print "\nPrint array element using c-style for loop:" min=1; max=13 for (i=min; i<= max; i++) { if (i in my_array) print i "=" my_array[i] }}

18P51UST: Unix and Software Tools

An Array Example

BEGIN { my_array[1] = "Partridge" my_array[2] = "pear" my_array[3] = "tree" my_array[13] = "Cassidy" print "Print array element using item-in-array for loop:" for (i in my_array) print i "=" my_array[i]

print "\nPrint array element using c-style for loop:" min=1; max=13 for (i=min; i<= max; i++) { if (i in my_array) print i "=" my_array[i] }}

19P51UST: Unix and Software Tools

Test whether a value has ever stored for a index value

A value can be stored at any index

Elements are not printed in order here

Copying an Array

• The awk language does not support assignment of arrays.

• Thus, to copy an array, you must copy the individual values from one array to the next.

BEGIN { arr_len = split( "Mary lamb freezer", my_array ); for (word in my_array) { copy_array[word] = my_array[word] } for (word in copy_array) { print copy_array[word] }}

20P51UST: Unix and Software Tools

Delete an Array Element

• Syntax

Delete array_name[key]

• Example

BEGIN { my_array["purple"] = "Partridge"; my_array["mountain"] = "pear"; my_array["majesties"] = "tree";my_array["fruited"] = "Cassidy";

mykey = "fruited"; delete my_array["mountain"]; delete my_array[mykey]; for (i in my_array) { print i "=" my_array[i]; }

}

21P51UST: Unix and Software Tools

String Functions (1)

• length ([argument])– Return the length of the argument

• index (string, target)– Return the location or byte posion of the first byte of

the target string within the whole string.

• substr (string, start [, length])– Return a substring of the whole string, starting at start

• split (string, array [, separator])– Splits the string into many words and stores into array.

String Functions (2)

• Assume the following target file (/etc/passwd)

1. Username

2. password,

3. User ID (UID) ,

4. Group ID (GID),

5. User ID Info,

6. Home directory,

7. Command/shell

String Functions (3)

• Print each of users’ login and first name using index and substr functions

BEGIN{ print "Here are the user ID\'s and first names from /etc/passwd"

FS=":"}{ blank = index($5," ") first = substr($5, 1, blank-1) printf("User ID = %-15s \t first name = %-25s\n", $1, first)}

String Functions (4)

• Using function split to print each of users’ login, first name and last name.

BEGIN{ FS=":" }{ howmany= split($5, names, " ") printf("User ID = %-15s firstname = %-15s lastname = %-15s\n", $1, names[1],names[howmany])}

The system() Function

• The system() function allows a programmer to execute a command whilst within an awk script.

system(“cmd”)

• The awk script waits for the command to finish before continuing execution.

• The output of the command is NOT available for processing from within awk.

• The system() function returns an exit status which can be tested by the awk script.

An Example Using system()

BEGIN { if (system(“mkdir UST”) == 0){ if (system(“cd UST”) !=0)

print “change directory – failed”

}else

print “make directory - failed”

}

27P51UST: Unix and Software Tools

This example tries to create a new directory called UST. If successful, the code tries to change directory to UST. If not, an error is printed.

An Example Using system()

28P51UST: Unix and Software Tools

$ awk -f create.awk

$ ls UST

$ awk -f create.awk

mkdir: UST: File exists

make directory - failed

Here, the script (called create.awk) is run and is successful. “ls UST” doesn’t return anything because UST is empty.

Here, the script is run for a second time and so the mkdir command fails because UST already exists. The first error is given by the mkdir command, the second error is given by the awk script

User-Defined Functions

• You can define your own functions in awk, in much the same way as you define a function in C or Java– Thus code that is to be repeated can be grouped

together inside a function

– Allows code reuse!

– NOTE: when calling a function you have defined yourself, no space is allowed between the function name and the opening bracket.

An Example using a Function and an Array

# capitalise the first letter of each word in a stringfunction capitalise(input){

result= ""n=split(input, words, " ")for (i=1; i <=n; i++){

w = words[i]w = toupper(substr(w, 1, 1)) substr(w, 2)if (i > 1)

result = result " "result = result w

}return result

} # this is the main program{ print capitalise($0) }

30P51UST: Unix and Software Tools

Break-down of Example

# capitalise the first letter of each word in a stringfunction capitalise(input){

… Variable to be used in function

- input contains whatever the caller called the

function with

Break-down of Example (2)

result= ""

n=split(input, words, " ")

Take the input and split it up into the array “words” - divide the input wherever there is a space

n is the result returned by the split command and contains the number of elements in the array “words”

# Set result to be an empty string

Break-down of Example (3)

…for (i=1; i <=n; i++){

w = words[i]w = toupper(substr(w, 1, 1)) substr(w, 2)if (i > 1)

result = result " "result = result w

}return result

} …

Assign element to w

For each element of array from 1 to the number of elements…

Tag a space on to the end of the result string

Tag the next word on to the end of the result string

Take the substring which starts at the first character and has a length of 1 and capitalise using toupper()

Take remainder of string starting at 2nd character and append it to capitalised character

Break-down of Example (4)

…# this is the main program{ print capitalise($0) }

This is a comment in awk

Call the capitalise function with the entire input record. Print the result.

Output from Example

• Given the input file:

• …our Capitalise function will output:

In theory there is no difference between theory and practice, but in practice there is

In Theory There Is No Difference Between Theory And Practice, But In Practice There Is