L_4-1

40
Fall 2010 ~ Eric Meyer 

Transcript of L_4-1

Page 1: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 1/40

Fall 2010 ~ Eric Meyer 

Page 2: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 2/40

2

Lesson A

Extracting Information from Files

Page 3: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 3/40

3

Objectives  Explain the UNIX approach to file processing

  Use basic file manipulation commands

  Extract characters and fields from a file using the cutcommand

  Rearrange fields inside a record using the paste command

  Merge files using the sort command

  Create a new file by combining cut, paste, and sort

Page 4: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 4/40

4

UNIX Approach toFile Processing  Based on the approach that files should be treated as nothing

more than character sequences

  Because you can directly access each character, you canperform a range of editing tasks – this offers flexibility in terms of

file manipulation

Page 5: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 5/40

5

Understanding UNIX File Types  Regular files, also known as ordinary files

  Create information that you maintain and manipulate, andinclude ASCII and binary files

  Directories

  System files for maintaining file system structure

  Special files

  Character special files relate to serial I/O devices

  Communicates one character at a time

  Block special files relate to devices such as disks

  Communicates using blocks of data

Page 6: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 6/40

Page 7: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 7/40

7

File Structures

  Files can be structured in many ways depending on the kind ofdata they store

  UNIX stores data, such as letters and product records, as flatASCII files

  Three kinds of regular files are

  Unstructured ASCII character 

  Unstructured ASCII records

  Unstructured ASCII trees

Page 8: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 8/40

8

Page 9: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 9/40

9

Using Input and Error Redirection  You can use redirection operators to retrieve input from

something other than the standard input device and sendoutput to something other than the standard output device

  Examples of redirection:  Redirect the ls command output to a file, instead of to the

monitor (or screen)

  Redirect a program that receives input from the keyboardto receive input from a file instead

  Redirect error messages to files, instead of to the screen by

default

Page 10: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 10/40

10

Using Input and Error Redirection

Create a file by:

typing in all thecommands, or by

redirecting the cat

command output to

a file

Page 11: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 11/40

11

Manipulating Files  When you manipulate files, you work with the files

themselves, as well as their contents

  Create files using output redirection

  cat command - concatenate text via output redirection

  touch command - used to create empty files

Page 12: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 12/40

12

Manipulating Files  Delete files when you no longer needed

  rm command - permanently removes a file or an emptydirectory

  The -r option of the rm command will remove a directory

and everything it contains (recursive delete)  The –f option forces it

  So to kill a folder regardless rm –rf**do not do this to / - it will kill your system**

  Copy files as a means of back-up or as a means to assist with

new file creation  cp command - copies the file(s) specified by the source

path to the location specified by the destination path

Page 13: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 13/40

13

Manipulating Files  Moving a file in order to change the directory that contains it

  mv command - removes file from one directory and places it

in another 

  Finding a file helps you locate it in the directory structure

  find command - searches for the file that has the name you

specify

Page 14: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 14/40

14

Manipulating Files

Page 15: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 15/40

15

Manipulating Files  Combining files using output redirection

  cat command - concatenate text of two different files via

output redirection

  paste command - joins text of different files in side by side

fashion

  Extracting fields of a file using output redirection

  cut command - removes specific columns or fields from a

file

Page 16: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 16/40

16

Manipulating Files

Page 17: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 17/40

17

Manipulating Files  Re-arranging the contents of a file

  sort command - sorts a file’s contents alphabetically or 

numerically

  The sort command offers many options:

  You can sort the contents of a file and redirect the outputto another file

  Utilizing a sort key which provides the option of sorting on

a field position within each line

Page 18: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 18/40

18

Manipulating Files

Page 19: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 19/40

19

Lesson B

Assembling Extracted Information

Page 20: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 20/40

20

Objectives  Create a script file

  Use the join command to link files using a common field

  Use the awk command to create a professional-looking report

Page 21: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 21/40

21

Using Script Files  UNIX users create shell script files to contain commands that

can be run sequentially as a set – this helps with the issues of

command automation and re-use of command actions

  UNIX users use the vi editor to create script files, then make the

script executable using the chmod command with the x

argument

Page 22: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 22/40

22

Using Script Files

Type out the script andthen make it

executable using the

chmod command.

Page 23: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 23/40

Page 24: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 24/40

Page 25: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 25/40

25

Using the Join Command to Createthe Vendor Report

Use the join command

to create reportsshowing the

relationship between

two files

Page 26: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 26/40

Page 27: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 27/40

27

A Brief Introduction to theAwk Program

Awk uses a print

formatting function

from the Cprogramming

language to achieve a

more professional-

looking report

Page 28: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 28/40

Page 29: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 29/40

Page 30: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 30/40

30

Standard input, output and error •  standard input (0: stdin)    

 –   The default place where a process reads its input

(usually the terminal keyboard)    

•  standard output (1: stdout)    

 –   The default place where a process writes its output

(typically the terminal display)    

•  standard error (2: stderr)    

 –   the default place where a process can send its error 

messages (typically the terminal display)    

Page 31: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 31/40

31

Redirecting standard I/O•  Standard input and output can be redirected providing a

great deal of flexibility in combining programs and unix

tools

•  Can redirect standard input from a file using <

a.out < input12

 –   any use of stdin will instead use input12 in this example

•  Can redirect standard output to a file using >

testprog1 > testout1

cal > todaycal

a.out < input12 > testout –   the stdout of a.out is directed to file testout1 in this

example

•  Can also redirect stderr and / or stdout at the same time

Page 32: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 32/40

32

Appending to a file

•  The >> operator appends to a file rather than redirecting

the output to a file

cat textinfo >assign4

prog1.exe >>assign4

prog2.exe >>assign4

cat endinfo >>assign4

Page 33: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 33/40

33

Pipes

•  Pipes allow the standard output of one program to beused as the standard input of another program

•  The pipe operator ‘|’ takes the input from the commandon the left and feeds it as standard input to the commandat the right of the pipe

Examples

ls | sort -r

prog1.exe < input.dat | prog2.exe | prog3.exe>output.dat

ls -l | cut -c 38-80

•  Pipes are more efficient as compared to usingintermediate files

Page 34: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 34/40

34

Another Example

du -sc * | sort -n | tail

•  The du command is for disk usage (default is in blocks of

512 bytes). The s and c flags are for summarize and give a

grand total respectively•  the  sort -n command will sort by numeric value

•  head and tail commands print out a few lines at the head

or tail of the file respectively

Page 35: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 35/40

Page 36: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 36/40

36

Page 37: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 37/40

37

Page 38: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 38/40

38

Chapter Summary  UNIX supports regular files, directories, and character and

block special files

  File’s structures depend on data being stored and three kindsof regular files are unstructured ASCII characters, records andtrees

  When running, UNIX receives input from the standard inputdevice (keyboard) also known as stdin, and sends output tothe standard output device (monitor) also known as stdout.Another standard device, stderr, refers to the error file thatdefaults to the monitor 

Page 39: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 39/40

39

Chapter Summary  The touch command updates a file’s time and date stampsand creates empty files

  The rmdir command removes empty directories

  The cut command extracts specific columns or fields from afile

  To combine two or more files, use the paste command

  Use the sort command to sort a file’s contents alphabeticallyor numerically

Page 40: L_4-1

8/6/2019 L_4-1

http://slidepdf.com/reader/full/l4-1 40/40

40

Chapter Summary  To automate command processing, include commands in a

script file that you can later execute as a program

  Use the join command to extract data from two files sharing

a common field and use this field to join the two files

  Awk is a pattern-scanning and processing language useful

for creating a formatted report with a professional look