January 13, 20001 Files – Chapter 2 Basic File Processing Operations.
-
Upload
marjorie-nash -
Category
Documents
-
view
229 -
download
3
Transcript of January 13, 20001 Files – Chapter 2 Basic File Processing Operations.
January 13, 2000 1
Files – Chapter 2
Basic File Processing Operations
2
Outline
• Physical versus Logical Files• Opening and Closing Files• Reading, Writing and Seeking• Special Characters in Files• The Unix Directory Structure• Physical Devices and Logical Files• Unix File System Commands
3
Physical versus Logical Files
• Physical File: A collection of bytes stored on a disk or tape.
• Logical File: A “Channel” (like a telephone line) that hides the details of the file’s location and physical format to the program.
• When a program wants to use a particular file, “data”, the operating system must find the physical file called “data” and make the hookup by assigning a logical file to it. This logical file has a logical name which is what is used inside the program.
4
Opening Files
• Once we have a logical file identifier hooked up to a physical file or device, we need to declare what we intend to do with the file:
• Open an existing file• Create a new file
That makes the file ready to use by the programWe are positioned at the beginning of the file and
are ready to read or write.
5
Opening Files in UNIX/C• The UNIX system function open( ) is used to
open an existing file or create a new file.fd = open(filename, flags, [pmode]);
– fd: the file description -- the logical file name. The fd is an integer. If there is an error in the attempt to open the file, fd is negative (-1).
– filename: the physical file name. The filename argument can be a pathname.
6
– flags: an integer argument that controls the operation of the open function. The values of flag is set by performing a bitwise OR of the following values:
• O_APPEND: Append every write operation to the end of the file.
• O_CREAT: Create and open a file for writing.• O_EXCL: Return an error if O_CREAT opens an existing
file.• O_RDONLY: Open a file for reading only.• O_RDWR: Open a file for reading and writing.• O_TRUNC: Truncate an existing file to a length of 0,
destroying its contents.• O_WRONLY: Open a file for writing only.• and many others for synchronization.
7
Opening Files in UNIX/C (cont’d)
–pmode: An integer argument to specify the protection mode. • If O_CREAT is specified, pmode is required.
• In UNIX, the pmode is a three-digit octal that indicates how the file can be used by the owner (1st digit), by members of the owner’s group (2nd digit), and by everyone else (3rd digit). r: read permission, w: write permission, e: execute permission.
pmode = 751 = r w er w e r w e1 1 1 1 0 1 0 0 1owner group world
• File protection is tied more to the operating system than to a specific language.
8
– Examples:
fd = open(filename, O_RDWR | O_CREAT, 0751);
fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0751);
fd = open(filename, O_RDWR | O_CREAT | O_EXCL, 0751);
9
Closing Files
• Makes the logical file name available for another physical file (it’s like hanging up the telephone after a call).
• Ensures that everything has been written to the file [since data is written to a buffer prior to the file].
• Files are usually closed automatically by the operating system (unless the program is abnormally interrupted).
10
Reading
• Read(Source_file, Destination_addr, Size)
• Source_file = location the program reads from, i.e., its logical file name
• Destination_addr = first address of the memory block where we want to store the data.
• Size = how much information is being brought in from the file (byte count).
11
Writing
• Write(Destination_file, Source_addr, Size)
• Destination_file = the logical file name where the data will be written.
• Source_addr = first address of the memory block where the data to be written is stored.
• Size = the number of bytes to be written.
12
• A program does not necessarily have to read through a file sequentially: It can jump to specific locations in the file or to the end of file so as to append to it.
• The action of moving directly to a certain position in a file is often called seeking.
• Seek(Source_file, Offset)– Source_file = the logical file name in which the seek will
occur– Offset = the number of positions in the file the pointer is to
be moved from the start of the file.
13
• The seek function in UNIX/C: lseek( )pos = lseek(fd, byte_offset, origin)
– pos: a long integer value returned by lseek( ) equal to the number of bytes from the beginning to the file pointer after it has been moved.
– fd: the file descriptor.– byte_offset: the number of bytes to move from some
origin in the file. The byte_offset is a long integer and can be a negative value.
– origin: a value that specifies the starting position from which the byte_offset is to be taken. The values of origin:
• SEEK_SET: lseek( ) from the beginning of the file;• SEEK_CUR: lseek( ) from the current position;• SEEK_END: lseek( ) from the end of the file.
14
C/C++ streams• In C/C++, a file (and other devices like keyboard) is a stream of data.• There are two sets of I/O operations.
– C streams in stdio.h– C++ stream classes in iostream.h and fstream.h
• Comparison between UNIX/C operations and C/C++ streams– both support a complete set of file operations
UNIX/C
•Available mostly on UNIX, (also in Microsoft Visual C++)
•Fast
•Low level
C/C++ Streams
•Standard C/C++ features, available on almost all operating systems
•Provide structured I/O
15
C Streams
• Three standard streams: stdin, stdout, and stderr.• Opening file
fopen(const char *filename, const char *mode)• Closing file
fclose(FILE *fp)• Reading file
fread(void *buf, size_t size, size_t num, FILE *fp)//read num items of size bytes into buf from fpfgetc(FILE *fp) // return the next character from fpfgets(char *buf, int size, FILE *fp) // read a line or up to size bytes into buf from fpfscanf(FILE *fp, const char *format, …)// read and format data from fp
16
C Streams (Cont.)
• Writing filefwrite(const void *buf, size_t size, size_t num, FILE *fp)
//write num items of size bytes from buf to fpfputc(int ch, FILE *fp) //write the character ch to fpfputs(const char *buf, FILE *fp)
// write the string in buf to fpfprintf(FILE *fp, const char *format, …)
// write formatted data to fp• Seeking file
fseek(FILE *fp, long offset, int origin)
17
• C++ handles file I/O by creating objects of the stream classes.
• Standard stream objects: cin, cout, cerr, clog• Stream classes:
in file iostream.h: ios, istream, ostream, iostream,
in file fstream.h: ifstream, ofstream, fstream
ios
istream ostream
ifstream iostream ofstream
fstream
18
• Opening fileconstructormember function open
• Closing filedestructormember function close
• Reading fileoverloaded extracting operator <<many others: read, get, getline
• Writing fileoverloaded inserting operator >>many others: write, put
• Seeking fileseekg: set the read/get pointerseekp: set the write/put pointer
19
The LIST Program
• A simple file processing program: LIST– Display a prompt for the name of the input file.– Read the user’s response from the keyboard
into a variable called filename.– Open the file for input.– While there are still characters to be read from
the input file,• read a character from the file and,• write the character to the terminal screen.
– Close the input file.
20
/* read characters from a file and write them to the terminal screen */
#include <stdio.h>#include <fcntl.h>
main( ){
char c;int fd; /* file descriptor */char filename[20];
printf(“Enter the name of the file: “); /* step 1 */gets(filename); /* step 2 */fd = open(filename, O_RDONLY); /* step 3 */
while (read(fd, &c, 1) != 0) /* step 4a */putchar(c); /* write(stdout, &c, 1); does not work step 4b */
close(fd); /* step 5 */}
21
// listc.cpp// program using C streams to read characters from a file // and write them to the terminal screen #include <stdio.h>main( ) {
char ch;FILE * file; // file descriptorchar filename[20];printf("Enter the name of the file: "); // Step 1gets(filename); // Step 2file =fopen(filename, "r"); // Step 3while (fread(&ch, 1, 1, file) != 0) // Step 4a
fwrite(&ch, 1, 1, stdout); // Step 4bfclose(file); // Step 5
}
22
// listcpp.cpp DO THIS ONE...// list contents of file using C++ stream classes#include <fstream.h>void main (){
char ch;fstream file; // declare fstream unattachedchar filename[20];cout <<"Enter the name of the file: " // Step 1
<<flush; // force outputcin >> filename; // Step 2 file.open(filename, ios::in); // Step 3 file.unsetf (ios::skipws); // include white space in readwhile (1){
file >> ch; // Step 4a if (file.fail()) break;cout << ch; // Step 4b
}file.close(); // Step 5
}
23
Detecting End-of-File
• In UNIX/C– read returns 0
• Using C streams– fread returns -1– feof returns true
• Using C++ stream classes– fail returns true– eof returns true
24
Special Characters in Files I
• Sometimes, the operting system attempts to make “regular” user’s life easier by automatically adding or deleting characters for them.
• These modifications, however, make the life of programmers building sophisticated file structures (YOU) more complicated!
25
Special Characters in Files II: Examples
• Control-Z is added at the end of all files (MS-DOS). This is to signal an end-of-file.
• <Carriage-Return> + <Line-Feed> are added to the end of each line (again, MS-DOS).
• <Carriage-Return> is removed and replaced by a character count on each line of text (VMS)
26
The Unix Directory Structure I
• In any computer systems, there are many files (100’s or 1000’s). These files need to be organized using some method. In Unix, this is called the File System.
• The Unix File System is a tree-structured organization of directories. With the root of the tree represented by the character “/”.
• Each directory can contain regular files or other directories.• The file name stored in a Unix directory corresponds to its
physical name.
27
The Unix Directory Structure II
• Any file can be uniquely identified by giving it its absolute pathname. E.g., /usr6/mydir/addr. (see the next slide)
• The directory you are in is called your current directory.• You can refer to a file by the path relative to the current
directory.• “.” stands for the current directory and “..” stands for the
parent directory.
28
29
Physical Devices and Logical Files
• Unix has a very general view of what a file is: it corresponds to a sequence of bytes with no worries about where the bytes are stored or where they come from.
• Magnetic disks or tapes can be thought of as files and so can the keyboard and the console.
• No matter what the physical form of a Unix file (real file or device), it is represented in the same way in Unix: by an integer.
30
Stdout, Stdin, Stderr
• Stdout --> Console
fwrite(&ch, 1, 1, stdout);
• Stdin --> Keyboard
fread(&ch, 1, 1, stdin);
• Stderr --> Standard Error (again, Console)
[When the compiler detects an error, the error message is written in this file]
31
I/O Redirection and Pipes
• < filename [redirect stdin to “filename”]
• > filename [redirect stdout to “filename”]
E.g., a.out < my-input > my-output
• program1 | program2 [take any stdout output from program1 and use it in place of any stdin input to program2.
E.g., list | sort
32
Unix System Commands
• cat filenames --> Print the content of the named textfiles.• tail filename --> Print the last 10 lines of the text file.• cp file1 file2 --> Copy file1 to file2.• mv file1 file2 --> Move (rename) file1 to file2.• rm filenames --> Remove (delete) the named files.• chmod mode filename --> Change the protection mode on the
named file.• ls --> List the contents of the directory.• mkdir name --> Create a directory with the given name.• rmdir name --> Remove the named directory.