CPPS CHAPTER NO 5€¦ · Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 4 of 29...

29
CE-102 CPPS ( Chapter # 5: Handling Files In C ) Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 1 of 29 CE-102 C OMPUTER P ROGRAMMING & P ROBLEM S OLVING CPPS C HAPTER N O 5 H ANDLING F ILES IN C Computer Engineering Department Sir Syed University of Engineering and Technology Tel: 111994994, 4988000-2

Transcript of CPPS CHAPTER NO 5€¦ · Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 4 of 29...

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 1 of 29

CE-102

COMPUTER PROGRAMMING & PROBLEM SOLVING

CPPS

CHAPTER NO 5

HANDLING FILES IN C

Computer Engineering Department

Sir Syed University of Engineering and Technology

Tel: 111994994, 4988000-2

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 2 of 29

COMPUTER PROGRAMMING AND PROBLEM SOLVING ( CPPS )

HANDLING FILES IN C CHAPTER # 5

Handling Files in C ♦ Introduction to files

♦ File Opening modes

♦ Reading and Writing Files

♦ Binary and Text Modes

♦ I/O (Standard I/O and System I/O )

♦ Console I/O function

♦ Record I/O

♦ Random Access Most programs need to read and write data to disk-.based storage systems. Word processors need to store text files, spreadsheets need to store contents of cells, and databases need to store records. In this chapter explore the facilities that C makes available for input and output ( I/O ) disk system. Disk I/O operations are performed on entities called files. A collection of bytes that is given a name. In most microcomputer systems, are used as a unit of storage primarily on floppy disk and fixed-disk storage systems. The major use of the MS-DOS or PC handles files: to manage their storage on the disk, load them into them, delete them, and so forth. Standard I/O Versus System I/O Probably the broadest division in C file I/O is between standard I/O ( often called stream I/O), and system-level (or low-level I/O).

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 3 of 29

Character, String, Formatted, and Record I/O The standard I/O package makes available tour different ways of reading and writing data. Relationship of standard and system I/O

♦ Character getc ( ) ; putc ( )

♦ String fgets ( ) ; fputs ( )

♦ Formatted fscanf ( ) ; fprintf ( )

♦ Record fread ( ) ; fwrite ( )

Text Versus Binary Another way to categorize file I/O operations is according to whether files are opened in text mode or binary mode. Which of these two modes is used to open the file determines how various details of file management are handled; newlines are stored, for example, and how end-of-file is indicated. The reason that these two modes exist is historical: UNIX systems (on which C was developed) do things one-way, while MS-DOS does them another. Just to confuse matters, there is a second distinction between text binary: the format used for storing numbers on the disk. In test format numbers are stored as strings of characters, while in binary format as they are in memory: two bytes for an integer, four for floating point.

“Text versus binary mode is concerned with new line and EOF translation; text versus binary format is concerned with number representation.”

Standard Input/Output Standard Input/Output is also called stream I/O. Standard I/O is probably the easier to program and understand, so we'll start with it. Of the four kinds of standard I/O, we'll explore the first three in this section: character I/O, string I/O, and formatted I/O.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 4 of 29

Character Input/Output Our first example program will take characters typed at the keyboard and, one at a time, write them to a disk file. Program // writec.c // write one charector at a time in a file void main (void) {

FILE *fptr ; // ptr to FILE char ch; fptr = fopen ( " textfile.txt " , "w" ) ; // open FILE set fptr

while ( ( ch = getche ( ) ) !=' \ r ' ) // get character putc ( ch , fptr ) ; //write it to file

fclose ( fptr ) ; //close file } In operation the program sits there and waits for you to type a line of text. When you've finished, you hit the [Return] key to terminate the program. Output: C> writec Hello Friend C> What you've typed will be written to a file called textfile.txt. To see that the file has in fact been created, you can use the DOS TYPE function, which will read the contents of the file. C> type textfile.txt Hello Friend C> Opening a File Before we can write a file to a disk, or read it, we must open it. Opening a file establishes an understanding between our program and the operating system about which file we're going to access and how we're going to do it. We provide the operating system with the name of the file and other information such as whether we plan to read or write to it. Communication areas are the set up between the file and our program. One of these areas is a C structure that holds information about the file.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 5 of 29

This structure, which is defined to be of type struct FILE, is our contact point. When we make a request to open a file, what we receive back. ( if the request is indeed granted) is a pointer to a particular FILE structure. Each file we open will have its own FILE structure, with a pointer to it.

Figure: Opening a File The FILE structure contains information about the file being used, such as its current size and the location of its data buffers. The FILE structure declared in the header file stdio.h (for "standard I/O") .It is necessary to #include this file with your program whenever you use standard I/O. In the writec.c program we first declare a variable of type pointer-to-FILE, in the statement: FILE *fptr ; Then, we open the file with the statement: fptr = fopen ( "textfile.txt" , "w" ) ; This tells the operating system to open a file called "textfile.txt". (We could also have specified a complete MS-DOS path name here, such as \samples\jan \textfile.txt.). This statement also indicates, via the "w", that we will be writing to the file. The fopen ( ) function returns a pointer to the FILE. structure for our file, which we store in the variable fptr. The one-letter string " w " (note that it is a string and not a character; hence the double, and not single quotes) is called a type. The " w " is but one of several types we can specify when we open a file.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 6 of 29

Different Modes of Filing Here are the other possibilities:

“ r “ Open for reading. The file must already exist.

“ w ” Open for writing. If the file exists its contents will be written over. If it does not exist it will be created.

“ a ” Open for append. Material will be added to the end of an existing

file, or a new file will be created. “ r+ ” Open for both reading and writing. The file must already exist.

“ w+ ” Open for both reading and writing. If the file exists its contents are

written over.

“ a+ ” Open for both reading and appending. If the file does not exist it will be created.

Writing to a File Once we have established a line of communication with a particular file by opening it, we can write to it. In the writec.c program we do this one character at a time, using the statement:- putc ( ch , fptr ) ; The putc ( ) function is similar to the putch ( ) and putchar ( ) functions always write to the console, while putc ( ) writes to a file? The file whose FILE structure is pointed to by the variable fptr, which we obtained when we opened the file. This pointer has become our key to the file; we no longer refer to the file by name but by the address stored in the fptr. The writing process continues in the while loop; each time the putc ( ) function is executed another character is written to the file. When the user types [Return], the loop terminates. Closing the File When we've finished writing to the file we need to close it. This is carried out with the statement: fclose ( fptr ) ; Closing the file has several effects: First, any characters remaining in the buffer are written out to the disk. What buffer? Basically they are invisible to the programmer when using standard I/O. However, a buffer is necessary even if it is invisible. Consider, for example, how inefficient it would be to actually access the disk just to write one character. It takes a while for the disk system to position the head correctly and wait for the right sector of the track to come around. On a floppy disk system the motor actually has to start the disk from disk is accessed. If you typed several characters rapidly, a completely separate disk access, some of the characters word lost. This is where the buffer comes in.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 7 of 29

When you send a character off to a file with putc ( ), the character is actually stored in a buffer -an area of memory- rather than being written immediately to the disk. When the buffer is full, its contents are written to the disk all at once. Or, if the program knows the last character it forces the buffer to be written to the disk by closing the file. A major advantage of using standard I/O (as opposed to system I/O) is activities take place automatically; the programmer doesn’t 't need to about them. Another reason to close the file is to free the communications areas used by that particular file so they are available for other files. These areas include the FILE structure and the buffer itself. Reading from a File If we can write to a file, we should be able to read from one. Here is a program that does just, using the function getc ( ): // readc.c // reads one character at a time in a file void main (void) {

FILE *fptr ; // ptr to FILE char ch ;

fptr = fopen ( "textfile.txt", "r" ) ; // open FILE set fptr while ( ( ch = getc ( fptr ) ) != EOF ) // get character from file printf ( " %c " , ch ) ; / /print it fclose ( fptr ) ; // close file } As you see, this program is quite similar to writec.c. The pointer to FILE is declared the same way, and the file is opened and closed in the same way. The getc ( ) function reads one character from the file "textfile.txt"; this function is the complement of the putc ( ) function. End of FILE A major difference between this program and writec.c is that readc.c must be concerned with knowing when it has read the last character in the file. It does this by looking for an end-of-file (EOF) signal from the operating system. If it tries to read a character, and reads the EOF signal instead, it knows it come to the end of the file. What does EOF consist of? It's important to understand that it is not a character. It is actually an integer value, sent to the program by the operating system and defined in the stdio.h file to have a value of -1.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 8 of 29

No character with value is stored in the file on the disk; rather, when the operating system realizes that the last character in a file has been sent, it transmits the EOF signal. “EOF signal sent by the operating system to a C program is not a character, but an integer with a value of –1.” So our program goes along reading characters and printing them', for a value of -1, or EOF. When it finds this value, the program terminates. We use an integer variable to hold the character being read so we can interpret the EOF signal as the integer -1. What difference does it make? If we used a variable of type char, the character with the ASCII code 255 decimal (FF hex) would be interpreted as an EOF. We want to be able to use all character codes from O to 255 - all possible 8 bit combinations, that is; so by using an integer variable we ensure that only a 16 bit value of -1, which is not the same as any of our character codes, will signal EOF. Trouble Opening the File The two programs we have presented so far have a potential flaw; if the file specified in the fopen ( ) function cannot be opened, the program will not run. Why couldn't a file be opened? If you are trying to open a file for writing, it is probably because there's no more space on the disk. If for reading, it is much more common that a file can't be opened; you can't read it if it hasn’t been created yet. Thus it's important for any program that accesses disk files to check whether a file has been opened successfully before trying to read or write to the file. If the file cannot be opened, the fopen ( ) function returns a value 0 (defined as NULL in stdio.h); Since in C Language this is not considered a valid address, the program infers that the file could not be opened. Here's a variation of readc.c that handles this situation: // readc2.c // reads one character at a time in a file and check for read error int main (void) {

FILE *fptr ; // ptr to FILE char ch ;

if ( fptr = fopen ( "textfile.txt", "r" ) ) = = NULL ) // open FILE set fptr {

printf ( “ Cannot open file “ ) ; exit (1) ; } while ( ( ch = getc ( fptr ) ) != EOF ) // get character from file

printf ( " %c " , ch ) ; // print it fclose ( fptr ) ; return (0); }

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 9 of 29

Here the fopen ( ) function is enclosed in an if statement. If the function returns NULL, then an explanatory message is printed and the exit ( ) is executed, causing the program to terminate immediately, and avoiding the embarrassment of trying to read from a nonexistent file. Since we use the exit ( ) function, to return a value, we also use return for a normal program termination, returning a value of O rather than 1 which indicates an error. Counting Characters The ability to read and write to files on a character-by-character basis has many useful applications. // charcntf.c // count char in a file int main ( int argc ,char *argv [ ] ) {

FILE *fptr ; char string [80] ; int count = 0 ; if ( argc != 2 ) // check # of args {

printf ( " Formate: C> charcntf filename " ) ; exit(1);

}

if ( ( fptr = fopen ( argv [1] , "r" ) ) = = NULL ) //open file {

printf ( " Can't open file %s " , argv [1] ) ; exit(1);

}

while ( getc ( fptr ) != EOF ) // get char from file count++ ; // count it fclose ( fptr ) ; printf ( " FILE %s contains %d characters .", argv[1], count ) ; return ( 0); } In this program we have used a command line argument to obtain the name of the file to be examined; rather than writing it into the fopen ( ) statement. We start by checking the number of arguments; if there are two, then the second one, argv [1], is the name of the file. For compactness, we've also placed the two statements following the if statement on one line.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 10 of 29

The program opens the file checking to make sure it can be opened, and cycles through a loop, reading one character at a time and incrementing the variable count each time a character is read. Counting Words It's easy to modify our character counting program to count words. This can be a useful utility for writers or anyone else who needs to know how many words an article, chapter, or composition comprises. Here's the program: // wordcntf.c // count words in a file int main ( int argc , char *argv[ ] ) { FILE *fptr ; char ch, string [80] ; int white = 1 ; // white space flag int count = 0 ; // word count

if ( argc != 2 ) // check # of argc {

printf ( " Format: C> wordcntf filename " ) ; exit (1); }

if ( ( fptr = fopen ( argv [1], "r" ) ) = = NULL ) // open file

{ printf ( " Can't open file %s", argv [1] ); exit(1);

} while( ( ch = getc (fptr) ) != EOF ) // get char from file { switch (ch) { // if space ,tab, or case ' ' : // new line set flag case ' \t ' : case ' \n ' :

white++ ; berak ; default: //non _ whitespace ,and if ( white )

{ white = 0 ; // flag set count++ ; // count word

} } }

fclose (fptr) ; printf ( " FILE %s contains %d words .", argv[1], count ) ; return (0) ; }

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 11 of 29

What we really count in this program is the change from white space characters (spaces, new lines, and tabs) to actual (non whites pace) characters. In other words, if there's a string of spaces or carriage returns, the program reads them, waiting patiently for the first actual character. It counts this transition as a word. Then it reads actual characters until another white space character appears. A "flag" variable keeps track of whether the program is in the middle of a word or in the middle of white space; the variable white is 1 in the middle of white space, 0 in the middle of a word. String (line) Input/Output Reading and writing strings of characters from and to files is almost as easy as reading and writing individual characters. Here's a program that writes strings to a file, using the string I/O function fputs ( ) // writes.c // Writes sentences typed at keyboard to file void main(void) { FILE *fptr ; // declare ptr to file char string [81] ; // storage for string

fptr = fopen ( "textfile", "w" ) ; // open file while( strlen ( gets (string) ) > 0 ) ) // get char from keyboard { fputs ( string, fptr ) ; // write string to the file fputs ( " \n ", fptr ) ; // write new line to file } fclose (fptr) ; // close file } The user types a series of strings, terminating each by hitting [Return]. To terminate the entire program, the user hits [Return] at the beginning of a line. This creates a string of zero length, which the program recognizes as the signal to close the file and exit. We have set up a character array to store the string; the fputs ( ) function then writes the contents of the array to the disk. Since fputs ( ) does not automatically adds a new line character to the end of the line, we must do this explicitly to make it easier to read the string back from the file.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 12 of 29

// reads.c // read string from file void main(void) { FILE *fptr ; // ptr to FILE char string [81] ; // storage string

fptr = fopen ( "textfile.txt", " r " ) ; // open file while( fgets ( string, 80, fptr ) != NULL ) // read string printf ( " %s " , string ) ; // print string fclose (fptr) ; // close file } The function fgets ( ) takes three parameters. The first is the address where the string is stored, and the second is the maximum length of the string. This parameter keeps the fgets ( ) function from reading in too long a string and overflowing the array. The third parameter, as usual, is the pointer to the FILE structure for the file. Here's a sample run C> writes C> Pakistan Zindabad C> reads C> Pakistan Zindabad The New line Problem Earlier we mentioned that our charcntf.c program might not always returns the same results as other character-counting programs, such as the DOS DIR command. Here's what happens when we try to verify the length of the file.

C> dir textfile.txt

TEXTFILE TXT 116 10-27-87 2:36a

C> charcntf textfile.txt File textfile.txt contains 112 characters.

Using DIR we find 116 characters in the textfile.txt file. Whereas using our home made charcntf.c program, we find 112. This discrepancy occurs because of the difference in the way C and MS- DOS represent the end of a line.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 13 of 29

In C Language the end of a line is signalled by a single character: the new line character, ' \n ' ( ASCII value 10 decimal ). In DOS, on the other hand the end of a line is marked by two characters, the carriage return ( ASCII 13 decimal ) and the linefeed ( which is the same as the C new line: ASCII 10 decimal ). “ The end of a line is represented by a single character in C ( the new line ), but by two characters in MS-DOS files ( carriage return and line feed ).” When your program writes a C text file to disk, Turbo C causes all the new lines to be translated into the carriage return plus line feed ( CR / LF ) combination. When your program reads in a text file, the CR/LF combination is translated back into a single new line character. Thus, DOS oriented programs such as DIR will count two characters at each end of the line, while C oriented programs, such as charcntf.c will count one. Printer Output There are several ways to send output to the printer. In this section we'll look at two approaches that use standard I/O. The first approach uses standard file handles, and the second uses standard file names. Standard File Handles We have seen several examples of obtaining handles that refer to files. MS-DOS also predefines five standard handles that refer to devices rather than files. These handles are accessed using names defined to be of type *FILE in the stdio.h file. They are:

Number Handle Device 0 " stdin " Standard input (keyboard, unless redirected) 1 " stdout " Standard output (screen, unless redirected) 2 " stderr " Standard error (screen) 3 " stdaux " Auxiliary device 4 " stdpm " Printer

The standard input and standard output normal refer to the keyboard and screen, but they can be redirected. Standard error, which is where errors are normally reported, always refers to the screen. The auxiliary device can be assigned when the system is set up; it is often used for a serial device.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 14 of 29

// print.c // print file on printer_use stundard handle int main( int argc , char *argv [ ] ) {

FILE *fptr ; char string [81] ; if ( argc != 2 ) // check argc {

printf ( " Formate: C>print filename " ) ; exit (1) ;

} if ( ( fptr = fopen ( argv [1] , "r" ) ) = = NULL ) // open file {

printf ( " Can't open file %s " , argv [1 ] ) ; exit (1) ;

} while ( fgets ( string, 80, fptr ) != NULL ) // read string { fputs ( string, stdprn ) ; // send to printer putc ( '\r', stdprn ) ; // with return } fclose ( fptr ) ; // close file printf ( " FILE %s contains %d words .", argv [1], count ) ;

return (0) ; } Standard File Names You can also refer to devices by names, rather than handles. These names are used very much like file names. For instance, you probably know that you can create short text files by typing. C> copy con readme.txt This is a short text file. The CON file name used above refers to the console, which is the key- board or screen, depending on whether input or output is intended. Similarly, the standard auxiliary device can be called AUX, and the standard printer can be called PRN. Here is the example that uses the PRN file name to access the printer. Because this is a name, and no a handle, the device must be opened before it is used.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 15 of 29

// print2.c // print file on printer_use standard filename int main( int argc , char *argv [ ] ) {

FILE *fptr1, *fptr2 ; char string [81] ; if ( argc != 2 ) // check c argc {

printf ( " Formate: C> print2 filename " ) ; exit (1) ;

} if ( ( fptr1 = fopen ( argv [1] , "r" ) ) = = NULL ) // open file {

printf ( " Can't open file %s " , argv [1] ) ; exit (1) ; }

if ( ( fptr2 = fopen ( " prn " , "w" ) ) = = NULL ) // open printer {

printf ( " Can't access printer " ) ; exit ( 0);

} while ( fgets ( string, 80, fptr1 ) != NULL ) // read string { fputs ( string, fptr2 ) ; // send to printer } fclose (fptr1); fclose (fptr2) ; // close file return (0); } This approach is advantageous if you use more than one printer. The name LPT1 is usually a synonym for PRN, but if you have additional printers, you can send output to them using file name LPT2, LPT3 and so on.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 16 of 29

Formatted Input / Output So far we have deal with reading and writing only characters and text. How about numbers? // writef.c // writes formatted data to file void main (void) { FILE *fptr ; // declare ptr to FILE

char name [40] ; // agent's name int code ; // code number float height ; // agent's height

fptr = fopen ( "textfile.text","w"); //open file do { printf ( " Type name, code number, and height: " ) ; scanf ( " %s %d %f ", name, &code, &height ) ; fprintf ( fptr,"%s %d %f ", name, code, height ) ; } while ( strlen (name) > 1 ) ; // no name given? fclose (fptr) ; // close file } The Key to this program is the fprintf ( ) function, which writes the values of the three variables to the file. This function is similar to the printf ( ), except that a FILE pointer is included as the first argument. As in printf ( ), we can format the data in a variety of ways; in fact, all the conventions of printf ( ) operates with fprintf ( ) as well. This information is now in the “textfile.txt” file. We can look at it there with the DOS TYPE command or with type2.c although when we use these programs all the output will be on the same line, since there is no new lines in the data. To format output more conveniently, we can write a program specifically to read textfile.txt: // readf.c // reads formatted data to file void main (void) { FILE *fptr ; // declare ptr to FILE char name [40] ; int code ; float height ;

fptr = fopen ( "textfile.text" , "r" ) ; // open file

while ( fscanf ( fptr , " %s %d %f " , name, code, height ) != EOF ) ; printf ( " %s %3d %2f ", name, code, height ) ; fclose (fptr) ; }

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 17 of 29

This program uses the fscanf ( ) function, to read the data from the disk. This function is similar to scanf ( ), except that, as with fprintf ( ), pointer to FILE is included as the first argument. Of course, we can use similar techniques with fprintf ( ) and fscanf ( ) for other sorts of formatted data: any combination of characters, strings, integers and floating point numbers. Number Storage in Text Format It is important to understand hot numerical data is stored on the disk by fprintf ( ). Text and characters are stored one character per byte, as we would expect. Are numbers stored as they are in the memory two bytes for an integer, four bytes for floating point, and so on? NO. They are stored as strings of characters. Thus the integer 999 is stored in two bytes of memory, but requires three bytes in a disk file, one for each ‘9’ character. The floating-point number 12.75 requires four bytes when stored in memory, but five when stored on the disk; one for each digit and one for the decimal point. Numbers with more digits requires even more disk space. Numbers with more than a few significant digits requires substantially more space on the disk using formatted I/O than they do in memory. Thus, if a large amount of numerical data is to be stored on a disk file, using text format can be inefficient. The solution is to use a function that stores numbers in binary format. The fourth method, record I/O, writes data in binary format. When writing data in binary mode it is often desirable to use binary mode. Binary Mode and Text Mode The need for two different modes arose from incompatibilities between C and the MS-DOS operating system. C originated on UNIX system and used UNIX file conventions. MS-DOS, which evolved separately from UNIX, has its own, somewhat different conventions. When C was ported over to the MS-DOS operating system, compiler writer had to resolve these incompatibilities. The solution decided was to have two modes. One, the text mode, made files look like to C programs as if they were UNIX files. The other, binary mode made files look more like MS-DOS files. “ Text mode imitates UNIX files, while binary mode imitates MS-DOS files.” There are two main areas where text and binary mode files are different: the handling of the new lines and the representation of end-of-line.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 18 of 29

Text Versus Binary Mode: NEWLINE We have learn that in text mode, a new line character is translated into the carriage return-line feed ( CR/LF ) combination before being written to the disk. Likewise, the CR/LF on the disk is translated back into a new line when the file is read by the C program. However. If a file is opened in binary mode, as opposed to the text mode, these translations will not take place. Lets revise our character count program in the binary mode: // charcnt2.c // count char in a file opened as binary int main( int argc , char *argv [ ] ) { FILE *fptr ; char string [80] ; int count = 0 ; if ( argc != 2 ) // check # of argc {

printf ( " Formate: C>charcnt2 filename " ) ; exit (1) ;

} if ( ( fptr = fopen ( argv [1], "rb" ) ) = =NULL ) // open file

{ printf ( " Can't open file %s", argv [1] ) ;

exit (1) ; }

while ( getc (fptr) != EOF ) // get char from file count ++ ; // count it

fclose (fptr ); // close file printf ( " FILE %s contains %d characters .", argv [1], count ) ; return (0); } There is only one difference between this program and the original charcntf.c; we have introduced the letter “b” as part of the type setting string, following the “r”. This “b” means that’s the file should be opened in the binary mode. We could also use a ”t” to specify text mode, but since this is the default for characters I/O this is not normally necessary. In binary mode the translation of the CR/LF pair into a new line does not take place. The binary charcntf2.c program reads each CR/LF as two characters, just as it is stored on the file.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 19 of 29

Text Versus Binary Mode: End-of-File The second difference between text and binary modes is in the way end-of-line is detected. Both systems actually keep track of the total length of the file and will signal an EOF when this length has been reached. In text mode, however a special character, 1A hex ( 26 decimal ), inserted after the last character in the file is also used to indicate EOF: ( This character, can be generated from the keyboard by typing [Ctrl] [z] and is often written as ^Z. ). If this character is encountered at any point in the file, the read function will return the EOF signal (-1) to the program. There is a moral to be derived from the text mode conversation of 1A hex to EOF. If a file stored numbers in binary format, it is important that binary mode be used in reading the file back, since one of the numbers might will be the number 1A hex ( 26 decimal ). If this number were detected while we were reading a file in text mode, reading would be terminated at that point. Also when writing in binary format, we do not want numbers that happen to have the value 10 decimal to be interpreted as new line and expanded into CR/LFs. Thus both in reading and writing binary format numbers we should use binary mode when accessing the file.

Figure: Test and Binary Modes

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 20 of 29

Record Input / Output We have saw that storing numbers in the format provided by these functions can take up a lot of disk space, because each digit is stores as a character. Formatted I/O presets another problem; there is no direct way to read and write complex data types such as arrays and structures. Arrays can be handled, but inefficiently by writing each array elements one at a time. Structures must also be written bit by bit. In Standard I/O, one answer to these problems is record I/O, sometimes called block I/O. Record I/O writes numbers to disk files in binary ( or un-translated ) format, so that integer are stored in two bytes, floating point numbers in four bytes, and so on for the other numerical types- the same format used to store numbers in the memory. Record I/O also permits writing any amount of data at once; the process is not limited to a single character or string or to the few values that can be placed in a fprintf ( ) or fscanf ( ) functions. Writing Structures with fwrite ( ) // writer.c int main( ) {

struct // define structure { char name [40] ; int agnumb ; double height ; } agent;

char numstr [81] ; // for number FILE * fptr ; // file pointer

If ( fptr = ( fopen ( "agent.rec", " wb " ) ) = = NULL ) {

printf ( " Can't open fill agent.rec " ) ; exit(1);

} do { printf ( " Enter name: " ) ; // get name gets ( agent.name ) ; printf ( " Enter number: " ) ; // get number gets ( numstr ) ; agent.agnumb = atoi ( numstr ) ;

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 21 of 29

printf ( " Enter height: " ) ; // get height gets ( numstr ) ; agent.height = atof ( numstr ) ;

fwrite ( &agent, sizeof (agent), 1, fptr ) ; // write structure to file printf ( " Add another agent( y / n ) " ) ; } while ( getche( ) = = 'y' ) ; fclose (fptr) ; return (0) ; } This program will accept input concerning name, number, and height of each agent and will then write the data to a disk file called agent.rec. The information obtained from the user at the keyboard is placed in the structure agent. Then the following statement writes the structure to the file: fwrite ( &agent, sizeof (agent), 1, fptr ) ; The first argument of this function is the address of the structure to be written. The second argument is the size of the structure. Instead of counting bytes, we let the program do it by using the sizeof ( ) function. The third argument is the number of such structure we want to write at one time. If we had an array of structures, for example, we might want to write the entire array all at once. This number could then be adjusted accordingly, In this case, however we want to write only one structure at a time. The last argument is the pointer to the file we want to write to. Writing Arrays with fwrite ( ) The fwrite ( ) function need not to be restricted to writing structures to the disk. We can use it and its complementary function fread ( ),to work with other data types as well. Suppose we want to store an integer array of 10 elements in a disk file. We could write the 10 items one at a time by using fprintf ( ), but it is far more efficient to take the following approach: // warray.c // writes array to file int table [10] = {1,2,3,4,5,6,7,8,9,10} ; void main (void) { FILE * fptr ; // file pointer if( ( fptr = fopen ( " table.rec " , " w" ) ) = = NULL ) // open file {

printf ( "Can't open file agents.rec ") ; exit (1) ; } fwrite ( table, sizeof (table), 1, fptr ) ; // write array

fclose (fptr); }

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 22 of 29

This program does not accomplish anything useful, but it does demonstrate that the writing of an array to the disk is similar to the writing of a structure. Reading structure with fread ( ) // readr.c int main() {

struct { char name [40] ; int agnumb ;

double height ; } agent ; FILE * fptr ; // file pointer if ( fptr = ( fopen ( "agent.rec" , "wb" ) ) = = NULL )

{ printf ( " Can't open fill agent.rec " ) ; exit(1);

} while ( fread ( &agent, sizeof (agent), 1, fptr ) = = 1 )

{ printf ( " \n Name: %s " , agent.name ) ;

printf ( " Number: %3d ", agent.agnumb ) ; printf ( " Height: %2f ", agent.height ) ; } fclose (fptr) ; return (0); } The heart of this program is the expression fread ( &agent, sizeof (agent), 1, fptr ) This causes data read from the disk to be places in the structure; the format is the same as the fwrite ( ) . The fread ( ) function returns the number of items read. In record I/O, numerical data is stored in binary mode. This means that an integer always occupies 2 bytes, a floating point number always occupies 4 bytes and so on. Record I/O is more efficient for the storage of numerical data than function that uses text format such as fprintf ( ).

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 23 of 29

Data Base Example # define TRUE 1 void newname(void); void listall (void) ; // prototype void rfile (void) ; void wfile (void) ; struct personal // define data structure {

char name [30] ; // name int agnumb ; // code number float height ; // height in inch } ; struct personal agent [50 ]; // array of 50 structure int n = 0 ; // number of agents listed char numstr [40] ; void main (void) {

clrscr ( ) ; while (TRUE)

{ printf ( " Type ‘e’ to enter new agent \n, 'r' read file " ) ;

printf ( " Type ‘l’ to list all agent \n, ' w' write file" ) ; printf ( " q to quit: " ) ; switch ( getche ( ) ) { case ' e ': newname ( ) ; break ; case ' l ': listall ( ) ; break; case ' w ': wfile ( ) ; break; case ' r ': rfile ( ) ; break;

case ' q ': exit (0) ; default: puts ( " Enter only selected char " ) ; }

} getch ( ) ;

}

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 24 of 29

void newname (void) {

char numstr [88] ; printf ( " \n Record %d enter name: ", n+1 ) ;

gets (agent[n].name); printf ( " Enter agent number (3digits): " ) ; gets (numstr); agent[n].agnumb = atoi (numstr); printf ( " Enter height " ) ;

gets (numstr); agent[n++].height = atof (numstr); } void listall (void) {

int j; if ( n < 1 ) // check for empty test printf ( "Empty list " ) ; for ( j = 0 ; j < n ; j++ ) { printf ( " Record number %d \n:", ,j+1 ) ; printf ( " Name: %s \n ", agent[j].name ) ; printf ( " Agent number: %d \n ", agent[j].agnumb ) ; printf ( " Height: %4.2f \n", agent[j].height ) ; } } void wfile (void) {

FILE * fptr; char ch ; If ( n < 1 )

{ printf ( " Can't write empty list. \n " ) ; return;

} if ( ( fptr = fopen ( "ant.rec " , "w" ) ) = = NULL ) ; printf ( " Can't open file agent.rec \n " ) ;

else { fwrite ( agent, sizeof (agent [0] ), n, fptr ) ; fclose (fptr) ; printf ( " File of %d record written . \n " , n ) ; } }

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 25 of 29

void rfile (void) {

FILE * fptr ; If ( ( fptr = fopen ( "ant.rec", "rb " ) = = NULL ) )

printf ( " Can't open file agent.rec \n " ) ; else { while ( fread ( &agent [n], sizeof ( agent [n] ) , 1, fptr ) = = 1 ) n++ ; fclose (fptr); printf ( " \n File read.total agent is now %d. \n " , n ) ; } } The new capabilities of this program to be written are in the function wfile ( ) and rfile ( ). In wfile ( ) the statement fwrite ( agent, sizeof (agent [0] ), n, fptr ) ; causes, the entire array of n structure to be written to disk all at once. The address of the data is at agent; the size of one record is the sizeof ( agent [0] ), which is the size of the first record; the number of such records is n; and the file pointer is fptr. When the function rfile ( ) reads the data back in, it must do so one record- that is, one structure – at a time, since it does not know in advance how many agents are in the database. Thus the expression fread ( &agent [n], sizeof ( agent [n] ) , 1, fptr ) is embedded in a while loop. Which waits for the fread ( ) function to report that no bytes were read, including end-of-file. This expression causes the data in thestructure to be stored at address & agent [n], and to have the size of agent [n]. It causes one structure to be read at a time from the file pointed to by fptr.

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 26 of 29

Random Access So far all our file reading and writing has been sequential. That is, when writing a file we have taken a group of items – whether character, strings or more complex structure – and placed them on the disk one at a time. Likewise reading we have started at the beginning of the file and gone on until we came to the end. It is also possible to access files “randomly”. This means directly accessing a particular data items, even though it may be in the middle of the file. // reads one agent's record from file int main(void) {

struct { char name [40] ;

int agnumb ; double height ; } agent; FILE * fptr ; int recno ; // record number long int offset ; // must be long

if ( ( fptr = fopen ( "agent.rec", "r " ) ) = = NULL ) {

printf ( " Can't open file agent.rec \n " ) ; exit (1);

} printf ( " Enter record number " ) ; // get record num scanf ( " %d " , &recno ) ; offset = (recno-1) * sizeof (agent) ; // find offset if ( fseek ( fptr, offset, 0 ) != 0 ) // go there {

printf ( " Can't move pointer the rec \n " ) ; exit (1);

}

fread ( &agent, sizeof (agent), 1, fptr ) ; printf ( " Name :%s \n ", agent.name ) ; printf ( " Agent number: %d\ n " , agent.agnumb ) ; printf ( " Height: %4.2f \n " , agent.height ) ; fclose (fptr); return (0);

}

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 27 of 29

Output: Randr Enter record number: 2 Name: Ali Number: 026 Height: 70.5 File Pointers To understand how this program works, you need to be familiar with the concept of file pointers. A file pointer is a pointer to a particular byte in a file. The function we have examined in the above program uses the file pointer each time we wrote something to a file. When we closed a file ad then reopened it, the file pointer was set back to the beginning of the file, so that if we then read from the file, we would start at the beginning. If we had opened a file for append ( using the “a” type option ), then the file pointer would have been placed at the end of an existing file before we began writing.

“ The file pointer points to the byte in the file where the nesx access will take place. The fseek ( ) function lets us move this pointer”.

The function fseek ( ) gives us control over the file pointer. Thus, to access a record in the middle of a file, we use this function to move the file pointer to that record. In the above program, the file pointer is set to the desired value in the expression.

if ( fseek ( fptr, offset, 0 ) != 0 ) The first argument of the fseek ( ) is the pointer to the FILE structure for this file. This indicates where we are in the file. The second argument in the fseek ( ) is called the offset. This is the number if byte from a particular place to start reading. Often this place is the beginning of the file; that is the case here. In our program the offset is calculated by multiplying the size of one record the structure agent by the number of the record we want to access. In the example we access the second record, but the program thinks of this as record number 1 ( since the first record is 0 ) so the multiplication will be by 1. It is essential that the offset be a long integer. The last argument of the fseek ( ), function is called the mode. There are three possible mode numbers, and they determine where the offset will be measured from:

Mode Offset is measured from 0 Beginning of the file

1 Current position of the file pointer 2 End of the file

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 28 of 29

Once the file pointer has been set, we read the contents of the record at that point into the structure agent and print out its contents. Another function, ftell ( ), returns the position of the file pointer. Error Conditions In most cases, if a file can be opened, it can be read from or written to. There are situations, however, when this is not the case. A hardware problem might occur while either reading or writing is taking place, a write operation might run out of the disk space, or some other problem might occur. Most standard I/O functions do not have an explicit error return. For example, if putc ( ) returns an EOF, this might indicate either error a true end-of-file or an error. If fgets ( ) returns a NULL value, this might indicate either an EOF or an error, and so forth. Program // writef2.c // writes formated data to file // program includes error_handle function int main(void) {

FILE *fptr; // declare ptr to FILE Char name [40]; // agent's name int code; // code number float height; // agents height

if ( ( fptr = fopen ( "textfile.txt" , "w" ) = = NULL) ) {

printf ( " Can't open textfile.txt " ) ; exit (1 );

} do

{ printf ( " Type name ,code number, and height: " ) ; scanf ( " %s %d %f ", name, &code, &height ) ; fprintf (fptr, "%s %d %f ", name, code, height ) ; if ( ferror (fptr) ) // check for error { perror ( " Write error " ) ; // writes message fclose (fptr) ; // file close exit (1) ; }

} while ( strlen (name) > 1 ) ; // no name given ?

fclose (fptr) ; return (0) ; }

CE-102 CPPS ( Chapter # 5: Handling Files In C )

Compiled By: Muzammil Ahmad Khan and Muhammad Kashif Shaikh Page 29 of 29

To determine whether an error has been occurred we can use the function ferror ( ). This function takes one argument: the file pointer ( to FILE ). It returns a value of 0 if no error has occurred, and a nonzero ( TRUE ) value if there is error. This function should resets by closing the file after it has been used. Another function is also useful in conjuction with ferror ( ) . This one is called perror ( ). Its takes a string supplied by the program as an argument; this is usually an error message indicating where in the program the error occurred. This function prints out the program’s error message and then goes on to print out a system-error message. In the event that, for example, there is a disk error, ferror ( ) will return a nonero value and perror ( ) will print out the following message: Write error: Bad data The first part of this message is supplied by the program and the second part, from the colon on, is supplied by the system. Explicit error messages of this type can be informative both for the user and for the programmer during development.