Post on 22-Feb-2016
description
File I/O
File I/O
Types of files Command line arguments File input and output functions Binary files Random access
Introduction
Introduction
Data stored in main memory does not persist Most programs require that the user be
able to store and retrieve data from previous sessions
In persistent storage, such as a hard disk There are many forms of persistent
storage Which suggests that the low level
processes for accessing them is different A file is a high level representation that
allows us to ignore low level details
Reading Files
Files have formats A set of rules that determine the
meaning of its contents To read a file
Know (or find) its name Open it for reading Read in the data in the file Close it
There are similar processes for writing to files
Files
A file represents a section of storage Files are viewed as contiguous
sequences of bytes which can be read individually In reality a file may not be stored
sequentially and may not be read one byte at a time
Such details are the responsibility of the OS
C has two ways to view files Text Binary
Text and Binary Files
There is a distinction between text files and binary files Text files store all data as text whereas
binary files store the underlying binary representation
In addition C allows for both text and binary views of files Usually the binary view is used with
binary files
Low Level I/O
In addition to types of files and views of files C has a choice of I/O levels
Low level I/O uses the fundamental I/O given by the OS
Standard high level I/O uses a standard package of library functions ANSI C only supports standard I/O since
all OS I/O cannot be represented by one low level model
We will only look at standard I/O
Text Files
Standard Files
C automatically assigns input and output to standard files for some I/O functions e.g. getchar(), gets(), scanf(), printf(),
puts() There are three standard devices for
I/O Standard input is set to the keyboard Standard output is set to the display Standard error is set to the display
Redirection causes other devices or files to be used for standard input or output
Redirection
The input or output of a C program can be redirected to a file When the program is run at the command prompt▪ > file redirects standard input to the named file▪ < file redirects standard output to the named file
To redirect input for a program called a3 to a file called a3test ./a3 > a3test▪ ... which is what we are doing when marking some of
your assignments ...
Command Line Arguments Command line arguments are
additional input for programs For example, gcc takes a number of
command line arguments, such as in gcc -o hello helloworld.c
Our C programs can also take command line arguments Another form of declaring the main
function gives the main function two arguments
int main(int argc, char *argv[])
int main(int argc, char* argv[])
number of arguments
array of strings (the arguments)
short for argument count, the count should be one more than the number of arguments
the first string is the name of the command, the remaining strings are the additional arguments
Using Command Line Arguments The value of argc is derived
It does not have to be entered by the user
The first element of argv is the name of the executable (the program name) On most systems
The second and subsequent elements of argv are the arguments In the order in which they were entered
Counting Characters
Write a program to count the number and types of characters in a file The file to be read will be given as a
command line argument to the program The program will exit under two
conditions The wrong number of arguments are
given to the program The file cannot be opened
Counting Characters – includes #include "stdio.h"#include "ctype.h"
//Forward Declarations
void countCharacters(FILE* fp, char* fName);
required for character comparisons
Counting Characters – main 1int main(int argc, char* argv[]){
FILE* fp;// Test the number of argumentsif(argc != 2){
printf("%s requires file name\n", argv[0]);exit(1);
}
i.e. command and one argument
size of argv
first test to make sure the user has entered the command correctly
in Unix (or Linux) the program can be given different names
Counting Characters – main 2int main(int argc, char* argv[]){
// ...// Attempt to open fileif((fp = fopen(argv[1], "r")) == NULL){
printf("Cannot open %s\n", argv[1]);exit(1);
}
countCharacters(fp, argv[1]);fclose(fp);return 0;
}
the argument, should be a file namereturns NULL if file cannot be opened
processes the file
then attempt to open the file for reading
1 has the value of EXIT_FAILURE, note that exit will exit the program from any function
Counting Characters Function – 1
// Prints the count of characters in a file, by:// alpha// digits// whitespace// other// PRE: fp can be opened and read// PARAM: fp is a pointer to a file to be readvoid countCharacters(FILE* fp, char* fName){
int alpha = 0;int digits = 0;int white = 0;int other = 0;int total = 0;char ch;
note the pre-condition is documented
variable declarations
documentation and variable declarations
Counting Characters Function – 2
void countCharacters(FILE* fp, char* fName){
// ...// Read file one character at a timewhile((ch = getc(fp)) != EOF){
if(isalpha(ch)){alpha++;
}else if(isdigit(ch)){digits++;
}else if(isspace(ch)){white++;
}else{other++;
}}total = alpha + digits + white + other;
processes each character until end-of-file
it’s an if ... else if ... else statement to minimize comparisons and to ensure that other is counted correctly
go through the file one character at a time
Counting Characters Function – 3
void countCharacters(FILE* fp, char* fName){
// ...// Print number of charactersprintf("%s contains %d characters\n", fName, total);printf("%d letters\n", alpha);printf("%d digits\n", digits);printf("%d whitespace\n", white);printf("%d other\n", other);
} prints the count of each type of character
and then print the number of characters
Counting Characters Outputhere is a sample run of the program
changes directory to the directory containing the .exe
it’s a Word document
no such file
no file name argument
Discussion
There is no need to use command line arguments with the preceding program It’s just an example of using them
A different version of the program could allow the user to process multiple files With a loop that ended the program
when the user wanted In which case it would not make sense to
have the file name as a command line argument
Opening Files with fopen
The fopen function is used to open files It returns a pointer to a FILE structure The FILE structure is defined in stdio.h
and contains data about the file If the file cannot be opened fopen
returns the null pointer fopen takes two string arguments
The name of the file to be opened The mode in which the file is to be
opened
File Modes
Mode Meaning"r" opens text file for reading"w" opens text file for writing, overwrites existing files,
creates new files"a" opens text file for writing, appends to the end of
existing files"r+" opens text file for update (both reading and writing)"w+" opens text file for update (both reading and writing)
overwrites existing files, creates new files"a+" opens text file for update (both reading and writing)
the whole file can be read but writing only appends to the end of the file
"rb", “wb", ...
the same as the preceding modes except that it uses binary rather than text mode
Character I/O
The functions getc and putc can be used for character based file I/O They are similar to getchar and putchar
except that they require a file argument The getc function will return the EOF
value if it has reached the end of a file To avoid trying to process empty files
check for EOF before processing the first character
Closing Files
Files should be closed when finished with Using the fclose function which takes a
file pointer The fclose function flushes buffers as
required, and allows the file to be correctly opened again
The fclose function returns 0 if a file was closed successfully and EOF if it was not Files can be unsuccessfully closed if the
disk is full or if their drive is removed
File I/O Functions
There are file I/O functions similar to the I/O functions we’ve been using Each function takes a FILE pointer▪ Which could be stdin or stdout if input is to be
from the keyboard, or output to the display
fprintf, fscanf and rewind The fprintf and fscanf functions work
just like scanf and printf except with files The file pointer is an additional first
argument▪ The file pointer is the last argument for putc
The rewind function moves the file pointer back to the front of the file
fgets and fputs
The fgets function is used for string input The first argument is an address of a
string The second is the maximum length of
the string The third is the file where input is stored fgets returns NULL when it encounters
an EOF The fputs function is used for string
output It has arguments for a string and a file
pointer It does not append a newline when it
prints▪ Unlike puts which does
Append Names to a File 1#include "stdio.h"
const int FNAME_LEN = 20;const int NAME_MAX = 40;
int main(){
char fname[FNAME_LEN];char name[NAME_MAX];FILE* fp;
printf("Enter the name of the file: ");gets(fname);
fp = fopen(fname, "a+")
maximum lengths of file names and names
opens the file for appending and reading (a+)
open for append and read, will create a new file if fname does not exist
Append Names to a File 2int main(){
// ...puts("Enter names to add to the file");
while(gets(name) != NULL && name[0] != '\0'){fprintf(fp, "%s\n", name);
}
puts prints a newline
similar to printf, can be used to format numeric values
add words to the end of the file
the while loop continues until the user presses enter twice in sequence
Append Names to a File 3int main(){
// ...puts("File contents\n");rewind(fp);while(fgets(name, NAME_MAX, fp) != NULL){
printf("%s",name);}
fclose(fp);return 0;
}
goes back to the start of the file
fgets is used instead of fscanf since names consist of two words
then print the entire contents of the file
Append Names to a File 4here is a sample run of the program
note that the new names have been appended to the existing file rather than over-writing the file
Binary Files
Storing Numeric Data
All of the examples have involved string and character storage Consider storing numeric data
Storing integers is straightforward But what about storing floating point values?
We could use fprintf for floating point values e.g. fprintf(fp, "%f", num); But this entails making decisions about the
format specifier
Storing Bytes
If fprintf stores numeric values they are converted to characters and stored as text This may waste space if the number
contains many digits (e.g. 1.0/3) Or may lose precision if the format
specifier is used to fix decimal places▪ fprintf(fp, "%.2f", 1.0/3);
An alternative is to store the same pattern of bits used to represent the value
Binary File
A binary file stores data using the same representation as a program Numeric data are not converted to
strings The functions fread and fwrite are
used for binary I/O They are a little more complex than text
file functions They require information about the size
of data to be stored
Function Prototype for fwrite
size_t fwrite(void * ptr, size_t size, size_t nmemb, FILE* fp)
size_t is a type, defined in terms of other C standard types and is usually an unsigned intsize_t is the type returned by sizeof
file pointer
address of the first memory location to be written
the size of the variables
the number of variables
Use of fwrite
The complex structure of fwrite allows it to store entire arrays in one function call double temperatures[365]; fwrite(temperatures, sizeof(double), 365,
fp); The return value of fwrite is the
number of items successfully written to the file This should equal the nmemb parameter
fread
The fread function takes the same set of arguments as fwrite The ptr argument is the address in
memory to read the data into fread should be used to read files
that were written using fwrite double temperatures[365]; fread(temperatures, sizeof(double), 365,
fp);
Random Access
It may be useful to move to a particular location in a file Without reading the preceding part of
the file, like reading an array This is known as random access
The fseek and ftell functions allow random access to files They are usually used with binary files
fseek
The fseek function has three arguments A file pointer to the file An offset indicating the distance to be moved
from the starting point The mode which identifies the starting point▪ SEEK_SET – the beginning of the file▪ SEEK_CUR – the current position▪ SEEK_END – the end
fseek returns 0 normally and -1 for an error Such as reading past the end of the file
ftell
The ftell function returns the current position in a file, as a long The number of bytes from the start of
the file fseek and ftell may differ based on
the OS Since the distance that fseek moves is
measured in bytes they are normally used for binary files
ANSI C introduced fgetpos and fsetpos for use with larger file sizes
Binary File Example
This example creates an array of random values and writes them to a binary file
The user is then asked for an index value The program finds and prints the value
with that index in the file using fseek and fread
Writing and Reading an Array 1
#include "stdio.h"#include "stdlib.h"
#define ARR_SIZE 100
int main(){
double numbers[ARR_SIZE];double value;int i;long pos;char* fname = "numbers.dat";FILE* fp;
length of the array
declarations
Writing and Reading an Array 2
int main(){
// ...// Create a set of double valuesfor(i = 0; i < ARR_SIZE; ++i){
numbers[i] = i + (double)rand() / RAND_MAX;}
create the array to written to the file
this is probably unnecessarily complicated but it produces an ordered array of doubles with digits to the right of the decimal point
defined in stdlib.h
Writing and Reading an Array 3
int main(){
// ...// Open file for writingif((fp = fopen(fname, "wb")) == NULL){
fprintf(stderr, "Could not open %s.\n", fname);exit(1);
}
// Write array in binary formatfwrite(numbers, sizeof(double), ARR_SIZE, fp);fclose(fp);
write the array to the file
the array
size of each value
for writing a binary file
number of values
Writing and Reading an Array 4
int main(){
// ...// Open file for readingif((fp = fopen(fname, "rb")) == NULL){
fprintf(stderr, "Could not open %s.\n", fname);exit(1);
}
open file for reading
for reading a binary file
Writing and Reading an Array 5
int main(){
// ...// Read array elements as requestedprintf("Enter index in range 0 to %d: ", ARR_SIZE-1);scanf("%d", &i);while(i >= 0 && i < ARR_SIZE){
pos = (long) i * sizeof(double);fseek(fp, pos, SEEK_SET);fread(&value, sizeof(double), 1, fp);printf("value at index %d = %.2f\n", i, value);printf("Enter index (out of range to quit): ");scanf("%d", &i);
}fclose(fp);
}
read values from the file
position in file to be read
move to positionbinary
read
get next position
Writing and Reading an Array 6
here is a sample run of the program
note that the binary file is not comprehensible by humans