Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will...

167
Introduction to Data Abstraction, Algorithms and Data Structures With C++ and the STL Spring 2020 Alexis Maciel Department of Computer Science Clarkson University Copyright c 2020 Alexis Maciel

Transcript of Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will...

Page 1: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Introduction toData Abstraction, Algorithms

and Data Structures

With C++ and the STL

Spring 2020

Alexis MacielDepartment of Computer Science

Clarkson University

Copyright c© 2020 Alexis Maciel

Page 2: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

ii

Page 3: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Contents

Preface v

1 Data Abstraction and Object-Oriented Programming 11.1 A Data Type for Times . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Encapsulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Modularity and Abstraction . . . . . . . . . . . . . . . . . . . . . . . . 91.4 Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.5 Object-Oriented Programming . . . . . . . . . . . . . . . . . . . . . . 161.6 Constant Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221.7 Constructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2 More About Classes 352.1 Inline Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2 Get and Set Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.3 Operator Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . 482.4 Compiling Large Programs . . . . . . . . . . . . . . . . . . . . . . . . 562.5 The make Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3 Strings and Streams 65

iii

Page 4: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

iv CONTENTS

3.1 C Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.2 C++ Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.3 I/O Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783.4 String Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4 Vectors 914.1 A Simple File Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.2 Vectors in the STL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 934.3 The Software Life Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 1014.4 The Software Development Process . . . . . . . . . . . . . . . . . . . 1044.5 Creation of the File Viewer . . . . . . . . . . . . . . . . . . . . . . . . 1064.6 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5 Generic Algorithms 1255.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.2 Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1285.3 Iterator Types and Categories . . . . . . . . . . . . . . . . . . . . . . . 1365.4 Vectors and Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1425.5 Algorithms in the STL . . . . . . . . . . . . . . . . . . . . . . . . . . . 1445.6 Implementing Generic Algorithms . . . . . . . . . . . . . . . . . . . . 148

Bibliography 155

Index 157

Page 5: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Preface

These notes are for a second course on computer programming and softwaredevelopment. In a first course, you likely focused on learning the basics: vari-ables, control statements, input and output, files, vectors (or arrays), functionsand structures. You may have also had an introduction to classes. These con-cepts are critically important and they are sufficient for the creation of manyuseful programs. But many other programs, especially large ones, require morepowerful concepts and techniques. And a deeper understanding of classes anddesign principles.

In fact, the creation of large computer programs poses three basic challenges.The overall goal of these notes is to teach you concepts and techniques that willallow you to meet these three challenges. Here’s an overview.

First, a large program always contains a large amount of details and all thisinformation can be difficult to manage. The solution is to organize the programinto a set of smaller components. This modularity is usually achieved through thetechnique of abstraction. Even if you are not familiar with these terms, you havealready used abstraction in your programs: abstraction happens automaticallyevery time you create a function. In these notes, you will learn how to apply thetechnique of abstraction to data.

Second, a large program typically holds a large amount of data that needs

v

Page 6: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

vi PREFACE

to be accessed efficiently. In response, computer scientists have invented a widevariety of data structures, which are ways of organizing data so it can be storedand accessed efficiently. You are already familiar with one data structure: vec-tors (or arrays). In these notes, you will learn about other basic data structures:linked lists and maps. You will learn how and when to use these data structures.You will also learn the techniques involved in the implementation of vectors andlinked lists.

Third, a large program often performs complex tasks that require nontrivialtechniques. Computer scientists have designed algorithms that perform a widevariety of complex tasks efficiently. In these notes, you will learn efficient algo-rithms for searching and sorting. You will also learn to design algorithms usingthe technique of recursion and to compare their relative efficiency in terms oftheir asymptotic running times.

Several of the concepts and techniques you will learn will be introducedthrough the creation of programs that are representative of real-life software.These examples will show that these concepts and techniques are tools that havebeen developed to solve real problems.

As you learn about data abstraction, data structures and algorithms, you willalso learn about a number of other important topics such as the software devel-opment process, the importance of good documentation, object-oriented pro-gramming (but not inheritance and polymorphism), classes, pointers, dynamicmemory allocation, exceptions, testing, the use of standard software compo-nents (as available in a standard library), and the creation of generic softwarecomponents using templates and iterators.

These notes use the programming language C++ and its associated StandardTemplate Library (STL). Even though the main focus of these notes is not on theC++ language itself, you will learn all the relevant features of C++ and theSTL. In particular, you will learn to implement data structures that are realistic

Page 7: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

vii

subsets of the data structures provided in the STL.After reading these notes and working on the exercises, you will be able

to create fairly complex computer programs. But you will also be prepared tocontinue your study of computer science and software development. For exam-ple, at Clarkson University, these notes were written for CS142 Introduction toComputer Science II, a course that leads directly to CS242 Advanced Program-ming Concepts (graphical user interfaces, more object-oriented programming),CS344 Algorithms and Data Structures (more advanced than those covered inthese notes) and CS241 Computer Organization. In fact, the topics covered inthese notes are the foundation you need for the study of almost every other sub-ject in computer science, including programming languages, operating systems,artificial intelligence, cryptography, computer networks, database systems, aswell as, of course, large-scale software development.

These notes were typeset using LaTeX (MiKTeX implementation with the TeX-works environment). The paper size and margins are set small to make it easierto read the notes on the relatively small screen of a laptop or tablet. But then, ifyou’re going to print the notes, don’t print them to “fit the page”. That would giveyou pages with huge text and tiny margins. Instead, print them double-sided,at “actual size” (100%) and centered. If your favorite Web browser doesn’t al-low you to do that, download the notes and print them from a standalone PDFreader such as Adobe Acrobat.

Feedback on these notes is welcome. Please send comments [email protected].

Page 8: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

viii PREFACE

Page 9: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Chapter 1

Data Abstraction andObject-Oriented Programming

In this chapter, we will create a data type for storing times. In the process, wewill discuss and learn the important concepts of modularity, abstraction, dataabstraction and object-oriented programming.

1.1 A Data Type for Times

Imagine that we are creating a program that works with times. This could bea program that keeps track of appointments, or a program that records whenemployees start and stop working. Let’s assume that times consist of only hoursand minutes. Then each time could be stored as a pair of integers, one for thehours and one for the times:

int h;int m;

1

Page 10: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2 CHAPTER 1. DATA ABSTRACTION AND OOP

void read_time(int & h, int & m){

cin >> h;cin.get(); // coloncin >> m;

}

void print_time(int h, int m){

cout << h << ’:’;if (m < 10)

cout << 0;cout << m;

}

Figure 1.1: Functions for reading and printing times

The hours can be stored as a number from 0 to 23 so there is no need to storea.m. or p.m.

Now, imagine that this program needs to read and print times. The mostconvenient way to do this is to create functions for this purpose, as shown inFigure 1.1. The function read_time takes its arguments by reference so it canmodify the original variables that are given to it. The function print_time,on the other hand, takes its arguments by value so it works with copies of theoriginal variables. This prevents the function from accidentally modifying thosevariables. It also allows the function to be called on values that are not storedin variables, as in

print_time(8, 35)

This all works but it’s not very convenient. Because every time we want to

Page 11: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.1. A DATA TYPE FOR TIMES 3

struct Time{

int hours;int minutes;

};

Figure 1.2: The Time structure

declare a time, we need to declare two integers. And every time we want to passa time to a function such as read_time or print_time, we need to pass twointegers:

read_time(h, m);print_time(h, m);

A solution is to group the integers for the hours and the minutes into a sin-gle unit. In C++, this can be done by creating a type of structure, as shownin Figure 1.2. Then the creation of a time can be done by declaring a singlevariable:

Time t;

And the read and print functions can be revised to take a single value as argu-ment, as shown in Figure 1.3. These functions can then be used more simply asfollows:

read(t);print(t);

Note that the functions read_time and print_time had the word “time”in their names to distinguish them from other similar functions that would take

Page 12: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4 CHAPTER 1. DATA ABSTRACTION AND OOP

void read(Time & t){

cin >> t.hours;cin.get(); // coloncin >> t.minutes;

}

void print(const Time & t){

cout << t.hours << ’:’;if (t.minutes < 10)

cout << 0;cout << t.minutes;

}

Figure 1.3: Revised functions for reading and printing times

two integers as arguments. But now that the functions have an argument of typeTime, this is no longer necessary and the functions can have the more naturalnames read and print.

Note also that the print function now takes its argument by constant ref-erence. This ensures that the argument cannot be changed accidentally while atthe same time avoiding the copying that would occur if the argument was passedby value. If an argument consists of more than one integer or a few characters,as is the case with the Time structure, it usually takes less time to pass it byreference (constant or not) than by value.

So structures are convenient because they allow us to group several variablesinto a single unit. This simplifies the code, which makes it easier to both writeand understand.

We end this section by making another improvement to the read and print

Page 13: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.1. A DATA TYPE FOR TIMES 5

void read(Time & t, istream & in){

in >> t.hours;in.get(); // colonin >> t.minutes;

}

void print(const Time & t, ostream & out){

out << t.hours << ’:’;if (t.minutes < 10)

out << 0;out << t.minutes;

}

Figure 1.4: The functions read and print with a stream argument

functions. Right now, those functions only allow us to read times from standardinput (cin) and print times to standard output (cout). We can generalizethese functions by adding a stream argument, as shown in Figure 1.4. Thisallows these functions to read from any input stream and print to any outputstream, including file streams. Note that stream arguments are always passedby reference.

Source code for the data type Time we created in this section, together witha test driver, is available on the course website under Time1.

Study Questions

1.1.1. What is the purpose of a structure?

Page 14: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

6 CHAPTER 1. DATA ABSTRACTION AND OOP

1.1.2. When should an argument be passed by reference?

1.1.3. When should an argument be passed by constant reference?

1.2 Encapsulation

Continuing with our example of a program that works with times, notice that aslong as the functions that use times only need to read and print times, there isno need for those functions to directly access the hours and minutes stored inindividual times, as shown in this example:

Time t;read(t, cin);print(t, cout);

In fact, ideally, users of times should not directly access the hours and minutesstored in the structure. For two main reasons.

First, it’s safer. That’s because there could be hundreds or even thousandsof functions that use times. On the other hand, the number of functions suchas read and print that perform basic operations on times is probably muchsmaller, typically no more than a dozen or two. If the operations are the onlyfunctions that access the time data directly, then to ensure that this data is usedproperly, all we need to do is make sure that the operations use the data properly.We don’t need to worry about this when implementing the users of times. Sincethe number of users can be much larger than the number of operations, thissaves time and effort, which helps us get it right.

Second, if the users of times do not directly access the time data, then if wedecided to change how the time data is stored, we would not need to revisethose functions. In fact, we would not even need to check if those functions

Page 15: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.2. ENCAPSULATION 7

need to be revised. The operations would likely need to be revised but, again,since the number of operations is usually small and the number of users can bevery large, this saves time and effort and reduces the risk of errors.

The fact that users of Time do not access the data directly but instead usethe operations read and print is an example of encapsulation. Encapsulationcan be defined as the bundling of the data and operations of a data type in sucha way that the users of the data type can access the operations but not the data.

With the current version of the Time data type, encapsulation depends onprogrammer discipline because nothing prevents the users of Time from directlyaccessing the data. In the remainder of this section, we learn how encapsulationcan be enforced, that is, how we can prevent the users of Time from accessingthe data.

The simplest way of achieving this is to turn Time into a class and declarethat the Time data, which consists of the integers hours and minutes, isprivate. This prevents users of the class from directly accessing the data. Wethen declare that the Time operations are friends of the class so they havespecial permission to access the data. The result is shown in Figure 1.5.

Note that we didn’t have to turn Time into a class; in C++, we could haveleft it as a structure. In fact, the only difference between structures and classesin C++ is that, by default, data is private in a class and public in a structure(meaning that it can be accessed without restrictions). In these notes, we willfollow the common practice of using structures when, and only when, all thedata is public.

In this section, we used classes and privacy to enforce encapsulation. In alater section, we will learn that classes allow us to do even more.

Source code for the class Time of this section is available on the coursewebsite under Time2.

Page 16: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

8 CHAPTER 1. DATA ABSTRACTION AND OOP

class Time{

friend void read(Time & t, istream & in);friend void print(const Time & t, ostream & out);

private:int hours;int minutes;

};

void read(Time & t, istream & in){

in >> t.hours;in.get(); // colonin >> t.minutes;

}

void print(const Time & t, ostream & out){

out << t.hours << ’:’;if (t.minutes < 10)

out << 0;out << t.minutes;

}

Figure 1.5: A Time class with friend operations

Page 17: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.3. MODULARITY AND ABSTRACTION 9

Study Questions

1.2.1. What is encapsulation?

1.2.2. What are two advantages of encapsulation?

1.2.3. What does it mean for a class to grant friendship to a function?

1.3 Modularity and Abstraction

Large programs are always designed as collections of smaller components. Forexample, the Time class could be a component of a program. The functions thatuse Time would be other components.

In a well-designed modular program, these software components shouldsatisfy the following two properties:

1. Cohesion: each component performs one well-defined task.

2. Independence: each component is as independent as possible from the oth-ers.

In the context of software, independence is achieved when changes to onecomponent do not affect other components. Notice that the definition of mod-ularity says that components should be as independent as possible from eachother. Independence is a property that can be achieved at various degrees. Thegoal is to maximize independence.

Components with a high degree of independence are said to have a low de-gree of coupling.

Modular programs have a number of advantages. For example, a componentthat performs one well-defined task is going to be easier to understand and also

Page 18: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

10 CHAPTER 1. DATA ABSTRACTION AND OOP

1. Modular programs are easier to understand because their components per-form well-defined tasks and can be understood in isolation. In addition,modular programs contain little redundant code.

2. Modularity facilitates software reuse because components perform well-defined tasks that may occur elsewhere and because components dependas little as possible on the context in which they are used.

3. Modular programs are easier to implement because components can becoded one at a time and the work can be divided among various program-mers.

4. Modular programs are easier to test because components can be tested inisolation. This simplifies locating and fixing errors.

5. Modular programs are easier to modify because changes to one componentoften only affect that component.

Figure 1.6: Advantages of modularity

more likely to be useful in another project. This is not true of a component thatperforms several unrelated tasks.

Independence, on the other hand, allows us to code, test and modify com-ponents in isolation. Independence is also necessary for software reuse: a com-ponent that is highly dependent on a particular context will be difficult to use ina different one.

Modularity also typically reduces redundant code because the same com-ponent can be reused for repeated tasks. This helps make programs easier tounderstand and easier to modify. Figure 1.6 summarizes the advantages of mod-ularity.

Page 19: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.3. MODULARITY AND ABSTRACTION 11

Independence is usually achieved through information hiding, that is, byhaving components hide information from each other. Of course, the term hid-ing here is somewhat figurative: components aren’t people. We consider thatcomponent A hides information from component B if component B could be cre-ated without knowing some aspect of component A.

With functions, a high degree of information hiding and independence isessentially automatic. For example, consider the function sqrt from the C++standard library. We know what that function does: it returns the square root ofits argument. We may also know how that function does what it does. But whatis critical to note is that to use the sqrt function, we only need to know what itdoes, not how it does it. In other words, a function that uses the sqrt functiondepends only on what sqrt does, not on how it does it.

In fact, independence between software components is usually achieved inthis way: by having the users of a component depend on its purpose (what thecomponent does) but not on its implementation (how the component does whatit does). This is called abstraction. In the case of functions, we call it proce-dural abstraction. Abstraction is one of the most important techniques for thedesign of modular programs.

Notice that abstraction leads to two clearly separate perspectives on eachsoftware component. From the point of view of its users, a component can beseen as having a purpose but no implementation: it is an abstract component.This is the outside or public view of the component. This view concerns onlywhat the component does and it includes the interface of the component, whichis the information needed to use the component, such as the name, return typeand arguments of a function.

The developer of a component has a different view: he or she also sees theimplementation of the component. This is the inside or private view. It concernshow the component does what it does and it includes the inner workings of the

Page 20: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

12 CHAPTER 1. DATA ABSTRACTION AND OOP

component.

As discussed in the previous section, the concept of abstraction can also beapplied to data types. Consider our class Time. Functions that use Time relyon the fact that values of type Time represent times on a 24-hour clock and thatthe Time operations read and print do what they are supposed to do. Butthe users of Time do not depend on how the time data is stored or on how theoperations perform their tasks. In other words, the users of Time depend onwhat times are and what operations can be performed on times, but not how ontimes are stored or how the operations do what they are supposed to do. This isabstraction applied to data and it’s called data abstraction.

With data abstraction, from the point of view of a user component, a datatype looks as if it has a purpose but no implementation. When a data type isviewed in this way, we call it an abstract data type (ADT).

It’s important to note that while abstraction is automatic with functions, itis not automatic with data. With data, it must be designed into the software.This is why it is important to understand what data abstraction is and how toachieve it.

Data abstraction is normally achieved through encapsulation, a concept wediscussed in the previous section. Recall that encapsulation is the bundling ofthe data and operations of a data type in such a way that users of the data typecan access the operations but not the data. We achieved this with Time bydeclaring the Time data private. In this way, the users of Time cannot accessthe data directly; they must instead use the operations that are provided by thedata type (read and print).

Data abstraction plays a major role in the design of modular programs be-cause it leads to a high degree of independence. And the benefits can be signif-icant. For example, as mentioned in the previous section, in a large program,a class such as Time could be used by a very large number of other functions.

Page 21: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.3. MODULARITY AND ABSTRACTION 13

With data abstraction, none of those other functions depend on how the timedata is stored, which makes it easier to change how times are stored, if needed.In contrast, without data abstraction, if users of Time were allowed to directlyaccess the data, then each one of them would have to be carefully examined forany necessary revisions.

So we can use classes and privacy to encapsulate data types. This allowsus to achieve data abstraction, which leads to more information hiding, moreindependence and more modularity.

In the next section, we will learn that classes also allow us to program inan object-oriented way, which again leads to more modularity. And in the nextchapter, we will learn more about classes in C++. In the meantime, we end thissection with several notes related to the topics we just discussed.

The main benefit of encapsulation is data abstraction. But, as explained inthe previous section, encapsulation also increases safety because, for example, itprevents users of a data type such as Time from setting the data type’s variables(hours and minutes) to invalid values.

A concept related to encapsulation is the grouping of several variables intoa single unit to form a new data type, without any restrictions on how the userscan access the data. This can be called aggregation. In C++, this can be done bycreating a creating a type of structure and not specifying any privacy, just as wedid in the first section of this chapter. Or by creating a class and making the datapublic. In these notes, we will follow the common practice of using structuresfor aggregation and classes for encapsulation.

It is possible to view functions as a form of encapsulation. The idea is that afunction’s body bundles the operations that perform a task and the users of thefunction don’t perform those operations directly but instead call the function.Most people, however, think of data types when they think of encapsulation.

Finally, in languages like C++, encapsulation and data abstraction can be

Page 22: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

14 CHAPTER 1. DATA ABSTRACTION AND OOP

realized with classes and privacy. But some languages, such as JavaScript, haveclasses but no notion of privacy. Other languages, such as C, don’t even haveclasses. In those languages, it is still possible to program with data abstractionbut we have to rely on programmer discipline to ensure that the users of a datatype do not directly access the data but instead use the operations of the datatype. This is how things were done before object-oriented languages like C++and Java became widely used.

Study Questions

1.3.1. What exactly is a modular program?

1.3.2. Give five distinct advantages of modular programs.

1.3.3. Why does eliminating redundant code make programs easier to under-stand and easier to modify?

1.3.4. What is abstraction?

1.3.5. With abstraction, what information is hidden from the users of a softwarecomponent?

1.3.6. What is procedural abstraction?

1.3.7. What is data abstraction?

1.3.8. Besides data abstraction, what is another benefit of encapsulation?

Page 23: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.4. NAMES 15

1.4 Names

In the examples we’ve seen so far in these notes, we’ve had to choose severalnames for variables and functions, as well as for a structure and a class. The mostimportant consideration when choosing names is that they should be descriptive.For example, the name of a variable should describe its value. But there is moreto it than that. In this section, we describe the naming convention we will followin these notes.

It is useful to be able to easily distinguish between variable names and namesof types. We will do this by using different formats for variables and types.Variable names will use the underscore format: all lowercase with underscores toseparate words. For example, t, hours and start_time. Names of types, onthe other hand, will use the mixed-case format: capitalized words concatenatedwithout any separating character. For example, Time and PhoneBookEntry.

Function names normally occur together with arguments so we will reuse theunderscore format for them, as in read and find_entry. In addition, whenthe main purpose of a function is to return a value, the name of the function willbe a noun phrase that describes that value, as in length. On the other hand,when the main purpose of a function is to perform an action, the name of thefunction will be a verb phrase that describes the action, as in read and print.

There are many different naming conventions that can be followed. Whatis most important is to use some sort of convention and to be consistent. Inparticular, when working as a team on a project, it is critical for every memberof the team to follow the same convention. So if you are asked in an exerciseto extend some of the code presented in these notes, you should follow theconvention used in these notes.

Page 24: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

16 CHAPTER 1. DATA ABSTRACTION AND OOP

Study Questions

1.4.1. What format will we use in these notes for the names of variables, typesand functions?

1.5 Object-Oriented Programming

We now know that classes allow us to enforce data abstraction by preventingusers of a data type from directly accessing the data. But classes have anotherbenefit: they support object-oriented programming (OOP). This section brieflydiscusses the basic idea and rationale behind object-oriented programming.

Techniques like modularity and abstraction have been developed to help inthe construction of large programs. Object-oriented programming is anotherone of those techniques.

The way most people usually learn to program is called imperative pro-gramming. In imperative programming, a program is viewed as a sequence ofinstructions that tell the computer what to do. Imperative programming workswell with small programs but it is not as effective with large programs.

In object-oriented programming, a program is viewed as a collection of ob-jects that work together to accomplish the overall goal of the program. Eachobject has a certain set of responsibilities. Objects collaborate by requesting ser-vices from each other. They do so by sending messages to other objects. Thereceiver of a message responds by following a predetermined method. The setof messages understood by an object, as well as the methods used to respond tothose messages, are determined by the type, or class, of the object.

For example, with our class Time (see Figure 1.5), the way to read a time tis to call a function:

read(t, in);

Page 25: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.5. OBJECT-ORIENTED PROGRAMMING 17

bool less_than(const Time & t1, const Time & t2){

return (t1.hours < t2.hours) ||(t1.hours == t2.hours

&& t1.minutes < t2.minutes);}

Figure 1.7: The function less_than

In a sense, the time t just sits there waiting for us to do things to it.In contrast, in the object-oriented view of programming, the time t is an

object with responsibilities. When needed, we ask t to read from in:

t.read(in);

On receiving the read message, t responds by executing the correspondingmethod (which would be defined in the class Time). Note how read illustratesthe fact that messages can have arguments.

Here’s another example. Suppose that we want to be able to computewhether a time occurs before another one. We could create a function for thatpurpose, as shown in Figure 1.7. This function would be used as follows:

less_than(t1, t2)

This is imperative programming. The object-oriented way is to ask one of thetimes to tell us the difference between itself and the other time:

t1.less_than(t2)

Note that the two Time objects play different roles: t1 is the receiver of themessage while t2 is an argument of the message.

Page 26: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

18 CHAPTER 1. DATA ABSTRACTION AND OOP

Figure 1.8 shows the class Time with all the time operations turned intomethods. The methods are declared public so they can be accessed by the usersof Time. (We will see examples of private methods later.) Note that methodsare sometimes called member functions in C++.

In the implementation of the methods, the data members of the receiver areaccessed without specifying an object. For example, in the implementation ofless_than, the hours of the receiver are accessed as hours while the hoursof the argument t are accessed as t.hours.

One way to make sense of this (and to remember how it works), is to reada method from the perspective of the receiver, as if the method was telling thereceiver how to respond to the message. So the less_than method is tellingthe receiver to compare its hours and minutes to those of t.

The main advantage of OOP is two-fold. First, it automatically produces alot of data abstraction because if you’re going to treat a piece of data as a smartobject that provides services to you, why should you mess with that object’sdata? This results in a high degree of independence between components.

Second, OOP encourages software components to delegate as many tasks aspossible to other components, because these other components are not dumbpieces of data but smart objects that provide services. This delegation allowseach component to stay focused on its main task. In other words, OOP leadsto a high degree of cohesion in components. Recall that besides independence,cohesion is the other key aspect of modularity.

Note that in pure OOP, every component of a program is an object. Withlanguages such as C++ and Java, we normally use a mix of object-oriented andimperative programming.

Source code for the class Time of this section is available on the coursewebsite under Time3.

Page 27: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.5. OBJECT-ORIENTED PROGRAMMING 19

class Time{public:

void read(istream & in){

in >> hours;in.get(); // colonin >> minutes;

}

void print(ostream & out){

out << hours << ’:’;if (minutes < 10)

out << 0;out << minutes;

}

bool less_than(const Time & t){

return (hours < t.hours) ||(hours == t.hours &&

minutes < t.minutes);}

private:int hours;int minutes;

};

Figure 1.8: A class Time with methods

Page 28: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

20 CHAPTER 1. DATA ABSTRACTION AND OOP

Study Questions

1.5.1. In OOP, how is an object viewed differently from just a piece of data?

1.5.2. What is a message? What is a method? What is a receiver?

1.5.3. What is the ultimate goal of both data abstraction and OOP?

Exercises

1.5.4. Create a class of dates. This class should be called Date and each ob-ject in this class represents a date such as January 22, 2019. Include thefollowing methods in the class.

a) A method read(in) that reads the date from input stream in.Dates are typed as m/d/y where m, d and y are integers. No error-checking is performed.

b) A method print(out) that prints the date to output stream out.Dates are printed in numerical format, as in 1/22/2019.

c) A method print_in_words(out) that also prints the date to out-put stream out but with the month in words, as in January 22,

2019.

1.5.5. Create a class of three-dimensional vectors. The class should be calledThreeDVector. Each object in this class represents a three-dimensionalvector of real numbers, such as (3.5,2.64,−7). Include the followingmethods in the class. Carefully consider how arguments should be passedto the methods.

Page 29: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.5. OBJECT-ORIENTED PROGRAMMING 21

a) A method read(in) that reads the vector from input stream in.Vectors are entered in the format (x, y, z) where x, y and z arereal numbers. No error-checking is performed.

b) A method print(out) that prints the vector to output stream out.Vectors are printed in the format described for read.

c) A method sum(v) that returns the sum of the receiver and vector v.The argument is a ThreeDVector object.

1.5.6. Create a class of fractions called Fraction. In this class, each objectrepresents a fraction, that is, a number of the form a/b where a is aninteger and b is a positive integer. Include the following methods in theclass. Carefully consider how arguments should be passed to the methods.

a) A method read(in) that reads the fraction from input stream in.Fractions are entered in the format a/b where a is an integer and bis a positive integer. No error-checking is performed.

b) A method print(out) that prints the fraction to output streamout. Fractions are printed in the format described for read. Frac-tions are not reduced.

c) Methods sum(s) and product(s) that return, respectively, thesum and product of the receiver and the argument. For example,if the receiver r is 2/3 and s is 3/4, then r.sum(s) returns a frac-tion whose value is 17/12 and r.product(s) returns 6/12. Thereturned fractions are not reduced. The arguments and the returnedvalues are all Fraction objects.

Page 30: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

22 CHAPTER 1. DATA ABSTRACTION AND OOP

void println(const Time & t, ostream & out){

t.print(out);out << endl;

}

Figure 1.9: The println function

1.6 Constant Methods

In this section, we address a major flaw of our class Time. Consider the functionshown in Figure 1.9. This function prints a time and then moves to the next line.Note that the Time argument of the function is passed by constant reference.This is the right thing to do since it avoids unnecessarily copying the time whilealso preventing println from accidentally modifying the time.

Unfortunately, printlnwill not compile. The compiler will essentially com-plain that the method print may change t and t should not be changed be-cause it is declared constant.

The solution is to declare the method print to be a constant method.This is done by adding the keyword const right after the argument list of themethod:

void print(ostream & out) const

This essentially declares that the receiver of the print method is constant andcannot be changed by the method.

This has two consequences. First, when compiling print, the compiler willcheck that the method does not change its receiver. This is valuable on its ownbecause we really don’t want print to change its receiver. So it’s safer to havethe compiler check that for us.

Page 31: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.7. CONSTRUCTORS 23

Second, because print is a constant method, the compiler will allow print

to be used on a constant Time. In general, only messages corresponding toconstant methods can be sent to constant objects.

To summarize, methods that are not supposed to modify their receiversshould be declared constant for two reasons: to prevent them from modifyingtheir receivers and to allow for the methods to be used on constant objects.

All this means that we should also declare the less_thanmethod constant:

bool less_than(const Time & t) const

Because that method should not change either time, not its argument but alsonot its receiver.

Source code for the class Time of this section is available on the coursewebsite under Time4.

Study Questions

1.6.1. How do you prevent a method from modifying its receiver?

1.6.2. How do you ensure that a method can be used on a constant object?

Exercises

1.6.3. Consider the new classes described in the exercises of the previous sec-tion. In each of these classes, which methods should be constant?

1.7 Constructors

A common programming mistake is to declare a variable but then forget to setit. This kind of error can be difficult to detect and can lead to hard-to-find bugs.

Page 32: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

24 CHAPTER 1. DATA ABSTRACTION AND OOP

This is why it’s a good habit to always initialize a variable as soon as it is declared.For example, if a counter count is needed for a loop, it is better to set it to itsinitial value right away, as in

int count = 0;

This way, we won’t forget to set the counter later.However, it would be more convenient and even safer if variables were ini-

tialized automatically as soon as they are declared. This turns out to be anotheraspect of object-oriented programming: objects are always automatically ini-tialized as soon as they are created. This is done by special methods called aconstructors.

We’re going to start by adding to Time what is called a default constructor.This is the constructor that executes whenever a Time object is created by adeclaration such as

Time t;

The object produced by the default constructor can be called the default objectof the class.

We need to decide what the default object of class Time should be. Sinceit’s not clear what initial value a time should have, it is probably safer to simplyinitialize a time to an invalid value, one that will be easy to recognize in case welater forget to set the time to a proper value. A value such as 99:99 works well.

The default constructor can then be defined as follows:

Time() { hours = minutes = 99; }

The constructor should be placed in the public section of the class. Note thatconstructors have the same name as the class but they have no return value

Page 33: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.7. CONSTRUCTORS 25

(not even void). Constructors can have arguments but the default constructordoesn’t have any.

Our default constructor currently uses an assignment statement to set thedata members hours and minutes. A better approach is to use initializers:

Time() : hours(99), minutes(99) {}

In the case of data members with a primitive type, such as int, double, charand bool, this makes no difference. But if Time had a data member that was anobject, a string, for example, then that data member would be first initializedby the default constructor of its class and then changed in the body of the Timeconstructor by the assignment statement. In this case, the initializer versionwould be more efficient because initializers override the automatic initializationof the data member. In other words, using initializers in a constructor ensuresthat data members are initialized only once. It’s a good idea to get in the habitof using initializers whenever possible.

Figure 1.10 shows the class Time with two new constructors. These con-structors can be used to initialize Time objects to specific values. The secondconstructor can be used in a declaration such as

Time noon(12);

where it initializes the time to 12:00. The third constructor can be used in adeclaration such as

Time wake_up_time(6, 15);

where it initializes the time to 6:15. Note how the arguments of these construc-tors are provided as part of the declaration of the times. Which constructor isused in a declaration is determined by the arguments provided.

Page 34: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

26 CHAPTER 1. DATA ABSTRACTION AND OOP

class Time{public:

Time() : hours(99), minutes(99) {}Time(int h) : hours(h), minutes(0) {}Time(int h, int m) : hours(h), minutes(m) {}

...

private:int hours;int minutes;

};

Figure 1.10: The class Time with three constructors

The implementations of the three Time constructors have most of their codein common. It is possible to eliminate some of this repetition by having the firsttwo constructors delegate their work to the third one as shown in Figure 1.11.We then say that the first two constructors are delegating constructors.

Constructors can also be called directly to create and initialize objects. Forexample, suppose that we want to print the time 8:30. We can do this as follows:

Time eight_thirty(8, 30);eight_thirty.print(cout);

But there’s an easier way:

Time(8, 30).print(cout);

This creates a temporary Time object, initializes it to 8:30 by using the third

Page 35: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.7. CONSTRUCTORS 27

class Time{public:

Time() : Time(99, 99) {}Time(int h) : Time(h, 0) {}Time(int h, int m) : hours(h), minutes(m) {}

...

private:int hours;int minutes;

};

Figure 1.11: The class Time with delegating constructors

constructor, and tells the Time object to print. The second constructor can beused in the same way. For example,

Time(8).print(cout);

will print 8:00.We can also use constructors to create objects that are then passed to func-

tions. For example, suppose that t is a time and we need to know if it occursbefore 10:30. This could be computed as follows:

t.less_than(Time(10, 30))

However, in the case of objects passed as arguments, it’s possible to do some-thing even simpler:

t.less_than({10, 30})

Page 36: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

28 CHAPTER 1. DATA ABSTRACTION AND OOP

What happens here is that the constructor expects a Time object as argumentand instead finds a list of two integers. It then looks for a way to create a Timeobject out of those two integers. It does that by looking for a Time constructorthat can take two integers as arguments. And it finds one. So the compileressentially replaces the list {10, 30} by the constructor call Time(10, 30).Because the end result is that a Time object is created and initialized by usingthe list {10, 30}, this is called list initialization.

Both versions of this code, the one with the explicit call to the constructorand the one with the list of integers, end up doing essentially the exact samething. But the version with the list is easier to write and also easier to readbecause it feels more natural.

Suppose we now need to determine if t occurs before 10:00. This couldobviously be computed in a similar way:

t.less_than({10})

But in the case of a single value, the braces are not even needed:

t.less_than(10)

The end result here is that the integer 10 is converted into a Time object. Thisis called an implicit conversion. In contrast, Time(10) would be an explicitconversion.

Note that list initialization and implicit conversions are not always a goodidea. For example,

println(Time(8, 30), cout);

is perfectly fine. It’s completely clear that a time will be printed. But consider

println({8, 30}, cout);

Page 37: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.7. CONSTRUCTORS 29

If there was another println function that takes as argument an object thatcan be initialized by the list {8, 30}, then the compiler wouldn’t know whichfunction to use and would complain that the call is ambiguous. And

println(8, cout);

is even riskier because if there was a println function with an argument oftype int, that function would be used instead. In addition to possibly leadingto errors, both of these calls are also harder to understand because they don’tmake it clear that a time will be printed.

Therefore, in the case of println, it’s safer to call the constructor explicitly:

println(Time(8, 30), cout);

Note that we don’t have any of these problems with less_than:

t.less_than({10, 30})

That’s because that function is a method of class Time, which implies that thereceiver provides the context needed to understand that the argument is a Timeobject.

Note that a Time variable can be declared and initialized by the secondconstructor with either the notation

Time noon(12);

or

Time noon = 12;

This equal sign is not the assignment operator but just another way of initializingan object.

Page 38: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

30 CHAPTER 1. DATA ABSTRACTION AND OOP

If you do not include any constructor in a class, then the compiler will au-tomatically generate a default constructor. This compiler-generated default con-structor is usually not what is needed. If you add any constructor to a class butnot a default constructor, then the compiler will not generate a default construc-tor and the class will be left without one. In that case, you would not be able tocreate arrays of objects of that class since objects in an array are automaticallyinitialized by the default constructor. Therefore, you usually need to include adefault constructor in all your classes.

We can simplify the constructors a bit more by using in-class initializers.The idea is to provide an initial value with the declaration of the data members:

int hours = 99;int minutes = 99;

The constructors can then be implemented as as shown in Figure 1.12. Becauseof the in-class initializers, the default constructor has nothing left to do. The sec-ond constructor delegates to the third one, as before, while the third constructoruses initializers to override the in-class initializers and set the data members toother values.

In-class initializers are particularly useful when some data member of a classis set to the same value by a large number of constructors. This is not the casehere.

In fact, it is debatable whether in-class initializers are a good idea in the caseof Time. One advantage of in-class initializers is that they follow more closelythe principle that variables should be initialized as soon as they are declared.But a disadvantage is that the initialization code is split between two differentlocations: the data member declarations and the constructors. This may makethe class harder to understand.

Page 39: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.7. CONSTRUCTORS 31

class Time{public:

Time() {}Time(int h) : Time(h, 0) {}Time(int h, int m) : hours(h), minutes(m) {}

...

private:int hours = 99;int minutes = 99;

};

Figure 1.12: The class Time with in-class initializers

Source code for the class Time of this section is available on the coursewebsite under Time5.

Study Questions

1.7.1. Why is it good practice to initialize a variable as soon as it is declared?

1.7.2. What is a constructor?

1.7.3. What is a default constructor?

1.7.4. What is an initializer?

1.7.5. What is an advantage of using initializers?

1.7.6. What is a delegating constructor?

Page 40: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

32 CHAPTER 1. DATA ABSTRACTION AND OOP

1.7.7. What is an advantage of using delegating constructors?

1.7.8. What is list initialization?

1.7.9. When does the compiler perform an implicit conversion?

1.7.10. What type of constructor is used to perform implicit conversions?

1.7.11. When precisely does the compiler automatically generate a default con-structor for a class?

1.7.12. What is an in-class initializer?

1.7.13. What are an advantage and a disadvantage of using in-class initializers?

Exercises

1.7.14. Add the following constructors to the class Date of Exercise 1.5.4:

a) A default constructor that initializes the date to January 1, 2000.

b) A constructor Date(month, day, year) that initializes the re-ceiver to the given month, day and year. The arguments are integers.

1.7.15. Add the following constructors to the class ThreeDVector of Exer-cise 1.5.5:

a) A default constructor that initializes the vector to (0, 0,0).

b) A constructor ThreeDVector(x, y, z) that initializes the vectorto (x, y, z). The arguments are of type double.

1.7.16. Add the following constructors to the class Fraction of Exercise 1.5.6:

Page 41: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

1.7. CONSTRUCTORS 33

a) A default constructor that initializes the fraction to 0.

b) A constructor Fraction(a) that initializes the fraction to a. Theargument is an integer.

c) A constructor Fraction(a, b) that initializes the fraction to a/b.The arguments are integers. Assume that b is positive.

Page 42: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

34 CHAPTER 1. DATA ABSTRACTION AND OOP

Page 43: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Chapter 2

More About Classes

In the previous chapter, we learned that classes can be used to add data abstrac-tion to our programs and that classes allow us to program in an object-orientedway. These are the main purposes of classes. In this chapter, we will learn a bitmore about programming with C++ classes.

2.1 Inline Methods

The methods of our class Time are currently declared and implemented insidethe class declaration, as shown in Figure 2.1. This has the effect of making theminline methods.

Compilers treat inline functions differently from ordinary functions. Whena compiler sees a call to an inline function, it may remove the function call andreplace it by the body of the function. In principle, this should speed up theprogram because it avoids the extra work involved in calling a function. Butit may also increase the size of the program and very large programs can runmore slowly because they can’t be stored entirely within the fastest portions of

35

Page 44: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

36 CHAPTER 2. MORE ABOUT CLASSES

class Time{public:

Time() {}Time(int h) : Time(h, 0) {}Time(int h, int m) : hours(h), minutes(m) {}

void read(istream & in){

in >> hours;in.get(); // colonin >> minutes;

}

void print(ostream & out) const{

out << hours << ’:’;if (minutes < 10)

out << 0;out << minutes;

}

bool less_than(const Time & t) const{

return (hours < t.hours) ||(hours == t.hours &&

minutes < t.minutes);}

...};

Figure 2.1: The class Time from the previous chapter

Page 45: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.1. INLINE METHODS 37

class Time{public:

void print(ostream & out) const;...

};

void Time::print(ostream & out) const{

out << hours << ’:’;if (minutes < 10)

out << 0;out << minutes;

}

Figure 2.2: A non-inline version of the print method

a computer’s memory.The only sure way to know if it’s better to make a function inline is to run

tests. But this is often impractical so several rules of thumb have been pro-posed. One is to make a function inline if it consists of no more than threestatements [SS].

According to this rule, the print method should not be inline. The way toachieve this is to declare the method inside the class declaration but implementit outside, as shown in Figure 2.2.

Note that in the implementation of the method, its name is preceded by thename of the class: Time::print. This tells the compiler that print is amethod that belongs to the class Time and not a standalone function (that is, afunction that is not a method of any class).

Inline methods can make a class declaration become crowded. The style

Page 46: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

38 CHAPTER 2. MORE ABOUT CLASSES

preferred by many C++ programmers is for a class declaration to be mainly a listof methods and variable declarations. Long methods are usually implementedoutside of the class declaration. This is illustrated in Figure 2.3 where methodslonger than a single line are now implemented outside the class declaration.Note that the methods read and difference are still inline because theirimplementations are preceded by the keyword inline.

So methods can be made inline by implementing them within the class dec-laration, or by implementing them outside and using the keyword inline. Inthe case of standalone functions, the inline keyword is the only option, asshown in Figure 2.4.

Source code for the revised class Time of Figure 2.3, together with a testdriver, is available on the course website under Time6.

Study Questions

2.1.1. What is special about an inline function?

2.1.2. What are two ways to make a method inline?

Exercises

2.1.3. Revise the classes in the exercises of Section 1.5 by moving out of theclass declarations the implementation of all the methods that are longerthan a single line. But make sure short methods stay inline. Use the ruleof thumb given in this section.

Page 47: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.1. INLINE METHODS 39

class Time{public:

Time() : Time(99, 99) {}Time(int h) : Time(h, 0) {}Time(int h, int m) : hours(h), minutes(m) {}

void read(istream & in);void print(ostream & out) const;bool less_than(const Time & t) const;

private:int hours;int minutes;

};

inline void Time::read(istream & in){

...}

void Time::print(ostream & out) const{

...}

inline bool Time::less_than(const Time & t) const{

...}

Figure 2.3: The class Time with longer methods implemented outside the classdeclaration

Page 48: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

40 CHAPTER 2. MORE ABOUT CLASSES

inline void println(const Time & t, ostream & out){

t.print(out);out << endl;

}

Figure 2.4: An inline standalone function

2.2 Get and Set Methods

Our class Time provides all the functionality we need for the pay calculator. Butthe range of operations that can be performed on these times is fairly limited.

For example, imagine that later, perhaps in another program, we needed toprint times with an h instead of a colon, as in 8h30. (This is how it’s donein French, for example.) One option would be to modify the class to include anew print_with_hmethod. But this requires “reopening” that class: learningagain how it works and running the risk of breaking the parts of it that alreadywork. (This wouldn’t be much of an issue with a simple class like Time, butclasses can be much larger and much more complex.)

Another option is for the designers of the original class to include so-calledget methods in the class. For example, Figure 2.5 shows a version of Timewith two get methods called hours and minutes. Then, without modifyingthe class, these methods allow us to write a print_with_h function that printstimes with an h instead of a colon, as shown in in Figure 2.6.

These methods are called get methods because they allow us to retrieve val-ues associated with their receivers. In fact, names such as get_hours andget_minutes are common alternatives to the simpler hours and minutes.In these notes, as explained in Section 1.4, we will normally use noun phrasesas names for methods whose main purpose is to return a value.

Page 49: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.2. GET AND SET METHODS 41

class Time{public:

...

int hours() const { return hours_; }int minutes() const { return minutes_; }

private:int hours_;int minutes_;

};

Figure 2.5: The class Time with get methods

void print_with_h(const Time & t, ostream & out){

out << t.hours() << ’h’;if (t.minutes() < 10)

out << ’0’;out << t.minutes();

}

Figure 2.6: The function print_with_h

Page 50: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

42 CHAPTER 2. MORE ABOUT CLASSES

inline void Time::read(istream & in){

in >> hours_;in.get(); // colonin >> minutes_;

}

void Time::print(ostream & out) const{

out << hours_ << ’:’;if (minutes_ < 10)

out << 0;out << minutes_;

}

Figure 2.7: The read and print methods with the new data member names

Note that in a class, a data member and a method with no arguments cannothave the same name. So, for example, we cannot have a get method and adata member both named hours. The way we’re going to solve this problemin these notes is to always add an underscore the names of data members. Thiswill allows us to use those names (without the underscore) for the get methods.And this practice has another advantage: in the bodies of the methods, this willallow us to easily distinguish the variables that are data members of the classand from those that are arguments and local variables of the method. This isillustrated in Figure 2.7.

Now, suppose that t is a Time and that we need to change its minutes to,say, 45. With the class of the previous section, we could do this as follows:

t = {t.hours(), 45};

Page 51: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.2. GET AND SET METHODS 43

class Time{public:

...

void set_hours(int new_hours){

hours_ = new_hours;}void set_minutes(int new_minutes){

minutes_ = new_minutes;}

private:int hours_;int minutes_;

};

Figure 2.8: The class Time with set methods

But it’s more more convenient and more efficient to use a method specificallydesigned for this purpose:

t.set_minutes(45);

Not surprisingly, such a method is called a set methods. Figure 2.8 shows aversion of Time with set methods for both the hours and the minutes.

We might also need to change both the hours and minutes of t to, say, 8:35.This could be done as follows:

t = {8, 35};

Page 52: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

44 CHAPTER 2. MORE ABOUT CLASSES

class Time{public:

...

void set(int new_hours, int new_minutes = 0);

private:int hours_;int minutes_;

};

inline void Time::set(int new_hours, int new_minutes){

hours_ = new_hours;minutes_ = new_minutes;

}

Figure 2.9: The class Time with the method set

We can now also use the set methods:

t.set_hours(8);t.set_minutes(35);

This is more efficient but not that convenient. A better approach is to once againuse a method specifically designed for this purpose:

t.set(8, 35);

Figure 2.9 shows a version of Time with such a method.

Page 53: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.2. GET AND SET METHODS 45

Note that the declaration of the method set specifies that the value 0 is tobe used in case the second argument is missing. This default value is called adefault argument. For example, this allows us to set t to 8:00 by simply writing

t.set(8);

Default values can only be specified for the rightmost arguments of a func-tion. When the function is called, the values that are given in the function callare matched to the arguments of the function from left to right. Default val-ues are then used in place of the missing arguments. If needed, a function canspecify default values for all its arguments.

Note that a default argument could be used to combine the second and thirdconstructors of our class:

Time() : Time(99, 99) {}Time(int h, int m = 0) : hours_(h), minutes_(m) {}

This reduces the amount of code even further but it may cause a reader to missthe fact that there are three ways to initialize a time, not just two.

Get and set methods are a standard way of adding flexibility to a class be-cause they allow users to perform operations that weren’t anticipated by thedesigners of the class. This increases the chances that the class will be useful inother projects, which is good.

On the negative side, get and set methods may increase dependence betweena class and its users. For example, the simplest way to print a time is to use theprint method as in

t.print(out);

But now that Time has get and set methods, nothing prevents us from printinga time ourselves, without using the print method:

Page 54: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

46 CHAPTER 2. MORE ABOUT CLASSES

out << t.hours() << ’:’;if (t.minutes() < 10)

out << ’0’;out << t.minutes();

This code is obviously less readable and less convenient to write. But it is alsomuch more dependent on the details of Time. For example, if the colon waschanged to an h or if seconds were added to times, every occurrence of thiscode would need to be revised. None of this work is needed if we always usethe print method.

Therefore, as a general rule, it is much better for users of a class not to use getand set methods to perform a task that can be performed by a method alreadyincluded in the class. This is an example of the design strategy that softwarecomponents should delegate as much work as possible to other components.This usually leads to greater independence between software components.

You may be worried that get and set methods reveal too much about theimplementation of the class. For example, in the case of Time, if we’re going toinclude get and set methods for hours and minutes, then why not just make thedata members hours and minutes public? But note that users of Time alreadyknow that times consist of hours and minutes. That’s part of what times are. Butthat’s not necessarily how times are stored. For example, we could store timesas a single number of minutes since midnight. That wouldn’t change the factthat times consist of hours and minutes. Even if Time was implemented in thisway, we could still include the get and set methods for hours and minutes. Theimplementation of those methods would be more complicated but the methodsthemselves still make sense and they don’t reveal how times are stored.

The version of the class Time with get and set methods, as well as a testdriver that includes the print_with_h function, is available on the coursewebsite under Time7.

Page 55: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.2. GET AND SET METHODS 47

Study Questions

2.2.1. What is the main purpose of get and set methods?

2.2.2. Why is it a bad idea for a function to use the get methods instead of theprint method to print a time?

2.2.3. What is a default argument?

Exercises

2.2.4. Without modifying the class Time, write a function that takes a Time

object as argument and prints the time in the 12-hour format using “a.m.”and “p.m.” (This function should be a standalone function, not a methodof the class Time.)

2.2.5. Modify the implementation of the class Time so that times are stored asa single number of minutes. Do this without changing the interface of theclass so that users of Time do not need to be revised. (In particular, theclass should continue to have the get and set methods we introduced inthis section.)

2.2.6. Add to the class Date of Exercise 1.5.4 methodsset(month, day, year) and set(month, day) that set thedate to the given month, day and year. The second method leaves theyear unchanged. The arguments are integers. Use default arguments asappropriate.

2.2.7. Add the following methods to the class ThreeDVector of Exer-cise 1.5.5:

Page 56: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

48 CHAPTER 2. MORE ABOUT CLASSES

a) A method set(x, y, z) that sets the vector to (x, y, z). Thearguments are of type double.

b) Methods x(), y() and z() that return, respectively, the first, secondand third components of the vector. All three methods return a valueof type double.

2.2.8. Add the following methods to the class Fraction of Exercise 1.5.6:

a) Methods set(a, b) and set(a) that set the fraction to a/b and a,respectively. The arguments are integers. Assume that b is positive.Use default arguments as appropriate.

b) Methods numerator() and denominator() that return, respec-tively, the numerator and denominator of the fraction. Both methodsreturn an integer.

2.3 Operator Overloading

Our class Time includes a method that allows us to determine if a time occursbefore another one:

t1.less_than(t2)

But the following code is much more natural:

t1 < t2

And because it is more natural, it easier to remember, easier to write and easierto understand. The essential difference is that the more natural code uses anoperator instead of a method.

Page 57: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.3. OPERATOR OVERLOADING 49

inline bool Time::operator<(const Time & t) const{

return (hours_ < t.hours_) ||(hours_ == t.hours_ &&

minutes_ < t.minutes_);}

Figure 2.10: A subtraction operator for Time

In C++, we can extend existing operators so they work with new argumenttypes. This done by creating a new version of the operator. Because this addsnew meaning to an existing operator, it’s called operator overloading .

To overload the less-than operator so it works with times, we simply need tochange the name of the method from less_than to operator<, as shown inFigure 2.10. When the compiler sees an expression such as

t1 < t2

it will interpret it as

t1.operator<(t2)

and the desired result will be achieved.As explained earlier, the compiler can use the one-argument constructor of

class Time to perform implicit conversions from integers to times. So, for ex-ample, the code t < 12 would be perfectly valid and result in the time t beingcompared to 12:00.

On the other hand, it would not possible to write 8 < t since the leftoperand of the operator is its receiver, not an argument. Implicit conversionsare not performed on receivers.

Page 58: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

50 CHAPTER 2. MORE ABOUT CLASSES

inline bool operator<(const Time & t1,const Time & t2)

{return (t1.hours_ < t2.hours_) ||

(t1.hours_ == t2.hours_ &&t1.minutes_ < t2.minutes_);

}

Figure 2.11: The less-than operator as a standalone function

Ideally, both operands of the operator should be treated in the same way. Toget implicit conversions on both operands of the operator, we must redesign itso that both operands are arguments. This can be done by taking the operatorout of the class and turning it into a standalone function with two arguments,as shown in Figure 2.11. This will work because an expression such as

t1 < t2

can also be interpreted by the compiler as

operator<(t1, t2)

Since the data members of Time are private and the less-than operator is nolonger a Time method, the operator cannot directly access the data membersof the class. We have two options here: we could have the operator use the getmethods, or we could declare that the operator is a friend of the class.

As a general rule, friendship declarations should be used only as a last resortbecause they undermine encapsulation and data abstraction. In our case, forexample, a friendship declaration would make the less-than operator dependenton the fact that each time is stored as two numbers, hours_ and minutes_.But if the way in which the times are stored was changed, we would probably

Page 59: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.3. OPERATOR OVERLOADING 51

want to revise the implementation of the operator so it is as efficient as possible.For example, if each time was stored as a single number of minutes, then themost efficient way to compare times would not be the calculation of Figure 2.11but the much simpler

t1.total_minutes_ < t2.total_minutes_

So the extra dependency is not a problem here, it might even be desirable.In other words, even though strictly speaking the less-than operator is a stan-

dalone function, it really belongs to the class Time. So a friendship declarationmakes sense here.

Note that a friendship declaration would be necessary is the class did notprovide get methods.

We can also overload the input (or stream extraction) operator >> so thatinstead of writing

t.read(in)

we can write

in >> t

just as we do with other types of values such as integers, characters and strings.The overloading of the input operator involves a couple of particular issues.

One has to do with how the compiler interprets code such as

in >> t

Just as in the case of the subtraction operator, there are two possibilities:

1. in.operator>>(t)

2. operator>>(in, t)

Page 60: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

52 CHAPTER 2. MORE ABOUT CLASSES

The first interpretation requires adding the method operator>> to theclass istream, which is the class of the standard input stream cin as well asall input file streams.1 But istream is a library class and we cannot modify it.

Now, if the compiler could interpret in >> t as

t.operator>>(in)

we would be all set: we could overload the operator by adding the methodoperator>> to our class Time. But the C++ compiler does not interpretin >> t in this way.

So we are left with the second interpretation, which corresponds to over-loading the operator by creating a standalone function with two arguments.

A possible implementation is shown in Figure 2.12. The operator reads thehours and minutes into local variables and then sets the time by using the setmethod. Unfortunately, this is a bit inefficient since both numbers are copiedtwice.

A more efficient approach is to make the operator a friend of the class. Thisallows us to implement the input operator as shown in Figure 2.13.

As can be seen in both Figures 2.12 and 2.13, the input operator returns thestream it receives as argument. One reason has to do with chains of readingoperations. In C++, instead of having to write

in >> t1;in >> t2;

it is normally possible to write the more convenient and more readable

in >> t1 >> t2;

1Strictly speaking, input file streams are of class ifstream. We will say more about this inthe next chapter when we take a look at I/O stream classes.

Page 61: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.3. OPERATOR OVERLOADING 53

istream & operator>>(istream & in, Time & t){

int new_hours;in >> new_hours;

in.get(); // colon

int new_minutes;in >> new_minutes;

t.set(new_hours, new_minutes);

return in;}

Figure 2.12: An input operator for Time

istream & operator>>(istream & in, Time & t){

in >> t.hours_;in.get(); // colonin >> t.minutes_;return in;

}

Figure 2.13: A more efficient input operator

Page 62: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

54 CHAPTER 2. MORE ABOUT CLASSES

ostream & operator<<(ostream & out,const Time & t)

{out << t.hours() << ’:’;if (t.minutes() < 10)

out << ’0’;out << t.minutes();return out;

}

Figure 2.14: An output operator for Time

Such a statement is actually executed as a sequence of nested function calls:

operator>>(operator>>(in, t1), t2);

For this to work, the first call to the input operator must return the stream so itbecomes the first argument of the second call to the operator.

Note that the operator must return a reference to the stream, not a copy ofthe stream. This allows the calling function to access the original stream insteadof copy of the stream. This is done for the same reason that stream argumentsare passed by reference as arguments: all I/O operations to a single file shouldnormally be performed through a single stream variable, not various copies ofit. We will see other examples of functions that return references later in thesenotes.

The output (or stream insertion) operator << can be overloaded in a waysimilar to the input operator, as shown in Figure 2.14. In this case, there is noneed to use a friendship declaration since no matter how times are stored, thisimplementation is going to be as efficient as possible.

Page 63: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.3. OPERATOR OVERLOADING 55

The version of the class Time with operators is available on the course web-site under Time8.

Study Questions

2.3.1. What is operator overloading?

2.3.2. What is the advantage of declaring an operator such as the subtractionoperator as a stand-alone function instead of a method?

2.3.3. Why do the input and output operators have to be declared as standalonefunctions instead of methods?

2.3.4. Why do the input and output operators return the stream?

Exercises

2.3.5. Add an equality testing operator (==) to the class Time.

2.3.6. Add the operators <<, >> and += to the class Date of Exercise 1.5.4.They should behave just like the methods read, print and add, respec-tively.

2.3.7. Add the operators <<, >> and + to the class ThreeDVector of Exer-cise 1.5.5. They should behave just like the methods read, print andadd, respectively.

2.3.8. Add the operators + and ∗ to the class Fraction of Exercise 1.5.6. Theyshould behave just like the methods plus and times, respectively.

2.3.9. Add the operators < and == to the class Fraction of Exercise 1.5.6.Fractions such as 2/3 and 8/12 should be considered equal.

Page 64: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

56 CHAPTER 2. MORE ABOUT CLASSES

2.4 Compiling Large Programs

While developing our class Time, we’ve had all the source code in a single file.That’s the code for the class itself but also the main function that we use to testthe class.

When working on a large program, it is not practical to have all the code inone file. For example, an advantage of modularity is that it allows us to work ona component such as the class Time in isolation from the rest of the program.But this would require copying the code of the component out of the programand then copying it back into it when we are done. This is inconvenient andprone to errors.

Having all the code in a single file also implies that the entire program needsto be recompiled every time any part of the code is changed. Since compiling alarge program can take a significant amount of time, this is also inconvenient,especially during debugging when often only small changes are made to theprogram between recompilations.

An alternative is to organize the program so that each component is con-tained in its own file (or files) and then to recompile only those files that needto be recompiled.

The standard way of doing this is to separate each component into a headerfile and an implementation file. The header file contains the declarations associ-ated with the component, that is, all the code that the compiler needs in order tocompile the code that uses the component. This normally consists of global con-stants, class declarations, function declarations, as well as the implementationof all inline methods and functions.

The implementation file contains the rest of the code. This normally consistsof the implementation of the non-inline methods and functions. The names ofthese files normally end with the extensions h and cpp, respectively.

Page 65: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.4. COMPILING LARGE PROGRAMS 57

For example, we can reorganize the code we’ve written in this chapter intothree files, as follows:

1. Time.h: the declaration of the class Time, the declaration of the outputoperator, the implementation of the inline method set and the implemen-tation of the inline operator less-than, as shown in Figure 2.15.

2. Time.cpp: the implementation of the input and output operators, asshown in Figure 2.16.

3. main.cpp: the main function, as shown in Figure 2.17.

Note that the main function does not need to be split into header and imple-mentation files because that function will not be used by any other component.

Each implementation file should include (#include) the correspondingheader file. For example, Time.cpp includes Time.h. Each header file shouldinclude all the necessary library files as well as the header files of other compo-nents that are used. For example, Time.h includes iostream. Since the mainfunction doesn’t have a header file, all necessary files are included in main.cpp.This consists of the library file iostream as well as the header file Time.h.

Note that no global using declarations are used in Time.h. Instead, forexample, the long form std::istream is used to access istream. This is be-cause header files such as Time.h will be included in the source code of othercomponents and it’s best to avoid “polluting” the namespaces of those compo-nents. Therefore, in a header file, all using declarations should be local to in-dividual functions. This is not an issue with implementation files, which is whymain.cpp and Time.cpp contain the declaration using namespace std.

With a compiler that is part of an integrated development environment (IDE),2

we normally create a project, add to it all the source files and then just click a

2Examples of IDE’s are Code::Blocks and Eclipse for Windows, Linux and Mac OS X, Bloodshed

Page 66: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

58 CHAPTER 2. MORE ABOUT CLASSES

#ifndef _Time_h_#define _Time_h_

#include <iostream>

class Time{public:

friend bool operator<(const Time & t1,const Time & t2);

friend std::istream & operator>>(std::istream & in,Time & t);

... // declaration of the methods

private:int hours_;int minutes_;

};

std::ostream & operator<<(std::ostream & out,const Time & t);

... // implementation of the inline method set and// the inline operator <

#endif

Figure 2.15: The header file Time.h

Page 67: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.4. COMPILING LARGE PROGRAMS 59

#include "Time.h"

using namespace std;

istream & operator>>(istream & in, Time & t){

...}

ostream & operator<<(ostream & out, const Time & t){

...}

Figure 2.16: The implementation file Time.cpp

#include <iostream>using namespace std;

int main(){

... // code for running tests on Time}

Figure 2.17: The implementation file main.cpp

Page 68: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

60 CHAPTER 2. MORE ABOUT CLASSES

button to have the IDE compile the project. When recompiling a program, theIDE will automatically compile only the files that actually need to be compiled.As mentioned before, with a large program, this can take much less time thanrecompiling the entire program.

With a console-type compiler such as g++ for Unix and Linux, the entireprogram can be compiled by compiling all the cpp files as follows:

g++ ∗.cpp

Afterwards, the process of recompiling only those files that are needed is usuallymanaged with the assistance of the make utility. This will be discussed briefly inthe next section.

Now suppose that another component is added to the pay calculator and thatthis new component is used by both the main function and the class Time. Theheader file of that new component would be included in both main.cpp andTime.h. But this would cause problems because Time.h is already includedin main.cpp: when main.cpp is compiled, the compiler will read the headerfile of the new component twice and will complain that the new component isdeclared more than once.

A solution is to remember which header files are already included in whichother header files. In a large program, this would be difficult to manage andprone to errors.

A simpler solution is to use the trick shown in Figure 2.15. At the beginningof the header file, we test to see if the symbol _Time_h_ is defined. If not, wedefine the symbol and proceed with the rest of the file. The second time the fileis visited, the symbol will be defined, the test will fail and the contents of the

Dev-C++ and Microsoft Visual C++ for Windows, and Anjuta for Linux. As of January 2012,Code::Blocks, Eclipse, Dev-C++, Anjuta and the Express Edition of Visual C++were all availablefor free.

Page 69: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.5. THE MAKE UTILITY 61

file will be skipped all the way to the #endif directive at the very bottom. Thisprevents the contents of the file from being read more than once.

It’s a good idea to use this trick systematically in all header files. Each headerfile must use a unique symbol name. A scheme based on a modification of thefile name works well. For example, for a file named component.h, you canuse the symbol _component_h_.

The latest version of our class Time and its test driver (the main func-tion) organized with header and implementation files for separate compilationis available on the course website under Time9.

Study Questions

2.4.1. Why is the setup described in this section especially useful during debug-ging?

2.4.2. Why is it better not to use global using declarations in a header file?

Exercises

2.4.3. By following the guidelines given in this section, reorganize the code ofthe classes you created for the exercises of this chapter.

2.5 The make Utility

With a console-type compiler such as g++, it is possible to manually recompileonly the files that need to be recompiled. The first time we compile the program,we would compile all the implementation files as follows:

Page 70: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

62 CHAPTER 2. MORE ABOUT CLASSES

g++ −c ∗.cppg++ ∗.o −o test_Time

The first command compiles the implementation files and produces object fileswith the .o extension. The second command links those files to produce theexecutable test_Time. Afterwards, if, for example, only the file Time.cpp ischanged, then we only need to do the following:

g++ −c Time.cppg++ ∗.o −o test_Time

This recompiles Time.cpp and then relinks all the object files together.The difficulty with this manual approach is that if we modify a header file,

then we need to figure out which implementation files include it, either directlyor indirectly, because all those files will need to be recompiled. With a largeprogram, this is obviously not practical.

Fortunately, the process can be automated by using the make utility. All thecommands needed for the compilation of the program are put in a Makefile alongwith a declaration of dependencies between the various source files, as shownin Figure 2.18. For example, the entry

main.o: main.cpp Time.hg++ −c main.cpp −o main.o

indicates that the file main.o depends on main.cpp and Time.h, and thatwhenever these other files are changed, main.o should be regenerated by usingthe command

g++ −c main.cpp −o main.o

Page 71: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

2.5. THE MAKE UTILITY 63

paycalc: main.o Time.og++ main.o Time.o −o test_Time

main.o: main.cpp Time.hg++ −c main.cpp −o main.o

Time.o: Time.cpp Time.hg++ −c Time.cpp −o Time.o

clean:rm −f ∗~ ∗.o test_Time

Figure 2.18: A Makefile

The list of dependencies contains all the files that are included, directly or indi-rectly, in main.cpp. The list of dependencies should normally be on a singleline and the command line that follows should begin with a TAB character.

To compile the program using the Makefile, simply type make ormake test_Time. To run the cleaning command included in the Makefile,type make clean. This will delete the executable test_Time as well as theobject files and all files with a name that ends in a tilde (~). This includes thebackup files produced by the emacs text editor.

The Makefile shown in Figure 2.18 is available on the course website underTime9. You can use it as a model for your own Makefiles.

Study Questions

2.5.1. What is a Makefile?

Page 72: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

64 CHAPTER 2. MORE ABOUT CLASSES

Page 73: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Chapter 3

Strings and Streams

In this chapter, we will learn about strings, I/O streams and string streams.These are not only extremely useful components of the C++ standard library,they also are great examples of abstraction and object-oriented programming.

3.1 C Strings

Strings are sequences of characters. They can be names, words, sentences orlines of text. Not surprisingly, since so many applications involve data in the formof text, strings are one of the most common types of data handled by software,second maybe only to numbers.

There are two standard ways of storing strings in C++ programs. The firstone comes from the language C, the predecessor of C++. For this reason, stringsstored in this way are called C strings. C strings are sequences of charactersstored in an array of characters. In addition, the characters of the string arefollowed by the null character (\0).

This means that the array holding a C string can be larger than the string,

65

Page 74: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

66 CHAPTER 3. STRINGS AND STREAMS

which allows the string to grow and shrink by simply moving the null character.But the array still has a fixed size and this imposes a maximum length on the Cstring. (Later in these notes, we will see that dynamic memory allocation allowsus to address this problem.)

The other standard way of storing strings in C++ is to use the class stringthat’s provided in the C++ standard library. To distinguish, these strings areoften called C++ strings. We will learn about C++ strings in the next section.

C++ strings are much more convenient and safer to use than C strings. How-ever, it is useful to learn how to program with C strings because there are cir-cumstances in which C strings are still used. For example, when programmingin C, when working with old C++ code, or if ever we needed to implement ourown class of strings. In addition, in C++, literal strings such as "hello" arestored as C strings.

Since C strings are stored in arrays, we can work with them as we would withany other array. But the C++ standard library provides several functions andoperators that perform useful operations on C strings. Tables 3.1 and 3.2 listsseveral of these C string functions. All are defined in the library file cstring,except for atoi and atof, which are defined in cstdlib, and the I/O func-tions and operators, which are defined in iostream. The C string functionsare in the global namespace (and not in the std namespace like most elementsof the C++ standard library.) Additional C string library functions are describedin a reference such as [CPP].1

Note that all of these C string functions are fairly unsafe. One reason is thatthey all assume that the given C strings are valid. For example, Figure 3.1 showsa possible implementation of the function strlen. (The function has beenrenamed to avoid any chance of conflict with the library version.) The function

1Some of these other functions involve concepts that we still have not covered, such as point-ers. You can ignore these functions for now.

Page 75: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.1. C STRINGS 67

strlen(cs)Returns the length of C string cs.

strcpy(dest, source)strncpy(dest, source, n)

Makes C string dest a copy of C string source. The second versioncopies at most n characters. If that maximum is reached before thenull character is copied, then the null character is not appended todest.

strcat(dest, source)strncat(dest, source, n)

Appends a copy of C string source to C string dest. The secondversion copies at most n characters, followed by a null character.

strcmp(cs1, cs2)strncmp(cs1, cs2, n)

Returns a negative integer if cs1 < cs2, 0 if cs1 == cs2, a posi-tive integer if cs1 > cs2. Uses alphabetical order. The second ver-sion compares at most n characters from each string.

atoi(cs)atof(cs)

Converts C string cs into a number. Starting from the beginning of thestring, skips whitespace and then converts as many characters as pos-sible into a number. If no conversion can be performed, 0 is returned.The first version returns an int while the second version returns adouble. The string is not modified.

Table 3.1: Some C string operations (part 1 of 2)

Page 76: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

68 CHAPTER 3. STRINGS AND STREAMS

stream << csOutputs the characters of C string cs.

stream >> csReads characters into C string cs. Skips leading white space and stopsreading at white space (blank, tab or newline) or at the end of the file.The terminating character is not read.

stream.get(cs, n)Reads at most n−1 chars into C string cs. Does not read past the endof the current line. The null character is appended to cs. Does notread the newline character, if encountered.

stream.getline(cs, n)Reads the rest of the current input line into C string cs. Does not readmore than n−1 chars. The null character is appended to cs. Readsthe newline character, if encountered, but does not add it to cs.

Table 3.2: Some C string operations (part 2 of 2)

Page 77: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.1. C STRINGS 69

int my_strlen(const char cs[]){

int i = 0;while (cs[i] != ’\0’)

++i;// i is now both the index of ’\0’ and the length// of the stringreturn i;

}

Figure 3.1: An implementation of strlen

scans the array from left to right until the null character is encountered. If everthe array that holds the string does not contain a null character, the function willgo out of bounds and either return an incorrect length or cause the program tocrash.

Another way in which the C string functions can be unsafe has to do withthe size of the arrays. The function strcpy, for example, assumes that thearray holding dest is large enough to hold source. If the array is too small,then strcpy will go out of bounds and overwrite other program variables withcharacters from source. Or the program will crash. Either way, a bad outcome.

The problem is that there is no reliable way for strcpy to determine the sizeof the array that holds dest. The strncpy version of strcpy tries to addressthis problem. If n is no greater than the size of the array holding dest, thenstrncpy will not go out of bounds. But if ever source is of length n, thendestmay be missing a null character. In addition, there is no way for strncpyto make sure that n is not too large. So strncpy is safer than strcpy, if onlybecause it encourages the programmer to think about the size of the array. Butstrncpy is by no means completely safe.

Page 78: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

70 CHAPTER 3. STRINGS AND STREAMS

As mentioned earlier, literal strings such as "abc" are stored as C stringsin a C++ program. Therefore, C string functions can be used on literal strings.For example, strlen can be used to compute the length of a literal string:strlen("abc") returns 3. And strcpy can be used to copy a literal stringto another C string: strcpy(dest, "abc"). But note that we can’t copyanything to a literal string as in strcpy("abc", source). That’s becauseliteral strings are stored as constant C strings.

The C string functions provide good examples of procedural abstraction: touse these functions, all we need to know is what they do (and how to call them).We don’t need to know how the functions work. And there is nothing we can dothat would depend on how these functions work: if ever the implementation ofany these functions was changed, our code would continue to work as long asthe function that was changed still does what it is supposed to do.

Study Questions

3.1.1. What is a C string?

3.1.2. Suppose that you want to store a string of size n as a C string. How smallcan the array be? How large can it be?

3.1.3. What is the difference between the C string functions get and getline?

Exercises

3.1.4. Create a function println(cs) that takes a C string as argument andbehaves exactly as the code cout << cs << endl. (But don’t use thiscode in implementing the function.)

Page 79: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.2. C++ STRINGS 71

3.1.5. Create a function strlwr(cs) that changes to lowercase all the lettersin C string cs. (You’ll probably want to use the function tolower fromthe standard library cctype. This function takes a character as argumentand returns its lowercase equivalent. If the character has no lowercaseequivalent, then the function simply returns the character unchanged. Youmay also need to add the prefix my to the name of your function, as inmy_strlwr. That’s because some cstring libraries may include thefunction strlwr.)

3.1.6. Write your own implementation of the following C string functions. Usethe prefix my as in my_strlen to avoid conflicts with the library func-tions.

a) strcpy.

b) strncpy.

c) strcat.

d) strncat.

e) strcmp. Implement this function by using the comparison operators(<, <=, ==, >, >=) on individual characters.

3.2 C++ Strings

As we saw in the previous section, C strings have several disadvantages: theyhave a maximum size, they are somewhat inconvenient to use and they are notsafe. Later in these notes, we will learn that the first problem can be addressedby storing C strings in dynamically allocated arrays. But this is not an idealsolution since it involves low-level techniques that are prone to errors.

Page 80: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

72 CHAPTER 3. STRINGS AND STREAMS

C++ strings address these problems. They (essentially) don’t have a maxi-mum size and they grow automatically as needed. They are also more conve-nient to use and are much safer than C strings. C++ strings are objects are classstring. This class is defined in the library file string and included in thestd namespace.

Note that C++ strings do have a maximum size that’s related to the largestvalue that can be stored in an int variable. But that maximum size is typicallymuch larger than the size of any string in most applications.2

Tables 3.3 to 3.6 show several string operations. Many more are describedin a reference such as [CPP].3 In these tables, the word string without qualifierrefers to C++ strings. Note that in C++ strings, indices start at 0 just as inarrays.

The second constructor provides a way for converting C strings into C++strings. The reverse conversion can be performed by the c_str method.4,5

As mentioned in the previous section, literal strings such as "abc" are storedas C strings in a C++ program. Therefore, any function that takes a C string asargument can be used on a literal string. For example, string s("abc")

2A typical limit is approximately 4 billion characters.3Some of these operations involve concepts we still have not covered, such as iterators,

ranges, exceptions and capacity. You can ignore these operations for now.4Before C++11, the version of C++ that was finalized in 2011, a common use of the c_str

operation was for opening files. This was because the file stream constructor that takes a filename as argument accepted a C string but not a C++ string. If we had a file name stored in astring variable, it was necessary to first convert it into a C string by using the c_str method,as in

ifstream in(file_name.c_str());

5You probably know that C++ functions cannot return arrays. So you may wonder how thec_str method can return a C string since C strings are stored in arrays. Later in these notes,we will learn that pointers can be used to return arrays in an indirect way.

Page 81: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.2. C++ STRINGS 73

string sstring s(s2)string s(s2, i, n)string s(n, c)

Creates a string s and initializes it to be empty, or a copy of string s2,or a copy of the substring of s2 that starts at index i and is of lengthn, or n copies of character c. The argument s2 can also be a C string.

s.length()s.size()

Asks string s for the number of characters it currently contains.

s.empty()Asks string s if it is empty.

s.max_size()Asks string s for the maximum number of characters it can contain.

s[i]Returns a reference to the character at index i in string s.

s1 = s2Makes string s1 a copy of string s2. The right operand can also be aC string or a single character. Returns a reference to s1.

s1.swap(s2)Asks string s1 to swap contents with string s2.

s.clear()Asks string s to delete all its characters.

Table 3.3: Some string operations (part 1 of 4)

Page 82: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

74 CHAPTER 3. STRINGS AND STREAMS

s1 op s2Compares string s1 with string s2 where the operator op is one of==, !=, <, >, <= or >=. Uses alphabetical order. Returns true orfalse. One of the operands must be a string object but the othercan be a C string.

s1 + s2Returns a string that consists of a copy of string s1 followed by a copyof string s2. One of the operands must be a string object but theother can be a C string or a single character.

s1 += s2Appends a copy of string s2 to string s1. The right operand can alsobe a C string or a single character. A reference to s is returned.

s.resize(n)s.resize(n, c)

Asks string s to change its size to n. If n is smaller than the currentsize of s, the last characters of s are erased. If n is larger, s is paddedwith the null character or with copies of character c.

s.substr(i, m)s.substr(i)

Asks string s for a copy of the substring that starts at index i, and isof length m or ends at the end of the string.

s.insert(i, s2)Asks string s to insert into itself, at index i, a copy of string s2. Thesecond argument can be of any of the forms accepted by the construc-tors. A reference to s is returned.

Table 3.4: Some string operations (part 2 of 4)

Page 83: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.2. C++ STRINGS 75

s.replace(i, m, s2)Asks string s to replace the substring of length m that starts at index iby a copy of string s2. The third argument can be of any of the formsaccepted by the constructors. A reference to s is returned.

s.erase(i, m)s.erase(i)

Asks string s to delete m characters starting at index i, or all the char-acters from index i to the end of the string. A reference to s is re-turned.

s1.find(s2)s1.find(s2, i)

Asks string s1 for the index of the first occurrence of string s2 as asubstring. In the first version, the entire string is searched. In theother, the search starts at index i. The argument s2 can also be a Cstring or a single character. If the search is unsuccessful, the constantstring::npos (“not a position”) is returned.

s1.rfind(s2)s1.rfind(s2, i)

Similar to find except that the search is for the last occurrence andthat the search ends at index i.

s.c_str()Asks string s for a C string that contains the same characters as s. ThatC string is constant and may become invalid if a subsequent operationmodifies the size of s.

Table 3.5: Some string operations (part 3 of 4)

Page 84: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

76 CHAPTER 3. STRINGS AND STREAMS

stoi(s)stod(s)

Converts string s into a number. Starting from the beginning of thestring, skips whitespace and then converts as many characters as pos-sible into a number. The first version returns an int while the secondversion returns a double. The string is not modified.

to_string(x)Returns a string representing number x. The argument can be of anyof the usual numeric types.

stream << sOutputs the characters of string s. Returns a reference to the stream.

stream >> sReads characters into string s. Skips leading white space and stopsreading at white space (blank, tab or newline). That terminating char-acter is not read. Returns a reference to the stream.

getline(stream, s)Reads characters into string s until then end of the current line. Thenewline character is read but not included in s. Returns a referenceto the stream.

Table 3.6: Some string operations (part 4 of 4)

Page 85: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.2. C++ STRINGS 77

initializes s to "abc" by using the second constructor.That second constructor can also be used for implicit conversions. Therefore,

any function that takes a C++ string as argument can also be used on a C string(and a literal string). This applies to the C++ string operations themselves, as ins + "abc". Note, however, that for the sake of efficiency, several operationscome with separate implementations designed to handle C string arguments di-rectly.

The find and rfind methods return the special value string::npos

(“not a position”) in case the search is unsuccessful. If s is a string, that constantcan also be accessed as s.npos.

Study Questions

3.2.1. What are three advantages of C++ strings over C strings?

3.2.2. How can we convert a C string to a C++ string and vice-versa?

Exercises

3.2.3. Create a function println(s) that takes a C++ string as argument andbehaves exactly as the code cout << s << endl. (But don’t use thiscode in implementing the function.)

3.2.4. Write a code segment that starts with a string that contains a person’sname in the format "John Doe". Assume that the name contains a singleblank space. Your code should produce another string that contains thesame name but in the format "Doe, John".

a) Write your code by using only the default constructor, the indexingoperator and the methods length and resize.

Page 86: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

78 CHAPTER 3. STRINGS AND STREAMS

coutA buffered output stream (class ostream) normally associated withthe computer screen.

cinA buffered input stream (class istream) normally associated withthe keyboard.

Table 3.7: Standard input and output streams

b) Now simplify your code by using the method find and the opera-tor +=.

c) Simplify your code further by using the method append. (Consult areference such as [CPP].)

3.3 I/O Streams

You should already be familiar with basic input and output operations in C++.The library file iostream contains two objects cin and cout that can be usedfor reading data from the keyboard and displaying data on the screen, as shownin Table 3.7. These objects are of class istream and ostream, respectively.These two classes provide several useful operations. Some of these are listed inTable 3.8. In that table, out and in refer to output and input streams, respec-tively.

File input and output is done through file stream objects of class ifstream,ofstream and fstream. The class ifstream is a subclass, or derived class,of istream, which means that all the istream operations are automatically

Page 87: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.3. I/O STREAMS 79

out << dataSends data to the output stream and returns the stream.

out.put(c)Asks the output stream to write character c. The stream is returned.

in >> varReads from the input stream a value of the appropriate type and storesit in the variable. Initial white space is skipped. Returns the stream.

in.get()Asks the input stream to read and return the next character.

in.get(var)Asks the input stream to read the next character and store it in thevariable. The stream is returned.

in.peek()Asks the input stream to return the next character without removingit from the stream. The stream is returned.

in.putback(c)Asks the input stream to put back character c so it will be the nextcharacter that is read. The stream is returned.

Table 3.8: Some input and output operations

Page 88: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

80 CHAPTER 3. STRINGS AND STREAMS

type fCreates a file the stream of the specified type (ifstream, ofstreamor fstream).

type f(file_name)Creates a file stream and asks it to open the file with the given name.The argument can be a C++ string or a C string.

f.open(file_name)Asks the file stream to open the file with the given name. The argu-ment can be a C++ string or a C string.

f.close()Asks file the stream to close the file currently associated with it.

Table 3.9: Some operations specific to file streams

included in ifstream. The same is true of ofstream and ostream. The classfstream is a subclass of both istream and ostream, so all the operations ofthese two classes are included in fstream. Table 3.9 includes some operationsthat are available only with file streams.

Note that when a file stream variable ceases to exist, then the associated fileis automatically closed. This happens, for example, when a stream variable islocal to a function and that function returns. In such a case, it is not necessaryto explicitly ask the stream to close the file.

Note also that if a function takes as argument an istream passed by ref-erence, then that function can be called on any input stream, including cin

as well as any input file stream. Similarly, if a function takes as argument anostream passed by reference, then that function can be called on any output

Page 89: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.3. I/O STREAMS 81

stream, including cout as well as any output file stream.The fact that all the operations of a class like istream are also available in

its subclasses is called inheritance. The fact that a reference argument declaredof a certain class can be set to either an object of that class or an object of anyof its subclasses is an example of polymorphism.6

At any moment, a stream is in at least one of the following states: end of file,fail, bad or good. The fail and bad states are error states.

A stream enters the end of file state if an attempt was made to read past theend of the file. Note that this is not necessarily an error since it will occur, forexample, when a number is read and that number was the last piece of data ina text file.

A stream enters the fail or bad states if the last stream operation failed. Notethat a stream can be at the same time in the end of file state and in one of theerror states, or in one but not the other. Note also that if a stream is in the endof file state or in an error state, then every subsequent input operation on thatstream will fail.

A stream is in the good state if it is not in the end of file state or in an errorstate. This essentially means that the stream is ready to be read from.

The stream classes provide methods to determine if a stream is in any ofthese states, as shown in Table 3.10. An additional clear method is providedto return the stream to the good state. This method is useful for error recovery.

When a stream is used as a Boolean expression, it evaluates to true if it isnot in an error state. In other words, a stream is true if the last stream operationwas successful. We can use this to test if a file opens properly:

ifstream in("input.txt");if (!in) ... // file did not open

6At Clarkson, the concepts of inheritance and polymorphism, as well as the mechanism forcreating subclasses, are covered in detail in the course CS242 Advanced Programming Concepts.

Page 90: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

82 CHAPTER 3. STRINGS AND STREAMS

stream.eof()Asks the stream if it is in the end of file state, which means that anattempt was made to read past the end of the file.

stream.fail()Asks the stream if it is in an error state (either fail or bad), whichmeans that an error occurred.

stream.bad()Asks the stream if it is in the bad state, which generally means that anon-recoverable error occurred.

stream.good()Asks the stream if it is in the good state, which generally means it isready to be read from.

stream.clear()Asks the stream to return to the good state (usually for error recovery).

streamEvaluates to true if the stream is not in an error state.

Table 3.10: Error-related stream operations

Page 91: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.3. I/O STREAMS 83

int n;while (in >> n)

cout << n << endl;

Figure 3.2: A loop that reads every number from a file

Note that the usual logical operators, including negation (!), can be used onstreams.

Another example is to control a loop that reads from a file. For example,suppose that a file contains a list of integers and we need to print them to thescreen. This can be done as shown in Figure 3.2. This works because the inputoperator returns the stream. As long as a number is successfully read from thefile, the stream is true and the loop keeps running. But after the last number isread, the next input operation fails, the stream is false and the loop stops.

At first, the code

while (in >> n)

may seem awkward and hard to read because the phrase "while read n from in"doesn’t make sense. A solution is to read this code as "while it is possible to readn from in". This actually reads well.

Note that the eof method often cannot be used to control a loop that readsfrom a file. For example, consider the loop of Figure 3.3. Suppose that the lastnumber in the file is followed by a newline character (\n). Then after readingthe last number, the stream will still not be in the end of file state, which meansthat the loop will run one more time. Then the reading operation will fail andthis will cause the last number in the file to be printed twice to the screen.

It’s possible to fix this by adding a test inside the loop, as shown in Fig-ure 3.4. But this loop is now less efficient and more complicated than the loop

Page 92: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

84 CHAPTER 3. STRINGS AND STREAMS

int n;while (!in.eof()) {

in >> n;cout << n << endl;

}

Figure 3.3: A loop that may not work

int n;while (!in.eof()) {

in >> n;if (in)

cout << n << endl;}

Figure 3.4: A loop that works but is more complicated

Page 93: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.3. I/O STREAMS 85

of Figure 3.2.As mentioned earlier, the objects cin and cout are defined in the library file

iostream. The classes istream and ostream are also defined there. Thefile stream classes are defined in the library file fstream. All of these classesand objects are part of the std namespace.

The stream objects and operations that were mentioned in this section aresufficient for many applications. Some additional operations are described in areference such as [CPP].

Input and output streams are not only useful, they are also a very good exam-ple of object-oriented programming. Consider cin. That object provides accessto the standard input stream (normally the keyboard). The cin object mostlikely holds data. For example, it may hold the current contents of the inputbuffer and the current location of the next character to be read. Of course, usersof cin do not depend on how that data is stored. As such, cin can be seen asa good example of data abstraction.

But cinmay not even hold that data. For example, the buffer could be storedin some device that cin has access to. So cin is best thought of not as a piece ofdata, but as an object that provides us with a service (access to keyboard input).In addition, users of cin don’t need to be concerned with how that service isprovided. And they do not depend on those implementation details. All of thisillustrates that OOP goes beyond data abstraction and that OOP does lead to ahigh level of independence.

Exercises

3.3.1. Experiment with I/O streams and the various I/O stream operations, in-cluding error states, by writing a test driver that uses the stream objectsand operations shown in Tables 3.7 to 3.10.

Page 94: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

86 CHAPTER 3. STRINGS AND STREAMS

3.3.2. Create functions read(cs, n, in) and readln(cs, n, in) thatbehave exactly as the C string functions get and getline describedin Table 3.2. (But don’t use those functions in implementing read andreadln. Make sure your functions handle the end of a file properly. Thatis, make sure they can handle reading from the last line of a file even ifthat line does not end with a new line character.)

3.4 String Streams

In addition to input and output streams, C++ provides string streams. Theseare streams that are each associated with a string instead of a device or a file.When you write to a string stream, characters get added to a string. When youread from a string stream, data is read from a string.

String streams support all the general stream operations described inSection 3.3. That’s because the string stream classes istringstream,ostringstream or stringstream are subclasses of the I/O stream classesistream, ostream and iostream. Table 3.11 lists some operations that areparticular to string streams.

One example of the usefulness of string streams is the conversion betweenstrings and other types of values. In the case of conversions between strings andthe basic numerical types, the standard library provides the functions stoi,stod and to_string. These are described in Table 3.6.

But suppose, for example, that we need to perform conversions betweenstrings and objects of class Time. There are no library functions that can dothat. But the input operator of class Time essentially converts a string to a time.And the output operator of class Time essentially converts a time to a string.So the idea is to use these operators to read a time from a string and to write atime to a string.

Page 95: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.4. STRING STREAMS 87

type ss()type ss(s)

Creates a string stream ss of the specified type (istringstream,ostringstream or stringstream). The second version asso-ciates the stream with a copy of string s.

ss.str()Asks string stream ss for a copy of its string.

ss.str(s)Asks string stream ss to set its string to be a copy of string s.

Table 3.11: Some operations specific to string streams

This can be done as shown in Figure 3.5. In the function to_Time, the inputstring stream in is initialized with a copy of the string s. A time is then extractedfrom that string by using the input operator. In the function to_string, a timeis written to the output string stream and the resulting string is then retrievedby using the str method.

String streams are also useful for error checking. Suppose that a programreceives as input a file specifying when each employee of a company startedand stopped working. Suppose that each line in the file contains an employeenumber, a start time and a stop time. And now, suppose that we need to verifythat this file is correctly formatted in this way.

The easiest way to do this is to read the file line by line and then try toextract from each line an employee number and two times. This can be doneby initializing an input string stream with the line and then attempting to readan employee number and two times from the string stream. If any of the inputoperations fail, we know that the line doesn’t contain the required data.

Page 96: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

88 CHAPTER 3. STRINGS AND STREAMS

#include <sstream>#include <string>

Time to_Time(const std::string & s){

std::istringstream in(s);Time t;in >> t;return t;

}

inline std::string to_string(const Time & t){

std::ostringstream out;out << t;return out.str();

}

Figure 3.5: Functions to convert between strings and times

Page 97: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

3.4. STRING STREAMS 89

The result is shown in Figure 3.6. The loop is controlled by a call togetline. This works, as explained in the previous section, because getlinereturns the stream. This code keeps track of the number of the current line sothat number can be included in the error messages.

This code also checks that each line in the file only contains an employeenumber and two times. This is done by using the stream manipulator ws:

line_stream >> ws

This extracts whitespace from the line. If the line contains only whitespace afterthe second time, this leaves the string stream in the end of file state. If that’s notthe case, an error message is printed.

Note that this code does not check for every possible error. For example, itdoes not check that times are properly written. For example, the times 8:5 and37:50 would not be recognized as errors. We will learn later in these noteshow the input operator of class Time can be modified to detect these kinds oferrors.

The string stream classes are defined in the library file sstream. Theseclasses are part of the std namespace.

Exercises

3.4.1. Create string conversion functions for the class Date of the exercises ofthe previous two chapters. More precisely, create a function to_Date(s)that takes a string as argument and returns a Date object. And createa function to_string(d) that takes a Date object as argument andreturns a string.

3.4.2. Write code that reads a line of text from cin and verifies that the linecontains exactly three integers and nothing else.

Page 98: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

90 CHAPTER 3. STRINGS AND STREAMS

ifstream in("employee_times.txt");

string line;int line_number = 0;while (getline(in, line)) {

++line_number;istringstream line_stream(line);

int n;Time t1;Time t2;line_stream >> n >> t1 >> t2;

if (!line_stream) {cout << "There is an error at line "

<< line_number << endl;return;

}

// check that rest of line is blankline_stream >> ws;if (!line_stream.eof()) {

cout << "Line " << line_number<< " contains too much data\n";

return;}

}

cout << "The file doesn’t appear to contain any "<< "errors\n";

return;

Figure 3.6: Using a string stream to verify the formatting of an input file

Page 99: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Chapter 4

Vectors

In this chapter, we will learn how to use vectors to store large amounts ofdata. We will illustrate the usefulness of vectors by creating a simple file viewer.This will also allow us to discuss the software development process and object-oriented design.

4.1 A Simple File Viewer

Our file viewer will be a program that allows the user to view the contents of atext file. Figure 4.1 shows what the interface of this program will look like. Theprogram displays the name of the file that is currently being viewed, if any, fol-lowed by a certain number of lines from that file surrounded by a border. Belowthe text, a menu of commands is displayed followed by the prompt “choice:”. The user types the first letter of a command, the command executes andeverything is redisplayed.

The commands next and previous cause the file viewer to display the next orprevious “pages”. The command open causes the program to ask the user for the

91

Page 100: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

92 CHAPTER 4. VECTORS

preface.txt−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

1 After a few computer science courses, students2 may start to get the feeling that programs can3 always be written to solve any computational4 problem. Writing the program may be hard work.5 For example, it may involve learning a difficult6 technique. And many hours of debugging. But7 with enough time and effort, the program can be8 written.9

10 So it may come as a surprise that this is not11 the case: there are computational problems for12 which no program exists. And these are not−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

next previous open quit−−−−−−−command: ofile: introduction.txt

Figure 4.1: Sample interface of the file viewer

Page 101: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.2. VECTORS IN THE STL 93

name of another file to open.We will create the file viewer later in this chapter. One main decision we will

have to make concerns the storage of the file contents. To avoid having to readthe file multiple times, we will have the program store the contents of the file inmain memory (that is, in the program’s variables).

How should we store the contents of the file? One idea would be as one verylong string, with the different lines separated by new line characters. But thenmoving to the next or previous page would involve scanning the string characterby character while counting new line characters. This is somewhat inefficient.

An alternative is to store each line on its own, as a string. We obviouslycan’t use separate variables for each of the lines, since the number of lines isunknown and might be large. This is where vectors come in. In the next section,we describe what these are.

4.2 Vectors in the STL

In C++, a vector is a container, that is, a software component designed to holda collection of elements. To be more precise, a vector holds a (finite) sequenceof elements. These elements can be of any type but in an individual vector, allelements have to be of the same type. The type of element held by a particularvector is specified at the time that the vector is created. For example, a vectorthat can store integers can be created as follows:

vector<int> v;

C++ vectors are objects of class vector. This class is defined in the library filevector and included in the std namespace.

One important property of vectors is that they are dynamic, in the sense thatthey can grow and shrink as needed. For example, suppose that we want to fill

Page 102: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

94 CHAPTER 4. VECTORS

v with integers read from a file stream in. This can be done as follows:

int x;while (in >> x)

v.push_back(x);

This loop reads every integer in the file and adds a copy of each one of them tothe back of the vector. The size of the vector increases automatically.

Vector elements can be accessed by using the indexing operator. For example,all the elements of vector v can be printed as follows:

for (int i = 0; i < v.size(); ++i)cout << v[i];

Note how we can simply ask a vector for its size. There is no need to store thatsize in a separate variable. And when passing a vector as argument, there is noneed to pass the size of the vector as an additional argument since the functioncan simply ask the vector for its size.

Now, when a vector needs to be traversed in its entirety, as in the aboveexample, it is more convenient to use a range-for loop:

for (int x : v)cout << x;

When only part of the vector needs to be traversed, indices must be used. (Oriterators, a concept we will learn later in these notes.)1

Tables 4.1 and 4.2 show some of the most basic vector operations. The classsupports several more, as described in a reference such as [CPP].2

1Range-for loops and iterator-based loops are also safer than index-based loops. That’s be-

Page 103: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.2. VECTORS IN THE STL 95

vector<T> vvector<T> v(n)vector<T> v(n, e)vector<T> v({elements})vector<T> v(v2)

Creates a vector v that can hold elements of type T. The vector isinitialized to be empty, or to contain n value-initialized elements oftype T, or n copies of element e, or copies of the given elements,or copies of all the elements of vector v2.

v.size()Asks vector v for the number of elements it currently contains.

v[i]Returns a reference to the element at index i in vector v.

v.front()v.back()

Asks vector v for a reference to its front or back element.

v.resize(n)v.resize(n, e)

Asks vector v change its size to n. If n is smaller than the current sizeof v, the last elements of v are deleted. If n is larger than the currentsize, then v is padded with either value-initialized elements of type Tor with copies of element e.

Table 4.1: Some basic vector operations (part 1 of 2)

Page 104: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

96 CHAPTER 4. VECTORS

v.push_back(e)Asks vector v to add a copy of element e to its back end.

v.pop_back()Asks vector v to delete its last element.

v1 = v2v1 = {elements}

Makes vector v1 contain copies of all the elements of vector v2, orcopies of the given elements. Returns a reference to v1.

v.empty()Asks vector v if it is empty.

v.max_size()Asks vector v for the maximum number of elements it can contain.

v.clear()Asks vector v to delete all its elements.

v.assign(n, e)v.assign({elements})

Asks vector v to change its contents to n copies of element e, or tocopies of the given elements.

v1.swap(v2)Asks vector v1 to swap contents with vector v2 (without any elementsbeing copied).

Table 4.2: Some basic vector operations (part 2 of 2)

Page 105: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.2. VECTORS IN THE STL 97

The second constructor value-initializes the elements of the vector. A com-plete definition of value initialization involves concepts we have not covered butthe following addresses the most common situations. If the vector elements areprimitive values, such as int’s, double’s or char’s, then the elements are setto 0. (In the case of char, 0 is converted to the null character.) If the vectorelements are objects, then they are initialized with their default constructor. Ifthe vector elements are arrays, then the elements of each of these arrays arevalue-initialized.3

The argument of the fourth constructor is an initializer list. When such alist is specified by using braces, the elements should be separated by commas asin {1, 2, 3}. In addition, the parentheses can be omitted:

vector<T> v{elements}

The following syntax can also be used:

vector<T> v = {elements}

The fifth constructor is known as a copy constructor. It can also be usedwith the equal sign syntax:

cause the vector method size usually returns an unsigned integer and comparing an unsignedinteger to a plain (signed) int can sometimes fail. The vector’s size would have to be very largefor this to happen, typically over 2 billion elements. But it’s good to be aware of this possibility.

2Some of these operations involve concepts that we have not covered, such as iterators,ranges and capacity. You can ignore these operations for now.

3One exception is when the vector elements are objects (or structures) that have noprogrammer-defined constructor. In that case, the objects are first zero-initialized and theninitialized with the compiler-generated default constructor. When an object is zero-initialized,each of its data members is zero-initialized. If the data member is a primitive value, it is set to 0.If it’s an array, all its elements are zero-initialized. The compiler-generated default constructordefault-initializes each data member of the object. If the data member is a primitive value,nothing is done. If the data member is an object, it’s initialized with its default constructor. Ifthe data member is an array, all its elements are default-initialized.

Page 106: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

98 CHAPTER 4. VECTORS

vector<T> v = v2

Note that vectors have push_back and pop_back methods but nopush_front or pop_front. The reason has to do with efficiency: it is possi-ble to implement the operations at the back efficiently but doing so at the frontis more difficult. (We will discuss this point in more detail later in these notes.)

As mentioned earlier, vectors can grow and shrink as needed. Even then,vector do have a maximum size, which is related to the value of the largestinteger that can be stored in an int variable. However, that maximum size,which can be retrieved by the method max_size, is typically much larger thanwhat’s needed in most applications. For example, for a vector of integers, atypical limit is approximately 1 billion elements.

Vectors are said to be generic because they can hold elements of any type. Infact, vectors belong to a portion of the C++ standard library called the StandardTemplate Library (STL) and the word template in the name of this library refersto a C++ construct that allows the creation of generic software components. Wewill see later that the STL includes several other generic components, includingother containers as well as functions that implement useful algorithms.

We end this section with a note about the efficient use of vectors. At anymoment, a vector has a certain amount of memory available to it. That amountof memory is called the capacity of the vector. More precisely, the capacity ofa vector is the maximum number of elements that the vector can hold with thememory it currently has available. In contrast, the size of a vector is the numberof elements it currently holds.

As we saw earlier, a vector can be easily filled with integers read from a filestream as follows:

int x;while (in >> x)

v.push_back(x);

Page 107: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.2. VECTORS IN THE STL 99

This will typically cause the capacity of the vector to increase multiple times.And each time the capacity of a vector increases, all the elements currently storedin the vector must be copied to a new location where more space is available.Therefore, it’s best to try to reduce the number of times that the vector mustincrease its capacity.

One way to achieve this is to ask the vector to reserve enough capacity aheadof time. For example, suppose that we know that file stream in contains nintegers. Then those integers can be read more efficiently as follows:

v.reserve(n);int x;while (in >> x)

v.push_back(x);

This will cause the vector to increase its capacity only once.Note that this would work even if all we had was a reasonable upper bound

on the number of integers in the stream. If necessary, after the reading is com-pleted, we could ask the vector to reduce its capacity to the minimum required,which is the size of the vector:

v.shrink_to_fit()

Whether this request is fulfilled depends on the particular implementation of thelibrary being used.

Another way to avoid multiple increases in the capacity of a vector is toincrease the size of the vector ahead of time:

v.resize(n);for (int & x : v)

in >> x;

Page 108: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

100 CHAPTER 4. VECTORS

v.capacity()Asks vector v for its capacity.

v.reserve(n)Asks vector v to increase its capacity so it is at least n. If n is less thanthe current capacity, nothing happens.

v.shrink_to_fit()Asks vector v to reduce its capacity to the size of the vector. (Whetherthe request is fulfilled depends on the implementation.)

Table 4.3: Vector operations related to the capacity of the vector

Note the difference: v.push_back(x) adds a new element to the vector whilein >> x in the above code modifies an existing element of the vector.

In this example, it was just as easy to resize the vector as it was to reservecapacity and then use push_back repeatedly. But in many other situations,repeated push_back’s are more convenient.

Table 4.3 lists three methods related to the capacity of vectors.

Study Questions

4.2.1. What is a container?

4.2.2. In what way are vectors dynamic?

4.2.3. In what way are vectors generic?

4.2.4. What happens when a vector element is value-initialized?

4.2.5. Why do STL vectors have no push_front or pop_front methods?

Page 109: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.3. THE SOFTWARE LIFE CYCLE 101

Exercises

4.2.6. Experiment with vectors by writing a test driver that creates more thanone type of vector and uses all the methods shown in Tables 4.1 and 4.2.

4.2.7. Create a function print(v, out) that takes as arguments a vector ofintegers v and an output stream out and prints to out all the elementsof v, one per line.

4.2.8. Create a function concatenate(v1, v2, v3) that takes as argu-ments three vectors of integers v1, v2 and v3 and makes v3 contain acopy of all the elements of v1 followed by a copy of all the elements of v2.

4.2.9. Create a function replace(v, x, y) that takes as arguments a vectorof integers v and two other integers x and y, and replaces every occurrenceof x in v by a copy of y.

4.2.10. Create a function count(v, x) that takes as arguments a vector ofintegers v and an integer x and returns the number of occurrences of xin v.

4.3 The Software Life Cycle

Before turning to the creation of the file viewer, we’re going to discuss in somedetail what activities are actually involved in the creation of a piece of software.

The first one is the specification of the software. This consists in determiningexactly what the software must do. A specification is normally concerned onlywith the external behavior of the software, not its inner workings. In otherwords, a specification spells out what the software must do, not how it doesit. A good specification should be clear, correct and complete. It also helps

Page 110: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

102 CHAPTER 4. VECTORS

if it is as concise as possible. Specifying a piece of software typically involvescommunicating with the client.

The second activity is the design of the software. This generally consists ofthe following three tasks:

1. Identify the various components of the software and the tasks they areresponsible for.

2. Choose or design major algorithms and data structures for these compo-nents.

3. Write precise specifications for these components, including precise inter-faces.

The interface of a component is what the user code (or client code) uses tocommunicate with the component. In the case of a function, this consists of thename of the function as well as the type of its arguments and its return value.In the case of a class, this consists of the name of the class and the interface ofall of its public methods and data members.

In other words, after the software is designed, we should know what com-ponents it contains, what these components do and how to use them.

The third activity is the implementation of the software. This is the writingand testing of the code. This normally involves the following:

1. Code each software component.

2. Test each component on its own. This is called unit testing.

3. Combine the components gradually, one at a time, testing after each addi-tion. This is called integration testing.

Page 111: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.3. THE SOFTWARE LIFE CYCLE 103

Unit and integration testing make it easier to locate and fix errors. This isespecially important when dealing with large programs.

Once a program is implemented, it is ready to be used. But the softwarewill usually continue to evolve. This can consist of fixing errors (as reportedby users, for example), adapting the software to new environments (such asnew hardware), extending the software (to give it new capabilities), or moregenerally improving the software (to make it more efficient or easier to use, forexample). All of these activities constitute what is called software maintenanceand evolution, or just maintenance, for short.

Software specification, design, implementation and maintenance are oftenreferred to as the four stages of the software life cycle. We will say a bit moreabout the first three stages later in this chapter. But first, in the next section, wediscuss a key issue in the management of the software development process.

Study Questions

4.3.1. What are the four stages of software development?

4.3.2. What is a software specification?

4.3.3. What are four properties of a good software specification? Hint: Fourwords that start with the letter c.

4.3.4. What three tasks does software design consist of?

4.3.5. What is the interface of a software component?

4.3.6. What is unit testing?

4.3.7. What is integration testing?

Page 112: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

104 CHAPTER 4. VECTORS

4.3.8. What is the main advantage of integration testing (over testing all thecomponents together right away)?

4.4 The Software Development Process

By the time we are done creating a program, we will have necessarily speci-fied, designed and implemented it. A key question is how to synchronize theseactivities. A number of approaches are possible.

One is to simply do everything all at once. Essentially, start coding right away,and specify and design the program as you code. This may work reasonablywell for small programs but this approach is problematic with large programs.For example, it makes it difficult to split the work among several programmers:how can one programmer implement a component while, at the same time,another programmer is writing code that uses that component unless the twoprogrammers first agree on what the component does and how it should beused (specification, including interface)?

An approach that addresses these problems is to carry out the specification,design and implementation of a program in sequence. This is usually referredto as the waterfall model of software development. The idea is that the prod-uct of each stage flows as input into the next stage: a specification flows fromthe specification stage into the design stage, a design document flows from thedesign stage into the implementation stage, and code flows the implementationstage into the maintenance stage.

This approach may seem ideal because each stage of the development of thesoftware relies on complete information from the previous stage. However, it isdifficult to specify and design a very large program without writing any code toexperiment with the software and verify that the specification and design makesense.

Page 113: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.4. THE SOFTWARE DEVELOPMENT PROCESS 105

An alternative to the waterfall model is to proceed incrementally by speci-fying, designing and implementing successively more complete versions of thesoftware. The main advantage of incremental (or iterative) development isthat knowledge and experience gained in developing one version of the soft-ware can be used in developing the next one. For example, it is possible toget feedback from the client on early versions. Another advantage is that thecreation of the entire software proceeds as a sequence of smaller, more man-ageable projects. And finishing a version of the software, even an incompleteone, is a satisfying experience that typically generates excitement and increasedmotivation.

Note that the specification and design of a particular version of the programdoes not need to be completely done before its implementation. Details thatconcern only one component, such as I/O format, can be left to the developer ofthat component. In other words, incremental development does not have to bea series of smaller waterfall development projects. A fair amount of flexibility ispossible in the development of each version.

There is a lot of evidence that incremental development is superior to thesingle-pass waterfall model. For example, Larman and Basili [LB03] surveyedthe history of incremental development and concluded that incremental de-velopment concepts “have been and are recommended practice by prominentsoftware-engineering thought leaders of each decade, associated with many suc-cessful large projects, and recommended by standards boards.”

In addition, Fred Brooks, an early leader in the field of software engineer-ing,4 described the essential challenges of software engineering and identifiedincremental development as one of three promising solutions [Bro87]. He pointsout that software developers used to think of writing programs (as if they were

4Brooks was also the 1999 winner of the ACM’s Turing Award, which is widely regarded asthe “Nobel Prize” of computer science [TA].

Page 114: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

106 CHAPTER 4. VECTORS

recipes) but that as program size increased, software developers realized it wasmore appropriate to think of building programs. Similarly, Brooks suggestedthat instead of building programs, we now think of growing them (by using in-cremental approaches).

Note that incremental development is a key ingredient in all the agile ap-proaches to software development.

Much more can be said about the software development process. At Clark-son, this is done in a course such as CS350 Software Design and Development.

Study Questions

4.4.1. What is a disadvantage of coding right away without first specifying anddesigning the software?

4.4.2. What is the waterfall model of software development?

4.4.3. What is incremental software development?

4.4.4. What are the benefits of incremental development?

4.5 Creation of the File Viewer

The general behavior of the file viewer was described earlier in this chapter butseveral details still need to be specified. For example, how does the programknow how many lines to display per “page”? What should the next commanddo when the last page is already displayed? How should the last page be dis-played if it is shorter than the others? What should happen if a file doesn’t open?

One strategy for producing a program specification is to pretend that theprogram is running and then imagine using it in every possible way. This is

Page 115: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 107

sometimes called working through scenarios. This allows us to explore and spec-ify every detail of the program’s behavior.

A full specification of the file viewer should answer these questions as well asevery other possible question about the behavior of the program. One possiblespecification is shown in Figures 4.2 to 4.4.

We now turn to the design of the program. Recall from earlier in these notes,that our goal is to design a modular program. There are three strategies thathelp us achieve this goal. First, of course, we use abstraction, both proceduraland data abstraction. Second, as we design the program, we make sure thatcomponents delegate as many tasks as possible to other components. Recallthat thinking in an object-oriented way encourages delegation. Third, we aimfor a design in which each component has its own secret. This typically leads toimportant aspects of the program being isolated from each other within differentcomponents.

Just like the specification of the program, the design can be carried out byworking through scenarios. The design then proceeds mainly as a sequence ofwhat/who questions: What needs to be done? Who is going to do it? This issometimes referred to as the what-who cycle. This design process is said to beresponsibility-driven because it is driven by the identification and assignment ofresponsibilities.

Note that while designing a program, it is a good idea to delay deciding onminor details, especially those that involve only one component. Early on, wewant to focus on the major aspects of the program without getting distracted(and possibly overwhelmed) by all the little details.

Now, at any moment, the file viewer holds a copy of the contents of a file.We will call this copy a buffer. There are three main tasks that the file viewerneeds to handle:

1. The user interaction.

Page 116: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

108 CHAPTER 4. VECTORS

OVERVIEW

A simple file viewer that allows the user view thecontents of a text file.

DETAILS

The program interacts with the user as shown in thefollowing example:

preface.txt−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

1 After a few computer science courses, students2 may start to get the feeling that programs can3 always be written to solve any computational4 problem. Writing the program may be hard work.5 For example, it may involve learning a difficult6 technique. And many hours of debugging. But7 with enough time and effort, the program can be8 written.9

10 So it may come as a surprise that this is not11 the case: there are computational problems for12 which no program exists. And these are not−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

next previous open quit−−−−−−−command: ofile: introduction.txt

Figure 4.2: Specification of the file viewer (part 1 of 3)

Page 117: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 109

The program begins by asking the user for a windowheight. This is the number of lines that will bedisplayed as each "page". The displayed lines arenumbered starting at 1 for the first line of the file.If the number of lines on the last page is smallerthan the window height, the rest of the window isfilled with unnumbered empty lines.

Each page is displayed between two lines of 50 dashes.The name of the file is printed above the first lineof dashes. If no file is currently open, the string"<no file opened>" is printed instead of the file name.

Below the second line of dashes, a menu of commands isdisplayed. Below that menu, the prompt "choice:" isdisplayed. The user types the first letter of acommand, the command executes and everything isredisplayed. Some commands prompt the user for moreinformation.

Here is a description of the various commands:

next: The next page is displayed. Does nothing if thelast line of the file is already displayed.

previous: The previous page is displayed. Doesnothing if the first line of the file is alreadydisplayed.

Figure 4.3: Specification of the file viewer (part 2 of 3)

Page 118: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

110 CHAPTER 4. VECTORS

open: Asks for a file name (with prompt "file:") anddisplays that file. If a file named X does not open,the message "ERROR: Could not open X" is displayedjust before the file name is redisplayed.

quit: Stops the program.

NOTES FOR LATER VERSIONS

Add more error−checking. For example, check thatcommands are entered properly and that the windowheight is a positive integer.

Figure 4.4: Specification of the file viewer (part 3 of 3)

2. The storage of the buffer.

3. The execution of the commands.

Ideally, we would like to assign these tasks to three separate components. But theexecution of the commands is highly dependent on the exact way in which thebuffer is stored, so it makes sense to assign these two tasks to a single componentthat we call Buffer. The overall control of the program, including most of theuser interaction, is assigned to a FileViewer component.

Here are some more details on the program’s design. The FileViewer hasa single public method, as shown in Figure 4.5. The FileViewer stores thelines of text in a Buffer object. It also delegates the displaying of the text andthe execution of the commands to that Buffer object. Accordingly, the Bufferclass has several public methods, including one for displaying the buffer and onefor each of the file viewer commands, as shown in Figure 4.6.

Page 119: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 111

class FileViewer{public:

void run();

private:void display();void execute_command(char command, bool & done);

Buffer buffer_;int window_height_;std::string error_message_;

};

Figure 4.5: Declaration of FileViewer

We now turn to the implementation of the program. As stated earlier, thisconsists of coding and testing each component of the program.

Figure 4.7 shows the implementation of the FileViewer method run.It is essentially a loop that runs until the user enters the quit command. Themethod run delegates some of its work to two helper methods, display andexecute_command. These methods are not part of the interface of the class;that is, users of the class should not use these methods. That’s why they aredeclared private, as can be seen in Figure 4.6.

The method display is responsible for displaying the buffer and the menuof commands. Its implementation is shown in Figure 4.8. The data membererror_message_ is set when an error occurs during the execution of a com-mand. In this version of the file viewer, the only error that can occur is that afile doesn’t open.

The implementation of execute_command is shown in Figure 4.9. It’s ba-

Page 120: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

112 CHAPTER 4. VECTORS

class Buffer{public:

void display() const;const std::string & file_name() const{

return file_name_;}void move_to_next_page();void move_to_previous_page();bool open(const std::string & file_name);void set_window_height(int h){

window_height_ = h;}

private:std::vector<std::string> v_lines_;int ix_top_line_ = 0;std::string file_name_;int window_height_;

};

Figure 4.6: Declaration of Buffer

Page 121: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 113

void FileViewer::run(){

cout << "Window height? ";cin >> window_height_;cin.get(); // ’\n’cout << ’\n’;buffer_.set_window_height(window_height_);

bool done = false;while (!done) {

display();

cout << "choice: ";char command;cin >> command;cin.get(); // ’\n’

execute_command(command, done);

cout << endl;}

}

Figure 4.7: The FileViewer method run

Page 122: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

114 CHAPTER 4. VECTORS

void FileViewer::display(){

const string long_separator(50, ’−’);const string short_separator(8, ’−’);

if (!error_message_.empty()) {cout << "ERROR: " + error_message_ << endl;error_message_.clear();

}

string file_name = buffer_.file_name();if (file_name.empty())

cout << "<no file opened>\n";else

cout << file_name << endl;

cout << long_separator << endl;buffer_.display();cout << long_separator << endl;cout << " next previous open quit\n";cout << short_separator << endl;

}

Figure 4.8: The FileViewer method display

Page 123: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 115

void FileViewer::execute_command(char command,bool & done)

{switch (command) {

case ’n’: {buffer_.move_to_next_page();break;

}

case ’o’: {cout << "file name: ";string file_name;getline(cin, file_name);if (!buffer_.open(file_name))

error_message_ ="Could not open " + file_name;

break;}

case ’p’: {buffer_.move_to_previous_page();break;

}

case ’q’: {done = true;break;

}}

}

Figure 4.9: The FileViewer method execute_command

Page 124: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

116 CHAPTER 4. VECTORS

sically a switch statement. The argument done is set to true when the quitcommand is executed.

Figures 4.10 and 4.11 show the implementation of the Buffermethods thatweren’t already implemented in the class declaration. In the open method, thecontents of the vector is cleared only after we know that the new file openedsuccessfully.

After each component is coded, it should be tested on its own. Sometimes,all that this requires is the writing of a test driver, that is, a piece of code whosesole purpose is to test our component. For example, in the file viewer, we canwrite a test driver to test the various methods of the class Buffer.

But if a component uses other components, then it cannot be tested in com-plete isolation. This is the case with the class FileViewer because this classuses Buffer.

One option is to wait until the other components have been implemented.In the file viewer, we would wait for Buffer to be coded and tested beforetesting FileViewer. In a single-person project, this would work well. But ina multi-person project, we’d like to have all the developers coding and testingtheir respective components at the same time.

So another option is to create stubs for the other components. A stub is adummy version of a component: it has the same interface but performs noneor only a very small part of the responsibilities of the component. For example,to test FileViewer, we could create a version of Buffer that does not readfrom a file and always prints the same lines. Testing FileViewer with thisdummy Buffer would allow us to verify that FileViewer performs the userinteraction properly. Of course, this doesn’t tell us if FileViewer will workproperly with the real Buffer. This is to be determined at integration testing,after both components have been implemented and tested on their own.

The complete source code and documentation of the file viewer is available

Page 125: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 117

inline void Buffer::move_to_next_page(){

ix_top_line_ += window_height_;if (ix_top_line_ >= v_lines_.size())

ix_top_line_ −= window_height_;}

inline void Buffer::move_to_previous_page(){

ix_top_line_ −= window_height_;if (ix_top_line_ < 0)

ix_top_line_ = 0;}

void Buffer::display() const{

int ix_stop_line_ = ix_top_line_ + window_height_;for (int i = ix_top_line_; i < ix_stop_line_; ++i){

if (i < v_lines_.size())cout << std::setw(6) << i+1 << " "

<< v_lines_[i];cout << endl;

}}

Figure 4.10: Implementation of some of the Buffer methods

Page 126: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

118 CHAPTER 4. VECTORS

bool Buffer::open(const string & new_file_name){

std::ifstream file(new_file_name);if (!file)

return false;

v_lines_.clear();

string line;while (getline(file, line))

v_lines_.push_back(line);

file_name_ = new_file_name;ix_top_line_ = 0;return true;

}

Figure 4.11: Implementation of the Buffer method open

Page 127: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.5. CREATION OF THE FILE VIEWER 119

on the course web site under FileViewer.

Study Questions

4.5.1. What does working through scenarios mean?

4.5.2. Why should we aim for a design in which each component has its ownsecret?

4.5.3. What is the what-who cycle?

4.5.4. What is responsibility-driven design?

4.5.5. What is a stub or dummy component?

Exercises

4.5.6. Modify the file viewer as described below. Change the original specifi-cation, design and implementation as little as possible. Error messagesshould be displayed at the top of the window, just like when a file doesnot open.

a) Add a jump command that asks the user for a line number and re-displays the file with the requested line at the top. In case the userenters an invalid line number N, the program should print the errormessage ERROR: N is not a valid line number.

b) Add a search command that asks the user for a string and finds thefirst line that contains that string. The search starts at the first linecurrently displayed and stops at the end of the file. If the string isfound, the line that contains it is displayed at the top. In case the

Page 128: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

120 CHAPTER 4. VECTORS

user enters a string X that does not occur in the file, the programshould print the error message ERROR: string "X" was not found.

c) Modify the search command so that the search wraps around to thebeginning of the file when the end of the file is reached.

d) Add an again command that repeats the last search. If that last searchwas the previous command, then the new search starts at the secondline that is currently displayed. If no search has been previously per-formed, then the again command asks the user for a string.

4.5.7. Add more error checking to the file viewer. Change the program as littleas possible.

a) Make the program check that the user enters a positive integer as thewindow height. If not, the program should print an error messageand ask again.

b) Make the program check that the user enters either the whole nameof a command or the first letter of the command. If the user entersan invalid command X, the program should print the error messageERROR: X is not a valid command. The error message should be dis-played at the top of the window, just like when a file does not open.

4.6 Arrays

In this chapter, we learned that C++ vectors can be used to store sequences ofelements. But there is a simpler alternative: ordinary arrays. Table 4.4 showshow arrays can be created and how their elements can be accessed.

In the first form of array declaration, the elements of the array are default-initialized. Recall that this means that if the array elements are primitive values,

Page 129: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.6. ARRAYS 121

T a[N]T a[N] = {elements}T a[] = {elements}

Creates an array a that can hold elements of type T. The array isinitialized to contain N default-initialized elements of type T, or copiesof the given elements. In case the number of given elements isless than N, the remaining elements of a are value-initialized. N mustbe a compile-time constant.

a[i]Returns a reference to the element at index i in array a.

Table 4.4: Some basic array operations

then the elements are not initialized. If the array elements are objects, then theyare initialized with their default constructor. If the array elements are themselvesarrays, then the elements of each of these arrays are default-initialized.

In the second form of array declaration, elements that are not given explicitinitial values are value-initialized. Recall that this means that if these elementsare primitive values, then the elements are set to 0. If these elements are objects,then they are initialized with their default constructor. If these elements arethemselves arrays, then the elements of each of these arrays are value-initialized.(See the footnote in Section 4.2 for an exception.)

Ordinary C++ arrays have one advantage over vectors: when used in somecontexts, the array indexing operator runs slightly more quickly than the vectorindexing operator.

However, ordinary arrays have several major weaknesses compared to vec-tors. In fact, vectors were developed precisely to address these weaknesses.

Page 130: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

122 CHAPTER 4. VECTORS

void print(const int a[], int n){

for (int i = 0; i < n; ++i)cout << a[i] << ’\n’;

}

Figure 4.12: A function that prints an array

The first weakness, and perhaps the main one, is that ordinary C++ arrayshave a predetermined size. Predetermined here means determined at compiletime, before the execution of the program. The size of an array is also fixed: itcannot change during the execution of the program. This is a significant limita-tion that can make a program fail in case an array is too small, or waste memoryin case an array is unnecessarily large.

In contrast, as we already know, vectors can be declared to be of any sizeand that size can be changed as needed during the execution of the program, byusing methods such as resize, push_back and assign.

Later in these notes, when we learn the techniques that are used to imple-ment containers such as vectors, we will learn that arrays can be dynamicallyallocated. However, this involves low-level techniques that are somewhat incon-venient and prone to errors. In any case, all C++ arrays, even those that aredynamically allocated have several other weaknesses.

First, arrays are not aware of their size. In particular, there is usually noreliable way for a function to figure out the size of an array argument, which iswhy we typically pass the size of the array as a separate additional argument.For example, Figure 4.12 shows a function that prints the contents of an arrayof integers. One danger is that nothing guarantees that the value of the sizeargument is correct. In contrast, vectors are aware of their size and whenever

Page 131: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

4.6. ARRAYS 123

we need to know that size, we just have to ask by using the method size().

Second, the usual operators, such as =, == and <, don’t work with arrays.Actually, they can sometimes work but they don’t do what you would expect.For example, if a and b are two arrays, then a = b, if it compiles, will causethe names a and b to refer to the same array. This is called a shallow copy. Incontrast, we would likely want a = b to cause a to become a separate copy ofarray b. This is called a deep copy. With arrays, dynamically allocated or not,the only way to achieve a deep copy is to write a loop. This is inconvenient anda possible source of errors. In contrast, vectors support all the usual operators,including =, == and <.

Third, it is not easy to have a function return a copy of an array. In particular,the return type of a C++ function cannot be an array. It is possible to get aroundthis problem by using dynamically allocated arrays. But, once again, these tech-niques are inconvenient and prone to errors. In contrast, functions can returncopies of vectors just as if they were a value of any of the primitive data types.5

To summarize, ordinary C++ arrays have the following four weaknesses: (1)they have a fixed, predetermined size, (2) they don’t know their size, (3) theydon’t support the usual operators and (4) functions cannot easily return copiesof arrays.

Study Questions

4.6.1. What are two advantages of arrays over vectors?

4.6.2. What are four advantages of vectors over ordinary arrays?

5Note, however, that this is something that is usually done only with small vectors. It is moreefficient to avoid the copying of the vector by passing it to the function as a reference argument.

Page 132: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

124 CHAPTER 4. VECTORS

Exercises

4.6.3. Modify the file viewer by using an array to implement the Buffer. Usean array of size 100,000. In case a file called X is too large, the programshould print the error message ERROR: file X is too large and redisplay theprevious file.

4.6.4. Repeat the previous exercise but this time, in case the file is too large,have the program use the array to store part of the file. When anotherpart is needed, the program reopens the file to read that part and store itin the array.

Page 133: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Chapter 5

Generic Algorithms

The C++ Standard Template Library includes algorithms that solve a wide vari-ety of common problems. In this chapter, we will learn how to use and imple-ment some of these algorithms. In the process, we will become familiar with theimportant concepts of iterators, function templates, function objects and genericprogramming.

5.1 Introduction

Some problems come up very frequently in a wide variety of software projects.To illustrate, here are just three examples:

1. Computing the maximum of two values.

2. Counting the number of occurrences of an element in some container (avector, for example).

3. Sorting the elements of a container in some order.

125

Page 134: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

126 CHAPTER 5. GENERIC ALGORITHMS

Standard algorithms have been developed for all of these common problems.Some are trivial: to compute the maximum of two values, simply compare themby using the less-than operator (<). Others are nontrivial but simple: to countthe number of occurrences of an element in a sequence container such as avector, initialize a counter to 0 then scan the sequence from beginning to endwhile adding one to the counter every time an occurrence of the element isseen. Some of these algorithms, on the other hand, can be fairly complex. Thisincludes efficient algorithms for sorting vectors.1

When needed, we can always implement these standard algorithms our-selves. But the STL includes functions that implement many of these algorithms.For example,

max(a, b)

returns the maximum of a and b. The number of occurrences of element e invector v can be computed as follows:

count(v.begin(), v.end(), e)

And

sort(v.begin(), v.end())

rearranges the elements of vector v in increasing order. These functions arecalled STL algorithms.

Using STL algorithms has a number of advantages compared to implement-ing algorithms ourselves. First, obviously, it saves time. Second, it leads to morereliable software because the STL algorithms have been more extensively tested

1Examples are mergesort, heapsort and quicksort. We will study mergesort and quicksortlater in these notes. At Clarkson, heapsort is normally covered in the course CS344 Algorithmsand Data Structures.

Page 135: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.1. INTRODUCTION 127

than any of our own implementations ever could. Third, it makes our programseasier to understand because other programmers are already familiar with theSTL algorithms.

The use of STL algorithms, as well as the use of any software component froma standard library, is an example of software reuse. Reusing any software compo-nent, but especially standard components, generally gives those three benefits:faster development, increased reliability, and software that’s easier to under-stand.

Note that the STL algorithms are generic, in the sense that they can be usedon more than one type of argument. For example, max can be used on anytype of value that supports the less-than operator (<). This includes not onlynumbers, like int’s, but also strings and any other class for which the < operatoris overloaded.2

The algorithm count can be used on any type of vector, as long as the ele-ment type supports the equality testing operator (==). And sort can be usedon any vector whose element type supports the less-than operator (<). In the

2By strings here, we mean C++ strings (objects of class string). Because, as is the casewith most operators, the < operator does not work properly on C strings. This also applies toliteral strings (e.g., "abc") because literal strings are stored as C strings. (Calling max on Cstrings stored in arrays of different sizes may actually result in a compiler error because the maxalgorithm requires that its arguments be of the same type.)

To use max on C strings, one solution is to explicitly convert the C strings into C++ strings,as in

max(string("alice"), string("bob"))

A more convenient solution is to specify that we want to use the max algorithm on C++ strings:

max<string>("alice", "bob")

The compiler will then perform implicit conversions to turn the arguments into C++ strings.The notation max<string> is similar to the notation we’ve used to specify the type of elementstored in a vector: vector<string>.

Page 136: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

128 CHAPTER 5. GENERIC ALGORITHMS

next section, we will see that the count and sort algorithms are actually evenmore generic than that since they can be used on a wide variety of containers,not just vectors.

Now, in the above examples, exactly what type of value are the argumentsv.begin() and v.end()? This is what we will learn in the next section.

Study Questions

5.1.1. What are three benefits of using standard software components such asthe STL algorithms?

5.1.2. What is a generic algorithm?

Exercises

5.1.3. Experiment with the three STL algorithms mentioned in this section bywriting a test driver that uses each one of them on more than one type ofargument. (These algorithms are defined in the library file algorithmand included in the std namespace.)

5.2 Iterators

In the previous section, we learned that the number of occurrences of element ein vector v can be computed as follows by using the STL algorithm count:

count(v.begin(), v.end(), e)

The first two arguments specify the portion of the vector over which the countingoccurs. But exactly what type of value are those arguments?

Page 137: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.2. ITERATORS 129

These arguments are iterators. In fact, the STL algorithm count has thegeneral form

count(start, stop, e)

where the arguments start and stop are iterators that specify the positionsin a container where the counting will start and where it will stop.

All STL algorithms use iterators to specify positions. Another example is thesort algorithm we mentioned in the previous section:

sort(start, stop)

sorts the elements that occur in the portion of a container that begins at startand ends at stop.

We will soon learn exactly what iterators are and what we can do with them.But, first, why do STL algorithms use iterators and not indices to specify positionswithin a container? The answer is simple: indices are not an efficient way ofaccessing elements in most types of containers, including the list and map

containers that we will study later in these notes. For this reason, most typesof containers don’t support indices (that is, they don’t have an operation likean indexing operator that allows an element to be accessed given its index).Iterators, on the other hand, can be used to efficiently access elements in mosttypes of containers, including vectors, lists and maps.

But then, why don’t STL algorithms come in two versions, one for iteratorsand one for indices? For example, an index version of count could have thegeneral form

count(container, i, j, e)

where the arguments i and j are indices that specify where the counting startsand stops. One possible answer is that since iterators can be used to efficiently

Page 138: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

130 CHAPTER 5. GENERIC ALGORITHMS

access elements in a vector, the index version of count is not needed. By notincluding it, the STL is kept simpler.

Perhaps a better answer relates to a design issue. When writing code thatinvolves vectors, it is sometimes necessary to use indices. But sometimes iter-ators would also do the job perfectly well. And in those situations, it is betterto use iterators because the same code can later be reused with other contain-ers, such as lists, that don’t support indices. The fact that the STL only includesiterator versions of its algorithms encourages us to do precisely that: to use it-erators, whenever possible, because this leads to code that’s more general andhas a greater chance of being reused.

So what exactly is an iterator? An iterator is a value, usually an object, whosemost basic purpose is simply to mark a position within a container. We say thatthe iterator points to the element that occupies that position.

For example, consider

count(v.begin(), v.end(), e)

As we said earlier, this returns the number of occurrences of e in v. The beginmethod returns a begin iterator, one that points to the first element of the vector.The end method returns a special iterator called the end iterator. This iteratordoes not point to any element. Instead, it should be thought of as pointing toa position that’s just past the last element of the container. This means that thesecond argument of count is not the position of the last element to be counted;it’s the position of the first element not to be counted. In other words, countstops counting as soon as it reaches the position specified by its second argument.

In general,

count(start, stop, e)

counts from start to stop. In this context, the pair of iterators start andstop specifies a range of positions. In the STL, ranges of positions are always

Page 139: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.2. ITERATORS 131

specified by two iterators that act as the endpoints of an interval. The first it-erator is included in the range but the second one is not. Such a range can berepresented by using the usual mathematical notation as [start, stop).

Now, let’s say that we want to count the number of occurrences of e amongthe first 10 elements of v. To specify that range of positions, we need an iteratorthat points to v[0] and one that points to v[10], so that the counting rangesover the elements v[0] to v[9]. The begin iterator points to v[0]. An iteratorthat points to v[10] can be obtained by adding 10 to v.begin():

count(v.begin(), v.begin() + 10, e)

This works because with vector iterators, itr + i produces an iterator thatpoints i positions to the right of itr. In other words, v.begin() + i con-verts index i into an iterator. But note that this doesn’t work with the iteratorsprovided by every type of STL container. We will revisit this issue in the nextsection.

The STL algorithm count takes iterators as arguments and returns an inte-ger. But STL algorithms can also return iterators. For example,

find(v.begin(), v.end(), e)

returns an iterator to the first occurrence of e in vector v. In general,

find(start, stop, e)

returns an iterator to the first occurrence of e in the range [start, stop). Incase e is not found in that range, find returns stop. In the example above,find would return v.end(). To detect that possibility, we can simply comparethe returned iterator to v.end():

if (find(v.begin(), v.end(), e) == v.end())

Page 140: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

132 CHAPTER 5. GENERIC ALGORITHMS

We often need to store the iterator returned by a function like find. Thiscan be done as follows:

auto itr = find(v.begin(), v.end(), e);

The keyword auto asks the compiler to deduce the appropriate type for thevariable. In this case, itr will be given the type of the iterator returned byfind.

The use of auto is a good idea here because iterator types are a bit messy.For example, if v is a (non-constant) vector of integers, then find in the aboveexample would return an iterator of type

vector<int>::iterator

This is because each type of STL container provides its own type of iterator andthat type is defined within the class itself. The double-colon (::) is used toaccess that type. In general, if Container is a container type, then

Container::iterator

is the type of iterator that works with Container.Note that auto should only be used where it really improves readability.

That’s the case with complex types such as vector<int>::iterator butnot with simple types such as int and string. In those cases, it’s better tospecify the type explicitly.

So find returns an iterator that points to an element. What if we wantedthe index of that element? The index can be computed as the difference be-tween the iterator and the begin iterator: itr − v.begin(). In general,itr2 − itr1 is the distance, in number of elements, between the two it-erators. Since v.begin() points to the element with index 0, the distanceitr − v.begin() is equal to the index of the element that itr points to.

Page 141: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.2. ITERATORS 133

for (auto itr = find(v.begin(), v.end(), 5);itr != v.end();++itr)

cout << ∗itr << ’ ’;

Figure 5.1: Using iterators to traverse a container

In other words, itr − v.begin() converts iterator itr into an index. Butnote that this is another example of an operation that works with vector iteratorsbut not with the iterators provided by every other type of STL container. Onceagain, we will say more about this in the next section.

As we said earlier, the most basic purpose of an iterator is to mark a positionin a container. But an iterator can also be used to access the element at thatposition. For example, suppose that we need to change to 12 the first occurrenceof 5 in vector v. This could be done as follows:

auto itr = find(v.begin(), v.end(), 5);∗itr = 12;

In this code, the dereferencing operator (∗) is applied to the iterator returned byfind. In general, if itr is an iterator, then ∗itr refers to the element thatitr points to.

Iterators can also be used to step through, or iterate over, the elements of acontainer. (That’s why they’re called iterators.) For example, suppose that weneed to print all the elements of vector v starting at the first occurrence of 5.This can be done as shown in Figure 5.1. The iterator itr is initialized to theposition of the first 5. At each iteration of the loop, the increment operator (++)is used to make the iterator point to the next element. Eventually, the iteratorwill move past the last element, to the position marked by the end iterator. Thiswill cause the loop to terminate.

Page 142: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

134 CHAPTER 5. GENERIC ALGORITHMS

Iterators can also be used to step through all the elements of a container, asin the following example:

for (auto itr = v.begin(); itr != v.end(); ++itr)cout << ∗itr << ’ ’;

But when iterating over an entire container, it is simpler to use a range-for loop:

for (int x : v)cout << x << ’ ’;

Note that range-for loops are actually implemented by using iterators: in theabove example, the range-for loop is essentially converted into the previousiterator-based loop. In fact, range-for loops can only be used on containers thatprovide begin and end methods.

To summarize, in C++ and many other programming languages, iteratorsare a uniform and efficient way of specifying positions and accessing elements invarious types of containers. An iterator is a value (usually an object) that marksa position in a container, provides access to the element at that position, andallows us to step through all the elements of a container. In the STL, positionsare specified by using iterators. This allows STL algorithms to be highly genericin that they can be used on many different types of containers.

All standard types of iterators support the basic operations ∗, ++ and !=

mentioned in this section. Vector iterators also support +i and −. In the nextsection, we will describe in more detail the various operations supported byiterators.

Study Questions

5.2.1. What is an iterator?

Page 143: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.2. ITERATORS 135

5.2.2. In what way do iterators allow algorithms to be more generic?

5.2.3. Does it make sense to dereference the end iterator of a container?

5.2.4. What is the C++ keyword auto used for?

5.2.5. What are some of the operations supported by vector iterators?

5.2.6. How is a range of positions specified in the STL?

Exercises

5.2.7. Assuming that v is a vector of integers, write code fragments that performthe following tasks. Use the STL algorithms count and find.

a) Set integer variable count to the number of 0’s in the first half of v.(In case v is of odd length, consider that the middle element is in thesecond half of v.)

b) Change to 1 the first 0 that occurs in v. Do nothing if the vector doesnot contain any 0’s.

c) Add 1 to every element that precedes the first 0 in v. If v containsno 0’s, add 1 to every element in v.

5.2.8. Create the following functions, which are intended to be used on vectorsof integers. In each case, the arguments start and stop are iteratorsof type vector<int>::iterator. The arguments e, x and y are in-tegers.

a) count(start, stop, e) returns the number of occurrences ofelement e in the range [start,stop).

Page 144: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

136 CHAPTER 5. GENERIC ALGORITHMS

b) fill(start, stop, e) sets to e every element in the range[start,stop).

c) find(start, stop, e) returns an iterator to the first occur-rence of element e in the range [start,stop). In case e is notfound, stop is returned.

d) replace(start, stop, x, y) replaces by y every occurrenceof element x in the range [start,stop).

(Note that these functions are special cases of the STL algorithms that havethe same name. These special cases work only on vectors of integers butlater in this chapter, we will learn to create functions that can be used onmultiple types of containers, just like the STL algorithms.)

5.3 Iterator Types and Categories

In this section, we will learn that STL containers each provide several types ofiterators. In addition, each iterator type belongs to a category that determineswhat operations the iterators of that type support.

Consider the code

find(v.begin(), v.end(), e)

If v is a non-constant vector, then v.begin() and v.end() return iteratorsof type iterator, which causes find to return an iterator of the same type.As mentioned in the previous section, this type of iterator can be used to modifythe element that it points to.

If, on the other hand, v is a constant vector, then v.begin() and v.end()return iterators of type const_iterator, which once again causes find to

Page 145: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.3. ITERATOR TYPES AND CATEGORIES 137

return an iterator of the same type. These are called constant iterators becausethey cannot be used to modify the element they point to. This makes sensebecause it prevents the elements of a constant vector from being modified.

Now, suppose that v is a non-constant vector but that for some reason wedo not wish to modify it. Then, in the above code, we could instead use themethods cbegin and cend:

find(v.cbegin(), v.cend(), e)

These methods always return constant iterators, even if the vector is non-constant. This would cause find to return a constant iterator, thereby pro-tecting the vector elements from modification.

In addition to plain iterators and constant iterators, STL containers providereverse iterators and constant reverse iterators. These iterators are of typereverse_iterator and const_reverse_iterator, respectively.

Reverse iterators, as the name indicates, are designed to traverse a containerin reverse order. Containers that support these iterators also provide a methodrbegin that returns a reverse iterator that points to the last element of thecontainer and a method rend that returns a reverse iterator that points to aposition just before the first element of the container. In addition, the ++ op-erator works in reverse direction with reverse iterators. The methods rbeginand rend also have versions crbegin and crend that always return constantreverse iterators.

For example, the following code prints the contents of vector v in reverseorder:

for (auto itr = v.crbegin();itr != v.crend();++itr)

cout << ∗itr;

Page 146: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

138 CHAPTER 5. GENERIC ALGORITHMS

And

find(v.rbegin(), v.rend(), e)

searches v in reverse order, therefore returning an iterator to the last occurrenceof e.

So iterators can be of four different types: plain, constant, reverse, constantreverse. But there are also several iterator categories. The two most importantones are bidirectional iterators and random-access iterators.

The difference between these iterator categories is in the set of operationsthat they support. Bidirectional iterators support the basic operations ∗, −>,++, −−, == and != described in Table 5.1.

The operator−> is called the dereference-and-select operator or, more sim-ply, the arrow operator. It is a convenient combination of the dereferencing op-erator (∗) followed immediately by the selection operator (.). For example, ifitr points to a Time, then the message hours can be sent to ∗itr as either(∗itr).hours() or itr−>hours(). The arrow version is easier to bothwrite and read.

Random-access iterators support all the operations supported by bidirec-tional iterators plus the following ones: +i, −i, +=i, −=i, −, < and [i]. Theseoperations are described in Table 5.2. Note that itr[0] is equivalent to ∗itrand that, in general, itr[i] is equivalent to ∗(itr + i).

Vector iterators are random-access iterators. C++ strings also providerandom-access iterators. Lists and maps, two containers we will study later inthese notes, only provide bidirectional iterators.

Now, what if itr is a bidirectional iterator and we need to move it i posi-tions to the right? We can’t simply do itr += i since this only works withrandom-access iterators. We could write a loop that does ++itr i times.But a more convenient option is to use the library function advance, as in

Page 147: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.3. ITERATOR TYPES AND CATEGORIES 139

∗itrReturns a reference to the element that itr points to.

itr−>f(args)itr−>var

Accesses the element that itr points to and either sends message fwith arguments args to it or returns a reference to data member var.The element must be an object or a structure.

++itr−−itr

Makes itr point to the next or previous element.

itr1 == itr2itr1 != itr2

Returns true if itr1 and itr2 point or don’t point to the sameelement.

Table 5.1: Operations supported by bidirectional iterators

Page 148: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

140 CHAPTER 5. GENERIC ALGORITHMS

itr + iitr − i

Returns an iterator that points i positions to the right or left of itr.

itr += iitr −= i

Moves the iterator i positions to the right or left.

itr1 − itr2Returns the distance, in elements, between itr1 and itr2.

itr1 < itr2Returns true if itr1 points to the left of itr2.

itr[i]Accesses the element i positions to the right of the element that itrpoints to.

Table 5.2: Additional operations supported by random-access iterators

Page 149: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.3. ITERATOR TYPES AND CATEGORIES 141

next(itr, n)prev(itr, n)

Returns an iterator that points n positions to the right or left of itr.n can be negative. Uses + or − once if the iterator is random-access.Otherwise, uses ++ or −− repeatedly.

advance(itr, n)Advances itr by n positions. n can be negative. Uses += or −= onceif the iterator is random-access. Otherwise, uses ++ or −− repeatedly.

distance(itr1, itr2)Returns the distance from itr1 to itr2, in elements. Uses − once ifthe iterators are random-access. Otherwise, uses ++ or −− on itr1until it reaches itr2. Assumes that itr2 is reachable from itr1.

Table 5.3: Some library functions useful with bidirectional iterators

advance(itr, i). When given a random-access iterator, advance uses+=i exactly once. When given a bidirectional iterator, advance uses ++ re-peatedly.

Table 5.3 lists some other functions that can be used to essentially performthe +i, −i and − operations on bidirectional iterators. But note that when usedon bidirectional iterators, those functions are all slower than the correspondingrandom-access iterator operations. If, in a particular situation, you find yourselfusing those functions often, it may be worth considering using a container thatsupports random-access iterators.

The functions of Table 5.3 are defined in the library file iterator and arepart of the std namespace.

Page 150: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

142 CHAPTER 5. GENERIC ALGORITHMS

Study Questions

5.3.1. What is the difference between a plain iterator and a constant iterator?

5.3.2. What method always returns a constant begin iterator?

5.3.3. What position does a reverse begin iterator point to? What about a re-verse end iterator?

5.3.4. In what direction does the −− operator move a reverse iterator?

5.3.5. What operations are supported by bidirectional iterators?

5.3.6. What operations are supported by random-access iterators?

Exercises

5.3.7. Experiment with the operations of random-access iterators by writing atest driver that uses all the operations shown in Table 5.2.

5.4 Vectors and Iterators

We now know that vectors provide random-access iterators. We also know thatthe methods begin, end and their variations can be used to obtain vector iter-ators. But there are several other vector operations that take iterators as argu-ments. Table 5.4 shows some of them.

Note that several vector operations may invalidate existing vector iterators.In general, whenever the size of a vector increases, all iterators may becomeinvalid. When elements are removed from a vector, then the following iteratorsare invalidated: all iterators that point to the removed elements or to positionsto the right of those elements, including the end iterator.

Page 151: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.4. VECTORS AND ITERATORS 143

vector<T> v(start, stop)Creates a vector v that can hold elements of type T. The vector isinitialized with copies of all the elements in the range [start, stop).

v.assign(start, stop)Asks vector v to change its contents to a copy of all the elements inthe range [start, stop).

v.insert(itr, e)Asks vector v to insert, at the position indicated by the iterator itr,a copy of element e. An iterator that points to the new element isreturned.

v.insert(itr, {elements})v.insert(itr, start, stop)

Asks vector v to insert, at the position indicated by the iterator itr,copies of the given elements, or copies of all the elements in therange [start, stop). An iterator that points to the first new elementis returned. If the range is empty, itr is returned.

v.erase(itr)Asks vector v to delete the element that itr points to. An iteratorthat points to the next element is returned.

v.erase(start, stop)Asks vector v to delete all the elements in the range [start, stop).An iterator that points to the next element is returned.

Table 5.4: Some vector operations that involve iterators

Page 152: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

144 CHAPTER 5. GENERIC ALGORITHMS

Study Questions

5.4.1. What operations may invalidate vector iterators?

Exercises

5.4.2. Experiment with the vector operations that use iterators by writing a testdriver that uses all the operations shown in Table 5.4.

5.5 Algorithms in the STL

This section presents several of the generic algorithms available in the STL. Manymore are described in a reference such as [CPP].

Tables 5.5 and 5.6 list some generic algorithms that are designed to be usedon sequences such as vectors. All of these algorithms work by traversing thesequence from one end to the other. Accordingly, their running time is propor-tional to the number of elements in the range. More formally, we say the runningtime of these algorithms is Θ(n)where n is the number of elements in the range.

We will define the notation Θ(n) precisely later in these notes. For now,we will only consider its intuitive meaning: a running time is Θ(n) when it isessentially proportional to n. Such a running time is called linear.

The find generic algorithm performs what is called a sequential search ofthe range. In case the elements in the range are sorted, the binary search algo-rithm runs much faster, as long as elements can be accessed quickly.

The STL includes several versions of the binary search algorithm. These aredescribed in Table 5.7. When given random-access iterators, these algorithmsrun in time Θ(log n). (That’s the log in base 2 of n.) This is called logarithmictime. Since log 106 is about 20, in principle, we can expect a binary search of

Page 153: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.5. ALGORITHMS IN THE STL 145

count(start, stop, e)Returns the number of occurrences of element e in the range[start,stop). Uses == on elements.

find(start, stop, e)Returns an iterator to the first occurrence of element e in the range[start,stop). Returns stop if e is not found. Uses == on ele-ments.

max_element(start, stop)min_element(start, stop)

Returns an iterator that points to the maximum or minimum elementin the range [start,stop). Returns stop if the range is empty.Uses < on elements.

copy(start1, stop1, start2)Copies the elements in the range [start1,stop1) to a range ofpositions that begins at start2. If n is the number of elements thatare copied, returns an iterator that points n positions to the right ofstart2. Copies forward, from start1 to stop1. Typically notuseful when start2 falls within the range [start1,stop1).

copy_backward(start1, stop1, stop2)Copies the elements in the range [start1,stop1) to a range ofpositions immediately to the left of stop2. If n is the number of ele-ments that are copied, returns an iterator that points n positions to theleft of stop2. Copies backward, from stop1 to start1. Normallyused instead of copy when start2 falls within [start1,stop1).

Table 5.5: Some generic sequence algorithms (part 1 of 2)

Page 154: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

146 CHAPTER 5. GENERIC ALGORITHMS

fill(start, stop, e)Sets all the elements in the range [start,stop) to be copies ofelement e.

replace(start, stop, x, y)Replaces all occurrences of element x by a copy of element y in therange [start,stop). Uses == on elements.

reverse(start, stop)Reverses the order of the elements in the range [start,stop).

Table 5.6: Some generic sequence algorithms (part 2 of 2)

a vector of size one million to run about 50,000 times faster than a sequentialsearch. That would be the difference between one millisecond and almost oneminute.

Table 5.7 also includes a description of the generic algorithm sort we men-tioned earlier in this chapter. This algorithm requires random-access iteratorsand runs in time Θ(n log n). In comparison, the simplest sorting algorithms allrun in time Θ(n2). Once again, in principle, we can expect the generic algorithmsort to run about 50,000 faster than these simple sorting algorithms on a vec-tor of size one million. That would be the difference between one second andabout 14 hours. (We will study searching and sorting algorithms in more detaillater in these notes.)

Table 5.8 lists some other useful generic algorithms that are typically usedon non-container arguments. The generic algorithm swap is defined in the li-brary file utility. The other STL generic algorithms mentioned in this sectionare all defined in the library file algorithm. All of them are part of the stdnamespace.

Page 155: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.5. ALGORITHMS IN THE STL 147

binary_search(start, stop, e)lower_bound(start, stop, e)upper_bound(start, stop, e)

Performs a binary search in the range [start,stop). Assumes thatthe range is sorted with respect to <. Uses < on elements. The firstversion returns true if element e is present in the range. Otherwise,it returns false. The second version returns an iterator that points tothe first position where e could be inserted in the range while preserv-ing the order. The third version returns an iterator that points to thelast such position. All versions run in time Θ(log n) if the argumentsare random-access iterators. If the arguments are only bidirectionaliterators, the running time is Θ(n).

sort(start, stop)Sorts the elements in the range [start,stop) using the < operator.Requires random-access iterators. Runs in time Θ(n log n).

Table 5.7: Some generic algorithms for searching and sorting

swap(x, y)Swaps the values of x and y.

max(x, y)min(x, y)max({elements})min({elements})

Returns the maximum or minimum of the given elements. Uses < tocompare elements.

Table 5.8: Some other generic algorithms

Page 156: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

148 CHAPTER 5. GENERIC ALGORITHMS

Study Questions

5.5.1. What is the intuitive meaning of Θ(n)?

5.5.2. How does the running time of a binary search compare to that of a se-quential search?

5.5.3. What is the running time of the generic algorithm sort?

Exercises

5.5.4. Explain why the generic algorithm copy is typically not useful whenstart2 falls within the range [start1,stop1)?

5.5.5. Experiment with the STL searching and sorting algorithms by writing atest driver that uses all the algorithms shown in Table 5.7. To see thedifference between lower_bound and upper_bound, make sure yousearch for an element that occurs multiple times in the sequence.

5.6 Implementing Generic Algorithms

In this section, we will learn how to implement generic algorithms. Recall thatthis means algorithms that can be used on arguments of more than one type.

Consider the max generic algorithm. Figure 5.2 shows an implementationof an integer version of this algorithm. If ever we needed a version for anothertype of argument, all we would have to do is replace the three occurrences ofint by the other type. For example, Figure 5.3 shows a string version.

To simplify this process, and to reduce the risk of errors, we could explicitlyidentify the types that must be changed to produce a new version of the algo-rithm, as shown in Figure 5.4. The result is a template in which the generic

Page 157: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.6. IMPLEMENTING GENERIC ALGORITHMS 149

inline const int & max(const int & x, const int & y){

return (x < y ? y : x);}

Figure 5.2: An integer version of the max algorithm

inline const string & max(const string & x,const string & y)

{return (x < y ? y : x);

}

Figure 5.3: A string version of the max algorithm

name T is used for the type of value being compared. To make things clear,the template definition begins with a declaration that identifies T as a templateparameter. Then, whenever a version of max is needed for a particular type ofvalue, all we have to do is copy the template and replace every occurrence of Tby the desired type.

What we just described is a fairly mechanical process that’s a good candidate

template <typename T>inline const T & max(const T & x, const T & y){

return (x < y ? y : x);}

Figure 5.4: A template for the max algorithm

Page 158: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

150 CHAPTER 5. GENERIC ALGORITHMS

for automation. And, in fact, C++ compilers can do just that. When, for exam-ple, max is called with integer arguments, the compiler looks for a max functionthat can take two integers as argument. When that fails, the compiler then looksfor a template that can be used to generate such a function. The template of Fig-ure 5.4 will work if T is set to int. The process of generating a function out ofa template is called template instantiation.3

Strictly speaking, a function template is not a function but we can think ofit as a generic function, that is, a function that can work on more than onetype of argument. The creation of generic functions is an example of genericprogramming, the writing of code that can be used on a variety of data types.We will soon learn that classes, too, can be generic.4

Another example of a generic function is shown in Figure 5.5. As docu-mented there, it’s a function that prints the contents of a vector on a single line.The documentation also clearly states a condition that the type T must meet. Itis important to clearly identify such requirements when designing generic func-tions. (In the case of the generic algorithm max, this was done for us by theC++ standard. See [CPP], for example.)

Figure 5.6 shows an implementation of the generic algorithm count. Notethat this template has two parameters, one for the type of iterator and one forthe type of element these iterators point to.

You may have noticed that in our implementation of count, the iterator ar-guments are passed by value. Up until now, we have been using the generalrule that only small primitive values such as integers should be passed by value.

3The template declaration of Figure 5.4 uses the keyword typename to declare the templateparameter T. An alternative is to use the keyword class. This is allowed even if the type maynot be a class. Although the keyword typename is more accurate, the use of the keywordclass is widespread.

4In fact, Bjarne Stroustrup, the original designer of C++, describes the language as a betterC that supports data abstraction, object-oriented programming and generic programming [Str].

Page 159: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.6. IMPLEMENTING GENERIC ALGORITHMS 151

// Prints the elements of v to cout, one per line,// separated by a single space.// Requirement on T: values can be printed to cout by// using the output operator (<<).template <typename T>void println(const vector<T> & v){

for (const T & e : v)cout << e << ’ ’;

cout << endl;}

Figure 5.5: The generic function println

template <class Iterator, class T>int count(Iterator start, Iterator stop, const T & e){

int count = 0;for (Iterator itr = start; itr != stop; ++itr) {

if (∗itr == e )++count;

}return count;

}

Figure 5.6: An implementation of the generic algorithm count

Page 160: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

152 CHAPTER 5. GENERIC ALGORITHMS

namespace my {

... (your code)

} // namespace my

Figure 5.7: A namespace

Iterators often contain data that’s no larger than a single integer, but not always.Still, it is generally better to pass iterators by value. The reason is that iteratorarguments tend to be used repeatedly and accessing an iterator through a refer-ence takes longer than accessing it directly. For example, in our implementationof count, if the number of elements in the range is n, then the iterator stopwill be accessed n+1 times. It is more efficient to copy stop once when passingit as argument and then be able to directly access a copy of it. Therefore, as ageneral rule, we will always pass iterator arguments by value. The STL does thesame.

The exercises of this section ask you to create your own implementations ofvarious STL algorithms. To avoid possible conflicts with your compiler’s imple-mentations, and to allow you to choose which implementation you want to usein a particular piece of code, it is best to place your implementations in yourown namespace, as shown in Figure 5.7. Then, to use your own implementa-tion of count, for example, you would specify the namespace when calling thefunction (my::count) or you would include a using declaration in your code(using my::count).

Note that if the algorithm library is included in your code and you specifyusing namespace std, then a declaration such as using my::countwillproduce a conflict. It will cause the compiler to complain that any call to count

Page 161: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

5.6. IMPLEMENTING GENERIC ALGORITHMS 153

is ambiguous. This simply means that the compiler doesn’t know which versionof count to use. In such a situation, you must specify the namespace whencalling the function, as in std::count or my::count.

And this problem can occur even if you don’t specifyusing namespace std. For example, if you use count on a vector,then the compiler will consider using std::count because this algorithm isin the same namespace as vector. This is a consequence of what is knownas argument-dependent lookup. In such a situation, you must also specify thenamespace when calling the function.5

Study Questions

5.6.1. What is a generic function?

5.6.2. What C++ construct allows us to implement generic functions?

5.6.3. What is generic programming?

Exercises

5.6.4. Implement your own version of the following generic algorithms.

a) swap.

b) max_element.

c) copy.

5Note that this could also be an issue even if the library algorithm is not included in yourcode. This is because other library files may include algorithm or portions of it. For example,as of March 2018, in the version of g++ that comes with the latest version of Code::Blocks, thestring library includes the library file that contains the algorithm max. As a consequence, theconflict occurs with max and strings even if the algorithm library is not included in your code.

Page 162: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

154 CHAPTER 5. GENERIC ALGORITHMS

d) copy_backward.

e) reverse. Hint: Make sure you consider sequences of even and oddlength.

Page 163: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Bibliography

[Bro87] Fred P. Brooks. No silver bullet: Essence and accidents of softwareengineering. Computer, 20(4):10–19, 1987.

[CPP] cppreference.com. Web. Last accessed February 2018. http://

cppreference.com.

[LB03] Craig Larman and Victor R. Basili. Iterative and incremental develop-ment: A brief history. Computer, 36(6):47–56, 2003.

[SS] Bjarne Stroustrup and Herb Sutter. C++ Core Guidelines. Web.Last accessed February 2019. https://github.com/isocpp/

CppCoreGuidelines/blob/master/CppCoreGuidelines.

md.

[Str] Bjarne Stroustrup. The C++ programming language. Web.Last accessed April 2017. http://www.stroustrup.com/

C++.html.

[TA] ACM Turing Award. Web. Last accessed April 2017. http://

amturing.acm.org/.

155

Page 164: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

156 BIBLIOGRAPHY

Page 165: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

Index

abstract data type (ADT), 12abstraction, 11

data, 12procedural, 11

agile software development, 106argument

default, 45array

dynamically allocated, 122auto, 132

class, 7, 16cohesion, 9constructor, 24

compiler-generated, 30default, 24

container, 93conversion

implicit, 28copy

deep, 123shallow, 123

copy constructor, 97coupling, 9

delegating constructors, 26derived class, 78design, 102dummy component, 116dynamic, 93

encapsulation, 7evolution, 103

fileheader, 56implementation, 56

friend, 7, 52function

standalone, 37

genericalgorithm, 127container, 98function, 150

157

Page 166: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

158 INDEX

programming, 150

implementation, 102in-class initializers, 30incremental development, 105independence, 9information hiding, 11inheritance, 81initialization

default, 97list, 28value, 97zero, 97

initializer, 25initializer list, 97inline, 35integrated development environment

(IDE), 57iterative development, 105iterator, 129

begin, 130bidirectional, 138category, 138constant, 137end, 130random-access, 138reverse, 137

Java, 18

maintenance, 103make, 62Makefile, 62member

function, 18message, 16method, 16

constant, 22get, 40set, 43

mixed-case format, 15modularity, 9

advantages, 9

naming convention, 15null

character, 65

object, 16default, 24

operatordereference-and-select (−>), 138dereferencing (∗), 133increment (++), 133overloading, 49

polymorphism, 81private, 7programming

Page 167: Introduction to Data Abstraction, Algorithms and Data ...alexis/CS142/Notes/notes.pdf · will discuss and learn the important concepts of modularity, abstraction, data abstraction

INDEX 159

imperative, 16object-oriented (OOP), 16, 85

public, 18

receiver, 16reference

constant, passing by, 4responsibility-driven design, 107reuse, 127

scenarios, 107searching

sequential search, 144software life cycle, 103specification, 101Standard Template Library (STL), 98stream

I/O, 78state, 81string, 86

stringC, 65C++, 72

stub, 116subclass, 78

templatefunction, 148instantiation, 150

parameter, 149testing

integration, 102unit, 102

underscore format, 15

vector, 93capacity, 98

waterfall model, 104what-who cycle, 107