Chapra & Canale 1998, Numerical methods for engineers

- C a m w d v e oavmge of Numerical AMetbo& inahding treatment of optimimrtioa and partial differenti* , _ - ;- .,._ cquaJhm - - -.

~trong ~ o m ~ u t e r %entation ini?b&~. . pseubmde algorithms and overviews of how numdcal methods can be applied with a wide variety of mftwam packages

- 1 \ inctuding h 4 a a ~ a t l a b , . ~ x & ~ , . 1 &d '. TMSL.

,, u More than SO0 problems drawn from dt%d enginwriag pmctice reveal real , applicatjons of text material.

\\ Excellant e x a m m h s ~ ~ qxm _- "" + #-*tin~ so that every

~ ~ ~ ~ H ~ ~ ~ ~ I L I M ' w ~ w ~ ) jpply tbw skill6 In thsit'&mtn Geld *, YSarpfui, scpmte Appstrdiw, Wetting

- with .." M a t h c a d , ~ s @ U % p s d o ~ . -A,.

~mdoGsib vetaston of Numeric$ Methods' "Electronic TOOLKIT" packaged w e text.

ISBN 0-07-115895-2

Steven C. Chapra Professor of Civil Engineering

The University of Colorado Professor Emeritus of Civil Engineering

University of Michigan

Numerical Methods for Engineers W i ~ h Programming and Software Applications

Third Edition

Boston Burr Ridge, IL Dubuque, IA Madison, WI New York San Francisco St. Louis Bangkok Bogoti Caracas Lisbon London Madrid

Mexico City Milan New Delhi Seoul Singapore Sydney Taipei Toronto

WCB/McGraw-HiEE A Division of 7'heMcGrawHd Companies

E'

NUMERICAL METHODS FOR ENGINEERS: WITH PROGRAMMING AND SOFTWARE APPLICATIONS International Editions 1998

Exclusive rights by McGraw-Hill Book Co - Singapore, for manufacture and export. This book cannot be re-exported from the country to which it is consigned by McGraw-Hill.

Copyright O 1998 by The McGraw-Hill Companies, Inc. All rights reserved. Previous editions O 1985 and 1988 by McGraw-Hill, Inc. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a data base or retrieval system, without the prior written permission of the publisher.

Library of Congress Cataloging-in-Publication Data

Chapra, Steven C. Numerical methods for engineers: with programming and software

applications / Steven C. Chapra, Raymond P. Canale. - 3rd ed. p. cm.

Includes index. ISBN 0-07-0 10938-9 1. Engineering mathematics-Data processing. 2. Numerical

calculations-Data processing. 3. Microcomputers-Programming. I. Canale, Raymond P. 11. Title. TA345.C47 1998 5 19.4'024'62-dc2 1 97-37136

When ordering this title, use ISBN 0-07-115895-2

Printed in Singapore

Margaret and Gabriel Chapra

Helen and Chester Canale

Steve Chapra teaches in the Civil, Environmental, and Architectural Engineering De- partment at the University of Colorado. His other books include Surface Water-Quality Modeling and Introduction to Computing for Engineers.

Dr. Chapra received engineering degrees from Manhattan College and the University of Michigan. Before joining the faculty at the University of Colorado, he worked for the Environmental Protection Agency and the National Oceanic and Atmospheric Administra- tion, and was an Associate Professor at Texas A&M University. His general research inter- ests focus on surface water-quality modeling and advanced computer applications in environmental engineering.

He has received a number of awards for his scholarly contributions, including the 1993 Rudolph Hering Medal (ASCE) and the 1987 Meriam-Wiley Distinguished Author Award (American Society for Engineering Education). He has also been recognized as the outstanding teacher among the engineering faculties at both Texas A&M University (1986 Tenneco Award) and the University of Colorado (1992 Hutchinson Award).

Raymond P. Canale is an emeritus professor at the University of Michigan. During his over 20-year career at the university, he taught numerous courses in the area of computers, numerical methods, and environmental engineering. He also directed extensive research programs in the area of mathematical and computer modeling of aquatic ecosys- tems. He has authored or co-authored several books and has published over 100 scientific papers and reports. He has also designed and developed personal computer software to facilitate engineering education and the solution of engineering problems. He has been given the Meriam-Wiley Distinguished Author Award by the American Society for Engineering Education for his books and software and several awards for his technical publications.

Professor Canale is now devoting his energies to applied problems, where he works with engineering firms and industry and governmental agencies as a consultant and expert witness.

PREFACE

This third edition of Numerical Methods for Engineers differs from the second edition in five key ways:

1. Inclusion of sections on major software packages and libraries. At appropriate points throughout the text, we present overviews of how numerical methods can be implemented on a variety of popular software packages and libraries: Mathcad, Excel, MAT- LAB, and the IMSL software library. The intent here is to acknowledge and support the increased use of these tools for numerical analysis. We provide students with the necessary guidance to implement the methods with these packages and relate them to the theory in the text. Several homework problems in each part are also designed to illustrate how the packages can be employed. Prerequisite information (e.g., "Getting Started with Mathcad") is included as appendices.

2. Computer languages. Our continued emphasis on algorithms and program structure may seem somewhat archaic to some instructors. Although we understand that canned software packages are important, we do not hold with those who contend that their use will make programming obsolete. In fact, even if packages were to become the primary tools for numerical calculations, advanced applications involving macros and scripts all require fundamental knowledge of programming and algorithm structure. As in the previous edition, we predominantly use pseudocode to describe our algorithms. In contrast to the previous edition, where the emphasis was on complete programs, this edition stresses a more modular approach using subroutines and functions.

3. New material. Although there are many minor changes and refinements throughout the text, there are only two major additions to the general coverage. First, we have included a major new part on optimization. Second, we have added a new chapter on determining the roots of polynomials. The optimization section was developed because (1) engineering students are increasingly using optimization and (2) optimization is used in a variety of numerical methods contexts such as root location and regression.

4. New homework problems. The homework problems have been significantly re- vamped. In particular, we have modified most of the problems from the previous edition as well as including a number of new problems in every chapter.

x PREFACE

5. Windows version of Numerical Methods TOOLKIT. We have developed a Windows version of the Numerical Methods TOOLKIT that is included with the text on a 3 112-inch diskette. Aside from enhancing some of the methods (e.g., calculation of matrix inverse, integration of several simultaneous ODES), the new version is much more user friendly because of its integration with Windows. Thus, built-in Windows utilities, such as printer output, can be exploited. Aside from student use, the software has been designed to expedite classroom demonstrations by the instructor.

Aside from these additions, the third edition is very similar to the second edition in most other respects. In particular, we have endeavored to maintain most of the features contributing to its pedagogical effectiveness. These include the overall organization, the use of introductions and epilogues to consolidate major topics, and the extensive use of worked examples and engineering applications.

It should be noted that our book has a web site. Its URL is www.mhhe.com/engcs/ generallchapral. Feel free to consult it for additional information on this book. In particular, you can use it to provide us with your feedback.

Finally, as with the previous editions, we have exerted a conscious effort to make this book as student friendly as possible. Thus, we have endeavored to keep our explanations straightforward and oriented practically. Although our primary intent is to provide students with a sound introduct'ion to numerical methods, we have the ancillary objective of making this introduction exciting and pleasurable. We believe that motivated students who enjoy numerical methods, computers, and mathematics will, in the end, make better engineers. If our book fosters an enthusiasm for these subjects, we will consider our efforts a success.

Acknowledgments. Special thanks to Jery Stedinger of Cornell University, who gener- ously shared many insights and suggestions. His comments on optimization were particularly useful to us. David Clough of the University of Colorado shared his wisdom and catholic understanding of numerical methods and software packages. In addition, useful suggestions and reviews were made by David V. Chase (The University of Dayton), Ray- mundo Cordero (ITESM), Theresa Good (Texas A&M University), Wallace Grant (Vir- ginia Tecwirgina Polytechnic Institute & State), James W. Hiestand (University of Ten- nessee at Chattanooga), Steve Klegka (U.S. Military Academy), James L. Kuester (Arizona State University), Karim Miici (ITESM), Robert L. Rankin (Arizona State University), Elisa D. Sotelino (Purdue University), and Hewlon Zimmer (U.S. Merchant Marine Academy).

Finally, it should be stressed that, although we received useful advice from the afore- mentioned individuals, we are responsible for any inaccuracies or mistakes you may detect in this edition.

Steven C. Chapra Raymond P. Canale

CONTENTS PART ONE MODELING, PT 1.1 Motivation 3 COMPUTERS, AND PT 1.2 Mathematical Background 5 ERROR ANALYSIS 3 PT 1.3 Orientation 8

CHAPTER 1 Mathematical Modeling and Engineering Problem-Solving 1 1

1.1 A Simple Mathematical Model 1 1 1.2 Conservation Laws and Engineering 18 Problems 21

CHAPTER 2 Computers and Software 24

2.1 Computing Environments 25 2.2 The Software Development Process 26 2.3 Algorithm Design 30 2.4 Program Composition 4 0 2.5 Quality Control 43 2.6 Documentation and Maintenance 4 7 2.7 Sofhvare Strategy 48 Problems 53

CHAPTER 3 Approximations and Round-Off Errors 56

3.1 Significant Figures 5 7 3.2 Accuracy and Precision 59 3.3 Error Definitions 60 3.4 Round-Off Errors 63 Problems 78

CHAPTER 4 Truncation Errors and the Taylor Series 79

4.1 The Taylor Series 79 4.2 Error Propagation 95 4.3 Total Numerical Error 99 4.4 Blunders, Formulation Errors, and Data Uncertainty 101 Problems 103

xii CONTENTS

EPILOGUE: PART ONE 105 PT 1.4 Trade-offs 105 PT 1.5 Important Relationships and Formulas 108 PT 1.6 Advanced Methods and Additional References 108

PART TWO

ROOTS OF PT 2.1 Motivation 1 1 1 EQUATIONS 1 1 1 PT 2.2 Mathematical Background 1 13

PT 2.3 Orientation 1 14

CHAPTER 5 Bracketing Methods 1 18

5.1 Graphical Methods 1 18 5.2 The Bisection Method 122 5.3 The FalsePosition Method 132 5.4 Incremental Searches and Determining Initial Guesses 137 Problems 138

CHAPTER 6 Open Methods 14 1

6.1 Simple Fixed-Point Iteration 142 6.2 The Newton-Raphson Method 147 6.3 The Secant Method 153 6.4 Multiple Roots 158 6.5 Systems of Nonlinear Equations 161 Problems 165

CHAPTER 7 Roots of Polynomials 167

Polynomials in Engineering and Science 167 Computing with Polynomials 170 Conventional Methods 173

7.4 Mueller's Method 174 7.5 Bairstow's Method 178 7.6 Other Methods 183 7.7 Root Location with Libraries and Packages 183 Problems 193

CHAPTER 8 Engineering Applications: Roots of Equations 194

8.1 Ideal and Nonideal Gas Laws (Chemical/Petroleum Engineering) 194 8.2 Open-Channel Flow (Civil/Environmental Engineering) 197 8.3 Design of an Electric Circuit (Electrical Engineering) 201

CONTENTS xiii

8.4 Vibration Analysis (Mechanical/Aerospace Engineering) 203 Problems 2 10

EPILOGUE: PART TWO 2 1 5 PT 2.4 Trade-offs 2 15 PT 2.5 Important Relationships and Formulas 2 16 PT 2.6 Advanced Methods and Additional References 2 16

PART THREE

LINEAR ALGEBRAIC PT 3.1 Motivation 2 19 EQUATIONS 2 1 9 PT 3.2 Mathematical Background 22 1

PT 3.3 Orientation 229

CHAPTER 9 Gauss Elimination 233

9.1 Solving Small Numbers of Equations 233 9.2 Naive Gauss Elimination 240 9.3 Pitfalls of Elimination Methods 246 9.4 Techniques for Improving Solutions 252 9.5 Complex Systems 259 9.6 Nonlinear Systems of Equations 259 9.7 Gauss-Jordan 261 9.8 Summary 263 Problems 263

CHAPTER 10 LU Decomposition and Matrix Inversion 266

10.1 LU Decomposition 266 10.2 The Matrix Inverse 275 10.3 Error Analysis and System Condition 279 Problems 287

CHAPTER 1 1 Special Matrices and Gauss-Seidel 288

1 1 . 1 Special Matrices 288 1 1.2 Gauss-Seidel 292 1 1.3 Linear Algebraic Equations with Libraries and Packages 299 Problems 307

CHAPTER 12 Engineering Applications: Linear Algebraic Equations 309

12.1 Steady-State Analysis of a System of Reactors (Chemical/Petroleum Engineering) 309

xiv CONTENTS

12.2 Analysis of a Statically Determinate Truss (Civil/Environmental Engineering) 3 12 12.3 Currents and Voltages in Resistor Circuits (Electrical Engineering) 3 16 1 2.4 Spring-Mass Systems (Mechanical/Aerospace Engineering) 3 1 8 Problems 321

EPILOGUE: PART THREE 327 PT 3.4 Trade-offs 327 PT 3.5 Important Relationships and Formulas 328 PT 3.6 Advanced Methods and Additional References 328

PART FOUR

OPTIMIZATION 33 1 PT 4.1 Motivation 33 1 PT 4.2 Mathematical Background 336 PT 4.3 Orientation 337

CHAPTER 13 One-dimensional Unconstrained Optimization 341

13.1 Golden-Section Search 342 13.2 Quadratic Interpolation 349 1 3.3 Newton's Method 35 1 Problems 353

CHAPTER 14 Multidimensional Unconstrained Optimization 355

14.1 Direct Methods 356 14.2 Gradient Methods 360 Problems 373

CHAPTER 15 Constrained Optimization 375

15.1 Linear Programming 375 15.2 Nonlinear Constrained Optimization 386 1 5.3 Optimization with Packages 387 Problems 397

CHAPTER 16 Engineering Applications: Optimization 399

16.1 Least-Cost Design of a Tank (Chemical/Petroleum Engineering) 399 16.2 Least-Cost Treatment of Wastewater (Civil/Environmental Engineering) 403 16.3 Maximum Power Transfer for a Circuit (Electrical Engineering) 408 16.4 Mountain Bike Design (Mechanical/Aerospace Engineering) 41 2 Problems 41 4

CONTENTS xv

EPILOGUE: PART FOUR 420 PT 4.4 Trade-offs 420 PT 4.5 Additional References 42 1

PART FIVE

CURVE FITTING 423 PT 5.1 Motivation 423 PT 5.2 Mathematical Background 425 PT 5.3 Orientation 434

CHAPTER 17 Least-Squares Regression 438

17.1 Linear Regression 438 17.2 Polynomial Regression 454 17.3 Multiple Linear Regression 460 17.4 General Linear Least-Squares 463 17.5 Nonlinear Regression 468 Problems 471

CHAPTER 18 Interpolation 473

18.1 Newton's Divided-Difference lnterpolating Polynomials 474 18.2 Lagrange lnterpolating Polynomials 485 18.3 Coefficients of an lnterpolating Polynomial 490 18.4 Inverse Interpolation 490 18.5 Additional Comments 491 18.6 Spline lnterpolation 494 Problems 505

CHAPTER 19 Fourier Approximation 507

19.1 Curve Fitting with Sinusoidal Functions 508 1 9.2 Continuous Fourier Series 5 14 19.3 Frequency and Time Domains 5 17 1 9.4 Fourier Integral and Transform 52 1 19.5 Discrete Fourier Transform (DFT) 523 19.6 Fast Fourier Transform (FFT) 525 19.7 The Power Spectrum 532 19.8 Curve Fitting with Libraries and Packages 533 Problems 544

xvi CONTENTS

CHAPTER 20 Engineering Applications: Curve Fitting 546

20.1 Linear Regression and Population Models (Chemical/Petroleum Engineering) 546 20.2 Use of Splines to Estimate Heat Transfer (Civil/Environmental Engineering) 550 20.3 Fourier Analysis (Electrical Engineering) 552 20.4 Analysis of Experimental Data (Mechanical/Aerospace Engineering) 553 Problems 555

EPILOGUE: PART FIVE 561 PT 5.4 Trade-offs 5 6 1 PT 5.5 Important Relationships and Formulas 562 PT 5.6 Advanced Methods and Additional References 564

PART SIX

NUMERICAL PT 6.1 Motivation 567 DIFFERENTIATION PT 6.2 Mathematical Background 576 A N D INTEGRATION PT 6.3 Orientation 579 567

CHAPTER 2 1 Newton-Cotes lntegration of Equations 582

2 1 .1 The Trapezoidal Rule 584 21.2 Simpson's Rules 595 2 1.3 lntegration with Unequal Segments 604 2 1.4 Open lntegration Formulas 607 Problems 607

CHAPTER 22 lntegration of Equations 61 0

22.1 Newton-Cotes Algorithms for Equations 61 0 22.2 Romberg lntegration 61 2 22.3 Gauss Quadrature 61 7 22.4 Improper Integrals 624 Problems 628

CHAPTER 23 Numerical Differentiation 629

23.1 High-Accuracy Differentiation Formulas 629 23.2 Richardson Extrapolation 632 23.3 Derivatives of Unequally Spaced Data 634 23.4 Derivatives and Integrals for Data with Errors 635 23.5 Numerical Integration/Differentiation Formulas with Libraries and Packages 636 Problems 641

CONTENTS xvii

CHAPTER 24 Engineering Applications: Numerical lntegration and Differentiation 643

24.1 lntegration to Determine the Total Quantity of Heat (Chemical/Petroleum Engineering) 643

24.2 Effective Force on the Mast of a Racing Sailboat (Civil/Environmental Engineering) 645

24.3 Root-Mean-Square Current by Numerical lntegration (Electrical Engineering) 647 24.4 Numerical lntegration to Compute Work (Mechanical/Aerospace Engineering) 650 Problems 654

EPILOGUE: PART SIX 661 PT 6.4 Trade-offs 661 PT 6.5 Important Relationships and Formulas 662 PT 6.6 Advanced Methods and Additional References 662

PART SEVEN

ORDINARY PT 7.1 Motivation 665 DIFFERENTIAL PT 7.2 Mathematical Background 669 EQUATIONS 665 PT 7.3 Orientation 671

CHAPTER 25 Runge-Kutta Methods 675

25.1 Euler's Method 676 25.2 Improvements of Euler's Method 687 25.3 Runge-Kutta Method 695 25.4 Systems of Equations 705 25.5 Adaptive Runge-Kutta Method 71 0 Problems 71 8

CHAPTER 26 Stiffness and Multistep Methods 71 9

26.1 Stiffness 71 9 26.2 Multistep Methods 723 Problems 743

CHAPTER 27 Boundary-Value and Eigenvalue Problems 745

27.1 General Methods of Boundary-Value Problems 746 27.2 Eigenvalue Problems 752 27.3 ODES and Eigenvalues with Libraries and Packages 768 Problems 775

CHAPTER 28 Engineering Applications: Ordinary Differential Equations 777

28.1 Using ODES to Analyze the Transient Response of a Reactor (Chemical/Petroleum Engineering) 777

28.2 Predator-Prey Models and Chaos (Civil/Environmental Engineering] 784 28.3 Simulating Transient Current for an Electronic Circuit (Electrical Engineering) 788 28.4 The Swinging Pendulum (Mechanical/Aerospace Engineering) 793 Problems 796

EPILOGUE: PART SEVEN 800 PT 7.4 Trade-offs 800 PT 7.5 Important Relationships and Formulas 801 PT 7.6 Advanced Methods and Additional References 801

PART EIGHT

PARTIAL PT 8.1 Motivation 805 DIFFERENTIAL PT 8.2 Orientation 808 EQUATIONS 805

CHAPTER 29 Finite Difference: Elliptic Equations 8 1 2 29.1 The Laplace Equation 8 1 2 29.2 Solution Techniques 8 14 29.3 Boundary Conditions 820 29.4 The Control Volume Approach 826 29.5 Software to Solve Elliptic Equations 829 Problems 830

CHAPTER 30 Finite Difference: Parabolic Equations 832

30.1 The Heat Conduction Equation 832 30.2 Explicit Methods 833 30.3 A Simple Implicit Method 837 30.4 The Crank-Nicholson Method 841 30.5 Parabolic Equations in Two Spatial Dimensions 844 Problems 848

CHAPTER 31 Finite Element Method 849

3 1.1 The General Approach 850 3 1.2 Finite-Element Application in One Dimension 854 3 1.3 Two-Dimensional Problems 863 3 1.4 PDEs with Libraries and Packages 867 Problems 874

CONTENTS xix

CHAPTER 32 Engineering Applications: Partial Differential Equations 876

32.1 One-Dimensional Mass Balance of a Reactor (Chemical/Petroleum Engineering) 876

32.2 Deflections of a Plate (Civil/Environmental Engineering) 880 32.3 Two-Dimensional Electrostatic Field Problems (Electrical Engineering) 882 32.4 Finite-Element Solution of a Series of Springs (Mechanical/Aerospace

Engineering) 885 Problems 889

EPILOGUE: PART EIGHT 891 PT 8.3 Trade-offs 891 PT 8.4 Important Relationships and Formulas 891 PT 8.5 Advanced Methods and Additional References 892

APPENDIX A: THE FOURIER SERIES 893

APPENDIX B: GETTING STARTED WlTH MATHCAD 895

APPENDIX C: GETTING STARTED WlTH MATLAB 905

BIBLIOGRAPHY 91 3

INDEX 916

Numerical Methods for Engineers With Programming and Software Applications

PART ONE

MODELING, COMPUTERS, AND ERROR ANALYSIS

PT 1.1 MOTIVATION

Numerical methods are techniques by which mathematical problems are formulated so that they can be solved with arithmetic operations. Although there are many kinds of numerical methods, they have one common characteristic: they invariably involve large numbers of tedious arithmetic calculations. It is little wonder that with the development of fast, efficient digital computers, the role of numerical methods in engineering problem solving has increased dramatically in recent years.

PTI . I . I Noncomputer Methods

Beyond providing increased computational firepower, the widespread availability of computers (especially personal computers) and their partnership with numerical methods has had a significant influence on the actual engineering problem-solving process. In the precomputer era there were generally three different ways in which engineers approached problem solving:

1. Solutions were derived for some problems using analytical, or exact, methods. These solutions were often useful and provided excellent insight into the behavior of some systems. However, analytical solutions can be derived for only a limited class of problems. These include those that can be approximated with linear models and those that have simple geometry and low dimensionality. Consequently, analytical solutions are of limited practical value because most real problems are nonlinear and involve complex shapes and processes.

2. Graphical solutions were used to characterize the behavior of systems. These graphical solutions usually took the form of plots or nomographs. Although graphical techniques can often be used to solve complex problems, the results are not very precise. Further- more, graphical solutions (without the aid of computers) are extremely tedious and awk- ward to implement. Finally, graphical techniques are often limited to problems that can be described using three or fewer dimensions.

3. Calculators and slide rules were used to implement numerical methods manually. Al- though in theory such approaches should be perfectly adequate for solving complex problems, in actuality several difficulties are encountered. Manual calculations are slow and tedious. Furthermore, consistent results are elusive because of simple blunders that arise when numerous manual tasks are performed.

During the precomputer era, significant amounts of energy were expended on the solution technique itself, rather than on problem definition and interpretation (Fig. PTl. la).

I This unfortunate situation existed because so much time and drudgery were required to obtain numerical answers using precomputer techniques.

4 MODELING, COMPUTERS, AND ERROR ANALYSIS

FORMULATION

FIGURE PTl . I The three ~hases of en ineering a problem solving in (a) t e precomputer and (b) the computer era. The sizes of the boxes indi- cute the level of emphasis directed toward each phase. In-depth analysis Computers facilitate the imple- limited by time- rnentation of solution techniques consuming solution and thus allow more emphasis to be placed on the creative aspects of problem formulation and interpretation of results. (a)

Today, computers and numerical methods provide an alternative for such complicated calculations. Using computer power to obtain solutions directly, you can approach these calculations without recourse to simplifying assumptions or time-intensive techniques. Although analytical solutions are still extremely valuable both for problem solving and for providing insight, numerical methods represent alternatives that greatly enlarge your capabilities to confront and solve problems. As a result, more time is available for the use of your creative skills. Thus, more emphasis can be placed on problem formulation and solution interpretation and the incorporation of total system, or "holistic," awareness (Fig. PTl .lb.)

PT1.1.2 Numerical Methods and Engineering Practice

Since the late 1940s the widespread availability of digital computers has led to a veritable explosion in the use and development of numerical methods. At first, this growth was somewhat limited by the cost of access to large mainframe computers, and, consequently, many engineers continued to use simple analytical approaches in a significant portion of their work. Needless to say, the recent evolution of inexpensive personal computers has given us ready access to powerful computational capabilities. There are several additional reasons why you should study numerical methods:

PT1.2 MATHEMATICAL BACKGROUND 5

1. Numerical methods are extremely powerful problem-solving tools. They are capable of handling large systems of equations, nonlinearities, and complicated geometries that are not uncommon in engineering practice and that are often impossible to solve analytically. As such, they greatly enhance your problem-solving skills.

2. During your careers, you may often have occasion to use commercially available prepackaged, or "canned," computer programs that involve numerical methods. The in- telligent use of these programs is often predicated on knowledge of the basic theory underlying the methods.

3. Many problems cannot be approached using canned programs. If you are conversant with numerical methods and are adept at computer programming, you can design your own programs to solve problems without having to buy or commission expensive software.

4. Numerical methods are an efficient vehicle for learning to use computers. It is well known that an effective way to learn programming is to actually write computer programs. Because numerical methods are for the most part designed for implementation on computers, they are ideal for this purpose. Further, they are especially well-suited to illustrate the power and the limitations of computers. When you successfully implement numerical methods on a computer and then apply them to solve otherwise intractable problems, you will be provided with a dramatic demonstration of how computers can serve your professional development. At the same time, you will also learn to acknowledge and control the errors of approximation that are part and parcel of large-scale numerical calculations.

5. Numerical methods provide a vehicle for you to reinforce your understanding of mathematics. Because one function of numerical methods is to reduce higher mathematics to basic arithmetic operations, they get at the "nuts and bolts" of some otherwise obscure topics. Enhanced understanding and insight can result from this alternative perspective.

PT1.2 MATHEMATICAL BACKGROUND

Every part in this book requires some mathematical background. Consequently, the introductory material for each part includes a section, such as the one you are reading, on mathematical background. Because Part One itself is devoted to background material on mathematics and computers, this section does not involve a review of a specific mathematical topic. Rather, we take this opportunity to introduce you to the types of mathematical subject areas covered in this book. As summarized in Fig. PT1.2, these are

1. Roots of Equations (Fig. PT1.2a). These problems are concerned with the value of a variable or a parameter that satisfies a single nonlinear equation. These problems are especially valuable in engineering design contexts where it is often impossible to explic- itly solve design equations for parameters.

2. Systems of Linear Algebraic Equations (Fig. PT1.2b). These problems are similar in spirit to roots of equations in the sense that they are concerned with values that satisfy equations. However, in contrast to satisfying a single equation, a set of values is sought that simultaneously satisfies a set of linear algebraic equations. Such equations arise in a variety of problem contexts and in all disciplines of engineering. In particular, they

6 COMPUTERS AND SOFTWARE

FIGURE PT1.2 Summary of the numerical methods covered in this book.

PT1.2 MATHEMATICAL BACKGROUND 7

FIGURE PT1.2 (concluded)

originate in the mathematical modeling of large systems of interconnected elements such as structures, electric circuits, and fluid networks. However, they are also encountered in other areas of numerical methods such as curve fitting and differential equations.

3. Optimization (Fig. PT1.2c). These problems involve determining a value or values of an independent variable that correspond to a "best" or optimal value of a function. Thus, as in Fig. PT1.2c, it involves identifying maxima and minima. Such problems occur routinely in engineering design contexts. They also arise in a number of other numerical methods. We address both single- and multi-variable unconstrained optimization. We also describe constrained optimization with particular emphasis on linear programming.

4. Curve Fitting (Fig. PT1.2d). You will often have occasion to fit curves to data points. The techniques developed for this purpose can be divided into two general categories: regression and interpolation. Regression is employed where there is a significant degree of error associated with the data. Experimental results are often of this kind. For these situations, the strategy is to derive a single curve that represents the general trend of the data without necessarily matching any individual points. In contrast, interpolation is used where the objective is to determine intermediate values between relatively error- free data points. Such is usually the case for tabulated information. For these situations, the strategy is to fit a curve directly through the data points and use the curve to predict the intermediate values.

5. Integration (Fig. PT1.2e). As depicted, a physical interpretation of numerical integration is the determination of the area under a curve. Integration has many applications in engineering practice, ranging from the determination of the centroids of oddly shaped objects to the calculation of total quantities based on sets of discrete measurements. In

8 MODELING, COMPUTERS, AND ERROR ANALYSIS

addition, numerical integration formulas play an important role in the solution of differential equations.

6. Ordinary Differential Equations (Fig. PT1.2f ). Ordinary differential equations are of great significance in engineering practice. This is because many physical laws are couched in terms of the rate of change of a quantity rather than the magnitude of the quantity itself. Examples range from population forecasting models (rate of change of population) to the acceleration of a falling body (rate of change of velocity). Two types of problems are addressed: initial-value and boundary-value problems. In addition, the computation of eigenvalues is covered.

7. Partial Differential Equations (Fig. PT1.2g). Partial differential equations are used to characterize engineering systems where the behavior of a physical quantity is couched in terms of its rate of change with respect to two or more independent variables. Exam- ples include the steady-state distribution of temperature on a heated plate (two spatial dimensions) or the time-variable temperature of a heated rod (time and one spatial dimension). Two fundamentally different approaches are employed to solve partial differential equations numerically. In the present text, we will emphasize finite-difference methods that approximate the solution in a pointwise fashion (Fig. PT1.2g). However, we will also present an introduction to finite-element methods, which use a piecewise approach.

PT1.3 ORIENTATION

Some orientation might be helpful before proceeding with our introduction to numerical methods. The following is intended as an overview of the material in Part One. In addition, some objectives have been included to focus your efforts when studying the material.

PT1.3.1 Scope and Preview

Figure PT1.3 is a schematic representation of the material in Part One. We have designed this diagram to provide you with a global overview of this part of the book. We believe that a sense of the "big picture" is critical to developing insight into numerical methods. When reading a text, it is often possible to become lost in technical details. Whenever you feel that you are losing the big picture, refer back to Fig. PT1.3 to reorient yourself. Every part of this book includes a similar figure.

Figure PT1.3 also serves as a brief preview of the material covered in Part One. Chap- ter 1 is designed to orient you to numerical methods and to provide motivation by demonstrating how these techniques can be used in the engineering modeling process. Chapter 2 is an introduction and review of computer-related aspects of numerical methods and suggests the level of computer skills you should acquire to efficiently apply succeeding information. Chapters 3 and 4 deal with the important topic of error analysis, which must be un- derstood for the effective use of numerical methods. In addition, an epilogue is included that introduces the trade-offs that have such great significance for the effective implementation of numerical methods.

PT1.3 ORIENTATION 9

FIGURE PT1.3 Schematic of the organ~zation of the material in Part One: Modeling, Computers, and Error Analysis

1 0 MODELING, COMPUTERS, AND ERROR ANALYSIS

TABLE PTl .I Specific study objectives for Part One.

1 . Recognize the difference between analytical and numerical solutions. 2. Understand how conservation laws are employed to develop mathematical models of physical systems. 3. Define top-down and modular design. 4. Delineate the rules that underlie structured programming. 5 . Be capable of composing structured and modular programs in a high-level computer language. 6. Know how to translate structured flowcharts and pseudocode into code in a high-level language. 7. Start to familiarize yourself with any software packages that you will be using in conjunction with

this text. 8. Recognize the distinction between truncation and round-off errors. 9. Understand the concepts of significant figures, accuracy, and precision.

10. Recognize the difference between true relative error E,, approximate relative error E,, and acceptable error E,, and understand how E , and E , are used to terminate an iteratlve computation.

1 1 . Understand how numbers are represented in dig~tal computers and how this representation induces round-off error. In particular, know the difference between single and extended precision.

12. Recognize how computer arithmetic can introduce and amplify round-off errors in calculations. In particu- lor, a~~rec la te the wroblem of subtractive cancellation , 8 ,

1 3. Understand how thk Taylor series and its remainder are employed to represent continuous functions. 14. Know the relationship between f~nite divided differences and derivatives. 15. Be able to analyze how errors are propagated through functional relationships. 16. Be familiar with the concepts of stability and condition. 17. Familiarize yourself with the trade-offs outlined in the Epilogue of Part One.

PT1.3.2 Goals and Objectives

Study Objectives. Upon completing Part One, you should be adequately prepared to embark on your studies of numerical methods. In general, you should have gained a fundamental understanding of the importance of computers and the role of approximations and errors in the implementation and development of numerical methods. In addition to these general goals, you should have mastered each of the specific study objectives listed in Table PT1.l.

Computer Objectives. Upon completing Part One, you should have mastered sufficient computer skills to develop your own software for the numerical methods in this text. You should be able to develop well-structured and reliable computer programs on the basis of pseudocode, flowcharts, or other forms of algorithms. You should have developed the capability to document your programs so that they may be effectively employed by users. Fi- nally, in addition to your own programs, you may be using software packages along with this book. The Numerical Methods TOOLKIT, which has been provided along with the text, and other packages like Mathcad, MATLAB, or Excel are examples of such software. You should become familiar with these packages, so that you will be comfortable using them to solve numerical problems later in the text.

CHAPTER 1

Mathematical Modeling and Engineering Problem SGlving

Knowledge and understanding are prerequisites for the effective implementation of any tool. No matter how impressive your tool chest, you will be hard-pressed to repair a car if you do not understand how it works.

This is particularly true when using computers to solve engineering problems. Al- though they have great potential utility, computers are practically useless without a fundamental understanding of how engineering systems work.

This understanding is initially gained by empirical means-that is, by observation and experiment. However, while such empirically derived information is essential, it is only half the story. Over years and years of observation and experiment, engineers and scientists have noticed that certain aspects of their empirical studies occur repeatedly. Such general behavior can then be expressed as fundamental laws that essentially embody the cumulative wisdom of past experience. Thus, most engineering problem solving employs the two- pronged approach of empiricism and theoretical analysis (Fig. 1.1).

It must be stressed that the two prongs are closely coupled. As new measurements are taken, the generalizations may be modified or new ones developed. Similarly, the generalizations can have a strong influence on the experiments and observations. In particular, generalizations can serve as organizing principles that can be employed to synthesize observations and experimental results into a coherent and comprehensive framework from which conclusions can be drawn. From an engineering problem-solving perspective, such a framework is most useful when it is expressed in the form of a mathematical model.

The primary objective of this chapter is to introduce you to mathematical modeling and its role in engineering problem solving. We will also illustrate how numerical methods figure in the process.

1.1 A SIMPLE MATHEMATICAL MODEL

A mathematical model can be broadly defined as a formulation or equation that expresses the essential features of a physical system or process in mathematical terms. In a very general sense, it can be represented as a functional relationship of the form

Dependent independent forcing =f( , parameters, variable variables functions

12 MATHEMATICAL MODELING AND ENGINEERING PROBLEM SOLVING

FIGURE 1.1 The engineering problem- solving process.

where the dependent variable is a characteristic that usually reflects the behavior or state of the system; the independent variables are usually dimensions, such as time and space, along which the system's behavior is being determined; the parameters are reflective of the system's properties or composition; and theforcing functions are external influences acting upon it.

The actual mathematical expression of Eq. (1.1) can range from a simple algebraic relationship to large complicated sets of differential equations. For example, on the basis of his observations, Newton formulated his second law of motion, which states that the time rate of change of momentum of a body is equal to the resultant force acting on it. The mathematical expression, or model, of the second law is the well-known equation

where F = net force acting on the body (N, or kg m/s2), m = mass of the object (kg), and a = its acceleration (m/s2).

1 .1 A SIMPLE MATHEMATICAL MODEL 13

The second law can be recast in the format of Eq. (1.1) by merely dividing both sides by m to give

F a = -

m (1.3)

where a = the dependent variable reflecting the system's behavior, F = the forcing function, and m = a parameter representing a property of the system. Note that for this simple case there is no independent variable because we are not yet predicting how acceleration varies in time or space.

Equation (1.3) has several characteristics that are typical of mathematical models of the physical world:

1. It describes a natural process or system in mathematical terms. 2. It represents an idealization and simplification of reality. That is, the model ignores neg-

FIGURE 1.2 Schematic diagram of the forces acting on a falling parachutist. FD is the downward force due to gravity. FU is the upward force due to air resistance.

ligible details of the natural process and focuses on its essential manifestations. Thus, the second law does not include the effects of relativity that are of minimal importance when applied to objects and forces that interact on or about the earth's surface at veloc- ities and on scales visible to humans.

3. Finally, it yields reproducible results and, consequently, can be used for predictive purposes. For example, if the force on an object and the mass of an object are known, Eq. (1.3) can be used to compute acceleration.

Because of its simple algebraic form, the solution of Eq. (1.2) can be obtained easily. However, other mathematical models of physical phenomena may be much more complex, and either cannot be solved exactly or require more sophisticated mathematical techniques than simple algebra for their solution. To illustrate a more complex model of this kind, Newton's second law can be used to determine the terminal velocity of a free-falling body near the earth's surface. Our falling body will be a parachutist (Fig. 1.2). A model for this case can be derived by expressing the acceleration as the time rate of change of the velocity (dvldt) and substituting it into Eq. (1.3) to yield

where v is velocity (m/s). Thus, the mass multiplied by the rate of change of the velocity is equal to the net force acting on the body. If the net force is positive, the object will accel- erate. If it is negative, the object will decelerate. If the net force is zero, the object's velocity will remain at a constant level.

Next, we will express the net force in terms of measurable variables and parameters. For a body falling within the vicinity of the earth (Fig. 1.2), the net force is composed of two opposing forces: the downward pull of gravity F, and the upward force of air resistance F,:

F = FD + Fu (1.5)

If the downward force is assigned a positive sign, the second law can be used to for- mulate the force due to gravity, as

FD = rng (1.6)

where g = the gravitational constant, or the acceleration due to gravity, which is approximately equal to 9.8 m/s2.


Air resistance can be formulated in a variety of ways. A simple approach is to assume that it is linearly proportional to velocity1 and acts in an upward direction, as in

~ h + relocity of a idling Fu = -CV

= mg-cv (1.7)

dl

dv = 8-cv where c = a proportionality constant called the drag coefficient (kg/s). Thus, the greater dr m the fall velocity, the greater the upward force due to air resistance. The parameter c ac- " = -;(- $g + v) dl counts for properties of the falling object, such as shape or surface roughness, that affect air 1 d v = -5dr resistance. For the present case, c might be a function of the type of jumpsuit or the orien-

m - - g + v m tation used by the parachutist during free-fall.

jL = j(-id) The net force is the difference between the downward and upward force. Therefore, - r n g + ~

C Eqs. (1.4) through (1.7) can be combined to yield I n t 3 + y I = -?+c d v mg - c v

-%*C -- - d t m

(1.8) t$g+vl = e

or simplifying the right side,

so, -5,

-!Eg+, = - 4 6 " Equation (1.9) is a model that relates the acceleration of a falling object to the forces act- C c ing on it. It is a differential equation because it is written in terms of the differential rate of

-it

v = -- :ge " + :g change (dv ld t ) of the variable that we are interested in predicting. However, in contrast to the solution of Newton's second law in Eq. (1.3), the exact solution of Eq. (1.9) for the velocity of the falling parachutist cannot be obtained using simple algebraic manipulation. Rather, more advanced techniques such as those of calculus must be applied to obtain an exact or analytical solution. For example, if the parachutist is initially at rest ( v = 0 at t = 0) , calculus can be used to solve Eq. (1.9) for

Note that Eq. (1.10) is cast in the general form of Eq. ( ] . I ) , where v(t) = the dependent variable, t = the independent variable, c and m = parameters, and g = the forcing function.

EXAMPLE 1.1 Analytical Solution to the Falling Parachutist Problem

Problem Statement. A parachutist of mass 68.1 kg jumps out of a stationary hot air bal- loon. Use Eq. (1.10) to compute velocity prior to opening the chute. The drag coefficient is equal to 12.5 kg/s.

/ Solution. Inserting the parameters into Eq. (1.10) yields

I which can be used to compute

' In fact, the relationship is actually nonlinear and might better be represented by a power relationship such as F, = -cv2. We will explore how such nonlinearities affect the model in a problem at the end of this chapter.

1 .1 A SIMPLE MATHEMATICAL MODEL IS

According to the model, the parachutist accelerates rapidly (Fig. 1.3). A velocity of 44.87 m/s (100.4 mi/h) is attained after 10 s. Note also that after a sufficiently long time, a constant velocity, called the terminal velocity, of 53.39 m/s (1 19.4 mi/h) is reached. This velocity is constant because, eventually, the force of gravity will be in balance with the air resistance. Thus, the net force is zero and acceleration has ceased.

Equation (1.10) is called an analytical, or exact, solution because it exactly satisfies the original differential equation. Unfortunately, there are many mathematical models that cannot be solved exactly. In many of these cases, the only alternative is to develop a numerical solution that approximates the exact solution.

As mentioned previously, numerical methods are those in which the mathematical problem is reformulated so it can be solved by arithmetic operations. This can be illustrated

FIGURE 1.3 The analytical solution to the falling parachutist problem as computed in Example 1 . l . Velocity increases with time and asymptotically approaches a terminal velocity.


FIGURE 1.4 The use of a finite difference to a proximate the first derivative oPv with respect to f .

for Newton's second law by realizing that the time rate of change of velocity can be approximated by (Fig. 1.4):

where Av and At = differences in velocity and time computed over finite intervals, v(t,) = velocity at an initial time ti, and v(ti+,) = velocity at some later time ti+,. Note that dvldt AvlAt is approximate because At is finite. Remember from calculus that

Equation (1.11) represents the reverse process. Equation (1.11) is called afinite divided difference approximation of the derivative at

time t,. It can be substituted into Eq. (1.9) to give

This equation can then be rearranged to yield

Notice that the term in brackets is the right-hand side of the differential equation itself [Eq. (1.9)]. That is, it provides a means to compute the rate of change or slope of v. Thus, the differential equation has been transformed into an equation that can be used to determine the velocity algebraically at ti+, using the slope and previous values of v and t. If you are given an initial value for velocity at some time t,, you can easily compute velocity at a

1 .1 A SIMPLE MATHEMATICAL MODEL 17

later time ti+,. This new value of velocity at ti+, can in turn be employed to extend the computation to velocity at t,,, and so on. Thus, at any time along the way,

New value = old value + slope x step size

EXAMPLE 1.2 Numerical Solution to the Falling Parachutist Problem

Problem Statement. Perform the same computation as in Example 1 . 1 but use Eq. (1.12) to compute velocity. Employ a step size of 2 s for the calculation.

Solution. At the start of the computation (ti = O), the velocity of the parachutist is zero. Using this information and the parameter values from Example 1.1, Eq. (1.12) can be used to compute velocity at t,,, = 2 s:

For the next interval (from t = 2 to 4 s), the computation is repeated, with the result

The calculation is continued in a similar fashion to obtain additional values:

The results are plotted in Fig. 1.5 along with the exact solution. It can be seen that the numerical method captures the essential features of the exact solution. However, because we have employed straight-line segments to approximate a continuously curving function, there is some discrepancy between the two results. One way to minimize such discrepancies is to use a smaller step size. For example, applying Eq. (1.12) at l-s intervals results in a smaller error, as the straight-line segments track closer to the true solution. Using hand calculations, the effort associated with using smaller and smaller step sizes would make such numerical solutions impractical. However, with the aid of the computer, large numbers of calculations can be performed easily. Thus, you can accurately model the velocity of the falling parachutist without having to solve the differential equation exactly.

As in the previous example, a computational price must be paid for a more accurate numerical result. Each halving of the step size to attain more accuracy leads to a doubling


FIGURE 1.5 Comparison of the numerical and analytical solutions for the falling parachutist problem.

of the number of computations. Thus, we see that there is a trade-off between accuracy and computational effort. Such trade-offs figure prominently in numerical methods and constitute an important theme of this book. Consequently, we have devoted the Epilogue of Part One to an introduction to more of these trade-offs.

1.2 CONSERVATION LAWS AND ENGINEERING

Aside from Newton's second law, there are other major organizing principles in engineering. Among the most important of these are the conservation laws. Although they form the basis for a variety of complicated and powerful mathematical models, the great conservation laws of science and engineering are conceptually easy to understand. They all boil down to

Change = increases - decreases (1.13)

This is precisely the format that we employed when using Newton's law to develop a force balance for the falling parachutist [Eq. (1.8)].

Although simple, Eq. (1.13) embodies one of the most fundamental ways in which conservation laws are used in engineering-that is, to predict changes with respect to time. We give Eq. (1.13) the special name time-variable (or transient) computation.

Aside from predicting changes, another way in which conservation laws are applied is for cases where change is nonexistent. If change is zero, Eq. (1.13) becomes

Change = 0 = increases - decreases

Increases = decreases

1.2 CONSERVATION MWS AND ENGINEERING 19

Pipe 2 Flow in = 80

Pipe 1 Pipe 4 Flow in = 100 Flow out = ?

FIGURE 1.6 A flow balance for steady incompressible fluid flow at the Pipe 3 iunction of pipes. Flow out = 120

Thus, if no change occurs, the increases and decreases must be in balance. This case, which is also given a special name-the steady-state computation-has many applications in engineering. For example, for steady-state incompressible fluid flow in pipes, the flow into a junction must be balanced by flow going out, as in

Flow in = flow out

For the junction in Fig. 1.6, the balance can be used to compute that the flow out of the fourth pipe must be 60.

For the falling parachutist, steady-state conditions would correspond to the case where the net force was zero, or [Eq. (1.8) with dv /dt = 01

Thus, at steady state, the downward and upward forces are in balance and Eq. (1.15) can be solved for the terminal velocity

Although Eqs. (1.13) and (1.14) might appear trivially simple, they embody the two fundamental ways that conservation laws are employed in engineering. As such, they will form an important part of our efforts in subsequent chapters to illustrate the connection between numerical methods and engineering. Our primary vehicles for making this connection are the engineering applications that appear at the end of each part of this book.

Table 1.1 summarizes some of the simple engineering models and associated conservation laws that will form the basis for many of these engineering applications. Most of the chemical engineering applications will focus on mass balances for reactors. The mass balance is derived from the conservation of mass. It specifies that the change of mass of a chemical in the reactor depends on the amount of mass flowing in minus the mass flowing out.

Both the civil and mechanical engineering applications will focus on models developed from the conservation of momentum. For civil engineering, force balances are utilized to analyze structures such as the simple truss in Table 1.1. The same principles are employed for the mechanical engineering applications to analyze the transient up-and- down motion or vibrations of an automobile.


TABLE 1.1 Devices and types of balances that are commonly used in the four maior areas of engineering. For each case, the conservation law upon which the balance is based is specified.

<-----* - - - _ D r r i _ ***rCl)-_lLICnXI----<--P--

Field Device Organizing Principle Mathematical Expression

Chemical engineering Conservation of mass Mass balance: Input Output

Civil engineering

Structure A! Conservation of momentum

Mechanical engineering Conservation of momentum

Over a unit of time period Amass = inputs - outputs

Force balance: + F,.

t .FH -0- +FH

I

At each node X horizontal forces (F,) = 0 X vertical forces (F,) = 0

Force balance: Upward force

Downward force

m fi = downward force - upward force dt2

Electrical engineering Conservation of charge Current balance: + i , + - i 3

For each node I current (i) = 0 1

+ i,

Circuit Conservation of energy Voltage balance: ~ I R I

i2.2 @ . i3R3

Around each loop Z emf's - H voltage drops for resistors = 0 X t - X i R = O

PROBLEMS 2 1

TABLE 1.2 Some practical issues that will be explored in the engineering applications at the end of each part of this book.

1 . Nonlinear versus linear. Much of classical engineering depends on linearization to permit analytical solu- tlons Although thls is often appropriate, expanded insight can often be gained if nonlinear problems are examined.

2 large versus small systems W~thout a computer, ~t is often not feas~ble to examine systems with over three lnteractlng components. With computers and numerical methods, more realistic m~lt icom~onent systems can be examined.

3. Nonideal versus ideal, ldeallzed laws abound in engineering Often there ore nonidealized alternatives that are more realistic but more computationally demanding. Approximate numerical approaches con fo- cilitate the application of these nonideal relationships.

4 Sensitivity analysis. Because they are so involved, many manual calculations require a great deal of time and effort for successful implementation This sometimes discourages the analyst from implementing the mult~ple computations that are necessary to examine how a system responds under different conditions. Such sensitivity analyses are facilitated when numerical methods allow the computer to assume the computational burden

5 Design. It is often a straightforward proposition to determine the performance of a system as a functlon of ~ts parameters. It is usually more difficult to solve the inverse problem-that IS, determining the parameters when the required performance IS specifled. Numerical methods and computers often permlt this task to be implemented In an efficient manner

Finally, the electrical engineering applications employ both current and energy balances to model electric circuits. The current balance, which results from the conservation of charge, is similar in spirit to the flow balance depicted in Fig. 1.6. Just as flow must balance at the junction of pipes, electric current must balance at the junction of electric wires. The energy balance specifies that the changes of voltage around any loop of the circuit must add up to zero. The engineering applications are designed to illustrate how numerical methods are actually employed in the engineering problem-solving process. As such, they will permit us to explore practical issues (Table 1.2) that arise in real-world applications. Making these connections between mathematical techniques such as numerical methods and engineering practice is a critical step in tapping their true potential. Careful examination of the engineering applications will help you to take this step.

1.1 Answer true or false: (a) The value of a variable that satisfies a single equation is called

the root of the equation. (b) Finite divided differences are used to represent derivatives in

approximate terms. (c) In the precomputer era, numerical methods were widely em-

ployed because they required little computational effort. (d) Interpolation is employed for curve-fitting problems when

there is significant error associated with the data points. (e) Mathematical models should never be used for predictive pur-

poses.

(f) The large systems of equations, nonlinearities, and complicated geometries that are common in engineering practice are easy to solve analytically.

(g) Newton's second law is a good example of the fact that most physical laws are based on the rate of change of quantities rather than on their magnitudes.

(h) A physical interpretation of integration is the area under a curve.

(i) Numerical methods are those in which a mathematical problem is reformulated so that it can be solved by arithmetic operations.


(j) Today more attention can be paid to problem formulation and interpretation because the computer and numerical methods facilitate the solution of engineering problems.

1.2 Read the following problem descriptions and identify which area of numerical methods (as outlined in Fig. PT1.2) relates to their solution. (a) You are responsible for determining the flows in a large inter-

connected network of pipes to distribute natural gas to a series of communities spread out over a 20-mi2 area.

(b) You are performing experiments to determine the voltage drop across a resistor as a function of current. You make measurements of voltage drop for a number of different values of current. Although there is some error associated with your data points, when you plot them they suggest a smooth curvilinear relationship. You are to derive an equation to characterize this relationship.

(c) You must develop a shock-absorber system for a racing car. Newton's second law can be used to derive an equation to predict the rate of change in position of the front wheel in response to external forces. You must compute the motion of the wheel as a function of time after it hits a 6-in bump at 150 mi/h.

(d) You have to determine the annual revenues required over a 20-year period for an entertainment center to be built for a client. Money can presently be borrowed at an interest rate of 10.37 percent. Although the information to perform this estimate is contained in economics tables, values are listed only for interest rates of 10 and 11 percent.

(e) You must determine the temperature distribution on the two- dimensional surface of a flat gasket as a function of the tem- peratures of its edges.

(f) For the falling parachutist problem, you must determine the value of the drag coefficient in order that a 200-lb parachutist not exceed 100 mi/h within 10 s ofjumping. You must make this evaluation on the basis of the analytical solution [Eq. (1.10)]. The information will be used to design a jumpsuit.

(g) You are on a survey crew and must determine the area of a field that is bounded by two roads and a meandering stream.

1.3 Give one example of an engineering problem where each of the following classes of numerical methods can come in handy. If possible, draw from your experience in class and in readings or from any professional experience you have gathered to date. (a) Roots of equations (b) Linear algebraic equations (c) Curve fitting: regression and interpolation (d) Optimization (e) Integration (f) Ordinary differential equations (g) Partial differential equations

1.4 What is the two-pronged approach to engineering problem solving? Into what category should the conservation laws be placed? 1.5 What is the form of the transient conservation law? What is it for steady state? 1.6 The following information is available for a bank account:

Date Deposits Withdrawals Balance

Use the conservation of cash to compute the balance on 6/ 1, 7/ 1, 811, and 911. Show each step in the computation. Is this a steady- state or a transient computation? 1.7 Give examples of conservation laws in engineering and in everyday life. 1.8 Examine your engineering textbooks and find four examples where mathematical models are used to describe the behavior of physical systems. List the independent and dependent variables as well as the parameters and forcing functions. 1.9 Verify that Eq. (1.10) is a solution of Eq. (1.9). 1.10 Repeat Example 1.2. Compute the velocity to t = 12 s, with a step size of (a) 1 and (b) 0.5 s. Can you make any statement regarding the errors of the calculation based on the results? 1.11 Rather than the linear relationship of Eq. (1.7), the upward force on the falling parachutist is actually nonlinear and might better be represented by a power relationship such as

1 2 Fu = -C v

where c' = a second-order drag coefficient (kglm). Using this relationship, repeat the calculation in Example 1.2 with the same initial condition and parameter values. Use a value of 0.23 kg/m for c'. 1.12 The amount of a uniformly distributed radioactive contaminant contained in a closed reactor is measured by its concentration c (becquerellliter, or BqIL). The contaminant decreases at a decay rate proportional to its concentration; that is,

Decay rate = -kc

PROBLEMS 23

where k = a constant with units of day-'. Therefore, according to y Eq. (1.13), a mass balance for the reactor can be written as

d c - - - -kc d t

(change) = (decrease) in mass by decay

Use numerical methods to solve this equation from t = 0 to 1 d, with k = 0.1 d-'. Employ a step size of At = 0.1. The concentration at t = 0 is 10 Bq/L. FIGURE P1.13 1.13 A storage tank contains a liquid at depth y, where y = 0 when the tank is half full. Liauid is withdrawn at a constant flow rate Q to meet demands. The contents are resupplied at a sinusoidal rate or, since the surface area, A, is constant 3Q sin2t (see Fig. P1.13).

Equation (1.13) can be written for this system as - d y = 3- Q . sm2t - - Q d t A A

change in ( volume ) = (inflow) - (outflow)

Use a numerical method to solve for the height y from t = 0 to 5 d with a step size of 0.5 d. The parameter values are A = 1200 m2 and Q = 400 m3/d.

CHAPTER 2 Computers and Software

Numerical methods combine two of the most important tools in the engineering repertoire: mathematics and computers. In fact, numerical methods can be loosely defined as computer math. Good computer programming skills will enhance your studies of numerical methods. In particular, the powers and limitations of numerical techniques are best ap- preciated when you use the methods in tandem with a computer to solve engineering problems.

You will have the opportunity in using this book to develop your own software. Be- cause of the widespread availability of personal computers and magnetic storage devices, this software can be easily retained and applied throughout your career. Therefore, one of the primary objectives of this text is that you come away from it possessing useful, high- quality software.

In addition, engineers are increasingly using software packages to implement numerical methods. Therefore, another objective of this text will be to acquaint yourself with the capabilities, operation, and limitations of some of these packages. In addition, advanced applications with such packages often involve some programming in the guise of macros or scripts. Thus, even though it might superficially appear that using the software is a simple "point and shoot" exercise, sophisticated engineers must also know how to develop short, well-structured programs to truly exploit these useful tools.

This text has a number of features that are intended to maximize this possibility. First, all the numerical techniques are accompanied by material related to effective computer implementation. Algorithms are provided for many of the methods. Second, several popular software packages and libraries are described and used to solve engineering-oriented problems. Finally, supplementary software is available for the application or illustration of several of-the most elementary methods discussed in this book. This software, which is com- patible with widely used personal computers, can serve as a starting point for your own program library.

This chapter provides background information that has utility if you plan to use this text as the basis for developing your own programs. It is written under the premise that you have had some prior exposure to computer programming. Because this book is not intended to support a computer programming course per se, the discussion pertains only to those aspects related to developing high-quality software for numerical methods. It is also intended to provide you with specific criteria for evaluating the quality of your efforts.

FIGURE 2.1 The evolution of the modern computer era can be conceptualized as a succession of ener- E ations closely linked to tec no- logical advances. In the third generation, a divergence emerged with the advent of machines expressly designed for the needs of individuals. Today, a broad range of computers is available.

Source. Redrawn with permission from Steven Chapra and Raymond Canale, Introduction to Comput~ng for Engineers (New York: McGraw- Hill, 1994).

2.1 COMPUTING ENVIRONMENTS 25

COMPUTING ENVIRONMENTS

We believe that a historical perspective is very useful in appreciating recent advances in software design. As shown in Fig. 2.1, the modem computer age can be conceptualized as a progression of generations. Prior to the early 1970s, computing was monolithic in the sense that large mainframe systems were the only available option. Because they were so expensive to own, operate, and maintain, computers could be acquired only by large orga- nizations such as corporations, universities, government agencies, and the military. As might be expected, the centralized way in which computing was configured had a profound impact on the manner in which engineers and scientists interacted with the machines. In particular, a distinct separation existed between the user and the computer.

In the late 1960s, the introduction of integrated circuits resulted in machines that were significantly faster, cheaper, and smaller. This technological breakthrough in turn spawned a number of new developments that have led to a divergence in the previously one- dimensional evolution of the computing industry (Fig. 2.1).


Whereas earlier computers had been developed expressly for large institutions and or- ganizations, the fourth generation resulted in machines that were designed to meet the needs of individuals. Consequently, today's engineer has a broad spectrum of tools with which to implement computations. In terms of size, cost, and number of users, these tools can be broadly divided into four major categories: personal computers, workstations, mainframes, and supercomputers.

As the name implies, apersonal computer (or PC) is a smaller machine principally intended for an individual user. The workstation is generally more powerful and, although often dedicated to an individual user, is typically part of a network of computers. The mainframe is a larger, more expensive machine usually owned by an institution and serves multiple users. Finally, the supercomputer is a large, very expensive, powerful machine that is principally employed for sophisticated applications by a select group of users.

It should be noted that these computer classifications are not hard and fast. For example, as more personal computers are being used as terminals, the boundaries between today's PC's and workstations are beginning to blur.

Whatever definitions are used, the beauty of today's situation is that we have potential access to all types of computers. For example, preliminary work on a large-scale computation might be performed on a personal computer. Then, a telecommunications link can be used to upload the program to a mainframe for fast implementation. In this way, the strengths of each type of computer are exploited. Engineers with access to a range of machines will have enormous capability at their command.

For this reason, and because we anticipate that you will be using different types of machines to learn numerical methods, we have designed this book so that it can be used with any of these systems. Thus, most of the material can be implemented on systems ranging from personal computers to mainframes.

At the same time, we have tried wherever possible to acknowledge and explore some of the exciting new applications that are directly associated with personal computers. In particular, we have devoted a significant portion of this edition to PC-oriented software packages. We believe that the "personal" aspect of the microcomputer revolution is a major source of the present excitement over computing among students and professionals alike.

2.2 THE SOFTWARE DEVELOPMENT PROCESS

No matter what type of computer you use, it has utility only if you provide it with careful instructions. These instructions are the software. The next sections deal with the process of composing your own high-quality software to implement numerical methods.

Figure 2.2 delineates the five steps that constitute the process. Each of these steps will be discussed in the subsequent sections. However, before embarking on this discussion, we must first mention some qualitative changes that have occurred in software development and that have great relevance to implementing numerical methods on computers.

2.2.1 Programming Style

Just as hardware has evolved dramatically, the software development process has also un- dergone a radical transformation in recent years. In the early years of computing, programming was somewhat more of an art than a science. Although all languages had very precise vocabularies and syntax, there were no standard definitions of good programming style.

2.2 THE SOFTWARE DEVELOPMENT PROCESS 27

FIGURE 2.2 The five steps required to produce high-quality software. The feedback arrows indicate that the first four steps con be improved in the light of experience.

Consequently, individuals developed their own criteria for what constituted an excellent piece of software.

In the early years, these criteria were strongly influenced by the available hardware. Early computers were expensive and slow and had small memories. As a consequence, one mark of a good program was that it utilized as little memory as possible. Another might be that it executed rapidly. Still another was how quickly it could be written the first time.

The ultimate result was that programs were highly nonuniform. In particular, because they were strongly influenced by hardware limitations, programmers put little premium on the ease with which programs might be used and maintained. Thus, if you wanted to effectively utilize and maintain a large program over long periods, you usually had to keep


FIGURE 2.3 Trends in hardware and software costs in the computing industry. Source: I. M. Branscornb, "Electron- ics and Computers: An Overview," Science, 215.755 (1982).

the original programmer close by. In other words, the program and the programmer formed a costly package.

Today many of the hardware limitations do not exist or are rapidly disappearing. In addition, since software costs constitute a proportionally larger fraction of the total computing budget (Fig. 2.3), there is a high premium on developing software that is easy to use and modify. In particular, there is a strong emphasis on clarity and readability rather than on terse, noncommunicative code. This emphasis has great significance in engineering, where work is often performed by teams and programs must be shared and modified by several individuals.

Computer scientists have systematically studied the factors and procedures needed to develop high-quality software of this kind. Although the resulting methods are somewhat oriented toward large-scale programming efforts, most of the general techniques are also extremely useful for the types of programs that engineers develop routinely in the course of their work. Collectively, we will call these techniques structured design and programming. In this section, we discuss three of these approaches: modular design, top-down design, and structured programming.

2.2.2 Modular Design

Imagine how difficult it would be to study a textbook that had no chapters, sections, or paragraphs. Breaking complicated tasks or subjects into more manageable parts is one way to make them easier to handle. In the same spirit, computer programs can be divided into small subprograms, or modules, that can be developed and tested separately. This approach is called modular design.

The most important attribute of modules is that they be as independent and self- contained as possible. In addition, they are typically designed to perform a specific, well- defined function and have one entry and one exit point. As such, they are usually short (generally 50 to 100 instructions in length) and highly focused.

One of the primary programming elements used to represent each module is the subprogram or procedure. A subprogram is a series of computer instructions that together perform a given task. Two types of procedures are commonly employed: functions and subroutines. The former usually returns a single result, whereas the latter returns several. A calling, or main, program invokes these modules as they are needed. Thus, the main program orchestrates each of the parts in a logical fashion.

2.2 THE SOFTWARE DEVELOPMENT PROCESS 29

In addition, it should be mentioned that much of the programming related to software packages like Excel and MATLAB involves the development of functions. That is, Excel macros and MATLAB functions are designed to receive some information, perform a calculation, and return results. Thus, modular thinking is consistent with how programming is implemented in package environments.

Modular design has a number of advantages. The use of small, self-contained units makes the underlying logic easier to devise and to understand for both the developer and the user. Development is facilitated because each module can be perfected in isolation. In fact, for large projects, different programmers can work on individual parts. In the same spirit, you can maintain your own library of useful modules for later use in other programs. Modular design also increases the ease with which a program can be debugged and tested because errors can be more easily isolated. Finally, program maintenance and modirication are facilitated. This is primarily because new modules can be developed to perfo~ln additional tasks and then easily incorporated into the already coherent and organized scheme.

2.2.3 Top-Down Design

Although the previous section introduced the notion of modular design, it did not specify how the actual modules are identified. The top-down design process is a particularly effective approach for accomplishing this objective.

Top-down design is a systematic development process that begins with the most general statement of a program's objectives and then successively divides it into more detailed segments. Thus, the design proceeds from the general to the specific. Most top-down designs start with a literal description of the program in its most general terms. This description is then broken down into a few elements that define the program's major functions. Then each major element is divided into subelements. This process is continued until well- defined modules have been identified.

Besides providing a method for dividing the algorithm into well-defined units, top- down design has other benefits. In particular, it ensures that the finished product is comprehensive. By starting with a broad definition and progressively adding detail, the programmer is less likely to overlook important operations.

2.2.4 Structured Programming

Modular and top-down approaches deal with effective ways to organize a program. In both cases, emphasis was placed on what the program is supposed to do. Although this usually guarantees that the program will be coherent and neatly organized, it does not ensure that the actual program instructions or code for each module will manifest an associated clarity and orderliness. Structured programming, on the other hand, deals with how the actual program code is developed so that it is easy to understand, correct, and modify.

In essence, structuredprogramming is a set of rules that prescribe good style habits for the programmer. Although structured programming is flexible enough to allow corisider- able creativity and personal expression, its rules impose enough constraints to render resulting codes far superior to unstructured versions. In particular, the finished product is more elegant and easier to understand.

The rules that constitute structured programming will be described in a subsequent section (2.4.2). Before doing so, we will first turn to the initial step of the program development process---designing the underlying logic or algorithm of the program.


ALGORITHM DESIGN

The most common problems encountered by inexperienced programmers usually can be traced to the premature preparation of a program that does not encompass an overall strategy or plan. In the early days of computing, this was a particularly thorny problem because the design and programming components were not clearly separated. Programmers would often launch into a computing project without a clearly formulated design. This led to all sorts of problems and inefficiencies. For example, the programmer could get well into a job only to discover that some key preliminary factor had been overlooked. In addition, the logic underlying improvised programs was invariably obscure. This meant that it was usually difficult for someone to modify someone else's program. Today, because of the high costs of software development, much greater emphasis is placed on such aspects as preliminary planning that lead to a more coherent and efficient product. The focus of this plan- - - ning is algorithm design.

An algorithm is the sequence of logical steps required to perform a specific task such as solving a problem. Aside from accomplishing its objectives, a good algorithm must fea- ture several attributes:

1. Each step must be deterministic-that is, nothing can be left to chance. The final results cannot depend on who is following the algorithm. In this sense, an algorithm is analo- gous to a recipe. Two chefs working independently from a good recipe should end up with dishes that are essentially identical.

2. The process must always end after a finite number of steps. An algorithm cannot be open-ended.

3. The algorithm must be general enough to deal with any contingency.

Figure 2.4a shows an algorithm for the solution of the simple problem of adding a pair of numbers. Two independent programmers working from this algorithm might develop programs exhibiting somewhat different styles. However, given the same data, their programs should yield identical results.

Step-by-step literal descriptions of the sort depicted in Fig. 2.4a are one way to express an algorithm. These descriptions are particularly useful for small problems or for specifying the broad tasks of a large programming effort. However, for detailed representations of complicated programs, they rapidly become inadequate. For this reason, more versatile alternatives, called flowcharts and pseudocodes, have been developed.

2.3.1 Flowcharting

Aflowchart is a visual or graphical representation of an algorithm. The flowchart employs a series of blocks and arrows, each of which represents a particular operation or step in the algorithm. The arrows represent the sequence in which the operations are implemented. Figure 2.4b shows a flowchart for the problem of adding two numbers.

Not everyone involved with computer programming agrees that flowcharting is a pro- ductive endeavor. In fact, some experienced programmers do not advocate flowcharts. However, we believe that there are three good reasons for studying them. First, they are still used for expressing and communicating algorithms. Second, even if they are not employed routinely, there will be times when they will prove useful in planning, unraveling, or communicating the logic of your own or someone else's program. Finally, and most

2.3 ALGORITHM DESIGN 3 1

FIGURE 2.4 (a) Algorithm, (b] flowchart, 01

(c) pseudocode for the solutio~ of a simple addition program.

important for our purposes, they are excellent pedagogical tools. From a teaching perspective, they are ideal vehicles for visualizing some of the fundamental control structures employed in computer programming.

Flowchart Symbols. Notice that in Fig. 2.4b the blocks have different shapes to distinguish different types of operations. In the early years of computing, flowchart symbols were nonstandardized. This led to confusion, because programmers used different symbols to represent the same operation. Today a number of standardized systems have been developed to facilitate communication. In this book, we use the set of symbols shown in Fig. 2.5.

A flowchart is read by starting at the beginning terminal and following the flowlines to trace the algorithm's logic. Thus, as shown in Fig. 2.4b, we would start at the "begin" block and then follow the arrow sequentially until we arrive at the "end" block. If the flowchart is written properly, there should never be any doubt regarding the correct path.

Top-Down Algorithm Design. There are no set standards for how detailed a flowchart should be. During the development of a major algorithm, you will often draft flowcharts of various levels of complexity. As an example, Fig. 2.6 depicts a hierarchy of three charts that might be used in the development of an algorithm to determine the grade-point average (GPA) or cumulative index of a student. The system flowchart (Fig. 2 . 6 ~ ) delineates


SYMBOL NAME FUNCTION

Represents the beginning or end of a program.

Flowlines Represents the flow of logic. The humps on the horizontal arrow indicate that it passes over and does not connect with the vertical flowlines.

Process Represents calculations or data manipulations.

Input/output Represents inputs or outputs of data and information

Decision Represents a comparison, question, or decision that determines alternative paths to be followed.

Junction Represents the confluence of flowlines.

Off-page Represents a break that is continued on another page. connector

Count-controlled Used for loops which repeat a prespecified number of iterations loop

FIGURE 2.5 Symbols used in flowcharts

the big picture. It identifies the major tasks, or modules, and the sequence that is required to solve the entire problem. Such an overview flowchart is invaluable for ensuring that the total scheme is comprehensive enough to be successful. Next, the major modules can be broken off and charted in greater detail (Fig. 2.6b). Finally, it is sometimes advantageous to break the major modules down into even more manageable units, as in Fig. 2 . 6 ~ . This is an example of the top-down design process that was introduced in Sec. 2.2.3.

In the following sections we will study and improve the flowchart in Fig. 2 . 6 ~ . This will be done to illustrate how the symbols in Fig. 2.5 can be orchestrated to develop an efficient and effective algorithm. In the process we will demonstrate how the power of computers can be reduced to three fundamental types of operations. The first, which should already be apparent, is that computer logic can proceed in a definite sequence. As depicted in Fig. 2.6a and c, this is represented by the flow of logic from box to box. The other two fundamental operations are selection and repetition.

Structured Algorithms. Notice that the operations in Fig. 2.6a and c specify a simple se- quential flow. That is, the operations follow one after the other. However, notice how the first diamond-shaped decision symbol in Fig. 2.6b permits the flow to branch and follow one of two possible paths, depending on the answer to a question. For this case, if final grades are being determined, the flow follows the right branch, whereas if intermediate or midterm grades are being determined, the flow follows the left branch. This type of flowchart operation is called selection.

Aside from depicting a conditional branch, Fig. 2.6b demonstrates another flowchart operation that greatly increases the power of computer programs-repetition. Notice that

2.3 ALGORITHM DESIGN 33

FIGURE 2.6 A hierarchy of flowcharts dealing with the problem of determining a student's grade-point average. The system flowchart in (a) provides a comprehensive plan. A more detailed chart for a maior module is shown in [b). An even more detailed chart for a segment of the maior module is seen in (c). This way of approaching a problem is called top-down design.


the "false" branch of the second decision symbol allows the flow to return to the head of the module and repeat the calculation process for another course. This ability to perform repetitive tasks is among the most important strengths of computers. The operation of "looping7' back to repeat a set of operations is accordingly named a loop in computer jargon.

The operations of selection and repetition can be employed to great advantage when deriving algorithms. This can be demonstrated by developing an improved version of Fig. 2 . 6 ~ . In Fig. 2.6c, we specified a flowchart to total the quiz grades and determine the average. A more detailed flowchart to accomplish these objectives is presented in Fig. 2.7. Notice that the algorithm consists of a loop and a decision construct. The loop inputs,

FIGURE 2.7 A detailed flowchart to compute the average quiz Through the use o grade, an accumu- lator and a counter, this example improves on the simple version shown in Fig. 2.6.


counts, and sums the grades. These operations are repeated until the input of a negative grade triggers the exit from the loop. At this point, a selection construct is used to compute the average grade. A selection construct is employed for this purpose to avoid division by zero in the event that no positive grade has been entered.

Structured Flowcharts. Flowcharting symbols can be orchestrated to construct logic pat- terns of extreme complexity. However, programming experts have recognized that, without rules, flowcharts can become so complicated that they impede rather than facilitate clear thinking.

One of the main culprits that contributes to algorithm complexity is the unconditional transfer, also known as the GO TO. As the name implies, this control structure allows you to "go to" any other location in the algorithm. For flowcharts, it is simply represented by an arrow. Indiscriminate use of unconditional transfers can lead to algorithms that, at a distance, resemble a plate of spaghetti; hence, the derogatory label spaghetti algorithm.

In an effort to avoid this dilemma, advocates of structured programming have demonstrated that any program can be constructed with the three fundamental control structures depicted in Fig. 2.8. As can be seen, they correspond to the three operations-sequence, selection, and repetition-that we have just been discussing.

All flowcharts should be composed of the three fundamental control structures shown in Fig. 2.8. If we limit ourselves to these structures and avoid GO TOs, the resulting algorithm will be much easier to understand.

Before proceeding, we must make one practical revision to our "no GO TO" dictum. Unfortunately, some languages (notably Fortran 77) and some packages do not presently include a complete set of structured programming constructs. This is particularly true of the packages where simplified, primitive programming languages have often been adopted. As more and more users write macros and scripts, this situation is changing. However, at the time of this book's printing, there are still occasions where you must use a GO TO to implement a standard control structure. This said, if you avoid wild jumps using GO TOs, you can still write decent codes in these environments.

2.3.2 Pseudocode

The object of flowcharting is obviously to develop a quality computer program consisting of a set of step-by-step instructions called code. Inspection of Figs. 2.6 and 2.7 illustrates how top-down flowcharting moves in the direction of a computer program. Recall that in top-down design, more detailed flowcharts are developed as we progress from the general overview (Fig. 2.6a) to specific modules (Fig. 2 . 6 ~ ) . At the highest level of detail, the module flowcharts are almost in the form of a computer program. That is, as seen in Fig. 2.7, each flowchart compartment represents a single well-defined instruction to the computer.

An alternative way to express an algorithm that bridges the gap between flowcharts and computer code is called pseudocode. This technique uses codelike statements in place of the graphical symbols of the flowchart. Figure 2.9 shows pseudocode representations for the fundamental control structures.

We have adopted some style conventions for the pseudocode in this book. Keywords such as IF, DO, etc., are capitalized, whereas the conditions, processing steps, and tasks are in lowercase. Additionally, notice that the processing steps are indented. Thus, the keywords form a "sandwich" around the steps to visually define the extent of each control structure.


Sequence

Selection

Repetition

FIGURE 2.8 The three fundamental control structures.

Figure 2.10 shows a pseudocode representation of the flowchart from Fig. 2.7. This version looks much more like a computer program than the flowchart. Thus, one advantage of pseudocode is that it is easier to develop a program with it than with a flowchart. The pseudocode is also easier to modify and share with others. However, because of their


Sequence

Repetition

FIGURE 2.9 Pseudocode representations of the three fundomental control structures.

FIGURE 2.10 Pseudocode to compute the average quiz grade. This algorithm was represented previously by the flowchart in Fig. 2.7.

Sequence

BEGIN Average Grade

sum = 0 count = 0

Repetition -

DO INPUT grade IF gradk <: o EXIT sum = sum + grade count = count + 1

END DO

- IF count > 0 THEN average = sumlcount

Selection - ELSE average = 0

- END IF

END Average Grade


graphic form, flowcharts sometimes are better suited for visualizing complex algorithms. In this text, pseudocode will be used to communicate algorithms related to numerical methods.

2.3.3 Other Useful Structures

Although any numerical algorithm can be developed from the three fundamental control structures, computer scientists have devised several additional structures that have great utility in numerical methods. The most important of these are the multialtemative selection structure and the count-controlled loop.

The Multialternative Selection Structure. Suppose that the ELSE clause of an IFITHENELSE contains another IFITHEN, as in the following pseudocode:

IF condition Task,

ELSE IF condition

Task, ELSE

7a s k, END IF

END IF

For such cases, the ELSE and the IF can be combined, as in

IF condition Task,

ELSE IF condition f a 5 k,

ELSE Task,

END IF

Notice how the ELSE IF makes the structure more concise. Not only are two lines consol- idated, but one of the END IF statements is no longer needed. A general flowchart for the IFITHENELSE IF structure is shown in Fig. 2.11.

Notice how there is a chain, or "cascade," of decisions. The first one is the IF statement, and each successive decision is an ELSE IF statement. Going down the chain, the first condition encountered that tests true will cause a branch to its corresponding code block followed by an exit of the structure. At the end of the chain of conditions, if all have tested false, there is an optional ELSE block.

The Count-Controlled Loop. The loops in Fig. 2.9 are called logical loops because they terminate on a logical condition. In contrast, count-controlled loops perform a specified number of repetitions, or iterations. Of course, such an operation could be programmed with a logical loop. For example, the following pseudocode is designed to repeat 10 times:


FIGURE 2.1 1 General multialternative structure (IF/THEN/ELSE IF).

i= l DO

IF i > 10 EXIT PKINT i i= i+l

END DO

The variable i is a counter that keeps track of the number of iterations. If i is less than or equal to 10, an iteration is performed. On each pass, the value of i is displayed and i is in- cremented by 1. After the 10th iteration, i becomes 11 ; therefore, the logical condition will test true and the loop will terminate.

Although the pretest loop is certainly a feasible option for performing a specified number of iterations, the operation is so common that special statements are available in all common high-level languages for accomplishing the same objective in a more efficient manner. A flowchart and pseudocode for the count-controlled structure are depicted in Fig. 2.12.

The count-controlled loop works as follows. The index (represented as i in Fig. 2.12) is a variable that is set at an initial value of start. The program then tests whether the index is less than or equal to the final value,finish. If so, it executes the body of the loop and then


FIGURE 2.1 2 Flowchart and pseudocode representations of the count- controlled loop that are useful in implementing numerical methods.

cycles back to the DO statement. Every time the END DO statement is encountered, the index is automatically increased by the step. Thus, the index acts as a counter. Then, when the index is greater than the final value (finish), the computer automatically transfers control out of the loop to the line following the END DO statement.'

2.4 PROGRAM COMPOSITION After concocting the algorithm, the next step is to express it as code. Before discussing how this should be done, we will first discuss some computer languages.

2.4.1 High-level and Macro Languages

Hundreds of computer languages have been developed since the computer age began. The question of which language is "best" has been the subject of heated debate. Unfortunately, many programmers have an almost emotional attachment to "their" language, and some go as far as to contend that all others are inferior. We believe that such parochial attitudes can be self-defeating. Although every language has limitations, each can be used to advantage in particular problem contexts.

In this text, we have chosen to emphasize pseudocode for expressing our numerical algorithms. These can be readily translated into the principal problem-solving languages used by engineers and scientists: Fortran and C. In addition, they can also be implemented in the macro/script languages of packages such as the Visual BASIC dialect used for Excel.

Fortran. Fortran, standing for formula translation, was introduced commercially by IBM in 1957. As the name implies, one of its distinctive features is that it uses a notation that fa-

'Although we believe the foregoing description has pedagogical value, it should be noted that computer languages such as Fortran 90 do not actually perform logical tests when implementing count-controlled loops. Rather, the number of iterations are precomputed during compilation. This is the reason why it is illegal to change the value of a counter variable within the body of a loop.

2.4 PROGRAM COMPOSITION 41

cilitates the writing of mathematical formulas. Because of such features, Fortran is the na- tive computer language of many engineers and scientists.

Aside from mathematical formulas, Fortran has other features that are relevant to numerical methods. For example, it has standard capabilities to handle extended precision and special mathematical functions, including complex variables. Additionally, it is highly conducive to modular programming because its subroutines allow local variable scoping and the transfer of values between subprograms and the main program as arguments. Thus, libraries of numerically oriented subprograms can be developed or obtained commercially for use in a Fortran code.

C. As noted in Fig. 2.3, software development constitutes the dominant share of today's computing costs. Developed at Bell Laboratories in 1972, C is the most popular language today for professional software development.

C possesses many of the features of a high-level programming language and hence can be employed by engineers for numerical calculations. It should be noted that it is generally inferior to Fortran 90 in this regard. However, it also offers the low-level access to hardware and software that is more characteristic of assembly language. At the same time, it is more generic than assembly language, which is usually hardware-specific. This combination of low-level capability and portability is what makes C appealing to software developers.

Although engineers and scientists use computers primarily to solve problems, a small but significant fraction develop software. Further, some universities choose to teach their engineering and science students C as their first computer language. Thus, it must be acknowledged that C will be used by a significant number of engineers and scientists over the next decade.

Visual BASIC. BASIC was originally developed by John Kemeny and Thomas Kurtz as an instructional language for students at Dartmouth College in the mid-1960s. Over the en- suing years it has evolved and grown more sophisticated (e.g., Microsoft's Quick BASIC). Today, its primary implementation is Visual BASIC. This dialect is integrated closely with the Microsoft Windows operating system, the de facto computing environment for IBM PCs and compatibles. Today, it is one of the primary development tools used for software development on PCs. In addition, its utility is also reinforced because of its role as the macro language for the popuiar Excel spreadsheet program.

Dialects. As time passes, new versions, or "dialects," of each language are being developed. Not surprisingly, the newer versions of each language often incorporate some of the advantages of the others. The specific dialects used in this book-Fortran 90 and Excel Vi- sual BASIC-were chosen because of their availability, ease of use, and compatibility with a structured programming approach. Thus, all codes in the book are written in a highly structured style derived from pseudocode. Consequently, all can be easily translated into other languages such as Fortran 77 or C. Additionally, they can be readily implemented within software packages such as MATLAB functions. In the following section, we will elaborate on this structured approach.

2.4.2 Structured Programming

Not everyone agrees on a standard definition of structured programming. In general, the major idea is embodied in what can be called the structure principle (Dijkstra 1968):


The structure principle: The static structure (that is, spread out on the page) of the program should correspond in a simple way to the dynamic structure (that is, spread out in time) of the corresponding computation.

Thus, the "jumping around" due to indiscriminate branching and other poor programming habits will be avoided as a natural consequence of the orderly nature of well-conceived computations.

In addition to the structure principle, there are several other rules that can be considered as part of the structured programming philosophy. The following are among the most generally agreed upon rules:

1. Programs should consist solely of the three fundamental control structures of sequence, selection, and repetition (recall Fig. 2.8). Computer scientists have proved that any program can be constructed from these basic structures.

2. Each of the structures must have only one entrance and one exit. 3. Unconditional transfers (GO TOs) should be avoided. 4. The structures should be clearly identified with comments and visual devices such as in-

dentation, blank lines, and blank spaces.

Although the rules may appear deceptively simple, adherence to them avoids the spaghettilike code that is the hallmark of indiscriminate branching. Because of their benefits, these rules and other structured programming practices will be stressed continually in subsequent parts of this book.

Figures 2.13 and 2.14 show how the selection and repetition constructs are programmed in Fortran 90, C, and Visual BASIC. Note how the codes strongly resemble pseudocode.

Figure 2.15 is a dramatic example of the contrast between unstructured and structured approaches. All three programs are designed to determine an average grade, the algorithm for which was delineated in Fig. 2.10. Although the programs compute identical values, their designs are in marked contrast. The purpose, logic, and organization of the unstructured program in Fig. 2.150 is not obvious. The user must delve into the code on a line-by- line basis to decipher the program.

In contrast, the major parts of Fig. 2.15b and c are immediately apparent. Program execution progresses in an orderly fashion from the top to the bottom with no major jumps. Every segment has one entrance and one exit, and each is clearly delineated by spaces and indentation. In essence, the programs are distinguished by a visual clarity that is reminis- cent of a well-designed flowchart. Additionally, the structured programs are a natural evolution of the pseudocode shown in Fig. 2.10.

On the debit side, the programs in Fig. 2.1% and c may take longer to develop than the program in Fig. 2 . 1 5 ~ . Because many pragmatic engineers are accustomed to "getting the job done" in an efficient and timely manner, the extra effort might not always be justified. Such would be the case for a short program you might develop for a "one-time- only" computation. However, for more important programs that are intended for application by yourself or others, there is no question that a structured approach yields long-term benefits.

2.5 QUALITY CONTROL 43

Pseudocode Fortran 90 C Visual BASIC

IF condition I F ( a < 0 . ) THEN i f ( a < O ) C True alternative b = - a b = -a ;

E N D IF END I F 3

IF condition I F ( a < 0 . ) THEN i f ( a x 0 1 C True alternative b = - a b = -a ;

E L S E ELSE 3 e l s e C False alternative b = a b = a;

E N D IF END I F 3

[c) Mvltialternative d e c ~ s ~ o n

IF condition I F ( a < 0 . ) THEN i f ( a < O ) C Alternative, b = - a b = -a ;

E L S E IF ELSE I F ( a > 0 . ) THEN 3 e l s e i f ( a > O ) C Alternative, b = a b = a ;

E L S E ELSE 3 e l s e C Else Alternative b = 0 . b = 0;

E N D IF END I F 3

FIGURE 2.13 Selection constructs as expressed in Fortran 90, ANSI C, and Visual BASIC

I F a < 0 . THEN

b = - a

END I F

I F a < 0 . THEN

b = - a

ELSE

b = a

END I F

I F a < 0 . THEN

b = - a

ELSE I F a > 0 . THEN

b = a

ELSE

b = 0 .

END I F

2.5 QUALITY CONTROL - -- - - -

With the tools described up to this point, you would be capable of developing a program to implement a calculation on the computer. However, the fact that a program prints out information is no guarantee that these answers are correct. consequently, we will now turn to some issues related to the reliability of your program. We are emphasizing this subject at this point to stress its importance in the program-development process. Since we are problem-solving engineers, most of our work will directly or indirectly be used by clients and sponsors. For this reason, it is essential that our programs be reliable-that is, do exactly what they are supposed to do. At best, lack of reliability can be embarrassing. At worst, for engineering problems involving public safety, it can be tragic.

2.5.1 Errors or "Bugs"

During the course of preparing and executing a program of any size, it is likely that errors will occur. These errors are sometimes called bugs in computer jargon. This label was


Pseudocode Fortran 90 C Visual BASIC

13) Logical loop (m~dtest)

DO D 0 w h i l e ( 1 ) C D 0 Task a = a / 2 . a = a / 2 ; a = a / 2 .

IF condition EXIT I F ( a < 1 . ) E X I T i f ( a < l ) b r e a k ; I F ( a < 1 . ) E X I T

Task a = a / 4 . a = a / 4 ; a = a / 4 . END DO END DO > LOOP

(b ] Count-controlled loop

DOi=s tar t , f in ish ,s tep DO i = 1 , 1 0 , 2 f o r ( i = I ; i < = l O ; i = i + 2 ) C FOR i = 1 TO 1 0 STEP

Task x = x + i x = x + i ; x = x + i END DO END DO 1 NEXT i

FIGURE 2.14 Repetition constructs as expressed in Fortran 90, ANSI C, and Visual BASIC

coined by Admiral Grace Hopper, one of the pioneers of computer languages. In 1945, she was working with one of the earliest electromechanical computers when it went dead. She and her colleagues opened the machine and discovered that a moth had become lodged in one of the relays. Thus, the label "bug" came to be synonymous with problems and errors associated with computer operations, and the term debugging came to be associated with their solution.

When you are developing and running a program, four types of bugs can occur:

1. Syntax errors violate the rules of the language such as spelling, number formation, and other conventions. These errors are often the result of mistakes such as typing REED rather than READ. They usually result when the program is compiled and result in the computer printing out an error message. Such messages are called diagnostics because the computer is helping you to "diagnose" the problem.

2. Link or build errors occur during the link phase. A common example might be that you misspelled the name of an intrinsic function. In such cases, the computer would print out a message like, "unresolved external reference."

3. Run-time errors are those that occur during program execution. An example is when there are an insufficient number of data entries for the number of variables in an input statement. For such situations a diagnostic will usually be printed and you must then re- enter the data properly. For other run-time errors the computer may simply terminate the run or print a message providing you with information regarding the error. In any event, you will be cognizant of the fact that an error has occurred.

4. Logic errors, as the name implies, are due to faulty program logic. These are the worst type of error because they can occur without diagnostics. Thus, your program will appear to be working properly in the sense that it executes and generates output. However, the output will be incorrect.

2.5 QUALITY CONTROL 45

S=O. I =o

1 READ *,G I F ( G . L T . 0 ) GO TO 2 S=S+G I = I + l G O TO 1

2 I F ( I . E Q . 0 ) G O TO 3 A = S / I G O TO 4

3 A=O. 4 PRINT *,A

END

PROGRAM g r a d e r I M P L I C I T n o n e REAL :: a v g r d PRINT *, a v g r d 0 END

FUNCTION a v g r d 0 I M P L I C I T n o n e REAL :: sum,grade,avegrd INTEGER :: c o u n t

c o u n t = 0 sum = 0 . READ *, g r a d e

D 0 I F ( g r a d e < O . ) E X I T sum = s u m + g r a d e c o u n t = c o u n t + 1 READ *, g r a d e

END DO

I F ( c o u n t >O) THEN a v g r d = s u m l c o u n t

ELSE a v g r d = 0 .

END I F

END

m a i n 0 C

f l o a t a v e g r d 0 ; p r i n t f ( " % f \ n n , a v e g r d 0 ) ;

1

f l o a t a v e g r d 0 C

i n t c o u n t ; f l o a t sum, ave, g r a d e ;

c o u n t = 0; sum = 0 .0 ; s c a n f ( " % f " , & g r a d e ) ;

w h i l e ( 1 ) C i f ( g r a d e < 0 . 0 ) b r e a k ; sum = sum + g r a d e ; c o u n t + + ; s c a n f ( " % f M , & g r a d e ) ;

1

i f ( c o u n t > 0 ) C a v e = s u m / c o u n t ;

) e l s e C a v e = 0 .0 ;

1 r e t u r n Cave);

1

PROGRAM g r a d e r

FIGURE 2.1 5 (a) An unstructured Fortran 77 and structured (b) Fortran 90 and (c) C versions of the same program. Both versions perform identical cornputations but differ markedly in terms of clarity

All the above types of errors must be removed before a program can be employed for engineering applications. We divide the process of quality control into two categories: debugging and testing. As mentioned above, debugging involves correcting known errors. In contrast, testing is a broader process intended to detect errors that are not known. Addi- tionally, testing should also determine whether the program fulfills the needs for which it was designed.

As will be clear from the following discussion, debugging and testing are interrelated and are often carried out in tandem. For example, we might debug a module to remove the obvious errors. Then, after a test, we might uncover some additional bugs that require cor- rection. This two-pronged process would be repeated until we were convinced that the program was absolutely reliable.


2.5.2 Debugging

As stated above, debugging deals with correcting known errors. There are three ways in which you will become aware of errors. First, an explicit diagnostic can provide you with the exact location and the nature of the error. Second, a diagnostic may occur but the exact location of the error is unclear. For example, the computer may indicate that an error has occurred in a line representing a long equation including some user-defined functions. This error could be due to syntax errors in the equation or it could also stem from errors in the statements defining the functions. Third, no diagnostics occur but the program does not operate properly. These are usually due to logic errors such as infinite loops or mathematical mistakes that do not yield syntax errors. Corrections are easy for the first type of error. However, for the second and third types, some detective work is in order to identify the errors and their exact locations.

Unfortunately, the analogy with detective work is all too apt. That is, ferreting out errors is often difficult and involves a lot of "leg work," collecting clues and running up some blind alleys. In particular, as with detective work, there is no clear-cut methodology involved. However, there are some ways of going about it that are more efficient than others.

One key to identifying bugs is to print out intermediate results. With this approach, it is often possible to determine at what point the computation went awry. Similarly the use of a modular approach will obviously help to localize errors in this way. You could put a different output statement at the start of each module and in this way focus in on the module in which the error occurred. In addition, it is always good practice to debug (and test) each module separately before integrating it with the total package. It should be noted that many programming environments include built-in tools that facilitate debugging.

Although the above procedures can help, there are often times when your only option will be to sit down and read the code line by line following the flow of logic and performing all the computations with a pencil, paper, and pocket calculator. Just as a detective sometimes has to think like a criminal to solve a case, there will be times you will have to think like a computer to debug a program.

Finally, we cannot stress enough that "an ounce of prevention is worth a pound of cure." That is, sound preliminary programming design and structured programming techniques are great ways to avoid errors in the first place. As with so many other aspects of your life, the sooner you adopt a "pay me now rather than pay me later" philosophy, the less grief you will experience as a computer programmer.

2.5.3 Testing

One of the greatest misconceptions of the novice programmer is the belief that, if a program runs and prints out results, it is correct. Of particular danger are those cases where the results appear to be "reasonable" but are in fact wrong. To ensure that such cases do not occur, the program must be subjected to a battery of tests for which the correct answer is known beforehand.

As mentioned in the previous section, it is good practice to debug and test modules prior to integrating them into the total program. Thus, testing usually proceeds in phases. These are module tests, development tests, whole system tests, and operational tests.

Module tests, as the name implies, deal with the reliability of individual modules. Bcc;~usc cach is dcsigncd to perfor111 a specific, well-defined function, the modules can

2.6 DOCUMENTATION AND MAINTENANCE 47

be run in isolation to determine that they are executing properly. Sample input data can be developed for which the proper output is known. This data can be used to run the module and the outcome compared with the known result to verify successful performance.

Development tests are implemented as the modules are integrated into the total program package. That is, a test is performed after each module is integrated. An effective way to do this is with a top-down approach that starts with the first module and progresses down through the program in execution sequence. By performing a test after each module is incorporated, problems can be isolated more easily because new errors can usually be attributed to the latest module that has been added.

After the total program has been assembled, it can be subjected to a series of whole- system tests. These should be designed to expose the program to (1) typical data, (2) unusual but valid data, and (3) incorrect data to check the program's error-handling capabilities.

Note that all the foregoing tests are often collectively referred to as alpha-testing. Once they are successful, a beta- or operational testing phase is usually initiated. These tests are designed to check how the program performs in a realistic setting. A standard way to do this is to have a number of independent subjects (including the ultimate user) implement the program. Operational tests sometimes turn up bugs. In addition, they provide feedback to the developer regarding ways to improve the program so that it more closely meets the user's needs.

Although the ultimate objective of the testing process is to ensure that programs are error-free, there will be complicated cases where it is impossible to subject a program to every possible contingency. Also, it should be recognized that the level of testing depends on the importance and magnitude of the problem context for which the program is being developed. There will obviously be different levels of rigor applied to testing a program for determining the averages of your intramural softball team as compared to a program to reg- ulate the operation of a nuclear reactor or a space station. In each case, however, adequate testing must be performed that is consistent with the liabilities connected with a particular problem context.

2.6 DOCUMENTATION AND MAINTENANCE

2.6.1 Documentation

After the program has been debugged and tested, it must be documented. Documentation is the addition of English-language descriptions to allow the user to implement the program more easily. Remember, along with other people who might employ your software, you also are a "user." Although a program may seem clear and simple when it is first composed, with time the same code may seem inscrutable. Therefore, sufficient documentation must be included to allow you and other users to immediately understand your program and how it can be implemented. This task has both internal and external aspects.

Internal Documentation. Picture for a moment a text with just words and no paragraphs, subheadings, or chapters. Such a book would be more than a little intimidating to study. The way the pages of a text are structured-that is, all the devices that act to break up and organize the material-makes the text more effective and enjoyable. In the same sense,


internal documentation of computer code can greatly enhance the user's understanding of a program and how it works.

The following are some general suggestions for documenting programs internally:

1. Include a module at the head of the program giving the program's title, your name, and the date. This is your signature and marks the program as your work.

2. Include a second module to define each of the key variables. 3. Select variable names that are reflective of the type of information the variables are to

store. 4. Insert spaces within statements to make them easier to read. 5. Use comment statements liberally throughout the program to add explanation and to

skip lines for the purposes of labeling, clarity, and separation of modules. 6. In particular, use comment statements to clearly label and set off all the modules. 7. Use indentation to clarify the program structures. In particular, indentation can be used

to set off loops and decisions.

External Documentation. This refers to instructions in the form of output messages and supplementary printed matter. The messages occur when the program runs and are intended to make the output attractive and "user-friendly." This involves the effective use of spaces, blank lines, or special characters to illuminate the logical sequence and structure of the program output. Attractive output simplifies the detection of errors and enhances the communication of program results. Finally, the program can be designed to output descriptive error messages to alert and inform the user regarding problems.

The supplementary printed matter can range from a single sheet to a comprehensive user's manual. The user's manual for your computer is an example of comprehensive documentation. This manual tells you how to run your computer system and disk-operating programs. When developing your own software, you may find it useful to develop a short "user's manual" that provides description and instruction.

2.6.2 Storage and Maintenance

Remember that when you sign off the computer, your active file will be destroyed. To retain a program for later use, you must transfer it to a secondary storage device such as a disk or tape before turning the machine off.

Aside from this physical act of retaining the program, storage and maintenance consists of two major tasks: (1) upgrading the program in light of experience and (2) ensuring that the program is safely stored. The former is a matter of communicating with users to gather feedback regarding suggested improvements to your design. The question of safe storage of programs is especially critical during development. A backup copy should always be created before making major code changes. Also, maintaining backups on external media such as diskettes is imperative.

2.7 SOFTWARE STRATEGY

One of your primary objectives in using this book should be to develop your own numerical methods software library. Whether you use them to build short macros or large sophisticated computer programs, such a library will be of immense value in your future acade-

2.7 SOFTWARE STRATEGY 49

TABLE 2.1 The programs in the Windows version of the Numerical Methods TOOLKIT.

Program Capability

Roots of equations Determines a real root of a single algebraic or transcendental equation using the bisect~on method. Can also be used to plot functions.

Systems of linear algebraic Solves or finds the inverse of up to 20 linear algebraic equations

equations* using an 1U decomposition implementation of Gauss elimination. Curve fitting* Fits data with up to a 10th-order regression polynomial and

statistically characterizes the adequacy of the fit Integration* Integrates functions or unevenly spaced tabular data with the

trapezoidal rule. Systems of ord~nary differential Solves up to five simultaneous ordinary differential equations using

equat~ons* the 4th-order Runge-Kutta method

*Denotes sign~ficant improvements compared to the previous DOS version

mic and professional efforts. We have provided a diverse array of computer media to help you accomplish this objective. These include pseudocode and software.

Pseudocode and other descriptions of algorithms are provided for many methods in the book. If used as the basis for your own programs, these can constitute a fairly comprehensive numerical methods capability.

Over and above your own collection of programs, we have also developed a generic software product called the Numerical Methods TOOLKIT that is used in conjunction with this book. This package includes modules devoted to five of the more fundamental numerical methods (Table 2.1). Each of the numerical methods programs is fully illustrated by an example problem in the relevant chapter of the book. The examples can provide you with models for the design of input and output screens, including the graphical display of results. Homework problems are included in the relevant chapters to reinforce your capabilities to use the disk on your own computer. Additionally, the software can be employed to check the accuracy of your own programming efforts.

EXAMPLE 2.1 Install and use the Numerical Methods TOOLKIT on your computer

Problem Statement. The purpose of this example is to use the software provided with the text (the Numerical Methods TOOLKIT).

Solution. Run the install.exe program on the disk included with the text and start up the TOOLKIT. The screen should contain the TOOLKIT main menu as shown in Fig. 2 . 1 6 ~ . First, notice the row of five control buttons on the bottom of the menu. The About button describes the authorship of the software. Click this button now and then click OK. The Help button has five messages in sequence that remind you of certain conventions and design features of the TOOLKIT. The first help message describes the user-defined functions that are recognized by the software. These include familiar trigonometric and exponential functions. Note that the arguments of the trigonometric functions are expressed in radians rather than degrees. Thus the sine of 90 degrees must be written as SIN(2*pi*90/360) or SIN(pi12) or SIN(1.57) when using the TOOLKIT. The second Help message describes how plots are generated by the TOOLKIT. Four plot parameters


FIGURE 2.16 (a) Title screen from the Numerical Methods TOOLKIT software. (b) The root location screen showing a graph of the falling parachutist's velocih/ [Eq. ( 1 . 1 O)].

are required to construct a plot. These are the minimum values of X and Y and the interval between tic marks for both X and Y. There are always 10 Delta X and 10 Delta Y intervals on a plot such that the maximum values of X and Y are Xmin + 10 Delta X and Ymin + 10 Delta Y. This convention allows you to create plots of your liking. The third Help message explains the red warning button convention employed by the TOOLKIT. You will note that the TOOLKIT is designed to show all the inputs as well as numerical and graphical results for each method on a single screen. This allows you to view all aspects of a particular problem with a single glance. However, suppose you plot

2.7 SOFTWARE STRATEGY 5 1

a function Y(X) with X and Y ranging from "0 to 10." Upon inspection of the plot you decide that it would be useful to change the range of Y from "0 to 10" to "0 to 20." When you change the value 10 to 20 in the input table, you will note that the Plot button changes color from black to red. This simply alerts you that the plot must be updated to reflect the new range of 20. Click the red Plot button and it changes to black, signaling that the plot and the input parameters are consistent. Thus, when using the software on an interactive basis, you should make sure that there are no red buttons when final results are recorded. The fourth Help message informs you that TOOLKIT computations can be saved and recovered as files with ".nmt" extensions. Thus, if you use the TOOLKIT to help with a homework problem in a mechanical engineering course, you might name your file, MEPROB4.nmt. The fifth Help message alerts you to the fact that the TOOLKIT Print function may not work properly on some network configurations.

The Open and Save buttons invoke typical Windows boxes that allow you to name, store, and recall files in locations of your choice. Any analyses performed with the TOOLKIT will be lost upon exit unless you take specific action using the Save button.

Next, notice the five large stacked buttons on the left side of the main menu. These buttons have names, figures, and mathematical symbols to remind you of the function of the numerical methods. We will carefully explore the program behind each one of these buttons in appropriate sections of the text.

For now we will use the Find Root (top button) program to plot the velocity of the falling parachutist as a function of time. To do this click the Find Root button, whereupon the screen should produce a pattern similar to that of Fig. 2.16b. The screen has five main parts: a table for the input parameters, a box for the input function, a plot window, a table for the results, and a bar containing control buttons. Let us ignore the root-finding capabilities of the program until Part Two and simply use the program to plot Eq. (1.10) with m = 68.1 kg, c = 12.5 kg/s, and g = 9.8 m/s2. Figure 2 . 1 6 ~ shows the results for Xmin and Ymin = 0 with Delta X = 1.5 and Delta Y = 6. Try various values for the plot parameters including negative Xmin and Ymin to gain familiarity with the plot operation and design. Note that the Plot button is red when the plot needs to be updated ( i t . , the actual plot showing on the screen and the plot parameters are inconsistent). Experiment with the Clear button as an alternative way to remove the input parameters and function from the boxes. Also click the Notes button. A window appears that you can use to write notes concerning your work. These notes will be saved and retrieved when you use the Save and Open buttons on the main menu. Finally, use the Print button to generate a hard copy of the plot and input parameters and function. Remember, this may not work for some network printers. This hard copy might be submitted in a report to a client or as part of a homework assignment.

The ability of the TOOLKIT to create plots will have many other uses as you pursue your goal of understanding and applying numerical methods and computers to solve engineering problems. We will explore these uses as well as the other capabilities of the TOOLKIT in appropriate sections of the text.

2.7.1 Packages and Libraries

Throughout this book, we will endeavor to describe how numerical methods are implemented using program software packages and libraries. The former will stress Mathcad,

COMPUTERS AND SOFTWARE

Excel, and MATLAB. The latter will focus on the IMSL library. A short description of each follows.

Mathcad. Mathcad is the flagship product of Mathsoft, Inc. This software package at- tempts to bridge the gap between spreadsheets like Excel and scratch pads. The result is an interactive worksheet that allows the user to perform a number of mathematical, data- handling, and graphical tasks commonly encountered in engineering and science. Informa- tion and equations are input to the worksheet as you would with paper and pencil rather than with a programming language or with spreadsheet notation. The first version of Math- cad was released in 1986, and version 7 was released in 1997.

Mathcad can perform tasks in either numeric or symbolic mode. In numeric mode, Mathcad functions and operators give numerical responses, whereas in symbolic mode results are given as general expressions or equations. Maple V, a comprehensive symbolic math package, is the basis of the symbolic mode and was incorporated into Mathcad in 1993.

Mathcad has a variety of functions and operators that allow convenient implementation of many of the numerical methods developed in this book. These will be described in detail in succeeding chapters. Because of its ease of use, we believe Mathcad is particularly useful as a pedagogical tool.

Excel. Excel is the spreadsheet produced by Microsoft, Inc. Spreadsheets are a special type of mathematical software that allow the user to enter and perform calculations on rows and columns of data. As such, they are a computerized version of a large accounting worksheet on which large interconnected calculations can be implemented and displayed. Because the entire calculation is updated when any number on the sheet is changed, spreadsheets are ideal for "what if?" sorts of analysis.

Excel has some built-in numerical capabilities including equation solving, curve fitting, and optimization. It also includes Visual BASIC as a macro language that can be used to implement numerical calculations. Finally, it has several visualization tools (such as three-dimensional surface plots) that serve as valuable adjuncts for numerical analysis.

MATLAB. MATLAB is the flagship software product of Mathworks, Inc. co-founded by the numerical analyst Cleve Moler and John N. Little. As the name implies, MATLAB was originally developed as a matrix laboratory. To this day, the major element of MATLAB is still the matrix. Mathematical manipulations of matrices are very conveniently implemented in an easy-to-use, interactive environment. To these matrix manipulations, MAT- LAB has added a variety of numerical functions, symbol'ic computations, and visualization tools. As a consequence, the present version represents a fairly comprehensive technical computing environment.

MATLAB has a variety of functions and operators that allow convenient implementation of many of the numerical methods developed in this book. These will be described in detail in succeeding chapters. In addition, programs can be written as so-called scripts, functions, and M-files that can be used to implement numerical calculations.

IMSL. The IMSL (International Mathematical and Statistical Library, Inc.) consists of three separate but coordinated libraries of Fortran subroutines and functions: the MATH (general applied mathematics), STAT (statistics), and SFUN (special function) libraries. This text focuses on the MATH library, which contains over 700 subprograms spanning all

PROBLEMS 53

the numerical areas covered in this text. Most are available in both single and double precision versions. As such, it constitutes an immense capability for numerical analysis. Al- though the routines themselves are written in Fortran, they can be accessed from other languages such as C.

It should be noted that most of our Mathcad, Excel, and MATLAB applications will stress their built-in numerical functions (e.g., matrix manipulations, regression, root solving). However, it should be noted that our pseudocodes can be readily translated into macro programs for direct implementation in these environments.

PROBLEMS - -

2.1 List and define five steps in the production of high-quality software in engineering. 2.2 Define top-down design, modular design, and structured programming. 2.3 Write pseudocode to implement the flowchart depicted in Fig. P2.3. Make sure that proper indentation is included to make the structure clear. 2.4 A value for the concentration of a pollutant in a lake is recorded on each in a set of index cards. A card marked "end of data" is placed at the end of the set. Write an algorithm to determine the sum and the average of these values.

FIGURE P2.3

2.5 Write a structured flowchart for Prob. 2.4. 2.6 Write an algorithm to print out the roots (either real or complex) of a quadratic equation ax2 + bx + c = 0, where a, b, and c are real coefficients. Design your algorithm so that it deals with all possible contingencies that might occur during the computation. 2.7 Develop a structured flowchart for Prob. 2.6. 2.8 Write pseudocode for Prob. 2.6. 2.9 Develop, debug, and document a subprogram for Prob. 2.6 in either a high-level language or a macro language of your choice. Use a subroutine to compute the roots. Perform test runs for the cases (a) a = 1, b = 5 , c = 2; (b) a = 0, b = -3, c = 1.5; (c) a = 1, b = 2.5, c = 7. 2.10 The sine function can be evaluated by the following infinite series:

x3 x5

Write an algorithm to implement this formula so that it computes and prints out the values of sin x as each term in the series is added. In other words, compute and print in sequence the values for

sinx = x

x3 sinx = x - -

3!

up to the order term of your choosing. For each of the above, compute and print out the percent relative error as

true - series approximation % error = x 100%

true

2.11 Develop a structured flowchart for Prob. 2.10. 2.12 Write pseudocode for Prob. 2.10.


2.13 Develop, debug, and document a subprogram for Prob. 2.10 in either a high-level language or a macro language of your choice. Employ the library function for the sine in your computer to determine the true value. Have the program print out the series approximation and the error at each step. As a test case, employ the program to compute sin 1.5 for up to and including the term xis/ 15!. Interpret your results. 2.14 The following algorithm is designed to determine a grade for a course that consists of quizzes, homework, and a final exam:

Step 1: Input course number and name. Step 2: Input weighting factors for quizzes (WQ), homework

(WH), and the final exam (WF). Step 3: Input quiz grades and determine an average quiz grade

(AQ). Step 4: Input homework grades and determine an average home-

work grade (AH). Step 5: If this course has a final grade, continue to step 6. If not, go

to step 9. Step 6: Input final exam grade (FE). Step 7: Determine average grade AG according to

W Q x A Q + W H x A H + W F * F E AG =

W Q + W H + W F

Step 8: Go to step 10. Step 9: Determine average grade AG according to

W Q x A Q + W H x AH AG =

WQ + WH

Step 10: Print out course number, name, and average grade. Step 11: Terminate computation.

Write, debug, and document a structured computer program based on the algorithm. Test it using the following data to calculate a grade without the final exam and a grade with the final exam: WQ = 30; WH = 30; WF = 40; quizzes = 98, 95, 90, 60, 99; homework = 95,90,86, 100, 100,77; and final exam = 91. 2.15 An amount of money P is invested in an account where interest is compounded at the end of the period. The future worth F yielded at an interest rate i after n periods may be determined from the following formulation:

Write a subprogram that will calculate the future worth of an in- vestment in either a high-level language or a macro language of your choice. The input to the program should include the initial in- vestment P, the interest rate i (as a decimal), and the number of years n for which the future worth is to be calculated. The output should also include these values. The output should also include, in a labeled table format, the future worth for each year up to and including the nth year. Run the program for P = $100,000, i = 0.08, and n = 25 years.

2.16 Figure P2.16 shows the reverse of a checking account statement. The bank has developed this to help you balance your checkbook. If you look at it closely, you will realize that it is an algorithm. Develop a step-by-step algorithm (in a format similar to Fig. 2.6a) to accomplish the same task. 2.17 Write a structured flowchart for Prob. 2.16. 2.18 Develop, debug, and document a subprogram for Prob. 2.16 in either a high-level language or a macro language of your choice. Test it by verifying that it duplicates the results shown in Fig. P2.16. 2.19 Write, debug, and document a subprogram to determine statistics for your favorite sport in either a high-level language or a macro language of your choice. Pick anything from football to jog- ging to bowling. If you play intramural sports, make up a subprogram for your team. Design the program so that it is user-friendly and provides valuable and interesting information to anyone (for example, a coach or a player) who might use it to evaluate athletic performance. 2.20 The average daily temperature for an area can be approximated by the following function,

where T,,,,,, = the average annual temperature, T,,,, = the peak temperature, w = the frequency of the annual variation (= 2 n /365), and t,,,, = day of the peak temperature ( 2 205 d). Parameters for some U.S. towns are listed in Table P2.20.

TABLE P2.20 Mean daily air temperature parameters for some selected U.S. locations.

City Tmmn ("C) T,k ("C)

M~ami , FL 22.1 28.3 Yurna, AZ 23.1 33.6 Bismarck, ND 5.2 22.1 Seattle, WA 10.6 17.6 Boston, M A 10.7 22.9

Develop a subprogram in either a high-level language or a macro language of your choice that computes the average temperature between two days of the year entered by the user. Test it for (a) Janu- ary-February in Bismarck, North Dakota (t = 0 to 59), and the July-August temperature in Yuma, Arizona (t = 180 to 242). 2.21 Economic formulas are available to compute annual payments for loans. Suppose that you borrow an amount of money P and agree to repay it in n annual payments at an interest rate of i. The formula to compute the annual payment A is

PROBLEMS 5 5

CHECKS OUTSTANDING - NOT CHARGED TO ACCOUNT

FIGURE P2.16

MONTH

NEW BALANCE 5 AS SHOWN ON THlS STATEMENT

ADD DEPOSITS NOT CREDITED IN THlS STATEMENT

April 19 97

TOTAL 5 - SUBTRACT TOTAL CHECKS OUTSTANDING

YOUR CHECKBOOK BALANCE 600.52 AFTER SUBTRACTING CURRENT MONTH'S SERVICE CHARGE AND ADDING INTEREST EARNED (EARNER ACCOUNTS ONLY) TO YOUR BALANCE

Write a subprogram in either a high-level language or a macro language of your choice to compute A. Test it with P = $35,000 and an interest rate of 15 percent (i = 0.15). Set up the program so that you can evaluate as many values of n as you like. Compute results for n = 1, 2, 3,4, and 5. 2.22 Develop, debug, and test a subprogram in either a high-level language or a macro language of your choice to compute the velocity of the falling parachutist as outlined in Example 1.2. Design the program so that it allows the user to input values for the drag coefficient and mass. Test the program by duplicating the results from Example 1.2. Repeat the computation but employ step sizes of 1 and 0.5 s. Compare your results with the analytical solution obtained previously in Example 1.1. Does a smaller step size make the results better or worse? Explain your results. 2.23 Use the bisection program on the TOOLKIT to plot several functions of your choice. Try polynomials and transcendental functions (e.g., sines, logs) whose behavior may be difficult to visualize

in advance of plotting. Try functions that may not be continuous or otherwise unusual. Utilize several choices for the scales of both the x and y axes to facilitate your exploration. Make permanent copies of the plots if you have a printer. Help us beta-test the program by sending us the output of any examples that failed to perform properly. 2.24 Read the appendices concerning the capabilities and operation of packages useful for numerical methods. Select one or more packages and repeat Prob. 2.23. 2.25 Attain the capability to plot functions in a manner similar to the TOOLKIT. Use your own computer and its utility software or library subroutines if appropriate. If your computer does not have programs to help you, then write your own code using the capabilities of your computer. Develop this program in the form of a subroutine and save it on a magnetic storage disk. Be prepared to up- date this program as you apply it to various problems covered in other parts of this book.

CHAPTER 3 Approximations and ~obnd-off Errors

Because so many of the methods in this book are straightforward in description and application, it would be very tempting at this point for us to proceed directly to the main body of the text and teach you how to use these techniques. However, understanding the concept of error is so important to the effective use of numerical methods that we have chosen to devote the next two chapters to this topic.

The importance of error was introduced in our discussion of the falling parachutist in Chap. 1. Recall that we determined the velocity of a falling parachutist by both analytical and numerical methods. Although the numerical technique yielded estimates that were close to the exact analytical solution, there was a discrepancy, or error, because the numerical method involved an approximation. Actually, we were fortunate in that case because the availability of an analytical solution allowed us to compute the error exactly. For many applied engineering problems we cannot obtain analytical solutions. Therefore, we eannot compute exactly the errors associated with our numerical methods. In these cases we must settle for approximations or estimates of the errors.

Such errors are characteristic of most of the techniques described in this book. This statement might at first seem contrary to what one normally conceives of as sound engineering. Students and practicing engineers constantly strive to limit errors in their work. When taking examinations or doing homework problems, you are penalized, not rewarded, for your errors. In professional practice, errors can be costly and sometimes catastrophic. If a structure or device fails, lives can be lost.

Although perfection is a laudable goal, it is rarely, if ever, attained. For example, despite the fact that the model developed from Newton's second law is an excellent approximation, it would never in practice exactly predict the parachutist's fall. A variety of factors such as winds and slight variations in air resistance would result in deviations from the prediction. If these deviations are systematically high or low, then we might need to develop a new model. However, if they are randomly distributed and tightly grouped around the prediction, then the deviations might be considered negligible and the model deemed adequate. Numerical approximations also introduce similar discrepancies into the analysis. Again, the question is: How much error is present in our calculations and is it tolerable?

3.1 SIGNIFICANT FIGURES 57

This chapter and the next cover basic topics related to the identification, quantification, and minimization of these errors. In this chapter, general information concerned with the quantification of error is reviewed in the first sections. This is followed by a section on one of the two major forms of numerical error: round-off error. Round-off error is due to the fact that computers can represent only quantities with a finite number of digits. Then the next chapter deals with the other major form: truncation error. Truncation error is the discrepancy introduced by the fact that numerical methods may employ approximations to represent exact mathematical operations and quantities. Finally, we briefly discuss errors not directly connected with the numerical methods themselves. These include blunders, formulation or model errors, and data uncertainty.

3.1 SIGNIFICANT FIGURES

This book deals extensively with approximations connected with the manipulation of numbers. Consequently, before discussing the errors associated with numerical methods, it is useful to review basic concepts related to approximate representation of the numbers themselves.

Whenever we employ a number in a computation, we must have assurance that it can be used with confidence. For example, Fig. 3.1 depicts a speedometer and odometer from an automobile. Visual inspection of the speedometer indicates that the car is traveling between 48 and 49 k m h . Because the indicator is higher than the midpoint between the markers on the gauge, we can say with assurance that the car is traveling at approximately 49 k m h . We have confidence in this result because two or more reasonable individuals reading this gauge would arrive at the same conclusion. However, let us say that we insist that the speed be estimated to one decimal place. For this case, one person might say 48.8,

FIGURE 3.1 An automobile speedometer and odometer illustrating the concept of a significant figure.

58 APPROXIMATIONS AND ROUND-OFF ERRORS

whereas another might say 48.9 km/h. Therefore, because of the limits of this instrument, only the first two digits can be used with confidence. Estimates of the third digit (or higher) must be viewed as approximations. It would be ludicrous to claim, on the basis of this speedometer, that the automobile is traveling at 48.8642138 kmh. In contrast, the odometer provides up to six certain digits. From Fig. 3.1, we can conclude that the car has trav- eled slightly less than 87,324.5 km during its lifetime. In this case, the seventh digit (and higher) is uncertain.

The concept of a significant figure, or digit, has been developed to formally designate the reliability of a numerical value. The significant digits of a number are those that can be used with confidence. They correspond to the number of certain digits plus one estimated digit. For example, the speedometer and the odometer in Fig. 3.1 yield readings of three and seven significant figures, respectively. For the speedometer, the two certain digits are 48. It is conventional to set the estimated digit at one-half of the smallest scale division on the measurement device. Thus the speedometer reading would consist of the three significant figures: 48.5. In a similar fashion, the odometer would yield a seven-significant-figure reading of 87,324.45.

Although it is usually a straightforward procedure to ascertain the significant figures of a number, some cases can lead to confusion. For example, zeros are not always significant figures because they may be necessary just to locate a decimal point. The numbers 0.00001845, 0.0001845, and 0.001845 all have four significant figures. Similarly, when trailing zeros are used in large numbers, it is not clear how many, if any, of the zeros are significant. For example, at face value the number 45,300 may have three, four, or five significant digits, depending on whether the zeros are known with confidence. Such uncertainty can be resolved by using scientific notation, where 4.53 x lo4, 4.530 x lo4, 4.5300 x lo4 designate that the number is known to three, four, and five significant figures, respectively.

The concept of significant figures has two important implications for our study of numerical methods:

1. As introduced in the falling parachutist problem, numerical methods yield approximate results. We must therefore develop criteria to specify how confident we are in our approximate result. One way to do this is in terms of significant figures. For example, we might decide that our approximation is acceptable if it is correct to four significant figures.

2. Although quantities such as n, e, or 2/;j represent specific quantities, they cannot be expressed exactly by a limited number of digits. For example,

ad infiniturn. Because computers retain only a finite number of significant figures, such numbers can never be represented exactly. The omission of the remaining significant figures is called round-off error.

Both round-off error and the use of significant figures to express our confidence in a numerical result will be explored in detail in subsequent sections. In addition, the concept of significant figures will have relevance to our definition of accuracy and precision in the next section.

3.2 ACCURACY AND PRECISION 59

ACCURACY AND PRECISION

The errors associated with both calculations and measurements can be characterized with regard to their accuracy and precision. Accuracy refers to how closely a computed or measured value agrees with the true value. Precision refers to how closely individual computed - or measured values agree with each other.

These concepts can be illustrated graphically using an analogy from target practice. The bullet holes on each target in Fig. 3.2 can be thought of as the predictions of a numerical technique, whereas the bull's-eye represents the truth. Inaccuracy (also called bias) is defined as systematic deviation from the truth. Thus, although the shots in Fig. 3.2c are more tightly grouped than those in Fig. 3.2a, the two cases are equally biased because they are both. centered on the upper left quadrant of the target. Imprecision (also called uncertainty), on the other hand, refers to the magnitude of the scatter. Therefore, although Fig. 3.2b and d are equally accurate (that is, centered on the bull's-eye), the latter is more precise because the shots are tightly grouped.

Numerical methods should be sufficiently accurate or unbiased to meet the requirements of a particular engineering problem. They also should be precise enough for adequate

FIGURE 3.2 An example from marksmanship illustrating the concepts of accuracy and precision. (a) Inaccu- rate and imprecise; (b) accurate and imprecise; (c) inaccurate and precise; (dl accurate and precise.


engineering design. In this book, we will use the collective term error to represent both the inaccuracy and the imprecision of our predictions. With these concepts as background, we can now discuss the factors that contribute to the error of numerical computations.

3.3 ERROR DEFINITIONS

Numerical errors arise from the use of approximations to represent exact mathematical operations and quantities. These include truncation errors, which result when approximations are used to represent exact mathematical procedures, and round-off errors, which result when numbers having limited significant figures are used to represent exact numbers. For both types, the relationship between the exact, or true, result and the approximation can be formulated as

True value = approximation + error (3.1)

By rearranging Eq. (3.1), we find that the numerical error is equal to the discrepancy between the truth and the approximation, as in

E, = true value - approximation (3 .2 )

where El is used to designate the exact value of the error, The subscript t is included to designate that this is the "true" error. This is in contrast to other cases, as described shortly, where an "approximate" estimate of the error must be employed.

A shortcoming of this definition is that it takes no account of the order of magnitude of the value under examination. For example, an error of a centimeter is much more significant if we are measuring a rivet rather than a bridge. One way to account for the magnitudes of the quantities being evaluated is to normalize the error to the true value, as in

true error True fractional relative error =

true value

where, as specified by Eq. (3.2), error = true value - approximation. The relative error can also be multiplied by 100 percent to express it as

true error E r = 100%

true value

where E , designates the true percent relative error.

EXAMPLE 3.1 Calculation of Errors

Problem Statement. Suppose that you have the task of measuring the lengths of a I bridge and a rivet and come up with 9999 and 9 cm, respectively. If the true values are

10,000 and 10 cm, respectively, compute (a) the true error and (b) the true percent i relative error for each case.

) Solution.

! (a) The error for measuring the bridge is [Eq. (3.2)]

3.3 ERROR DEFINITIONS 61

and for the rivet it is

1 (b) The percent relative error for the bridge is [Eq. (3.3)] I

I and for the rivet it is

Thus, although both measurements have an error of 1 cm, the relative error for the rivet is f much greater. We would conclude that we have done an adequate job of measuring the i I bridge, whereas our estimate for the rivet leaves something to be desired.

Notice that for Eqs. (3.2) and (3.3), E and E are subscripted with a t to signify that the error is normalized to the true value. In Example 3.1, we were provided with this value. However, in actual situations such information is rarely available. For numerical methods, the true value will be known only when we deal with functions that can be solved analytically. Such will typically be the case when we investigate the theoretical behavior of a particular technique for simple systems. However, in real-world applications, we will obviously not know the true answer a priori. For these situations, an alternative is to nmnalize the error using the best available estimate of the true value, that is, to the approximation itself, as in

approximate error E, = 100%

approximation

where the subscript a signifies that the error is normalized to an approximate value. Note also that for real-world applications, Eq. (3.2) cannot be used to calculate the error term for Eq. (3.4). One of the challenges of numerical methods is to determine error estimates in the absence of knowledge regarding the true value. For example, certain numerical methods use an iterative approach to compute answers. In such an approach, a present approximation is made on the basis of a previous approximation. This process is performed repeatedly, or iteratively, to successively compute (we hope) better and better approximations. For such cases, the error is often estimated as the difference between previous and currer,t approximations. Thus, percent relative error is determined according to

current approximation - previous approximation E, = 100%

current approximation

This and other approaches for expressing errors will be elaborated on in subsequent chapters.

The signs of Eqs. (3.2) through (3.5) may be either positive or negative. If the approximation is greater than the true value (or the previous approximation is greater than the current approximation), the error is negative; if the approximation is less than the true value, the error is positive. Also, for Eqs. (3.3) to (3.3, the denominator may be less than zero, which can also lead to a negative error. Often, when performing computations, we may not


be concerned with the sign of the error, but we are interested in whether the percent absolute value is lower than a prespecified percent tolerance E,. Therefore, it is often useful to employ the absolute value of Eqs. (3.2) through (3.5). For such cases, the computation is repeated until

If this relationship holds, our result is assumed to be within the prespecified acceptable level E,. Note that for the remainder of this text, we will almost exclusively employ absolute values when we use relative errors.

It is also convenient to relate these errors to the number of significant figures in the approximation. It can be shown (Scarborough, 1966) that if the following criterion is met, we can be assured that the result is correct to at least n significant figures.

EXAMPLE 3.2 Error Estimates for Iterative Methods

Problem Statement. In mathematics, functions can often be represented by infinite series. For example, the exponential function can be computed using

Thus, as more terms are added in sequence, the approximation becomes a better and better estimate of the true value of ex. Equation (E3.2.1) is called a Maclaurin series expansion.

Starting with the simplest version, ex = 1, add terms one at a time to estimate e0.5. After each new term is added, compute the true and approximate percent relative errors with Eqs. (3.3) and (3.3, respectively. Note that the true value is e0.5 = 1.648721 . . . . Add terms until the absolute value of the approximate error estimate E, falls below a prespecified error criterion E, conforming to three significant figures.

Solution. First, Eq. (3.7) can be employed to determine the error criterion that ensures a result is correct to at least three significant figures:

Thus, we will add terms to the series until E, falls below this level. The first estimate is simply equal to Eq. (E3.2.1) with a single term. Thus, the first

estimate is equal to 1. The second estimate is then generated by adding the second term, as in

or for x = 0.5,

e0.5 = 1 + 0.5 = 1.5

This represents a true percent relative error of [Eq. (3.3)]

3.4 ROUND-OFF ERRORS 63

I Equation (3.5) can be used to determine an approximate estimate of the error, as in I

Because E, is not less than the required value of F , , we would continue the computation by adding another term, x2/2! and repeating the error calculations. The process is continued until < E,. The entire computation can be summarized as

I Terms Result E l (%I &a (%I

1 Thus, after six terms are included, the approximate error falls below e, = 0.05% and the i computation is terminated. However, notice that, rather than three significant figures, the I result is accurate to five! This is because, for this case, both Eqs. (3.5) and (3.7) are con- ' servative. That is, they ensure that the result is at least as good as they specify. Although,

as discussed in Chap. 6, this is not always the case for Eq. (3 .3 , it is true most of the time.

With the preceding definitions as background, we can now proceed to the two types of error connected directly with numerical methods: round-off errors and truncation errors.

3.4 ROUND-OFF ERRORS As mentioned previously, round-off errors originate from the fact that computers retain only a fixed number of significant figures during a calculation. Numbers such as n, e, or 2/7 cannot be expressed by a fixed number of significant figures. Therefore, they cannot be represented exactly by the computer. In addition, because computers use a base-2 representation, they cannot precisely represent certain exact base-10 numbers. The discrepancy introduced by this omission of significant figures is called round-off error.

3.4.1 Computer Representation of Numbers

Numerical round-off errors are directly related to the manner in which numbers are stored in a computer. The fundamental unit whereby information is represented is called a word. This is an entity that consists of a string of binary digits, or bits. Numbers are typically stored in one or more words. To understand how this is accomplished, we must first review some material related to number systems.

Number Systems. A number system is merely a convention for representing quantities. Because we have 10 fingers and 10 toes, the number system that we are most familiar with is the decimal, or base-10, number system. A base is the number used as the reference for


constructing the system. The base-10 system uses the 10 digits-0, l ,2,3,4,5,6,7,8,9- to represent numbers. By themselves, these digits are satisfactory for counting from 0 to 9.

For larger quantities, combinations of these basic digits are used, with the position or place value specifying the magnitude. The rightmost digit in a whole number represents a number from 0 to 9. The second digit from the right represents a multiple of 10. The third digit from the right represents a multiple of 100 and so on. For example, if we have the number 86,409 then we have eight groups of 10,000, six groups of 1000, four groups of 100, zero groups of 10, and nine more units, or

(8 x lo4) + (6 x lo3) + (4 x lo2) + (0 x 10') + (9 x 10') = 86,409

Figure 3 . 3 ~ provides a visual representation of how a number is formulated in the base-10 system. This type of representation is calledpositional notation.

Because the decimal system is so familiar, it is not commonly realized that there are alternatives. For example, if human beings happened to have had eight fingers and eight toes, we would undoubtedly have developed an octal, or base-8, representation. In the same sense, our friend the computer is like a two-fingered animal who is limited to two states--either 0 or 1. This relates to the fact that the primary logic units of digital comput-

FIGURE 3.3 How the (a) decimal (base 10) and the (b) binary (base 2) systems work. In [b) , the binary number 101 01 101 is equivalent to the decimal number 173.


I Number Sign

FIGURE 3.4 The representation of the decimal integer - 173 on a 16-b~t computer using the signed magnitude method

ers are on/off electronic components. Hence, numbers on the computer are represented with a binary, or base-2, system. Just as with the decimal system, quantities can be represented using positional notation. For example, the binary number 11 is equivalent to (1 x 2') + (1 x 2') = 2 + 1 = 3 in the decimal system. Figure 3.3b illustrates a more complicated example.

Integer Representation. Now that we have reviewed how base-10 numbers can be represented in binary form, it is simple to conceive of how integers are represented on a computer. The most straightforward approach, called the signed magnitude method, employs the first bit of a word to indicate the sign, with a 0 for positive and a 1 for negative. The remaining bits are used to store the number. For example, the integer value of -173 would be stored on a 16-bit computer, as in Fig. 3.4.

EXAMPLE 3.3 Range of Integers

Problem Statement. Determine the range of integers in base-10 that can be represented on a 16-bit computer.

Solution. Of the 16 bits, the first bit holds the sign. The remaining 15 bits can hold binary numbers from 0 to 11 11 11 11 1111 11 1. The upper limit can be converted to a decimal integer, as in

(1 214) + (1 213) + . . . + (1 21) + (1 20)

which equals 32,767 (note that this expression can be simply evaluated as 2'" 1). Thus, a 16-bit computer word can store decimal integers ranging from -32,767 to 32,767. In addition, because zero is already defined as 0000000000000000, it is redundant to use the number 1000000000000000 to define a "minus zero." Therefore, it is usually employed to represent an additional negative number: -32,768, and the range is from -32,768 to 32,767.

Note that the signed magnitude method described above is not used to represent integers on conventional computers. A preferred approach called the 2's complement technique directly incorporates the sign into the number's magnitude rather than providing a separate bit to represent plus or minus (see Chapra and Canale 1994). However, Example 3.3 still serves to illustrate how all digital computers are limited in their capability to represent integers. That is, numbers above or below the range cannot be represented. A more serious


Signed

-1

FIGURE 3.5 The manner in which a floating-point number is stored in a word

limitation is encountered in the storage and manipulation of fractional quantities as described next.

Floating-Point Representation. Fractional quantities are typically represented in computers using floating-point form. In this approach, the number is expressed as a fractional part, called a mantissa or significand, and an integer part, called an exponent or characteristic, as in

where m = the mantissa, h = the base of the number system being used, and e = the exponent. For instance, the number 156.78 could be represented as 0.15678 x lo3 in a floating-point base- 10 system.

Figure 3.5 shows one way that a floating-point number could be stored in a word. The first bit is reserved for the sign, the next series of bits for the signed exponent, and the last bits for the mantissa.

Note that the mantissa is usually normalized if it has leading zero digits. For example, suppose the quantity 1/34 = 0.029411765. . . was stored in a floating-point base-10 system that allowed only four decimal places to be stored. Thus, 1/34 would be stored as

However, in the process of doing this, the inclusion of the useless zero to the right of the decimal forces us to drop the digit 1 in the fifth decimal place. The number can be normalized to remove the leading zero by multiplying the mantissa by 10 and lowering the exponent by 1 to give

Thus, we retain an additional significant figure when the number is stored. The consequence of normalization is that the absolute value of m is limited. That is,

where b = the base. For example, for a base-10 system, rn would range between 0.1 and 1, and for a base-2 system, between 0.5 and 1.

Floating-point representation allows both fractions and very large numbers to be expressed on the computer. However, it has some disadvantages. For example, floating-point


numbers take up more room and take longer to process than integer numbers. More significantly, however, their use introduces a source of error because the mantissa holds only a finite number of significant figures. Thus, a round-off error is introduced.

EXAMPLE 3.4 Hypothetical Set of Floating-Point Numbers L ~ ~ - ] . A ~ J A - r-3 'P 1 Problem Statement. Create a hypothetical floating-point nkmber set for a machine that

stores information using 7-bit words. Employ the first bit for the sign of the number, the next three for the sign and the magnitude of the exponent, and the last three for the

/ magnitude of the mantissa (Fig. 3.6). I

Solution. The smallest possible positive number is depicted in Fig. 3.6. The initial 0 indicates that the quantity is positive. The 1 in the second place designates that the exponent has a negative sign. The 1's in the third and fourth places give a maximum value to the exponent of

1 x 2 I + l x 2 O = 3

Therefore, the exponent will be -3. Finally, the mantissa is specified by the 100 in the last three places, which conforms to

Although a smaller mantissa is possible (e.g., 000,001,010, Oll), the value of 100 is used because of the limit imposed by normalization [Eq. (3.8)]. Thus, the smallest possible positive number for this system is +0.5 x r 3 , which is equal to 0.0625 in the base-10 system. The next highest numbers are developed by increasing the mantissa, as in

01 11101 = (1 x 2-I + 0 x 2-* + 1 x r 3 ) x = (0.078125)10

01 11 110 = (1 x 2-I + 1 x 2-2 + 0 x 2-3) x 2-3 = (0.093750)10

0111111 = (1 x 2-' + 1 x 2T2 + 1 x T 3 ) x 2-3 = (0.109375)10

1 Notice that the base-10 equivalents are spaced evenly with an interval of 0.015625.

1 At this point, to continue increasing, we must decrease the exponent to 10, which gives a value of i

I 1 x 2 ' + 0 x 2 O = 2 i I I

~ I G U R E 3.6 I The smallest poss~ble pos~t~ve floating-po~nt number from Example 3 4 i

/ I 1 Magnitude Sign of Sign of of mantissa number exponent = 0,s + - Magnitude

1 of exponent


7 I Overflow -

Underflow "hole" at zero

1 FIGURE 3.7 1 The hypothetical number system developed in Example 3.4. Each value is indicated by a tick

mark. Only the positive numbers are shown. An identical set would also extend in the negative . , / direction.

i i

The mantissa is decreased back to its smallest value of 100. Therefore, the next number is

0110100= (1 x 2-' + O x 2-2+0 x 2-9 x 2-' = (0.125000)10

This still represents a gap of 0.125000 - 0.109375 = 0.015625. However, now when higher numbers are generated by increasing the mantissa, the gap is lengthened to 0.03 125,

0110101 = (1 x 2-' + 0 x 2-2 + I x 2 ~ ~ ) x 2-2 = (0.156250)10

0110110 = (1 X 2-' + 1 x 2-2 + 0 X 2-3) X 2-2 = (o.187500)10

0110111 = (1 x 2-' + 1 x 2-2 + 1 x zp3) x 2-2 = (0.218750)10

1 This pattern is repeated as each larger quantity is formulated until a maximum number is

I reached. 0011111 = (1 x 2-I + 1 2p2 + 1 2-3) 23 = (7)10

Figure 3.7 manifests several aspects of floating-point representation that have significance regarding computer round-off errors:

1. There Is a Limited Range of Quantities That May Be Represented. Just as for the integer case, there are large positive and negative numbers that cannot be represented. At-


tempts to employ numbers outside the acceptable range will result in what is called an overflow error. However, in addition to large quantities, the floating-point representation has the added limitation that very small numbers cannot be represented. This is illustrated by the underflow "hole" between zero and the first positive number in Fig. 3.7. It should be noted that this hole is enlarged because of the normalization constraint of Eq. (3.8).

2. There Are Only a Finite Number of Quantities That Can Be Represented Within the Range. Thus, the degree of precision is limited. Obviously, irrational numbers cannot be represented exactly. Furthermore, rational numbers that do not exactly match one of the values in the set also cannot be represented precisely. The errors introduced by approximating both these cases are referred to as quantizing errors. The actual approximation is accomplished in either of two ways: chopping or rounding. For example, suppose that the value of n = 3.14159265358. . . is to be stored on a base-10 number system carry- ing seven significant figures. One method of approximation would be to merely omit, or "chop off," the eighth and higher terms, as in n = 3.141592, with the introduction of an associated error of [Eq. (3.2)]

This technique of retaining only the significant terms was originally dubbed "truncation" in computer jargon. We prefer to call it chopping to distinguish it from the truncation errors discussed in Chap. 4. Note that for the base-2 number system in Fig. 3.7, chopping means that any quantity falling within an interval of length Ax will be stored as the quantity at the lower end of the interval. Thus, the upper error bound for chopping is Ax. Additionally, a bias is introduced because all errors are positive. The shortcom- ings of chopping are attributable to the fact that the higher terms in the complete decimal representation have no impact on the shortened version. For instance, in our example of n, the first discarded digit is 6. Thus, the last retained digit should be rounded up to yield 3.141593. Such rounding reduces the error to

Consequently, rounding yields a lower absolute error than chopping. Note that for the base-2 number system in Fig. 3.7, rounding means that any quantity falling within an interval of length Ax will be represented as the nearest allowable number. Thus, the upper error bound for rounding is Ax/2. Additionally, no bias is introduced because some errors are positive and some are negative. Some computers employ rounding. However, this adds to the computational overhead, and, consequently, many machines use simple chopping. This approach is justified under the supposition that the number of significant figures is large enough that resulting round-off error is usually negligible.

3. The Interval between Numbers, Ax, Increases as the Numbers Grow in Magnitude. It is this characteristic, of course, that allows floating-point representation to preserve significant digits. However, it also means that quantizing errors will be proportional to the magnitude of the number being represented. For normalized floating-point numbers, this proportionality can be expressed, for cases where chopping is employed, as


and, for cases where rounding is employed, as

where % is referred to as the machine epsilon, which can be computed as

where b is the number base and t is the number of significant digits in the mantissa. No- tice that the inequalities in Eqs. (3.9) and (3.10) signify that these are error bounds. That is, they specify the worst cases.

EXAMPLE 3.5 Machine Epsilon

Problem Statement. Determine the machine epsilon and verify its effectiveness in characterizing the errors of the number system from Example 3.4. Assume that chopping is used.

Solution. The hypothetical floating-point system from Example 3.4 employed values of the base b = 2, and the number of mantissa bits t = 3. Therefore, the machine epsilon would be [Eq. (3.1 I)]

8 = 21-3 = 0.25

Consequently, the relative quantizing error should be bounded by 0.25 for chopping. The largest relative errors should occur for those quantities that fall just below the upper bound of the first interval between successive equispaced numbers (Fig. 3.8). Those numbers falling in the succeeding higher intervals would have the same value of Ax but a greater value of x and, hence, would have a lower relative error. An example of a maximum error would be a value falling just below the upper bound of the interval between (0. 125000)10 and (0.156250)10. For this case, the error would be less than

I Thus, the error is as predicted by Eq. (3.9).

FIGURE 3.8 The largest quantizing error will occur for those values falling just below the upper bound of the first of a series of equispaced intervals.

~ a r ~ e s t ' r e l a t i v e error


epsilon = 1 DO

IF (epsilon + 1 5 1) EXIT epsilon = epsilon/2

END DO epsilon = 2 X epsilon

FIGURE 3.9 Pseudocode to determine machine epsilon for a binary computer.

The magnitude dependence of quantizing errors has a number of practical applications in numerical methods. Most of these relate to the commonly employed operation of testing whether two numbers are equal. This occurs when testing convergence of quantities as well as in the stopping mechanism for iterative processes (recall Example 3.2). For these cases, it should be clear that, rather than test whether the two quantities are equal, it is advisable to test whether their difference is less than an acceptably small tolerance. Further, it should also be evident that normalized rather than absolute difference should be compared, particularly when dealing with numbers of large magnitude. In addition, the machine epsilon can be employed in formulating stopping or convergence criteria. This ensures that programs are portable-that is, they are not dependent on the computer on which they are implemented. Figure 3.9 lists pseudocode to automatically determine the machine epsilon of a binary computer.

Extended Precision. It should be noted at this point that, although round-off errors can be important in contexts such as testing convergence, the number of significant digits carried on most computers allows most engineering computations to be performed with more than acceptable precision. For example, the hypothetical number system in Fig. 3.7 is a gross exaggeration that was employed for illustrative purposes. Commercial computers use much larger words and, consequently, allow numbers to be expressed with more than adequate precision. For example, computers that use IEEE format allow 24 bits to be used for the mantissa, which translates into about seven significant base-10 digits of precision' with a range of about to

With this acknowledged, there are still cases where round-off error becomes critical. For this reason most computers allow the specification of extended precision. The most common of these is double precision, in which the number of words used to store floating- point numbers is doubled. It provides about 15 to 16 decimal digits of precision and a range of approximately to

In many cases, the use of double-precision quantities can greatly mitigate the effect of round-off errors. However, a price is paid for such remedies in that they also require more memory and execution time. The difference in execution time for a small calculation might seem insignificant. However, as your programs become larger and more complicated, the added execution time could become considerable and have a negative impact on your effectiveness as a problem solver. Therefore, extended precision should not be used frivo- lously. Rather, it should be selectively employed where it will yield the maximum benefit at the least cost in terms of execution time. In the following sections, we will look closer at how round-off errors affect computations, and in so doing provide a foundation of understanding to guide your use of the double-precision capability.

Before proceeding, it should be noted that some of the commonly used software packages (e.g., Excel, Mathcad) routinely use double precision to represent numerical quantities. Thus, the developers of these packages decided that mitigating roundoff errors would take precedence over any loss of speed incurred by using extended precision. Others, like MATLAB, allow you to use extended precision, if you desire.

' Note that only 23 bits are actually used to store the mantissa. However, because of normalization, the first bit of the mantissa i s always I and is. therefore. not stored. Thus, this first hit logether with the 23 stored hits gives

Ihc 11 ~oi;ll hits III' prc%c.ision 1;~. the tllil~iii\\;l.


3.4.2 Arithmetic Manipulations of Computer Numbers

Aside from the limitations of a computer's number system, the actual arithmetic manipulations involving these numbers can also result in round-off error. In the following section, we will first illustrate how common arithmetic operations affect round-off errors. Then we will investigate anumber of particular manipulations that are especially prone to round-off errors.

Common Arithmetic Operations. Because of their familiarity, normalized base-10 numbers will be employed to illustrate the effect of round-off errors on simple addition, subtraction, multiplication, and division. Other number bases would behave in a similar fashion. To simplify the discussion, we will employ a hypothetical decimal computer with a 4-digit mantissa and a 1 -digit exponent. In addition, chopping is used. Rounding would lead to similar though less dramatic errors.

When two floating-point numbers are added, the mantissa of the number with the smaller exponent is modified so that the exponents are the same. This has the effect of aligning the decimal points. For example, suppose we want to add 0.1557 . 10' + 0.4381 . 10-'. The decimal of the mantissa of the second number is shifted to the left a number of places equal to the difference of the exponents [ l - (- 1) = 21, as in

0.4381 . lo-' -t 0.004381 . 10'

Now the numbers can be added,

0.1557 . 10' 0.004381 . 10'

0.160081 . 10'

and the result chopped to 0.1600 . 10'. Notice how the last two digits of the second number that were shifted to the right have essentially been lost from the computation.

Subtraction is performed identically to addition except that the sign of the subtrahend is reversed. For example, suppose that we are subtracting 26.86 from 36.41. That is,

0.3641 . 10' - 0.2686 . 102

0.0955 . 102

For this case the result is not normalized, and so we must shift the decimal one place to the right to give 0.9550 . 10' = 9.550. Notice that the zero added to the end of the mantissa is not significant but is merely appended to fill the empty space created by the shift. Even more dramatic results would be obtained when the numbers are very close, as in

0.7642 lo3 - 0.7641 . lo3

0.0001 . 103

which would be converted to 0.1000 . 10' = 0.1000. Thus, for this case, three nonsignifi- cant zeros are appended. This introduces a substantial computational error because subsequent manipulations would act as if these zeros were significant. As we will see in a later section, the loss of significance during the subtraction of nearly equal numbers is among the greatest source of round-off error in numerical methods.

Multiplication and division are somewhat more straightforward than addition or subtraction. The exponents are added and the mantissas multiplied. Because multiplication of two n-digit mantissas will yield a 2n-digit result, most computers hold intermediate results


in a double-length register. For example,

If, as in this case, a leading zero is introduced, the result is normalized,

and chopped to give

Division is performed in a similar manner, but the mantissas are divided and the exponents are subtracted. Then the results are normalized and chopped.

Large Computations. Certain methods require extremely large numbers of arithmetic manipulations to arrive at their final results. In addition, these computations are often interdependent. That is, the later calculations are dependent on the results of earlier ones. Consequently, even though an individual round-off error could be small, the cumulative effect over the course of a large computation can be significant.

EXAMPLE 3.6 Large Numbers of Interdependent Computations

/ Problem Statement. Investigate the effect of round-off error on large numbers of interdependent computations. Develop a program to sum a number 100,000 times. Sum the number 1 in single precision, and 0.00001 in single and double precision.

Solution. Figure 3.10 shows a Fortran 90 program that performs the summation. Where- as the single-precision summation of 1 yields the expected result, the single-precision

FIGURE 3.10 Fortran 90 program to sum a number 1 O5 times. The case sums the number 1 in single precision and the number 1 0-5 In single and double precision.

PROGRAM f i go31 0 I M P L I C I T n o n e 1NTEGER::i REAL::suml, sum2, XI, x2 DOUBLE PRECISION::sum3, x3 suml =O. sum2=0. sum3=0. x l = l . x 2 = l . e - 5 x 3 = l . d-5 D O i = 1 , 1 0 0 0 0 0

suml=suml+x l sum2=sum2+x2 sum3=sum3+x3

END D O PRINT *, suml PRINT *, sum2 PRINT *, sum3 END o u t p u t : 1 0 0 0 0 0 . 0 0 0 0 0 0

1 . 0 0 0 9 9 0 9 . 9 9 9 9 9 9 9 9 9 9 8 0 8 3 8 E - 0 0 1


summation of 0.00001 yields a large discrepancy. This error is reduced significantly when 0.00001 is summed in double precision.

Quantizing errors are the source of the discrepancies. Because the integer 1 can be represented exactly within the computer, it can be summed exactly. In contrast, 0.00001 cannot be represented exactly and is quantized by a value that is slightly different from its true value. Whereas this very slight discrepancy would be negligible for a small computation, it accumulates after repeated summations. The problem still occurs in double precision but is greatly mitigated because the quantizing error is much smaller.

Note that the type of error illustrated by the previous example is somewhat atypical in that all the errors in the repeated operation are of the same sign. In most cases the errors of a long computation alternate sign in a random fashion and, thus, often cancel out. However, there are also instances where such errors do not cancel but, in fact, lead to a spurious final result. The following sections are intended to provide insight into ways in which this may occur.

Adding a Large and a Small Number. Suppose we add a small number, 0.0010, to a large number, 4000, using a hypothetical computer with the 4-digit mantissa and the 1-digit exponent. We modify the smaller number so that its exponent matches the larger,

which is chopped to 0.4000 . lo4. Thus, we might as well have not performed the addition!

This type of error can occur in the computation of an infinite series. The initial terms in such series are often relatively large in comparison with the later terms. Thus, after a few terms have been added, we are in the situation of adding a small quantity to a large quantity.

One way to mitigate this type of error is to sum the series in reverse order-that is, in ascending rather than descending order. In this way, each new term will be of comparable magnitude to the accumulated sum (see Prob. 3.4).

Subtractive Cancellation. This term refers to the round-off induced when subtracting two nearly equal floating-point numbers.

One common instance where this can occur involves finding the roots of a quadratic equation or parabola with the quadratic formula,

For cases where b2 >> 4ac, the difference in the numerator can be very small. In such cases, double precision can mitigate the problem. In addition, an alternative formulation can be used to minimize subtractive cancellation,


An illustration of the problem and the use of this alternative formula are provided in the following example.

EXAMPLE 3.7 Subtractive Cancellation

I Problem Statement. Compute the values of the roots of a quadratic equation with a = 1, b = 3000.001, and c = 3. Check the computed values versus the true roots of xl = -0.001 andx2 = -3000.

Solution. Figure 3.11 shows a Fortran 90 program that computes the roots XI and x2 on the basis of the quadratic formula [(Eq. (3.12)]. Note that both single- and double- precision versions are given. Whereas the results for x;! are adequate, the percent relative errors for xl are poor for the single-precision version, E, = 2.4%. This level could be inadequate for many applied engineering problems. This result is particularly surprising because we are employing an analytical formula to obtain our solution!

The loss of significance occurs in the line of both programs where two relatively large numbers are subtracted. Similar problems do not occur when the same numbers are added.

On the basis of the above, we can draw the general conclusion that the quadratic formula will be susceptible to subtractive cancellation whenever b2 >> 4ac. One way to cir- cumvent this problem is to use double precision. Another is to recast the quadratic formula in the format of Eq. (3.13). As in the program output, both options give a much smaller error because the subtractive cancellation is minimized or avoided.

FIGURE 3.1 1 Fortran 90 programs to determine the roots of a quadratic. (a) Single and (b) double precision

PROGRAM f i g 0 3 1 1 I M P L I C I T none REAL::a,b,c,d,xl,x2,xlr DOUBLE PRECISION::aa,bb,cc,dd,x11,x22 a = 1 . b = 3 0 0 0 . 0 0 1 c = 3 . d = SQRT(b * b - 4. * a * c ) XI = ( - b + d ) / ( 2 . * a ) x2 = ( - b - d ) / ( 2 . * a ) PRINT *, ' S i n g l e - p r e c i s i o n r e s u l t s : ' PRINT ' ( l x , a I O , f 2 0 . 1 4 ) ' , 'XI = ', XI PRINT ' ( l x , a 1 0 , f 1 0 . 4 ) ' , ' x 2 = ', x 2 PRINT * aa = 1 . b b = 3000 .001 c c = 3. d d = SQRT(bb * b b - 4 . * aa * c c ) x l l = ( - b b + d d ) / ( 2 . * a a ) x 2 2 = ( - b b - d d ) / (2. * a a )

PRINT *, ' D o u b l e - p r e c i s i o n r e s u l t s : ' PRINT ' ( l x , a 1 0 , f 2 0 . 1 4 ) ' , 'XI = ', x l l PRINT ' ( l x , a 1 0 , f 1 0 . 4 ) ' , ' x 2 = ', x 2 2 PRINT * PRINT *, ' M o d i f i e d f o r m u l a f o r f i r s t r o o t : ' x l r = -2 . * c / ( b + d ) PRINT ' ( l x , a I O , f 2 0 . 1 4 ) ' , 'XI = ', x l r END

OUTPUT: S i n g l e - p r e c i s i o n r e s u l t s :

XI = - . 0 0 0 9 7 6 5 6 2 5 0 0 0 0 x2 = - 3 0 0 0 . 0 0 0 0

D o u b l e - p r e c i s i o n r e s u l t s : XI = - . 0 0 1 0 0 0 0 0 0 0 0 7 7 1 x2 = -3000 .0000

M o d i f i e d f o r m u l a f o r f i r s t r o o t : X I = - . 0 0 1 0 0 0 0 0 0 0 0 0 0 0


Note that, as in the foregoing example, there are times where subtractive cancellation can be circumvented by using a transformation. However, the only general remedy is to employ extended precision.

Smearing. Smearing occurs whenever the individual terms in a summation are larger than the summation itself. As in the following example, one case where this occurs is in series of mixed signs.

EXAMPLE 3.8 Evaluation of ex using Infinite Series

Problem Statement. The exponential function y = ex is given by the infinite series

Evaluate this function for x = 10 and x = - 10, and be attentive to the problems of round- off error.

Solution. Figure 3 . 1 2 ~ gives a Fortran 90 program that uses the infinite series to evaluate ex. The variable i is the number of terms in the series, term is the value of the current term added to the series, and sum is the accumulative value of the series. The variable test is the preceding accumulative value of the series prior to adding term. The series is terminated when the computer cannot detect the difference between test and sum.

Figure 3.12b shows the results of running the program for x = 10. Note that this case is completely satisfactory. The final result is achieved in 31 terms with the series identical to the library function value within seven significant figures.

Figure 3 . 1 2 ~ shows similar results for x = -10. However, for this case, the results of the series calculation are not even the same sign as the true result. As a matter of fact, the negative results are open to serious question because ex can never be less than zero. The problem here is caused by round-off error. Note that many of the terms that make up the sum are much larger than the final result of the sum. Furthermore, unlike the previous case, the individual terms vary in sign. Thus, in effect we are adding and subtracting large numbers (each with some small error) and placing great significance on the differences-that is, subtractive cancellation. Thus, we can see that the culprit behind this example of smearing is, in fact, subtractive cancellation. For such cases it is appropriate to seek some other computational strategy. For example, one might try to compute y = e'' as y = (e-')I0. Other than such a reformulation, the only general recourse is extended precision.

Inner Products. As should be clear from the last sections, some infinite series are particularly prone to round-off error. Fortunately, the calculation of series is not one of the more common operations in numerical methods. A far more ubiquitous manipulation is the calculation of inner products, as in

n

This operation is very common, particularly in the solution of simultaneous linear algebraic equations. Such summations are prone to round-off error. Consequently, it is often de- sirable to compute such summations in extended precision.


(a) Program I PROGRAM f i g 0 3 1 2 I M P L I C I T n o n e R E A L : : t e r m , t e s t , sum,x 1NTEGER::i i = O t e r m = 1 . sum = 1 . t e s t = 0 . PRINT *, ' x = ' READ *, x PRINT *, ' i ' , ' t e r m ' , ' s u m ' D 0

I F ( s u m . E Q . t e s t ) EXIT PRINT *, i, t e r m , sum i = i + l t e r m = t e r m * x / i t e s t = sum sum = s u m + t e r m

END D O PRINT *, ' e x a c t v a l u e = ' , e x p ( x ) END

(b) Evaluation of el0

X = 1 0 i t e r m sum 0 1 .000000 1 . 0 0 0 0 0 0 1 1 0 . 0 0 0 0 0 0 1 1 . 0 0 0 0 0 0 2 50 .000000 6 1 .OOOOOO 3 1 6 6 . 6 6 6 7 0 0 227 .666700 4 416 .666700 6 4 4 . 3 3 3 4 0 0 5 8 3 3 . 3 3 3 4 0 0 1 4 7 7 . 6 6 7 0 0 0

2 7 9.183693E-02 22026 .420000 2 8 3.279890E-02 2 2 0 2 6 . 4 5 0 0 0 0 29 1.130997E-02 22026 .460000 3 0 3.769989E-03 22026 .470000 3 1 1 .216126E-03 22026 .470000 e x a c t v a l u e = 22026 .460000

(c) Evaluation of e-'O

i t e r m sum 0 1 .000000 1 .000000 1 -1 0 .000000 -9 .000000 2 50 .000000 41 .OOOOOO 3 - 1 6 6 . 6 6 6 7 0 0 -125 .666700 4 41 6 .666700 291 .000000 5 -833 .333400 -542 .333400

41 -2.989312E-09 8.137590E-05 42 7 . 1 1 7 4 1 0E-10 8 . 1 3 7 6 6 1 E-05 43 -1 .655212E-10 8 .137644E-05 44 3.761815E-11 8.137648E-05 45 -8.359655E-12 8.137647E-05 e x a c t v a l u e = 4.539993E-05

FIGURE 3.12 (a) A Fortran 90 program to evaluate ex using an infinite series. (b) Evaluation of ex (c) Evaluation of e-x.


Although the foregoing sections should provide rules of thumb to mitigate round-off error, they do not provide a direct means beyond trial and error to actually determine the effect of such errors on a computation. In the next chapter, we will introduce the Taylor series, which will provide a mathematical approach for estimating these effects.

PROBLEMS

3.1 Compose your own program based on Fig. 3.9 and use it to determine your computer's machine epsilon. 3.2 In a fashion similar to that in Fig. 3.9, write a short program to determine the smallest number, x,,,i,, used on the computer you will be employing along with this book. Note that your computer will be unable to reliably distinguish between zero and a quantity that is smaller than this number. 3.3 Determine a theoretical relationship to predict the smallest floating-point number for a digital computer on the basis of parameters such as its word size, number of mantissa bits, number of exponent bits, etc. 3.4 The infinite series

1 f(N) = C -

n=l n2

and

and compare with the true value of 2.485168 x and discuss your results. Use 25 terms to evaluate each series. 3.7 The derivative of f(x) = 1/(3 - p) is given by

Do you expect to have difficulties evaluating this function at x = 1.22? Try it using 3- and 4-digit arithmetic with chopping. 3.8 (a) Evaluate the polynomial

y = 2 - 5 , t ? + 6 x + 0 . 5 5

converges on a value of f(N) = n2/6 as N approaches infinity. at = 2.73. Use 3-digit arithmetic with chopping. Write a program to compute f (N) for N = 10,000 by computing the Evaluate the error. sum from n = 1 to 10,000. Then repeat the computation but in re- (b) Repeat (a) but express as verse order-that is, from n = 10,000 to 1 using increments of -1. Explain the results. y = [(x - 5) + 6]x + 0.55

3.5 can the machine be a stop- Evaluate the percent relative error and compare with part (a). ping criterion, e,, for your programs? Provide an example. 3.6 Evaluate e-8.3 using two approaches

CHAPTER 4 Truncation Errors and the Taylor Series

Truncation errors are those that result from using an approximation in place of an exact mathematical procedure. For example, in Chap. 1 we approximated the derivative of velocity of a falling parachutist by a finite-divided-difference equation of the form [Eq. (1.1 I)]

A truncation error was introduced into the numerical solution because the difference equation only approximates the true value of the derivative (recall Fig. l .4). In order to gain insight into the properties of such errors, we now turn to a mathematical formulation that is used widely in numerical methods to express functions in an approximate fashion-the Taylor series.

4.1 THE TAYLOR SERIES

Taylor's theorem (Box 4.1) and its associated formula, the Taylor series, is of great value in the study of numerical methods. In essence, the Taylor series provides a means to predict a function value at one point in terms of the function value and its derivatives at another point. In particular, the theorem states that any smooth function can be approximated as a polynomial.

A useful way to gain insight into the Taylor series is to build it term by term. For example, the first term in the series is

This relationship, called the zero-order approximation, indicates that the value off at the new point is the same as its value at the old point. This result makes intuitive sense because if xi and xi+, are close to each other, it is likely that the new value is probably similar to the old value.

Equation (4.2) provides a perfect estimate if the function being approximated is, in fact, a constant. However, if the function changes at all over the interval, additional terms

80 TRUNCATION ERRORS AND THE TAYLOR SERIES

Box 4.1 Taylor's Theorem

Taylor's theorem: If the function f and its first n + 1 derivatives are continuous on an interval containing a and x, then the value of the function at x is given by

f "(a > f ( x ) = f ( a ) + f f (a ) (x - a ) + -(x 2! -a)*

where the remainder R, is defined as

where t = a dummy variable. Equation (B4.1.1) is called the Tay- lor series or Taylor's formula. If the remainder is omitted, the right side of Eq. (B4.1.1) is the Taylor polynomial approximation to f (x). In essence, the theorem states that any smooth function can be approximated as a polynomial.

Equation (B4.1.2) is but one way, called the integralform, by which the remainder can be expressed. An alternative formulation can be derived on the basis of the integral mean-value theorem.

First theorem of mean for integrals: If the function g is continuous and integrable on an interval containing a and x, then there exists a point 6 between a and x such that

In other words, this theorem states that the integral can be represented by an average value for the function g(e) times the interval length x - a. Because the average must occur between the minimum and maximum values for the interval, there is a point x = 6 at which the function takes on the average value.

The first theorem is in fact a special case of a second mean- value theorem for integrals.

Second theorem of mean for integrals: If the functions g and h are continuous and integrable on an interval containing a and x, and h does not change sign in the interval, then there exists a point 4 between a and x such that

Thus, Eq. (B4.1.3) is equivalent to Eq. (B4.1.4) with h(t) = 1. The second theorem can be applied to Eq. (B4.1.2) with

As t varies from a to x, h(t) is continuous and does not change sign. Therefore, iff ("+')(t) is continuous, then the integral mean-value theorem holds and

This equation is referred to as the derivative or Lagrange form of the remainder.

of the Taylor series are required to provide a better estimate. For example, the first-order approximation is developed by adding another term to yield

The additional first-order term consists of a slope f ' (xi) multiplied by the distance between xi and xi+]. Thus, the expression is now in the fonn of a straight line and is capable of predicting an increase or decrease of the function between xi and xi+,.

Although Eq. (4.3) can predict a change, it is exact only for a straight-line, or linear, trend. Therefore, a second-order term is added to the series to capture some of the curvature that the function might exhibit:

4.1 THE TAYLOR SERIES 8 1

In a similar manner, additional terms can be included to develop the complete Taylor series expansion:

f '((4 f(xi+l> = f (x i ) + fl(xi)(xi+l - x;) + - 2! (xi+, - xi)'

f '3' (xi 1 +- f '"I (4 ) (xi+, - x ; ) ~ + . . . + -

3! n! (xi+l - xi)" + R, (4.5)

Note that because Eq. (4.5) is an infinite series, an equal sign replaces the approximate sign that was used in Eqs. (4.2) through (4.4). A remainder term is included to account for all terms from n + 1 to infinity:

where the subscript n connotes that this is the remainder for the nth-order approximation and < is a value of x that lies somewhere between xi and x;+l. The introduction of the < is so important that we will devote an entire section (Sec. 4.1.1) to its derivation. For the time being, it is sufficient to recognize that there is such a value that provides an exact determination of the error.

It is often convenient to simplify the Taylor series by defining a step size h = xi+, - xi and expressing Eq. (4.5) as

where the remainder term is now

EXAMPLE 4.1 Taylor Series Approximation of a Polynomial

Problem Statement. Use zero- through fourth-order Taylor series expansions to approximate the function

f ( x ) = -0.1x4 - 0.15x3 - 0.5x2 - 0 . 2 5 ~ + 1.2

from xi = 0 with h = 1. That is, predict the function's value at xi+l = 1.

Solution. Because we are dealing with a known function, we can compute values for f (x) between 0 and 1. The results (Fig. 4.1) indicate that the function starts at f (0) = 1.2 and then curves downward to f(1) = 0.2. Thus, the true value that we are trying to predict is 0.2.

The Taylor series approximation with n = 0 is [Eq. (4.2)]


FIGURE 4.1 The approximation of Fix) = - 0.1 x4 - 0.1 5x3 - 0.5x2 - 0.25~ + 1.2 at x = 1 by zero-order, first-order, and second-order Taylor series expansions.

Thus, as in Fig. 4.1, the zero-order approximation is a constant. Using this formulation results in a truncation error [recall Eq. (3.2)] of

E, = 0.2 - 1.2 = -1.0

atx = 1 . For n = 1 , the first derivative must be determined and evaluated at x = 0:

f'(0) = - 0.4(0.0)~ - 0.45(0.0)~ - l.O(O.0) - 0.25 = - 0.25

Therefore, the first-order approximation is [Eq. (4.3)]

f(xi+,) 2 1.2 - 0.25h

which can be used to compute f(1) = 0.95. Consequently, the approximation begins to capture the downward trajectory of the function in the form of a sloping straight line (Fig. 4.1). This results in a reduction of the truncation error to

E, = 0.2 - 0.95 = -0.75

For n = 2, the second derivative is evaluated at x = 0:

Therefore, according to Eq. (4.4),

and substituting h = 1, f (1) = 0.45. The inclusion of the second derivative now adds some downward curvature resulting in an improved estimate, as in Fig. 4.1. The truncation error is reduced further to 0.2 - 0.45 = -0.25.

4.1 THE TAYLOR SERIES 83

Additional terms would improve the approximation even more. In fact, the inclusion of the third and the fourth derivatives results in exactly the same equation we started with:

f(x) = 1.2 - 0.25h - 0.5h2 - 0.15h3 - 0.1h4

/ where the remainder term is

because the fifth derivative of a fourth-order polynomial is zero. Consequently, the Taylor series expansion to the fourth derivative yields an exact estimate at xi+l = 1:

In general, the nth-order Taylor series expansion will be exact for an nth-order polynomial. For other differentiable and continuous functions, such as exponentials and sinu- soids, a finite number of terms will not yield an exact estimate. Each additional term will contribute some improvement, however slight, to the approximation. This behavior will be demonstrated in Example 4.2. Only if an infinite number of terms are added will the series yield an exact result.

Although the above is true, the practical value of Taylor series expansions is that, in most cases, the inclusion of only a few terms will result in an approximation that is close enough to the true value for practical purposes. The assessment of how many terms are required to get "close enough" is based on the remainder term of the expansion. Recall that the remainder term is of the general form of Eq. (4.8). This relationship has two major drawbacks. First, ,$ is not known exactly but merely lies somewhere between xi and xi+*. Second, to evaluate Eq. (4.8), we need to determine the (n + 1)th derivative of f(x). To do this, we need to know f(x). However, if we knew f(x), there would be no need to perform the Taylor series expansion in the present context!

Despite this dilemma, Eq. (4.8) is still useful for gaining insight into truncation errors. This is because we do have control over the term h in the equation. In other words, we can choose how far away from x we want to evaluate f(x), and we can control the number of terms we include in the expansion. Consequently, Eq. (4.8) is usually expressed as

where the nomenclature O(hnt') means that the truncation error is of the order of hn+'. That is, the error is proportional to the step size h raised to the (n + 1)th power. Although this approximation implies nothing regarding the magnitude of the derivatives that multiply hn+', it is extremely useful in judging the comparative error of numerical methods based on Taylor series expansions. For example, if the error is O(h), halving the step size will halve the error. On the other hand, if the error is 0(h2), halving the step size will quar- ter the error.

In general, we can usually assume that the truncation error is decreased by the addition of terms to the Taylor series. In many cases, if h is sufficiently small, the first and other lower-order terms usually account for a disproportionately high percent of the error. Thus, only a few terms are required to obtain an adequate estimate. This property is illustrated by the following example.


EXAMPLE 4.2 Use of Taylor Series Expansion to Approximate a Function with an Infinite Number of Derivatives

Problem Statement. Use Taylor series expansions with n = 0 to 6 to approximate f(x) = cos x at xi+, = n / 3 on the basis of the value of f(x) and its derivatives at x; = n / 4 . Note that this means that h = n / 3 - n / 4 = n/ 12.

Solution. As with Example 4.1, our knowledge of the true function means that we can determine the correct value f (n /3) = 0.5.

The zero-order approximation is [Eq. (4.3)]

n f (?) S cos ($) = 0.707106781

I which represents a percent relative error of

For the first-order approximation, we add the first derivative term where f '(x) = -sin x:

n n IT n f (?) Z cos (q) - sin (q) (12) = 0.521986659

which has E, = -4.40 percent. For the second-order approximation, we add the second derivative term where

f "(x) = -COS x:

f ( f ) e cos ($) - sin (%) (:) - cos ( n / 4 ) n 2

2 (i?) = 0.497754491

with E* = 0.449 percent. Thus, the inclusion of additional terms results in an improved estimate.

The process can be continued and the results listed, as in Table 4.1. Notice that the derivatives never go to zero as was the case with the polynomial in Example 4.1. Therefore, each additional term results in some improvement in the estimate. However, also notice how most of the improvement comes with the initial terms. For this case, by the time we

TABLE 4.1 Taylor series approximation of f (x) = cos x at xi+, = lr/3 using a base point of n/4. Values are shown for various orders (nl of approximation.

Order n f ("'(x) f ( d 3 ) st -- -- - -~ -

cos x 0.707 1 0678 1 -41.4 -sin x 0.52 1986659 -4 4

-COS x 0.49775449 1 0.449 sin x 0.499869 147 2.62 x

cos x 0.50000755 1 -1.51 x lo-3 -sin x 0.500000304 -6.08 x 1 0-5

-cos x 0.499999988 2 40 x


have added the third-order term, the error is reduced to 2.62 x percent, which means that we have attained 99.9738 percent of the true value. Consequently, although the addition of more terms will reduce the error further, the improvement becomes negligible.

4.1.1 The Remainder for the Taylor Series Expansion

Before demonstrating how the Taylor series is actually used to estimate numerical errors, we must explain why we included the argument 6 in Eq. (4.8). A mathematical derivation is presented in Box 4.1. We will now develop an alternative exposition based on a somewhat more visual interpretation. Then we can extend this specific case to the more general formulation.

Suppose that we truncated the Taylor series expansion [Eq. (4.7)] after the zero-order term to yield

A visual depiction of this zero-order prediction is shown in Fig. 4.2. The remainder, or error, of this prediction, which is also shown in the illustration, consists of the infinite series of terms that were truncated:

It is obviously inconvenient to deal with the remainder in this infinite series format. One simplification might be to truncate the remainder itself, as in

FIGURE 4.2 Graphical depiction of a zero-order Taylor series re diction and remainder.


Although, as stated in the previous section, lower-order derivatives usually account for a greater share of the remainder than the higher-order terms, this result is still inexact because of the neglected second- and higher-order terms. This "inexactness" is implied by the approximate equality symbol ( Z ) employed in Eq. (4.9).

An alternative simplification that transforms the approximation into an equivalence is based on a graphical insight. As in Fig. 4.3, the derivative mean-value theorem states that if a function f(x) and its first derivative are continuous over an interval from xi to x,+l, then there exists at least one point on the function that has a slope, designated by f '(c), that is parallel to the line joining f(xi) and f(xi+1). The parameter 6 marks the x value where this slope occurs (Fig. 4.3). A physical illustration of this theorem is that, if you travel between two points with an average velocity, there will be at least one moment during the course of the trip when you will be moving at that average velocity.

By invoking this theorem it is simple to realize that, as illustrated in Fig. 4.3, the slope f '(c) is equal to the rise Ro divided by the run h, or

which can be rearranged to give

RO = f ' ( W (4.10)

Thus, we have derived the zero-order version of Eq. (4.8). The higher-order versions are merely a logical extension of the reasoning used to derive Eq. (4.10). The first-order version is

FIGURE 4.3 Graphical depiction of the derivative mean-value theorem.


For this case, the value of C; conforms to the x value corresponding to the second derivative that makes Eq. (4.11) exact. Similar higher-order versions can be developed from Eq. (4.8).

4.1.2 Using the Taylor Series to Estimate Truncation Errors

Although the Taylor series will be extremely useful in estimating truncation errors throughout this book, it may not be clear to you how the expansion can actually be applied to numerical methods. In fact, we have already done so in our example of the falling parachutist. Recall that the objective of both Examples 1 . 1 and 1.2 was to predict velocity as a function of time. That is, we were interested in determining v(t). As specified by Eq. ( 4 3 , v(t) can be expanded in a Taylor series:

Now let us truncate the series after the first derivative term:

Equation (4.13) can be solved for

v(ti+l> - v(t i ) R1 = --

t i + ~ - ti ti+l - tl -- First-order approximation error

The first part of Eq. (4.14) is exactly the same relationship that was used to approximate the derivative in Example 1.2 [Eq. (1.1 I ) ] . However, because of the Taylor series approach, we have now obtained an estimate of the tlancation error associated with this approximation of the derivative. Using Eqs. (4.6) and (4.14) yields

R1 -- - o(t i+l - ti) (4.16) ti+^ - ti

Thus, the estimate of the derivative [Eq. ( 1 . 1 1 ) or the first part of Eq. (4.14)] has a truncation error of order ti+l - ti. In other words, the error of our derivative approximation should be proportional to the step size. Consequently, if we halve the step size, we would expect to halve the error of the derivative.

EXAMPLE 4.3 The Effect of Nonlinearity and Step Size on the Taylor Series Approximation

Problem Statement. Figure 4.4 is a plot of the function

f (x) = xm (E4.3.1)

for m = 1 , 2 , 3 , and 4 over the range from x = 1 to 2. Notice that for m = 1 the function is linear, and as m increases, more curvature or nonlinearity is introduced into the function.


FIGURE 4.4 Plot of the function f(x) = xm for rn = 1 , 2, 3, and 4. Notice that the function becomes more nonlinear as rn increases.

Employ the first-order Taylor series to approximate this function for various values of the exponent m and the step size h.

Solution. Equation (E4.3.1) can be approximated by a first-order Taylor series expansion, as in

f(xi+l) = f(x;) + mx"-'h (E4.3.2)

which has a remainder

First, we can examine how the approximation performs as m increases-that is, as the function becomes more nonlinear. For m = 1, the actual value of the function at x = 2 is 2.


The Taylor series yields

f(2) = 1 + 1(1)=2

and

The remainder is zero because the second and higher derivatives of a linear function are zero. Thus, as expected, the first-order Taylor series expansion is perfect when the underlying function is linear.

For m = 2, the actual value is f(2) = 22 = 4. The first-order Taylor series approximation is

and

R1 = i ( 1 ) 2 + ~ + ~ + . . . = 1

Thus, because the function is a parabola, the straight-line approximation results in a discrepancy. Note that the remainder is determined exactly.

For m = 3, the actual value is f(2) = 23 = 8. The Taylor series approximation is

and

R1 = + 31)~ + o + o + . . . = 4

Again, there is a discrepancy that can be determined exactly from the Taylor series. Form = 4, the actual value is f (2) = 24 = 16. The Taylor series approximation is

and

RI = ? ( I ) ~ + $ ( ~ ) ~ + ~ ( I ) ~ + O + O + . . . = 11

On the basis of these four cases, we observe that R1 increases as the function becomes more nonlinear. Furthermore, Rl accounts exactly for the discrepancy. This is because Eq. (E4.3.1) is a simple monomial with a finite number of derivatives. This permits a complete determination of the Taylor series remainder.

Next, we will examine Eq. (E4.3.2) for the case m = 4 and observe how R1 changes as the step size h is varied. For m = 4, Eq. (E4.3.2) is

If x = 1, f(1) = 1 and this equation can be expressed as

with a remainder of


FIGURE 4.5 Log-log plot of the remainder R 1 of the first-order Taylor series approximation of the function F(x) = x4 versus step size h. A line with a slo e of 2 is also shown to indicate that as h decreases, the error becomes proportional to /T2.

TABLE 4.2 Comparison of the exact value of the function f ( x ) = x4 with the first-order Taylor series approximation. Both the function and the approximation are evaluated at x + h, where x = 1 .

True First-Order

Approximation RI

This leads to the conclusion that the discrepancy will decrease as h is reduced. Also, at sufficiently small values of h, the error should become proportional to h2. That is, as h is halved, the error will be quartered. This behavior is confirmed by Table 4.2 and Fig. 4.5.

Thus, we conclude that the error of the first-order Taylor series approximation decreases as m approaches 1 and as h decreases. Intuitively, this means that the Taylor series

A. 1 THE TAYLOR SERIES 91

becomes more accurate when the function we are approximating becomes more like a straight line over the interval of interest. This can be accomplished either by reducing the size of the interval or by "straightening" the function by reducing m. Obviously, the latter option is usually not available in the real world because the functions we analyze are typically dictated by the physical problem context. Consequently, we do not have control of their lack of linearity, and our only recourse is reducing the step size or including additional terms in the Taylor series expansion.

4.1.3 Numerical Differentiation

Equation (4.14) is given a formal label in numerical methods-it is called afinite divided difference. It can be represented generally as

Afi f '(xi ) = - h + O (h) (4.18)

where Af, is referred to as thefirst forward difference and h is called the step size, that is, the length of the interval over which the approximation is made. It is termed a "forward" difference because it utilizes data at i and i + 1 to estimate the derivative (Fig. 4 .6~) . The entire term A f l h is referred to as afirstfinite divided difference.

This forward divided difference is but one of many that can be developed from the Taylor series to approximate derivatives numerically. For example, backward and centered difference approximations of the first derivative can be developed in a fashion similar to the derivation of Eq. (4.14). The former utilizes values at xi-1 and xi (Fig. 4.6b), whereas the latter uses values that are equally spaced around the point at which the derivative is estimated (Fig. 4.6~). More accurate approximations of the first derivative can be developed by including higher-order terms of the Taylor series. Finally, all the above versions can also be developed for second, third, and higher derivatives. The following sections provide brief summaries illustrating how some of these cases are derived.

Backward Difference Approximation of the First Derivative. The Taylor series can be expanded backward to calculate a previous value on the basis of a present value, as in

Truncating this equation after the first derivative and rearranging yields

where the error is O(h) and Vf, is referred to as thefirst backward difference. See Fig. 4.6b for a graphical representation.


FIGURE 4.6 Graphical depiction of (a) forward, (b) backward, and (c) centered finitedivided-difference approximations of the first derivative.


Centered Difference Approximation of the First Derivative. A third way to approximate the first derivative is to subtract Eq. (4.19) from the forward Taylor series expansion:

to yield

which can be solved for

f + - f i - f " ' ( ~ i ) ~ 2 + +. . . f '(xi 1 = -- 2h 6

Equation (4.22) is a centered difference representation of the first derivative. Notice that the truncation error is of the order of h2 in contrast to the forward and backward approximations that were of the order of h. Consequently, the Taylor series analysis yields the practical information that the centered difference is a more accurate representation of the derivative (Fig. 4 . 6 ~ ) . For example, if we halve the step size using a forward or backward difference, we would approximately halve the truncation error, whereas for the central difference, the error would be quartered.

EXAMPLE 4.4 Finite-Divided-Difference Approximations of Derivatives

Problem Statement. Use forward and backward difference approximations of O(h) and a centered difference approximation of 0(h2) to estimate the first derivative of

f (x) = -0.1x4 - 0.15x3 - 0.5x2 - 0 . 2 5 ~ + 1.2

at x = 0.5 using a step size h = 0.5. Repeat the computation using h = 0.25. Note that the derivative can be calculated directly as

f ' (x ) = -0.4x3 - 0.45x2 - 1 . 0 ~ - 0.25

and can be used to compute the true value as f'(0.5) = -0.9125.

Solution. For h = 0.5, the function can be employed to determine

X i - 1 = 0 f ( x i p l ) = 1.2 X ; = 0.5 f (x i ) = 0.925

X i + l = 1.0 f ( ~ ~ + ~ ) = 0.2

These values can be used to compute the forward divided difference [Eq. (4.17)],

4.2 ERROR PROPAGATION 95

This relationship is called the second forwardfinite divided difference. Similar manipulations can be employed to derive a backward version

and a centered version

f "(xi) = f(xi+l) - 2 f ( x i ) + f ( x i - I ) + O(hi)

h2

As was the case with the first-derivative approximations, the centered case is more accurate. Notice also that the centered version can be alternatively expressed as

Thus, just as the second derivative is a derivative of a derivative, the second divided difference approximation is a difference of two first divided differences.

We will return to the topic of numerical differentiation in Chap. 23. We have introduced you to the topic at this point because it is a very good example of why the Taylor series is important in numerical methods. In addition, several of the formulas introduced in this section will be employed prior to Chap. 23.

4.2 ERROR PROPAGATION

The purpose of this section is to study how errors in numbers can propagate through mathematical functions. For example, if we multiply two numbers that have errors, we would like to estimate the error in the product.

4.2.1 Functions of a Single Variable

Suppose that we have a function f (x) that is dependent on a single independent variable x. Assume that i is an approximation of x. We, therefore, would like to assess the effect of the discrepancy between x and 3 on the value of the function. That is, we would like to estimate

The problem with evaluating A f ( i ) is that f (x) is unknown because x is unknown. We can overcome this difficulty if i is close to x and f ( 2 ) is continuous and differentiable. If these conditions hold, a Taylor series can be employed to compute f (x) near f(.?), as in

Dropping the second- and higher-order terms and rearranging yields

f ( x ) - f ( x ) r f y i ) ( x - i)


FIGURE 4.7 Graphical depiction of first- order error propagation.

where Af(x") = I f (x) - f(x")l represents an estimate of the error of the function and Ax" = Ix - x"I represents an estimate of the error of x. Equation (4.25) provides the capability to approximate the error in f ( x ) given the derivative of a function and an estimate of the error in the independent variable. Figure 4.7 is a graphical illustration of the operation.

EXAMPLE 4.5 Error Propagation in a Function of a Single Variable

Problem Statement. Given a value of x" = 2.5 with an error of Ax" = 0.01, estimate the resulting error in the function, f ( x ) = x 3.

Solution. Using Eq. (4.25),

A f(x") Z 3(2.5)2(0.01) = 0.1875

Because f (2.5) = 15.625, we predict that

.f(2.5) = 15.625 f 0.1875

or that the true value lies between 15.4375 and 15.8125. In fact, if x were actually 2.49, the function could be evaluated as 15.4382 and if x were 2.51, it would be 15.8132. For this case, the first-order error analysis provides a fairly close estimate of the true error.

4.2 ERROR PROPAGATION 97

4.2.2 Functions of More than One Variable

The foregoing approach can be generalized to functions that are dependent on more than one independent variable. This is accomplished with a multivariable version of the Taylor series. For example, if we have a function of two 'ndependent variables u and v, the Tay- lor series can be written as

where all partial derivatives are evaluated at the base point i . If all second-order and higher terms are dropped, Eq. (4.26) can be solved for

where A k and A6 = estimates of the errors in u and v , respectively. For n independent variables i l , i 2 , . . . , 2 , having errors A x l , A i 2 , . . . , AX, the fol-

lowing general relationship holds:

EXAMPLE 4.6 Error Propagation in a Multivariable Function

1 Problem Statement. The deflection y of the top of a sailboat mast is

where F = a uniform side loading (lblft), L = height (ft), E = the modulus of elasticity (lb/ft2), and I = the moment of inertia (ft4). Estimate the error in y given the following data:

1 Solution. Employing Eq. (4.27) gives


Substituting the appropriate values gives

Ay = 0.0225 + 0.0075 + 0.00375 + 0.005625 = 0.039375

Therefore, y = 0.5625 f 0.039375. In other words, y is between 0.523125 and 0.601875 ft. The validity of these estimates can be verified by substituting the extreme values for the variables into the equation to generate an exact minimum of

52(30. 1)4 Ymax = = 0.60285 g(1.49 x 108)0.0594

Thus, the first-order estimates are reasonably close to the exact values.

Equation (4.27) can be employed to define error propagation relationships for common mathematical operations. The results are summarized in Table 4.3. We will leave the derivation of these formulas as a homework exercise.

4.2.3 Stability and Condition

The condition of a mathematical problem relates to its sensitivity to changes in its input values. We say that a computation is numerically unstable if the uncertainty of the input values is grossly magnified by the numerical method.

These ideas can be studied using a first-order Taylor series

This relationship can be employed to estimate the relative error of f(x) as in

The relative error of x is given by

TABLE 4.3 Estimated error bounds associated with common mathematical operations using inexact numbers ij and k

Operation Estimated Error

Addition Subtraction Multiplication

Division

Ai, + A v Ai, + Ai.

l U l ~ i . + l i .1~6 161~; + li.1~13

l;I2

4.3 TOTAL NUMERICAL ERROR 99

A condition number can be defined as the ratio of these relative errors

i f ' ( i ) Condition number = -

f(3 The condition number provides a measure of the extent to which an uncertainty in x is magnified by f (x). A value of 1 tells us that the function's relative error is identical to the relative error in x. A value greater than 1 tells us that the relative error is amplified, whereas a value less than 1 tells us that it is attenuated. Functions with very large values are said to be ill-conditioned. Any combination of factors in Eq. (4.28) that increases the numerical value of the condition number will tend to magnify uncertainties in the computation of f(x).

EXAMPLE 4.7 Condition Number

I Problem Statement. Compute and interpret the condition number for

f ( x ) = tanx for .i =

I Solution. The condition number is computed as

i ( l / c0s2 x ) Condition number =

tan i

I For i = "12 + O.l (n/2) ,

1.7279(40.86) Condition number = = -11.2

-6.314

Thus, the function is ill-conditioned. For 2 = n / 2 + 0.01(n/2) , the situation is even worse:

1.5865(4053) Condition number = = -101

-63.66

For this case, the major cause of ill conditioning appears to be the derivative. This makes sense because in the vicinity of n/2, the tangent approaches both positive and negative infinity.

4.3 TOTAL NUMERICAL ERROR The total numerical error is the summation of the truncation and round-off errors. In general, the only way to minimize round-off errors is to increase the number of significant figures of the computer. Further, we have noted that round-off error will increase due to subtractive cancellation or due to an increase in the number of computations in an analysis. In contrast, Example 4.4 demonstrated that the truncation error can be reduced by decreasing the step size. Because a decrease in step size can lead to subtractive cancellation or to an increase in computations, the truncation errors are decreased as the round-off errors are increased. Therefore, we are faced by the following dilemma: The strategy for decreasing


FIGURE 4.8 A graphical depiction of the trade-off between round-off and truncation error that sometimes comes into pla in the course of a numerical method. The point of diminishing returns is shown, where round-o Y f error begins to negate the benefits of step-size reduction.

one component of the total error leads to an increase of the other component. In a computation, we could conceivably decrease the step size to minimize truncation errors only to discover that in doing so, the round-off error begins to dominate the solution and the total error grows! Thus, our remedy becomes our problem (Fig. 4.8). One challenge that we face is to determine an appropriate step size for a particular computation. We would like to choose a large step size in order to decrease the amount of calculations and round-off errors without incurring the penalty of a large truncation error. If the total error is as shown in Fig. 4.8, the challenge is to identify the point of diminishing returns where round-off error begins to negate the benefits of step-size reduction.

In actual cases, however, such situations are relatively uncommon because most computers carry enough significant figures that round-off errors do not predominate. Never- theless, they sometimes do occur and suggest a sort of "numerical uncertainty principle" that places an absolute limit on the accuracy that may be obtained using certain computerized numerical methods.

4.3.1 Control of Numerical Errors

For most practical cases, we do not know the exact error associated with numerical methods. The exception, of course, is when we have obtained the exact solution that makes our numerical approximations unnecessary. Therefore, for most engineering applications we must settle for some estimate of the error in our calculations.

There are no systematic and general approaches to evaluating numerical errors for all problems. In many cases error estimates are based on the experience and judgment of the engineer.

4.4 BLUNDERS, FORMULATION ERRORS, AND DATA UNCERTAINTY 101

Although error analysis is to a certain extent an art, there are several practical programming guidelines we can suggest. First and foremost, avoid subtracting two nearly equal numbers. Loss of significance almost always occurs when this is done. Sometimes you can rearrange or reformulate the problem to avoid subtractive cancellation. If this is not possible, you may want to use extended-precision arithmetic. Furthermore, when adding and subtracting numbers, it is best to sort the numbers and work with the smallest numbers first. This avoids loss of significance.

Beyond these computational hints, one can attempt to predict total numerical errors using theoretical formulations. The Taylor series is our primary tool for analysis of both truncation and round-off errors. Several examples have been presented in this chapter. Pre- diction of total numerical error is very complicated for even moderately sized problems and tends to be pessimistic. Therefore, it is usually attempted for only small-scale tasks.

The tendency is to push forward with the numerical computations and try to estimate the accuracy of your results. This can sometimes be done by seeing if the results satisfy some condition or equation as a check. Or it may be possible to substitute the results back into the original equation to check that it is actually satisfied.

Finally you should be prepared to perform numerical experiments to increase your awareness of computational errors and possible ill-conditioned problems. Such experiments may involve repeating the computations with a different step size or method and comparing the results. We may employ sensitivity analysis to see how our solution changes when we change model parameters or input values. We may want to try different numerical algorithms that have different theoretical foundations, are based on different computational strategies, or have different convergence properties and stability characteristics.

When the results of numerical computations are extremely critical and may involve loss of human life or have severe economic ramifications, it is appropriate to take special precautions. This may involve the use of two or more independent groups to solve the same problem so that their results can be compared.

The roles of errors will be a topic of concern and analysis in all sections of this book. We will leave these investigations to specific sections.

4.4 BLUNDERS, FORMULATION ERRORS, A N D DATA UNCERTAINTY p p p p p -

Although the following sources of error are not directly connected with most of the numerical methods in this book, they can sometimes have great impact on the success of a modeling effort. Thus, they must always be kept in mind when applying numerical techniques in the context of real-world problems.

4.4.1 Blunders

We are all familiar with gross errors, or blunders. In the early years of computers, erroneous numerical results could sometimes be attributed to malfunctions of the computer itself. Today, this source of error is highly unlikely, and most blunders must be attributed to human imperfection.

Blunders can occur at any stage of the mathematical modeling process and can contribute to all the other components of error. They can be avoided only by sound knowledge

1 02 TRUNCATION ERRORS AND THE TAYLOR SERIES

of fundamental principles and by the care with which you approach and design your solution to a problem.

Blunders are usually disregarded in discussions of numerical methods. This is no doubt due to the fact that, try as we may, mistakes are to a certain extent unavoidable. How- ever, we believe that there are a number of ways in which their occurrence can be minimized. In particular, the good programming habits that were outlined in Chap. 2 are extremely useful for mitigating programming blunders. In addition, there are usually simple ways to check whether a particular numerical method is working properly. Throughout this book, we discuss ways to check the results of numerical calculations.

4.4.2 Formulation Errors

Formulation, or model, errors relate to bias that can be ascribed to incomplete mathematical models. An example of a negligible formulation error is the fact that Newton's second law does not account for relativistic effects. This does not detract from the adequacy of the solution in Example 1.1 because these errors are minimal on the time and space scales associated with the falling parachutist problem.

However, suppose that air resistance is not linearly proportional to fall velocity, as in Eq. (1.7), but is a function of the square of velocity. If this were the case, both the analytical and numerical solutions obtained in the first chapter would be erroneous because of formulation error. Further consideration of formulation error is included in some of the engineering applications in the remainder of the book. You should be cognizant of these problems and realize that, if you are working with a poorly conceived model, no numerical method will provide adequate results.

4.4.3 Data Uncertainty

Errors sometimes enter into an analysis because of uncertainty in the physical data upon which a model is based. For instance, suppose we wanted to test the falling parachutist model by having an individual make repeated jumps and then measuring his or her velocity after a specified time interval. Uncertainty would undoubtedly be associated with these measurements, since the parachutist would fall faster during some jumps than during others. These errors can exhibit both inaccuracy and imprecision. If our instruments consis- tently underestimate or overestimate the velocity, we are dealing with an inaccurate, or biased, device. On the other hand, if the measurements are randomly high and low, we are dealing with a question of precision.

Measurement errors can be quantified by summarizing the data with one or more well- chosen statistics that convey as much information as possible regarding specific characteristics of the data. These descriptive statistics are most often selected to represent (I) the location of the center of the distribution of the data and (2) the degree of spread of the data. As such, they provide a measure of the bias and imprecision, respectively. We will return to the topic of characterizing data uncertainty in Part Five.

Although you must be cognizant of blunders, formulation errors, and uncertain data, the numerical methods used for building models can be studied, for the most part, independently of these errors. Therefore, for most of this book, we will assume that we have not made gross errors, we have a sound model, and we are dealing with error-free measurements. Under these conditions, we can study numerical errors without cornplicating factors.

PROBLEMS 1 03

PROBLEMS

4.1 The infinite series Compare your estimates with the true value of the second derivative. Interpret your results on the basis of the remainder term of the

x2 x3 x" e x = l + x + - + - + . . . + - Taylor series expansion.

2 3! n ! 4.8 Recall that the velocity of the falling parachutist can be com-

puted by [Eq. (1.10)], can be used to approximate ex. (a) Prove that this Maclaurin series expansion is a special case of g m V(t) = - (1 - e-(clm)'

the Taylor series expansion [Eq. (4.7)] with xi = 0 and h = x. c ) (b) Use the Taylor series to estimate f (x) = e-' at xi+, = 1 for

Use a first-order error analysis to estimate the error of v at t = 6, if x, = 0.25. Employ the zero-, first-, second-, and third-order

g = 9.8 and m = 50 but c = 12.5 f 2. versions and compute the I for each case.

4.9 Repeat Prob. 4.8 with g = 9.8, t = 6, c = 12.5 f 2, and m 4.2 The Maclaurin series expansion for cos x is

= 50 f 0.5.

Starting with the simplest version, cos x = 1, add terms one at a time to estimate cos (7114). After each new term is added, compute the true and approximate percent relative errors. Use your pocket calculator to determine the true value. Add terms until the absolute value of the approximate error estimate falls below an error criterion conforming to two significant figures. 4.3 Perform the same computation as in Prob. 4.2, but use the Maclaurin series expansion for sin x,

to estimate sin (7114). 4.4 Use zero- through third-order Taylor series expansions to predict f (2) for

using a base point at x = 1. Compute the true percent relative error cr for each approximation. 4.5 Use zero- through fourth-order Taylor series expansions to predict f(3) for f(x) = In x using a base point at x = 1. Compute the true percent relative error st for each approximation. Discuss the meaning of the results. 4.6 Use forward and backward difference approximations of O(h) and a centered difference approximation of O(h2) to estimate the first derivative of the function examined in Prob. 4.4. Evaluate the derivative at x = 2 using a step size of h = 0.25. Compare your results with the true value of the derivative. Interpret your results on the basis of the remainder term of the Taylor series expansion. 4.7 Use a centered difference approximation of 0(h2) to estimate the second derivative of the function examined in Prob. 4.4. Per- form the evaluation at x = 2 using step sizes of h = 0.2 and 0.1.

4.10 The Stefan-Boltzmann law can be employed to estimate the rate of radiation of energy H from a surface, as in

where H is in watts, A = the surface area (m2), e = the emissivity that characterizes the emitting properties of the surface (dimen- sionless), u = a universal constant called the Stefan-Boltzmann constant (= 5.67 x lo-* W m-2 KF4), and T = absolute temperature (K). Determine the error of H for a steel plate with A = 0.15 m2, e = 0.90, and T = 650 f 25. Compare your results with the exact error. Repeat thecomputation but with T = 650 f 50. Interpret your results. 4.11 Repeat Prob. 4.10 but for a copper sphere with radius = 0.15 f 0.02 m, e = 0.90 & 0.05, and T = 550 f 25. 4.12 Evaluate and interpret the condition numbers for

(a)f (x)=, / -+l forx=1.0001

(b) f(x) = e-" for x = 9

(c) f(x) = - x forx = 200

for x = 0.01

sin x (e) f ( 4 = --- for x = 1.001n

1 + cosx 4.13 Employing ideas from Sec. 4.2, derive the relationships from Table 4.3. 4.14 Prove that Eq. (4.4) is exact for all values of x if f(x) = a 2 + bx + c. 4.15 Manning's formula for a rectangular channel can be written as

where Q = flow (m3/s), n = a roughness coefficient, B = width (m), H = depth (m), and S = slope. You are applying this formula to a stream where you know that the width = 20 m and the depth = 0.3 m. Unfortunately, you know the roughness and the slope to


only a f 10% precision. That is, you know that the roughness is where v, = the escape velocity of the missile. It is desired to fire about 0.03 with a range from 0.027 to 0.033 and the slope is 0.0003 the missile and reach the design maximum velocity within an accu- with a range from 0.00027 to 0.00033. Use a first-order error analy- racy of f 1%. Determine the range of values for & if ve/vo = 2 and sis to determine the sensitivity of the flow prediction to each of or = 0.2. these two factors. Which one should you attempt to measure with more precision? 4.16 If (x 1 < 1, it is known that FIGURE P4.17

Repeat Prob. 4.2 for this series for x = 0.1. 4.17 A missile leaves the ground with an initial velocity vo form- ing an angle $0 with the vertical as shown in Fig. P4.17. The maximum desired altitude is orR where R is the radius of the earth. The laws of mechanics can be used to show that

EPILOGUE: PART ONE

PT 1.4 TRADE-OFFS, Numerical methods are scientific in the sense that they represent systematic techniques for solving mathematical problems. However, there is a certain degree of art, subjective judgment, and compromise associated with their effective use in engineering practice. For each problem, you may be confronted with several alternative numerical methods and many different types of computers. Thus, the elegance and efficiency of different approaches to problems is highly individualistic and correlated with your ability to choose wisely among options. Unfortunately, as with any intuitive process, the factors influencing this choice are difficult to communicate. Only by experience can these skills be fully comprehended and honed. However, because these skills play such a prominent role in the effective implementation of the methods, we have included this section as an introduction to some of the trade- offs that you must consider when selecting a numerical method and the tools for implementing the method. It is hoped that the discussion that follows will influence your orientation when approaching subsequent material. Also, it is hoped that you will refer back to this material when you are confronted with choices and trade-offs in the remainder of the book.

1. Type of Mathematical Problem. As delineated previously in Fig. PT1.2, several types of mathematical problems are discussed in this book: (a) Roots of equations. (b) Systems of simultaneous linear algebraic equations. (c) Optimization. (d) Curve fitting. (e) Numerical integration. ( f ) Ordinary differential equations. ( g) Partial differential equations.

You will probably be introduced to the applied aspects of numerical methods by con- fronting a problem in one of the above areas. Numerical methods will be required because the problem cannot be solved efficiently using analytical techniques. You should be cognizant of the fact that your professional activities will eventually involve problems in all the above areas. Thus, the study of numerical methods and the selection of automatic computation equipment should, at the minimum, consider these basic types of problems. More advanced problems may require capabilities of handling areas such as functional approximation, integral equations, etc. These areas typically demand greater computation power or advanced methods not covered in this text. Other references such as Carnahan, Luther, and Wilkes (1969); Hamming (1973); Ralston and Rabinowitz (1978); and Burden and Faires (1993) should be consulted for problems beyond the scope of this book. In addition, at the end of each part of this text, we include a brief summary and references for

1 06 EPILOGUE: PART ONE

advanced methods to provide you with avenues for pursuing further studies of numerical methods.

2. Type, Availability, Precision, Cost, and Speed of Computer. You may have the option of working with a variety of computation tools. These range from pocket calculators to large mainframe computers. Of course, any of the tools can be used to implement any numerical method (including simple paper and pencil). It is usually not a question of ultimate capability but rather of cost, convenience, speed, dependability, repeatability, and precision. Although each of the tools will continue to have utility, the recent rapid advances in the performance of personal computers have already had a major impact on the engineering profession. We expect this revolution will spread as technological improvements continue, because personal computers offer an excellent compromise in convenience, cost, precision, speed, and storage capacity. Furthermore, they can be readily applied to most practical engineering problems.

3. Program Development Cost versus Software Cost versus Run-Time Cost. Once the types of mathematical problems to be solved have been identified and the computer system has been selected, it is appropriate to consider software and run-time costs. Software development may represent a substantial effort in many engineering projects and may therefore be a significant cost. In this regard, it is particularly important that you be very well acquainted with the theoretical and practical aspects of the relevant numerical methods. In addition, you should be familiar with professionally developed software. Low-cost software is widely available to implement numerical methods that may be readily adapted to a broad variety of problems.

4. Characteristics of the Numerical Method. When computer hardware and software costs are high, or if computer availability is limited (for example, on some timeshare systems), it pays to choose carefully the numerical method to suit the situation. On the other hand, if the problem is still at the exploratory stage and computer access and cost are not problems, it may be appropriate for you to select a numerical method that always works but may not be the most computationally efficient. The numerical methods available to solve any particular type of problem involve the types of trade-offs just discussed and others: (a) Number of Initial Guesses or Starting Points. Some of the numerical methods for

finding roots of equations or solving differential equations require the user to specify initial guesses or starting points. Simple methods usually require one value, whereas complicated methods may require more than one value. The advantages of complicated methods that are computationally efficient may be offset by the re- quirement for multiple starting points. You must use your experience and judgment to assess the trade-offs for each particular problem.

(b) Rate of Convergence. Certain numerical methods converge more rapidly than others. However, this rapid convergence may require more refined initial guesses and more complex programming than a method with slower convergence. Again, you must use your judgment in selecting a method. Faster is not always better.

(c) Stability. Some numerical methods for finding roots of equations or solutions for systems of linear equations may diverge rather than converge on the correct answer for certain problems. Why would you tolerate this possibility when confronted with

PT1.4 TRADE-OFFS 107

design or planning problems? The answer is that these methods may be highly efficient when they work. Thus, trade-offs again emerge. You must decide if your problem requirements justify the effort needed to apply a method that may not always converge.

(d) Accuracy and Precision. Some numerical methods are simply more accurate or precise than others. Good examples are the various equations available for numerical integration. Usually, the performance of low-accuracy methods can be improved by decreasing the step size or increasing the number of applications over a given interval. Is it better to use a low-accuracy method with small step sizes or a high-accuracy method with large step sizes? This question must be addressed on a case-by-case basis taking into consideration the additional factors such as cost and ease of programming. In addition, you must also be concerned with round-off errors when you are using multiple applications of low-accuracy methods and when the number of computations becomes large. Here the number of significant figures handled by the computer may be the deciding factor.

(e) Breadth of Application. Some numerical methods can be applied to only a limited class of problems or to problems that satisfy certain mathematical restrictions. Other methods are not affected by such limitations. You must evaluate whether it is worth your effort to develop programs that employ techniques that are appropriate for only a limited number of problems. The fact that such techniques may be widely used suggests that they have advantages that will often outweigh their disadvantages. Obviously, trade-offs are occurring.

(f) Special Requirements. Some numerical techniques attempt to increase accuracy and rate of convergence using additional or special information. An example would be to use estimated or theoretical values of errors to improve accuracy. However, these improvements are generally not achieved without some inconvenience in terms of added computing costs or increased program complexity.

(g) Programming Effort Required. Efforts to improve rates of convergence, stability, and accuracy can be creative and ingenious. When improvements can be made without increasing the programming complexity, they may be considered elegant and will probably find immediate use in the engineering profession. However, if they require more complicated programs, you are again faced with a trade-off situation that may or may not favor the new method.

It is clear that the above discussion concerning a choice of numerical methods reduces to one of cost and accuracy. The costs are those involved with computer time and program development. Appropriate accuracy is a question of professional judgment and ethics.

5. Mathematical Behavior of the Function, Equation, or Data. In selecting a particular numerical method, type of computer, and type of software, you must consider the complexity of your functions, equations, or data. Simple equations and smooth data may be appropriately handled by simple numerical algorithms and inexpensive computers. The opposite is true for complicated equations and data exhibiting discontinuities.

6. Ease of Application (User-Friendly?). Some numerical methods are easy to apply; others are difficult. This may be a consideration when choosing one method over another. This same idea applies to decisions regarding program development costs versus

108 EPILOGUE: PART ONE

professionally developed software. It may take considerable effort to convert a difficult program to one that is user-friendly. Ways to do this were introduced in Chap. 2 and are elaborated throughout the book. In addition, the Numerical Methods TOOLKIT software supplement is an example of user-friendly programming.

7. Maintenance. Programs for solving engineering problems require maintenance because during application, difficulties invariably occur. Maintenance may require changing the program code or expanding the documentation. Simple programs and numerical algorithms are simpler to maintain.

The chapters that follow involve the development of various types of numerical methods for various types of mathematical problems. Several alternative methods will be given in each chapter. These various methods (rather than a single method chosen by the authors) are presented because there is no single "best" method. There is no best method because there are many trade-offs that must be considered when applying the methods to practical problems. A table that highlights the trade-offs involved in each method will be found at the end of each part of the book. This table should assist you in selecting the appropriate numerical procedure for your particular problem context.

PTl .S IMPORTANT RELATIONSHIPS AND FORMULAS

Table PT1.2 summarizes important information that was presented in Part One. The table can be consulted to quickly access important relationships and formulas. The epilogue of each part of the book will contain such a summary.

PT1.6 ADVANCED METHODS AND ADDITIONAL REFERENCES The epilogue of each part of the book will also include a section designed to facilitate and encourage further studies of numerical methods. This section will reference other books on the subject as well as material related to more advanced methods.'

To extend the background provided in Part One, numerous manuals on computer programming are available. It would be difficult to reference all the excellent books and manuals pertaining to specific languages and computers. In addition, you probably already have material from your previous exposure to programming. However, if this is your first experience with computers, Chapra and Canale (1994) provide a general introduction to BASIC and Fortran. Your instructor and fellow students should also be able to advise you regarding good reference books for the machines and languages available at your school.

As for error analysis, any good introductory calculus book will include supplementary material related to subjects such as the Taylor series expansion. Texts by Swokowski (1979), Thomas and Finney (1979), and Simmons (1985) provide very readable discussions of these subjects. In addition, Taylor (1982) presents a nice introduction to error analysis.

' ~ o o k s are referenced only by author here; a complete bibliography will be found at the back of this text.

PT1.6 ADVANCED METHODS AND ADDITIONAL REFERENCES 1 09

TABLE M1.2 Summary of important information presented in Part One.

The Structure Principle The static structure of the program should correspond in a simple way to the dynamic structure of the corresponding computation.

Error Definitions True error

True percent relative error

Approximate percent relative error

El = true value - approximation

el = true value - approximation 100% true value

eo = present approximation - previous approximation 100% present approximation

Stopping cr~terion Terminate computation when

€0 < es where e, is the desired percent relat~ve error

Taylor Series Taylor series expansion

Remainder

where

fin+ I ' ([I /,"+I R" = ----- /n + I)!

- - ~

Numerical Differentiation First forward finite divided difference f'(x,l =

f h + 1 ) - f(x,l h + O(hl

(Other divided differences are summarized in Chaps. 3 and 17.1

Error Propagation For n independent variables xi, x2, ..., x, having errors A i l , Ax*, .., A?,,, the error in the function f can be estimated via

A 1 - 1 A + $ 1 A + + I $ 1 Ai,,

Finally, although we hope that our book serves you well, it is always good to consult other sources when trying to master a new subject. Burden and Faires (1993); Ralston and Rabinowitz (1978); Hoffman (1992); and Carnahan, Luther, and Wilkes (1969) provide comprehensive discussions of most numerical methods, including some advanced methods that are beyond our scope. Other enjoyable books on the subject are Gerald and Wheatley (1989); Rice (1983); and Cheney and Kincaid (1985). In addition, Press et al. (1992) include computer codes to implement a variety of methods.

Chapra & Canale 1998, Numerical methods for engineers

Documents

Transcript of Chapra & Canale 1998, Numerical methods for engineers