DISCOVERING COMPUTER SCIENCEdiscovercs.denison.edu/toc_preface.pdf · 2015-08-12 · Unlike most...
Transcript of DISCOVERING COMPUTER SCIENCEdiscovercs.denison.edu/toc_preface.pdf · 2015-08-12 · Unlike most...
Chapman & Hall/CRCTEXTBOOKS IN COMPUTING
Jessen HavillDenison University
Granville, Ohio, USA
DISCOVERING COMPUTER
SCIENCEInterdisciplinary Problems,
Principles, and Python Programming
Contents
Preface xv
Acknowledgments xxiii
About the author xxv
Chapter 1 ⌅ What is computation? 11.1 PROBLEMS AND ABSTRACTION 21.2 ALGORITHMS AND PROGRAMS 41.3 EFFICIENT ALGORITHMS 11
Organizing a phone tree 11
A smoothing algorithm 13
A better smoothing algorithm 17
1.4 COMPUTERS ARE DUMB 20Inside a computer 20
Machine language 21
Everything is bits 22
The universal machine 26
1.5 SUMMARY 291.6 FURTHER DISCOVERY 30
Chapter 2 ⌅ Elementary computations 312.1 WELCOME TO THE CIRCUS 312.2 ARITHMETIC 32
Finite precision 34
Division 34
Order of operations 35
Complex numbers 37
2.3 WHAT’S IN A NAME? 382.4 USING FUNCTIONS 45
v
vi ⌅ Contents
Built-in functions 45
Strings 47
Modules 51
*2.5 BINARY ARITHMETIC 54Finite precision 55
Negative integers 56
Designing an adder 57
Implementing an adder 58
2.6 SUMMARY 622.7 FURTHER DISCOVERY 62
Chapter 3 ⌅ Visualizing abstraction 653.1 DATA ABSTRACTION 663.2 VISUALIZATION WITH TURTLES 70
Drawing with iteration 72
3.3 FUNCTIONAL ABSTRACTION 76Function parameters 78
Let’s plant a garden 84
3.4 PROGRAMMING IN STYLE 89Program structure 89
Documentation 91
Descriptive names and magic numbers 95
3.5 A RETURN TO FUNCTIONS 97Return vs. print 100
3.6 SCOPE AND NAMESPACES 103Local namespaces 104
The global namespace 107
3.7 SUMMARY 1113.8 FURTHER DISCOVERY 112
Chapter 4 ⌅ Growth and decay 1134.1 DISCRETE MODELS 114
Managing a fishing pond 114
Measuring network value 121
Organizing a concert 124
4.2 VISUALIZING POPULATION CHANGES 1364.3 CONDITIONAL ITERATION 140
Contents ⌅ vii
*4.4 CONTINUOUS MODELS 145Di↵erence equations 145
Radiocarbon dating 148
Tradeo↵s between accuracy and time 150
Propagation of errors 152
Simulating an epidemic 153
*4.5 NUMERICAL ANALYSIS 159The harmonic series 159
Approximating ⇡ 162
Approximating square roots 164
4.6 SUMMING UP 1674.7 FURTHER DISCOVERY 1714.8 PROJECTS 171
Project 4.1 Parasitic relationships 171
Project 4.2 Financial calculators 173
*Project 4.3 Market penetration 177
*Project 4.4 Wolves and moose 180
Chapter 5 ⌅ Forks in the road 1855.1 RANDOM WALKS 185
A random walk in Monte Carlo 192
Histograms 195
*5.2 PSEUDORANDOM NUMBER GENERATORS 200Implementation 201
Testing randomness 203
*5.3 SIMULATING PROBABILITY DISTRIBUTIONS 205The central limit theorem 206
5.4 BACK TO BOOLEANS 209Short circuit evaluation 212
Complex expressions 214
*Using truth tables 216
Many happy returns 218
5.5 A GUESSING GAME 2245.6 SUMMARY 2335.7 FURTHER DISCOVERY 2345.8 PROJECTS 234
viii ⌅ Contents
Project 5.1 The magic of polling 234
Project 5.2 Escape! 237
Chapter 6 ⌅ Text, documents, and DNA 2416.1 COUNTING WORDS 2426.2 TEXT DOCUMENTS 250
Reading from text files 250
Writing to text files 253
Reading from the web 254
6.3 ENCODING STRINGS 259Indexing and slicing 259
Creating modified strings 261
Encoding characters 263
6.4 LINEAR-TIME ALGORITHMS 270Asymptotic time complexity 274
6.5 ANALYZING TEXT 279Counting and searching 279
A concordance 284
6.6 COMPARING TEXTS 289*6.7 GENOMICS 297
A genomics primer 297
Basic DNA analysis 301
Transforming sequences 302
Comparing sequences 304
Reading sequence files 306
6.8 SUMMARY 3126.9 FURTHER DISCOVERY 313
6.10 PROJECTS 313Project 6.1 Polarized politics 313
*Project 6.2 Finding genes 316
Chapter 7 ⌅ Designing programs 3217.1 HOW TO SOLVE IT 322
Understand the problem 323
Design an algorithm 324
Implement your algorithm as a program 327
Analyze your program for clarity, correctness, and e�ciency 330
Contents ⌅ ix
*7.2 DESIGN BY CONTRACT 331Preconditions and postconditions 331
Checking parameters 332
Assertions 334
*7.3 TESTING 340Unit testing 340
Regression testing 342
Designing unit tests 343
Testing floating point values 347
7.4 SUMMARY 3507.5 FURTHER DISCOVERY 350
Chapter 8 ⌅ Data analysis 3518.1 SUMMARIZING DATA 3518.2 CREATING AND MODIFYING LISTS 360
List accumulators, redux 360
Lists are mutable 361
Tuples 365
List operators and methods 366
*List comprehensions 368
8.3 FREQUENCIES, MODES, AND HISTOGRAMS 373Tallying values 373
Dictionaries 374
8.4 READING TABULAR DATA 384*8.5 DESIGNING EFFICIENT ALGORITHMS 390
A first algorithm 391
A more elegant algorithm 399
A more e�cient algorithm 400
*8.6 LINEAR REGRESSION 403*8.7 DATA CLUSTERING 409
Defining similarity 410
A k-means clustering example 411
Implementing k-means clustering 414
Locating bicycle safety programs 416
8.8 SUMMARY 4218.9 FURTHER DISCOVERY 421
x ⌅ Contents
8.10 PROJECTS 422Project 8.1 Climate change 422
Project 8.2 Does education influence unemployment? 425
Project 8.3 Maximizing profit 427
Project 8.4 Admissions 428
*Project 8.5 Preparing for a 100-year flood 430
Project 8.6 Voting methods 435
Project 8.7 Heuristics for traveling salespeople 438
Chapter 9 ⌅ Flatland 4439.1 TWO-DIMENSIONAL DATA 4439.2 THE GAME OF LIFE 449
Creating a grid 451
Initial configurations 452
Surveying the neighborhood 453
Performing one pass 454
Updating the grid 457
9.3 DIGITAL IMAGES 461Colors 461
Image filters 463
Transforming images 467
9.4 SUMMARY 4719.5 FURTHER DISCOVERY 4719.6 PROJECTS 471
Project 9.1 Modeling segregation 471
Project 9.2 Modeling ferromagnetism 473
Project 9.3 Growing dendrites 474
Chapter 10 ⌅ Self-similarity and recursion 47710.1 FRACTALS 477
A fractal tree 479
A fractal snowflake 481
10.2 RECURSION AND ITERATION 488Solving a problem recursively 491
Palindromes 492
Guessing passwords 495
10.3 THE MYTHICAL TOWER OF HANOI 500
Contents ⌅ xi
*Is the end of the world nigh? 502
10.4 RECURSIVE LINEAR SEARCH 503E�ciency of recursive linear search 505
10.5 DIVIDE AND CONQUER 508Buy low, sell high 508
Navigating a maze 512
*10.6 LINDENMAYER SYSTEMS 518Formal grammars 518
Implementing L-systems 522
10.7 SUMMARY 52510.8 FURTHER DISCOVERY 52610.9 PROJECTS 526
*Project 10.1 Lindenmayer’s beautiful plants 526
Project 10.2 Gerrymandering 531
Project 10.3 Percolation 536
Chapter 11 ⌅ Organizing data 54111.1 BINARY SEARCH 542
E�ciency of iterative binary search 546
A spelling checker 548
Recursive binary search 549
E�ciency of recursive binary search 550
11.2 SELECTION SORT 553Implementing selection sort 553
E�ciency of selection sort 557
Querying data 558
11.3 INSERTION SORT 563Implementing insertion sort 564
E�ciency of insertion sort 566
11.4 EFFICIENT SORTING 570Internal vs. external sorting 574
E�ciency of merge sort 574
*11.5 TRACTABLE AND INTRACTABLE ALGORITHMS 577Hard problems 579
11.6 SUMMARY 58011.7 FURTHER DISCOVERY 581
xii ⌅ Contents
11.8 PROJECTS 581Project 11.1 Creating a searchable database 581
Project 11.2 Binary search trees 581
Chapter 12 ⌅ Networks 58712.1 MODELING WITH GRAPHS 588
Making friends 590
12.2 SHORTEST PATHS 594Finding the actual paths 598
12.3 IT’S A SMALL WORLD. . . 601Clustering coe�cients 603
Scale-free networks 605
12.4 RANDOM GRAPHS 60812.5 SUMMARY 61112.6 FURTHER DISCOVERY 61112.7 PROJECTS 612
Project 12.1 Di↵usion of ideas and influence 612
Project 12.2 Slowing an epidemic 614
Project 12.3 The Oracle of Bacon 616
Chapter 13 ⌅ Abstract data types 62113.1 DESIGNING CLASSES 622
Implementing a class 625
Documenting a class 632
13.2 OPERATORS AND SPECIAL METHODS 637String representations 637
Arithmetic 638
Comparison 640
Indexing 642
13.3 MODULES 645Namespaces, redux 646
13.4 A FLOCKING SIMULATION 648The World ADT 649
The Boid ADT 655
13.5 A STACK ADT 66513.6 A DICTIONARY ADT 671
Hash tables 672
Contents ⌅ xiii
Implementing a hash table 673
Implementing indexing 676
ADTs vs. data structures 678
13.7 SUMMARY 68213.8 FURTHER DISCOVERY 68213.9 PROJECTS 683
Project 13.1 Tracking GPS coordinates 683
Project 13.2 Economic mobility 687
Project 13.3 Slime mold aggregation 690
Project 13.4 Boids in space 692
Appendix A ⌅ Installing Python 697A.1 AN INTEGRATED DISTRIBUTION 697A.2 MANUAL INSTALLATION 697
Appendix B ⌅ Python library reference 701B.1 MATH MODULE 701B.2 TURTLE METHODS 702B.3 SCREEN METHODS 703B.4 MATPLOTLIB.PYPLOT MODULE 704B.5 RANDOM MODULE 704B.6 STRING METHODS 705B.7 LIST METHODS 706B.8 IMAGE MODULE 706B.9 SPECIAL METHODS 707
Bibliography 709
Index 713
Preface
I n my view, an introductory computer science course should strive to accomplishthree things. First, it should demonstrate to students how computing has become
a powerful mode of inquiry, and a vehicle of discovery, in a wide variety of disciplines.This orientation is also inviting to students of the natural and social sciences, whoincreasingly benefit from an introduction to computational thinking, beyond thelimited “black box” recipes often found in manuals. Second, the course should engagestudents in computational problem solving, and lead them to discover the power ofabstraction, e�ciency, and data organization in the design of their solutions. Third,the course should teach students how to implement their solutions as computerprograms. In learning how to program, students more deeply learn the core principles,and experience the thrill of seeing their solutions come to life.
Unlike most introductory computer science textbooks, which are organizedaround programming language constructs, I deliberately lead with interdisciplinaryproblems and techniques. This orientation is more interesting to a more diverseaudience, and more accurately reflects the role of programming in problem solv-ing and discovery. A computational discovery does not, of course, originate in aprogramming language feature in search of an application. Rather, it starts witha compelling problem which is modeled and solved algorithmically, by leveragingabstraction and prior experience with similar problems. Only then is the solutionimplemented as a program.
Like most introductory computer science textbooks, I introduce programmingskills in an incremental fashion, and include many opportunities for students to prac-tice them. The topics in this book are arranged to ease students into computationalthinking, and encourage them to incrementally build on prior knowledge. Eachchapter focuses on a general class of problems that is tackled by new algorithmictechniques and programming language features. My hope is that students will leavethe course, not only with strong programming skills, but with a set of problemsolving strategies and simulation techniques that they can apply in their futurework, whether or not they take another computer science course.
I use Python to introduce computer programming for two reasons. First, Python’sintuitive syntax allows students to focus on interesting problems and powerfulprinciples, without unnecessary distractions. Learning how to think algorithmicallyis hard enough without also having to struggle with a non-intuitive syntax. Second,the expressiveness of Python (in particular, low-overhead lists and dictionaries)expands tremendously the range of accessible problems in the introductory course.Teaching with Python over the last ten years has been a revelation; introductorycomputer science has become fun again.
xv
xvi ⌅ Preface
Web resources
The text, exercises, and projects often refer to files on the book’s accompanyingweb site, which can be found at
http://discoverCS.denison.edu .
This web site also includes pointers for further exploration, links to additionaldocumentation, and errata.
To students
Learning how to solve computational problems and implement them as computerprograms requires daily practice. Like an athlete, you will get out of shape and fallbehind quickly if you skip it. There are no shortcuts. Your instructor is there tohelp, but he or she cannot do the work for you.
With this in mind, it is important that you type in and try the examplesthroughout the text, and then go beyond them. Be curious! There are numbered“Reflection” questions throughout the book that ask you to stop and think about, orapply, something that you just read. Often, the question is answered in the bookimmediately thereafter, so that you can check your understanding, but peekingahead will rob you of an important opportunity.
There are many opportunities to delve into topics more deeply. Boxes scatteredthroughout the text briefly introduce related, but more technical, topics. For themost part, these are not strictly required to understand what comes next, but Iencourage you to read them anyway. In the “Further discovery” section of eachchapter, you can find additional pointers to explore chapter topics in more depth.
At the end of most sections are several programming exercises that ask youto further apply concepts from that section. Often, the exercises assume that youhave already worked through all of the examples in that section. All later chaptersconclude with a selection of more involved interdisciplinary projects that you maybe asked by your instructor to tackle.
The book assumes no prior knowledge of computer science. However, it doesassume a modest comfort with high school algebra and mathematical functions.Occasionally, trigonometry is mentioned, as is the idea of convergence to a limit,but these are not crucial to an understanding of the main topics in this book.
To instructors
This book may be appropriate for a traditional CS1 course for majors, a CS0 coursefor non-majors (at a slower pace and omitting more material), or an introductorycomputing course for students in the natural and/or social sciences.
As suggested above, I emphasize computer science principles and the role ofabstraction, both functional and data, throughout the book. I motivate functionsas implementations of functional abstractions, and point out that strings, lists,and dictionaries are all abstract data types that allow us to solve more interestingproblems than would otherwise be possible. I introduce the idea of time complexity
Preface ⌅ xvii
Chapter 1What is
computation?
Chapter 2Elementary
computations
Chapter 3Visualizing abstraction
Chapter 4Growth and decay
Chapter 5Forks in the road
Chapter 6Text, documents,
and DNA
Chapter 7Designing programs
Chapter 8Data analysis
Chapter 9Flatland
Chapter 10Self-similarity and
recursion
Chapter 11Organizing data
Chapter 12Networks
Chapter 13Abstract data
types
Figure 1 An overview of chapter dependencies.
intuitively, without formal definitions, in the first chapter and return to it severaltimes as more sophisticated algorithms are developed. The book uses a spiralapproach for many topics, returning to them repeatedly in increasingly complexcontexts. Where appropriate, I also weave into the book topics that are traditionallyleft for later computer science courses. A few of these are presented in boxes thatmay be covered at your discretion. None of these topics is introduced rigorously, asthey would be in a data structures course. Rather, I introduce them informally andintuitively to give students a sense of the problems and techniques used in computerscience. I hope that the tables below will help you navigate the book, and see whereparticular topics are covered.
This book contains over 600 end-of-section exercises and over 300 in-text reflectionquestions that may be assigned as homework or discussed in class. At the end ofmost chapters is a selection of projects (about 30 in all) that students may work onindependently or in pairs over a longer time frame. I believe that projects like theseare crucial for students to develop both problem solving skills and an appreciationfor the many fascinating applications of computer science.
Because this book is intended for a student who may take additional courses incomputer science and learn other programming languages, I intentionally omit somefeatures of Python that are not commonly found elsewhere (e.g., simultaneous swap,chained comparisons, enumerate in for loops). You may, of course, supplementwith these additional syntactical features.
There is more in this book than can be covered in a single semester, giving aninstructor the opportunity to tailor the content to his or her particular situation andinterests. Generally speaking, as illustrated in Figure 1, Chapters 1–6 and 8 form thecore of the book, and should be covered sequentially. The remaining chapters can becovered, partially or entirely, at your discretion, although I would expect that mostinstructors will cover at least parts of Chapters 7, 10, 11, and 13. Chapter 7 contains
xviii ⌅ Preface
additional material on program design, including design by contract, assertions andunit testing that may be skipped without consequences for later chapters. Chapters9–13 are, more or less, independent of each other. Sections marked with an asteriskare optional, in the sense that they are not assumed for future sections in thatchapter. When projects depend on optional sections, they are also marked with anasterisk, and the dependency is stated at the beginning of the project.
Chapter outlines
The following tables provide brief overviews of each chapter. Each table’s threecolumns, reflecting the three parts of the book’s subtitle, provide three lenses throughwhich to view the chapter. The first column lists a selection of representative problemsthat are used to motivate the material. The second column lists computer scienceprinciples that are introduced in that chapter. Finally, the third column lists Pythonprogramming topics that are either introduced or reinforced in that chapter toimplement the principles and/or solve the problems.
Chaper 1. What is computation?
Sample problems Principles Programming
• digital music• search engines• GPS devices• smoothing data• phone trees
• problems, input/output• abstraction• algorithms and programs• computer architecture• binary representations• time complexity• Turing machine
—
Chapter 2. Elementary computations
Sample problems Principles Programming
• wind chill• geometry• compounding
interest• Mad Libs
• finite precision• names as references• using functional
abstractions• binary addition
• int and float numeric types• arithmetic and the math module• variable names and assignment• calling built-in functions• using strings, + and * operators• print and input
Chapter 3. Visualizing abstraction
Sample problems Principles Programming
• visualizing anarchaeological dig
• random walks• ideal gas• groundwater flow• demand functions
• using abstract data types• creating functional
abstractions• basic functional
decomposition
• using classes and objects• turtle module• basic for loops• writing functions• namespaces• docstrings and comments
Preface ⌅ xix
Chapter 4. Growth and decay
Sample problems Principles Programming
• network value• demand and profit• loans and investing• bacterial growth• radiocarbon dating• di↵usion models– SIR, SIS, Bass
• competition models– Nicholson-Bailey– Lotka-Volterra– indirect
• accumulators• list accumulators• di↵erence equations• approximating continuous
models• accuracy vs. time• error propagation• numerical approximation• classes of growth
• for loops• format strings• range• matplotlib• appending to lists• while loops
Chapter 5. Forks in the road
Sample problems Principles Programming
• random walks• guessing games• polling and
sampling• particle escape
• Monte Carlo simulation• pseudorandom number
generators• simulating probabilities• flag variables• using uniform and normal
distributions• DeMorgan’s laws
• random module• if/elif/else• comparison operators• Boolean operators• matplotlib histograms• while loops
Chapter 6. Text, documents, and DNA
Sample problems Principles Programming
• word count• textual analysis• parsing XML• checksums• concordances• detecting plagiarism• congressional votes• genomics
• ASCII, Unicode• linear-time algorithms• asymptotic time complexity• linear search• dot plots• string accumulators
• str class and methods• iterating over strings• indexing and slices• iterating over indices• reading and writing text files• nested loops
Chapter 7. Designing programs
Sample problems Principles Programming
• word frequencyanalysis
• problem solving• top-down design• pre and postconditions• assertions• unit testing
• assert statement• conditional execution
of main• writing modules
xx ⌅ Preface
Chapter 8. Data analysis
Sample problems Principles Programming
• 100-year floods• traveling salesman• Mohs scale• meteorite sites• zebra migration• tumor diagnosis• education levels• supply and demand• voting methods
• histograms• hash tables• tabular data files• e�cient algorithms• linear regression• k-means clustering• heuristics
• list class• iterating over lists• indexing and slicing• list operators and methods• lists in memory; mutability• list parameters• tuples• list comprehensions• dictionaries
Chapter 9. Flatland
Sample problems Principles Programming
• earthquake data• Game of Life• image filters• racial segregation• ferromagnetism• dendrites
• 2-D data• cellular automata• digital images• color models
• 2-D data in list of lists• nested loops• 2-D data in a dictionary
Chapter 10. Self-similarity and recursion
Sample problems Principles Programming
• fractals• cracking passwords• Tower of Hanoi• maximizing profit• path through a maze• Lindenmayer system• electoral districting• percolation
• self-similarity• recursion• linear search• recurrence relations• divide and conquer• depth-first search• grammars
• writing recursive functions
Chapter 11. Organizing data
Sample problems Principles Programming
• spell check• querying data sets
• binary search• recurrence relations• basic sorting algorithms• quadratic-time algorithms• parallel lists• merge sort• intractability• P=NP (intuition)• Moore’s law• binary search trees
• nested loops• writing recursive functions
Preface ⌅ xxi
Chapter 12. Networks
Sample problems Principles Programming
• Facebook, Twitter,web graphs
• di↵usion of ideas• epidemics• Oracle of Bacon
• graphs• adjacency list• adjacency matrix• breadth-first search• distance and shortest paths• depth-first search• small-world networks• scale-free networks• clustering coe�cient• uniform random graphs
• dictionaries
Chapter 13. Abstract data types
Sample problems Principles Programming
• data sets• genomic sequences• rational numbers• flocking behavior• slime mold
aggregation
• abstract data types• data structures• stacks• hash tables• agent-based simulation• swarm intelligence
• writing classes• special methods• overriding operators• modules
Software assumptions
To follow along in this book and complete the exercises, you will need to have installedPython 3.4 (or later) on your computer, and have access to IDLE or anotherprogramming environment. The book also assumes that you have installed thematplotlib and numpy modules. Please refer to Appendix A for more information.
Errata
While I (and my students) have ferreted out many errors, readers will inevitablyfind more. You can find an up-to-date list of errata on the book web site. Ifyou find an error in the text or have another suggestion, please let me know [email protected].