Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to...

221
Computing Science 165-3 Study Guide Introduction to Multimedia and the Internet by Greg Baker Faculty of Applied Sciences Centre for Distance Education Continuing Studies c Simon Fraser University, Summer 2004

Transcript of Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to...

Page 1: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Computing Science 165-3 • Study Guide

Introduction to Multimediaand the Internet

by

Greg Baker

Faculty of Applied SciencesCentre for Distance Education • Continuing Studies

c© Simon Fraser University, Summer 2004

Page 2: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2

Page 3: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Contents

I Introduction 11

Course Introduction 13Learning Resources . . . . . . . . . . . . . . . . . . . . . . 13Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 17Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Getting through CMPT 165 . . . . . . . . . . . . . . . . . 21About the Author . . . . . . . . . . . . . . . . . . . . . . . 23

II The Web and Web Pages 25

1 The World Wide Web 271.1 Basics of the Internet . . . . . . . . . . . . . . . . . . . . . 271.2 Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311.3 How Web Pages Travel . . . . . . . . . . . . . . . . . . . . 321.4 MIME Types . . . . . . . . . . . . . . . . . . . . . . . . . 341.5 Fetching a Web Page . . . . . . . . . . . . . . . . . . . . . 36

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2 Markup and HTML 392.1 Describing Documents . . . . . . . . . . . . . . . . . . . . 392.2 HTML Basics . . . . . . . . . . . . . . . . . . . . . . . . . 422.3 More HTML . . . . . . . . . . . . . . . . . . . . . . . . . . 472.4 Links in HTML . . . . . . . . . . . . . . . . . . . . . . . . 502.5 Images in HTML . . . . . . . . . . . . . . . . . . . . . . . 52

3

Page 4: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4 CONTENTS

2.6 More HTML . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.7 Validating HTML . . . . . . . . . . . . . . . . . . . . . . . 55

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3 Text and Graphics 59

3.1 How Computers Store Data . . . . . . . . . . . . . . . . . 59

3.2 Text and Character Sets . . . . . . . . . . . . . . . . . . . 61

3.3 Graphics and Image Types . . . . . . . . . . . . . . . . . . 64

3.4 Bitmap vs. Vector Images . . . . . . . . . . . . . . . . . . 65

3.5 File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.6 File Formats, Common . . . . . . . . . . . . . . . . . . . . 71

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4 Cascading Style Sheets 75

4.1 CSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.2 Classes and IDs . . . . . . . . . . . . . . . . . . . . . . . . 78

4.3 CSS Properties . . . . . . . . . . . . . . . . . . . . . . . . 82

4.4 Specifying Colours . . . . . . . . . . . . . . . . . . . . . . 83

4.5 CSS example . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.6 Logical versus Physical . . . . . . . . . . . . . . . . . . . . 86

4.7 Why Logical HTML and CSS? . . . . . . . . . . . . . . . . 90

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5 Design 93

5.1 General Design . . . . . . . . . . . . . . . . . . . . . . . . 93

5.2 Design Principles and HTML/CSS . . . . . . . . . . . . . . 98

5.3 Web Page Design . . . . . . . . . . . . . . . . . . . . . . . 100

5.4 Usability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.5 Web Site Design . . . . . . . . . . . . . . . . . . . . . . . . 105

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Page 5: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

CONTENTS 5

6 XML 1096.1 What is XML? . . . . . . . . . . . . . . . . . . . . . . . . 1096.2 Some XML Languages . . . . . . . . . . . . . . . . . . . . 1116.3 Styling XML . . . . . . . . . . . . . . . . . . . . . . . . . . 1136.4 Validating XML . . . . . . . . . . . . . . . . . . . . . . . . 1156.5 XHTML and HTML . . . . . . . . . . . . . . . . . . . . . 116

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

III Internet Programming 119

7 Programming Introduction 1217.1 What is Programming? . . . . . . . . . . . . . . . . . . . . 1227.2 Starting with Python . . . . . . . . . . . . . . . . . . . . . 1237.3 Example Program . . . . . . . . . . . . . . . . . . . . . . . 1247.4 Expressions and Variables . . . . . . . . . . . . . . . . . . 1277.5 User Input . . . . . . . . . . . . . . . . . . . . . . . . . . . 1297.6 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1307.7 Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . . 1337.8 Python Libraries . . . . . . . . . . . . . . . . . . . . . . . 1367.9 Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

8 Web Programming 1438.1 Making Web Pages with Python . . . . . . . . . . . . . . . 1448.2 HTML Forms . . . . . . . . . . . . . . . . . . . . . . . . . 1478.3 CGI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1498.4 Debugging CGI Scripts . . . . . . . . . . . . . . . . . . . . 150

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

9 More Programming 1539.1 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1549.2 Local Variables . . . . . . . . . . . . . . . . . . . . . . . . 1569.3 Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1589.4 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1629.5 Handling Errors . . . . . . . . . . . . . . . . . . . . . . . . 1659.6 Solving Problems 1: Making Change . . . . . . . . . . . . 1679.7 Solving Problems 2: Displaying HTML Source . . . . . . . 171

Page 6: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

6 CONTENTS

9.8 Coding Style . . . . . . . . . . . . . . . . . . . . . . . . . . 176Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

10 More Web Programming 17910.1 Security Basics . . . . . . . . . . . . . . . . . . . . . . . . 17910.2 Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18110.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 182

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

11 Internet Internals 18511.1 HTTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18511.2 HTTP Tricks . . . . . . . . . . . . . . . . . . . . . . . . . 18811.3 DNS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19211.4 Security and Encryption . . . . . . . . . . . . . . . . . . . 19311.5 URLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

IV Appendices 199

A Technical Instructions 201A.1 Installing Software . . . . . . . . . . . . . . . . . . . . . . 202A.2 SFU Computing Account . . . . . . . . . . . . . . . . . . . 204A.3 CMPT 165 Server Account . . . . . . . . . . . . . . . . . . 204A.4 Creating Web Pages . . . . . . . . . . . . . . . . . . . . . . 205A.5 Transferring Web Pages . . . . . . . . . . . . . . . . . . . . 205

B Software 207B.1 Mozilla . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208B.2 TextPad . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208B.3 The GIMP . . . . . . . . . . . . . . . . . . . . . . . . . . . 208B.4 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213B.5 Secure File Transfer . . . . . . . . . . . . . . . . . . . . . . 213B.6 Validators . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

Page 7: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

List of Figures

1 CMPT 165 course schedule . . . . . . . . . . . . . . . . . . . 22

1.1 How information might get from the SFU web server to ahome computer . . . . . . . . . . . . . . . . . . . . . . . . . 28

1.2 The conversation that a web client and web server might havewhen you view a web page . . . . . . . . . . . . . . . . . . . 30

1.3 The parts of a simple URL . . . . . . . . . . . . . . . . . . . 341.4 Some example MIME types . . . . . . . . . . . . . . . . . . . 35

2.1 An XHTML document . . . . . . . . . . . . . . . . . . . . . 442.2 The display of Figure 2.1 in a browser . . . . . . . . . . . . . 442.3 HTML terms to output “garcon” . . . . . . . . . . . . . . . . 482.4 A sample page from the XHTML reference . . . . . . . . . . 492.5 A hyperlink on a web page . . . . . . . . . . . . . . . . . . . 512.6 URLs from http://www.sfu.ca/∼somebody/pics/index.html . . 522.7 Some non-valid XHTML . . . . . . . . . . . . . . . . . . . . 562.8 Part of the validation results of Figure 2.7 . . . . . . . . . . . 57

3.1 Prefixes for storage units . . . . . . . . . . . . . . . . . . . . 613.2 Scaling (a) a vector image and (b) a bitmapped image . . . . 663.3 Colour dithering . . . . . . . . . . . . . . . . . . . . . . . . . 693.4 An image with a low-quality lossy compression . . . . . . . . 703.5 Various types of transparency in images . . . . . . . . . . . . 71

4.1 A cascading style sheet . . . . . . . . . . . . . . . . . . . . . 774.2 HTML source of a page that references a style sheet . . . . . 874.3 The style sheet, style.css, referenced by Figure 4.2 . . . . . . 874.4 The display of Figure 4.2, with the style sheet . . . . . . . . 88

7

Page 8: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

8 LIST OF FIGURES

4.5 The display of Figure 4.2, without the style sheet . . . . . . . 884.6 Two possible appearances for contents of <del> or <strike>. 89

5.1 A design illustrating proximity . . . . . . . . . . . . . . . . . 955.2 A design illustrating alignment . . . . . . . . . . . . . . . . . 965.3 A design illustrating repetition . . . . . . . . . . . . . . . . . 975.4 A design illustrating contrast . . . . . . . . . . . . . . . . . . 99

6.1 A recipe markup up in XML . . . . . . . . . . . . . . . . . . 1106.2 A sample SVG file that represents a happy face . . . . . . . . 1126.3 The display of Figure 6.2 . . . . . . . . . . . . . . . . . . . . 1126.4 Part of the style for Figure 6.1 . . . . . . . . . . . . . . . . . 1146.5 Part of the display of Figure 6.1 after the application of CSS

from Figure 6.4 . . . . . . . . . . . . . . . . . . . . . . . . . 114

7.1 An example Python program . . . . . . . . . . . . . . . . . . 1257.2 Three sample executions of the program in Figure 7.1 . . . . 1267.3 A Python program that gets input from the user . . . . . . . 1297.4 A program with type conversion . . . . . . . . . . . . . . . . 1327.5 A program with an if block . . . . . . . . . . . . . . . . . . 1347.6 Guessing game: testing random . . . . . . . . . . . . . . . . . 1387.7 Guessing game: testing user input . . . . . . . . . . . . . . . 1397.8 Guessing game: checking their guess . . . . . . . . . . . . . . 1397.9 Guessing game: checking greater or less . . . . . . . . . . . . 140

8.1 A program that generates a simple web page . . . . . . . . . 1448.2 A web script that does a little more . . . . . . . . . . . . . . 1458.3 Sample output from Figure 8.2 . . . . . . . . . . . . . . . . . 1458.4 Sample display of Figure 8.2 in a browser . . . . . . . . . . . 1468.5 Figure 8.1 rewritten with triple-quoted strings . . . . . . . . 1468.6 The body of an HTML file with a form . . . . . . . . . . . . 1478.7 The display of Figure 8.6 in a browser . . . . . . . . . . . . . 1488.8 An HTML form for simple user input . . . . . . . . . . . . . 1498.9 A web script that uses the input from Figure 8.8 . . . . . . . 150

9.1 A program with a function defined . . . . . . . . . . . . . . . 1559.2 A program that takes advantage of local variables . . . . . . 1579.3 Using a for loop . . . . . . . . . . . . . . . . . . . . . . . . . 1589.4 Using a while loop . . . . . . . . . . . . . . . . . . . . . . . 159

Page 9: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

LIST OF FIGURES 9

9.5 A version of the guessing game using a while loop . . . . . . 1609.6 A version of the guessing game using a for loop . . . . . . . 1619.7 Using a list to store some integers . . . . . . . . . . . . . . . 1649.8 Catching an exception . . . . . . . . . . . . . . . . . . . . . . 1659.9 Asking until we get correct input . . . . . . . . . . . . . . . . 1669.10 Making change: first attempt . . . . . . . . . . . . . . . . . . 1699.11 Making change: fixed multiple coins . . . . . . . . . . . . . . 1709.12 Making change: details filled in . . . . . . . . . . . . . . . . . 1709.13 Displaying source: basic CGI skeleton . . . . . . . . . . . . . 1729.14 Displaying source: using urllib . . . . . . . . . . . . . . . . 1739.15 Displaying source: inserting entities . . . . . . . . . . . . . . 1749.16 Sample display of Figure 9.15 in a browser . . . . . . . . . . 1759.17 Displaying source: fixed entity replacement . . . . . . . . . . 175

B.1 The main toolbox in the GIMP . . . . . . . . . . . . . . . . . 209B.2 The GIMP’s right-click menu with the option for saving the

current file selected . . . . . . . . . . . . . . . . . . . . . . . 210B.3 The main tool buttons in the GIMP . . . . . . . . . . . . . . 211

Page 10: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

10 LIST OF FIGURES

Page 11: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Part I

Introduction

11

Page 12: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor
Page 13: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Course Introduction

Welcome to CMPT 165, Multimedia and the Internet. This course is anintroduction to the internet and WWW for non-computer science majors. Itisn’t intended to teach you how to use your computer, starting with “Turnit on.” You are expected to have a basic knowledge of how to use your com-puter. You should be comfortable using it for simple tasks such as runningprograms, finding and opening files, and so forth. You should also be able touse the Internet.

Here are some of the goals set out for students in this course. By the endof the course, you should be able to

• Explain some of the underlying technologies of multimedia and theInternet.

• Create well-designed web sites using modern web technologies that canbe viewed in any web browser.

• Design computer programs to complete a specific task.

• Use your programming skills to create dynamically generated web sites.

• Begin to combine these skills to develop full-featured web sites.

Keep these goals in mind as you progress through the course.

Learning Resources

Study Guide

The Study Guide is intended to guide you through this course. It will helpyou learn the content and determine what information to focus on in thetexts.

13

Page 14: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

14

The readings for each section are listed at the beginning of the units. Youshould also look at the key terms listed at the end of each unit.

In some places, there are references to other sections. A reference to“Topic 3.4,” for example, means Topic 4 in Unit 3.

In the Study Guide, side-notes are indicated with this symbol.They are meant to replace the asides that usually happen in lec-tures when students ask questions. They aren’t strictly part of the“course.”

Required Texts

The only required text for this course is How to Think Like a Computer

Scientist: Learning With Python. It will be included in your course packageand will arrive with your study guide. This book is a nice introduction toPython programming, which we will use in the second half of the course.

The title of the book is a little misleading. The book does not discusscomputer science; it only covers computer programming.

Recommended Texts

There are several other texts recommended for this course. Whether or notyou buy or read these is up to you—you certainly aren’t expected to buy allof them. They are all available in the reserve section of the Library—youcan look at them there if you’re not sure if you want to buy them.

The recommended texts provide more information than we can put in theStudy Guide. If you are particularly interested in specific topics, you mightconsider buying the corresponding recommended texts.

The Internet Book by Douglas E. Comer provides more background onthe internals of the Internet, which are explored in Units 1 and 11.

Jeffrey Zeldman’s Designing with Web Standards covers design of webpages with XHTML and CSS which are discussed in Unit 2 and Unit 4. Thisis an excellent book for those who are really interested in web page design.It is one of the few (at least one of the first) books that discuss web pagedesign with techniques made possible by newer web browsers.

The Non-Designer’s Design Book by Robin Williams (no, not that RobinWilliams) is recommended for the material in Unit 5, which is also relevantto Assignment 2. Either the first or second edition of this book is acceptable.

Page 15: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

15

If you aren’t near campus, you can use the library’s Telebook service.Telebook services will mail you books, journal articles, and other librarymaterial required to complete course assignments. For more informationabout how to use Telebook, see http://www.lib.sfu.ca/kiosk/other/telebook.htm . Also, see the Telebook material in the red folder.

Web Materials

The address for the web materials for this course can be found in your RedFolder. To access some parts of the site, you will need to log in to WebCTusing your SFU Unix ID. Use the same username and password you use forWebmail and my.sfu.ca . Open University students should check their coursepackage for instructions on how to get a WebCT account.

If you feel that you need more information about WebCT, look at thematerial in your course package. In addition, the SFU Library offers intro-ductory sessions in using WebCT that are available to students in any coursethat uses WebCT.

The course web site has several main sections:

• Administrative Information

In this section, you will find information about the course, the coursesupervisor, and your tutor markers. You will also find information ondue dates and grading.

• References

Here, you find information about the various references for this course,including links to the online references that you are expected to use forthe course.

• Tools

The “Tools” section contains links to all of the software tools you willneed for the course. It lists software that you can download and installon your computer as well as tools that you will use online. More detailson the required software can be found in Appendices A and B

Inside “Tools,” you will find a link to “Discussions,” where you candiscuss the course topics with other students. The course supervisorand TMs do not generally respond to discussions; they are intendedfor student-to-student communication. See “Getting Help,” below formore information.

Page 16: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

16

• Materials

This section contains supplementary material directly related to thecourse content. You can download all of the code that is used in thisguide if you’d like to try it for yourself or modify it.

You will also find an archive of the course email list and some sampleexams.

• Links

The links are organized into categories corresponding to units in thisguide. You can use them to supplement the material the guide andother readings present. If you have suggestions for additions to thislist, feel free to email the course supervisor.

• Assignments

Here, you will find all of the assignments and exercises for this course.

Online References

There are several online references that are as important as the texts. Youcan find links to them on the course web site section titled “References.”

These resources are very important to your success in this course. Theyaren’t meant to be read from beginning to end like the readings in the text-book. You should use them to get an overall picture of the topic and asreferences as you do the assignments.

The Red Folder

The red folder in your Centre for Distance Education package contains theassignment deadlines and exam dates for the course. It also contains impor-tant information from the Centre for Distance Education, in particular, theStudent Handbook, which includes information about how to set up yourSFU computing account.

Email Communications

The TMs and course supervisor will use email to send announcements andtips during the semester. You should read your SFU email account regularly.

Page 17: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

17

You can also contact the TMs and course supervisor by email if you needhelp. See the “Getting Help” section at the end of the Introduction fordetails.

CMPT 165 Web Server

Throughout the course, you will be creating web pages. You have web spaceavailable on a web server set up specifically for this course. Appendix Acontains information on working with this server.

You must use this server for all web pages in this course. Work on otherweb servers will not be marked.

Open University students will not have accounts automatically set up onthis server. They must immediately email the course supervisor, giving theirname and email address so their account can be activated. Not doing so willnot be accepted as an excuse for late exercises/assignments.

Requirements

Computer Requirements

You need to have access to a computer with the minimum requirements notedon the back of the course outline.

You will also need a connection to the Internet through a dialup connec-tion (one is provided with your SFU registration) or through another type ofconnection like a cable modem or a DSL line. A high-speed connection willmake your life easier, but it isn’t required.

Software Requirements

All of the software that you need for this course can be downloaded free.Links to obtain the course software are on the course web site under “Tools.”Appendix A contains instructions for installing the software. Appendix Bcontains other instructions you need for assignments.

• A web browser: You will be creating web pages in this course. Tosee them and the other pages you need to read, you need to have a webbrowser installed on your computer. We strongly recommend that you

Page 18: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

18

use Mozilla or Netscape 7. These browsers are the only ones that havenear-perfect support for the style sheets we will be using.

Netscape 7 and Mozilla are actually the same browser with some minordifferences. Mozilla has been released by a group of developers for freeuse. The same program was modified slightly and released by Netscape.In this course, we recommend Mozilla because is has fewer superfluousfeatures.

Internet Explorer and earlier versions of Netscape don’t support stylesheets as well as the recommended browsers so they are harder to workwith. Since we use them a lot, you will find other browsers harder towork with.

• A text editor: You will be making your web pages and style sheetswith a text editor. For Windows, we recommend TextPad; for a Mac-intosh, BBEdit. You could get by with Notepad or Simpletext, but abetter editor will make your life easier.

• A graphics program: To create graphics for your web pages, you willuse a graphics editor. You should use GIMP for Windows or Graphic-Converter for a Mac. You won’t be able to use Windows Paint. If youalready have a program like Photoshop or PhotoPaint, you can use it,but the TMs will not have access to them, so they won’t be able tohelp with any software problems.

• Python: The second half of the course will involve programming withthe Python programming language. Python is a good language to workwith when you’re learning to program, and it can be downloaded free.

• Other software: There is some other software that you may need totransfer files and install certain software. See Appendix A for instruc-tions and descriptions.

Evaluation

Marking

You mark in this course will be based on your exercises, assignments, andtwo exams. The marks will be weighted as follows:

Page 19: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

19

Exercises 10%Assignments 20%Midterm exam 20%Final exam 50%

Each of the exercises and assignments will be weighted equally. You mustattain an overall passing grade on the weighted average of the exams in orderto pass the course.

Exercises and Assignments

There are four sets of exercises and four assignments in this course. You canfind them on the course web site. The due dates and instructions on how tosubmit them are also there.

The exercises consists of short problems. They will make sure you’re onthe right track and have the basics down. They shouldn’t take you too longto complete.

The assignments are more work. You will have to figure out more on yourown and explore the concepts on the course.

You will submit both exercises and assignments by email. Sending theweb address where you have put your web page(s) and any other informationrequired by the assignment. The exact email address that you should sendyour work to will be indicated on the exercise set or assignment. You mustplace your assignments in the web space provided for the course. You canfind information on uploading the files in Appendix A.

Late exercises and assignments will be penalized 10 percent per day (in-cluding weekends). The lateness will be assessed on the basis of the later ofthe email sent to the TM and the “last modified” date on the files you’veput in your web space.

So, you shouldn’t change your web pages after the due date, or they willbe counted as late. You also shouldn’t delete any of your assignments untilthe end of the course.

Exercises will not be accepted more than three days after the due datesince solutions have been posted by then.

Exams

There will be two examinations in this course. They are closed-book andyou aren’t allowed any notes, calculators, or other aids. The exams have a

Page 20: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

20

mixture of short and longer answer questions.The 50-minute midterm exam will be held in week 7 or 8. You will be

responsible for the material in Units 1 to 6 in this guide and the readingsgiven for those units. The final exam will be three hours long and will coverall of the course material.

Academic Dishonesty

We take academic dishonesty very seriously in this course. Academic dishon-esty includes (but is not limited to) the following: copying another student’sassignment; allowing others to complete work for you; allowing others to useyour work; copying part of an assignment from an outside source; cheatingin any way on a test or examination.

If you are unclear on what academic dishonesty is, please read Policy10.02. It can be found in the “Policies & Procedures” section in the SFUweb site,.

Cheating on an assignment will result in a mark of 0 on the assignmentand a recommendation for a further mark penalty in the course. At thecourse supervisor’s option, an F may recommend for the course. Any aca-demic dishonesty will also be recorded in your file at the University.

Any academic dishonesty on the midterm or final will result in a recom-mendation that an F be given for the course and possibly a recommendationthat the student be suspended or expelled from the University.

Copyright

When you create web pages, you must keep in mind other people’s copyright.Whenever someone publishes something, it is illegal for others to copy thematerial except with the permission of the copyright owner.

That means you can’t just copy text or images from any web site. Whenyou are creating your pages (except where the assignment states otherwise),you can use images from other sites that indicate that you are allowed to doso or if you have sought and received permission. Some sites, particularlyclip art sites and other graphics collections, state that you’re allowed to usetheir images for your own sites.

If you do use anything from other sites for this course, you must indicatewhere it came from by providing a link to the original source. Failure to doso is academic dishonesty and will be treated as described above.

Page 21: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

21

Schedule

To give you an idea of the pace at which you should be going through thematerial, please look at the course schedule in Figure 1. Exact assignmentdue dates are in your Red Folder.

The readings listed here aren’t the whole story. Specific chapters andsections are listed in each unit.

Getting through CMPT 165

Self-directed courses require a lot of motivation. You will be mostly on yourown as you work your way through this course.

For each Unit, the following order of activities is suggested:

1. Read the unit in the Study Guide and do the “Check-Up Questions.”

2. Do the readings listed at the start of the unit. Also have a look at therelevant links for the unit on the web site.

3. Have a look at Appendices A and B to see if they discuss any technicalskills that are relevant to the material.

4. If there’s an assignment or exercises, do it.

After doing the readings, look at the Summary at the end of the unit.It will give you an idea of what you should have learned. The list of termsis particularly useful. These terms are italicized when they are first used.Also, look at the Learning Outcomes listed at the start of the unit for a moredetailed list.

You should also pay attention to the course web site. There are usefulresources there that can give you help if you need it. If the content of theweb site is updated during the course, email will be sent to the course list.

Getting Help

There are several ways for you to get help in this course.You can check the discussion board on the course web site. You may find

that someone else has asked the same question and that it has been answered.

Page 22: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

22

Part Unit Week Topic Resources Submit

I 1 CourseIntroduction

Study Guide

II 1 1 The WorldWide Web

Study Guide,Internet Book

2 2,3 Markup andHTML

Study Guide, onlineHTML reference

Exer 1

3 4 Text andGraphics

Study Guide Assign 1

4 5,6 More HTMLand StyleSheets

Study Guide, onlineCSS reference andcolour chart

Exer 2

5 6,7 Design Study Guide, Part 1of Design Book

Assign 2

6 7,8 XML Study Guide

7–8 MidtermExam

III 7 9,10 ProgrammingIntroduction

Study Guide, ThinkPython

Exer 3

8 11 WebProgrammingIntroduction

Study Guide Assign 3

9 11,12 MoreProgramming

Study Guide, ThinkPython

Exer 4

10 12 More WebProgramming

Study Guide

11 13 InternetInternals

Study Guide,Internet Book

Assign 4

14–15 Final Exam

Figure 1: CMPT 165 course schedule

Page 23: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

23

You should also feel free to contact the TMs by email or phone. Youcan phone them during office hours or email at any time. Information oncontacting the TMs will be in your course package.

You can also contact the course supervisor (preferably by email) for ad-ministrative problems only. You should feel free to contact the supervisor ifyou’re having serious problems.

About the Author

I hesitated about writing this section. It seems sort of self-congratulatory—a few paragraphs about who I am aren’t so bad, but they set a dangerousprecedent. The next thing you know, I’ll have 8× 10 glossy photos of myselfand I’ll start writing an autobiography.

On the other hand, I’m reminded that students in an on-campus classwould learn something about me over the course of the semester. In an effortto duplicate that experience, I relented. . . .

I’m a lecturer at SFU in the School of Computing Science. I started inSeptember of 2000. I finished my M.Sc. in Computing Science at SFU justbefore that (and I do mean just). My undergraduate degree is in Math andComputer Science from Queen’s University (Kingston, not Belfast).

I am neither a professor nor a doctor, which causes problems for studentswho are accustomed to using these titles. For the record, I prefer “Greg.”

When I’m not working, I enjoy food and cooking. Unfortunately, I’mgetting to an age where enjoyment of food is going to cause buying newpants more often than I like. Vancouver is a great place to be into food. Themix of cultures and available ingredients makes for very interesting cuisine.I’m always interested in good restaurant suggestions if you have any youwould like to make.

I’d like to thank Tim Beamish and Rong She, the TAs for the summer2001 offering of CMPT 165, for helping me prepare this Study Guide. I’d alsolike to thank the students in the summer 2001 offering who suffered throughits draft version.

If you have any comments on the course or restaurant suggestions, pleasefeel free to contact me at [email protected].

Page 24: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

24

Page 25: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Part II

The Web and Web Pages

25

Page 26: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor
Page 27: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 1

The World Wide Web

Learning Outcomes

• Describe the basic structure of the Internet.

• Define some basic terms from the Internet and WWW.

• Give examples of protocols used on the Internet.

• Describe the way a web page gets to your computer.

• Understand error and status messages you get when using the Internet.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• (optional) Have a look at the relevant chapters in The Internet Book :Chapters 2 and 10 to 13.

• Do Exercise 1.

Topic 1.1 Basics of the Internet

As I mentioned in the Introduction, we will assume that you have somefamiliarity with computers and the Internet. In particular, we’re going toassume that you have used the Internet and know what the World Wide Web

(WWW ) is. You don’t have to be an expert, but you should be comfortableusing your computer and navigating the web.

27

Page 28: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

28 UNIT 1. THE WORLD WIDE WEB

UBC webhome

other SFU servers,

SFU web

computersnearby

otherISP’s

computerserver

gateway

server

gateways, etc.

elsewhere in Vancouver

other computers at SFU or

Figure 1.1: How information might get from the SFU web server to a homecomputer

Connecting to the Internet

So, what is the Internet anyway? Basically, it is a huge network of con-nected computers. One computer is connected to the next, and they passinformation along from one to another. That’s really all there is to it—manydifferent computers, all passing information around so it eventually gets toits destination.

When a desktop computer is connected to the Internet, it is probably onlyconnected to one other computer, its gateway. The gateway will be connectedto one or more other computers and will pass along whatever information issent or received.

The gateway and other computers that form the Internet’s infrastructureor backbone may be connected to many other machines. Their job is to receiveinformation and pass it along in the right direction. They might be computersthat are more-or-less like desktop PCs, or they might be specialized devicescalled routers designed specifically for this job.

For example, Figure 1.1 shows the possible route that a web page mighttravel to get from SFU’s web server (www.sfu.ca) to a computer connectedto a cable modem or ADSL. The actual route from www.sfu.ca to a homecomputer would have about 8 to 10 steps. Routes to other computers on theInternet might have 20 or even more steps.

The various computers on the Internet can be connected to each other inmany different ways. When a home computer is connected to the Internet,it is probably connected by one of the following methods:

• modem: Information is carried to and from your computer over yourphone line, the same way a voice call is transmitted. The modem at

Page 29: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

1.1. BASICS OF THE INTERNET 29

either end of the phone call translates the sound of the phone call toand from computer data.

• ADSL: Information is also transmitted over the phone line but it doesnot use a traditional “phone call.” Information is encoded differentlythan it is in a modem connection, which lets ADSL transmit data fasterthan a modem.

• cable modem: With a cable modem, the data to and from your com-puter is carried over your TV cable.

When a computer connects to a network in an office building or at SFU (or ifyou have a “home network” set up), different connection methods are used.These are typically faster than the “home” connections listed above.

• ethernet : The connector for Ethernet looks like a fat phone jack. Sinceit uses wires, it’s generally used for desktop computers that don’t haveto move around. It is several times faster than either ADSL or a cablemodem.

• wi-fi : Wireless networking has recently become affordable and so morepopular. Wi-fi (Wireless Fidelity) is also known as wireless LAN (Wire-less Local Area Network), AirPort, and 802.11. There are different ver-sions (eg. 820.11b and 802.11g) with different data transmission rates.

Wi-fi is often used for laptops, which makes it possible to move the ma-chine without having to worry about network cables. Wi-fi is availableat SFU and increasingly in other locations.

There are also many other connection types used to make connectionsbetween buildings and across cities and over long distances across countries.

Clients and Servers

You may have noticed that in the discussion above, the home computer, gate-way, and the SFU web server are all described only as computers connectedto the Internet. That’s true—all of them are connected to the Internet infundamentally the same way, only the speed of the connection is different.So, why is www.sfu.ca a web server when your computer isn’t?

The only real difference is that www.sfu.ca will answer requests for webpages. The gateway and your home computer won’t. What makes it possibleis the web server software installed on the server. This program runs all the

Page 30: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

30 UNIT 1. THE WORLD WIDE WEB

1. Please send me the web page

2. Here it is

4. Here it is

Web ServerYour Computer

3. Send me this image

����������������������������

��������������������

��������������������

����������

Figure 1.2: The conversation that a web client and web server might havewhen you view a web page

time and answers requests for web pages. If you ran similar software on yourcomputer, people could surf web pages on it as well.

If you did install web server software on your computer, you’d run intosome problems. First, the SFU web server has an easy to remember name:www.sfu.ca. When home computers are connected to the Internet, theytend to have names like akjx74wuc23nf.bc.hsia.telus.net or h24-84-78-194.vc.shawcable.net—a lot harder to remember and type in correctly

Second, www.sfu.ca has a very fast connection to the Internet, enough toserve up a lot of web pages at the same time. It’s also on all the time, whichyour computer might not be.

When you want to access web pages, you use a web browser. A webbrowser is one example of client software. A web client (like Mozilla orInternet Explorer) is the software you use to make the request to a webserver. The client has to transmit the request to the server, receive theresponse, and then process the information so you can use it.

Client and server software let people use computers to transfer informa-tion over the Internet. Everything that you transfer is done by interactingwith client software for that type of connection. For example, when you loada web page with Mozilla (a web client), it will ask the web server for the webpage itself and then for any images or other files that it needs to display thepage. Figure 1.2 shows the conversation that might happen to load a webpage with one image on it.

Page 31: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

1.2. PROTOCOLS 31

The term “web server” may be a little confusing here because it’sused to refer to both the computer itself (www.sfu.ca is a computersitting in a closet somewhere in Strand Hall) and the software that’srunning on the computer (www.sfu.ca runs the Apache web server).

Check-Up Question

I Find out what kind of connection is used between your computer and yourISP. If you have a home network, does it use ethernet or wi-fi?

Topic 1.2 Protocols

When a client and server talk to each other, they have to agree on how theywill exchange the various pieces of information that are required to get thejob done. The “language” that the client and server use to exchange thisinformation is called a protocol.

For example, when they are exchanging web pages, the client and serverneed to encode the requests and responses shown in Figure 1.2. They alsomust be able to indicate errors and other behind-the-scenes messages (like“Page not found” or “Page moved to here”).

Web pages are transferred using a protocol called the HyperText TransferProtocol (HTTP). We will discuss HTTP more in Topic 1.3 and in Unit 11.

Information on the Internet

There are many different kinds of information that travel over the Internet.The one you’re probably most familiar with is web traffic—web pages andall of the graphics, sounds, and other files that go with them. But, a lot ofother stuff travels around as well—the web is just one of many ways thatinformation can travel across the Internet.

Here are some other ways information can be exchanged between com-puters on the Internet that you might be familiar with:

• Email. Even if you check your email on the web (with SFU’s Webmailor something similar), the mail itself still has to get from the sender toreceiver. If someone at Hotmail sends email to your SFU account, ithas to travel from the computers at Hotmail to the mail server at SFU.

Page 32: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

32 UNIT 1. THE WORLD WIDE WEB

• Instant Messaging. The various instant messaging services each havetheir own protocols (which is why they don’t generally work together).These include ICQ, AOL Instant Messenger, Yahoo! Messenger, andMSN Messenger.

• Peer-to-peer file transfer. These methods of file transfer sidestep thetraditional client-server model and let people transfer files directly fromone client to another. (Technically, the ”client” software is performingthe duties of a client and a server.) Because of their dubious legality,they tend to come and go, but they have included Napster, Gnutella,Audiogalaxy and Kazaa.

• FTP. FTP stands for “File Transfer Protocol.” It is an older methodof transferring files but it is still often used to do things like installingweb pages onto a web server.

• network gaming. Games that allow network play generally act as aclient; a server is run by the company that produced the game. Theclient and server exchange information about the game: moves that aremade, changes in the game’s “map,” and other information to makesure the game works properly for all users.

Each of the examples above uses a different protocol to exchange informa-tion. That is why you need different client software for each of them—theyneed clients that speak different languages.

There are many other services that you might not think of. There areprotocols for file sharing that allow you to use files and printers as if theywere on your machine (like Windows File/Print Sharing). Other protocolsfor things like clock synchronization, and remote access to computers youmight never run across.

Check-Up Question

I Are there more examples of protocols/client software that you use?

Topic 1.3 How Web Pages Travel

We have covered web servers and web clients in Topic 1.1. The two computershave to talk to each other in order to deliver web pages.

Page 33: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

1.3. HOW WEB PAGES TRAVEL 33

As noted in Topic 1.2, the “language” a client and server use to talk toeach other is called a protocol, and the protocol used to transfer web pages iscalled HTTP. You may have noticed that web addresses start with http:// ;this part of the address tells your web browser that you want it to use HTTPto talk to the server. A web server’s job is to communicate with HTTP toany clients that connect to it.

URLs

A URL (Uniform Resource Locator) is the proper name for an Internet ad-dress. URLs are also sometimes called URIs (Uniform Resource Identifiers).Here are some examples:

http://www.sfu.ca/http://www.sfu.ca/∼somebody/page.htmlhttp://www.w3.org/Addressing/https://my.sfu.ca/ftp://ftp.mozilla.org/pub/mozilla/releases/mailto:[email protected]

(Actually, all of these are “absolute” URLs. Relative URLs will be introducedin Unit 2.)

The first three URLs are HTTP URLs—they refer to information that ison the WWW. The others refer to information that you access using differentprotocols.

The fourth one, https://my.sfu.ca/ works almost like a web page, butit uses a “secure” connection to transmit the data. All of the informationpassed back and forth between the client and the server is encrypted so noneof the computers in between can eavesdrop. HTTPS is often used for sensitiveinformation like banking, passwords, and email.

The last two items in the list are URLs for information accessed by otherprotocols. They are an FTP URL (information on an FTP server) and anemail URL (a link that will let you send an email to that address).

The protocol that should be used to access a particular URL is indicatedby its scheme. The scheme is the part of the URL before the colon (:). Forweb (HTTP) URLs, the scheme is “http” or “https” for secure web traffic.The URL schemes “ftp” and “mailto” indicate FTP and email URLs. Thereare many other less common URL schemes used by specific applications.

Page 34: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

34 UNIT 1. THE WORLD WIDE WEB

http://www.sfu.ca/~somebody/page.html

scheme pathserver

Figure 1.3: The parts of a simple URL

For HTTP and FTP URLs, the first part after the scheme indicates theserver that should be contacted to exchange information. The rest of theURL after the server is the path of the file. We will discuss the other partsof an HTTP URL in Topic 2.4.

Check-Up Question

I Find the URL of some web pages that you visit often (in the location barof Mozilla). Identify the scheme, server, and path of each.

Topic 1.4 MIME Types

Any type of data can be transferred over HTTP, including web pages them-selves (which are written in a language called HTML, as we will see in Unit 2).Graphics on web pages are also sent by HTTP (in formats called GIF, PNG,and JPEG, which will be discussed in Unit 3). All of the other files you getfrom the web are also sent by HTTP: video, audio, Office documents, and soon.

When your web browser receives these files, it has to know what to dowith them. It treats graphics data very differently from a text file. Thebrowser also has to know what program to open for files it can’t handleitself, like Acrobat files, Office documents, and MP3 audio files.

If you use a Windows computer, you are probably used to looking atthe file extension to figure out what type of file you have. For example,MS Word documents are usually named something like essay.doc; the .docindicates that it is a Word document. Other extensions include .html or .htmfor web pages and .pdf for Acrobat files.

Unfortunately, the file extension can’t be used on the web to decide thetype of the file. For one thing, some operating systems don’t use file exten-sions (e.g. MacOS), so we don’t know for sure they will be available. It’s also

Page 35: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

1.4. MIME TYPES 35

MIME type File Contents How it might be handled

text/html HTML (web page) display in browser as a webpage in browser

application/pdf Acrobat file open in Adobe Acrobatapplication/msword MS Word document open in Word

image/jpeg JPEG image display in browseraudio/mpeg MP3 audio open with WinAmp

video/quicktime Quicktime video open with Windows Media

Figure 1.4: Some example MIME types

possible that the browser wouldn’t even know a file name when the data issent. We will also see other reasons later in the course.

The above information applies to files sent as email attachments as well.Since email is older than the web, when HTTP was created, it could use thesolutions that were already in use for email.

Instead of using file extensions, the type of data sent by email or HTTPis indicated by a MIME type. MIME stands for Multipurpose Internet MailExtensions. The MIME type indicates what kind of file is being sent. It tellsthe web browser (or email program) how it should handle the data. MIMEtypes are made up of two parts. The first is the type, which indicates theoverall kind of information: text, audio, video, image, and so on. The secondis the subtype, which is the specific kind of information.

For example, a GIF format image will have the MIME type image/gif—indicating that it is an image (which gives the browser a hint what to dowith it, even if it doesn’t know the subtype) and the particular type of imageis GIF. Most web browsers know how to display GIF images themselves, sothey wouldn’t have to use another application to handle it. There are moreexamples in Figure 1.4.

Microsoft’s Internet Explorer browser doesn’t handle MIME typescorrectly. It often ignores them and then tries to guess the typeitself (and is sometimes wrong). This is one of the reasons that wesuggest you don’t use IE for this course.

When you put a file on a web server, the server usually determines theMIME type based on the file’s extension. This isn’t always the case. Someweb servers handle things differently, but for the moment the MIME type

Page 36: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

36 UNIT 1. THE WORLD WIDE WEB

will come from the file’s extension. However, you should remember that thetype and the extension are different things.

Check-Up Questions

I You can determine the MIME type of a file with Mozilla in the “PageInfo” window (in the “View” menu). Try going to a web page and have alook—you should see text/html.

I When you’re viewing an image you will see the image’s MIME type if youright-click and “View Image.” Try it out with a few images.

Topic 1.5 Fetching a Web Page

Let’s look back at Figure 1.2, now that we know a few more details aboutthe process.

Suppose that the web page that is being requested is at the URL http://www.sfu.ca/about/index.html and that one of the graphics on the page isthe JPEG image at http://www.sfu.ca/hp/images/sfu.jpg .

We can now give a more detailed description of what happens in theconversation indicated by each of the four arrows in Figure 1.2.

1. The web browser contacts the server specified in the URL, www.sfu.ca.It asks for the file with path /about/index.html .

2. The server responds with an “OK” message, indicating that the pagehas been found and will be sent. It indicates that the MIME type ofthe file is text/html—it’s an HTML page. Then it sends the contentsof the file, so the browser can display it.

3. The browser notices that the web page contains an image with URLhttp://www.sfu.ca/hp/images/sfu.jpg . It contacts www.sfu.ca and asksfor the path /hp/images/sfu.jpg .

4. The server again responds with an “OK” and gives the MIME typeimage/jpeg, which indicates a JPEG format image. Then it sends theactual contents of the image file.

Once the web browser has the HTML for the web page and all the graphicsthat go with it, it can draw the page on the screen for you to see.

Page 37: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

1.5. FETCHING A WEB PAGE 37

Summary

This unit gives you a brief introduction to how the WWW and how theInternet works. By now you should have a better understanding of whatgoes on behind the scenes when you use the Internet for various tasks. Thisinformation will help you later in the course and should be useful anytimeyou’re using the Internet and things go wrong—you can often fix or workaround problems if you understand what’s going on.

We will be coming back to many of the same topics in Unit 11. By then,we will be able to go into more technical detail.

Key Terms

• World Wide Web (WWW)

• Internet

• gateway

• client

• server

• web server

• client software

• web browser

• protocol

• HTTP

• URL

• scheme

• path

• MIME type

• subtype

Page 38: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

38 UNIT 1. THE WORLD WIDE WEB

Page 39: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 2

Markup and HTML

Learning Outcomes

• Discuss various ways of describing documents.

• Compare them for use in various tasks.

• Create a web page that includes links and images using HTML.

• List some common HTML tags.

• Carry out validation on an HTML page.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• Review the XHTML Reference pages found in the “Online References”section of the course web site.

Topic 2.1 Describing Documents

There are two basic ways of editing documents on a computer.

The first is “What You See Is What You Get” orWYSIWYG (pronouncedwissy-wig). You’re probably already familiar with it even if you don’t knowthe term. In a WYSIWYG program, what you see on the screen when youare editing the document looks the same as the finished product. Application

39

Page 40: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

40 UNIT 2. MARKUP AND HTML

programs like MS Word and WordPerfect are WYSIWYG; you view and editthe document on the screen, and it looks exactly as it will when printed.

The other way to edit a document is to use markup. With markup, youuse a “markup language,” which looks a lot like a programming language. Ithas special codes to indicate the appearance of the document. The editingis usually done in a text editor. You can’t see the finished form of thedocument until processed, and then you can look at it with a viewer. Someexamples of markup are HTML (for web pages, which we will use later), XML(used behind-the-scenes in many programs to store data), LATEX (often usedin mathematics and computer science), and WordPerfect’s “Reveal Codes”feature.

Most people use WYSIWYG editing for most of their work. Some of themost important reasons they do are that

• it’s easier to learn, and

• the user gets immediate feedback.

There are also good reasons for using markup instead:

• No information about the document is hidden.

• There can be more than meets the eye (e.g., bibliography references,different display for the screen, printers, etc.).

• It is easy to separate content from appearance.

• All you need to edit it is a text editor.

HTML is the markup language used for web pages. There are severalweb page editors available for HTML that claim to be WYSIWYG, butthere is no such thing as a WYSIWYG web editor. HTML doesn’t describeevery aspect of the document’s appearance, so it can never be WYSIWYG.Web page editors that claim that they are only give authors a false sense ofsecurity.

Physical vs. Logical Markup

We will soon see two distinct forms of markup.The first is physical markup (or visual markup). Physical markup directly

describes how the document should appear. That is, you specify a particularappearance for part of the document. For example, markup that indicatesthat text should be bold or in a 16pt font is visual markup.

Page 41: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.1. DESCRIBING DOCUMENTS 41

The other form of markup is logical markup (or structural markup). Log-ical markup describes the purpose or meaning of the text. For example,we might indicate that some text is the heading for a new chapter or para-graph. When you use logical markup, you should describe the content of thedocument without worrying about its appearance.

To determine the appearance of logical markup, you use a style. A styleis a set of rules like “a section heading should be in a bold, 16pt font, withsome space above and below it.” Once an appropriate style has been created,you only need to indicate where the section headings are in the document.

Benefits of Logical Markup

Using logical markup with styles has several benefits. The main one is that ifyou want to change the appearance of the document, you only need to changethe style; the changes should then be made throughout the document. In theexample, if we wanted to change the way section headings look, we wouldchange the style file to something like “section headings should be a 18ptfont, centred, with some space above and below.” If we had used physicalmarkup, we would have to hunt through the entire document and make thechange to each section heading.

When you use logical markup, it is also possible to have different styles fordifferent situations. For example, a document could have an on-screen stylefor viewing on a monitor, a “large-type” style for the visually impaired, andanother style for use when it is printed. It is also possible to have a “style”for speaking the document out loud, which could be used by a text-to-speechprogram for the blind.

HTML has both logical and physical markup (as we will see). MS Wordactually allows logical formatting with it’s underused and underappreciated“Styles” feature.

On the web, logical markup can be used by programs that aren’t tra-ditional web browsers. For example, visually impaired people often use aspeech browser. Logical markup can be used to tell the browser to use anappropriate tone of voice or pause at appropriate times.

Also, some search engines use logical markup to figure out what yourpages are actually about. Using logical markup properly can help these pro-grams categorize your page correctly and bring you visitors who are interestedin your topic.

Page 42: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

42 UNIT 2. MARKUP AND HTML

Check-Up Questions

I How do you usually create your documents? Using WYSIWYG or markup?Do you use physical or logical formatting?

I Does your word processor have a feature that allows you to see the markuplanguage that underlies your document?

I Do you know how to use any logical formatting features of your wordprocessor (like Word’s “Styles”)?

Topic 2.2 HTML Basics

The documents that we create in this course will be done in HTML (Hy-perText Markup Language). It’s really a very descriptive name: HTML is amarkup language that can be used to describe hypertext pages (documentswith links). In particular, HTML is used to describe web pages.

The version of HTML we will be using in this course is XHTML 1.0.XHTML stands for “eXtensible HyperText Markup Language.” For the mo-ment, we will use “HTML” and “XHTML” synonymously.

If you have used HTML before, you might not be familiar withXHTML. It is a newer version of HTML and has a few syntaxdifferences, but it is basically the same. We will discuss the dif-ference between XHTML and the older HTML standards more inTopic 6.5.

If you haven’t written web pages in HTML before, you can thinkof “HTML” and “XHTML” as the same thing, at least until weget to Topic 6.5.

The markup of an HTML page is done with tags. HTML tags consist ofone or a few letters, surrounded by triangular braces (<>). Some tags are:

<b> bold text<p> a paragraph<h1> a level 1 (the largest) heading

HTML tags have both an opening and closing version. For the bold texttag, <b> is the opening tag and </b> is the closing tag. The text that shouldbe modified is put in the tags. So, this HTML:

Page 43: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.2. HTML BASICS 43

this is <b>bold text</b>.

Will be displayed like this:

this is bold text.

The material between the opening and closing tags is referred to as thetags contents. In the example above, the <b> tag has contents “bold text.”

Some HTML Tags

Here are some common HTML tags:

<h1>. . . </h1> a level 1 heading—every page should probablyhave one of these at the top.

<h2>. . . </h2> the second largest heading—sections withinthe main document.

<h6>. . . </h6> the smallest heading.<p>. . . </p> a paragraph—these tags should be wrapped

around each paragraph.<i>. . . </i> italic text (visual markup).<em>. . . </em> emphasized text—text that should stand

out, usually displayed in italics (structuralmarkup).

<html>. . . </html> the HTML document—should be wrappedaround the whole document.

<head>. . . </head> the header—information about the page, onlya few tags, like <title> are allowed in the<head>.

<title>. . . </title> the page’s title—usually displayed in thebrowser’s title bar, must be in the <head>.

<body>. . . </body> the body of the page—contains the materialthat should be displayed on the web pageitself.

You should also have a look at the full list of XHTML tags. There is alink to it in the “References” section of the course web site. We will be usingit as our definitive resource for tags and entities.

Page 44: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

44 UNIT 2. MARKUP AND HTML

<html>

<head>

<title>The page’s title</title>

</head>

<body>

<h1>Sample page</h1>

<p>This is a paragraph, with some

<b>bold text</b>.</p>

</body>

</html>

Figure 2.1: An XHTML document

Figure 2.2: The display of Figure 2.1 in a browser

A Full XHTML Document

Figure 2.1 is a full XHTML document (although we will be making someadditions to it in Topic 2.7). The text you see there is typed into a texteditor and saved as a .html file. Its display in Mozilla 1.6 under Windows isshown in Figure 2.2. Remember that other browsers might display this sameHTML differently.

XHTML documents could also be saved with a .xhtml extension,but that can cause problems for browsers that don’t know aboutXHTML (Internet Explorer, in particular). Stick with .html forthis course.

Page 45: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.2. HTML BASICS 45

This Figure can be found in the “Examples” section of the course webpage.

More on Closing Tags

In older versions of HTML (before XHTML), it was possible to leave outclosing tags in certain circumstances. In XHTML, you can’t omit them: alltags must be closed.

There are a few XHTML tags that aren’t allowed to have any contents.For example, the <br> tag inserts a line break into your page; the nextwords after <br> appear on the next line. It doesn’t make any sense to haveanything between the <br> and the closing tag, </br>. So, the two mustalways appear right beside each other: <br></br>.

Since empty tags occur frequently in XHTML, there is a short form forthe closing tag that can be used for empty tags. These are exactly equivalentin XHTML:

<br></br>

<br />

Finishing the opening tag with the slash indicates that you want to close thetag right away.

The online XHTML reference indicates which tags are empty and canuse this short form for the closing tag. The reference is described later, inTopic 2.3.

Nesting Tags

The closing tags must be arranged to ensure the contents are properly nested.It is illegal to write something like the following:

<b><i>bold, italic text</b></i>

Instead, the tags should look like this:

<b><i>bold, italic text</i></b>

so that the <i>. . . </i> is entirely within the <b>. . . </b>, not partially over-lapping.

While we’re on the subject of nesting tags, note that some tags can onlyoccur inside others. For example, a paragraph, <p>, must be inside the body,

Page 46: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

46 UNIT 2. MARKUP AND HTML

<body>. The tag <li>, which stands for “list item,” must go in one of thelist structures, for example <ol>, which indicates an ordered list. These tagsare used in this way:

<ol><li>item 1</li><li>item 2</li></ol>

That would be rendered in most browsers as:

1. item 12. item 2

Again, the online XHTML reference in the “Online References” sectionof the course web site indicates which tags can contain which others.

HTML Pitfalls

There are a few more points about HTML that you should be aware of:

• All tags in XHTML must be in lower case. So, <body> is okay, but<BODY> and <Body> aren’t. Older versions of HTML allowed uppercasetags, but XHTML doesn’t.

• The browser ignores spacing in your HTML document. It doesn’t mat-ter if you press return to jump to the next line. All spacing on thefinished web page must be done with markup. Any combination ofspaces, returns, and tabs in your HTML will be treated the same as asingle space by the browser.

• HTML is sent over the Internet to the user’s web browser. As a result,the author doesn’t get to decide what the page looks like—the user’sbrowser does. Authors shouldn’t assume that the page will look thesame for everyone as it does for them. For example, pages might lookvery different in Mozilla, Internet Explorer, WebTV, and so on. If youuse the tags correctly, the page should still get its point across.

This is the reason WYSIWYG editors don’t fit well with HTML—HTML is never WYSIWYG.

• If a browser doesn’t know what to do with a particular tag, it ignores itentirely. The contents of the tag will be displayed, without any changein their appearance, which can cause unexpected results in differentbrowsers. Again, if you’ve used the HTML tags as they are intended,your page should still be readable.

Page 47: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.3. MORE HTML 47

Check-Up Questions

I Type in Figure 2.1 using a text editor or download it from the “Examples”section of the course web site. Save it with the file name first.html, andthen open it in your web browser. It should look something like Figure 2.2.

I Make some changes to your file first.html, and click the “Reload” or “Re-fresh” button in your browser to view the changes. Try some of the othertags mentioned above.

Topic 2.3 More HTML

Attributes

Some tags can be modified with attributes. To modify the way a tag be-haves, attributes and their associated values are put in the opening tag, forexample: <tag attribute="value">. (The word “attribute” here shouldbe pronounced with the stress on the first syllable, not on the second as inthe verb that is spelled the same way.)

For example, the tag <hr> is used to put a horizontal rule (line) on apage. By default, the rule goes all the way across the page. Its length canbe modified with the width attribute. So, <hr width="50%" /> will draw aline across half of the screen. Note that <hr> is another empty tag, so wecan put the closing slash in the opening tag.

Note also that the value must be enclosed in quotes. Also, a tag can haveseveral attributes, for example:

<tag attrib="value" attrib2="value">

The order of the attributes does not matter.The attributes that each tag can have are described in the online XHTML

reference. See the link in the “References” section of the course web site.

Entities

All HTML tags are enclosed in <>. So, if we want a < on a web page, wecan’t just type it into the HTML because the browser will assume that it isindicating the start of a tag. To print this symbol or other special characters,we use entities.

Page 48: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

48 UNIT 2. MARKUP AND HTML

valueattribute

<p lang="fr">gar&ccedil;on</p>

closing tag

contents entityopening tag

Figure 2.3: HTML terms to output “garcon”

All entities start with an “&” and end with a “;”—the entity for a less-than sign is &lt;. So, to display “7 < 10” on a web page, we would type

7 &lt; 10

in our HTML document. There are many other entities, for example:

• &gt; produces a “>”

• &amp; produces an “&”

• &quot; produces a “"”

• &aacute; produces an “a.”

Figure 2.3 shows the parts of an HTML fragment and how they fit to-gether. In the “Online References” section of the course web site, you canfind a list of all of the entities and see how they display in your browser.

Comments

It is possible to include comments in HTML. Comments are never displayedby the browser. They are used to place notes for yourself or others in theHTML itself. A comment starts with <!-- and ends with -->:

<!-- this is an HTML comment. -->

Comments can also be used to disable some HTML that you don’t wantdisplayed but don’t want to delete either.

XHTML Reference

There are many more XHTML tags and entities than have been mentionedhere, and you should explore them further. You can visit the definitivereference from the course web site from the “References” section.

Figure 2.4 shows part of the reference page for the <ul> tag.

Page 49: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.3. MORE HTML 49

Figure 2.4: A sample page from the XHTML reference

The Syntax line gives the general usage of the tag. Empty tags will bedisplayed with the short-form closing tag. The Attribute Specifications

lines give a list of possible attributes for this tag and their values.The Contents line indicates what you may be put inside the tag. In

Figure 2.4, we see that “one or more” <li> tags can go in a <ul>. TheContained in line gives the tags that they can be placed in. Figure 2.4indicates that <ul> can be placed inside of the <applet>. <blockquote>,. . . tags.

The text below these lines describes what the tag is for, how it should beused, and its attributes.

As you explore these pages, note that some of the tags and attributesare set in lighter coloured text, like “type” and “compact” in Figure 2.4 (if

Page 50: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

50 UNIT 2. MARKUP AND HTML

you can see this subtle difference after the printing and copying process).These tags and attributes are deprecated, which means that using them isdiscouraged because they will be removed from future versions of XHTML.There are better ways to accomplish the same task, often with style sheets,which are discussed in Unit 4

There is also a list of the entities on these pages. Some browsers do notsupport all of the entities, so you should be somewhat conservative aboutwhich ones you use. If you can, look at these pages in a few different webbrowsers to see which entities they display correctly.

Check-Up Questions

I Modify your first.html so you use some entities.

I Have a look at the XHTML reference pages and familiarize yourself withsome of the tags that are available. While you’re there, have a look at theentities.

I Modify your first.html so you use two or three tags with attributes.

Topic 2.4 Links in HTML

The links (or hyperlinks) on web pages are the key to the way the web works.Links are usually displayed in a different colour of text and underlined (seeFigure 2.5).

In HTML, you use the <a> tag to create a link. The contents (see Fig-ure 2.3) of the <a> tag are the text of the link. The href attribute indicatesthe URL of the link destination. So, Figure 2.5 would be created with thefollowing HTML:

This is what a <a href="http://www.sfu.ca/">link</a>

usually looks like.

In this case, the link would take the user to SFU’s main web page.It is also possible to put some other tags inside a link:

This is <a href="http://www.google.com/">a

<em>slightly</em> more interesting link</a>.

Here, the word “slightly” would be emphasized in the link.

Page 51: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.4. LINKS IN HTML 51

Figure 2.5: A hyperlink on a web page

Relative URLs

The value of the href attribute is a URL, which we discussed in Topic 1.3.All of the URLs that we have seen so far are absolute URLs because theystart with a scheme. Absolute URLs indicate everything needed about thelocation of a document that is needed to fetch it.

You can imagine that it would be a lot of work to type the absolute URLfor every link on every web page on a web site. It would also be a problemif you wanted to move a web site to another location—you would have to fixevery single link’s URL.

Both of these problems are addressed by relative URLs . Instead of havingto type the full URL every time, with relative URLs let you can just indicatethe changes from the current URL. Relative URLs don’t start with a schemename (http://) and don’t specify a server either.

The destination of a link with a relative URL depends on the URL of thepage it’s on. Suppose we are currently looking at the page http://www.sfu.ca/∼somebody/pics/index.html . Assume that the examples of relative URLsbelow are on that page.

First, drop the filename from the current URL (everything after the lastslash), so we have http://www.sfu.ca/∼somebody/pics/ . Then, the followingrules are applied:

• If the relative URL starts with a slash, drop the rest of the path aswell: http://www.sfu.ca .

• For every ../ at the start of the relative URL, drop another directoryfrom the end of the URL. So, if there is one, we’d have: http://www.sfu.ca/∼somebody/ .

• Put the rest of the relative URL on the end of the current URL.

Figure 2.6 contains examples of URLs that could be links on the pagehttp://www.sfu.ca/∼somebody/pics/index.html and the URL the user wouldbe taken to if they clicked on it.

Page 52: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

52 UNIT 2. MARKUP AND HTML

Link URL Destination

img.png http://www.sfu.ca/∼somebody/pics/img.png../file.html http://www.sfu.ca/∼somebody/file.html/test.html http://www.sfu.ca/test.html../../test.html http://www.sfu.ca/test.htmldir/img.png http://www.sfu.ca/∼somebody/pics/dir/img.pnghttp://www.cs.sfu.ca/ http://www.cs.sfu.ca/ (absolute URL)

Figure 2.6: URLs from http://www.sfu.ca/∼somebody/pics/index.html

Whenever possible, you should use relative URLs on your web pages.Then, if you move your page to a different location (say, from your harddrive to your web space), the links will still work. It also takes less typing.

Check-Up Question

I Put some links on your first.html page. Try both an absolute and a relativelink.

Topic 2.5 Images in HTML

We will discuss creating and editing images in Unit 3. For the moment, wewill just look at how to put an image that has already been created on a webpage.

Images are inserted with the <img> tag. This tag has two required at-tributes: that is, an <img> tag without them is illegal. The first requiredattribute is src, which is used to indicate the URL where the image can befound. The second is alt, which is used to specify alternate text for theimage.

The alternate text is used in several situations. It can be displayed ifthe image cannot be loaded for some reason (network congestion, bad URL,etc.), if the image hasn’t been downloaded yet, or if the browser does notsupport images.

Browsers that don’t support images are rare, but there some still in use.For example, the visually impaired, who cannot use a graphical browser,often use a text browser and a speech synthesizer.

Page 53: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.6. MORE HTML 53

The <img> tag is empty, so the closing tag is unnecessary. You insert animage in the following way (for an image like Figure 2.5):

<img src="link.gif" alt="usual appearance of a

link" />

When you are creating alt text, you should try to write text that gives usersas much of the meaning of the image as possible. Note that <img> is empty,so you can close it with the short-form slash.

You can specify the size of the image, in pixels, using the height andwidth attributes. With this information, the browser can display the pagebefore the images are downloaded, since it knows how much space it needsto leave for them. As a result, your page will be displayed faster, especiallyfor people with a slow connection, and it is a good idea. So, we might dosomething like this:

<img src="link.gif" alt="usual appearance of a

link" width="250" height="65" />

When an image is inserted in this way, the browser treats it like a charac-ter (admittedly, a funny shaped one) in the current paragraph. If you wantan image that “floats” along the left or right margin, you should use stylesheets (see Unit 4).

Check-Up Question

I Put an image on your first.html page. You can download an image fromany web site for this (right-click or shift-click on the image and select“Save”). Have a look at the discussion of copyright in the Introductionbefore you start taking images from outside sources.

Topic 2.6 More HTML

Tag Types

As we mentioned earlier, some tags cannot go inside others. At first, whichones can go inside which others may seem arbitrary. You will understandthe reasoning better if you think about the four classes of tags.

Page 54: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

54 UNIT 2. MARKUP AND HTML

• Top level tags: The top level tags are the ones that define the overalldocument structure. The ones we’ve discussed are <html>, <head>,and <body>.

• Head tags: The head tags go inside the page’s <head> tag. Thesetags give information about the page (rather than things to display onthe page itself). We’ve seen <title> and there are a few others, forexample, <link> and <meta>.

• Block level tags: Block level tags are placed inside the <body>. Thecontents of these tags take up some vertical space on the page: thatis, they are placed below the previous block level group, not beside it.Some block level tags are <p>, <hr>, <h1>, <blockquote>, and <ol>.

• Inline tags: The inline tags are placed inside the block level tags. Theyaffect the way part of a line looks; they are placed beside the previoustext. Some inline tags are <em>, <tt>, <img>, and <a>.

Most of the restrictions on tag placement are the result of these distinc-tions: block level tags go inside the body; inline tags go inside block leveltags.

The lang attribute

The lang attribute can be applied to most HTML tags. It is used to specifythe language of the text in the element.

Specifying the language is important for several reasons. Search en-gines can use the language to categorize your pages properly. It also al-lows browsers to render the text properly or even to provide an automatictranslation for the user.

You can specify the language of almost any tag, but using it with the<html> tag will specify the language for the whole page. You might alsowant to specify the language of a paragraph (<p>), a quote (<q>), or a longquote (<blockquote>).

Some of the language codes you might use are en (English), fr (French),zh (Chinese), and ja (Japanese). So, to indicate that an entire web pageis in English, you should use <html lang="en">; to indicate a paragraph inFrench, use <p lang="fr">.

A full list of language codes can be found in the “Other Links” section ofthe course web site.

Page 55: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.7. VALIDATING HTML 55

Versions of XHTML after 1.0 will use the attribute xml:lang to specifythe language. You should probably use both to be safe (both are valid inXHTML 1.0):

<html lang="en" xml:lang="en">

Check-Up Questions

I Have a look at the XHTML reference and see if you can categorize thetags as head/block/inline. (Note: there are a few tags that don’t really fitinto these categories.)

I Have a look at the language codes reference, and find the codes for anyother languages you know.

Topic 2.7 Validating HTML

Web browsers should always do their best to display a page, even if everythingisn’t exactly correct. For example, most pages will properly display thefollowing as bold and italics, even though it isn’t correct:

<b><i>improperly nested tags</b></i>

Browsers want to be able to display as many pages as possible. Becausethey let some errors pass, many people creating web pages wrongly assumethat every web browser will display incorrect HTML that they produce andso they develop bad habits. When you are creating web pages, you shouldassume that web browsers are very picky about what they display—that willgive you the best chance that your pages will work in all browsers.

To help solve this problem, several people have developed HTML valida-

tors (or just validators). These programs check your page against the formaldefinition of HTML and report any errors. Putting your web page through avalidator will help you find errors that your browser let pass and discouragebad habits.

For a validator to work, you must tell it what version of HTML youare using (there have been several). To do so, you insert a document type

declaration in your HTML file as the first line, even before the <html>. Thedocument type we will be using is:

Page 56: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

56 UNIT 2. MARKUP AND HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<title>Non-Validating HTML</title>

</head>

<body>

<h1>Non-Validating HTML</h1>

<b><p>This is not the way to make a bold

paragraph.</p></b>

<p>Here are some <b><i>badly nested tags</b></i>

</body>

</html>

Figure 2.7: Some non-valid XHTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

(This declaration can be on one line or split across two. You can copy andpaste it from one of the examples on the web site—you don’t have to type ityourself or memorize it.)

This line indicates that the document is written in XHTML version 1.0as defined by the World Wide Web Consortium (W3C ). You should add thisline to the start of every web page from now on.

In addition, for XHTML documents, you have to specify a namespace forthe document. To do so, you add an attribute to the <html> tag:

<html xmlns="http://www.w3.org/1999/xhtml">

The namespace is new with XHTML; it wasn’t present in olderHTML versions. You don’t need to worry about why it’s there;just make sure you include it.

Once the HTML version and namespace are specified on your web page,you can visit one of the HTML validators on the web and give it the URLof your page. Links to validators can be found on the course web site in the“Online Tools” section. The validator will retrieve your web page (either froma web server or from your computer) and check it for errors. See Figure 2.7

Page 57: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

2.7. VALIDATING HTML 57

Figure 2.8: Part of the validation results of Figure 2.7

for an example of some non-valid HTML and Figure 2.8 for a validator’soutput.

Figure 2.7 can be found in the “Examples” section under “Study Tools”on the course web site. Links have been added so you can click it and go tothe validator’s output. You can find some instructions for working with thevalidators in Topic B.6.

Check-Up Question

I Make a copy of your HTML from exercise set 1 and try to validate it. Ifnecessary, try to fix the errors so that it does validate.

Summary

After doing this unit, you should be comfortable creating a valid web pagesusing a text editor. You should know some of the basic tags and be learningmore.

Page 58: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

58 UNIT 2. MARKUP AND HTML

Key Terms

• WYSIWYG

• markup

• structural/logical markup

• visual/physical markup

• style

• HTML

• XHTML

• tag

• opening/closing tags

• attribute

• entity

• link

• absolute and relative URLs

• head-level tag

• block-level tag

• inline tag

• doctype

• valid HTML

• validator

Page 59: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 3

Text and Graphics

Learning Outcomes

• Describe how data is stored in a computer.

• Describe how text is stored and what a character set is.

• Compare the basic types of computer graphics.

• Identify an appropriate format for a given image.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• Install your graphics program as described in Appendix A.

• Do Assignment 1.

Topic 3.1 How Computers Store Data

When computers store information on a hard drive or in memory, the onlythings that they can store literally are bits. A bit can be either a zero or a one.A computer’s memory is just a string of bits, for example, 001011010100. . . .

All data must be represented by a string of ones and zeros when it isstored or transmitted by a computer. Luckily, it is possible to represent awide variety of information using bits. We will explore some of the morecommon types of information in this unit.

59

Page 60: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

60 UNIT 3. TEXT AND GRAPHICS

Numbers

The first thing that we need to consider is how computers represent num-bers. Once we know how to store numbers in a computer, we can representeverything else as a groups of numbers and then use the method describedhere to convert it to bits.

First, let’s look more closely about the way you count with regular num-bers: 1, 2, 3, 4, 5. . . . Consider the number 165. We know what each one ofthe digits in that number means. The 1 is one hundred, 6 is six tens, and 5is five ones: 165 = (1× 100) + (6× 10) + (5× 1).

When you go from one place to the next, the value it represents is mul-tiplied by 10. Each digit represents the number of 1s, 10s, 100s, 1000s. . . .The reason the values increase by a factor of 10 is that there are ten possibledigits in each place: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. This is called decimal or base10 arithmetic. (The “dec-” prefix in latin means 10.)

We can apply the same logic and get a counting system with bits, binaryor base 2 (“bin-” means 2). The rightmost bit will be the number of 1s, thenext will be the number of 2s, then 4s, 8s, 16s, and so on. Binary values areoften written with a little 2 (a subscript), to indicate that they are base 2values: 1012.

To convert binary values to decimal, we can do the same thing we didabove, substituting 2s for the 10s:

101001012 = (1× 27) + (0× 26) + (1× 25) + (0× 24)+

(0× 23) + (1× 22) + (0× 21) + (1× 20)

= 128 + 32 + 4 + 1

= 165 .

So, 10100101 is the base 2 representation of the number 165. We can repre-sent any whole number this way.

Usually, computers look at bits in fixed groups. One common size to lookat is an eight bit group, called a byte. A single byte can represent numbersfrom 0 (000000002) to 255 (111111112).

You should be able to convince yourself that if we look at a group of nbits, there are 2n possible values that can be stored in those bits. So, n bitscan represent any number from 0 to 2n− 1. Other common groupings are of16 bits (0–65535) and 32 bits (0–4294967295).

Page 61: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.2. TEXT AND CHARACTER SETS 61

Prefix Symbol Factor

(no prefix) 20 = 1kilo- k 210 = 1024 ≈ 103

mega- M 220 = 1048576 ≈ 106

giga- G 230 = 1073741824 ≈ 109

tera- T 240 = 1099511627776 ≈ 1012

Figure 3.1: Prefixes for storage units

There are some tricks used to represent negative numbers and numberswith fractional parts (like 13.25 or -0.675) using bits. We won’t go into thosein this course, but you should know that it’s possible.

When measuring how much storage, the number of bits or bytes quicklybecomes large. Figure 3.1 show the prefixes that are used for storage unitsand what they mean.

For example, “12 megabytes” is

12× 220 bytes = 12582912 bytes = 12582912× 8 bits = 100663296 bits .

Note that the values in Figure 3.1 are slightly different than the usualmeaning of the metric prefixes. One kilometre is exactly 1000 metres, not1024 metres. When we measure values in computers, the 1024 version of themetric prefixes is usually used.

That statement isn’t entirely true. Hard drive makers, for instance,generally use units of 1000 because people would generally prefera “60 gigabyte” drive to a “55.88 gigabyte” drive (60 × 1012 =55.88× 230).

Topic 3.2 Text and Character Sets

Of course, computers can store text (like the text that makes up and HTMLpage or the text of an essay). In order to do so, we must be able to converttext to bits. We convert the text to numbers and then convert the numbersto bits, as above.

There are different ways to translate between characters and numbers.Such a translation is called a character set. The most common is ASCII(pronounced ask-key, it stands for American Standard Code for Information

Page 62: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

62 UNIT 3. TEXT AND GRAPHICS

Interchange). ASCII indicates, for example, that the letter ‘A’ correspondsto the number 65; ‘a’ translates to 97; ‘;’ to 59, and so on. You can finda table of all of the characters in the ASCII character set linked from thecourse web site.

Every PC uses ASCII, which defines 127 characters. There are variantsof the ASCII code for different regions. There are also several extensionsto ASCII that define 255 characters, which allow accented characters and afew other things missing from the original specification. This is the highestnumber of characters that can be defined if we limit ourselves to one byteper character.

More and more application support the Unicode character set. The goalof Unicode was to create a character set that can be used for any language.Unicode allows up to 232 characters and includes support for the standardASCII characters along with characters for writing Chinese, Japanese, Greek,and many other languages and special symbols. The current Unicode stan-dard defines 95221 characters.

Unicode support varies depending on the application and operating sys-tem. You can use Unicode characters in XHTML pages. The problem is oftenin entering them; your keyboard probably doesn’t have a key for ‘ψ’ (low-ercase Greek letter psi). If your operating system has support for enteringlanguages with different character sets, you can enter them that way.

If not, you can use HTML entities to create the characters. For example,a times symbol, ‘×,’ is Unicode character 215. It can be created with theentity &#215; . You can use character entities for any Unicode character inXHTML.

Text Files

When you save a file in a word processor, you really have no way of knowinghow that file is converted to bits to be written to the hard disk. The wordprocessor has to encode all of the formatting, text, figures, and everythingelse that can be included in a word processor document. As a result, wordprocessors have very complicated sets of rules for storing information in filesso that it can be opened later, preserving all of the formatting and everythingelse.

In this course, you have been using a text editor to work with your HTMLfiles. Text editors are very different from word processors. In a text editor,there is no formatting or any other data types, just characters. The reason

Page 63: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.2. TEXT AND CHARACTER SETS 63

is that when a text editor opens a file, it takes one byte at a time from thefile and converts it to a character in the editor.

That means that when you’re using a text editor, you are directly editingthe characters corresponding to the bytes stored on disk. You can actuallyopen up any kind of file in a text editor. If you open up a MS Word documentwith a text editor, you will see the characters that correspond to the actualbytes stored by Word when it saved the file. (Many text editors, however,won’t actually let you do this, or at least they will warn you that somethingisn’t right. They are assuming that opening a Word file in a text editor is astupid thing to do. They’re probably right.)

The term text file generally refers to any file that you can edit with atext editor. This includes HTML files and the CSS and Python we will bewriting later in the course.

This is the reason you don’t need a special application to edit HTML files.Any text editor can read the characters from the disk and let you changethem. What you see in the text editor is exactly what gets transmitted tothe web browser when you view the web page.

Fonts

Character sets allow the computer to keep track of characters internally bystoring them as numbers. In order to display a character on the screen oranywhere else, the computer has to know what that character looks like.There are, of course, many ways to draw a letter. A font (or typeface) is acollection of drawings, one for each character. Some fonts that you mightknow of are Times, Arial, Helvetica, and Courier.

A font must contain an image of each character. Each of these is calleda glyph, so a font is actually a collection of glyphs.

With large character sets like Unicode, it’s difficult for a single font torepresent every possible character. It’s also not very practical—do you reallywant to take up the space on your hard drive needed for every font to havea full set of glyphs for Linear B and Ugartic?

Generally, systems that support Unicode will try their best to find a glyphfor the characters that are being used, even if it has to look in another fontfor it. Modern operating systems generally come with (or you can download)fonts that can fill in some of the common Unicode characters that aren’t inmost fonts (like Chinese and Korean characters and mathematical symbols).

Page 64: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

64 UNIT 3. TEXT AND GRAPHICS

Check-Up Question

I Have a look a the “Code Charts” on the Unicode web site (linked from the“Other Links” section of the course web site). If you happen to speak/writea language that is usually hard to represent in a computer, see if it bewritten with Unicode.

Topic 3.3 Graphics and Image Types

The term computer graphics refers to using a computer to create or manip-ulate any kind of picture, image, or diagram. There are many different waysto create computer graphics, and you should choose a program that fits yourneeds. We cannot cover them all here, so we will discuss some of the commonconcepts and terms.

There are two basic ways to store an image in a computer: vector graphicsand bitmapped graphics.

You are probably most familiar with bitmapped graphics (sometimes cal-led raster graphics). Bitmapped graphics are used in paint programs. Whenyou use bitmapped graphics, the image is stored in an array of dots or pixels.Each pixel in the grid is assigned a colour. If there are enough pixels, theimage will closely resemble the intended image.

The other basic image type is vector graphics. Vector graphics are used indrawing programs. A vector format stores a description of the image, usingvarious shapes like circles, lines, curves, and so on. For example, a vectorimage could contain the information “a green line from here to here, a redfilled circle with centre here and with this radius.”

Some of the bitmap graphics programs you might know are Photoshop,Photo-Paint, Paint Shop Pro, GIMP, and Graphic Converter. Since most webgraphics are bitmap graphics, these are the programs you might want to usefor this course. Some vector programs are Corel Draw, Adobe Illustrator, andAcrobat. Most vector graphics programs have the ability to include bitmapsin vector images, but it doesn’t make them paint programs—the bitmaps arejust one type of object that can be inserted into the vector image.

Page 65: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.4. BITMAP VS. VECTOR IMAGES 65

Check-Up Questions

I What graphics programs do you have on your computer? What have youused in the past? Do those programs use bitmap or vector images?

I Open up the bitmap graphics program you’re using for this course (probablythe GIMP; you can find instructions in Appendices A and B). Open a newimage and draw something.

Topic 3.4 Bitmap vs. Vector Images

Bitmap and vector images each have strengths and weaknesses when it comesto storing various kinds of images and to being manipulated in certain ways.Because of the way they store their images, they are also edited differently.

Bitmap Images

• are very flexible—can represent any image

• are created by scanners, digital cameras, and similar devices

• are used very commonly (on the web, etc.)

• can be displayed on the screen directly if one image pixel is to be thesame size as one screen pixel

• take a lot of memory—the colour of each pixel must be stored. Can becompressed when saved.

Vector Images

• can change parts of the image easily since they are stored as separateshapes

• can rotate, change size, colour, line width, and so on smoothly (seeFigure 3.2)

• are limited to shapes the program was designed to handle

• usually take less memory and disk space than a bitmap

• must be converted to a bitmap for display.

Page 66: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

66 UNIT 3. TEXT AND GRAPHICS

(b)(a)

Figure 3.2: Scaling (a) a vector image and (b) a bitmapped image

Check-Up Questions

I If you have created any computer graphics in the past, did you use abitmap or vector program? Looking at these strengths and weaknesses, doyou think you made the right choice?

Topic 3.5 File Formats

There are many ways to store images. Each is called a file format. Once wehave decided to use a bitmap or vector format, the program we’re using muststill be told how to store and represent the information we need on the disk.Since there are many choices to make here, many different image formatshave been created.

Each file format has a different way of converting the image informationinto a string of bits that can be stored on a disk or transmitted over theInternet. We won’t discuss the details of how each format works. The processis far too complicated for this course. We will just discuss some of thecapabilities of common formats. The differences in file formats are the reasonyou can’t take a GIF file, rename it .jpg , and expect things to work. To

Page 67: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.5. FILE FORMATS 67

convert a file from one type to another, a program must convert the imagedata from one type to the other.

The program Graphic Converter for the Macintosh can handle 145 differ-ent graphics formats. Some have more strengths than others and are usedmore often.

Some common bitmap formats are GIF (graphics interchange format),JPEG (joint photographic experts group), PNG (portable network graphic),BMP (Windows bitmap), and TIFF (tagged image file format); most bitmapprograms can read and write all of these examples.

Some common vector formats are SVG (scalable vector graphics), EPS(encapsulated postscript), CMX (Corel meta exchange), PICT (MacintoshPicture), and WMF (Windows metafile). Most vector editing programs canread and write most of these types.

Since there are so many formats to choose from, it can be difficult todecide on one. Here are some things to keep in mind:

• type: does the format store the type of image you want to use? (bitmapor vector)

• portability: can others use images in this format?

• colour depth: does the format store the number of colours you need?

• compression: compression can make a smaller file, at the cost of thecompression/decompression time.

• transparency: did you need some parts of the image to be clear?

The first two of these considerations should be self-explanatory. The otherthree need some explanation.

Colour Depth

Image formats can only store so many different colour values, depending onthe number of bits assigned to each pixel. The format must store the levelof each of the three primary colours (or components) in order to specify thecolour (we will explore this aspect more in Unit 4). The bit depth indicatesthe number of colours that can be used in the image.

A 24-bit image can indicate any one of 224 colours for each pixel. Eachof the components is stored on a scale to 28 = 256. This is usually enoughcolours because most people cannot distinguish between two colours that

Page 68: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

68 UNIT 3. TEXT AND GRAPHICS

differ only by one unit. Thus, full-colour photos look good when stored in24-bit colour.

Similarly, 15-bit colour can have any one of 215 colours, with 25 possiblevalues for each component in each pixel. It’s sometimes also referred to as16-bit colour. Full colour images still look good in 15-bit colour, but you canusually tell the difference between the same image in 15- and 24-bit.

Formats with fewer colours are usually paletted or indexed images. Here,some colours are chosen to be in the image’s palette. The colours in thepalette are the only ones that can be used in the image. For each pixel, weonly need to store a number that refers to a position in the palette.

For example, an 8-bit image has a palette of 28 = 256 colours. An 8-bitpaletted image can use up to 256 colours chosen from the 224 possible 24-bitcolours. A 1-bit image can have 21 = 2 colours, usually just black and white.

Using an image format with more colours gives you more flexibility, butit usually also means a bigger image file as well because more informationmust be stored for each pixel. It is often desirable to work in 24-bit colourand then convert to a lower bit depth to create a smaller file for storage ortransmission.

If you try to store an image in a file that can’t hold as many colours asthe image has, some decisions must be made. Usually, the graphics programwill choose a set of colours that closely matches the ones in your image. Forcolours that aren’t found in the palette, the program can either pick theclosest colour or do dithering.

Dithering is the process of creating a pattern of pixels that fools the eyeinto seeing a colour that is between the two. See Figure 3.3 for an example,using black and white pixels to make a gray square. If you hold the pageback, the dithered square on the right should look gray. If everything hasgone well in the printing process, it should be about the same colour as thepure gray square on the left. If you look very closely at the pure gray square,you can probably see some fine dithering in it as well; this dithering is doneduring the printing process since printers and copiers only have black ink towork with.

Compression

It is also possible to create a smaller file with compression. Without com-pression, bitmap images can be very large. For example, a 640 × 480 pixel

Page 69: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.5. FILE FORMATS 69

Figure 3.3: Colour dithering

image with 24-bit colour depth would take

640× 480× 24 bits = 7372800 bits = 900 kB .

Images this large would take a long time to transfer over the Internet. Thisis the reason the uncompressed Windows BMP format is not used on theweb.

There are many different ways to compress bitmap image data. Thevarious methods fall into two categories: lossless compression and lossy com-pression.

Most image compression techniques are lossless. When data is compressedand then uncompressed with a lossless algorithm, it is exactly the same.

When an image is compressed with a lossy algorithm and then uncom-pressed, it may be slightly different. This technique is often acceptable forimages, particularly if the changes are so slight that they can’t be seen withthe naked eye.

The JPEG format was created to store photographs at a high quality invery small files. It uses a lossy compression format. JPEG images mightchange some small details, but the changes can’t usually be seen, except atthe lowest quality settings. See Figure 3.4 for an example of a small image(a) that has been compressed and uncompressed with a low-quality lossyformat (b). Note that this isn’t a full-colour image; that’s why the JPEGcompression has done such a bad job with it—it wasn’t made for that job.

The GIF format uses the LZW compression algorithm, and the Unisyscorporation has a patent on this method. The PNG format was created sothere would be a free lossless format available for use on the web. In thiscourse, we recommend that you use PNGs so that you do not have to worryabout misusing this patented format.

Page 70: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

70 UNIT 3. TEXT AND GRAPHICS

(a) (b)

Figure 3.4: An image with a low-quality lossy compression

Transparency

Another feature that is important for some images is transparency. Someimage formats can indicate that part of the image is to be transparent sowhatever is behind it will show through. This technique is often used on webpages so that images appear to be shapes other than rectangles—the imageis still a rectangle, but the background shows through some of it, making itseem otherwise.

Most image formats don’t support transparency, so all images appearsquare, as in Figure 3.5(a).

Some formats, GIF in particular, support simple transparency. Some ofthe pixels in the image are marked as transparent, so the background isvisible through those parts of the image. This format can be used to makean image appear any shape; an example can be seen in Figure 3.5(b).

Finally, some image formats support more a general method of trans-parency called an alpha channel. The image has the image information anda mask, called the alpha channel, that indicates how transparent each partof the image is.

Formats and programs that support alpha channel transparency can dopartial transparency, as seen in Figure 3.5(c). The PNG format supports fulltransparency, as do graphics programs like Photoshop and GIMP. Some webbrowsers that support PNG graphics can do full transparency; some can’tand so they ignore any advanced transparency information in the image.

Page 71: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.6. FILE FORMATS, COMMON 71

(a) (c)(b)

Figure 3.5: Various types of transparency in images

Topic 3.6 Common File Formats

Since some image formats are used more commonly than others, we willdescribe some of them and what they are usually used for.

These formats are the ones common on the World Wide Web:

• JPEG: JPEG (Joint Photographic Experts Group, pronounced jay-

peg) is a bitmap format with lossy compression that is intended foreither 24-bit colour or 8-bit gray-scale photographs—the kind of thingwhat would come out of a digital camera or from scanning a photo-graph. It does an excellent job of storing this kind of information but apoor job with other kinds of images. Figure 3.4 is an example of whathappens when an image that isn’t a photograph is compressed withJPEG.

• GIF: The GIF (pronounced jiff ) bitmap format uses a lossless com-pression algorithm. GIF is well supported on the web and can be usedfor simple animations and simple transparency. It is, however, limitedto 256 colours in an image (8-bit colour). It is also burdened by apatent on its compression algorithm.

• PNG: The PNG (pronounced ping) bitmap format was created as afree replacement for GIF. It uses a better and free compression algo-rithm. It can also support up to 24-bit colour and full transparency.The PNG format came along after GIF, so it wasn’t supported in some

Page 72: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

72 UNIT 3. TEXT AND GRAPHICS

older browsers. All recent browsers support PNG, so you should beable to use it safely.

The following formats, while not common on the web, are often used inother types of digital multimedia:

• BMP: The Windows BMP format is a bitmap format. It is usuallyuncompressed (for speed), but it can use a simple compression algo-rithm.

• TIFF: TIFF (Tagged Image File Format) is a common bitmap formatamong people who do publishing and graphic design. It can be com-pressed in several different ways and and can hold colour depths up to24-bits.

• EPS: EPS (Encapsulated Postscript) is a vector format. It is quitewidely supported and can be used in most drawing programs.

• SVG: SVG (Scalable Vector Graphics) is a vector format that was cre-ated by the WWW Consortium in the hopes of making vector graphicspossible on web pages. SVG is discussed further in Topic 6.2.

Most computer graphics programs also have a file format of their own,usually called native formats . For example, Photoshop uses PSD files, theGIMP uses XCF files, Corel Draw uses CDR, and so on. These file formatsuse either no compression or a lossless compression scheme. They are de-signed so that they can store any of the information that particular programcan produce.

So, you should be able to save these files and open them back up withoutlosing any information. That is a good reason to use them while you areworking on an image. When you’re ready to publish it or pass it along toothers, you should probably convert it into one of the more universal formatsdescribed above.

Check-Up Questions

I If you have created any computer graphics in the past, what file format(s)did you use? Do you think you made the right choice(s)?

I What is the native format that your image editing program uses (if any)?

Page 73: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

3.6. FILE FORMATS, COMMON 73

Summary

This unit isn’t intended to teach you everything about image editing or evenhow to use a particular program. You should know some of the basic termsand ideas behind computer graphics. With this background knowledge, youcan easily learn more about graphics and how to use a graphics program foryour assignments.

Key Terms

• bit

• byte

• binary

• character set

• ASCII

• Unicode

• text file

• font

• glyph

• bitmap graphics

• vector graphics

• colour depth

• paletted or indexed colour

• dithering

• compression

• lossless compression

• lossy compression

• transparency

• alpha channel

Page 74: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

74 UNIT 3. TEXT AND GRAPHICS

Page 75: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 4

Cascading Style Sheets

Learning Outcomes

• Create and apply style sheets.

• Specify colours for HTML and style sheets.

• List some basic CSS properties.

• Use some some more advanced HTML on web pages.

• Design web pages with valid, logical XHTML and create a visual stylewith CSS.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• Browse the CSS reference pages in “Online Tools.”

• Have a look at the RGB colour chart.

• (optional) Review Designing with Web Standards, Chapters 1 to 3, 5to 7, and 9.

• Do Exercise 2.

75

Page 76: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

76 UNIT 4. CASCADING STYLE SHEETS

Topic 4.1 CSS

In Topic 2.1, we discussed logical markup and how its appearance is deter-mined with styles. The styles used in HTML are called cascading style sheetsor CSS . Style sheets were added to HTML after it was initially created.Before that, you had to accept the default style of the web browser.

CSS is now quite well supported by the major browsers (at least CSSversion 1 is). Netscape 6 and later, and Mozilla have near-perfect supportfor CSS features, which is why they are recommended for this course. InternetExplorer 6 has good support for CSS1.

Each browser gives a default style to each tag, which it uses unless youoverride it. That’s why you didn’t have to create a style sheet in order toview HTML you created.

A CSS file us usually placed in a separate file from your HTML documentso that many HTML pages can reference it. That way, if you want to changethe appearance of an entire site, you only need to change the style sheet, noteach individual file.

A style can be applied to any tag, like <h1>, <p>, and so forth. Apply-ing style rules to parts of your HTML page will allow you to change theappearance of your structural markup.

Writing Style Sheets

A CSS is a collection of rules that make changes to the browser’s defaultstyle. There are three parts of each rule: a selector, some properties, andvalues. Those parts are put together like this:

selector {

property: value;

property: value;

}

There can be any number of property: value; parts in each rule. As inHTML, spacing in style sheets doesn’t matter—the lines above could havebeen written on a single line and the effect would be the same.

If you’re putting your style sheet into a separate file, you should writeonly your rules in a text editor and save them, ending with “.css”. This

Page 77: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.1. CSS 77

h1 {

text-align: center;

}

code {

font-family: sans-serif;

font-weight: bold;

}

Figure 4.1: A cascading style sheet

extension tells the web server that it should be sent with the right MIMEtype, text/css.

The selector specifies what the rule will be modifying. The selector canbe the name of a tag, like h1 or hr. These selectors will make the rule changethe appearance of all <h1> or <hr> tags.

The property indicates what is to be changed. For example, color isused to change the colour of whatever is selected, and font-size is used tochange the size of the text.

Finally, the value indicates what the property should be changed to. Eachproperty has a different set of possible values. For color, we could use thevalue red and for font-size we could use large.

A full listing of the CSS properties and their values can be found on theCSS reference pages, which are linked from the course web site in the “OnlineReferences” section.

Figure 4.1 shows a sample style sheet. This CSS will cause all <h1> linesto be centred and all <code> to be displayed in a bold sans-serif font.

Applying a style sheet

There are several ways of applying a style sheet to a web page.If the CSS is in a separate file, which we suggest for this course, you can

add this HTML inside the <head> of your page:

<link rel="stylesheet" href="style.css" type="text/css" />

This line of code assumes that the style sheet is called style.css and is in thesame directory as the HTML file. The <link> tag is used to reference afile that is related to this HTML page. The rel attribute gives the type ofrelation; in this case, the relation is “a style sheet for this page.” The href

attribute is the URL where the file can be found; it can be a relative link

Page 78: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

78 UNIT 4. CASCADING STYLE SHEETS

(as in the example) or an absolute link. Finally, the type attribute gives theMIME type of file since not all web servers send the right one for CSS files.

It’s also possible to indicate an “alternate” stylesheet that the user couldselect if they don’t like your default:

<link rel="alternate stylesheet" title="large print"

href="large.css" type="text/css" />

In Mozilla, alternate stylesheets can be selected from “Use Style” in the“View” menu.

CSS rules can also be inserted into an HTML document with the <style>tag. If you follow this method, however, it is impossible to have that CSSreferenced by more than one HTML page. So you lose the ability to changean entire web site with a single change to a CSS.

Check-Up Question

I Create a basic style sheet with a few rules, as in Figure 4.1, and apply itto an HTML page.

Topic 4.2 Classes and IDs

When you are creating HTML pages with CSS, you will often find thatthe existing HTML tags just aren’t enough to mark up your content. Forexample, suppose you were creating a web page where you wanted some ofthe paragraphs to be side-notes—parenthetical remarks for the reader thatshouldn’t be confused with the main text.

In the Study Guide, side-notes are indicated in this way. Here, theyare meant to replace the spontaneous remarks or explanations thatwould often happen in lectures. The purpose of the side-notes inthis hypothetical web page would probably be the same, but theymight look different.

The paragraphs that are part of the main text and the side-notes are bothparagraphs, so they should be marked up with the paragraph tag, <p>. But,we would want to make the side-notes look different to give the reader a hintof their meaning.

Page 79: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.2. CLASSES AND IDS 79

In HTML, you can create a different look by indicating a class of a tag.Creating a class of a tag is almost like creating a new tag for your pages. Aclass is defined using the class attribute, which can be applied to any tag.

On our web page, we could do something like this:

<p>A very important thing to remember is...</p>

<p class="sidenote">Actually, it’s not important in

England because...</p>

<p>As long as you keep this in mind, ...</p>

By default, all of the paragraphs would look the same. But because wehave indicated a class for some of them, their appearance can be changedwithout affecting the rest of the paragraphs. To select only a particular classof a tag, a class selector is used:

p.sidenote {

font-size: smaller;

color: #070;

}

When this style rule is applied, all <p class="sidenote"> would be dis-played dark green and in a smaller font then the other paragraphs. Any CSSrules for paragraphs, p {...}, would still apply to <p class="sidenote">

paragraphs.A new class can be used for any element on a page that has a distinct

purpose and needs a distinct appearance. They can be specified in a CSSwith a tag.class selector.

A similar effect can be achieved by specifying an identifier for a particularelement. An identifier is given with the id attribute. The difference is thatan identifier must be unique on each page—a particular identifier can be usedonly once on an HTML page.

Think of an identifier as a name for an element of your page. For example,you might write:

<ol id="contents"><li>Introduction</li>...</ol>

You know each page will only have one table of contents, so using an identifieris okay. An identifier is selected with a CSS ID selector :

ol#contents {

font-family: sans-serif;

list-style-type: upper-roman;

}

Page 80: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

80 UNIT 4. CASCADING STYLE SHEETS

This identifier will ensure that the list to be set in a sans serif font andnumbered with upper case roman numerals (I, II, III, IV, V. . . ).

An added benefit to using an identifier is that it can be used as a fragmentor anchor in a URL. A fragment is used to specify a position in the document,so a link can send the viewer partway down a document. If the above exampleis in a file called page.html, the relative URL page.html#contents would jumpto that page, with the browser scrolled to the table of contents.

If you’ve used HTML before, you might be used to the <a

name="contents"> method of creating an anchor. Using id for an-chors was introduced with HTML 4 and is supported by all modernbrowsers.

So, how do you decide whether you should use an class or id?

• If the element might appear more than once on the page, you shoulduse a class.

• If you want to use a fragment to jump to that part of the page, use anid.

• If you want to modify the element with JavaScript, you need to use anid to name it. (If you don’t know what that means, ignore it.)

If what you’re doing doesn’t fall into any of these categories, it probablydoesn’t matter which you use. You can also use other class and id on asingle element if appropriate.

<p class="aside" id="otherapp">Oddly enough, all of this

also applies...</p>

It’s generally a good idea to put a few id names throughout youdocument, perhaps on each <h2>, even if you don’t use them. Doingso will let others link to relevant parts of your document.

The Generic Containers

Sometimes it’s useful to be able to use styles or a lang attribute on anarbitrary block of HTML, not just a particular paragraph or phrase.

The HTML tags <span> and <div> are tags that were introduced justfor this purpose; they are called generic containers. These tags do nothing

Page 81: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.2. CLASSES AND IDS 81

unless they are modified by attributes. The <div> tag is a block level tag,and <span> is an inline tag.

For example, we might want to produce a list of links that floats besidethe main body of the text. Since there is no appropriate HTML tag for anavigation box, we would use the generic <div> tag. We can’t use <span>

since it’s an inline tag. We would use HTML like this:

<div id="sidebar">

<p>...</p>

<p>...</p>

<ul><li>...</li><li>...</li></ul>

</div>

In the style sheet, we would have rules like this one:

div#sidebar {

float: left;

background-color: #ccf;

font-size: small;

width: 10%;

}

A web page that uses <div> and <span> can be found in the “Examples”section of the course web site.

When you’re choosing an HTML tag for a particular purpose, you shouldfirst look for a tag that indicates the meaning of the content you’re trying tomark up; it can be modified with a class if necessary. If there is no existingtag that fits the meaning of the content, you should use one of the genericcontainers.

Check-Up Questions

I Create a web page with a sidebar floating on a margin as described above.(Use <div> and the float property.)

I The generic tags <div> and <span> can be used anywhere you need towrap something in a tag for CSS, etc. Were there any of these in yourAssignment 1?

Page 82: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

82 UNIT 4. CASCADING STYLE SHEETS

Topic 4.3 Some CSS Properties

You should look again at the online CSS reference. Here are a few propertiesthat need a little more explanation than you will get browsing through thereference pages.

• font-family: used to set the typeface. The value should be a list oftypeface names in order of preference. The browser will go through thelist until it finds a typeface that it can display. Font names with spacesor punctuation should be enclosed in quotation marks. The list shouldalways end with one of five generic family names, serif, sans-serif,cursive, fantasy, monospace, which every browser should be able tomatch to something.

font-family: "Century Schoolbook", Times, serif;

• font-size: sets the font size. The value can be either an absolute size(small, x-large, 12pt) or a relative size (larger, 120%). You shoulduse relative sizes as much as possible to allow for users who have set alarger or smaller default font for their viewing.

font-size: xx-large;

font-size: 80%;

• background-image: used to set a background image for an element.When you use this property, note that you have to put the URL ofthe image inside url(). You should also set the background-color tosomething similar, in case the image can’t be loaded.

background-image: url(bg.gif);

• margin-top: used to increase or decrease the space above an element.It can be used to add space for design reasons, or you might use anegative length to shrink the space or even make things overlap. Seealso margin-bottom, margin-left, margin-right.

margin-top: 18pt;

margin-top: -2ex;

Page 83: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.4. SPECIFYING COLOURS 83

• float: used to make an element “float” at a margin so the rest of thetext flows around it; this property is common with images. You canindicate that the object can float on either the left or right margin.

float: left;

You should use the float property any time you want to place elementsbeside each other on a web page.

• clear: When you have used the float property to make an imageor something else float at the side of the page, you often want sometext to appear below the floating image. The clear property indicatesthat the web browser should move down past the floating material todisplay whatever clear has been applied to. You can specify that theleft, right, or both margins are clear.

clear: left;

clear: both;

Check-Up Question

I Take the copy of Assignment 1 to which you previously applied a style sheet;try using some of these properties. Try some different combinations, andsee if you can come up with a design you like. If you’ve used HTMLproperly, you should be able to change the entire look of your site withoutchanging the HTML.

Topic 4.4 Specifying Colours

If you’ve ever mixed colours together, you’ve probably done it with paint.Paints and dyes are mixed with the primary colours cyan (blue), yellow, andmagenta (red) in the CYM or subtractive colour model. However, paints andcomputer screens create colours differently. Paints absorb reflected light,while computer screens actually produce light. The primary colours of lightare red, green and blue—this is the RGB or additive colour model.

In order to specify a colour for a computer screen, you need to specify theintensity of the each of the three colours. The component values are usuallyspecified in hexadecimal, a counting scheme with 16 digits (base 16).

Page 84: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

84 UNIT 4. CASCADING STYLE SHEETS

We don’t need to worry about the details here. For web pages, what youreally need to know is that the “digits” are (from lowest to highest):

0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F.

The red, green, and blue components, expressed with a hexadecimal digit,are combined in order, with a # in front. A lower digit means less of thatcomponent; a higher digit means more. So, #F00 is a colour with a lot of redand no green or blue—red. The code #FFF indicates a colour with a lot ofall three components—white.

Here are some examples:

#F00 red#0F0 green#00F blue#FFF white#000 black#777 grey#700 dark red#FF0 yellow

Originally, you had to write the colours with six hex digits. Ifyou’ve done HTML before, you might be used to that. The threedigit short form is newer. Basically, you convert from three to sixdigits by repeating each one: red would be expressed as #FF0000.

There is a chart of RGB colour values in the “Online References” sectionof the course web site. You will need to use this scheme to specify colours instyle sheets.

The RGB colour chart online uses the six digit style. You can useeither; they are treated exactly the same.

You shouldn’t have to memorize a lot of colours in order to make a goodguess at a colour code for a particular colour. You can figure one out bycombining colours you already know.

For example, suppose you want a colour code for pink. Pink is a mixtureof red and white:

red #F00

white #FFF

Page 85: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.4. SPECIFYING COLOURS 85

For each of the red, green, and blue parts, we can average the values for redand white together to get pink. Remember that 7 is about halfway between0 and F: #F77. We could continue this process and average white and pinktogether to get a lighter shade of pink: #FBB.

Note that if you’re going to set the colours in the document with CSS,you have to be careful that you do not conflict with the settings of someusers. For example, suppose you have set the colour of the document text toblack:

body {

color: #000;

}

If users who have set their defaults to white text on a black backgroundvisit your site, they will get black text on a black background. To avoid thispossibility, if you’re going to set any document colours, you should set themall. Doing so will make sure all your colours fit together well. You shouldmake all five of these property changes or none of them; the values given areclose to the defaults for most browsers:

body {

background-color: #FFF;

color: #000;

}

a:link { color: #00E; }

a:visited { color: #529; }

a:active { color: #F00; }

Check-Up Questions

I Have a look at the RGB colour chart online. Try to make sense out of theway the colours are mixed together

I In your image editor, open up the “colour picker” (usually, you click onthe active colour in the toolbar). There should be an RGB colour selector;play with it.

I Try using the colours above on a web page and make sure the descriptionshere match.

Page 86: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

86 UNIT 4. CASCADING STYLE SHEETS

Topic 4.5 A CSS example

Figure 4.2 shows a complete HTML page with a <link> to a style sheet.Figure 4.3 shows the style sheet that it references.

The display of this page in one browser is shown in Figure 4.4. Thedisplay of the same page without the style sheet is shown in Figure 4.5.

The style sheet in Figure 4.3 uses a few CSS features that we haven’tlooked at yet. The <a> tag has some special selectors associated with it,known as pseudoclass selectors associated with it. The a:link selector de-termines the appearance of a link that the user hasn’t visited; the a:visitedselector applies to a link that the user has previously been to; a:active ap-plies to a link that the user is currently clicking on.

The third line from the bottom of figure Figure 4.3 uses a contextual

selector , h1 em. This rule applies to text in an <em> tag that is itself insidea <h1>. This rule causes the characters “H1” in the heading to have a whitebackground instead of the gray background that the other <em> has.

Figures 4.2 to 4.5 can also be found in the “Examples” section of thecourse web site.

Check-Up Question

I Make a copy of your Assignment 1. Create a simple style sheet (you coulduse Figure 4.1 and add some rules of your own) and apply it to your page.

Topic 4.6 Logical versus Physical

As we explained in Topic 2.1, logical markup is used to indicate the mean-ing or purpose of the content—a style is used to indicate the appearance.Physical markup only indicates the appearance.

This difference is a useful way of determining whether particular markupis logical or physical. If there is only one appearance that makes sense fora particular piece of markup, it’s physical. If you could give it anotherappearance with a style sheet (and still have it make sense), it’s logical.

For example, consider the following HTML fragments:

Page 87: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.6. LOGICAL VERSUS PHYSICAL 87

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<title>CSS Example</title>

<link rel="StyleSheet" href="style.css" type="text/css" />

</head>

<body>

<h1>The <em>H1</em></h1>

<p>

This is an <em>exciting</em> paragraph. As they say,

<span class="latin">caveat emptor</span>.

</p><p>

A style sheet is referenced like this:

<code class="html">&lt;link rel="StyleSheet" href="style.css"

type="text/css" /&gt;</code>

</p>

<p><a href="somewhere.html">A link somewhere</a></p>

</body></html>

Figure 4.2: HTML source of a page that references a style sheet

body {

background-color: #fff;

color: #000;

}

a:link { color: #00e; }

a:visited { color: #529; }

a:active { color: #f00; }

h1 {

border-top: medium solid ;

text-align: center ;

}

em {

background-color: #ccc ;

font-style: italic ;

}

h1 em { background-color: #fff; }

span.latin { font-style: italic; }

code.html { font-weight: bold; }

Figure 4.3: The style sheet, style.css, referenced by Figure 4.2

Page 88: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

88 UNIT 4. CASCADING STYLE SHEETS

Figure 4.4: The display of Figure 4.2, with the style sheet

Figure 4.5: The display of Figure 4.2, without the style sheet

Page 89: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.6. LOGICAL VERSUS PHYSICAL 89

Figure 4.6: Two possible appearances for contents of <del> or <strike>.

I am <del>very</del> unhappy about that.

I am <strike>very</strike> unhappy about that.

The <del> tag is used to indicate deleted text. So, the first fragment indicatesthat we intend to delete the word “very.”

Most browsers display the contents of the <del> tag with a “strikethrough”line. The <strike> tag indicates strikethough text. So, both <del> and<strike> look the same in most browsers.

Suppose we used a style sheet to change the appearance of both so theywere displayed as greyed-out text, instead of strikethough (see Figure 4.6).

The change doesn’t make any sense for the <strike> tag—we asked forstrikethough text but got something completely different. But, for the <del>tag, the change in appearance is perfectly reasonable.

So, the <del> tag is logical; <strike> is physical. You should have beenable to guess that this was the case from the fact that <del> gives a meaning(proposed deletion), but it does give you another way to determine which iswhich.

Things to Avoid

You shouldn’t fall into the trap of assuming that just because you’re usinglogical tags, you’re doing logical markup. Remember, the point of the logicaltags isn’t to achieve a desired appearance but to the indicate the content.

• Use of tables for layout. The HTML <table> tag is intended to markup tabular data as in Figure 3.1. Tables are things with rows andcolumns of data, not a way to create boxes for visual design.

• Abuse of the heading tags. The heading tags, <h1>–<h6> should notbe used just because the default style is what you’re looking for. Thecontents of these tags should be the title of the text below it, not aparagraph or a footer.

Page 90: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

90 UNIT 4. CASCADING STYLE SHEETS

• The line break tag. The <br /> tag should only be used to break a linewhen it’s necessary for the meaning of the text, not to achieve a visualappearance. If you’re formatting a mailing address or computer code,<br /> should be used to separate the lines. If you want to control thelength of the lines in a paragraph, you should use CSS.

You should also watch for physical attributes on logical tags. Most ofthese are deprecated, so they won’t validate.

Check-Up Question

I Go back to your Assignment 1 and see if you’ve misused any logical markup.Did you use any physical markup?

Topic 4.7 Why Logical HTML and CSS?

Why do we make such a big deal over using logical markup? Why shouldn’tweb authors just use physical markup and indicate the appearance of theirpages?

Many web authors make the mistake of assuming that what they see whenthey are designing their pages is exactly what the user will see. This isn’talways going to be the case. Pages will look different in different browsers,with different fonts, and so on.

Many authors, however, don’t consider uses of their pages in ways otherthan display in mainstream desktop browsers. Consider these possibilities:

• Speech-based browsers, like IBM’s Home Page Reader, are used by peo-ple with impaired vision. These browsers can use structural markup toguess at the meaning of particular parts of the page and adjust the voiceaccordingly. For example, an emphasized word will be pronounced dif-ferently than the rest of the text.

• Small-screen browsers, like those on a cell phone or palmtop, probablywon’t be able to display all of your images and fonts the same way asa desktop browser. If you’ve used decent logical markup, they will beable to make some reasonable guesses about how to display your pageand convey the right meaning to the reader.

Page 91: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

4.7. WHY LOGICAL HTML AND CSS? 91

• Search engines can use structural markup to guess at the importantpoints on your page. If a word occurs frequently in you page’s heading,it could guess that it’s the main topic of your page.

Physical markup is much harder (or impossible) to interpret for these pur-poses. When you are designing web pages and creating your HTML markup,you should keep these alternate uses in mind.

Another benefit of properly created logical HTML is a decrease in file size.When designers use physical markup, they will often specify the physicalappearance of every paragraph on the site individually, for example:

<p><font face="Arial" size="-1"><b>...</b></font></p>

If you do the same thing with style sheets, you only have to specify thestyle information once for your entire site. The markup required for everyparagraph is simplified to

<p>...</p>

and results in much smaller HTML files, which will load faster for users. Ifyou are paying for your web server space, it also means less bandwidth, whichwill cost less.

The optional readings in Designing with Web Standards cover this topicin much more detail than is possible here.

Summary

This unit has a lot of material in it. If you feel a little overwhelmed, that’sokay. Wait a little while, and go through it again.

After you complete this unit, you should be able to create valid, logicalXHTML and incorporate style sheets into it appropriately. You should beable to create valid style sheets now. While you are doing Assignment 2,you should experiment with style sheets and become more comfortable withthem.

Key Terms

• CSS

• rule

• selector

• property

Page 92: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

92 UNIT 4. CASCADING STYLE SHEETS

• value

• class (of a tag)

• identifier (of a tag)

• generic container

• RGB colour

Page 93: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 5

Design

Learning Outcomes

• Explain some principles of design.

• Explain how they can be applied to web pages.

• Apply some of basic design ideas for web sites.

• Evaluate the design of web sites and other materials.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• (optional) Read the first half of The Non-Designer’s Design Book.

• Do Assignment 2.

Topic 5.1 General Design

When we discuss how to design web pages, we must first describe somegeneral principles of design. This section discusses design for any medium.

The points made here are based on the first half of The Non-Designer’sDesign Book by Robin Williams. Williams lays out four principles of de-sign for creating well-designed documents. The four principles are proximity,alignment, repetition, and contrast. Used together, these ideas can help you

93

Page 94: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

94 UNIT 5. DESIGN

change a poor design into one that is visually pleasing and easy to get infor-mation from.

There are a few other things you should do when you are designing any-thing. Keep in mind that the whole point of your design is to make it easy forviewers to get the information they want. Many web pages fail here—howoften do you find yourself hunting through a page for the link you want?Remember, it’s all about information.

When you’re working on your design, have other people look at it. What’sthe first thing that they see? Is that what you want them to focus on? Canthey find the information you want them to get from your presentation?

Finally, you should design for your medium. Surfing the web, you get theimpression that a lot of people don’t know that the World Wide Web isn’t amagazine or television. Every medium has its own strengths and weaknesseswhen it comes to getting a message across. You have to keep these aspectsin mind so you don’t try to force your design into a medium where it doesn’tfit.

Proximity

Stated concisely, you should “group related items together” (p. 15). Theelements on your page should not be scattered randomly or all grouped tightlytogether.

Related items should be placed near each other, and unrelated itemsshould be separated by some space. If you follow this principle, those viewingyour document should realize what items are related before they start to read.It will make it easier for them to scan your document and find the relevantinformation.

Many inexperienced people seem to be afraid to leave any blank spaceon their page (whitespace). Whitespace helps separate unrelated topics, andyou shouldn’t avoid it.

When it is properly used, proximity helps your document look organized.Thinking about how items should be grouped together might even help youunderstand the organization of your material better.

Figure 5.1 is an example of a design that illustrates the concept of prox-imity. Notice that the contact information, the address, phone number, andweb site are grouped together. Also, the logo and name of the business areseparated from this information.

Page 95: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.1. GENERAL DESIGN 95

Figure 5.1: A design illustrating proximity

Alignment

“Alignment” refers to the position of something on the page. Williams statesthat “nothing should be placed on the page arbitrarily. Every item shouldhave a visual connection with something else on the page” (p. 31). A com-mon mistake beginners make is placing elements wherever they fit withoutconsidering the way they line up with other things on the page. Rememberthat everything should line up with something else.

Alignment should create a “line” on the page that the eye can follow. Itshould also make your page look organized. This “line” will give the readera good idea of how to follow the information. Centring text doesn’t alwaysdo a good job of creating this line.

Alignment is illustrated in Figure 5.2. On this page, the left side of therules and heading are aligned, as are the left sides of the quote and maintext. The right sides of the rules and main text are aligned.

Of course, you don’t have to align everything on the page. In Figure 5.2,the title and the text do not have the same left alignment.

Repetition

The idea here is to “repeat some aspect of the design throughout the entirepiece” (p. 49). For example, you might use the same distinctive font, rule,bullet, or colour in your entire presentation. You can repeat anything thatthe reader can recognize.

Page 96: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

96 UNIT 5. DESIGN

Figure 5.2: A design illustrating alignment

Page 97: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.1. GENERAL DESIGN 97

Figure 5.3: A design illustrating repetition

Along with alignment, repetition will unify your presentation so it alllooks like part of the same creation. A repeated element gives the usersomething to hang on to and gives the presentation a consistent feel.

On the other hand, you shouldn’t repeat too much. Everything in yourpresentation shouldn’t look the same.

In Figure 5.3, the fonts for the headings and lists are repeated, as are thebullets. Notice that there are also several repeated items from Figure 5.1 and5.2. These include the heavy horizontal rule, fonts, and logo. These elementsmake it clear that all of these designs fit together.

Contrast

“If two items are not exactly the same, then make them different. Reallydifferent” (p. 63). The reader should be able to tell at a glance what parts

Page 98: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

98 UNIT 5. DESIGN

of the page serve the same purpose.

Contrast can be created by using a different typeface, colour, background,border, and so on. These differences should be obvious; nothing should bejust a little bit different from something else.

A common problem in word processing documents is using a 12pt bodyfont, with 14pt headings. In this case, it is too difficult to distinguish betweena heading and a short paragraph. It would be better to have the headings16pt, bold, and in a (noticeably) different font or colour.

You shouldn’t be afraid to try something new with your design to createcontrast. The worst thing that could happen is that it will look ugly andyou will have to change it back. Remember, when you’re working with acomputer, you can do that.

As is the case with repetition, you should not have too much contrast.The point is to make some elements stand out. If everything is in a differentfont and colour, nothing will stand out, and your whole presentation willlook cluttered.

In Figure 5.4, there are three types of text on the page: the heading, themain text, and the questions for the reader. These groups are distinguishedby their fonts and background colour, which makes it clear that each has adistinct purpose on the page.

Check-Up Questions

I Look at some ads in a magazine or newspaper. Find some that use thesefour principles well and some that don’t. How would you change the poorerones?

I Critique your Assignment 1 with respect to these principles. How did youdo?

I Critique Figures 5.1 to 5.4 for all four design principles.

Topic 5.2 Design Principles and HTML/CSS

Williams’s design principles can be used when you are designing web pages.Remember that the web isn’t a magazine or poster; design for the web isdifferent. No matter how much you might want to, you can’t use the same

Page 99: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.2. DESIGN PRINCIPLES AND HTML/CSS 99

Figure 5.4: A design illustrating contrast

Page 100: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

100 UNIT 5. DESIGN

visual layout on a web page that you use for a poster. (On the other hand,you can’t include a hyperlink on a poster.)

If you stay within the confines of what HTML and CSS are intended todo, you can still do some very interesting design. In addition, if you do, yourweb page will look and act like a web page, not like a print or TV ad. It willbe a lot less confusing for your users.

To achieve proximity on a web page, you have to think about how yourinformation fits together. You can then separate the different groups ofinformation using heading and the CSS margin properties. When you arecreating a web site, you also have the option of moving some information toanother page—which certainly qualifies as separation.

Web pages usually have a strong left alignment with no style sheet chan-ges. You can further affect alignment using the text-align property for textand the float property for images and other elements.

CSS can easily be used to create repetition. You can create a distinctivestyle for heading, links, or other common elements on your pages. You canalso use the list-style-type and list-style-image properties to give allof your lists a distinctive bullet. It is also easy to use CSS to use a particularcolour for accents on your pages (links, borders, etc.).

Finally, contrast is easily lost on web pages. You should be careful thatyour headings do not look too similar and that they stand out from otherelements. You should also make sure that your links stand out from therest of your page—some web authors make links look almost exactly like thesurrounding text, which makes it difficult for users to find them.

Topic 5.3 Web Page Design

In addition to the design points described above, there are some others thatyou should keep in mind when you are designing web sites. Some of themcan also be applied to other types of multimedia design.

For web sites in particular, you must always remember that HTML andthe web are not WYSIWYG. HTML doesn’t specify every aspect of a page’sappearance, and browsers aren’t required to display a page exactly as speci-fied. So, you cannot design in the same way you do for other media. On theweb, you cannot specify the appearance of your pages exactly.

Page 101: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.3. WEB PAGE DESIGN 101

There are two ways to deal with this problem. You can understand it andtake it into account when designing your site, or you can stop making webpages. This lack of control causes stress for many people who have designedfor other media. They frequently ask: “How do I get my page to always looklike this?” The answer is, “You can’t.”

Type

Cascading style sheets can be used to suggest changes in typeface. As wementioned earlier, we won’t be too concerned about choosing typefaces inthis course. If you’re interested, you can have a look at Part 2 of The Non-Designer’s Design Book. This is an excellent starting point for learning todesign with type.

For the purposes of this course, you should keep one thing in mind: yourtext should be easy to read. There are are a few things you can do whendesigning web pages to ensure that they are.

First, don’t make your text too small. Small fonts are very hard to readon computer screens. All web browsers have a default text size that has beenchosen to make the text easily readable. You shouldn’t shrink the type foryour body text. It may be appropriate to shrink small amounts of text likethe page footers or captions under an image, but you should leave your mainparagraph font at the default size.

If you are going to change the typeface, you should be careful about itsreadability. Decorative fonts should only be used at larger sizes and not forlong passages of text. This kind of font might be appropriate for headings,but it should never be used as a body font; it’s too hard to read in largeamounts. For your body font, you should choose a simple serif or sans-seriffont.

When you are specifying fonts on a web page, you must remember that notevery browser will have the same fonts available. The method of specifyingfonts in CSS takes this factor into account; you should list the font you reallywant to use, followed by some reasonable alternatives, followed by one of thefive generic font names. For example, if you want a sans-serif body font witha particular look, you might specify it in a CSS as follows:

font-family: Optima, "Zapf Optima", AvantGarde, Tahoma,

Arial, Helvetica, sans-serif;

Page 102: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

102 UNIT 5. DESIGN

Colour

A serious discussion of designing with colour is also beyond the scope ofthis course. We will provide a few tips to help you begin to work with thecolours on your web page. You can find some comparisons of various colourschemes in the “Examples” section of the course web page. You might wantto have a look at those examples as you read this section. It will be easierto understand the descriptions if you have some examples to work with.

Many web pages lack contrast between their text and background colours.The colours you use for your background and text (and everything else that’son the background) should be quite different. If not, the text and otherforeground material won’t stand out well, which will make it hard to reador even hard to see. Many beginning web page designers put black text ona dark background, which is hard to read. Simply switching to white textimproves the look of the page greatly.

Another common pitfall is making the pages too bright. Fluorescentorange isn’t a colour you should use as the background of your page. If youstart with some very subtle colours and add some highlights, you’ll probablyget good results (especially if you keep the “repetition” idea in mind and usethe same colour throughout your site).

Finally, even if you keep these things in mind, some colours just don’tlook very good together. The best thing to do is to experiment with differentcolour schemes and find one that works.

Check-Up Questions

I If you haven’t done so already, look at the colour comparisons on the courseweb site.

I Have a look back at your Assignment 1. How did you do with respect tothese design ideas?

Topic 5.4 Usability

Even if your web page looks nice, it’s still very easy to make it entirelyimpossible for someone to use. Many of the observations here are basedon the findings of Jakob Nielsen. He publishes a bi-weekly column called

Page 103: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.4. USABILITY 103

Alertbox on web usability; there is a link to it in the “Links” section ofthe course web site. He has also published a book called Designing Web

Usability.

Don’t break expectations

People who come to your site have certain expectations about the way webpages work and the kind of information they will find there. If you don’t meetthose expectations, users are likely to leave. Neilsen sums this situation upwith the comment that “Users spend most of their time on other sites.”

Many people have the urge to make their web site “stand out” by makingit look and work differently from other sites. Such innovations make theirweb sites difficult to navigate. Web sites that don’t work like most othersites force the user to learn new ideas for a single site.

You should ask yourself: “Will browsers take the time to figure out howmy site works?” If your site works more-or-less the way other web sites do,they won’t have to learn anything new, so the answer is definitely “Yes.”The less you meet user’s expectations, the more likely it is that the answerwill be “No.”

Neilsen has identified some common mistakes that violate user’s expec-tations:

• Don’t change colours too much. With style sheets, it’s possible tochange the colour of the links on a site. Don’t do that—users expectlinks to be blue or purple. If your links are another colour, users maynot notice them.

• Leave the Back button alone. There are several ways to make thebrowser’s Back button not work as expected. The most common is tohave it open up links in a new browser window. If users want the link ina new window, they can do that—let them browse the way they want.Besides, new windows are annoying.

• Don’t use frames. Frames are bad for several reasons. The biggestproblem involves URLs. When the web was designed, the idea was thata URL would describe the page that you were currently looking at. Onpages with frames, the URL identifies only the original frameset, notwhat is currently in the frames themselves. So, bookmarks for framedpages do not work correctly.

Page 104: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

104 UNIT 5. DESIGN

Not everybody is you

When they create web pages, many people seem to start out with the beliefthat everyone has the same system as they do. That’s not true. Web surfersuse every imaginable combination of computer and web browser. If you useHTML and CSS properly, there shouldn’t be any problem, but it’s easy toexclude readers with your design.

• Don’t rely on new technology. Some web sites won’t work if the userdoesn’t have JavaScript or Flash or something else enabled. Do youreally think people will download a plug-in or enable JavaScript justto visit your site? There is no excuse to require these things to viewany web site. They can be used to enhance a site, but they shouldn’tbe required.

• Don’t rely on a particular browser. Nobody’s going to download an-other browser to view your site. Ever.

• Don’t design for a particular screen size. Some users have small screensand some have huge screens. Also, not everyone browses with their webbrowser taking up the entire screen. Try looking at your page with yourweb browser at a few different sizes. It might look a little funny, butthe information should still all be there and readable.

Don’t be annoying

It’s amazing that people forget this point. There are several things you oftensee on web pages that are just plain annoying.

• Use animation and movement cautiously. People who are actually try-ing to read the content of your site can be easily distracted by flashingtext, moving animations, and the like. You should only used animationon your site for a good reason. Providing animated eye-candy isn’t agood reason.

• Keep files small. When a user visits your site, they have to downloadthe HTML file and all the images on the page. If your page has a loton it, downloading can take a long time. Web users are impatient, andthey will often back up rather than wait for a big site to load.

Page 105: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.5. WEB SITE DESIGN 105

• Don’t move your pages. Many web authors seem to think that the onlyway people can get to their site is through their own links. If you movea page from one URL to another, it might work correctly from withinyour own site, but links from other pages and search engines will nolonger work. See Topic 11.2 for information on automatic redirects.

What works?

The most commonly used sites on the web follow most of the rules and avoidmost of the pitfalls outlined above. Good usability is part of the reason theare so popular—people can use the site to find the information they wantwith a minimum of headaches.

The Google search engine (http://www.google.com/) is one of the mostpopular sites on the web (although usage is very hard to measure). If youlook at Google’s site, you will find that they have obeyed all of the rulesoutlined above. Their pages are very simple and present the informationthat the user wants without a lot of other stuff that gets in the way. Theentire web site works like a web site—there’s almost nothing that a user whois familiar with the web would have to learn to use Google.

Also, don’t forget the other big reason that people use Google: it hasinformation people want. Always keep your potential readers in mind, andkeep your site relevant to them.

We won’t mention web sites that have particularly bad design for tworeasons. First, it’s not very nice. Second, many of the designers realize thattheir potential users aren’t coming back and fix their sites. So, by the timeyou read this, the sites might have a much better design.

There are many other suggestions and warnings on Jakob Neilsen’s Alert-box web site. It’s an excellent resource for people who want to make goodweb sites.

Topic 5.5 Web Site Design

In addition to the general ideas about design presented above, there are otherconsiderations to keep in mind when you are designing a web site.

There are millions of web sites out there. Why should someone stay atyours? First, you should have something interesting to say; if not, nobody’s

Page 106: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

106 UNIT 5. DESIGN

going to go to your site in the first place. Once someone is looking at yourweb site, using it should be a pleasant experience and it should be easy forusers to find the information they want. You should be designing for theuser’s needs. This is called “user-centred design.”

Each page should include navigational aids and links to related pageswhere the user can find more information. These should be consistent oneach page of your site. You should also make sure you provide the user withsome context. They should be able to figure out where they are in your site,so they can find more information more easily.

Here are some things you should make sure you have on each page of yoursite:

• enough information so the page can “stand alone.” You shouldn’t as-sume that the user knows the topic you’re discussing, since they mighthave come from an outside link or search engine.

• links to the rest of your site. You shouldn’t have any “dead end” pages.Remember that users from outside links might want to see the rest ofyour site.

• information about the author. Anybody can write anything on a webpage, whether they know anything the subject or not. If the user canfigure out who you are, that’s a good first step in trusting you.

• last update information. Pages on the web can get out of date andhave no indication that the information isn’t current. Make sure theuser can figure out when you last updated your page.

Summary

This unit should help you design good web sites and other multimedia pre-sentations. A lot of guidelines have been presented here, and it’s hard tokeep them all in mind at once. When you’re doing Assignment 2, look backthrough this unit and try to put the guidelines into practice.

Key Terms

• proximity • alignment

Page 107: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

5.5. WEB SITE DESIGN 107

• repetition

• contrast

• usability.

Page 108: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

108 UNIT 5. DESIGN

Page 109: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 6

XML

Learning Outcomes

• Explain how to store various information with XML.

• List some examples of XML schema.

• Use CSS to indicate style information for XML.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

Topic 6.1 What is XML?

XML stands for “eXtensible Markup Language.” XML is a structured markuplanguage that can be used to mark up any kind of data.

XML looks a lot like HTML. It uses the same kinds of tags, entities,attributes, and so on. The difference is that there are no tags built intoXML, so XML documents can contain any tags and attributes. To mark upthe content of the document, any tags can be used. When you are usingXML, you can define your own tags to represent any information. Everydocument must have a root element which is wrapped around everythingelse.

109

Page 110: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

110 UNIT 6. XML

<recipe author="Greg Baker">

<title>Guacamole</title>

<ingredients>

<item>4 ripe avocados</item>

<item>5 cloves garlic, minced</item>

<item>125 mL can minced jalape&#241;o peppers</item>

<item>1 tsp salt</item>

<item>1/2 tsp pepper</item>

<item>1 tbsp lime juice</item>

</ingredients>

<steps>

<item>Scoop out the flesh of the avacados into a bowl.</item>

<item>Combine all of the other ingredients in the bowl.</item>

<item>Mash with a potato masher until the desired consistancy is

reached.</item>

</steps>

</recipe>

Figure 6.1: A recipe markup up in XML

For example, suppose you want to represent a recipe in XML. A recipehas two main parts: the list of ingredients and the list of instructions tofollow to complete the recipe. Each list will have several items.

An example of how a recipe could be marked up is shown in Figure 6.1.(By the way, it’s also a good Guacamole recipe.) In Figure 6.1, the rootelement is <recipe>.

When creating markup for the recipe, we could have added more structureto the document. For example, we could have marked up the ingredients morecarefully:

<item><quantity>1</quantity> <unit>tsp</unit>

<ingred>salt</ingred></item>

If added these tags, we could always ignore them, so nothing has been lost.But the added structure could be used to manipulate the data. For example,if we were reading such a file with a recipe management program and wantedto make a triple batch, it could easily be transformed to:

<item><quantity>3</quantity> <unit>tsp</unit>

<ingred>salt</ingred></item>

If the program was even smarter, it could become (since one tablespoon isthree teaspoons):

Page 111: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

6.2. SOME XML LANGUAGES 111

<item><quantity>1</quantity> <unit>tbsp</unit>

<ingred>salt</ingred></item>

The amount of structure needed in the markup depends on the application.By using appropriate tags, any kind of information can be represented

with XML. So XML has a great deal of flexibility, which is part of what hasmade it popular in many areas.

If you start creating new XML tags for every document you write, itwon’t be terribly useful. You can create a style for the tags and display thedocument as we will see in Topic 6.3.

The real power of XML comes from deciding on a set of tags and attributesthat will be used for a particular kind of information. This way any programthat wants to use the data will be able to do so without worrying about fileformats created by other companies.

Check-Up Question

I Try to figure out how you could store some other types of data with XML.

Topic 6.2 Some XML Languages

Sets of XML tags and attributes have been created for many purposes. Theseare called schema.

XHTML

You are already quite familiar with one XML schema. XHTML is really justa language defined with XML. The XHTML schema is a set of rules; forexample “<html> is the top-level element” and “<ul> must contain one ormore <li> elements.” When you stick to this set of rules, web browsers knowhow to display your document.

SVG

SVG stands for “Scalable Vector Graphics.” It is an XML language that isused to represent vector graphics for use on the Web. It was created by theW3C because there is no standard vector graphics format used on the web.

Page 112: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

112 UNIT 6. XML

<?xml version="1.0" standalone="no"?>

<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN"

"http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">

<svg width="4cm" height="4cm" viewBox="0 0 400 400"

xmlns="http://www.w3.org/2000/svg">

<!-- outline -->

<circle cx="200" cy="200" r="150"

fill="yellow" stroke="black" stroke-width="10" />

<!-- eyes -->

<circle cx="150" cy="150" r="20" fill="black" />

<circle cx="250" cy="150" r="20" fill="black" />

<!-- smile -->

<path d="M125,250 C150,310 250,310 275,250" stroke="black"

stroke-width="20" fill="none" stroke-linecap="round" />

</svg>

Figure 6.2: A sample SVG file that represents a happy face

Most browsers don’t support SVG at the moment, so it’s not terriblyuseful, but it’s a good example of what can be done with XML. Figure 6.2shows an example SVG file that contains a picture of a happy face.

The intention isn’t really that you would edit a SVG file by hand in atext editor. You could edit a vector image in a program like Corel Draw andthen save it as a SVG. The benefit is that you would then be able to openup the same file in Adobe Illustrator. If the two programs agree on the XMLschema, they will be able to open up each other’s files with no problems.

Figure 6.3: The display of Figure 6.2

Page 113: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

6.3. STYLING XML 113

MathML

MathML is a markup language for mathematical expressions. It can be usedto describe mathematical formulas that can be displayed or manipulated inother ways (like by a computer algebra program).

MathML isn’t well supported by browsers either, partly because mostcomputers don’t have fonts to display the required characters. With MathML,you can include formulas like this in XHTML documents:

−b±√b2 − 4ac

2a.

OpenOffice

OpenOffice is a free office suite (word processor, spreadsheet, etc.). Thenative data formats for OpenOffice are XML-based. These XML files aren’tmeant to be edited by hand in a text editor. They are created by the OpenOf-fice tools with a nice user interface.

The benefit is that it would be fairly easy for a program to decipher theOpenOffice XML formats and create another program that reads or writesthe same file format.

Check-Up Question

I Can you find other examples of XML-based formats?

Topic 6.3 Styling XML

If we opened the recipe in Figure 6.5 with a browser, it would have no way todisplay it in a useful way. Since no browser knows about the <ingredients>tag (we just made it up a few pages ago), it won’t have any rules for howit should be displayed. It would either display the text of the document,ignoring the markup or display the XML markup itself as text.

So, we need some way to give the browser style information about ourtags. We did this for XHTML with CSS. We can do the same for XML,except there are no default styles built into the browser for us to start with.All of the style information has to be given from scratch.

Part of a style sheet for the XML from Figure 6.1 is shown in Figure 6.4

Page 114: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

114 UNIT 6. XML

ingredients {

display: block;

margin-top: 1ex;

margin-bottom: 1ex;

}

ingredients item {

display: list-item;

margin-left: 2em;

list-style-type: disc;

}

Figure 6.4: Part of the style for Figure 6.1

Figure 6.5: Part of the display of Figure 6.1 after the application of CSSfrom Figure 6.4

We have to indicate whether the elements are block-level, inline, or oth-erwise with the display property. We haven’t used display before, sincethe type of the HTML tags is already known by the browser.

A few other basic visual properties of the elements have been set as well.Note that a contextual selector is used to modify the <item> tags in theingredient list but not those from the step list. A more complete style sheetfor the document can be found on the course web site.

CSS works well for XML, but there are many other style decisions thatyou might want to make for a document that aren’t possible with CSS. CSSwas created with changing the style of HTML in mind, not the complexitiesthat come up with XML.

A more flexible style language, XSL, the “eXtensible Stylesheet Lan-guage” was created to fill this need. XSL is another XML schema. That is,you can use a special set of XML tags to indicate style information for yourXML document.

XSL is very powerful, and we won’t cover it in detail here. There are a

Page 115: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

6.4. VALIDATING XML 115

few examples of XSL on the course web site.

Check-Up Question

I Have a look at the examples of CSS and XSL formatting on the web site.Try to modify the CSS example to change the appearance.

Topic 6.4 Validating XML

How can we check to see if the XML in Figure 6.1 is valid? We can’t, sinceno validator would have any way of knowing whether the tags were usedproperly. Since we just made up <ingredients>, we can’t expect a validatorto know the rules for its use.

Some of the basic rules for all XML can be checked, however. For instance,we can check to see that <ingredients> is properly closed and that all ofthe tags inside are nested properly. We could also check for quotes aroundattribute values and so on.

For XML we can check for the basic syntax rules that make the XMLwell-formed . Checking for well-formed XML involves checking all of thebasic syntax rules without worrying about the particular tags that are used.

There are XML well-formedness checkers available online. (“Formedness”is almost certainly not a word, but that shouldn’t spoil the fun.) Any XMLdocument can be checked for well-formedness.

If we know the rules for the use of each tag and attribute of an XMLlanguage, then a validator could check the XML the same way we have beenchecking XHTML up to now.

This set of rules is what is given in an XML schema. Since there isa schema for XHTML, it can be validated. In fact, the whole point of thedoctype we have been specifying in each XHTML file is to indicate the schemabeing used.

There are also schema for SVG and MathML, so those can be validatedas well.

Check-Up Question

I Check some example XML documents for well-formedness and validity withthe online checker. You can use both XHTML and other XML documents.

Page 116: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

116 UNIT 6. XML

Topic 6.5 XHTML and HTML

As we said above, XHTML is just an XML schema. Basically, it’s a set ofXML tags, attributes, and rules along with a meaning for each one. It alsohappens to be an XML schema that web browsers can display.

Versions of HTML (versions 1–4) before XHTML were not based on XML.XML was created in 1998, well after the first versions of HTML were created.

Earlier versions of HTML were similar to XHTML, but their rules wereless strict. For example, attribute values didn’t have to be quoted if they onlycontained letters and numbers: <p id=conclusion> would be valid HTML,but not XHTML.

With XHTML 1.0, the standard was recreated. XHTML 1.0 is exactly thesame as HTML 4.01, which was the last non-XML-based HTML standard.The only changes between HTML 4.01 and XHTML 1.0 were the syntaxrules that come with XML—all of the tags and attributes are the same.

There is a version 1.1 of XHTML that we haven’t discussed here; thisguide covers version 1.0. There are a few technical reasons for this choice, inparticular:

• With XHTML 1.1, the MIME type text/html shouldn’t be used. Theproper MIME type for XHTML documents is application/xhtml+xml,but some browsers (Internet Explorer, in particular), don’t recognizethis type. Incidentally, this is why we used .html and not .xhtml forXHTML files—to make sure files were sent with the text/html MIMEtype.

• There is no “Transitional” version of XHTML 1.1. The transitional

doctype allows some of the older physical markup tags to be used. Wehaven’t discussed the transitional doctype, but many web authors stilluse it.

• XHTML doesn’t allow the lang attribute to specify a language, onlythe newer xml:lang attribute. Most browsers don’t support xml:lang.

Other than these points, everything else discussed in this guide also appliesto XHTML 1.1.

XHTML 2.0 is being written now. Many of the planned changes fall intothe category of “should have been done that way in the first place.” It is a

Page 117: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

6.5. XHTML AND HTML 117

significant departure from the older HTML standards—XHTML 2.0 pageswill be incompatible with current web browsers in many ways.

Summary

This unit is intended to get you comfortable with the basic ideas of XML.It should also reinforce the ideas behind XHTML that we have been dis-cussing up to this point. There aren’t a lot of technical skills here, just somebackground on an interesting topic.

Key Terms

• XML

• SVG

• MathML

• XSL

• well-formed

Page 118: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

118 UNIT 6. XML

Page 119: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Part III

Internet Programming

119

Page 120: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor
Page 121: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 7

Programming Introduction

Learning Outcomes

• Explain what programming is and what Python is.

• Create simple Python programs from a description of their behaviour.

• Identify the types of data that can be manipulated in Python.

• Use conditional statements in Python

• Debug programs in Python.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• Read Chapters 1 to 4 and Appendix A in How to Think Like a Com-

puter Scientist. You can ignore these sections : 3.4, 3.5, 3.11, and 4.9to 4.11.

• Do Exercise 3.

This Unit and Unit 9 cover material found in How to Think Like a Com-

puter Scientist. You are expected to do these readings to fill in gaps in thematerial in this Study Guide, which is much less complete than in previousunits.

This guide and the text cover the material in different ways. Differentstudents will probably find that they are more comfortable with one or theother. You can start with either one, but you should read both—the StudyGuide doesn’t cover everything in depth.

121

Page 122: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

122 UNIT 7. PROGRAMMING INTRODUCTION

Topic 7.1 What is Programming?

The second half of this course focuses on computer programming. We won’tassume you have programmed before and will start at the first step: Whatis programming?

Basically, a computer program is a list of steps that a computer can followto accomplish some task.

A programming language is a particular way of expressing instructions toa computer. There are many programming languages. They are all designedfor different reasons, and all have strengths and weaknesses, but they sharemany of the same concepts.

We have already covered HTML, which we said was a markup language.You might be wondering what makes a programming language different froma markup language. For one thing, a markup language is used to create adocument, whereas a programming language is used to create a computerprogram.

Also, there are some thing that a programming language must be able todo that HTML cannot. HTML doesn’t have any variables (Topic 7.4), condi-tional statements (Topic 7.7), or iterative statements (Topic 9.3). Anythingcalled a “programming language” will have these (or something equivalent).Basically, writing HTML isn’t “programming.”

Why Learn to Program?

There’s a good chance that if you’re in this course, you’ll never make a livingat programming. So, why would you bother learning how to do it?

For a lot of people, the answer may be “because it’s fun.” Programmingis an interesting challenge.

It is also nice to have another tool you can use to solve problems. If youknow how to program, it’s surprising how often you will find that a quickprogram is the easiest way to deal with a problem.

Page 123: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.2. STARTING WITH PYTHON 123

Topic 7.2 Starting with Python

In this course, we will be using the Python programming language. You candownload Python for free. Python is also an excellent programming languagefor people who are learning to program.

You may be wondering why this course doesn’t teach programming inC++ or Java. These are the languages that you probably hear about mostoften. The reason is simple: we aren’t trying to make you into computerprogrammers.

C++ and Java are very useful for creating desktop applications and otherbig projects. We aren’t doing that here, and you’ll probably never do it.Languages like Python are a lot easier to work with and are well suited to webprogramming. In fact, C++ and Java are rarely used for web programmingbecause they aren’t well suited to the task.

One nice feature of Python is its interactive interpreter . You can startup Python and start typing in Python code. It will be executed immediately,and you will see the results.

You can also type Python code into a text editor and run it all at once.The interactive interpreter is generally used for exploring the language ortesting ideas. Python code in a file can be run as an application and evendouble-clicked to run your program.

We will start by working with the Python interpreter. See Appendix Bfor instructions on getting Python running. When you start the Pythoninterpreter, you’ll see something like this:

Python 2.3.2 (#2, Oct 6 2003, 08:02:06)

Type "help", "copyright", "credits" or "license" for

more information.

>>>

The >>> is the prompt . Whenever you see it in the interpreter, you can typePython commands. When you press return, the command will be executedand you will be shown the result. Whenever you see the >>> prompt inexamples, it’s an example of what you’d see in the interpreter if you typedthe code after the >>>.

For some reason, when people are taught to program, the first programthey see is one that prints the words “Hello world” on the screen. Not wanting

Page 124: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

124 UNIT 7. PROGRAMMING INTRODUCTION

to rock the boat, we’ll do that too. Here’s what it looks like in the Pythoninterpreter:

>>> print "Hello world"

Hello world

The stuff after the prompt is the first piece of Python code we’ve seen. Wecould have also typed it into a text editor, named the file hello.py and run it.

The print command in Python is used to put text on the screen. What-ever comes after it will be printed on the screen.

Any text in quotes, like "Hello world" in the example, is called a string.Strings are just a bunch of characters. They have to be placed in quotesto distinguish then from Python commands. If we had left out the quotes,Python would have complained that it didn’t know what “Hello” meant,since there is no built-in command called Hello.

Check-Up Questions

I Type print "Hello world" into a text editor and save it as hello.py file.Run it with Python.

If you’re using Windows and you run the program by double-clicking thefile, the output window might disappear before you can see the results.You can stop this from happening by running the program in IDLE or bywaiting for the user to press return before ending the program. We’ll talkabout how to do that in the next topic.

I Add a few more print statements to your hello.py program (one per line).Run it and see what happens.

Topic 7.3 An Example Program

Figure 7.1 contains the Python code for a simple guessing game program.You probably won’t understand all of it now, but that’s okay.

Some sample executions of the program can be found in Figure 7.2. Whenthe program is run, the game it played once; the sample gives three separateexecutions of the program.

You can download this program from the course web site if you want totry it for yourself. By the end of this unit, you should understand how itworks.

Page 125: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.3. EXAMPLE PROGRAM 125

import random

num = random.randint(1,10)

print "I’m thinking of a number between 1 and 10."

guess = int(raw_input("Take a guess: "))

if num==guess:

print "Right! Wow, that was fast."

else:

if guess<num:

print "That was too small."

else:

print "That was too big."

guess = int(raw_input("Guess again: "))

if num==guess:

print "Yup, that’s it."

else:

print "Sorry, wrong again. It was " + str(num) + "."

print "Thanks for playing the game!"

Figure 7.1: An example Python program

Page 126: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

126 UNIT 7. PROGRAMMING INTRODUCTION

I’m thinking of a number between 1 and 10.

Take a guess: 4

That was too small.

Guess again: 6

Sorry, wrong again. It was 9.

Thanks for playing the game!

I’m thinking of a number between 1 and 10.

Take a guess: 7

Right! Wow, that was fast.

Thanks for playing the game!

I’m thinking of a number between 1 and 10.

Take a guess: 6

That was too big.

Guess again: 3

Yup, that’s it.

Thanks for playing the game!

Figure 7.2: Three sample executions of the program in Figure 7.1

Page 127: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.4. EXPRESSIONS AND VARIABLES 127

Check-Up Question

I Make some modifications to the program in Figure 7.1 and see what hap-pens when you run it.

Topic 7.4 Expressions and Variables

By putting a number of print statements in a program, we can make Pythonoutput a bunch of text. That’s not very exciting, but at least we can makeprograms.

The next step is to create a program that can do some calculations andoutput the results.

>>> print 33+5

38

More complicated expressions are also possible:

>>> print (14+19)*5

165

>>> print 327-2*(9**2)

165

>>> print 327 - 2*(9**2)

165

The * operator is used for multiplication (you don’t have a × on your key-board) and ** is used for a power (the last two calculations are 327− 2(92)).Notice that you can also put spaces in expressions. The spaces don’t changethe calculation, so you can use them to make the code easier to read.

When Python sees a calculation like 33+5, it evaluates it and uses theresult. Python will see that it’s a calculation because it’s not in quotes likea string:

>>> print "33+5"

33+5

Remember that a string is only a series of characters.Any kind of calculation that returns a result is called an expression. You

can use an expression anywhere you need to give Python a value to workwith.

Page 128: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

128 UNIT 7. PROGRAMMING INTRODUCTION

What if you want to remember the result of a calculation? Often, you’llwant to use the same result several times, and you shouldn’t have to calculateit every time. In Python, you can store data in a variable. You can think ofa variable as a label on a particular part of the computer’s memory.

>>> x = 3+4

>>> print x

7

>>> print x*2

14

Here, the first line calculates the value 7 and stores it in a variable namedx. This is called an assignment statement . To assign a value to a variable,you give the name of the variable, an equals sign (=) and an expression tocalculate the value you want to store.

Once a value has been stored in a variable, it can be used in any expres-sion. Variables can hold things besides numbers.

>>> course = "CMPT 165"

>>> print course

CMPT 165

>>> print "This is " + course

This is CMPT 165

>>> print course*2

CMPT 165CMPT 165

Notice that Python knows how to “add” two strings—it concatenates them.It can also “multiply” a string by an integer—it makes copies of the string.

Remember that variables don’t store the expression you assign; they storethe result of the calculation. For example,

>>> x = 5

>>> y = x*2

>>> print x,y

5 10

>>> x = 20

>>> print x,y

20 10

In this example, the variable y is holding the number 10 not the calculation“take x and multiply it by 2.”

Page 129: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.5. USER INPUT 129

name = raw_input("What is your name? ")

print "Hello, " + name

age = raw_input("How old are you? ")

print "Wow,", age, "is pretty old."

Figure 7.3: A Python program that gets input from the user

There are two variables used in Figure 7.1. The one called num holds thenumber that the user is trying to guess. The other called guess holds theguesses made by the user.

Topic 7.5 User Input

The program in Figure 7.3 asks the user some questions and gives responsesbased on them. For example, when we run the program, it might look likethis:

What is your name? Skippy

Hello, Skippy

How old are you? 19

Wow, 19 is pretty old.

The raw input function is built into Python. When you use the function,it asks the user for some input and waits for them to type it in. When theuser presses return, the string that they typed is returned. Here, we areputting their input into the variables name and age.

Note that whatever the user types, Python will consider it a string, evenif it looks like a number.

>>> num = raw input("Type a number: ")

Type a number: 14

>>> print "One more is", num+1

One more is

Traceback (most recent call last):

File "<stdin>", line 1, in ?

TypeError: cannot concatenate ’str’ and ’int’ objects

Page 130: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

130 UNIT 7. PROGRAMMING INTRODUCTION

When you try to do some arithmetic on num, Python stops because it doesn’tknow how to add a string (num) and an integer (1).

We can work around this problem, but a little background in how Pythonstores things is necessary first.

Check-Up Question

I Write a program that takes some user input and does something with it.

Topic 7.6 Types

So far, we have seen Python working with integers and strings. There’s moreto it than that—there are several other kinds of information that Python canwork with.

The type of information you have affects the types of operations youcan do with it. For example, you can subtract two integers, but you can’tsubtract strings.

>>> 10-6

4

>>> "Hello"-"H"

Traceback (most recent call last):

File "<stdin>", line 1, in ?

TypeError: unsupported operand type(s) for -: ’str’

and ’str’

You might have an idea about what should happen when you take the “H”away from “Hello” but Python doesn’t. If you try to use the subtractionoperator (-) on strings, it gives you an error.

The important part of this error is the last line: the operands of the -

aren’t a type Python knows how to subtract. Operands are the values thatare going into the subtraction. For 4-3, the operands are 4 and 3 and - isthe operator

Python tells you that it doesn’t know how to subtract strings. Theoperands to - must be numbers. The + operator, on the other hand, knowshow to add two numbers (it does the arithmetic) and how to add two strings(it does concatenation).

Page 131: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.6. TYPES 131

Something unexpected can happen when you try to divide two values.This is also a problem caused by the operand types.

>>> 33/2

16

>>> 33.0/2

16.5

In Python, integers are stored differently than floating point numbers

(numbers with decimal parts, sometimes called real numbers). In Topic 3.1,we discussed how a computer can store integers (at least, positive integers)and this is close to what Python does (except it has to be able to storenegative values too). For floating point values, it also has to keep track offractional parts of the values, and they have to be represented differently inthe computer’s memory.

In the first calculation above, Python is dividing two integers, and itinsists that the result will also be an integer. So, it does the division (16.5)and rounds down to get the result (16).

In the second calculation, Python recognizes 33.0 as a floating pointvalue. Then, it can do the division (16.5), but the result can be a floatingpoint value and it doesn’t get rounded off.

You can check the type of values in Python with the type function.

>>> type(13)

<type ’int’>

>>> type(13.0)

<type ’float’>

>>> type(13/2)

<type ’int’>

>>> type("Hello")

<type ’str’>

Type Conversion

We can also use type to confirm the problem we had with numeric inputabove:

>>> num = raw input("Type a number: ")

Type a number: 14

>>> type(num)

<type ’str’>

Page 132: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

132 UNIT 7. PROGRAMMING INTRODUCTION

name = raw_input("What is your name? ")

print "Hello, " + name

age = int( raw_input("How old are you? ") )

print "Next year, you’ll be", age+1

Figure 7.4: A program with type conversion

We know that “14” looks like a number, but Python treats it like a string:the character ‘1’ followed by a ‘4.’

If we explicitly tell Python that we want to treat the user’s input like aninteger, it will. This has been done in Figure 7.4.

What is your name? Skippy

Hello, Skippy

How old are you? 19

Next year, you’ll be 20

The int function will convert the string to an integer. In Figure 7.4, ittakes the string returned by raw input; the converted integer is stored inage. The float function can be used to convert it to a floating point value,if you expect the user to enter fractional values.

In Figure 7.2, the raw input function is used to ask the user for a guessand int is used to convert it to an integer. If the input hadn’t been convertedto an integer, the user could never “win” the game. Python would look atthe integer 4 and string "4" and say that they are different.

If you need to convert a value to a string, you can use the str function.

>>> x=16.5

>>> print "x=",x

x= 16.5

>>> print "x="+x

Traceback (most recent call last):

File "<stdin>", line 1, in ?

TypeError: cannot concatenate ’str’ and ’float’ objects

>>> print "x="+str(x)

x=16.5

Page 133: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.7. CONDITIONALS 133

Note in the first print statement above: When you use the comma to print

more than one thing, a space is automatically inserted. If you don’t wantthe space, you need to do string concatenation as in this example.

If you’re impatient to get to web programming, you could skipahead to the first few topics of the next unit and come back to therest of this one later.

Check-Up Questions

I What happens in Figure 7.4 if the user enters something that isn’t a number(like “old”) for their age?

I Remove the int functions from Figure 7.2 and see how the behaviourchanges.

Topic 7.7 Conditionals

All of the code we have written so far has been pretty simple. It all runsfrom top to bottom, and every line executes once as it goes by. The processsoon becomes boring. It’s also not very useful.

What we need to create more interesting programs is some way to makedecisions—to look at the user’s input or some other value and make somechoices based on the result.

The if statement

The most common way to make decisions in Python is by using the if

statement. The if statement lets you ask if some condition is true. If it is,the body of the if will be executed.

For example, the program in Figure 7.5 contains an if statement. Thecondition in the if checks to see if the value in the variable num is less than10. If it is, the code inside the if is executed.

Here are three example executions of the program:

Enter an integer: 3

That was smaller than ten.

Actually, it was 7 less than 10.

I’m done now.

Page 134: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

134 UNIT 7. PROGRAMMING INTRODUCTION

num = int(raw_input("Enter an integer: "))

if num<10:

print "That was smaller than ten."

print "Actually, it was", 10-num, "less than 10."

print "I’m done now."

Figure 7.5: A program with an if block

Enter an integer: 31

I’m done now.

Enter an integer: -3

That was smaller than ten.

Actually, it was 13 less than 10.

I’m done now.

As you can see from the second run of the program, the two indentedprint statements are not executed when the if condition is false. Thesetwo statements make up the body of the if statement. The last print isexecuted no matter what; it isn’t part of the if.

In Python (unlike many programming languages), the amount of spaceyou use is important. The only way you can indicate what statements arepart of the if body is by indenting, which means you’ll have to be carefulabout spacing in your program.

All block statements in Python (we’ll be seeing more later) are indentedthe same way. You start the block and then everything that’s indented afterit is the body of the block. When you stop indenting, the block is over.

How much you indent is up to you, but you have to be consistent. MostPython programmers indent 4 spaces and all of the example code for thiscourse is written that way.

Boolean Expressions

The expressions that are used for if conditions must be either true or false.In Figure 7.5, the condition is num<10 , and it evaluates to true when thevalue stored in num is under 10.

Page 135: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.7. CONDITIONALS 135

These conditions are called boolean expressions. The two boolean values

are “true” and “false.” A boolean expression is any expression that evaluatesto true or false.

In Python, boolean values are actually represented as integers.False is represented by 0; any other integer (usually 1) representstrue. You can usually ignore this point, though.

The less than sign (<) does just what it should. If the left operand is lessthan the right operand, it returns true. There are also boolean operators forgreater than (>), less than or equal (<=), and greater than or equal (>=).

To check to see if two values are equal, the == operator is used and != isthe not equal operator.

>>> if 4-1==3:

... print "Yes"

...

Yes

Note the difference between = and ==. The = is used for variable assign-ment; you’re telling Python to put a value into a variable. The == is usedfor comparison—you’re asking Python a question about the two operands.Python won’t let you accidentally use a = as part of a boolean expression,for this reason.

In the Python interpreter, the “...” prompt is used to indicatethat you’re expected to keep typing because you haven’t finishedthe statement yet.In this example, you need to give the if body before the statementcan be executed. Typing a blank line tells the interpreter thatyou’re done.

The else clause

In Figure 7.2, we wanted to take one particular action if the user guessedcorrectly and another if they guessed incorrectly. It could also have beenwritten in the following way:

if num==guess:...

if num!=guess:...

Page 136: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

136 UNIT 7. PROGRAMMING INTRODUCTION

If we had used this form, the program would have worked correctly, butit’s redundant—we’ve had it evaluate the same condition twice, which seemswasteful. If we did this too often, it would slow our program down.

In the if statement, you can specify an else clause. The purpose of theelse is to give an “if not” block of code. The else code is executed if thecondition in the if is false. This form was used in Figure 7.2 to create an if

block like this:

if num==guess:...

else:...

It is also possible to allow more possibilities with elif blocks. The elifis used as an “else if” option. In Figure 7.2, we could have done somethinglike this:

if num==guess:

print "Right!"

elif num<guess:

print "Too big!"

else:

print "Too small!"

Any number of elifs can be inserted to allow for many possibilities.Whenever an if. . . elif. . . elif. . . else structure is used, only one of the codebodies will be executed. The else will only execute if none of the conditionsare true.

We can also put ifs inside one another. In Figure 7.2, there was a bunchof code that we only wanted to execute if the first guess was wrong. If thefirst guess was right, we wanted to stop asking for guesses. So, the else partof the first if block has a couple of if block inside.

Topic 7.8 Python Libraries

In most programming languages, you aren’t expected to do everything fromscratch. Some prepackaged functions come with the language, and you canuse them whenever you need to. These are generally called libraries. InPython, each part of the total built-in library is called a module.

Page 137: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.9. DEBUGGING 137

In Figure 7.1, the random module was used to generate a random valuethat the user could guess.

The first line in Figure 7.1, import random, tells Python that we wantto use the random module. Python has a lot of modules, and the import

statement lets us use only the ones we want. If Python imported all of itslibraries every time it ran a program, it would take a long time to get started.

In the second line, the num variable is set to a random value from 1 to 10.The randint function comes from the random library and returns an integerin the range you indicate.

A function from a module is accessed by the module name, a dot, andthe name of the function itself. So, the sin function from the math modulewould be called as math.sin(...).

There are Python modules to do all kinds of things. Far too many tomention here. There is a reference to the Python libraries linked from thecourse web site.

We will mention a few more modules as we cover other topics in thecourse. You can always go to the reference and get a full list and descriptionof their contents.

Topic 7.9 Debugging

Unfortunately, when you write programs, they usually won’t work the firsttime. They will have errors or bugs. This is perfectly normal, and youshouldn’t get discouraged when your programs don’t work the first time.Debugging is as much a part of programming as writing code.

Section 1.3 and Appendix A in How to Think Like a Computer Scientist

cover the topic of bugs and debugging very well, so we won’t repeat too muchhere. You should read those before you start to write programs on your own.

Beginning programmers often make the mistake of concentrating toomuch on trying to fix errors in their programs without understanding whatcauses them. If you start to make random changes to your code in the hopesof getting it to work, you’re probably going to introduce more errors andmake everything worse.

When you realize there’s a problem with your program, you should dothings in this order:

1. Figure out where the problem is.

Page 138: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

138 UNIT 7. PROGRAMMING INTRODUCTION

import random

num = random.randint(1,10)

print "I was thinking of", num

Figure 7.6: Guessing game: testing random

2. Figure out what’s wrong.

3. Fix it.

Getting it right the first time

The easiest way to get through the first two steps here quickly is to writeyour programs so you know what parts are working and what parts mightnot be.

Write small pieces of code and test them as you go. As you write yourfirst few programs, it’s perfectly reasonable to test your program with everynew line or two of code.

It’s almost impossible to debug a complete program if you haven’t testedany of it. If you get yourself into this situation, it’s often easier to removemost of the code and add it back slowly, testing as you do. Obviously, it ismuch easier to test as you write.

Don’t write your whole program without testing and then ask theTMs to fix it. Basically, they would have to rewrite your wholeprogram to fix it, and they aren’t going to do that.

As you add code and test, you should temporarily insert some print

statements. These will let you test the values that are stored in variables soyou can confirm that they are holding the correct values. If not, you have abug somewhere in the code you’ve written and should fix it before you moveon.

Let’s go back to the guessing game in Figure 7.1 again. You shouldn’ttry to write this program all at once. Figures 7.6 to 7.9 show some of theprograms that you might write and test on the way to finishing Figure 7.1.

Figure 7.6 is just a test to make sure that we know how to use the randintfunction and the random module. We would run this a few times to makesure we get different values.

Page 139: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.9. DEBUGGING 139

import random

num = random.randint(1,10)

print "I’m thinking of a number between 1 and 10."

guess = int(raw_input("Take a guess: "))

print "I was thinking of", num

print "You guessed", guess

Figure 7.7: Guessing game: testing user input

import random

num = random.randint(1,10)

num = 6 # forget the random value while testing

print "I’m thinking of a number between 1 and 10."

guess = int(raw_input("Take a guess: "))

if num==guess:

print "Right!"

else:

print "Wrong."

Figure 7.8: Guessing game: checking their guess

Next, the user input was added (Figure 7.7). Again, we’ve just printedout the values of the two variables so we can see what’s going on.

In Figure 7.8 and Figure 7.9, we have started testing the if statements.In order to make it easier, a line has been added that sets the variable num to6, overwriting the random value. It will be much easier to test the conditionssince we know we’re always trying to guess 6. This line would, of course, beremoved in the final version of the program.

Finding bugs

Unfortunately, you won’t always catch every problem in your code as youwrite it, no matter how careful you are. Sooner or later, you’ll realize thereis a bug somewhere in your program that is causing problems.

Page 140: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

140 UNIT 7. PROGRAMMING INTRODUCTION

import random

num = random.randint(1,10)

num = 6 # forget the random value while testing

print "I’m thinking of a number between 1 and 10."

guess = int(raw_input("Take a guess: "))

if num==guess:

print "Right! Wow, that was fast."

else:

if guess<num:

print "Too small."

else:

print "Too big."

Figure 7.9: Guessing game: checking greater or less

Again, you should resist the urge to try to fix the problem before you knowwhat’s wrong. Appendix A of How to Think Like a Computer Scientist talksabout different kinds of errors and what to do about them.

When you realize you have a bug in your program, you’re going to haveto figure out where it is. When you are narrowing the source of a bug, theprint statement can be your best friend.

Usually, you’ll first notice either that a variable doesn’t contain the valueyou think it should or that the flow of control isn’t the way you think itshould be because the wrong part of an if is executed.

You need to work backwards from the symptom of the bug to its cause.For example, suppose you had an if statement like this:

if length*width < possible area:

If the condition doesn’t seem to be working properly, you need to figure outwhy. You can add in some print statements to help you figure out what’sreally going on. For example,

print "l*w:", length*width

print "possible:", possible area

if length*width < possible area:

print "I’m here"

Page 141: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

7.9. DEBUGGING 141

When you check this way, be sure to copy and paste the exact expressionsyou’re testing. If you accidentally mistype them here, it could take a long

time to figure out what has happened.You’ll probably find that at least one of the print statements isn’t doing

what it should. In the example, suppose the value of length*width wasn’twhat we expected. Then, we could look at both variables separately:

print "l, w:", length, width

If length was wrong, you would have to backtrack further and look at what-ever code sets length. Remove these print statements and add in somemore around the length=. . . statement.

Summary

By now, you should be able to understand all of the code in Figure 7.1. Youshould also be able to start writing programs of similar difficulty.

If you’re intimidated about starting to write programs, don’t panic. Writ-ing your first few programs will be difficult, since you’re learning about pro-gramming, Python, the Python tools, and error messages all at once. Afteryou get a few programs working and get the feeling of the steps you have totake, it will get easier.

Key Terms

• computer programming

• programming language

• Python

• expression

• variable

• assignment statement

• operator

• operand

• variable type

• integer

• floating point

• conditional statement

• if statement

• boolean expression

• Python module

• debugging

Page 142: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

142 UNIT 7. PROGRAMMING INTRODUCTION

Page 143: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 8

Web Programming

Learning Outcomes

• Create HTML forms to get user input.

• Create Python programs to generate web pages, using a user’s input.

• Install Python programs for use on a web server.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• Review the instructions in Topic A.5 for getting Python programs work-ing on the web server.

• Do Assignment 3

Now that we know a little about programming, we can concentrate onwriting programs that create web pages. This isn’t very hard at all. In fact,we already have all of the programming we need to do some interesting webprogramming.

Programs that generate web pages are often called web scripts or CGIscripts ; we will discuss “CGI” in Topic 8.3. Instead of just typing HTMLcode into a file and putting it on a web server, we will start writing programsthat create the HTML for us. This method will let us create dynamically

generated web content. A dynamically generated page is created by ourprogram when it is requested. Web pages that are contained in files on theweb server are generally called static.

143

Page 144: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

144 UNIT 8. WEB PROGRAMMING

print "Content-type: text/html"

print

print "<html><head>"

print "<title>Python did this</title>"

print "</head><body>"

print "<p>Here I am</p>"

print "</body></html>"

Figure 8.1: A program that generates a simple web page

Any programming language can be used to create web scripts. We willcontinue to use Python in this course. Other common web scripting languagesare Perl, PHP, and ASP. Java and other programming languages are also usedoccasionally.

Topic 8.1 Making Web Pages with Python

Figure 8.1 is a Python that creates a simple web page. This program gen-erates the same page every time. That’s not very exciting and we couldhave done it with a static HTML file, but it will get better as we add moreprogramming features to it.

The first line of the program outputs a HTTP header. It is part of theconversation that the web server and web browser are having in HTTP aspart of transferring the page. We will talk more about the details of HTTPin Unit 11.

For the moment, we’re only concerned with this one line of HTTP. TheContent-type header indicates the MIME type of the information we’reabout to send. In Figure 8.1, we have indicated the type text/html since weare sending HTML. If we were dynamically creating some other type of data,we could have used any other MIME type, but we’ll be sticking to HTML inthis course.

The second line of Figure 8.1 just prints a blank line. The blank lineseparates the HTTP headers from the content. This is how the web browserknows that you’re starting to send the HTML itself.

The rest of the program just prints out some HTML code. It will bedisplayed in the web browser just like HTML code typed into a .html file.

Page 145: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

8.1. MAKING WEB PAGES WITH PYTHON 145

import time

print "Content-type: text/html"

print

print "<html><head>"

print "<title>Slightly cooler</title>"

print "</head><body>"

print "<p>Right now, it is "

print "<strong>", time.asctime(), "</strong></p>"

print "<p>2 + 3 + 4 =", 2+3+4, "</p>"

print "</body></html>"

Figure 8.2: A web script that does a little more

Content-type: text/html

<html><head>

<title>Slightly cooler</title>

</head><body>

<p>Right now, it is

<strong> Tue Nov 11 11:11:00 2003 </strong></p>

<p>2 + 3 + 4 = 9 </p>

</body></html>

Figure 8.3: Sample output from Figure 8.2

When you’re creating a program that will run on a web server, you aremuch more limited than when you create program to run on your own com-puter. The only way to do output is with the print statement. You can’tuse raw input since you don’t have the same chance to ask the user to typesomething. You can get input from the user. We will discuss this in the nexttwo Topics.

On the other hand, you can use any programming features that you wantto get the output you need to print. Figure 8.2 contains a web script thatdoes a little more than Figure 8.1. Sample output from the Python programis shown in Figure 8.3. When this program is installed on a web server andloaded with a web browser, it might look like Figure 8.4.

Page 146: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

146 UNIT 8. WEB PROGRAMMING

Figure 8.4: Sample display of Figure 8.2 in a browser

print "Content-type: text/html"

print

print """<html><head>

<title>Python did this</title>

</head><body>

<p>Here I am</p>

</body></html>"""

Figure 8.5: Figure 8.1 rewritten with triple-quoted strings

The way you install a web script and get it working is differenton different web servers. See Appendix A for information on howto do it in the web space provided for this course. On other webservers, you’ll probably have to look for local documentation or askfor help from the server administrator.

Many web scripts have to output large chunks of HTML code that won’tchange. In Figure 8.1, the whole program is just a collection of print state-ments. It’s often tedious to write these statements.

In Python, there’s a shortcut you can use whenever you need to workwith large chunks of text. You can use a pair of """ to wrap up a largestring, as you can see in Figure 8.5. A triple-quoted string can also containline breaks, which regular strings (quoted with ") cannot.

This makes it easy to copy-and-paste large pieces of HTML from a staticfile into a Python program. You can just copy the code into a print

""". . . """ wherever it should go.

Check-Up Questions

I Install Figure 8.1 and Figure 8.2 in your web space, and make sure you canget them running.

I Modify Figure 8.2 to do some other calculations.

Page 147: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

8.2. HTML FORMS 147

<form action="sample.py">

<p>(a) <input type="text" name="text1" value="A textbox" /></p>

<p>(b) <input type="text" size="6" maxlength="10" name="text2" /></p>

<p>(c) <input type="checkbox" name="check1" /></p>

<p>(d) <input type="checkbox" checked="checked" name="check2" /></p>

<p>(e) <input type="submit" value="Go!" name="button1" /></p>

</form>

Figure 8.6: The body of an HTML file with a form

Topic 8.2 HTML Forms

When we create web scripts, we can’t use raw input to get input from theuser, but we need some way to do it. On the web, you have probably enteredterms into a search engine or typed your address into an e-commerce site.These are examples or forms, and we can use them to get input from theuser when we are web scripting.

The data in forms is usually sent back to a web script on the web server.In Topic 8.3, we will write web scripts to process the data and send back toan HTML page in response.

The <form> tag is wrapped around the entire form. It shouldn’t affectthe appearance of the page; it’s just used as a marker to indicate what partsof the page are part of the form.

We need to give an action attribute to the form. It will indicate theURL of the web script that will get the results of the form. The form inFigure 8.6 will send its results to a web script named sample.py in the samedirectory as the current page.

The <input> tag is used to put controls on the form. The type attributeof the tag indicates the kind of control, for example, a text input box, abutton, or a check button. We also usually specify a name for each control.Again, the name is used so we can refer to the control later.

Three types of <input> can be seen in Figure 8.6, and their appearancein a browser can be seen in Figure 8.7.

The text type of input is a single-line text box; two examples are (a)and (b) in Figures 8.6 and 8.7. Since each <input> should be given a uniquename so we can refer to it later, these text boxes are named text1 and text2.

Page 148: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

148 UNIT 8. WEB PROGRAMMING

Figure 8.7: The display of Figure 8.6 in a browser

The value attribute can be used to put some text in the box initially, ashas been done in (a). The attribute size indicates how many characters widethe box should be, and maxlength gives the maximum number of charactersthat the user will be allowed to enter, as in (b).

An input with type="checkbox" will produce a check button that caneither be unchecked (c) or checked (d). If the attribute checked is included("checked" is the only possible value), the box will be checked initially;otherwise, it will be unchecked.

The final type of input shown in Figures 8.6 and 8.7 is a submit button,(e). The user can click on the button to send their results to the script froman action attribute of the form. The value attribute is used to give thetext that should be placed on the button itself.

There are several other possible values for the type attribute of the<input> tag. You can find information about these types and how theyare used in the online HTML reference in the “Online References” section.

Check-Up Question

I Try creating a web page with a form and some controls on it. Of course,the controls won’t do anything yet, but you should be able to see them.Try validating the page you create.

Page 149: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

8.3. CGI 149

<form action="cgiinput.py">

<p> What is your name?

<input type="text" name="user" size="20" /> </p>

<p> How old are you?

<input type="text" name="age" size="3" /> </p>

<p><input type="submit" value="Go" /> </p>

</form>

Figure 8.8: An HTML form for simple user input

Topic 8.3 CGI

Web scripts, like the ones we started to write in Topic 8.1, can read informa-tion that is entered into a form.

Suppose we want to recreate something like the program in Figure 7.3 asa web page. We would start with a form like the one in Figure 8.8.

Figure 8.9 contains a Python web script that takes the user’s input. Since<form> tag’s action attribute give the file name cgiinput.py as the URLfor submission, Figure 8.9 would have to be saved in a file named cgiinput.pyin the same directory as Figure 8.8.

The method of passing information from an HTML form to a web script iscalled CGI (Common Gateway Interface). The Python module cgi providesfunctions that can be used to work with CGI data.

The first thing you’ll usually do when you are working with CGI is toload the data into a Python variable using the cgi.FieldStorage function.In Figure 8.9, it is put into a variable named form.

To get at the inputs of the form, you call them by name. This is whyform inputs have a name attribute. Figure 8.9 has two examples of how todo this with a text input. A Key Error when using CGI data is caused whenyou ask for a form element that wasn’t actually sent from the form.

The square brackets, [ ], are used here access elements of a dictio-nary. The Python code form["user"] is saying “in the dictionarynamed form, look up the entry for "user".” If you want to learnmore about dictionaries, see Chapter 10 of How to Think Like a

Computer Scientist.

Page 150: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

150 UNIT 8. WEB PROGRAMMING

import cgi

form = cgi.FieldStorage()

# print HTTP/HTML header stuff

print """Content-type: text/html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html><head>

<title>A CGI Script</title>

</head><body>

"""

# print HTML body using form data

print "<p>Hello,", form["user"].value, "</p>"

print "<p>Wow,", form["age"].value, "is pretty old.</p>"

print "</body></html>"

Figure 8.9: A web script that uses the input from Figure 8.8

The .value part lets us get the actual input’s value. There areother properties of an input that we could use, but we won’t haveto in this course.

Just like with raw input, any input from a form is a string. If you wantto treat it like a number, you’ll have to convert it, as we did in Topic 7.6.

Check-Up Questions

I Modify the program in Figure 8.9 so it uses an if statement where thecondition depends on the user’s input.

I Write your own CGI script that takes some form data and displays it.

Topic 8.4 Debugging CGI Scripts

You can debug CGI scripts using most of the same tricks as you use for otherprograms, as discussed in Topic 7.9 and in How to Think Like a Computer

Page 151: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

8.4. DEBUGGING CGI SCRIPTS 151

Scientist, Section 1.3 and Appendix A.The biggest problem is getting error messages back. Generally, if you

have an error in a web script, you don’t want every user that comes along tosee all your error messages. So, by default, most web servers just display ageneric error message that says something went wrong.

The web server for this course has been set up to show you the sameerror messages you would see if you were running the program on your owncomputer.

If you ever run Python programs on another web server, you shoulduse the cgitb module, which will catch everything except syntaxerrors and show you the error messages. To use cgitb, add this tothe start of your program:

import cgitb; cgitb.enable()

Also, remember that web scripts are just Python programs. You can runthem with the Python interpreter on your own computer. Just rememberthat the output that you see there will be the HTML code itself.

The only problem is getting CGI data. If your script expects to use thecgi library to get CGI data from a web form, it won’t be available if you’rejust running the script in the Python interpreter.

Another trick that can come in handy when you just can’t figure outwhat’s going on is to have a look at all of the output of your script. You cando this by tricking the web browser into displaying everything as plain text.Add this at the very start of your program:

print "Content-type: text/plain"

print

This will tell the browser that everything else (including the real Content-typeline) should be displayed. You can then look through it for problems.

Check-Up Questions

I Run a CGI script on your computer and have a look at the output.

I Take a CGI program that you’ve been working with a remove the Content-type line. What happens?

I Introduce some different errors into a CGI script and see what happenswhen you run it.

Page 152: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

152 UNIT 8. WEB PROGRAMMING

Summary

There aren’t really any new programming skills in this unit. It is really abouttransferring the programming ideas from Unit 7 to web programming. Youshould be able to use all of the ideas from Unit 7 for web scripts and youshould be able to get those scripts running on the web server.

Key Terms

• web scripts

• CGI scripts

• dynamically generated

• Content-type

• triple-quoted string

• HTML form

• input tag

• CGI

Page 153: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 9

More Programming

Learning Outcomes

• Create and use functions in programs.

• Analyze the functions required to complete a task.

• Use for and while loops in a program.

• Create programs that store information in lists.

• Use exception handling.

• Analyze a problem and create a program that solves it.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• Read chapters 5 to 8 in How to Think Like a Computer Scientist. Youcan ignore these sections : 5.5, 5.7, 5.8, 6.3, 7.4, 7.6, 7.7, 7.9, 7.10, 8.8,and 8.10 to 8.16

• Read section 11.5 in How to Think Like a Computer Scientist.

• Do Exercise 4.

153

Page 154: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

154 UNIT 9. MORE PROGRAMMING

Topic 9.1 Functions

We have already seen how several functions work in Python. In particular,we have used raw input, type, and int. Each of these is built into Pythonand can be used in any Python program

We have also used the function random.randint from the random library.Before we can use this function, the random module must be imported.

A function must be given arguments . These are the values in parenthesesthat come after the name of the function. For example, in int("321"),the string "321" is the argument. Functions can have no arguments (likecgi.FieldStorage()), or they can take several.

Functions that return values can be used as part of an expression. Wesaw how the int function works, which returns an integer. It can be used inan expression like this:

x = 3*int("10") + 2

After this statement, the variable x will contain the number 32. In thisexpression the int function returns the integer 10, which is then used in thecalculation.

Python functions can return any type of value including strings and float-ing point values.

Defining your own functions

You can define your own functions as well. They are defined with a def

block, as shown in Figure 9.1. The code inside the def isn’t executed rightaway. The function is defined and then run whenever it is called.

In Figure 9.1, the function is named “read integer” and takes one ar-gument that we’ll call prompt. Inside the function definition, prompt workslike a variable. Its value is filled in with whatever argument is given whenthe function is called.

The next line is a triple-quoted string that describes the function. This iscalled a documentation string or docstring. It has no effect on the behaviourof the function, but it will help somebody reading your code figure out whatit does.

Page 155: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.1. FUNCTIONS 155

def read_integer(prompt):

"""Read an integer from the user and return it."""

input = raw_input(prompt)

return int(input)

num = read_integer("Type a number: ")

print "One more is", num+1

num = read_integer("Type another: ")

print "One less is", num-1

Figure 9.1: A program with a function defined

Every function you write in this course must have a meaningfuldocstring. It will help us understand your code more easily whenwe mark it. It is also a good habit to get into. When you have tocome back to some of your own code after a few weeks, you’ll beglad you included it.

The statements in the body of the function are what will be executedwhen the function is called. The return statement indicates the value thatthe function returns.

The main part of the program in Figure 9.1 makes two calls to theread integer function. Here’s what the program looks like when it’s run:

Type a number: 15

One more is 16

Type another: 192

One less is 191

You should define functions to do tasks that you’ll have to do severaltimes. That way you’ll only have to type and debug the code once and beable to use it many times. As a general rule, you should never copy-and-pastecode. If you need to reuse code, put it in a function and call it as many timesas necessary.

Defining functions is also useful when you are creating larger program.Even if you’re only going to call a function once, it helps you break yourprogram into smaller pieces. Writing and debugging many smaller pieces ofcode is much easier than working on one large one.

Page 156: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

156 UNIT 9. MORE PROGRAMMING

Topic 9.2 Local Variables

In Figure 9.1, the argument prompt is only available in the read integer

function. If we tried to use prompt outside of the function, Python wouldgive the error

NameError: name ’prompt’ is not defined

It does so because prompt is local to the read integer function. In fact,any variables that are created within a function are local to that function.That means that they can’t be used outside of the function.

This is actually a very good thing. It means that when you write afunction, you can use a variable like num without worrying that some otherpart of the program is already using it. The function gets an entirely separatething named num, and anything named num in the rest of the program isundisturbed.

Have a look at the program in Figure 9.2. When it’s run, its output is:

Content-type: text/html

<h1>Page title</h1><p>This is the first paragraph.</p>

There is no confusion between the variable open in the function and theone in the main program. When the function uses the variable open, it istotally unrelated to the one in the main program.

So, to use the buildtag, you don’t have to worry about how it wasimplemented. All you have to know it what it does.

Also notice that the docstring in Figure 9.2 is much longer. It includesan example of what the function should do. Giving examples is a good ideabecause it gives you something to check when you test the function. Theactual behaviour should match your expectations in the docstring.

There is actually a Python module called doctest that looksthrough your docstrings for things that look like examples of thefunction’s use. It then checks them to make sure the examplesmatch what actually happens.

Page 157: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.2. LOCAL VARIABLES 157

def buildtag(tag,contents):

"""Create HTML with the given tag and contents.

This function returns a string with the given HTML

tag and contents. For example,

>>> print buildtag("p", "Hello")

<p>Hello</p>

"""

open = "<" + tag + ">"

close = "</" + tag + ">"

return open + contents + close

# build the page’s contents

open = buildtag("h1","Page title")

par1 = buildtag("p","This is the first paragraph.")

# output the page (not valid HTML)

print "Content-type: text/html"

print

print open + par1

Figure 9.2: A program that takes advantage of local variables

Page 158: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

158 UNIT 9. MORE PROGRAMMING

num = int( raw_input("How high should I count? ") )

for i in range(num):

print i,

Figure 9.3: Using a for loop

Topic 9.3 Iteration

We are still missing one major concept in computer programming. We needto be able to execute the same code several times (iterate). There are severalways to iterate in most programming languages. We will discuss two waysyou can use in Python: for and while loops.

The for loop

If you know ahead of time how many times you want to execute some code,you can use the for loop. Figure 9.3 is a very simple program that uses afor loop.

The easiest way to construct a for loop is with the range function. Whena for loop is given range(x), the loop body will execute x times. Figure 9.3will look like this when it’s executed:

How high should I count? 12

0 1 2 3 4 5 6 7 8 9 10 11

Notice that the range starts from zero and counts up to x− 1. It we wantedto count from one, we could have written the loop like this:

for i in range(num):

print i+1,

The while loop

If you don’t know how many times you want the loop body to execute, thefor loop is hard to work with.

Page 159: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.3. ITERATION 159

name = raw_input("What is your name? ")

while name=="":

name = raw_input("Please enter your name: ")

print "Hello,", name

Figure 9.4: Using a while loop

Another way to do iteration is the while loop. To construct a while

loop, you use a condition as you did in a if statement. The body of the loopwill execute as many times as necessary until the condition becomes false.

For example, the program in Figure 9.4 will ask the user to enter his orher name. If a user just presses enter, the program will keep asking until theuser provides a response. It looks like this when it’s executed:

What is your name?

Please enter your name:

Please enter your name:

Please enter your name: Zippy

Hello, Zippy

The Guessing Game, again

Now that we can write loops, we can have another look at the guessing gameexample from Topic 7.3. That program only allowed the user to make twoguesses before the game ended. Now that we have loops, we can make thegame a little more interesting.

Figure 9.5 uses a while loop to let the user keep trying until they guesscorrectly. Here are some sample runs of this program:

I’m thinking of a number between 1 and 10.

Take a guess: 4

That was too small.

Guess again: 7

That was too small.

Guess again: 9

Yup, that’s it.

Thanks for playing the game!

Page 160: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

160 UNIT 9. MORE PROGRAMMING

import random

num = random.randint(1,10)

print "I’m thinking of a number between 1 and 10."

guess = int(raw_input("Take a guess: "))

# keep trying until they get it

while guess!=num:

if guess<num:

print "That was too small."

else:

print "That was too big."

guess = int(raw_input("Guess again: "))

print "Yup, that’s it."

print "Thanks for playing the game!"

Figure 9.5: A version of the guessing game using a while loop

Page 161: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.3. ITERATION 161

import random

num = random.randint(1,10)

print "I’m thinking of a number between 1 and 10."

guesses = int( raw_input("How many guesses? ") )

# give them their guesses

for attempt in range(guesses):

guess = int( raw_input("Guess number "

+ str(attempt+1) + ": ") )

if guess<num:

print "That was too small."

elif guess>num:

print "That was too big."

else:

print "Yup, that’s it."

break # don’t keep guessing after they get it.

print "Thanks for playing the game!"

Figure 9.6: A version of the guessing game using a for loop

I’m thinking of a number between 1 and 10.

Take a guess: 6

That was too small.

Guess again: 9

That was too big.

Guess again: 8

Yup, that’s it.

Thanks for playing the game!

Figure 9.6 uses a for loop to give users a maximum number of guessesbefore it stops. Note that when they guess correctly, the pybreak statementis used. It stops the program when they guess correctly instead of makingthem guess more. You can use break any time to exit a loop.

Here are two runs of Figure 9.6:

I’m thinking of a number between 1 and 10.

How many guesses? 5

Page 162: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

162 UNIT 9. MORE PROGRAMMING

Guess number 1: 4

That was too small.

Guess number 2: 7

That was too big.

Guess number 3: 6

Yup, that’s it.

Thanks for playing the game!

I’m thinking of a number between 1 and 10.

How many guesses? 2

Guess number 1: 5

That was too big.

Guess number 2: 2

That was too small.

Thanks for playing the game!

Check-Up Questions

I In Figure 9.6, the user isn’t told what the correct answer is. Modify theprogram so it outputs the value of num before stopping.

I Modify Figure 9.6 so it outputs the value in num only if the user neverguesses correctly.

Topic 9.4 Lists

Let’s have a closer look at the range function that we used to create for

loops. By exploring in the Python Interpreter, we can see what it does:

>>> for i in range(7):

... print i,

...

0 1 2 3 4 5 6

>>> range(7)

[0, 1, 2, 3, 4, 5, 6]

We haven’t seen anything like the value returned by range before in Python.This is a Python list.

Page 163: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.4. LISTS 163

A list in Python is an ordered collection of other Python values, writtenin square brackets. The values in a list can be any Python object:

>>> list = [10, 20, "cow", 34.5]

>>> list

[10, 20, ’cow’, 34.5]

>>> list[2]

’cow’

>>> list[0]

10

As you can see here, you also use the square brackets to get a value out ofa list. The first item in a list is accessed with [0]. The numbering starts atzero, not one.

So, the loop “for i in range(7)” is really looping for each entry in thelist [0, 1, 2, 3, 4, 5, 6]. We can also use the for loop to loop over anyother list:

>>> for name in ["Peter", "Paul", "Mary"]:

... print "Hello", name

...

Hello Peter

Hello Paul

Hello Mary

Lists can also be used to store several values so you can use them later.One useful tool for doing this is the append function for lists, which adds anelement to the end of the list.

Figure 9.7 is a program that uses a list to store the user’s input so it canbe manipulated once it’s all there. When it runs, it looks like this:

Enter something, enter to exit: word

Enter something, enter to exit: hello

Enter something, enter to exit: one

Enter something, enter to exit: two

Enter something, enter to exit: internet

Enter something, enter to exit:

Here are they are unsorted:

word hello one two internet

Here are they are in order:

hello internet one two word

Page 164: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

164 UNIT 9. MORE PROGRAMMING

# put a list with no elements into ’value’:

values = []

# get the user’s input--go until they enter a blank.

while 1:

val = raw_input("Enter something, enter to exit: ")

if val=="":

break

else:

values.append(val)

# output as-is

print "Here are they are unsorted:"

for num in values:

print num,

print

# sort and output

values.sort()

print "Here are they are in order:"

for num in values:

print num,

Figure 9.7: Using a list to store some integers

Page 165: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.5. HANDLING ERRORS 165

age_string = raw_input("How old are you? ")

try:

age = int(age_string)

print "Next year, you’ll be", age+1

except:

print "That wasn’t a number."

Figure 9.8: Catching an exception

Check-Up Question

I There are a few things in Figure 9.7 that are probably new to you. UseHow to Think Like a Computer Scientist and the Python interpreter tomake sure you understand all of the parts of that program.

Topic 9.5 Handling Errors

So far, whenever we did something like ask for user input, we have assumedthat it will work correctly. Consider the program in Figure 7.4, where wegot the user to type their age and converted it to an integer. If the userenters something that can’t be converted to an integer, the results are notvery pretty:

What is your name? Lenny

Hello, Lenny

How old are you? old

Traceback (most recent call last):

File "input2.py", line 4, in ?

age = int( raw input("How old are you? ") )

ValueError: invalid literal for int(): old

This isn’t very helpful for the user. It would be much better if we could givethem another chance to answer or at least a useful error message. If this wasa CGI script, we would need to format an error in HTML.

Python lets you catch any kind of error, as Figure 9.8 shows. Here aretwo sample runs of that program:

Page 166: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

166 UNIT 9. MORE PROGRAMMING

got_age = 0

while got_age==0:

age_string = raw_input("How old are you? ")

try:

age = int(age_string)

got_age = 1

except:

print "Please enter a number."

got_age = 0

print "Next year, you’ll be", age+1

Figure 9.9: Asking until we get correct input

How old are you? old

That wasn’t a number.

How old are you? 21

Next year, you’ll be 22

Errors that happen while the program is running are called exceptions.The try/except block lets the program handle exceptions when they happen.If any exceptions happen while the try part is running, the except code isexecuted. It is ignored otherwise.

Figure 9.9 shows another example. In this program, the while loop willcontinue until there is no exception. The variable got age is used to keeptrack of whether or not we have the input we need.

Check-Up Question

I When using form data in a CGI script, if the input that you ask for wasn’tsent by the browser, you will get a “KeyError”. Modify Figure 8.9 so thatit runs even if no values for name and age are sent. You can modify theHTML form that calls the script to test what you have done.

Page 167: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.6. SOLVING PROBLEMS 1: MAKING CHANGE 167

Topic 9.6 Solving Problems 1: Making

Change

Students often find it difficult to make the leap from knowing many thingsthey can do with a programming language to using one to solve a problem.Of course, what you really should be able to do is take a problem statement

that describes what a program should do and turn that into a program thatdoes it.

This isn’t easy. In fact, most Computing Science courses focus on variousways to attack problems and implement good solutions. That doesn’t meanyou can’t do any real problem solving without a Computing Science degree.

The two examples that follow illustrate a process to use. They will giveyou some tips you can use to tackle similar problems on your own.

Problem Statement

Suppose we want to create a program that will tell someone how to makechange.

For example, if we want to give $2.53 in change, we should give thecustomer a $2 coin, two 25/c coins, and three 1/c coins. We would like aprogram that will figure this out for us.

Problem Statement: Given an amount of money, output the smallestcollection of coins that make that amount of change. The denominations ofthe coins are $2, $1, 25/c, 10/c, 5/c and 1/c. (We won’t worry about bills, but itwould be easy to add them to the program at the end.)

Can we do it?

First, we need to work out a method or algorithm to solve this problem.Once we know how to solve the problem, we can try to create a program toimplement it. We really do have to do this step first. If we don’t know howto solve the problem, there’s little hope that we can explain to the computerhow to do it.

Anyone who has worked a cash register should know how to do this. Youstart with the largest denomination of money you have. If you need that

Page 168: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

168 UNIT 9. MORE PROGRAMMING

much change (or more), you pick up the appropriate number of coins. Thenyou move on to the next largest denomination and repeat.

So, in the $2.53 example, we would start with the $2 coin. Since $2.53 ≥$2, we pick up a $2 coin. Now we have $0.53 left to give. We can’t take a $1coin since $0.53 < $1. We can pick up a two quarters, since $0.53 ≥ 2× 25/c.Finally, we would take no 10/c or 5/c coins and pick up three 1/c coins.

Write a recipe

The next thing we’re going to do is write a step-by-step method of solvingthe problem. It is something that we could give to a person to follow. It willget us one step closer to writing a program.

The first problem we have when we start doing these calculations isswitching between dollars and cents. From now on, we will think of ev-ery amount of money as a number of cents, so we will think of 200/c and 100/ccoins.

We have coins worth 200, 100, 25, 10, 5 and 1 cent each. For each coin:

1. Figure out how many coins will fit in the total change we need to give.

2. Give back that many coins.

3. Subtract the amount of money you gave back from the total.

When you’re done, you should have 0/c left to return.

Start Coding

As we said in Topic 7.9, you shouldn’t try to write the whole program all atonce. Start with a small part and test that. Keep adding and testing untilyou get the entire program working.

The first thing we can do is create a Python list that holds the denomi-nations of coins we have:

coins = [200, 100, 25, 10, 5, 1]

We can easily change the list later. We can then implement the “do this foreach denomination” part of the recipe with a for loop:

for amount in coins:

Page 169: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.6. SOLVING PROBLEMS 1: MAKING CHANGE 169

coins = [200, 100, 25, 10, 5, 1]

# total change to give (replace with user input)

change = 253

# check each denomination of coin

for amount in coins:

# can we give this coin?

if amount <= change:

print "Giving a", amount, "coin."

change = change - amount

print "Change left to give:", change

Figure 9.10: Making change: first attempt

A reasonable first step in writing this program is shown in Figure 9.10.When we run this program, it’s obvious that it’s not working properly. Theproblem is that only one 25 cent coin is given, not two.

Figure 9.11 shows the program with this problem fixed. Figure 9.12 showsthe completely finished program, with user input filled in and nice output.

There are some features of Python in Figure 9.12 that we haven’tdiscussed before. We’ll leave it to you to figure them out if you’recurious.

Check-Up Questions

I The % operator in Python is called the string formatting operator. Findout a little about how it’s used and try it yourself. (You won’t be testedon the string formatting operator, but learning about it is good practice inworking with the Python resources you have.)

I Modify the program in Figure 9.12 so it outputs the denominations as wewould say them: for example, “2 dimes” or “3 quarters.”

Page 170: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

170 UNIT 9. MORE PROGRAMMING

coins = [200, 100, 25, 10, 5, 1]

# total change to give (replace with user input)

change = 253

# check each denomination of coin

for amount in coins:

# how many of this should be given?

num = change/amount

# give the change

print "Giving", num, amount, "cent coins."

change = change - num*amount

Figure 9.11: Making change: fixed multiple coins

coins = [200, 100, 25, 10, 5, 1]

# get the amount from the user and convert to cents

dollars = raw_input("How much change do you need? $")

change = int( float(dollars)*100 )

print "Here’s your change:"

# check each denomination of coin

for amount in coins:

# how many of this should be given?

num = change/amount

# give the change

if num>0:

print " %i x $%.2f" % (num, amount/100.0)

change = change - num*amount

# if there are’t 0 cents left to give, we want to know.

assert change==0

Figure 9.12: Making change: details filled in

Page 171: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.7. SOLVING PROBLEMS 2: DISPLAYING HTML SOURCE 171

Topic 9.7 Solving Problems 2: Displaying

HTML Source

The problem given here is the first step in writing an HTML validator. Wewant to first ask users for a URL on a web form. When they submit theform, a web script will download the specified URL and display the HTMLsource.

We could extend this program to do various operations on the HTMLcode we retrieve, but for now we’ll just retrieve the HTML and display it.

Problem Statement

We first need to get the URL from the user with a form. After that, we’llhave to retrieve the URL and format it for display on a web page.

Can we do it?

At the moment, no. We don’t know how to retrieve a URL. We need someway to go and get one from a web server.

Fortunately, we don’t have to do this by ourselves—it would be far toodifficult for this course. There is a module built into Python called urllib

that will do it for us. Once we have that taken care of, we can handleeverything else.

Write a recipe

So, the outline of out program will look something like this:

1. Determine the URL from the user.

2. Use urllib to retrieve the URL’s contents.

3. Display the contents for the user

Page 172: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

172 UNIT 9. MORE PROGRAMMING

import cgi

form = cgi.FieldStorage()

print """Content-type: text/html

<title>URI Display</title>

<h1>URI Display</h1>

<blockquote>

"""

print form["uri"].value

print "</blockquote>"

Figure 9.13: Displaying source: basic CGI skeleton

Start Coding

We can start by producing the general format for a CGI script as in Fig-ure 9.13. The HTML produced isn’t valid—the version on the web siteproduces valid XHTML. It was cut down here to save space.

Now, we have to start dealing with the urllib module to get the pagethe URL refers to. There is a reference to all of the modules that come withPython in the online documentation.

Reading this documentation and figuring out how to use one of themodule’s functions takes some experience, so don’t worry if thesereferences seem opaque at first.

The way we open up the URL so we can read it is this function call:

page = urllib.urlopen( form["uri"].value )

This code will put an object holding information about the page in the vari-able page. We then have to grab the content from the URL. According tothe documentation, we can use the read() function:

content = page.read()

Now, content will be a string variable holding all of the HTML page. Fig-ure 9.14 uses urllib to read the HTML file.

Page 173: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.7. SOLVING PROBLEMS 2: DISPLAYING HTML SOURCE 173

import cgi, urllib

form = cgi.FieldStorage()

print """Content-type: text/html

<title>URI Display</title>

<h1>URI Display</h1>

<blockquote>

"""

page = urllib.urlopen(form["uri"].value)

content = page.read()

print content

print "</blockquote>"

Figure 9.14: Displaying source: using urllib

If you run the program in Figure 9.14, you’ll see that we have missed amajor part of displaying HTML source: the HTML contains tags like <h1>.If we just print them out in the CGI script, the web browser will interpretthem as HTML tags. We want to display the tag itself, not the results.

So, we have to replace the special characters with entities (for example,all of the “<” will become “&lt;”). Again, this isn’t something we know howto do by ourselves. It is something that we could do by ourselves by goingthrough the string one character at a time checking to see if it needs to beconverted to an entity. Luckily, we don’t have to.

In the string library, there’s a function replace that does exactly whatwe want. It replaces all occurrences of one substring with another. So, tomake the substitution “<” to “&lt;” we need the line

content = string.replace(content, "<", "&lt;")

This starts with the string content and replaces all "<" with "&lt;" (theresult replaces the old value of content).

Figure 9.15 uses the replace function to translate all of the HTML specialcharacters to the correct markup. In Python \n is used to indicate a newlinecharacter. This is equivalent to pressing Enter in a text editor. We replace

Page 174: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

174 UNIT 9. MORE PROGRAMMING

import cgi, urllib, string

form = cgi.FieldStorage()

print """Content-type: text/html

<title>URI Display</title>

<h1>URI Display</h1>

<blockquote>

"""

page = urllib.urlopen(form["uri"].value)

content = page.read()

content = string.replace(content, "<", "&lt;")

content = string.replace(content, ">", "&gt;")

content = string.replace(content, "&", "&amp;")

content = string.replace(content, "\n", "<br />")

print content

print "</blockquote>"

Figure 9.15: Displaying source: inserting entities

those with <br /> to preserve the line breaks when the HTML is displayed.

There is a problem with the code in Figure 9.15. When we enter a URLand display it with the web script in Figure 9.15, the output looks like Fig-ure 9.16. Do you know why?

Suppose part of the code we retrieved was “<body>.” After the first tworeplace functions, “&lt;body&gt;” would appear instead. Then, the nextfunction would replace the ampersands with the &amp; entity (making it“&amp;lt;body&amp;gt;”) This would create display like Figure 9.16.

We can fix this problem by replacing the ampersands first. Any “&” inthe original code should be replaced with &amp; but not any of the entitiesthat we produce. The fixed code is shown in Figure 9.17

Page 175: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.7. SOLVING PROBLEMS 2: DISPLAYING HTML SOURCE 175

Figure 9.16: Sample display of Figure 9.15 in a browser

import cgi, urllib, string

form = cgi.FieldStorage()

print """Content-type: text/html

<title>URI Display</title>

<h1>URI Display</h1>

<blockquote>

"""

page = urllib.urlopen(form["uri"].value)

content = page.read()

content = string.replace(content, "&", "&amp;")

content = string.replace(content, "<", "&lt;")

content = string.replace(content, ">", "&gt;")

content = string.replace(content, "\n", "<br />")

print content

print "</blockquote>"

Figure 9.17: Displaying source: fixed entity replacement

Page 176: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

176 UNIT 9. MORE PROGRAMMING

Problems

This code still isn’t perfect.If the user enters a URL of something that isn’t HTML, like a JPEG

image, our program will dutifully convert it to text and display it. It shouldprobably check the MIME type of the URL after it has been fetched. If it’stext/html, we can display it. If not, there should be an error message.

We can determine the MIME type with the urllib module by examin-ing page.info()[’Content-type’]. It will return a string containing theMIME type that came from the Content-type header. We would have tocheck this to make sure it starts with text/html.

The other possible problem with this program is the amount of memoryit uses. This line reads all of the contents of the file into the web server’smemory.

content = page.read()

If the user gives us a large page, it could take a huge amount of memory. Wedon’t want to allow this to happen. We can work around the problem byreading one line of the file at a time, which can be done with the readlinesfunction:

page = urllib.urlopen(form["uri"].value)

for line in page.readlines():

line = string.replace(line, "&", "&amp;")

line = string.replace(line, "<", "&lt;")

line = string.replace(line, ">", "&gt;")

line = string.replace(line, "\n", "<br />")

print line

Check-Up Question

I Modify the last version of the program to fix the two problems describedabove.

Topic 9.8 Coding Style

Whenever you’re writing a computer program of any kind, there’s more toworry about than getting it to work. You should also make sure your programis easy for you or someone else to read and follow.

Page 177: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

9.8. CODING STYLE 177

There are no fixed rules for creating a good program. It’s a matter ofstyle, just like writing good essays. But, there are some guidelines you shouldfollow to get yourself started in the right direction.

• Use descriptive names for all of your variables and functions. Don’tuse variables names like x and x2. Try to describe what the variableor function is for: total, calc tax

• Include a docstring for every function that describes its purpose andits arguments.

• Include comments (lines that start with #) to describe parts of yourcode that are hard to follow. For example,

# get the user’s input and make sure it’s a number

• Use functions to break tasks up into smaller pieces. If you find youare writing several screens of code in a single function, you should stopand try to pick out a few distinct jobs that this chunk of code is doing.Create separate functions for each of these tasks.

All of these tips will make it easier for someone else to read your codeand also make it easier for you to come back and work on your code later. Inaddition, if your code is easier to read, it will probably be easier to debug.

Check-Up Question

I Look back at the code in your Assignment 3. How easy is it to read? Havea look at someone else’s code. Can you follow it?

Summary

This section should get you to the point that you can write programs to solveinteresting problems. You should be able to apply the skills you’ve learnedhere to web programming as well.

Key Terms

• function • arguments

Page 178: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

178 UNIT 9. MORE PROGRAMMING

• return value

• docstring

• local variable

• iteration

• for loop

• while loop

• lists

• exception

• style

Page 179: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 10

More Web Programming

Learning Outcomes

• Identify areas of programs that can cause security problems.

• Describe how cookies are used on the web and how they can carry stateinformation.

• Identify some basic steps that can be taken to improve the performanceof a web site.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• (optional) Read Chapter 24 in the Internet Book.

Topic 10.1 Security Basics

Whenever you create web scripts, you have to take security into account.You have to make sure that the users of your site cannot disable it or doanything else that they shouldn’t be allowed to do.

Creating truly secure programs is very hard, so programmers should everassume they have thought of all of the possibile things that could go wrong.

First, you must remember that you should never trust the user’s input.The user may intentionally give incorrect input in an attempt to find securityholes in your programs, or they could accidentally give bad input.

179

Page 180: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

180 UNIT 10. MORE WEB PROGRAMMING

For example, suppose you are creating a CGI script that takes input froma page with a control like this:

<input type="text" name="in" maxlength="80" />

You might assume that the value of form["in"].value in your script wouldbe at most 80 characters. But, what if someone else creates a form that pointsto your CGI script and doesn’t include the maxlength attribute? Would a100-character string cause your program to misbehave? Whether it does ornot, you should make sure you remember the possibility.

You should also be careful about your output format. For example, whenwe created the HTML source displayer in Topic 9.7, we converted any HTMLtags to entities before displaying them. If we hadn’t and a user tried todisplay a web page that included JavaScript or VBScript, the scripts mighthave been executed instead of displayed. Again, it may or may not haveactually been a security problem, but it’s better to be safe than sorry.

You have to be particularly careful if you are using any input from theuser to create a filename or to build code that will be executed. You have tobe very careful that the user’s input doesn’t trick your program into writingto a file or running some code that does something dangerous.

For instance, you might be tempted to write a CGI script that will let theuser type some Python code and show them the result after it is submitted.You should never do anything like this. The user could type code usingthe os.remove function that deletes files and remove any files in your homedirectory. You should never, never execute any code that is given to you bythe user. There are too many possibilities for damage for you to catch themall.

Another common problem occurs when the user gives much more inputthan the programmer expected. This is less of a problem in Python thanin other programming languages since strings can hold any amount of in-formation (until the computer runs out of memory). In some programminglanguages, like C, you generally have to indicate a maximum length for astring. If you try to store more information than you have allocated, resultsare unpredictable. A clever cracker can use this to gain unpermitted access.This is known as a buffer overflow error.

The bottom line is that creating secure programs (both CGI scripts andother programs) is hard. It’s easy to forget some small little corner that wouldgive a determined cracker somewhere to get in. Since we haven’t discussed

Page 181: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

10.2. COOKIES 181

writing to files or interfacing with other programs, you probably can’t createprograms that are too dangerous, at this point.

Topic 10.2 Cookies

When a user accesses web pages from your site, there isn’t really any way foryou to connect one page view to another. There’s no way of knowing if sameuser is following links from one page to another—all you know is which pagesare being loaded. You can only guess that they are being loaded by the sameuser. In short, the web is stateless—every page is loaded independently.

This fact can cause problems if you need to follow a particular user fromone page to the next. For example, suppose you were creating an onlineshopping application. You would need to keep track of the items that usersplaced in their “shopping carts” so you could give them the right total whenthey eventually got to the “checkout” page.

One way to keep track of a user on the web is to use cookies. Cookies aresmall pieces of information that a web site can store on the user’s computer.Once a web server has stored a cookie, it can retrieve its value and use it todetermine who the user is, what is in their shopping cart, and so on.

Because they can be used to track users, many people don’t like acceptingcookies from all web sites. Some companies, Doubleclick in particular, havehuge databases of people’s browsing habits that they have collected throughtheir cookies.

Mozilla includes sophisticated tools for managing who can set acookie in your web browser. It also contains a “Cookie Manager”that you can use to view and delete the cookies stored in yourbrowser.

Cookies can be created when a page loads, as part of the HTTP conversa-tion between the server and client. Cookies come with an expiry date. Theyare stored on your hard drive until they expire.

There is a Python module named Cookie that can be used in CGI scriptsto create cookies. We won’t discuss it here, and you will not be expected touse it in this course.

An alternative to cookies that you can use when you are creating websites is the hidden type of <input>. A hidden control is invisible to theuser, but it contains a value that can be read when the form is submitted.

Page 182: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

182 UNIT 10. MORE WEB PROGRAMMING

You can create a hidden control on a page with information that will let youkeep track of a user. For example, you could use a CGI script to create apage that contains this HTML:

<input type="hidden" name="user" value="1234" />

When the user submits this form, you will know that it came from usernumber 1234. You could then repeat this technique and keep track of theuser through several forms.

Check-Up Question

I Use Mozilla’s “Cookie Manager” to see what cookies you have stored.

Topic 10.3 Performance

When you are designing and testing a web site, you will often be the onlyone accessing it. Everything should be very fast when the web server onlyhas to respond to one user. Will it still be fast if 1000 people try to accessyour site at once?

It’s really impossible to know the answer to this question until you have1000 people trying to access your site but there are a few things you can doto help keep your site running smoothly.

First and foremost, you shouldn’t use dynamic content unless it’s reallynecessary. It is much faster for a web server to pick an HTML file from thedisk and send it out than it is to start a program and wait for its output. Ba-sically, if you can get away without CGI scripts or other methods of creatingdynamic content, you should.

A well-known technology news site, slashdot.org, often sends somany visitors to a site it links to that the web servers crash (or runso slowly it seems like they have crashed). Most sites that can’thandle this load have a lot of dynamic content. Servers with staticpages typically only slow down when all of their bandwidth is used.

The second thing you can do is to make the code itself more efficient.Again, there are no fixed rules for how to do so.

Page 183: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

10.3. PERFORMANCE 183

There are two aspects that affect the efficiency of a program. The firstis the amount of calculation it needs to do. You should make sure that thecode inside of loops in your program doesn’t repeat any more than necessary.

The second aspect is the amount of memory the program uses. Don’tstore unnecessary information. A few numeric variables won’t matter, butlong lists and strings can take up a lot of memory quickly. For example, inTopic 9.7, we mentioned that it would be better to read in one line at a timeinstead of the whole file at once.

There are many tips and techniques for speeding up Python code that wecan’t go into here.

One of the big problems for high-volume sites is the time it takes tostart up the web script. Every time you start a Python program, there is alittle delay while the web server finds the Python interpreter and loads theprogram into memory. If you have hundreds of requests coming in, it cancreate a lot of work for the server.

There are ways to let the web server keep the web script in memory andonly restart the appropriate part of the program when it is needed. Thesemethods eliminate most of the startup time for each access to a script andoften result in the web server being able to handle 10 times as many requests.

For Python programs, the Apache web server has an option extensioncalled “mod python.” Using mod python requires a change in the way yourprogram accesses CGI data, so your code must be modified. We haven’t beenusing it on the course web server for this reason. If you were planning tocreate a large-scale web site with Python, mod python would definitely beworth looking at.

Check-Up Question

I Go back to the programs you’ve written so far. Do you see any ways tomake then run faster or use less memory?

Summary

This unit is much less technical than Unit 8. You probably won’t come awaywith specific programming skills.

Page 184: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

184 UNIT 10. MORE WEB PROGRAMMING

On the other hand, you should have learned more about what can be donewith web scripts and some of the problems that can arise when creating them.

Key Terms

• buffer overflow

• cookie

• stateless

Page 185: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Unit 11

Internet Internals

Learning Outcomes

• Describe the way resources are transmitted with HTTP.

• Describe some of the other things that can be done with HTTP.

• Discuss the DNS system and describe how domain names are translatedto IP addresses.

• Identify uses of encryption on the Internet and outline why it is neces-sary.

• Identify the parts of a URL and their uses.

Learning Activities

• Read this unit and do the “Check-Up Questions.”

• Browse through the links for this unit on the course web site.

• (optional) Browse Chapters 18, 29, and 30 in the Internet Book.

• Do Assignment 4.

Topic 11.1 HTTP

Recall way back in Unit 1 when we discussed the conversation that your webbrowser and a web server have when transferring a web page. The basics ofthe conversation can be seen in Figure 1.2.

185

Page 186: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

186 UNIT 11. INTERNET INTERNALS

In Topic 1.5, we got as far as indicating that the browser asks the servergiven in the URL for the path from the URL. The server responds with thecontent of the page or image itself. You will also remember that HTTP, theHyperText Transfer Protocol is the way these conversations happen.

We’re going to use the same example. We’re fetching a web page and oneimage from it, starting with the HTML page at http://www.sfu.ca/about/index.html.

First, the browser has to get the HTML file. This is the URL that itwould get either from the user or from a link on another page. The browserwould look at the URL and determine that it must contact the web servernamed www.sfu.ca. Once it contacted the web server, it would send theHTTP request

GET /about/index.html HTTP/1.1

Host: www.sfu.ca

The first line here is the request line. It tells the web server that we want thefile with the path /about/index.html. The web server must translate thispath and find the actual file on the disk. If the request is for a CGI script, itmust be executed; if it’s for a regular file, it will be read from the disk andsent back.

The “GET” is the HTTP method and indicates that we want to retrieve apage. There are several other HTTP methods that we won’t discuss. Finally,“HTTP/1.1” tells the server that we’re speaking version 1.1 of HTTP.

After the request line, there can be several HTTP headers. The only onehere, Host, gives the server name.

You might think that the Host header is unnecessary because theweb server already knows its own name. Usually, that’s the casebut it is possible for one computer to respond to requests for severaldifferent domains. For example, the same web server has been usedfor both www.cs.sfu.ca and www.fas.sfu.ca. For these web servers,Host can be necessary.

When the web server receives the request, it will send a response. Theresponse in this case would look something like this:

HTTP/1.1 200 OK

Date: Thu, 01 Jan 2004 11:59:59 GMT

Last-Modified: Mon, 29 Dec 2003 01:00:00 GMT

Server: Apache/1.3.29 (Unix)

Page 187: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

11.1. HTTP 187

Content-Type: text/html

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">

<html>

<head>... (rest of the HTML for the page)

The first line here is the status line and indicates that it’s a HTTP version1.1 response. The response code 200 indicates an “OK,” that the page hasbeen found and it’s on its way. There are many other response codes thatindicate various errors and messages to the browser.

After that, there are several headers that give various information aboutthe response. The Server header is an indicator of what web server softwareis being used. The Apache web server is the most common on the internet,running on about two-thirds of all web servers.

The Content-type header indicates the MIME type of the resource.When we write web scripts, we have had to generate this header ourselves;for static files, it is generated by the web server.

The blank line separates the HTTP headers from the content itself. Thestuff after the blank line is the contents of the HTML file that is sent to thebrowser.

Once the web browser gets the HTML page, it will notice an <img> tagthat points to the image with URL http://www.sfu.ca/hp/images/sfu.jpg.Then, a similar conversation will take place to get the image. First, thebrowser will make its request:

GET /hp/images/sfu.jpg HTTP/1.1

Host: www.sfu.ca

The server will respond:

HTTP/1.1 200 OK

Date: Thu, 02 Jan 2004 00:00:01 GMT

Last-Modified: Mon, 29 Dec 2003 01:02:03 GMT

Server: Apache/1.3.29 (Unix)

Content-Length: 9325

Content-Type: image/jpeg

@PJFIF@AB@@$1=wW7

... (rest of the JPEG data)

Page 188: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

188 UNIT 11. INTERNET INTERNALS

The only real difference here is that the data that’s sent back has typeimage/jpeg instead of text/html. That will change the way the browser in-terprets the body of the response.

If you were to look at the whole body of the response the way it’s presentedhere, it would look like random characters. As far as the web server isconcerned, any file that is being sent is just a string of bytes. When you lookat it as text, you’re taking the bytes from the JPEG file and using ASCIIto convert them to characters. This isn’t how you’re supposed to look at aJPEG file, so you won’t get much out of it. The web browser has to interpretthese bytes and display them properly as pixels on the screen.

When you are surfing the Internet, you might occasionally click a link andsee a screen of random characters when you expected an image, a movie, orsomething else. In this case, the web server has sent the wrong MIME type.Your browser can get confused, and so it tries to display the file as text. Ifyou save it to a disk and open it with an appropriate program, it should befine.

Topic 11.2 HTTP Tricks

So far, we have only thought of HTTP as a way to transfer web pages fromthe server to the client. This is its primary job and all you usually have toworry about. But there are many other things that can be done with HTTPthat are worth knowing about, particularly if you want to create “real” websites. You should keep these possibilities in mind and remember when theyare needed.

They are all achieved by slight variations in the HTTP conversation wediscussed in the previous topic. We won’t worry about the details of theconversations for each of these. If you’d like details, see the course web site.

Instructions about how to do these things on the course web server canalso be found on the course web site.

Caching

If your web browser needs the same page twice, do you really need to transferthe whole thing over the network each time? You could store a copy on thehard drive and use it when you need it again. This is called caching.

Page 189: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

11.2. HTTP TRICKS 189

Caching decreases the amount of network traffic and makes pages appearfaster. It is even more important for files like graphics and style sheets thatappear on every page.

Caching can cause a problem if a file changes. Suppose a web site’s stylesheet changes between visits. How is your web browser supposed to know itneeds to reload the style sheet?

A web browser can ask the web server to send only the HTTP headersfrom a file without its actual content. It will include the Last-Modified

header, which will tell the browser when the file was last changed. If thebrowser’s cached copy if older, it needs to be reloaded.

Usually, web browsers won’t ask for new headers for every file with every

view. They usually cache files for at least a few hours and assume theyhaven’t changed. The user will have to click the “Reload” button to see thenew version if there is a change.

In addition to your web browser’s cache, your ISP might have acache server that stores web pages that their subscribers have re-cently accessed. This is one way they can decrease the amount ofinformation that must be transmitted between them and the out-side Internet and thus reduce their expenses. Cache servers can beentirely invisible to the users.

Web scripts don’t usually generate a Last-Modified header since theircontent could change every time they are viewed. So, web scripts aren’tusually cached.

When you create web pages, you should keep caching in mind. For ex-ample, if you use the same style sheet on every page, most users will onlyhave to transfer it once. It will save them time and reduce your bandwidthcosts. If you use a different style sheet on each page, every one will be loadedseparately.

The same is true for images. If you use a single image with a logo onevery one of your pages, users probably won’t notice the time it takes theimage to transfer. If you use a different image on every page, the images willhave to be loaded every time the user goes to a new page.

Redirects

As you have traveled the WWW, you have probably followed a link to a siteand found a message like “We have reorganized our page, so you should go

Page 190: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

190 UNIT 11. INTERNET INTERNALS

to. . . .” You have probably also seen many “Not Found” errors because pageshave moved to another location.

This is a sign that people don’t understand HTTP or the features of theirweb server. People browsing a web site should never have to see messageslike this.

The web server can handle moved pages transparently. When the userrequests a page, the web server can indicate that the page has moved toanother URL. The web browser will follow the redirect without the user everseeing it. Search engines also understand automatic redirects and will takethem into account when returning search results.

Without an automatic redirect, all links from other web sites or searchengines will take users to the old page. Basically, when somebody else linksto your web site, it is a good thing and you don’t want to break those links.

The way a redirect is set up depends on the web server you are using.

Content Negotiation

As you should know by now, different web browsers have different capabili-ties. Some can display SVG images; some can’t. Some can display Unicodecharacters, and some can’t. It’s possible that every person browsing the webhas a different combination of files that they consider acceptable.

There is a way of dealing with these differences using HTTP that isn’tvery well known. It is possible for a web client and server to exchangeinformation about what files are acceptable and decide on the best version.This process is called content negotiation.

When a web browser requests a web page, it can send along informationabout acceptable information in three areas:

• File type. The browser sends a list of MIME types that it knows how tohandle. Usually browsers send information about the types that theycan handle internally and indicate that they can take any other type offile if it’s all that’s available. So, the browser might prefer a HTML file,but if all that’s available is a PDF, it will accept that and try to find aprogram to open it. Browsers that can display a SVG image might besent that, while other browsers would receive the PNG version of thesame image.

• Language. The web browser should have a list of the languages that theuser knows how to read and an order of preference. The web server will

Page 191: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

11.2. HTTP TRICKS 191

try to send a page that the browser can actually read. On multilingualweb sites, using these preferences lets different people read the web sitein different languages with no changes to the site and no effort for thebrowser.

• Character set. Some web browsers know about Unicode and somedon’t. There are also several other character sets used for various rea-sons that browsers may or may not be able to handle.

When these three kinds of information are combined, content negotiationallows web authors to create web sites with many different visitors in mind.All visitors will see the content that is best suited to them.

Once content negotiation is activated in the web server, the only changethat must be made on the web site is to link to files without an extension.For example, instead of linking to page.html, the link would simply point topage. The browser and server would then negotiate for the HTML file oranother variant if one is available.

Content negotiation is not used frequently on the web, partially becausemost people don’t know it is possible. Even for those that do, content nego-tiation presents problems.

When web browsers are installed, they usually assume that system’s de-fault language is the only one that the user can read. If users don’t changethis setting and go to a web site written in another language, they will getan error message that says there is no page that’s acceptable.

Content negotiation has been used on the course web site. If youhave a non-English version of Windows or the MacOS, you prob-ably had to change your browser’s settings to view the web site.Requiring such changes isn’t practical for commercial sites, so theydon’t generally use content negotiation.

A side benefit of content negotiation is that you don’t have to use fileextensions in your URLs. If you have used content negotiation and want toturn a static HTML page into a Python web script, you would just have toreplace the page.html file with page.py. Since all of the links only point topage, they will all still work correctly.

Check-Up Question

I Can you determine the IP address of your ISP’s name servers? Can youdetermine your computer’s IP address (when you’re connected)?

Page 192: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

192 UNIT 11. INTERNET INTERNALS

Topic 11.3 DNS

In Topic 1.1, we talked about the names that computer on the Internet have,like www.sfu.ca and h24-84-78-194.vc.shawcable.net. These names are usedby people to refer to a computer connected to the Internet.

Computers don’t use these names to refer to each other or to determinewho to ask for specific information. Computers use IP addresses to commu-nicate with each other. An IP address is a set of four number between zeroand 255. The IP address of www.sfu.ca is 142.58.200.82 as this guide is beingwritten.

When you give your computer a domain name like www.sfu.ca, it firstneeds to translate it to the corresponding IP address. There are too manynames on the Internet to store them all on your computer and they canchange at any time, so you’d never be able to keep your list up to date.

The translation is done by computers on the Internet called domain nameservers (DNS ). These computers run DNS software that can respond toqueries from your computer for name to number conversions.

Because there are so many names and IP addresses on the Internet, westill can’t expect a single DNS server to keep track of everything. Instead,the whole DNS system is broken up into a hierarchy.

When you connect your computer to the Internet, it will be given the IPaddresses of two or three DNS servers run by your ISP. Sometimes, you haveto enter these manually into your network settings. Your computer will askthese DNS servers any DNS questions it has. Here is an example of whathappens when you need to do a name to IP address conversion:

1. You type the URL http://www.sfu.ca/about/index.html into your webbrowser. The web browser figures out that the server name here iswww.sfu.ca. It asks your ISP’s DNS server for the number that goeswith www.sfu.ca.

2. The ISP’s DNS server starts by looking up the .ca domain. It can findout who to ask about the .ca domain by asking one of the root nameservers. The servers must have been configured with a list of the rootname servers. There are 13 root name servers spread across the world.

It will contact one of these root name servers and ask it where to findout about the .ca domain. There are currently seven DNS servers thatcan respond to queries about the .ca domain.

Page 193: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

11.4. SECURITY AND ENCRYPTION 193

3. Your ISP’s DNS server will then connect to one of the .ca DNS serversand ask about .sfu.ca. It will be given the IP address of one of the threeDNS servers for .sfu.ca.

4. Finally, the DNS server can ask one of the DNS servers for .sfu.ca aboutwww.sfu.ca. It will be given the IP address 142.58.200.82.

5. Your computer can then connect to the web server on 142.58.200.82and ask for the web page with path /about/index.html.

In each of the steps above, you may have noticed that there were severalDNS servers that could be asked at each step, which minimizes the chance ofthe process failing. All of the servers at any level will give the same resultsfor any query, so if one is down, the user won’t notice. In fact, the rootservers are designed so that two-thirds of them can fail before Internet userswould even notice there was a problem.

If you have problems with DNS lookups, the cause is almost always aproblem with your ISP’s DNS servers. Since the DNS lookup is the firstthing that your computer has to do whenever you request anything on theInternet, a problem with DNS is often the first symptom of a problem withyour network connection.

Topic 11.4 Security and Encryption

Have a look at Figure 1.1 again. As we have just seen, there are manyother computers between your home computer and SFU that pass the datayou send from one to the other. You probably have no idea who runs theseintermediate computers. There’s no reason you would, and it’s not easy todetermine. Most people don’t care.

All of the information that you send across the Internet passes throughintermediate computers like these. It’s possible for any of these computersto watch all of the information that passes through them and scan for thingslike passwords and credit card numbers that you want to keep secret.

When the Internet was originally designed, there were only a fewsites connected, mostly universities. Everybody knew and trustedthe administrators at the various sites that might be responsiblefor passing along their data. Now, with millions of people send-ing information across the Internet at any time and thousands ofservice providers, blind trust isn’t really an option.

Page 194: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

194 UNIT 11. INTERNET INTERNALS

It is possible to keep secrets, even when you are passing informationaround this way. The information can be encrypted so that only the intendedrecipient can decrypt it and read the contents. The computers in between canpass the information along, but they can’t (easily) decrypt it to see what’sinside. The details of how these encryption methods work is beyond thescope of this course.

Most information that is transmitted over the Internet isn’t encrypted.HTTP (web) traffic and emails are sent without encryption.

Secure HTTP (HTTPS or S-HTTP) is a version of HTTP where allinformation is encrypted when it is transmitted. Using HTTPS ensures thatany sensitive information you send will only be seen by the intended receiver.URLs that begin with https:// instead of http:// use the encrypted versionof HTTP. HTTPS isn’t always used because encrypting and decrypting datais extra work for web servers, so it costs more money in the long run.

Most web browsers have a small icon in the bottom of their windowto indicate whether or not they are using a secure connection. Youshould keep your eye on this icon when sending passwords andother sensitive information.

You may have used FTP (File Transfer Protocol) to transfer files overthe Internet. So ,you may have been wondering why you can’t use FTP totransfer files to the course web server.

FTP isn’t encrypted. Since you have to provide your password to uploadfiles into your account, it would be sent unencrypted every time you loggedin. The technical staff in Computing Science don’t FTP for security reasons.The SCP (Secure CoPy, also called SFTP) protocol does use encryption, soit was chosen as an alternative.

It is also possible to encrypt email by adding an encryption program toyour email program. The most common encryption program for email isPGP (Pretty Good Privacy).

Whenever you send information like credit card numbers or sensitive pass-words over the Internet, you should make sure you are using some kind ofencrypted connection.

Check-Up Questions

I You probably use some programs not listed here on the Internet. Do theyencrypt the data they sends? Are there secure versions?

Page 195: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

11.5. URLS 195

I Try to find an encryption plug-in for your email client.

Topic 11.5 URLs

We have learned a lot about the web since we first discussed URLs inTopic 1.3. As we have discussed different things that can be done on theweb, more has been added to our URLs.

Query String

You might have noticed that when we create forms and use them to accessweb scripts, the data from the form is transmitted as part of the URL. Forexample, if we had a form with two text inputs named name and age, asin Figure 8.8, we would submit it and see a URL link this in the browser’slocation bar:

http://cmpt165.cs.sfu.ca/∼student/cgi.py?name=Skippy&age=19

The part after the ? is called the query string. It contains an encodedversion of the CGI data. The different fields are separated by an ampersand,&. Note that if you want to include a URL like this in an <a> tag, you needto use the &amp; entity to encode each ampersand.

When you send CGI data this way, it will be recorded in logs of webtraffic and will generally make a very long and awkward URL. You can avoidthis problem by using the post method to transmit your form data. You justhave to change the way your form submits:

<form action="sample.py" method="post">...</form>

This way there will be no query string in your URL, and the form data istransmitted out-of-sight of the user.

Fragment

In Topic 4.2, we discussed the id attribute and pointed out that these iden-tifiers can be used as fragments or anchors. This is a way of creating a linkto a position within a page.

To include an anchor in a URL, it is put at the very end (after a querystring), starting with a #. Suppose we have a page at the URL http://

Page 196: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

196 UNIT 11. INTERNET INTERNALS

cmpt165.cs.sfu.ca/∼student/page.html with an identifier <h2 id="contents">.To create a link that scrolls to this <h2>, we would use this URL:

http://cmpt165.cs.sfu.ca/∼student/page.html#contents

Fragments can also be used in relative URLs on a web page:

<a href="../page.html#contents">...</a>

URL Encoding

In HTML, entities are needed to let us display characters like < that areused for special purposes in the language. We run into similar problemsin URLs. The characters ? and # are used to indicate the query string andfragment, respectively. What if we have a file name or CGI data that containsa question mark?

There are also characters that aren’t allowed in a URL, like spaces. Inorder to allow URLs to transmit these filenames and form data, these char-acters must be encoded.

Spaces are replaced with a plus sign, so if someone entered “John Smith”into a form, the URL including the query string would look like this:

http://cmpt165.cs.sfu.ca/∼student/cgi.py?name=John+Smith

Other characters are encoded with the hexadecimal number of their ASCIIvalue. If someone entered “Who?” in a form, it would be encoded like this:

http://cmpt165.cs.sfu.ca/∼student/cgi.py?name=Who%3F

Fortunately, you don’t usually have to worry about URL encoding. Whenyou use web sites, it’s done automatically by the browser. If you ever need tocreate a link with encoded characters, you can just type it into the Locationbar in Mozilla, which will convert it automatically. You can also type theproblem characters into a form and get the encoded version from the querystring after you submit.

Check-Up Questions

I Use the <a> tag and a URL with a query string to create a link to createa link to CGI script without using a form.

I Try method="post" in a form.

Page 197: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

11.5. URLS 197

I Use an id attribute on a tag on a web page and use a fragment to link toit.

I Experiment with URL encoding by typing some punctuation into an HTMLform and look at the query string that is used when you submit the form.

Summary

After completing this unit, you should have a good understanding of howthese parts of the Internet work. As in Unit 1, you should gain an under-standing of what’s happening behind the scenes when you use the WWW.

Key Terms

• HTTP request

• HTTP headers

• HTTP response

• caching

• redirect

• content negotiation

• DNS

• IP address

• encryption

• query string

• fragment

• URL encoding

Page 198: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

198 UNIT 11. INTERNET INTERNALS

Page 199: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Part IV

Appendices

199

Page 200: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor
Page 201: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Appendix A

Technical Instructions

Learning Outcomes

This material is intended to help you get over the technical hurdles necessaryto do the assignments. You won’t be tested on it.

Learning Activities

• Read this appendix and do the “Check-Up Questions.”

• Browse through the links for this appendix on the course web site.

• Install whatever software you need.

This Appendix explains how to install and work with some of the softwarethat we will be using in the course. The instructions cover the software usedwith Windows. You can also do this course with a Mac or other operatingsystem. See the course web page for more information.

The following Appendix B contains instructions on working with individ-ual pieces of software.

The instructions assume that you have a basic understanding of how touse your computer—you can open programs, load and save files, use menusand buttons, and so on. If you don’t know all these things, you should firstspend some time with the tutorial that came with your operating system orsit down with someone and have them show you the basics.

201

Page 202: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

202 APPENDIX A. TECHNICAL INSTRUCTIONS

Topic A.1 Installing Software

Links to all of the software you need can be found on the course web siteunder “Tools.” Initially you should install at least a web browser, a securefile transfer client and a text editor.

The instructions here are for Windows installations. If you have a Mac,you can use the software mentioned in the Introduction and on the courseweb site. The installation procedures for these programs are are fairly easy,but you will have to figure them out on your own.

WinZip

If you have a version of Windows earlier than Windows XP, you need WinZipto expand ZIP files that you download. If you need to work with a ZIP file,you must have it installed.

The installation is quite easy, consisting mostly of saying “Yes” to anumber of questions. You should probably have the program start with its“Wizard” interface and have it search your entire hard drive for ZIP files.

Mozilla

We recommend Mozilla as the web browser for this course. It is the “opensource” version of Netscape 7; the two programs are very similar.

Once you’ve downloaded the mozilla-. . . -installer.exe file, run it. The in-stallation is straightforward; you can select the “Complete” installation andsay “Okay” to everything else.

TextPad

TextPad is a powerful text editor. Using it instead of Notepad will makeediting HTML and CSS easier.

Download and run the executable file. You can just accept all of thedefaults for the installation.

Page 203: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

A.1. INSTALLING SOFTWARE 203

SSH Client

The SSH Client software includes a SCP (secure copy) program that youwill use to transfer your web pages to the web server. You cannot use FTPto transfer files to the course web server because FTP connections are notallowed; you will have to use SCP.

Once you have downloaded the installation file, you can run it and acceptthe defaults for the installation.

When you first start the “SSH Secure File Transfer Client,” you shouldtry connecting to the course web server to make sure you can. Click “QuickConnect” and fill in the host name cmpt165.cs.sfu.ca and your SFU username. You can leave the other settings at their defaults.

Type your password for cmpt165.cs.sfu.ca when you are prompted, andyou should be connected to the server. If you connect successfully, a windowtitled “Add Profile” should appear. If you name the profile “cmpt165,” youcan select that from the “Profiles” menu when you want to connect in thefuture.

You should use this profile whenever you transfer files into your web space.

The GIMP

The GIMP is distributed as a ZIP file. Unless you have Windows XP, youwill need to use WinZIP to uncompress it. When you unzip, you will get asingle file called gimp-setup-. . . .exe. Run this program.

Once the installation has started, you have to click your way throughseveral screens. You can choose the “Typical” installation.

Once it is installed, you will probably get several questions about fileassociations. They are of the form “The file type ’.???’ has already beenregistered with Windows. Do you want to replace those settings?” If youwant the GIMP to open when you double-click this file type, say “Yes.” It’sprobably easiest to say “Yes” to all of these questions.

Once you’ve installed GIMP, you will be asked to install GTK+ as well.You must install it for the GIMP to work. All you have to do is click “Okay.”

The first time you start the GIMP, there is a short setup process. Youcan accept all of the defaults here as well. At one point during the setup, aCommand window might pop up—you’ll have to switch to that window andpress Enter to get things moving again.

Page 204: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

204 APPENDIX A. TECHNICAL INSTRUCTIONS

Python

After you download the Python Windows installer, you need to run it.

You can accept the defaults for this installation too. If you want to savedisk space, you won’t need the “Python test suite” for this course, so youdon’t have to install it.

Topic A.2 SFU Computing Account

If your SFU account isn’t activated yet, you will need to do it before youstart this course. Instructions on how to do so are given in the CDE studenthandbook.

Your campus email account will be receiving email from the course emaillist, so make sure you check it regularly. Open University students shouldemail the course supervisor to have themselves added to the email list.

Topic A.3 CMPT 165 Server Account

In addition to your regular SFU computing account, you will have an accounton a web server set up just for this course. You will use the file space on thisserver for your assignments. It is set up to make the web programming wewill be doing in the second half of the course as straightforward as possible.

This account will have a different password than your SFU computingaccount. At the start of the semester, you can find your password for thisaccount from http://my.sfu.ca. Go to the “myCourses” section of my.sfu.ca.Under the “CMPT165” heading, you should see your initial password for theserver.

You should change this initial password immediately after you are con-nected. You can do this from the server’s front web page, http://cmpt165.cs.sfu.ca/.

On the server, you will also find links to various material for the courseand updated instructions for using the server if the configuration has beenchanged since the Study Guide was written.

See Topic A.5 for information on transferring files to this web server.

Page 205: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

A.4. CREATING WEB PAGES 205

You won’t be able to access this server after you’re done the course, somake sure you keep your own copies of the files you upload there if you thinkyou might want them in the future.

Topic A.4 Creating Web Pages

In this course, we will be creating web pages with a text editor, not a WYSI-WYG or other HTML editor. This is because we (a) want to learn HTMLand (b) don’t want to pretend that HTML is WYSIWYG.

The easiest way to work is to open up your HTML page in a text editor.All you have to do to specify that the file you’re working on is HTML is toput “.html” at the end of the file name when you save it. Then, open yourweb browser and open the same file (keep both the browser and text editoropen so you can switch back and forth). As you work on the HTML in thetext editor, you can periodically save the file and look at the changes in thebrowser. When you switch to your browser, remember to click “Reload” or“Refresh” so you see the most recent changes.

When you need to write style sheets, create another file with the ending“.css” in your text editor. Python programs (both web scripts and regularprograms) should end with “.py” and can be created with a regular texteditor or in the Python IDLE environment.

When you’re done working of your page on your computer, you can trans-fer it to the web server as described below.

Topic A.5 Transferring Web Pages

First, you need to create a profile for cmpt165.cs.sfu.ca as described above forSSH Client’s Secure File Transfer program. For more information on usingsecure file transfer, see Topic B.5.

When you connect, you will see a list of the files if your home directory oncmpt165.cs.sfu.ca. In this list, you should see a directory called public html.

The public html directory is where you put all of the files you want to beavailable on the web. You can put any kind of file here; you can also createdirectories to organize your pages.

Page 206: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

206 APPENDIX A. TECHNICAL INSTRUCTIONS

After you’ve transferred the files, remember to check and make sure you’vetransferred them correctly. If you put a file called stuff.html in your pub-lic html directory, you should be able to access it at this URL, where useridis replaced with your user name:

http://cmpt165.cs.sfu.ca/∼userid/stuff.html

You will have to submit URLs like this for each assignment.Note that when you’re making web pages with graphics or style sheets,

you have to upload all of the files: the HTML, the style sheet, and all of theimages.

Check-Up Questions

I Start up the SCP client and make sure you can connect to cmpt165.cs.sfu.ca. Go into your public html directory.

I Transfer an .html file and make sure you can access it as a web page.

Page 207: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Appendix B

Software

Learning Outcomes

This material will help you to learn how to use the software you need to dothe assignments. You won’t be tested on it.

Learning Activities

• Read this appendix and do the “Check-Up Questions.”

• Browse through the links for this appendix on the course web site.

• Explore the software more on your own.

• Install whatever software you need.

This Appendix contains instructions for using the following software:

• Mozilla (a web browser)

• TextPad (a text editor)

• The GIMP (a bitmap graphics editor)

• Python (programming language environment)

• SSH Secure File Transfer Client

• HTML and CSS Validators.

The instructions provided apply only to Windows.

207

Page 208: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

208 APPENDIX B. SOFTWARE

Topic B.1 Mozilla

We’re going to assume that you’ve used the web before, so you should haveused some web browser. Using Mozilla is similar.

Note that you cannot use the web page Composer that is included withMozilla for your assignments.

One thing that you must know is how to open web pages on your com-puter. From the “File” menu, select “Open File.” Here, you can select anHTML file and view it. It should look the same as it will when you put iton the web.

Topic B.2 TextPad

Spend a few minutes learning the basics of TextPad. TextPad is a texteditor—it works like a word processor, but you can’t set fonts or other for-matting.

When you load an HTML document into TextPad, it does syntax high-lighting. That means that it automatically reads your document to figureout what is a tag, an attribute, and so forth. Each of these is in a differentcolour so you can tell them apart at a glance. It may look strange at first,but once you get used to it, you’ll wonder how anyone does without it.

The basics of TextPad are just opening, editing, and saving files. Youshould be able to figure these things out on your own. There are also someother features that might make editing HTML in TextPad easier. Feel freeto explore them.

Topic B.3 The GIMP

The GIMP (GNU Image Manipulation Program) is a bitmap image editingprogram. The GIMP was created by a large group of people who wantedto make such a program freely available. The result is a full-featured freeprogram.

Page 209: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

B.3. THE GIMP 209

Figure B.1: The main toolbox in the GIMP

When you open GIMP, you will see the main toolbox (Figure B.1) (yourtoolbox might be wider or narrower, but the buttons should be the same).These buttons give you access to the main tools in the GIMP.

Some of the major features of the GIMP aren’t immediately obvious. Forexample, the right-click menu. When you’re working on an image, clickingthe right mouse button on it will bring up a menu with several sub-menus. Ifthe toolbox looks simple, that’s because the most of the features are hiddenin the right-click menu. Figure B.2 shows the right-click menu.

Check-Up Question

I Open up the GIMP and create a new image (by selecting New from the Filemenu on the main toolbox). Click the right mouse button on the imageand have a look at some of the options in the right-click menu.

Once you have discovered this menu, you should be able to orient yourselfif you have used other image editing programs. If you haven’t used suchprograms before, read on.

The Toolbox

Figure B.1 shows the main GIMP toolbox. In the bottom-left of the win-dow, you can see the current colours, which are displayed as two overlappingrectangles. The top colour is the foreground colour ; this is the colour thatwill be used for anything you draw. Behind it is the background colour that

Page 210: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

210 APPENDIX B. SOFTWARE

Figure B.2: The GIMP’s right-click menu with the option for saving thecurrent file selected

is used when you erase something. If you click on either of these, you willget a colour selection window where you can change to a new colour.

Check-Up Question

I Click on the foreground colour and change it to another colour. Next,change the background.

Above the current colours are twenty-five tool buttons. These are shownin Figure B.3. We will discuss some of them here; you can play with theothers on your own.

If you double-click any of the tools, you will activate the tool optionswindow. Different tools have different options; feel free to explore them aswell.

First, the “pencil” tool (A4) is used to draw. You can change the size ofthe line it draws by selecting a new “brush.” In the right-click menu, select“Dialogs” and then “Brushes.” You will see many different brush shapesand sizes that you can choose from. The Eraser (C4) works exactly like thepencil, but it uses the background colour, so it erases instead of drawing.

Page 211: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

B.3. THE GIMP 211

A C D E

4

5

3

2

1

B

Figure B.3: The main tool buttons in the GIMP

Check-Up Questions

I Create a new image (the default options should be fine). Select the penciltool and draw some lines. The lines should be drawn in the foregroundcolour you selected earlier.

I Open up the “Brush Selection” window and grab a different brush. Drawa little with that.

I Select the eraser tool and erase something.

I Double-click the pencil tool to display its “Tool Options.” Turn down its“Opacity” to about 25%. Draw some more and see what happens.

The buttons in the first row are used to select parts of the image. Whenpart of the image is selected, that’s the only part that you can change. The“select rectangles” button (A1) and “select ellipses” button (B1) are used toselect simple shapes.

Once you have made a selection, you can use the move tool (B2) to moveit around. You can also use any of the other tools, but you will only be ableto change the selected region. Creating and working with selections can alsobe done by using the “Select” and “Edit” options in the right-click menu.

Check-Up Questions

I Try some of the selection tools, particularly the rectangle and ellipse. Theother selection tools do more complex things; try them as well, so you

Page 212: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

212 APPENDIX B. SOFTWARE

know what they do if you need them.

I Select something and use the move tool (B2) to move it around. Notethat the background colour that you selected is what’s left behind whenyou move the selection away.

The “Zoom” tool (C2) is used to magnify or shrink your image on thescreen to make it easier to work with. Clicking on part of the image magnifiesthe image more, keeping that part in the display. Holding down the controlkey while clicking zooms out.

The “Crop” tool (D2) is used to shrink the image, keeping only part ofit.

Check-Up Questions

I Select the Crop tool and draw a rectangle around part of your image andclick the “Crop” button on the window that pops up.

I From the right-click menu, select “Edit,” then “Undo.”

The rest of the tools you can explore on your own. If you move the mousebutton over a tool and leave it there for a few seconds, a short description ofthe tool will be displayed.

Check-Up Question

I Some of the other tools you might want to try are the “Text” tool (B3),the “Airbrush” (D4), and “Smudge” (D5). Select these and some of theother tools and see what they do.

Right-Click Menu Options

As in the case of the toolbox, there are far too many options in the right-clickmenus to cover them all here. We will explain the fundamental ones, andthen you’re on your own to explore the rest.

You can change the colour depth of the image by selecting “Image” then“Mode” from the right-click menu. Colour depth is described in Topic 3.5.You have three options here. “RGB” refers to 24-bit colour; “Grayscale”refers to 8-bit colour consisting of 256 shades of gray; “Indexed” selects an8-or-less bit palette.

Page 213: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

B.4. PYTHON 213

The “Layers” menu in the right-click menu has many options. Layersare described in the text on pp. 77–80; basically, your image can be made ofmany layers that can be manipulated separately. Feel free to explore layers.

If you are working with an image and you find that some of the imagejust won’t change the way you think it should, it might be on another layer(occasionally, GIMP creates layers at unexpected times). To get rid of thelayers, select “Layers” and “Flatten Image” from the right-click menu. Allof the layers will merge.

Something else that’s easy to overlook is the current selection. Rememberthat you can only change the pixels that are currently selected. To get rid ofyour current selection, select “Select” and “None” from the right-click menu.

Topic B.4 Python

There are two aspects to the Windows Python software. If you double-click.py files, they will be executed with the Python interpreter directly. A windowwill appear that runs the program and disappears as soon as it’s finished.

If you’re working on a program, you should use the Python IDLE (Inte-grated DeveLopment Environment). Whenever you start IDLE, you will seea Python interpreter window. You can type Python code directly here andexecute it, as described in Unit 7.

You can also open up a program with IDLE and use the built-in texteditor, which does syntax highlighting for Python code. Then, if you wantto run the program, you can just press F5. The program will run in theinterpreter window, and you can see its output.

Topic B.5 Secure File Transfer

When you first start the SSH Secure File Transfer Client, you should create aprofile for the cmpt165.cs.sfu.ca server, as described in Topic A.1. Then, youcan use this profile whenever you need to transfer files to your web space.

Once you’re connected to a server, you will see two panes in the FileTransfer Client window. The left pane allows you to explore files on the harddrive of your computer. This is where you will find the files you want toupload. In the right pane, you can see the files in your space on the server.

Page 214: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

214 APPENDIX B. SOFTWARE

To transfer a file from your computer to the server, just drag them fromleft pane to the right. You can also select multiples files or a whole directoryand drag them all at once.

You can also drag-and-drop files from Windows Explorer to the FileTransfer Client window to transfer them. Just make sure you’re in the rightdirectory on the server (probably public html) first.

Topic B.6 Validators

The last things that you need to use in this course aren’t really “software.”The validators that will use for some of the assignments are used online witha web page interface.

Links to the validators can be found on the course web site in the “Tools”section.

Both the HTML and CSS validators can work with files that are on eithera web server or your computer. To validate a file that’s on a web server, youjust give the validator the URL of the file.

To validate a file on your computer, you first have to enter that part ofthe validator site (follow the link “validate files on your computer,” “uploadfiles,” or “by upload”). Then, click the “Browse” button and find the fileyou want to send.

All of the validators have options to include some warnings. It wouldn’thurt to leave the warnings on—they help you find things that you overlooked.On the other hand, when an assignment indicates that pages should validate,you don’t have to worry about warnings, only real errors.

If you’re uploading files in the CSS validator, you should give it yourCSS file (or the HTML file if you’re using a style sheet embedded with the<style> tag). If you’re submitting a file by URL, give it your HTML file,and it will find any style sheet that is linked to it.

Page 215: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

Index

* (multiplication), 127* (string repetition), 128** (exponentiation), 127+ (string concatenation), 128= (assignment), 128>>>, 1231-bit image, 6815-bit colour, 6816-bit colour, 6824-bit image, 678-bit image, 68802.11, 29

absolute URLs, 51additive colour model, 83ADSL, 29AirPort, 29Alertbox, 105algorithm, 167Alignment, 95 (subtopic)alpha channel, 70anchor, 80anchors, 195Applying a style sheet, 77

(subtopic)arguments, 154ASCII, 61assignment statement, 128Attributes, 47 (subtopic)

backbone, 28background colour, 209base 10, 60base 2, 60Basics of the Internet, 27 (topic)Benefits of Logical Markup, 41

(subtopic)binary, 60bit depth, 67Bitmap Images, 65 (subtopic)Bitmap vs. Vector Images, 65

(topic)bitmapped graphics, 64bits, 59block level tags, 54BMP, 72body, 133, 134Boolean Expressions, 134

(subtopic)boolean expressions, 135boolean values, 135buffer overflow, 180bugs, 137byte, 60

C++, 123cable modem, 29Caching, 188 (subtopic)called, 154

215

Page 216: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

216 INDEX

Can we do it?, 167 (subtopic), 171(subtopic)

Cascading Style Sheets, 75 (unit)cascading style sheets, see CSSCGI, 149, 149 (topic)cgi module, 149CGI scripts, 143cgitb module, 151character set, 61class, 79class selector, 79Classes and IDs, 78 (topic)clause

elif, 136else, 135

client software, 30Clients and Servers, 29 (subtopic)closing tag

empty tag shortform, 45short form, 45

Closing Tags, More on, 45(subtopic)

CMPT 165 Server Account, 204(topic)

CMPT 165 Web Server, 17(subtopic)

Coding Style, 176 (topic)Colour, 102 (subtopic)Colour Depth, 67 (subtopic)colour depth, 212colours, 83comments, 48Common Gateway Interface, 149components, 67Compression, 68 (subtopic)computer graphics, 64computer programming, 122concatenates, 128

condition, 133Conditionals, 133 (topic)Connecting to the Internet, 28

(subtopic)Content Negotiation, 190

(subtopic)contents, 43contextual selector, 86Contrast, 97 (subtopic)contrast, 102Cookies, 181 (topic)cookies, 181Creating Web Pages, 205 (topic)CSS, 76 (topic)CSS example, 86 (topic)CSS Properties, 82 (topic)CYM, 83

Debugging, 137 (topic)Debugging CGI Scripts, 150

(topic)decimal, 60Defining your own functions, 154

(subtopic)deprecated, 50Describing Documents, 39 (topic)Design, 93 (unit)Design Principles and

HTML/CSS, 98 (topic)dictionary, 149display property, 114dithering, 68DNS, 192, 192 (topic)docstring, 156document type, 55documentation string, 154, 156domain name servers, 192Don’t be annoying, 104 (subtopic)

Page 217: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

INDEX 217

Don’t break expectations, 103(subtopic)

drawing programs, 64dynamically generated, 143

elif clause, 136else clause, 135 (subtopic)Email, 31encrypted, 194Entities, 47 (subtopic)EPS, 72error message from Python, 130ethernet, 29Example Program, 124 (topic)exceptions, 166expression, 127Expressions and Variables, 127

(topic)

Fetching a Web Page, 36 (topic)file extension, 34File Formats, 66 (topic)

native, 72File Formats, Common, 71 (topic)Finding bugs, 139 (subtopic)float function, 132floating point numbers, 131Fonts, 63 (subtopic)for loop, 158 (subtopic)foreground colour, 209forms, 147Fragment, 195 (subtopic)fragment, 80fragments, 195FTP, 32, 194Full XHTML Document, 44

(subtopic)Functions, 154 (topic)

functionsarguments, 154return values, 154

gateway, 28General Design, 93 (topic)Generic Containers, 80 (subtopic)Getting it right the first time, 138

(subtopic)GIF, 71GIMP, see The GIMPglyph, 63Google, 105Graphics and Image Types, 64

(topic)

Handling Errors, 165 (topic)head tags, 54hexadecimal, 83How Computers Store Data, 59

(topic)How Web Pages Travel, 32 (topic)HTML, 40, 42

uppercase/lowercase, 46validating, 55version 4.01, 116vs. XHTML, 42, 116

HTML Basics, 42 (topic)HTML Comments, 48 (subtopic)HTML Forms, 147 (topic)HTML Pitfalls, 46 (subtopic)HTML validators, see validatorsHTTP, 31, 185 (topic)

secure, 194HTTP header, 144HTTP headers, 186HTTP method, 186HTTP request, 186

Page 218: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

218 INDEX

HTTP Tricks, 188 (topic)HTTPS, 194hyperlinks, 50HyperText Transfer Protocol, 31

ID selector, 79identifier, 79if statement, 133 (subtopic)Images in HTML, 52 (topic)indexed, 68Information on the Internet, 31

(subtopic)inline tags, 54Installing Software, 202 (topic)Instant Messaging, 32int function, 132interactive interpreter, 123Internet, 28Internet Explorer, 17Internet Internals, 185 (unit)interpreter, 123IP addresses, 192Iteration, 158 (topic)

Java, 123JPEG, 71

Key Error, 149

lang attribute, 54 (subtopic)layers, 213libraries, 136links, 50Links in HTML, 50 (topic)Lists, 162 (topic)local, 156Local Variables, 156 (topic)logical markup, see markup,

logical

Logical versus Physical, 86 (topic)lossless, 69lossy, 69

Making Web Pages with Python,144 (topic)

markup, 40logical, 41physical, 40structural, see markup, logicalvisual, see markup, visual

Markup and HTML, 39 (unit)MathML, 113 (subtopic)MIME type, 35MIME Types, 34 (topic)modem, 28module, 136More HTML, 47 (topic), 53

(topic)More Programming, 153 (unit)More Web Programming, 179

(unit)Mozilla, 17, 202 (subtopic), 208

(topic)

Name Error, 156namespace, 56native formats, 72Nesting Tags, 45 (subtopic)Netscape, 17network gaming, 32newline character, 173Not everybody is you, 104

(subtopic)Numbers, 60 (subtopic)

OpenOffice, 113 (subtopic)operands, 130operator, 130

Page 219: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

INDEX 219

paint programs, 64palette, 68path, 34Peer-to-peer file transfer, 32Performance, 182 (topic)PGP, 194physical markup, see markup,

physicalPhysical vs. Logical Markup, 40

(subtopic)pixels, 64PNG, 71post, 195print statement, 124

comma vs. +, 132Problem Statement, 167

(subtopic), 171 (subtopic)problem statement, 167Problems, 176 (subtopic)Programming Introduction, 121

(unit)programming language, 122prompt, 123property, 77protocol, 31, 33Protocols, 31 (topic)Proximity, 94 (subtopic)pseudoclass selectors, 86Python, 123, 204 (subtopic), 213

(topic)error message, 130interpreter, 123

Python errorscannot concatenate, 130Name Error, 156unsupported operand type,

130Python Libraries, 136 (topic)

Query String, 195 (subtopic)

raster graphics, 64raw input function, 129real numbers, 131Recommended Texts, 14

(subtopic)Red folder, 16Redirects, 189 (subtopic)Relative URLs, 51 (subtopic)relative URLs, 51Repetition, 95 (subtopic)request line, 186required, 52return, 154return statement, 155RGB, 83right-click menu, 209Right-Click Menu Options, GIMP,

212 (subtopic)root element, 109root name servers, 192routers, 28

S-HTTP, 194schema, 111scheme, 33SCP, 194Search engines, 91search engines, 41Secure File Transfer, 213 (topic)Secure HTTP, 194Security and Encryption, 193

(topic)Security Basics, 179 (topic)selector, 77

class, 79contextual, 86

Page 220: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

220 INDEX

ID, 79pseudoclass, 86

SFU Computing Account, 204(topic)

short form closing tag, 45simple transparency, 70Small-screen browsers, 90Software, 207 (unit)Solving Problems 1: Making

Change, 167 (topic)Solving Problems 2: Displaying

HTML Source, 171 (topic)Some HTML Tags, 43 (subtopic)Some XML Languages, 111 (topic)Specifying Colours, 83 (topic)Speech-based browsers, 90SSH Client, 203 (subtopic)Start Coding, 168 (subtopic), 172

(subtopic)Starting with Python, 123 (topic)stateless, 181statement

assignment, 128print, 124

static, 143status line, 187str function, 132string, 124string formatting operator, 169structural markup, see markup,

logicalstyle, 41

in HTML, see CSSStyling XML, 113 (topic)subtractive colour model, 83subtype, 35SVG, 72, 111 (subtopic)syntax highlighting, 208

tagclass, 79

Tag Types, 53 (subtopic)tags, 42Technical Instructions, 201 (unit)Telebook, 15Text and Character Sets, 61

(topic)Text and Graphics, 59 (unit)text file, 63Text Files, 62 (subtopic)TextPad, 202 (subtopic), 208

(topic)The GIMP, 203 (subtopic), 208,

208 (topic)The Guessing Game, again, 159

(subtopic)The World Wide Web, 27 (unit)Things to Avoid, 89 (subtopic)TIFF, 72Toolbox, GIMP, 209 (subtopic)top level tags, 53Transferring Web Pages, 205

(topic)transitional doctype, 116Transparency, 70 (subtopic)triple-quoted string, 146Type, 101 (subtopic)type, 35type function, 131Type Conversion, 131 (subtopic)typeface, 63Types, 130 (topic)

Unicode, 62Uniform Resource Locator, 33URIs, 33URL, 33, 51

Page 221: Introduction to Multimedia and the Internet · ComputingScience165-3 † StudyGuide Introduction to Multimedia and the Internet by Greg Baker Faculty of AppliedSciences Centrefor

INDEX 221

absolute, 51of your files, 206relative, 51

URL Encoding, 196 (subtopic)URLs, 33 (subtopic), 195 (topic)Usability, 102 (topic)User Input, 129 (topic)

Validating HTML, 55 (topic)Validating XML, 115 (topic)Validators, 214 (topic)validators, 55value, 47, 77variable, 128vector graphics, 64Vector Images, 65 (subtopic)viewer, 40visual markup, see markup,

physical

W3C, 56web browser, 30Web Page Design, 100 (topic)Web Programming, 143 (unit)web scripts, 143web server software, 29Web Site Design, 105 (topic)WebCT, 15well-formed, 115well-formedness checkers, 115What is Programming?, 122

(topic)What is XML?, 109 (topic)What works?, 105 (subtopic)while loop, 158 (subtopic)whitespace, 94Why Learn to Program?, 122

(subtopic)

Why Logical HTML and CSS?, 90(topic)

wi-fi, 29WinZip, 202 (subtopic)wireless LAN, 29World Wide Web, 27Write a recipe, 168 (subtopic),

171 (subtopic)Writing Style Sheets, 76

(subtopic)WWW, 27WYSIWYG, 39

HTML and, 46

XHTML, 111 (subtopic)1.0, 42file extension, 44version 1.1, 116vs. HTML, 42, 116

XHTML and HTML, 116 (topic)XHTML Reference, 48 (subtopic)XML, 109 (unit)

well-formed, 115xml:lang attribute, 55XSL, 114