_Dr._Dobb[ap]s_Journal_(Volume_30,_Issue_6,_No._373,_May_2005)_(2005)(en)(76s)

http://www.ddj.com

#373 JUNE 2005

PROGRAMMER

SOFTWARE TOOLS FOR THEPROFESSIONALDr Dobbs,

.Dr.Dobbs,O U R N A LJ

SOFTWARETOOLS FOR THEPROFESSIONALPROGRAMMER

TESTING & DEBUGGING

TESTING & DEBUGGING

Omniscient DebuggingExamining Software Testing Tools

Dissecting Error MessagesDebugging Production Software

Performance Monitoring

Omniscient DebuggingExamining Software Testing Tools

Dissecting Error MessagesDebugging Production Software

Performance Monitoring

ASP.NET &MultiplatformEnvironments

Swaine onVB-not-net

XScale, C++, &Hardware-AssistedBreakpoints

Ed Nisley onSecurity

JerryPournelle onSpyware


Swaine onVB-not-net

XScale, C++, &Hardware-AssistedBreakpoints

Ed Nisley onSecurity

JerryPournelle onSpyware

Software Reuse: Does It Really Pay Off?

Loadable Modules & The Linux 2.6 Kernel

Software Reuse: Does It Really Pay Off?


System Verification with SCVPortability & Data Management

Inside TR1 C++ Library Extensions

System Verification with SCVPortability & Data Management

Inside TR1 C++ Library Extensions

DR. DOBB’S JOURNAL (ISSN 1044-789X) is published monthly by CMP Media LLC., 600 Harrison Street, San Francisco, CA 94017; 415-947-6000. Periodicals Postage Paid at San Francisco and at additional mailing offices. SUBSCRIPTION: $34.95 for 1 year; $69.90 for 2 years. International orders must be prepaid. Payment may be made via Mastercard, Visa, or American Express; or via U.S.funds drawn on a U.S. bank. Canada and Mexico: $45.00 per year. All other foreign: $70.00 per year. U.K. subscribers contact Jill Sutcliffe at Parkway Gordon 01-49-1875-386. POSTMASTER: Send address changes to Dr. Dobb’s Journal, P.O. Box 56188, Boulder, CO 80328-6188. Registered for GST as CMP Media LLC, GST #13288078, Customer #2116057, Agreement #40011901. INTERNATIONALNEWSSTAND DISTRIBUTOR: Worldwide Media Service Inc., 30 Montgomery St., Jersey City, NJ 07302; 212-332-7100. Entire contents © 2005 CMP Media LLC. Dr. Dobb’s Journal is a registered trademark of CMP Media LLC. All rights reserved.

http://www.ddj.com Dr. Dobb’s Journal, June 2005 3

C O N T E N T SJUNE 2005 VOLUME 30, ISSUE 6

NEXT MONTH: We light upJuly with our coverage of Java.

F E A T U R E S

Omniscient Debugging 16by Bil LewisWith omniscient debugging, you know everything about the run of a program— from state changes to the valueof variables at any point in time.

Examining Software Testing Tools 26by David C. Crowther and Peter J. ClarkeOur authors examine class-based unit testing tools for Java and C#.

Dissecting Error Messages 34by Alek DavisError messages are the most important information users get when encountering application failures.

Debugging Production Software 42by John DiblingThe Production Software Debug library includes utilities designed to identify and diagnose bugs in production software.

System Verification with SCV 48by George F. FrazierThe SystemC Verification Library speeds up verification of electronic designs.

Portability & Data Management 51by Andrei GorineFollowing rules for developing portable code simplifies the reuse of data-management code in new environments.

Performance Monitoring with PAPI 55by Philip Mucci, Nils Smeds, and Per EkmanThe Performance Application Programming Interface is a portable library of performance tools and instrumentationwith wrappers for C, C++, Fortran, Java, and Matlab.

The Technical Report on C++ Library Extensions 67by Matthew H. AusternMatt looks at what the Technical Report on C++ Library Extensions means for C++ programmers.

Measuring the Benefits of Software Reuse 73by Lior Amar and Jan CoffeyDoes software reuse really pay off in the long run? How can you tell?

Loadable Modules & the Linux 2.6 Kernel 77by Daniele Paolo Scarpazza The Linux Kernel 2.6 introduces significant changes with respect to 2.4.

ASP.NET & Multiplatform Environments 81by Marcia GulesianRunning .NET web apps in the enterprise means accommodating myriad servers and browsers.

E M B E D D E D S Y S T E M SHardware-Assisted Breakpoints 87by Dmitri LemanDmitri explains how to access debug registers on XScale-based CPUs from C/C++ applications.

C O L U M N SProgramming Paradigms 91by Michael Swaine

Embedded Space 93by Ed Nisley

F O R U MEDITORIAL 6by Jonathan Erickson

LETTERS 10by you

DR. ECCO’SOMNIHEURIST CORNER 12by Dennis E. Shasha

NEWS & VIEWS 14by Shannon Cochran

OF INTEREST 103by Shannon Cochran

SWAINE’S FLAMES 104by Michael Swaine

R E S O U R C EC E N T E RAs a service to our readers, sourcecode, related files, and authorguidelines are available at http://www.ddj.com/. Letters to theeditor, article proposals andsubmissions, and inquiries canbe sent to [email protected], faxedto 650-513-4618, or mailed to Dr.Dobb’s Journal, 2800 CampusDrive, San Mateo CA 94403.

For subscription questions, call800-456-1215 (U.S. or Canada). Forall other countries, call 902-563-4753or fax 902-563-4807. E-mail sub-scription questions to [email protected] or write to Dr. Dobb’s Journal,P.O. Box 56188, Boulder, CO 80322-6188. If you want to change theinformation you receive from CMPand others about products andservices, go to http://www.cmp.com/feedback/permission.html orcontact Customer Service at theaddress/number noted on this page.

Back issues may be purchasedfor $9.00 per copy (which in-cludes shipping and handling).For issue availability, send e-mailto [email protected], fax to 785-838-7566, or call 800-444-4881(U.S. and Canada) or 785-838-7500 (all other countries). Backissue orders must be prepaid.Please send payment to Dr.Dobb’s Journal, 4601 West 6thStreet, Suite B, Lawrence, KS66049-4189. Individual back articlesmay be purchased electronically athttp://www.ddj.com/.

Chaos Manor 96by Jerry Pournelle

Programmer’s Bookshelf 101by Gregory V. Wilson

P U B L I S H E R E D I T O R - I N - C H I E FMichael Goodman Jonathan Erickson

E D I T O R I A LMANAGING EDITOR Deirdre BlakeMANAGING EDITOR, DIGITAL MEDIA Kevin CarlsonSENIOR PRODUCTION EDITOR Monica E. BergNEWS EDITOR Shannon CochranASSOCIATE EDITORDella WyserART DIRECTOR Margaret A. AndersonSENIOR CONTRIBUTING EDITORAl StevensCONTRIBUTING EDITORS Bruce Schneier, Ray Duncan, Jack Woehr, Jon Bentley,Tim Kientzle, Gregory V. Wilson, Mark Nelson, Ed Nisley,Jerry Pournelle, Dennis E. ShashaEDITOR-AT-LARGE Michael SwainePRODUCTION MANAGEREve Gibson

I N T E R N E T O P E R A T I O N SDIRECTOR Michael CalderonSENIOR WEB DEVELOPER Steve GoyetteWEBMASTERSSean Coady, Joe Lucca

A U D I E N C E D E V E L O P M E N TAUDIENCE DEVELOPMENT DIRECTOR Kevin ReganAUDIENCE DEVELOPMENT MANAGERKarina MedinaAUDIENCE DEVELOPMENT ASSISTANT MANAGERShomari HinesAUDIENCE DEVELOPMENT ASSISTANTMelani Benedetto-ValenteM A R K E T I N G / A D V E R T I S I N GASSOCIATE PUBLISHERWill WiseSENIOR MANAGERS, MEDIA PROGRAMS see page 82Pauline Beall, Michael Beasley, Cassandra Clark, Ron Cordek, Mike Kelleher, Andrew MintzMARKETING DIRECTOR Jessica MartySENIOR ART DIRECTOR OF MARKETING Carey Perez

DR. DOBB’S JOURNAL2800 Campus Drive, San Mateo, CA 94403 650-513-4300. http://www.ddj.com/

CMP MEDIA LLCGary Marshall President and CEOJohn Day Executive Vice President and CFOSteve Weitzner Executive Vice President and COOJeff Patterson Executive Vice President, Corporate Sales &MarketingLeah Landro Executive Vice President, Human ResourcesMike Mikos Chief Information OfficerBill Amstutz Senior Vice President, OperationsSandra Grayson Senior Vice President and General CounselAlexandra Raine Senior Vice President, CommunicationsKate Spellman Senior Vice President, Corporate MarketingMike Azzara Vice President, Group Director of InternetBusinessRobert Faletra President, Channel Group Vicki Masseria President, CMP Healthcare Media Philip Chapnick Vice President, Group Publisher AppliedTechnologiesMichael Friedenberg Vice President, Group PublisherInformationWeek Media NetworkPaul Miller Vice President, Group Publisher ElectronicsFritz Nelson Vice President, Group Publisher NetworkComputing Enterprise Architecture GroupPeter Westerman Vice President, Group Publisher SoftwareDevelopment MediaJoseph Braue Vice President, Director of Custom IntegratedMarketing SolutionsShannon Aronson Corporate Director, Audience DevelopmentMichael Zane Corporate Director, Audience DevelopmentMarie Myers Corporate Director, Publishing Services

PROGRAMMER

SOFTWARE TOOLS FOR THEPROFESSIONALDr.Dobbs,

O U R N A LJ

American Buisness Press

Printed in theUSA

4 Dr. Dobb’s Journal, June 2005 http://www.ddj.com

Q uirky is fine with me. I’m a fan of all things slightly out of whack— people, places, movies,museums, books, you name it. My buddy Michael once took to wearing flea collars aroundhis neck, wrists, and ankles when his apartment was invaded by fleas. Among my favorite

books are The Tulip: The Story of a Flower That Has Made Men Mad and The Dog in BritishPoetry. Then there’s the Bily Brothers Clock Museum in Spillville, Iowa, and the movie DancingOutlaw.

The latest to join the quirky club is The Collector’s Guide to Vintage Intel Microchips, by GeorgeM. Phillips, Jr.— an e-book (in PDF) on CD-ROM that has everything you ever wanted to know (andthen some) about Intel processors, controllers, RAM, ROM, EPROMs, memory, support circuits, andthe like. This includes 1300 pages worth of part numbers, photographs, data sheets, the names of thedesigners (and interviews with them, in some cases), and occasionally, the collectible value of thechip — all indexed, cross-linked, and in color (see http://www.vintagemicrochips.com/).

Actually, George isn’t alone in his fascination with silicon and circuits. It turns out that there’s aworldwide network that collects CPUs and microchips. For instance, in Poland Marcin Majewski hostshis ABC CPU (http://www.abc-cpu.xn.pl/); in Germany, Christian Lederer puts up the CPU Museum(http://www.cpu-museum.de/); in Oregon, John Culver is the curator of the CPU Shack (http://www.cpushack.net/); in Corsica, Desideriu maintains the CPU Museu (http://cpu-museu.net/); and LeeGallanger hosts his Vintage Chip Trader (http://www.vintagechiptrader.com/).

But when it comes down to it, collecting vintage microchips is no quirkier than collecting, say,vintage vacuum tubes. Bob Deuel’s collection, for instance, consists of tens of thousands ofvacuum tubes, although he displays only 1200 or so in his home, including a 228-pound tubefrom a 50,000-watt broadcast radio transmitter. And yes, Bob’s not alone out there, at least whenit comes to vacuum tubes. There’s the Tubepedia (http://www.aade.com/tubepedia/1collection/tubepedia.htm); the Tube Collectors Association (http://www.tubecollectors.org/); Kilokat’santique light bulb site (http://www.bulbcollector.com/); Mike’s Electric Stuff (http://www.electricstuff.co.uk/); Ake’s Tubedata (http://www.tubedata.com/); and more.

But as you can see in The Collector’s Guide to Vintage Intel Microchips, there’s more toserious collecting than grabbing a microcontroller here and a vacuum tube there and droppingthem into old cigar boxes (which, by the way, are also collectible; see http://galleries.cigarweekly.com/RickMG/album01). There’s all that information to be gathered— specifications, packaging,part numbers, and more. Some of the most difficult information for George to compile was theintroduction dates of the more than 300 different chips Intel introduced before 1980. Recall thatthe 8008 was introduced in 1972, and Intel’s first microchip, the 3101 static RAM, was introducedaround 1969. Of course, the easy way out would have been to just ask Intel, which maintains itsown chip museum (http://www.intel.com/museum/). Unfortunately, it didn’t occur to Intel tobegin documenting exact introduction dates of its chips until the mid 1980s. By then, many of theintroduction dates could only be determined by asking engineers who had worked for Intel inthe 1970s if they remembered when particular chips were introduced. The answer George usuallygot was something like, “I think we worked on that in late 1971, but it could have been 1972, orwas it 1973?” He adds that it is even more difficult when tracking down introduction dates of thedifferent versions of a chip — the 2107, 2107A, 2107B, 2107C, and so on.

Alternatively, you’d think you could just go to the library and simply peruse old Intel datacatalogs. Alas, libraries don’t have them and early Intel data catalogs are almost as rare asGutenberg Bibles. There’s only one known copy of Intel’s first data catalog printed in September1972, for instance, and no known copies of the 1973 or 1974 data catalogs. This means (youguessed it) early Intel data catalogs are collectible, too, and it’s not uncommon for them to sell forhundreds of dollars on eBay. Early Intel MCS-4 and MCS-8 user manuals, memory books, salesbrochures, and the like are also sought after by collectors. It’s no surprise that it took Georgeabout five years to put the e-book together.

So what possesses someone to devote this much time and energy into putting together such aquirky project? “Maybe the best answer is because it matters,” says George. He goes on to explainthat he’s been a programmer for 20 years and seen how computers have changed the world.However, he says that people who live through events often don’t realize the significance of thoseevents as they are happening. It’s all still too new. But George believes that future generationswill look back on this time as a monumental turning point in history, on par with the discovery offire and invention of writing— if they have a historical source to turn to. After all, no one eversaid that just because something is quirky that it isn’t important.

Quirky Is FineWith Me

Jonathan [email protected]

E D I T O R I A L


OAF AgainDear DDJ,I have been looking at John Trono’s OAFalgorithm (“Applying the Overtake & Feed-back Algorithm,” DDJ, February 2004) withgreat interest. This is a fantastic piece ofwork, which I am very keen to understandand hopefully apply, and was wonderingif John could please explain the followingsentence with regards to working out the“Distance” factor a bit further: “…wheredistance is the largest, integer multiple ofSQRTN less than or equal to the separa-tion between the two team’s rankings,which in this case is floor(57/11) or 5.” Forexample, if there are 20 teams in myleague and the two teams involved areranked 3 and 12, what would the distancebe? I live in Manchester, England, and waswondering whether I would be able touse this method to create some soccer rat-ings? The winning score margins for eachgame are going to be a lot lower than forAmerican football, the most frequent be-ing just 1 or 2. Should I multiply thesescores by another constant to fit them intothe algorithm? Also what do you thinkshould happen to each teams rating if aDraw (Tie) result occurs?

Ian [email protected]

John responds: Ian, thank you for your in-terest in (and kind words about) the OAFalgorithm. Though I left out how tiesgames are handled in the brief summarythat is on my web page (http://academics.smcvt.edu/jtrono/OAF.html) (since theyno longer are a possibility in NCAA foot-ball), such an outcome is treated almostlike a loss. When two teams’ ratings areaveraged in OAF, the victor gets the av-erage +0.4 and the loser the average –0.4as their updated ratings. When a tie oc-curs, the team that had the higher ratingbefore the tie game has 0.4 added to theaverage, and the other team receives theaverage –0.4 as its new rating.

If there are 20 teams, then the integerused for √N is the floor of the rounded

result of √(20), which in this case is 4. Ifa team ranked as #3 loses to a team thatis ranked #12, then the modified updaterule would compute that team #12 is(12–3)/4=2 “intervals” outside of the re-gion where the pure OAF update wouldbe applied and so the denominator of 4would be used instead of 2, when com-puting the “average” of the two opponents’ratings. (Teams 4– 6 would use a de-nominator of 2, i.e., the true mean, 7–10would use a denominator of 3, 11–14would use 4, etc.) I hope this clears thatup for you. Please go back and reread thelast three paragraphs on that web page tosee if you follow what I describe here.And remember, there is the caveat aboutusing the larger update of the two that arecomputed, as mentioned in the penulti-mate paragraph on my web page that youquote from below.

I would be curious to hear about yourresults when applying OAF to the soccergames in England. I don’t think youwould have to do anything special to ad-just the scores because of the smaller dif-ferentials; you might just end up with rat-ings that are closer, that’s all. (Sincetouchdowns in football are like goals insoccer, you might try multiplying all yourscores by 7, but again, I am not sure ifthat would be necessary.)

Good luck; don’t hesitate to ask for clar-ification if my description above is un-clear, and keep me posted about how yourstudy goes when applying this to soccer.

Silent UpdateDear DDJ,I appreciate the skill that went into ZuoliuDing’s article “A Silent Component Up-date for Internet Explorer (DDJ, April2005). As a user, I ask, how do I preventa software company or malicious agentfrom silently updating the software onmy hard drive? Will a firewall likeZoneAlarm do?

Bill [email protected]

Zuoliu responds: Bill, thanks for your in-terest in my article. Regarding your ques-tion, I would like to say there is nearly noway to prevent such an update with athird-party product. This is because sucha silent update is triggered by the specif-ic software that has been installed in yourlocal machine based on your trust already.The download and update is guided bythe software resided. If a hacker hijackedsuch a company download site and knewthe update mechanism, your computerwould be vulnerable and defenseless. Forsome general invasion, you may take [pre-cautions] like blocking some ActiveX down-loads from the prompts by XP/SP2. But

for such a silent update, the individualcompany should take full responsibility indesigning secure updates, such as encod-ing download information, dynamizing up-date sites, and so on. Hope this helps.

Is Wind Power Hot Air? Dear DDJ, As a part of Jonathan Erickson’s “lookingback” perspective in the February 2005 is-sue of DDJ, he effused about the amountof energy being created via wind-generatedelectricity. However, nuclear energy is anidea whose time has come and gone—and is come again with a vengeance. Sim-ply put, it is the cleanest, safest, best wayto create energy for people on this Earth.Wind power doesn’t even come close tofitting the needs of this world. Note that a“wind farm” would require about 300square miles of space to produce theequivalent output of one nuclear plant.The wind turbines slice-and-dice birds witha terrible regularity.

Don’t take my word for it. China re-searched it and is now committed to build-ing a minimum of 25 new reactors in thenext decade. All completely fail-safe. Thedesign is called “Pebble Bed Reactors.”You can fire these puppies up and walkaway from them forever. They won’t blowup and they won’t melt down. I refer youto the February 2005 issue of Wired mag-azine, which covered this issue in somedetail. The biggest threat to the reintro-duction of nuclear power to America is,of course, the Luddites— people who be-lieve that “nuclear is bad.” Again, Wiredcovers these types pretty well. And so Iask Jonathan to reconsider the scientificevidence that bodes extremely well fornuclear power, and drop the hippie-cultthing with “wind power.” In the end,“Wind Power” is just a bunch of hot air.

Peter [email protected]

Jonathan responds: Thanks for your note,Peter. Did I say that windpower ought toreplace nuclear power? I don’t recall do-ing so. Should there be alternatives to coal-based power plants? Yes. Coal-basedplants screw up the environment whenyou dig the coal and when you burn it. Idon’t have a problem with nuclear pow-er. But I also think that places rich in wind(Washington DC doesn’t count) can har-vest that energy and provide an econom-ic boon to the local economy. This in-cludes, say, western Kansas, the Dakotas,and other such windblown areas. It prob-ably isn’t feasible to build nuclear plantsin western Kansas, but wind turbines domake sense.

DDJ

L E T T E R S

,

D

C E N T S

22

22

OBB S POS T


D R . E C C O ’ S O M N I H E U R I S T C O R N E R

The tall man who came knocking atEcco’s apartment cut quite a figure.A titanium prosthetic had replacedhis left leg below the knee, but this

big man made his way around with a vig-or that few fully legged men could match.He wore a dark bushy beard and shaggyhair. He looked quite scary except that hisdark eyes twinkled and he sported verydistinctive smile lines.

“Call me Scooper,” he said by way of in-troduction. “After my, er, accident, I got thisleg and started walking around with awooden cane. Soon after, my buddies gaveme an eye patch, joking that I should puta jolly roger on my jeep’s antenna. I hadsome time on my hands, so I decided Iwould study pirates. I found some prettymodern ones, operating as late as the 1930soff the Carolina coasts and specializing inattacking yachts. They did this with stealthand intelligence. Never did they physical-ly harm any one. Sometimes their bootywas difficult to divide, however.

“In one heist for which I have partialrecords, almost half the value of the theftwas embodied in a single, very well-known diamond— so well known in factthat they couldn’t sell it right away. Theydecided to award the diamond to the pi-rate who could win the following contest.They were very mathematically adept pi-rates, so I’m convinced someone won. Iwant to know how.

“Here is the contest: There is a set ofwooden planks P. Given P, each contes-tant is to construct a structure of planksthat cantilever off a flat dock.

“Is someone going to walk the plank?”11-year-old Tyler asked.

“No,” Scooper replied with a smile.“They didn’t want anyone to walk theplank yet, but they wanted to make the

pile extend out as far as possible from thedock without requiring glue, ropes, or at-tachment. That is, they wanted to createa pile of planks and have them stay putassuming no vibrations or wind.”

“If all the planks are the same size andweight, and each plank can sit on only oneother plank and support only one plank,then this is the ‘Book Stacking Problem’ andis nicely analyzed on http://mathworld.wolfram.com/BookStackingProblem.html,”Liane volunteered.

“Interesting,” Scooper said. “Could youexplain how that goes?”

“The basic idea is simple,” Liane replied.“One plank can extend half-way out in acantilever without tipping; if you have twoplanks, then the first one extends 1/4 theway out and the other extends 1/2 wayout beyond the first as you can see in thedrawing in Figure 1.

“Let’s analyze this. Suppose each plankweighs W and is of length L. The netweight of each plank is at its center. Thetorque around the dock edge is therefore+(L/4)W due to the bottom plank and–(L/4)W because of the top plank. So thenet torque is zero. Similarly, the center ofthe top plank rests on the outer edge ofthe bottom plank, so it will not flip over.More planks allow an arbitrarily long can-tilever, given enough planks.”

“Physics is always a surprise,” Scoop-er said.

“1. Our pirates are not so bookish, how-ever. They have 10 thick and rigid planks

of lengths 1 through 10 meters and weigh-ing 5 through 50 kilograms they had justcaptured from a French yacht. Further, ac-cording to their description of the contest,all planks should share the same orienta-tion— they should lie horizontally andtheir lengths should be perpendicular tothe dock edge. To start with, they requiredthat a plank could lie on only one otherplank (or on the dock) and could supportonly one other plank. In that case, howfar from the dock can a plank extend with-out tipping for the best structure?

“2. Now suppose that the pirates allowtwo or even more planks to lie on a sup-porting plank. Can you create a cantileverthat goes even farther out?” (Hint: Yes, bymore than a couple of meters.)

Liane and Tyler worked on these for anhour before coming up with an answer.

“Nice work,” Scooper said, looking attheir drawings. “Here is the hardest prob-lem. Suppose there are no constraints abouthow planks can lie on one another? Theyneed not share the same orientation, manyplanks can support a single plank, and soon. Then how far can the cantilever go outusing just these 10 planks?”

I never heard the answer to this one.

For the solution to last month’s puzzle, see page 92.

DDJ

The Pirates’Cantilever Dennis E. Shasha

Dennis is a professor of computer scienceat New York University. His most recent booksare Dr. Ecco’s Cyberpuzzles (2002) andPuzzling Adventures (2005), both pub-lished by W. W. Norton. He can be contactedat [email protected].


Figure 1.

This plank is 1/2 cantilevered from the first plank.

Dock and a single plankcantilevered 1/2 its length off the dock.Dock

DockThis plank is 1/4 cantilevered.


Natural Language to Source CodeResearchers at the Massachusetts Instituteof Technology are developing a language-to-code visualizer tool called “Metafor” thattranslates natural languages (like English)to source code. According to researchersHugo Liu and Henry Lieberman, Metaforis a “brainstorming” editor that “interac-tively converts English sentences to par-tially specified program code, to be usedas ‘scaffolding’ for a more detailed pro-gram”— an outliner, in other words.Metafor builds program skeletons in Pythonand other languages from parts of speechand language syntax, in which noun phras-es are equivalent to objects, verbs to func-tions, and adjectives to object properties.A parser identifies the subjects, verbs, andobject roles, and the Metafor software mapsthese English language constructs to codestructures. For more information, see http://web.media.mit.edu/~hugo/publications/papers/CHI2005-NLInterfaces.pdf.

Blue Gene Blazes OnIBM’s supercomputer- in-progress, BlueGene/L, has eclipsed its own performancerecord. The partially assembled system wasranked the fastest in the world last Novem-ber, when it performed 70.72 teraflops onthe Linpack benchmark. Now, althoughthe cluster is still only half finished, it hasnearly doubled in speed— clocking in at135.3 teraflops on the same benchmark,according to the Department of Energy.IBM estimates that when Blue Gene/L iscomplete, it will be capable of an un-precedented 360 teraflops (http://www.research.ibm.com/bluegene/). When fin-ished, Blue Gene/L will consist of 65,536dual Power PC 400 cores running at 700MHz—131,072 processors—with on-chipmemory, and two dual floating-point unitsto speed calculation. The system will bedensely packaged into 64 racks and inte-grated with multiple interconnection net-works. The Blue Gene/L supercomputeris being installed at Lawrence LivermoreNational Labs, where it will perform nu-clear weapons simulations.

Hitachi Shows New RobotHitachi’s EMIEW—“Excellent Mobility andInteractive Existence as Workmate”— de-sign is a walking, talking humanoid robotdesigned to give Honda’s ASIMO a runfor its money. Hitachi has built twoEMIEWs, dubbed “Pal” and “Chum.” The

wheeled EMIEWs can move twice as fastas ASIMO, although unlike the other robot,they can’t climb stairs. While EMIEW mayseem to be playing catch-up to ASIMO andSony’s QRIO, the company is quick topoint out that it has dabbled in nonin-dustrial robots before. Its first bipedal robotwas shown at the 1985 Tsukuba Expo inJapan. For the EMIEWs, however, Hitachichose to use a two-wheeled base resem-bling a Segway, which lets the robots keepup with a normal human’s walking pace.

The EMIEWs also feature two arms andhands capable of grasping objects, as wellas humanoid torsos and heads. They cantalk, and are capable of engaging in dia-logue with humans, although their vo-cabularies are limited to about 100 words.Hitachi estimates that in five to six yearsof linguistic training, Pal and Chum couldbe ready for practical jobs as informationdesk workers or office support staff.

Ultra-Fast Electrical Signals Captured Researchers at the University of California,Los Angeles have for the first time cap-tured and digitized electrical signals at therate of 1 trillion times per second. Profes-sor Bahram Jalali and graduate researcherYan Han have developed a one-tera-sam-ple-per-second single-shot digitizer thatlets scientists see, analyze, and understandlightning-quick pulses. The one- tera-sample-per-second single-shot digitizeruses light to first slow down the electricalwaveforms, allowing the ultra-fast wave-forms to be digitized at pico-second in-tervals— or one-millionth of one-millionthof a second. One application being stud-ied is the development of defenses againstmicrowave “e-bombs” that can destroyelectronic devices. For more information,see http://www.newsroom.ucla.edu/page.asp?RelNum=6000.

Intel Science Talent SearchWinners AnnouncedDavid Vigliarolo Bauer of Bronx, New York,has been awarded a $100,000 scholarshipfor being named the first-place winner ofthe 2005 Intel Science Talent Search (In-tel STS) competition. Bauer, of Hunter Col-lege High School, designed a new methodusing “quantum dots” (florescent nanocrys-tals) to detect toxic agents that affect thenervous system. Second place and a$75,000 scholarship went to Timothy FrankCredo of the Illinois Mathematics and Sci-

ence Academy in Highland Park, Illinois,for developing a more precise method tomeasure very brief intervals of time— pi-coseconds (trillionths of seconds)— overwhich charged secondary particles of lighttravel. The $50,000 third-place scholarshipwent to Kelley Harris of C.K. McClatchyHigh School in Sacramento, California, forher work on Z-DNA binding proteins,which may play a role in cell responsesto certain virus infections. All in all, morethan 1600 entries were submitted by stu-dents ranging in age from 15 to 18, withIntel awarding a total of $580,000 in prizes.For more information, see http://www.intel.com/education.

Eclipse Roadmap Released The Eclipse Foundation has released Ver-sion 1.0 of its Eclipse Roadmap (http://www.eclipse.org/org/councils/roadmap.html), adocument that outlines future directions ofEclipse. Among the themes and prioritiesoutlined are issues related to scalability andenterprise-readiness; simplicity and exten-sibility; globalization; and attention to therich client platform. Among the projects theorganization will likely launch over the nextyear are: more extensive coverage of thesoftware development lifecycle, embeddeddevelopment, multiple language support,and vertical market technology frameworks.

New Largest Known Prime Number DiscoveredMartin Nowak, an eye surgeon in Germany,and a long-time volunteer in the Great In-ternet Mersenne Prime Search (GIMPS) dis-tributed computing project (http://www.mersenne.org/prime.htm), has discoveredthe largest known prime number. Nowakused one of his business PCs and free soft-ware by George Woltman and Scott Kurows-ki. His computer is a part of a worldwidearray of tens of thousands of computersworking together to make this discovery.The formula for the new prime number is225,964,951–1. The number belongs to a spe-cial class of rare prime numbers called“Mersenne primes.” This is only the 42ndMersenne prime found since MarinMersenne, a 17th century French monk, firststudied these numbers over 350 years ago.Written out, the number has 7,816,230 dig-its, over half a million digits larger thanthe previous largest known prime num-ber. It was discovered after more than 50days of calculations on a 2.4-GHz Pen-tium 4 computer.

News & ViewsDr. Dobb’s

News & ViewsSECTION

AMAIN NEWS

DR. DOBB’S JOURNALJune 1, 2005

The term “omniscient debugging” describes the concept thatdebuggers should know everything about the run of a pro-gram, that they should remember every state change, andbe able to present to you the value of any variable at any

point in time. Essentially, omniscient debugging means that youcan go backwards in time.

Omniscient debugging eliminates the worst problems withbreakpoint debuggers— no “guessing” where to put break-points, no “extra steps” to debugging, no “Whoops, I went toofar,” no nondeterministic problems. And when the bug is found,there is no doubt that it is indeed the bug. Once the bug oc-curs, you will never have to run the program a second time.

Breakpoint debugging is deduction based. You place break-points according to a heuristic analysis of the code, then lookat the state of the program each time it breaks. With omniscientdebugging, there is no deduction. You simply follow a trail of“bad” values back to their source.

In 1969, Bob Balzer implemented a version of omniscient de-bugging for Fortran that ran on a mainframe and had a TTY in-terface (see “EXDAMS—Extendible Debugging and MonitoringSystem,” ACM Spring Joint Computer Conference, 1969). Sincethen, there have been a half dozen other related projects thatwere also short-lived. Zstep (Lieberman) was a Lisp debuggerthat came closest to commercial production and shared a similarGUI philosophy (see “Debugging and the Experience of Imme-diacy,” by H. Lieberman, Communications of the ACM, 4, 1997).With the advent of Java (and its beautiful, simple, bytecode ma-chine!), there have been at least five related projects, includingtwo commercial products— CodeGuide from OmniCore(http://www.omnicore.com/) and RetroVue from VisiComp(http://www.visicomp.com/).

The ODB I present here implements this concept in Java. TheODB is GPL’d and available (with source code) at http://www.LambdaCS.com/. It comes as a single jar file that includesa manual, some aliases (or .bat files), and three demo programs.The current implementation is fairly stable, though not perfect.It is pure Java and has been tested (more or less) on Solaris,Mac OS, and Windows 2000. I am working on integration withEclipse and IntelliJ. You are welcome to try it out and encour-aged to let me know what you think.

The basic implementation of the ODB is straightforward. InJava, it is sufficient to add a method call before every putfieldbytecode that records the variable being changed, its new value,the current thread, and the line of code where it occurred. (TheODB records more than this, but this is the fundamental idea.)

With instrumentation taken care of, you can turn your atten-tion to the real issues: How this information is going to be dis-played and how you are going to navigate through time to findimportant events. All of this should help answer the question:“Does omniscient debugging work?”

Display and NavigationThe ODB GUI is based on a few fundamental principles:

• It displays all of the most relevant state information simulta-neously. (You should not have to toggle between popups.)

• It provides easy-to-read, unambiguous print strings for ob-jects. (<Thing_14> is a reasonable choice, corp.Thing@0a12ae0c isn’t.)

• It makes the most frequent operations simple. (You shouldn’thave to do text selections, use menus, and the like.)

• It doesn’t surprise you. (Displays are consistent and you canalways go back to where your were.)

The ODB records events in a strict order and assigns themtimestamps. Every variable has a known value (possibly “not-yet-set”) at every timestamp and everything in the debugger win-dow is always updated to show the proper data every time yourevert the debugger to a different time. There is never any am-biguity.


Omniscient Debugging

Bil has written three books on multithreaded programming alongwith the GNU Emacs Lisp manual and numerous articles. He iscurrently a professor at Tufts University and can be contactedat [email protected].

BIL LEWIS

An easier way to find program bugs

“Omniscient debuggingeliminates the worst problemswith breakpoint debuggers”

You mainly want to know what the program state availableto a chosen stack frame is— local variables, instance variablesof the this object, and perhaps some static variables and detailsabout other selected objects. You also need to know which lineof code in which stack frame you are looking at. This is whatthe ODB displays. Such things as the build path, variable types,inheritance structures, and the like, are rarely of interest duringdebugging and shouldn’t take up valuable screen space.

For print strings, the ODB uses the format <Thing_34 Cat>,where the type (without the package) is shown with an instancecounter and an optional, programmer-chosen instance variable.An important detail is that the print string must be immutable—it has to be the same at time 0 as it is at time 10,000. Print stringshave a maximum width of 20 characters, so that wrap-aroundisn’t an issue. This format also lends itself well to method traces:

<Person_3 John>.groom(<Pet_44 Dog>, 44.95) -> true

In Figure 1, method calls are shown per-thread and indent-ed, making clear the calling structure of the program. Traces

are wonderful for orienting you to where/when in the programyou are.

The primary operations I usually want are “show the previ-ous/next value of a variable,” and sometimes the first/last val-ue. This is what the arrow buttons do for the selected line (forinstance, next context switch, next variable value, next line ofcode, next line of output, and so on). Threads, traces, outputlines, and lines of code may be selected directly, in which casethe current time reverts to when that line was executed/print-ed. On occasion, it is useful to see the entire set of values a vari-able ever had (Figure 2). Being a rarer operation, it is on a menu.

At the bottom of the window is a minibuffer where messagesare printed and where typed interactions occur. When you changethe current time, the minibuffer displays the type of the newlyselected timestamp (for example, the local assignment as in Fig-ure 2) and how far it is from the previously selected time. In-cremental text searches through the trace pane entries (basedon EMACS’s I-search) are run here, as are expression evalua-tion and more elaborate event searches.

In addition to the variables in the current stack frame, it iscommon to keep an eye on specific objects. Any object dis-played in any pane may be copied to the Objects Pane whereit remains as long as desired. As with all variables, the contentsof those objects are updated as you revert the ODB to differenttimes. Variables whose values have changed between the pre-vious and currently selected time are marked with a “∗ ” (sumin Figure 1). Those not yet set are displayed with a value of “–”(i). Similarly, thread <Sorter_4> has not yet executed its first in-struction.

More Elaborate Interactions and DisplaysThere are times when more elaborate displays or interactionsare useful. For example, when there are thousands (or millions!)of traces, it is useful to filter out uninteresting ones. You can fil-ter the trace pane to show only those method calls in a partic-ular package, or those not in that package, or only those at adepth less than a limit.

You can revert the debugger to any time desired, then eval-uate a method call using the then current values of the record-ed objects. These methods will be executed and their traceswill be collected in a different timeline, which won’t affectthe original data. In the secondary timeline, it is possible tochange the value of instance variables before executing amethod. (The primary timeline is inviolate.) The input for-mat is identical to the trace format and even does name com-pletion.

There are some cases where a different print format is moreintuitive for you. Thus, a blocked thread acquires a pseu-dovariable to show the object it’s waiting on (see _blockedOnin Figure 1), as will the objects whose locks are used. Noticethat this information is duplicated in the threads pane.

The collection classes constitute another special case. In re-ality, a Vector comprises an array of type Object, an internalindex, and some additional bookkeeping variables that are notof any interest to anyone other than the implementer. So theODB displays collections in formats more consistent with theway they are used.

In addition to the simple text search, there is an event searchcommand that lets you look for specific assignments, calls, andthe like (based on the work of M. Ducassi; see “Coca: An Au-tomated Debugger for C” in the Proceedings of the 21st Inter-national Conference on Software Engineering, 1999). A goodexample of its use is the quick sort demo that is part of the ODBdistribution. The program sorts everything perfectly, except forelements 0 and 1. To search for this problem, you can issue anevent search to find all calls to sort( ) that include elements 0and 1.

(continued from page 16)


Figure 1: The main ODB window.

Figure 2: Displaying all values of a variable (or arrayelement).

port = call & callMethodName = "sort" & arg0 = 0 & arg1 >= 1

There are four matches, and sort(0, 2) is the obvious call tolook at. Stepping through that method, it’s easy to find the bug.Event searching has a long and honorable history of providinga “super intelligent” breakpoint facility. I find it a nice surprisethat it’s so useful here, too.

You can write ad hoc display methods for specific objects.For example, when debugging the ODB itself, I have a methodthat converts the bit fields of the timestamps (which are inte-gers) into readable strings. Instead of seeing “5235365,” I see“Thread-1 | a=b | Demo.java:44.”

The Snake In the GrassIf you see a snake in the grass and you pull its tail, sooner or lat-er you get to its head.

Under the ODB, a great number of bugs exhibit this kindof behavior. (By contrast, breakpoint debuggers suffer fromthe “lizard in the grass” problem. Even when you can see thelizard and grab its tail, the lizard breaks off its tail and getsaway.) If a program prints something wrong (“The answer is41” instead of 42), then you have a handle on the bug (youcan grab the tail).

“Pulling the tail” consists of finding the improper value at eachpoint, and following that value back to its source. It is not un-usual to start the debugger, select the faulty output, and navi-gate to its source in 60 seconds. And you can do this with per-fect confidence for bugs that take hours with conventionaldebuggers.

A good example of this is when I downloaded the Justice Javaclass verifier (http://jakarta.apache.org/bcel/). I found that it dis-allowed a legal instruction in a place where I needed it. Start-

ing with the output “Illegal register store,” it was possible to nav-igate through a dozen levels to the code that disallowed my in-struction. I changed that code, recompiled, and confirmed suc-cess. This took 15 minutes (most of which was spent confirmingthe lack of side effects) in a complex code base of 100 files Ihad never seen before.

A happy side benefit of the ODB is how easy it makes it tosee what a program is doing. I teach a class on multithreadedprogramming and I use the ODB regularly to explain what aprogram is doing. Students get to see which threads ran when



When I push zoom a second time, it displays the wrong data.Sometimes.

Ihad spent several hours looking for the bug with Eclipse.He was fairly sure that a certain ArrayList containing time-event pairs was being corrupted, but nothing more. He

thought it would be interesting to try the ODB.The EVolve visualization system comprises some 80K lines

of Java and is designed to read-in and display performancedata on Symbian applications for mobile phones. It had beenwritten over several years by several programmers, none ofwhom were available. I had never used the tool, nor had Iever seen a single line of the source code. It was exactlywhat I was looking for!

The first indication of the problem was a dialog box thatdisplayed a zero time. So I did an incremental search throughthe traces in the AWT thread for the string “Start time.” Ofthe eight events, which contained that string, the second-to-last showed a zero time range.

I could see from the code that the string was constructedusing the start and end instance variables from a Selectionobject. This particular object had the same value for both.Selecting that object and stepping back to its creation, I couldsee that RefDim.makeSelection( ) was calling new with thosevalues. That method was called from Paladin.select( ), whichobtained those values by taking the difference between thestart values of elements two and four of the aforementionedArrayList. I noticed that the first five elements of the list werethe same object.

Stepping backwards to find out who put those values in,I discovered this odd little loop, which ran in thread T1. (Atthis point, there is no clear connection between the creationof the list and its use.)

while ((x/Math.abs(interval) - xOffset) >= timeMap.add(time2event);}

It was clear that multiple identical entries in the list wereallowed, but that these particular ones were wrong. Afterstaring blindly at the loop for a while, I stepped back to thecaller:

countEvents(x + xOffset*interval);

The programmer was adding the offset to the X value,only to remove it in the loop. Weird. I had noticed that thesecond selection only failed if it were in a low range (pre-sumably 0 - xOffset∗ interval, which was also the range wherethe ArrayList values were identical).

Removing the offset eliminated the bug. The entire ses-sion lasted about an hour, during which I looked at 20 ob-jects in a dozen files. Most of my time was spent trying tounderstand the intent of the code.

—B.L.

The ODB At Work

and if there’s any confusion, they can just back up and lookagain. Some problems that used to be difficult suddenly becometrivial.

ImplementationThe ODB keeps a single array of timestamps. Each timestampis a 32-bit int containing a thread index (8 bits), a source-lineindex (20 bits), and a type index (4 bits). An event for chang-ing an instance variable value requires three words: a timestamp,the variable being changed, and the new value. An event for amethod call requires a timestamp and a TraceLine object con-taining: the object, method name, arguments, and return value(or exception), along with housekeeping variables. This addsup to about 20 words. A matching ReturnLine is generated uponreturn, costing another 10 words.

Every variable has a HistoryList object associated with it thatis just a list of timestamp/value pairs. When the ODB wants toknow what the value of a variable was at time 102, it just grabsthe value at the closest previous time. The HistoryList for localvariables and arguments hang off the TraceLine; those for in-stance variables hang off a “shadow” object that resides in ahashtable.

Every time an event occurs, the recording method locks theentire debugger, a new timestamp is generated and inserted intothe list, and associated structures are built. Return values andexceptions are back-patched into the appropriate TraceLine asthey are generated.

Code insertion is simple. The source bytecode is scanned,and instrumentation code is inserted before every assignmentand around every method call. A typical insertion looks like this:

289 aload 4 // new value291 astore_1 // Local var X292 ldc_w #404 <String"Micro4:Micro4.java:64">295 ldc_w #416 <String "X">298 aload_1 // X299 aload_2 // parent TraceLine300 invokestatic #14 <change(String, String, Object, Trace)>

where the original code was lines 289 and 291, assigning avalue to the local variable X. The instrumentation creates anevent that records that on source line 64 of Micro.java (line292), the local variable X (line 295), whose HistoryList canbe found on the TraceLine in register 2 (line 299), was as-signed the value in register 1 (line 298). The other kinds ofevents are similar.



A llen had finally decided that there was a bug in the JVM.Either that, or he was losing his mind. This was one ofthose bugs that “couldn’t happen.”

Allen used to like null pointer exceptions. The stack back-trace always identified the offending line, which usually indi-cated the offending variable, and then (eventually) he wouldfind the bug. But lately he had started to dread the null point-er exception. “The stack backtrace helped me to understandthe symptom of the bug, but the code containing the actualcause of the bug was often far removed from the exceptionthrowing code,” he explained. “And by the time the exceptionwas thrown, the underlying cause was sometime in the past.Who assigned null to the variable? And when? Where? That’sthe real bug, and by the time the exception occurs, it’s usual-ly way too late to pinpoint the cause.”

In this particular case, the null value was contained in afield, so it wasn’t too difficult using conventional tools to findall the places in the code where it was used. From that list,Allen whittled it down to a long list of assignments.

But upon examination of the code, it seemed none of themcould be the cause. In each case, if the value were null, itwould have blown up a few lines before the assignment, be-cause in each case the value was dereferenced. Allen dutiful-ly added some print statements, but they just confirmed whathe had already surmised. He examined the constructors andadded code to check the initialization values, but they, too,always verified that the value being assigned was nonnull.

After two days of tearing his hair out, he e-mailed his col-league Carl for help. “I don’t think Allen really expected meto be able to help,” Carl said. “I work remotely, and I wasn’t

familiar with Allen’s code. But he sounded desperate.” Carlasked Allen to run the program under RetroVue and send himthe resulting journal file. Ten minutes after receiving it, Carlsent Allen information (see Figure 3) that he had capturedfrom the RetroVue journal viewer and marked up. RetroVueclearly showed that, although the value of the argument wasnot null (line 115), the field named “configuration” was indeedbeing assigned a null value in GraphElement’s constructor online 119. That would explain the eventual null pointer excep-tion. But why wasn’t the argument value being assigned to thefield? Allen examined the constructor one more time:

115 public GraphElement(GraphNode parent, Configuration configuration) {116 if (configuration == null) {throw new IllegalArgumentException();}117 this.parent = parent;118 this.children = new GraphNode[0];119 this.configuration = configuration;120 }

Finally, he noticed the spelling error. Because the configu-ration argument was missing a “u,” it was never used in theconstructor at all (except in the check for null, which Allenhad added, inadvertently duplicating the spelling error withcopy-and-paste). So line 119, which was supposed to copythe nonnull argument value into the field, simply copied thefield to itself (as if the assignment statement had been writtenthis.configuration = this.configuration;). And, of course, thefield’s initial value was null.

The incident convinced Allen to start using RetroVue him-self. “We had found and fixed the bug, but now I was cu-rious why it occurred intermittently. Using RetroVue, I gota clear understanding of exactly what was going on in theseclasses in less than an hour. And along the way, I foundand fixed another potential bug, and discovered the causeof a performance problem that another engineer had beenstruggling with.”

— Ron Hugheshttp://www.visicomp.com/

RetroVue At Work

Figure 3: Information captured from the RetroVue journalviewer.

PerformanceThe most common remarks I hear when introducing omniscientdebugging are that it takes too much memory and it is muchtoo slow. These are legitimate concerns, but miss the mainpoint— does it work? If it is effective, there are all sorts of clevertechniques for reducing the cost of the monitoring. If it isn’t,who cares?

In experience with the ODB, neither CPU overhead nor mem-ory requirements have been a stumbling block. On a 700-MHziBook, it takes the ODB 1µs to record an assignment. I haverecorded 100 million events in a 31-bit address space. A tightloop may run 300 times slower in a naïve ODB, but such aloop can also be optimized to do no recording at all (it’s re-computable). Ant recompiling itself runs 7 times slower andgenerates about 10 million events. (We’re recording Ant, notthe compiler!)

There are a number of possible optimizations:

• You can start/stop recording either by hand or by writing anevent pattern. Most button-push bugs are amenable to this.(Let the program start up with recording off, turn it on, pushthe button, turn it off, find the bug.)

• The ODB has a garbage collector that throws out old eventsas memory gets filled up. Of course, these are not reallygarbage, but it does make it possible to maintain a sliding

window of events around the bug. (I find this is a grand idea,but only seldomly useful.)

• It is quite normal for a large percentage of a program to con-sist of well-known, “safe” code. (These are either recomputablemethods, or methods we just don’t want to see the interiorsof.) By requesting that these methods not be instrumented, agreat number of uninteresting events can be eliminated. TheODB lets you select arbitrary sets of methods, classes, or pack-ages, which can then be instrumented or not. A good opti-mizer can do this automatically.

My estimate is that a well-optimized omniscient debuggerwould exhibit an absolute worst-case of 10 times slow-down,with 2 times being far more common. True, there are programsfor which this is still too long, but not many. I’ve yet to en-countered a bug that I couldn’t find with the current ODB.

ConclusionWith the advent of omniscient debugging, it is becoming easi-er to find bugs. Bugs that were completely intractable with tra-ditional techniques have become child’s play. As optimizationtechniques improve and omniscient debuggers come to handlelarger and larger problems, bugs will find it harder and harderto hide.

DDJ



A s the producers of an IDE that includes omniscient de-bugging facilities, naturally we use back-in-time debug-ging daily during development. CodeGuide works a bit

differently than ODB in that it marries conventional debug-ging with back-in-time facilities. Thus, the developer can useconventional breakpoints when he wants to and go back intime when necessary. Once a breakpoint is set, CodeGuideinstructs the debugged application to log execution details inthe method containing the breakpoint and “related” methods.The rest of the application is not affected and can continueto run at full speed. If the bug is easy to fix, the developercan HotSwap his changes into the VM and test them withoutthe need to restart the application.

Bugs in Java applications often manifest themselves in un-caught exceptions. The dreaded NullPointerException is prob-ably the most common example.

Now, once a NullPointerException is thrown because the re-turn value of a function is null, you usually have no clue whythis function returned null. For example, I recently had to finda bug in some code that uses a cache to hold some per-filedata. With omniscient debugging in CodeGuide, I just had toset a breakpoint on the main exception handler of this thread,and then I stepped backwards to the cause of the exception.

The code in which the NullPointerException was thrownlooked like this:

class Processor {private Cache cache;public String process(File file) {

// ...Data data = cache.getData(file);// ...return data.getStringRepresentation(); // <- NPE thrown

}}

It was clear that the Cache.getData( ) method was at faultfor returning null. But how could that happen? The Cache.get-Data( ) method was not supposed to ever return null. It wassupposed to return dummy data instead:

class Cache {private Map<File, Data> cachedFiles = new HashMap<File, Data>();

private Data getData(File file) {if (!cachedFiles.containsKey(file))

return new DummyData();return cachedFiles.get(file);

}}

Stepping back into the Cache.getData( ) method revealed thatthe cachedFiles.get(file) returned null, even though cached-Files.containsKey(file) returned True. A peculiarity of HashMapis that it allows storing null values, in contrast to Hashtable, whichdoes not. Thus, changing the code to not use HashMap.con-tainsKey( ) fixed the problem:

private Data getData(File file) {Data data = cachedFiles.get(file);if (data == null)

return new DummyData();else

return data;}

Could I have found this bug without omniscient debugging?Certainly. But it would have taken much, much longer andseveral debugging sessions with breakpoints in different lo-cations to get to the root of this.

— Hans Kratzhttp://www.omnicore.com/

CodeGuide At Work

Examining SoftwareTesting Tools

P R O G R A M M E R ’ S T O O L C H E S T

In a world where automatic updates andthe latest service packs are quick fixesfor common problems, the need forhigher quality control in the software-

development process is evident and theneed for better testing tools to simplifyand automate the testing process appar-ent. Why are so many developers seekingautomated software testing tools? For onereason, an estimated 50 percent of thesoftware development budget for somesoftware projects is spent on testing [16].Finding the right tool, however, can be achallenge. How is one tool measuredagainst another? What testing features areimportant for a given software develop-ment environment? Does a testing tool re-ally live up to its claims?

In this article, we analyze several soft-ware testing tools—JUnit, Jtest, Panora-ma for Java, csUnit, HarnessIt, andClover.Net. The first three tools were de-veloped to work with Java programs, while

the latter three work on C# programs. Wechose tools that ranged from freeware tocommercial, and from basic to sophisti-cated. Our focus in selecting the tools—as well as the tests we performed—wason class-based testing. Our intention hereis not to provide a comprehensive com-parison of a set of testing tools, but tosummarize the work completed in an in-dependent study in an academic envi-ronment.

Software Testing Overview Binder defines software testing as the ex-ecution of code using combinations ofinput and state selected to reveal bugs.Its role is limited purely to identifyingbugs [2], not diagnosing or correctingthem (debugging). The test input nor-mally comes from test cases, which spec-ify the state of the code being tested andits environment, the test inputs or con-ditions, and the expected results. A testsuite is a collection of test cases, typicallyrelated by a testing goal or implementa-tion dependency. A test run is an exe-cution of a test suite with its results. Cov-erage is the percentage of elementsrequired by a test strategy that have beentraversed by a given test suite. Regres-sion testing occurs when tests are rerunto ensure that the system does not regressafter a change [3]. In other words, thesystem passes all the tests it did beforethe change.

Components usually need to interactwith other components in a piece of soft-ware. As a result, it is common practiceto create a partial component to mimic arequired component. Two such instancesof this are:

• Test driver, a class or utility program thatapplies test cases to a component to betested.

• Test stub, a partial, temporary imple-mentation of a component, which letsa dependent component be tested.

Additionally, a system of test drivers andother tools to support test execution isknown as a “test harness,” and an “oracle”is a mechanism to produce the predictedoutcomes to compare with the actual out-comes of the software under test [8].

Class-based testing is the process of op-erating a class under specified conditions,observing or recording the results, andmaking an evaluation of some aspect ofthe class [4]. This definition is based onthe IEEE/ANSI definition of software test-ing [5]. Class-based testing is comparable

Class-based testinggoes to work

DAVID C. CROWTHER AND PETER J. CLARKE

David is a graduate student and Peter anassistant professor in the school of com-puter science at Florida International Uni-versity. They can be contacted at [email protected] and [email protected], respectively.


“Software testing isthe execution ofcode usingcombinations ofinput and stateselected to reveal bugs”

to unit testing used for procedural pro-grams, where each class is tested indi-vidually against its specification. The twomain types of class-based testing are spec-ification based and implementation based.

Specification-based testing, also knownas black box or functional testing, focus-es on the input/output behavior or func-tionality of a component [3]. Some meth-ods used for specification-based testinginclude equivalence, boundary, and state-based testing. In equivalence testing, val-ues in the domain are partitioned intoequivalence classes, in which any valuetested should produce the same result asany other value in that class. Boundarytesting focuses on testing the extreme in-put values, such as the minimum and max-imum values along with other values with-in their proximity. Finally, state-basedtesting generates test cases from a finitestate machine (a statechart, for instance),which models some functionality of thesystem [3, 7].

Implementation-based testing, alsoknown as white box or structural testing,focuses purely on the internal structure ofa coded component [3]. The most com-mon implementation-based techniques arebased on data flow and control flow ana-lysis of code, which generate test casesbased on some coverage criteria [2]. Indata flow analysis, each variable defini-tion is matched up with the places in codewhere the variable is used, constituting adef-use pair. These pairs are used for mea-suring coverage; for instance, “All-Uses”checks if there is a path from every defto each use of a variable [6]. Control flowtesting analyzes coverage based on theorder in which statements are executed inthe code. For example, branch testingseeks to ensure every outcome of a con-ditional statement is executed during test-ing, and statement coverage verifies thenumber of statements executed.

While both specification-based and im-plementation-based testing have their ad-vantages and disadvantages, it is general-ly accepted that some combination of thetechniques is most effective [1].

Criteria for ComparisonHere are the criteria we used to evaluatethe selected software testing tools:

Support for the testing process. Howdoes a testing tool support the actual test-ing process? For each of the followingitems, we note whether they are supportedby each of the testing tools. Possible re-sult values for each criterion are: Y (forYes), N (No), and P (Partial).

1. Support for creation of stubs/drivers/harnesses.

a. Does the testing tool provide amechanism for generating stubs?

b. Does the testing tool provide amechanism for generating drivers?

c. Does the testing tool provide amechanism for generating test har-nesses?

2. Support for comparing test results withan oracle.

• Does the tool provide a mechanism forautomatically comparing the results ofthe test against the expected results?

3. Support for documenting test cases. • Does the tool assist in keeping docu-

mentation of test cases?4. Support for regression testing. • Does the tool provide the ability for au-

tomated regression testing?

Generalization of test information.What additional information can be pro-vided from a testing tool, to provide fur-ther support for the testing process?

1. Support for the creation of test cases.a. Does the tool provide help with

creating test cases?b. Does the tool provide help with

the generation of implementation-basedtest cases?c. Does the tool provide help with the

generation of specification-based testcases?

2. Support for structural coverage. a. Does the tool provide a measure

of statement coverage?b. Does the tool provide measure of

branch coverage?c. Does the tool provide a measure

of all-uses coverage?

Usability of testing tool. How easilycan a given tool be installed, configured,and used? As these criteria are more sub-jective, these factors are rated for eachof the selected tools. Each element list-ed here is given a ranking from 1 to 5,where 5 is always the positive or moredesired result, 3 is an average or ordi-nary result, and 1 indicates a poor orvery negative result. For example, “Learn-ing Curve” is based on the time neededto become familiar with the tool beforebeing able to use it in practice, and ashorter amount of time results in a high-er score.

1. Ease of Usea. How easy is it to install the tool?b. How user friendly is the interface?c. What is the quality of online help?d. How helpful are the error messages?

2. Learning Curve• How long should it take an average

programmer to be able to use the toolefficiently in practice?

3. Support

a. How quickly can you receive tech-nical support information?

b. How helpful is the technical sup-port provided?

Requirements for testing tool. What re-quirements and capabilities of the toolshelp determine which one(s) might besuitable for your circumstances?

1. Programming Language(s).• What programming language(s) will the

tool work with?2. Commercial Licensing.• What licensing options are available and

at what costs?3. Type of testing environment.• Is the testing environment command

prompt, Windows GUI, or part of thedevelopment environment?

4. System Requirements.a. What operating systems will the

tool run on?b. What software requirements does

the tool necessitate?c. What hardware requirements does

the tool impose?

Testing Tools AnalyzedJUnit is a freely available regression test-ing framework for testing Java programs,written by Erich Gamma and Kent Beck(http://www.junit.org/) [9]. It allows au-tomated unit testing of Java code andprovides output to both text-based andWindows-based user interfaces. To set uptest cases, you write a class that inheritsfrom a provided test case class, and codetest cases in the format specified. The testcases can then be added to test suiteswhere they can be run in batches. Oncethe test cases and test suites have beenconfigured, they can be run again andagain (regression testing) with ease. JUnitruns on any platform, needing just theJava runtime environment and SDK in-stalled.

Jtest is a commercially available auto-mated error-prevention tool from Parasoft(http://www.parasoft.com/). It automatesJava unit testing using the JUnit frame-work, as well as checks code against var-ious coding standards [11]. After a Javaproject is imported into Jtest, Jtest ana-lyzes the classes, then generates and ex-ecutes JUnit test cases. The generated testcases attempt to maximize coverage, ex-pose uncaught runtime exceptions, andverify requirements. Additional test casescan be added to those produced by Jtestfor more specific testing criteria. SettingJtest apart from other testing tools is itsability to check and help correct codeagainst numerous coding standards.

Created by International Software Au-tomation (ISA), Panorama for Java is afully integrated commercially available



software engineering tool that assists youwith software testing, quality assurance,maintenance, and reengineering [12] (http://www.softwareautomation.com/java/index.htm). Panorama provides testinganalysis, including code branch test cov-erage analysis, method and branch exe-cution frequency analysis, test case effi-ciency analysis, and test case minimization.Panorama provides a GUI capture/play-back tool to let you execute test casesmanually using the application’s interfaceand then later play them back again. Inaddition to its testing capabilities, Panora-ma generates various charts, diagrams, anddocuments, which help measure softwarequality and make complex programs eas-ier to understand.

csUnit is a freely available unit testingframework for .NET, which allows testingof any .NET language conforming to theCLR, including C#, VB.Net, J#, and man-aged C++ [13] (http://www.csunit.org/).Test cases are written using the function-ality provided by the framework, and thetest cases and test suite are identified byadding attributes to the correspondingfunctions and class. To run the test cases,you launch csUnitRunner and point it atthe project’s binary file; then the GUI dis-plays the results of the test.

HarnessIt, developed by United Binaries,is another commercial unit testing frame-work for .NET (http://www.unittesting.com/). Like csUnit, it supports any .NET[14] CLR-compliant language, and pro-vides documentation on testing legacycode, such as unmanaged C++ classes.Test cases are written in the .NET lan-guage of choice using the providedframework and marked with the appro-priate attribute. A GUI runs the test cas-es and displays the results. The inter-face can be run outside of Visual Studioor integrated to automatically run everytime the program is debugged. WhileHarnessIt provides a similar functional-ity to csUnit and other unit-testing soft-ware, as a commercial tool it has supe-rior documentation, features, andflexibility, and a full support staff for as-sistance.

Clover.Net is a code coverage analysistool for .NET from Cenqua (http://www.cenqua.com/clover.net/). As such, it high-lights code that isn’t being adequatelytested by unit test cases [15]. Clover.Netprovides method, branch, and statementcoverage for projects, namespaces, files,and classes, though it currently only sup-ports the C# programming language. A Vi-sual Studio .NET plug-in enables config-uring, building, and displaying results oftest coverage data. Clover can be used inconjunction with other unit testing tools,such as csUnit or HarnessIt, or with hand-written test code.

Comparative StudyThe example class we selected to evaluatethe testing tools was a stack, which wassimple enough to make analyzing the tooloutput straightforward. At the same time,it had enough functionality to provide var-ious places to “break” the code. We wrotethe stack class in Java (Listing One) and C#(Listing Two), to accommodate testing toolsof both languages. The code for the twolanguages was similar, differing only inwhich access modifiers were needed, caseof built-in object operations, and the factthat the class was stored in a package inJava and a namespace in C#.

Each testing tool had its own uniquefeatures and functions for testing; never-theless, all tools had in common the testcase. The test case is the most basic unitinvolved in testing, yet all other aspectsof testing depend on it. Some tools requireyou to write the code for the test cases,while other tools generate test cases foryou automatically.

JUnit. To set up test cases in JUnit, youwrite a test class, which inherits from theJUnit TestCase class. Each function in theclass represents a test case where the codecreates one or more instances of the classto be tested and executes functions of theclass using parameters as necessitated bythe test case. Listing Three is a test casethat tests the pop function of Stack. Here,two stacks are created: one to test with(stTested ) and the other to hold the ex-pected result (stExpected ). After the objectbeing tested has been exercised appro-priately, one or more assert functions areexecuted to check if the expected result(s)matches the actual result(s). In this exam-ple, one assert tests that the correct valueis being popped off, then another assertchecks that the final object matches theexpected object. The JUnit frameworkcatches any exceptions that are thrown bythe asserts and handles them accordingly.

Test cases are added to test suites wherethey can be executed in sets, or, if all testcase names begin with “test,” the test run-ner automatically creates a test suite withall the test cases and runs them. JUnit pro-vides both text and GUI interfaces for thetest runner. Configuration of the test cas-es takes time, but the core functionalityof JUnit is regression testing, where a con-figured test case can be reexecuted anynumber of times. Jtest. Jtest automates the test case gener-ation process by analyzing each class andcreating test cases for each function of theclass. These test cases attempt to maximizecoverage, expose uncaught runtime ex-ceptions, and verify requirements by usingthe JUnit framework. Listing Four is a sam-ple test case generated by Jtest where atest stack is created from an empty object

array and an object is pushed onto it. Thetest case then verifies that the resultingStack is not null. There are many similarwhite box test cases generated by Jtest,which attempt to execute as much of thecode as possible; however, black box test-ing is not necessarily as automated.

For Jtest to automate black box testing,the specifications must be embedded intothe code using Object Constraint Language(OCL), a language embedded into codeas comments, which allows a class’s con-straints to be formally specified [3]. Oth-erwise, you can either modify the auto-matically generated test cases to test thespecification, or write JUnit test cases man-ually outside of the Jtest-generated cases.Regression testing is performed by reusingcode similar to JUnit.

There are many additional functionali-ties Jtest provides to enhance the testingprocess; for example, it automatically cre-ates stub and driver code for running testcases. Also, Jtest measures code coverageto show completeness of test cases. Fi-nally, it checks code against numerouscoding standards and can automaticallycorrect many such issues. A testing con-figuration is completely customizable toinclude or exclude any of the numerousfeatures/subfeatures. Panorama for Java. Creating unit testcases like those just described is not pos-sible in Panorama. The way to create testcases in Panorama is to manually executethem in the application’s interface whilethey are recorded. These test cases couldlater be replayed where Panorama wouldreexecute the same test cases using theapplication’s interface, and thereby per-form regression testing. However, this teststhe user interface, not the class itself.csUnit. In csUnit, test code is marked withattributes that allow the framework toidentify the various test components. Testcases are placed in a test class, which ispreceded by a [TextFixture] attribute, whilethe test cases themselves are preceded bya [Test] attribute. An Assert class is pro-vided by the framework to validatewhether a test case passed. Listing Five isthe testPopEmpty test case. Here, Stack stis instantiated as an empty stack. An As-sert.True is then used to test that an at-tempted pop would return null.

Test cases are run by the csUnitRunnerapplication, which can be started from theProgram menu under csUnit or by choos-ing csUnitRunner from the Tools menu inVisual Studio. In csUnitRunner, test casesare loaded by choosing File|Assem-bly|Add and pointing to the binary fileproduced by building the solution to test.Then, clicking the Run icon or choosingTest|Run All starts csUnitRunner, exe-cuting each test case and displaying thesubsequent results.


HarnessIt. HarnessIt test cases, likecsUnits, are identified with an attribute;only in this case, the attribute is [Test-Method], and the class containing it is not-ed with a [TestClass] attribute. The methodof evaluating a test case, though, is dif-ferent. Instead of using an Assert class totest validity, a TestMethodRecord type isprovided, and is passed to a test case func-tion. After the test case has been set up,the RunTest method of the TestMethod-Record type is called and passed a Booleanstatement which can include unit test com-mands, comparison operators, and ex-pected results. Listing Six is the test casefor testPushFull, which fills Stack s1 withthe maximum number of values. Stack s2is instantiated as a duplicate of Stack s1.Then the code attempts to push anothervalue onto s1. The TestMethodRecord ver-ifies that s1 still equals s2, because no morevalues can be pushed onto a full stack.

Test files are run from the HarnessIt ap-plication, which can be opened from theHarnessIt option in the Program menu. In-side HarnessIt, choosing File|Add As-

sembly and pointing to the project’s bina-ry loads the project for testing. The testcases can then be run by choosingFile|Run from the menu or pressing F5. Awindow pops up with an indicator barshowing progress as the tests are beingrun. The indicator bar remains green aslong as tests pass, and turns red when theyfail. Once finished, a report is generated,displaying all test cases in a hierarchicalview with their test execution status. Clover.Net. Clover.Net, as a coverage anal-ysis tool, does not provide a mechanismfor creating test cases, but rather, assumestest cases have already been coded.Clover.Net analyzes the test cases whenthey are executed, to produce coverageresults. Clover.Net provides a Visual Stu-dio plug-in, which lets you configure test-ing options as well as view coverage re-sults directly in Visual Studio. Clover caneven highlight the code that hasn’t beentested, making it easy to find which areaswere missed. In the Clover View, func-tions are marked green if they have beenfully exercised, yellow if they have been

partially exercised, and red if they havenot been exercised at all. Statistics are giv-en for the percentage of methods, branch-es and statements covered, and thesestatistics can be broken down from thesolution level to project, class, or evenfunction level.

Test ResultsEach tool was examined according to thecriteria presented here. Tables 1 through4 detail the results of these evaluations,which correspond to the four sets of cri-teria. Tables 1 and 2 deal with featuresthat a tool may or may not have. The re-sult cells for each tool contain a Y (Yes,it has this feature), N (No, it doesn’t sup-port this), or P (Partial functionality of thefeature available). Table 3 presents moresubjective criteria, for which a score of 1to 5 is assessed, where 5 = Excellent, 4 =Good, 3 = Average, 2 = Sub par, and 1 =Poor. Finally Table 4 provides additionalinformation on requirements of the tools.Support for the testing process. Thesecriteria deal with the additional support atool can provide to make the testing pro-cess easier (see Table 1). All of the toolsprovided a test harness for running the testcases, and the ability for regression test-ing. Many of the tools scored a “P” in theCreation of Drivers category because,though they provided a framework for cre-ating drivers, Jtest was the only tool thatactually automated this process, along withthe creation of stubs. All the unit testingtools gave the ability to check results witha user supplied oracle. None of the toolsprovided help for documenting test cases,although some of the commercial toolsmay have an additional module for pur-chase to assist with this. Generalization of test information. Thecriteria in this section deal with how wella tool supports the actual testing process,including automatic generation of test cas-es and various coverage measurements(Table 2). Not surprisingly, the freewaretools did not offer help in these areas,leaving it to you to perform these tasksmanually. Jtest again scored well with theability to automatically generate test cas-es, and provide branch and statement cov-erage results when running them. Panora-ma and Clover.Net both provided branchand statement coverage for test cases thatwere run. Usability of testing tool. These criteriadeal with ease of setup and use (Table 3).While the freeware tools scored well inthese categories, bear in mind that theydon’t offer the functionality of the othertools. To substantiate the results and elim-inate bias, the tools were distributed to agroup of computer science students at Flori-da International University who ranked thetools on the preselected criteria.


Table 1: Support for the testing process.

Criteria/Tools JUnit Jtest Panorama csUnit HarnessIt Clover.Netfor Java

1a. Creation of Test Cases N Y N N N N1b. Implementation-based N Y N N N N

Test Cases1c. Specification-based N Y N N N N

Test Cases2a. Branch Coverage N Y Y N N Y2b. Statement Coverage N Y Y N N Y2c. All-uses Coverage N N N N N N

Table 2: Generalization of test information.


1a.Creation of Drivers P Y N P P N1b. Creation of Stubs N Y N N N N1c. Creation of Test Y Y Y Y Y Y

Harness2. Compare Results with Y Y N Y Y N

Oracle3. Help with Documenting N N N N N N

Test Cases4. Capable of Regression Y Y Y Y Y Y

Testing


1a. Ease of Installation 3.9 4.4 4.3 5 4.9 3.41b. User-friendliness of 3.7 4.4 4.3 5 4 4

Interface1c. Quality of Online Help 3.8 3.4 3.3 3.1 4.7 4.21d. Helpfulness of Error 4.3 4.23. 5 3.9 4.3 4.2

Messages 2. Learning Curve 3.9 3.4 3.8 4.6 3.8 3.23a. Responsiveness N/A 3.5 3 N/A 5 4.7

of Technical Support3b. Helpfulness of N/A 3.5 3 N/A 5 4.7

Technical Support

Table 3: Usability of testing tool.

Requirements of testing tool. Table 4provides additional information on the re-quirements of each tool. In addition to thefact that several of the products can beused for multiple programming languages,others have similar products available foradditional languages. The remaining re-quirements can help determine which toolswould fit a particular environment, andwhether hardware and software mightneed to be upgraded or purchased to usea given tool.

Analysis of ToolsJUnit provided a framework for creating,executing, and rerunning test cases. Thedocumentation was easy to follow andgave thorough instructions on how to con-figure JUnit to work with existing codeand set up test cases accordingly. The in-terface clearly showed which test casesfailed and where, making it easy to findand correct problems. Once test cases hadbeen set up, they could be executed overand over again for regression testing. Likeother freely available tools, JUnit does notpossess the automated aspects of tools,such as Jtest and Panorama.

JUnit provides a good starting point fortesting Java programs. As a free program,there is an obvious cost benefit over moresophisticated tools. Additionally, thestraightforwardness of setting up test cas-es makes the learning curve minimal.Many programmers find this an adequatetesting tool; those working on larger ap-plications would most likely seek moreadvanced tools.

Jtest took the JUnit framework andadded automation to it. Jtest was a bitmore difficult to set up than JUnit, and wehad a hard time getting it configured prop-erly with the documentation provided. Onthe other hand, Parasoft’s technical sup-port was very responsive, usually reply-ing to e-mail within a day. We were giv-en a direct number to a support agent,who remotely logged into our test com-puter and showed us how to configureJtest to work with our code.

Jtest automated the process of creatingtest cases and stubs for our sample code,and provided an environment for runningtests. Automation of white box test cas-es went seamlessly; however, for blackbox test cases, a little more work was re-quired up front to insert OCL statementsinto the code. One advantage of doingso is that it produced code documenta-tion along the way. In addition to testingfunctionality, the major selling point tothis tool was the extensive ability to checkcode versus numerous different codingstandards, and help bring the code intoconformance.

Jtest is a full-featured advanced testingtool, and as such, comes with a price tag.

It is intended for advanced users in a so-phisticated software development envi-ronment. It provides lots of functionalityto assist testers and programmers in de-veloping quality products.

Panorama for Java was simple to installand go through using the sample programprovided by Panorama. However, we wereunable to get Panorama to run test casesfor our sample class. This was becausePanorama does not actually generate testcases, or provide a framework for creat-ing them. What it does provide is the abil-ity to record test cases that are manuallyexecuted, so that they can be played overagain. This, however, is really used fortesting the user interface, where we werelooking to run test cases against the classitself in the code.

Panorama does provide sophisticatedreports and graphs for measuring testingand the code itself. Perhaps in the future,Panorama will incorporate more capabil-ities for unit testing into its software suite.

csUnit provided a unit testing frame-work for .NET, in our case, a C# testingsolution. The documentation was simpleand easy to follow, though not overly ex-tensive. The csUnitRunner interface waswell designed with a hierarchical view ofthe test cases available to navigate, andthe standard color-coded results indicat-

ing if a test passed or failed. Regressiontesting was achieved in the same manneras the other unit testing tools.

As a freeware tool, csUnit is appealingfor budget-conscious and novice testers.Its simplicity is another quality to makeit a good starting point for novices, aswell as an acceptable test tool for time-constrained developers.

HarnessIt offered a testing product withsimilar functionality to csUnit at a nomi-nal price. It supplied a full help menuwhich specified the basics of how to testcode with the provided framework, as wellas more advanced topics, such as usingHarnessIt to test legacy code. HarnessItprovided the standard unit testing func-tionality, and displayed test results in aneasy to follow manner. HarnessIt can beintegrated into an application such thatwhenever you run the application in de-bug mode, it automatically launches andruns the test cases.

Overall, we found that HarnessIt didnot provide much more functionalitythan csUnit; however, this does not nec-essarily mean it is not worth the price.What you are paying for is organized,complete help documentation, as wellas knowledgeable support staff to helpyou in getting the tool working withyour code.


We were able to integrate the Clover.Nettool into Visual Studio without much has-sle by following the instructions in theClover.Net manual. The manual was welllaid out and contained all the necessarydocumentation for working with Clover.The Clover View provided coverage re-sults within Visual Studio making it sim-ple to find which areas of the code need-ed further testing. Clover.Net does notinclude a framework for creating test cas-es, but works in conjunction with unit test-ing tools such as the those mentioned pre-viously.

As a commercial tool, Clover.Net of-fers significant additional functionalitiesover the available freeware tools by pro-viding coverage analysis. Additionally,the price tag isn’t much higher than Har-nessIt. This tool would be ideal for asmall to medium development environ-ment that does not have the necessarybudget or technical need of a tool onParasoft’s level.

Conclusion We have examined here just a few of theavailable testing tools.

As the functionality of software expands,and the underlying code becomes morecomplex, the often neglected area of soft-ware testing will be called on more thanever to ensure reliability.

AcknowledgmentsThanks to the students in the SoftwareTesting class at Florida International Uni-versity, Fall 2004, for their participation inevaluating the tools. The members of theSoftware Testing Research Group alsoprovided valuable comments that were in-corporated in the article.

References[1] J.D. McGregor and D.A. Sykes. A Prac-

tical Guide To Testing Object-OrientedSoftware. Addison-Wesley, 2001.

[2] R.V. Binder. Testing Object-OrientedSystems. Addison-Wesley, 2000.

[3] B. Bruegge and A.H. Dutoit. Object-Oriented Software Engineering UsingUML, Patterns, and Java. Pearson Pren-tice Hall, 2004.

[4] P.J. Clarke. “A Taxonomy of Classesto Support Integration Testing and theMapping of Implementation-BasedTesting Techniques to Classes.” Ph.D.Thesis for Clemson University, Au-gust 2003.

[5] IEEE/ANSI Standards Committee, 1990.Std 610.12.1990.

[6] M.J. Harrold and G. Rothermel. “Per-forming Data Flow Testing on Classes,”Proceedings of the 2nd ACM SIGSOFTSymposium on Foundations of SoftwareEngineering, pages 154–163. ACM, De-cember 1994.

[7] D. Kung, Y. Lu, N. Venugopalan, P.Hsia, Y.Y. Toyoshima, C Chen, and J.Gao. “Object State Testing and FaultAnalysis for Reliable Software Systems,”Proceedings of the 7th InternationalSymposium on Reliability Engineering,pages 239–242. IEEE, August 1996.

[8] BCS SIGIST. “Glossary of Terms usedin Software Testing,” http://www.testingstandards.co.uk/Gloss6 _2.htm,June 2004.

[9] E. Gamma and K. Beck. JUnit. http://www.junit.org, June 2004.

[10] Parasoft. Automating and ImprovingJava Unit Testing: Using Jtest with J-Unit. http://www.ddj.com/documents/s=1566/ddj1027024311022/parasoft.pdf.

[11] Parasoft. Jtest. http://www.parasoft.com, July 2004.

[12] International Software Associa-tion. Panorama for Java. http://www.softwareautomation.com/java/index.htm,July 2004.

[13] Manfred Lange. csUnit. http://www.csunit.org, August 2004.

[14] United Binary, LLC. HarnessIt. http://www.unittesting.com/, August 2004.

[15] Cenqua. Clover.Net. http://www.cenqua.com/clover.net/, August 2004.

[16] I. Sommerville. Software Engineering.Addison-Wesley, 2004.

DDJ


Criteria/Tools JUnit Jtest Panorama for Java csUnit HarnessIt Clover .Net

1. Programming Java Java (Parasoft Java (ISA does Any .NET CLR Any .NET CLR C# (Cenqua providesLanguage(s) does offer similar offer equivalents Language Language a similar product for

products for for C/C++ and VB) JavaC/C++ and .NET)

2. Licensing Open Software See web site See web site Open Software See web site See web site

3. Type of Integrates into Command GUI GUI GUI GUI or commandEnvironment existing Java prompt, GUI, prompt

development or could be environment integrated to

certain developmentenvironments

4a. Platforms Any Windows 2000 Windows Windows 2000 Windows WindowsWindows Xp, 95/98/NT Pro, Windows 2003 Windows 2000 Server, Solaris Advanced Server, Linux Windows XP Pro

4b. Software Current JDK Sun Microsystems Current JDK Microsoft .NET Microsoft .NET Microsoft .NETRequirements and JRE JRE1.3 or higher and JRE Framework 1.0 Framework 1.0 Framework 1.1

or 1.1 or SP2 or higher or Visual Studio,

4c. Hardware None given Intel Pentium III At least 5MB None given None given None givenRequirements 1.0 GHZ or free hard

higher drive space.recommended At least SVGA 16 MB RAM(800x600 min.)(1024x768 rec.)512 MB RAM min.1 GB RAM rec.

Table 4: Testing tool requirements.

Listing One//Simple stack implementation - Javapublic class Stack{

protected static final int STACK_EMPTY = -1;protected static final int STACK_MAX = 1000;Object[] stackelements;int topelement = STACK_EMPTY;

//create empty stackStack(){

stackelements = new Object[STACK_MAX];}//create stack with the objects stored in the array from bottom upStack(Object[] o){

stackelements = new Object[STACK_MAX];for(int i=0;i<o. length;i++){

push(o[i]);}

}//create stack as a duplicate of given stackStack(Stack s){

stackelements = s. stackelements;topelement = s. topelement;

}//push object onto the stackvoid push(Object o){

if(!isFull()){

stackelements[++topelement] = o;}

}//pop element off the stackObject pop(){

if(isEmpty())return null;

else return stackelements[topelement--]; }

//return true if stack is emptyboolean isEmpty(){

if(topelement == STACK_EMPTY)return true;

else return false;}

//return true if stack is fullboolean isFull(){

if(topelement == STACK_MAX-1)return true;

else return false;}

}

Listing Two//Simple stack implementation - C#public class Stack{

public const int STACK_EMPTY = -1;public const int STACK_MAX = 100;private Object[] stackelements;private int topelement = STACK_EMPTY;

//create empty stackpublic Stack(){

stackelements = new Object[STACK_MAX];}//create stack with the objects stored in the array from bottom uppublic Stack(Object[] o){

stackelements = new Object[STACK_MAX];topelement = o. Length-1;for(int i=0;i<o. Length;i++){

stackelements[i] = o[i];}

}//create stack as a duplicate of given stack

public Stack(Stack s){

stackelements = s. stackelements;topelement = s. topelement;

}//push object onto the stack

public void push(Object o){

if(!isFull()){

stackelements[++topelement] = o;}

}//pop element off the stack

public Object pop(){

if(isEmpty())return null;

else return stackelements[topelement--]; }//return true if stack is emptyprivate bool isEmpty(){

if(topelement == STACK_EMPTY)return true;

else return false;}

//return true if stack is fullprivate bool isFull(){

if(topelement == STACK_MAX-1)return true;

else return false;}

}

Listing Three public void testSimplePop(){

String strA = "a";String[] st = new String[1];st[0] = strA;Stack stTested = new Stack(st);Stack stExpected = new Stack();Assert.assertTrue(strA. equals(stTested. pop()));Assert.assertTrue(stExpected. equals(stTested));

}

Listing Four/** Test for method: push(Object)* @see Stack#push(Object)* @author Jtest*/public void testPush2() {

Object t0 = new Object();Object[] var0 = new Object[] {};Stack THIS = new Stack(var0);// jtest_tested_methodTHIS.push(t0);boolean var1 = THIS.equals((Stack) null);assertEquals(false, var1); // jtest_unverified

}

Listing Five[Test]public void testPopEmpty() {

Stack st = new Stack();Assert.True(st.pop()==null);

}

Listing Six[TestMethod]public void testPushFull(TestMethodRecord tmr){

String strA = "a";Stack s1 = new Stack();for(int i=0;i<Stack. STACK_MAX;i++){

s1.push(strA + Convert. ToString(i));}Stack s2 = new Stack(s1);s1.push(strA);

tmr. RunTest(s1.equals(s2),"Testing push to a full Stack");}

DDJ


Error messages are the most importantpieces of information users get whenencountering application failures. Gooderror messages identify the root cause

of problems and help users correct them.Bad error messages cause confusion. Asan enterprise application developer whofrequently participates in software trou-bleshooting, I see many error messages—and most of them are bad. They are eithertoo generic, ambiguous, misleading, or justplain wrong. Bad error messages cause usto spend long hours investigating prob-lems, which could’ve been corrected in amatter of minutes had they only been ac-companied by meaningful error messages.

In this article, I explain how to buildgood error messages and present exam-ples showing how to report errors in C/C++and C#. In the first part of the article, I out-line the guidelines covering the most im-portant, but frequently neglected, aspectsrelated to error reporting. The second partdescribes code samples.

Errors & Error MessagesThe right approach to error handling be-gins with application design. Before writ-ing code, the project team must define acommon error-handling methodology,which makes it easier for each developerto process errors. This methodology mustinclude coding standards specifying how

errors should be treated (generated, cap-tured, displayed, and stored) and a com-mon library (API) that takes care of thelow-level error-handling minutiae. Havinga defined standard and available API notonly improves developer productivity, butgives us more time to think about the con-tent (meaning) of error messages insteadof the implementation details.

Building meaningful error messages isjust one aspect of good error handling. Er-ror processing also involves tracing, log-ging, debugging, and more. Whether youare building Windows GUI applications orweb sites, make sure you display error mes-sages in a consistent manner. For WindowsGUI apps, use a standard dialog box, whichlets users view error details and suggestsways to correct the problem if this infor-mation is available.

Displaying error messages in web appspresents more challenges. Unless you havea better approach, show messages at thetop of the page or in a JavaScript pop-updialog. If you include error messages at thetop of the web pages, always display themin the same location (relative to page lay-out) and clearly identify messages as errors.To minimize work, implement the error dis-play area as a common control responsi-ble for rendering error information.

If you choose the JavaScript option, youcan follow the example in Listing One.When the server-side code detects an er-ror condition in this example, it calls ashared function, which renders client-sidescript displaying an alert box containingerror information (notice how the codehandles quotes and other special JavaScriptsymbols). You can test this example usingthe ASPX project (available electronically;see “Resource Center,” page 3).

Applications that run without user in-terfaces (such as server-side applications)normally save error messages in errorlogs. Examples of error logs can be textfiles, Windows Event Logs, and databas-es. Some client-side programs can alsouse error logs for storing extended errorinformation, such as exception stacks orerror traces.

Because a large number of entries inWindows Event Log can make it hard tonavigate, server-side programs should useEvent Logs only for catastrophic errors andstore messages in log files or databases.When storing error information in databasetables, make sure that you use space effi-ciently. For example, when saving an er-ror message returned by a database

provider (such as OLE DB) in a 255-character column, store the error descrip-tion first and details (name of the provider,ANSI code, and so on) last; otherwise, theremay not be enough space to hold the de-scription.

When writing error messages, think ofthe users who read it. Error details thatmake sense to one group of users can to-tally confuse another. The users readingyour error messages include end users,help desk personnel, and application de-velopers. End users may not know muchabout the application functionality and tech-nology in general. Help desk representa-tives (and by “help desk” I mean supportorganizations that do not have access tothe source code) usually know more aboutsoftware than end users, but not as muchas the application developers. Applicationdevelopers know everything (at least, theythink they do).

Error messages displayed to end usersmust be nontechnical and must not containinformation that only makes sense to de-velopers (such as exception call stack, er-ror numbers, names of source files, and linenumbers). Nor should details of server-side

Dissecting ErrorMessagesVigorous errorhandling and gooderror messages are the key

ALEK DAVIS

Alek is a senior application developer atIntel. He can be contacted at [email protected].


“The right approachto error handlingbegins withapplication design”

errors (say, failure of the application to re-trieve data from a database) be passed toend users. For one thing, end users will notbe able to correct server-side problems.Additionally, server-side error details canreveal information about the back end,thereby imposing a security risk. On theother hand, error messages displayed toend users must contain enough informa-tion to help correct the problems. For ex-ample, if an error occurs because users donot have the write permission to a local di-rectory, the name of the directory and re-quired permission must be included in theerror message.

Most error messages read by supportteams (both help desk and developers)typically come as a result of server-sideerrors. A server-side error message mustinclude all details that help identify theroot cause of the problem. Be careful withcode-specific error details. Although ap-plication developers like to include ex-ception stack, line numbers, and othercode-specific information, these details aregenerally not helpful to most trouble-shooters because they only make senseto programmers who have the sourcecode in front of them. Because code-spe-cific details can make it harder for non-developers to understand the meaning ofan error message, I don’t recommend mix-ing them with error descriptions. Eithersave these details in a separate file, or sep-arate them from the human-readable er-ror description.

What Are Good Error Messages?Good error messages must identify thefailed operation, describe the executioncontext, and specify the reason for failure.If the error reflects a common or antici-pated problem, the message can also tellusers how to fix it.

Consider an application that creates useraccounts for a group of customers, wherecustomer data are pulled from an externalsystem. Account creation is a two-step pro-cess—creation of a user profile and a mail-box. Imagine that the application fails tocreate the mailbox for one user.

To identify the failed operation, an ap-plication must name the top-level task ithas not been able to complete. Assumingthat the problem does not cause failurefor the whole batch, the top-level opera-tion is user-account creation. The messagemust say something like: “Cannot createan account for user X,” where X unique-ly identifies the user. Identifying the usercan help the support team to quickly pullcustomer data.

After describing the failed operation, theerror message must specify the executionpath (context) that caused the error. Theexecution context lists all logical operationsthat lead to the failure (excluding the top-most operation, which has already beenmentioned). You can think of the execu-tion path as the exception stack withoutcode-specific details. In this example, theexecution context can be: “Failed to cre-ate mailbox for Y,” where Y specifies thecustomer’s e-mail address. If the mailbox

creation contains substeps, the descriptionof all failed substeps on the execution stackmust be appended. The description of eachstep in the execution context must beunique.

Finally, the reason causing the last stepin the execution context to fail must bespecified. This information is typically re-turned by the failed API. For example, itcan be retrieved via the GetLastError andFormatMessage, or obtained from the ex-ception details. The complete error mes-sage must read: “Cannot create an accountfor user X. Failed to create mailbox for Y.Mailbox already exists.” Notice that all partsof the error message are written in com-plete sentences separated by periods, mak-ing it easier for the user to follow the stepsthat caused the failure.

To build a good error message, you needto provide sufficient details, but still avoidduplicate information. Achieving this goalis not easy because errors normally occurdeep in the call stack so the type of infor-mation that must be returned by error-han-dling code may not be obvious. The chal-lenge is to return sufficient information fromevery error handler on the stack.

For example, look at the pseudocode ac-count creation example in Listing Two. Ifyou are writing a function CreateMailbox,which message should the error handler ofthe CreateMailbox method pass to the caller?In particular, should it include “Failed tocreate mailbox for Y” or should this mes-sage be generated somewhere else? Keepin mind that a developer responsible forthe implementation of the CreateMailboxmethod may not know who will call it.

If you face a similar dilemma, followthis rule: When returning an error mes-sage from a function, do not describe themain operation performed by this func-tion. Instead, identify the step on whichthis function fails and include the errormessage returned by this step. If functionA calls function B, which calls functionC, and function C causes an error in stepX, error messages returned by each ofthese functions should be similar to thosein Table 1. Following this rule, the func-tions in the account creation examplewould return the error displayed in mod-ified pseudocode in Listing Three.

Although I recommend not includingcode-specific details, sometimes they maybe needed. If a problem is escalated to thedevelopment team, a developer may wantto know such error details as the excep-tion stack, line number, file name, and soon. To accomplish this, you can make theformat of error messages customizable andadjust it at runtime.

Implementing Error Messages In C/C++Compared to .NET languages, C/C++ doesnot offer rich error-handling facilities. You



Message Returned Message appended

in Main in A in B in C

from C Step X failed: Error reason.

from B Cannot do C. Step X failed: Error reason.

from A Cannot do B. Cannot do C. Step X failed: Error reason.

from main Cannot do A. Cannot do B. Cannot do C. Step X failed: Error reason.

Table 1: Sample returned error messages.

BuildMessage* Creates a formatted message in a dynamically allocated buffer (on the heap).

DeleteMessage* Deletes memory allocated for a character string on the heap after verifying that it is not NULL.

TryDeleteMessage* Same as DeleteMessage, except executes within a try…catch block.

GetWin32ErrorMessage Generates an error description for a specific Windows error code or a current system error.

GetComErrorMessage Generates an error description for the specified COM error code (HRESULT) and the information retrieved from the COM Error object (IErrorInfo interface).

SetComError Sets COM error information for the current thread. If there is a pending COM error available via the COM Error object (IErrorInfo interface), this function will also include its description in the new error.

DebugLog Logs a formatted debug message to a file.

Table 2: Library functions related to error processing. *ANSI and Unicodeversions of these functions are also available.

can pass error messages as function pa-rameters or return them as exceptions us-ing structured exception handling (SHE) orC++ exception handling. You can use Get-LastError in conjunction with FormatMes-sage to get the description of the last failedsystem call. When making COM calls, youcan obtain error information from the COMError object or other COM sources, such asthe OLE DB Error object.

The hard part about handling errors inC/C++ is memory management. To avoiddealing with memory allocation issues, youcan use string classes available in MFC orSTL, but this may not always be an opti-mal approach. In the following example,I show how to build messages in dynam-ically allocated memory buffers using thebasic Win32 APIs (not relying on MFC orSTL). I also explain how to process systemand COM errors and pass error messagesbetween functions.

The C/C++ sample project (available elec-tronically) builds a static library that imple-ments several methods that can be used toprocess error messages. To link this libraryto your project, add it to the project’s librarysettings, making sure that you use the rightrelease target (ANSI or Unicode). In thesource code, include a reference to the com-mon.h file, which is located in the Includefolder of the project. It defines function pro-totypes and several helpful macros. Table2 lists the library’s functions related to er-ror processing.

The BuildMessage function lets you cre-ate formatted messages of arbitrary sizes.It works similar to sprintf, but BuildMes-sage writes data to a dynamically allocat-ed memory (on the heap). The functiontakes three parameters: address of a point-er to a memory buffer, which holds for-matted messages; message format; and op-tional message arguments. There are threeversions of BuildMessage that work onTCHAR, ANSI, and Unicode strings.

BuildMessage dynamically allocates mem-ory if needed. To free memory allocatedby BuildMessage, you can call the corre-sponding version of the DeleteMessage (orTryDeleteMessage) function. This is howBuildMessage can be used:

TCHAR* pszErrMsg = NULL;for (int i=0; i<5; i++){

BuildMessage(&pszErrMsg, _T("#%d: Error in file %s...")

, i, __FILENAME__);_putts(pszErrMsg);

}TryDeleteMessage(&pszErrMsg);

When using BuildMessage, follow threesimple rules:

• Never pass the address of a static char-acter array (stack variable) as the first

parameter. BuildMessage assumes thatthe first parameter references a heapvariable, so it uses the _msize functionto check the amount of memory allo-cated for it and calls realloc to allocateadditional bytes if the buffer is toosmall. (If the memory buffer has alreadybeen allocated and is sufficient to holdthe formatted message, BuildMessagereuses it.)

• Before passing a nonallocated memorybuffer, always set the value of the point-er (not the address) to NULL; otherwise,the function causes a memory-access vi-olation.

• Always free memory allocated by Build-Message when you no longer need it.

Listing Four is pseudocode illustratinghow to use BuildMessage to concatenateerror information passed between functioncalls. I find it helpful to follow the con-vention of always using the first functionparameter to pass error messages. If youprefer to pass error information via ex-ceptions, you can still use BuildMessage toformat message strings, just don’t forget tofree memory.

Processing Error Information In C/C++Now that you have a method to easily for-mat error messages, you can generate orretrieve error information via the Get-Win32ErrorMessage, GetComErrorMessage,

and SetComError functions. The first twomethods can be used to retrieve error in-formation from a system or COM call. Set-ComError provides a capability to returnCOM errors to COM client via the COMError object.

GetWin32ErrorMessage returns the for-matted description of the specified Win-dows error code. If the description is notfound, it generates a generic message thatincludes the provided error number. Thefunction takes one required and two op-tional parameters. The first parameter isused to hold the generated error message(similar to BuildMessage). The second pa-rameter is used to pass the error number.If the error number is not provided or setto zero, GetWin32ErrorMessage calls Get-LastError to retrieve the system error code.The third parameter indicates whether theerror number should be included in the er-ror message.

GetComErrorMessage is similar to Get-Win32ErrorMessage, except in addition toretrieving the description of the HRESULTerror value passed as a second (optional)parameter, it also attempts to obtain infor-mation from the COM Error object (IEr-rorInfo interface). If the HRESULT value in-dicates success, it is ignored and thefunction only gets the information from theCOM Error object. You must always freememory buffer allocated by GetWin32Er-rorMessage and GetComErrorMessage.


ApplicationExceptionInfo Custom error message formatter for the System.ApplicationException class.

COMExceptionInfo Custom error message formatter for the System.Runtime.InteropServices.COMException class.

ExceptionFactory Generates a formatted error message and uses it as a description of an exception created via one of the threeoverloaded System.Exception constructors.

ExceptionInfo Custom error message formatter for System.Exception and all exception classes that do not have explicitly defined Custom error message formatters. This class also serves as a base class for other Custom error message formatters.

ExternalExceptionInfo Custom error message formatter for the System.Run-time.InteropServices.ExternalException class.

FormatFlags Defines enumeration flags for exception details, which will be included in the error message.

Helper Implements shared helper methods.HttpExceptionInfo Custom error message formatter for the

System.Web.HttpException class.IMessageFormatter An interface defining the methods responsible for

formatting error messages, which must be implementedby Custom error message formatters.

MessageFormatter Implements helper functions handling message formatting.OdbcExceptionInfo Custom error message formatter for the

System.Data.Odbc.OdbcException class.OracleExceptionInfo Custom error message formatter for the

System.Data.OracleClient.OracleException class.SqlExceptionInfo Custom error message formatter for the

System.Data.SqlClient.SqlException class.SystemExceptionInfo Custom error message formatter for the

System.SystemException class.WebExceptionInfo Custom error message formatter for the

System.Net.WebException class.Win32ExceptionInfo Custom error message formatter for the

System.ComponentModel.Win32Exception class.

Table 3: Library classes.

COM methods typically return error in-formation via the COM Error object. Set-ComError lets you do it easily and also of-fers some extras. If you are writing a COMobject, which encounters an error whenmaking an internal COM call (such as whencalling another COM object), you may wantto combine your own error message withthe error information retrieved from thefailed COM call and pass the result to theclient via the COM Error object. SetCom-Error does just that. It first calls the Get-ComErrorMessage to retrieve error infor-mation from the specified HRESULT codeand the global COM Error object. Then itappends the returned error information—if any—to the message passed via the firstparameter and sets this error message as adescription field of the ICreateErrorInfo in-terface generated for the current thread.The last two optional parameters can beused to specify the class and interface IDs(CLSID and IID) of your COM object. Whenusing GetComErrorMessage or SetComEr-

ror, you must define the _WIN32_DCOMprecompiler directive in the project settings.

In addition to the three methods just men-tioned, you can use the DebugLog methodto print formatted debug messages to a logfile. This function can be handy if you can-not step through the program’s source codein Visual Studio IDE. Source code commentsdescribe how DebugLog works.

Implementing Error Messages in C#The .NET Framework offers a much bettererror-handling methodology that is basedon .NET exceptions. Although the conceptof exception is not new—you can use ex-ceptions in C/C++—.NET exceptions havea useful feature: They can be nested. Thismakes it possible to easily pass and retrieveerror information. Reflection is another help-ful .NET feature, which makes it possible toprogrammatically retrieve the context of ex-ception at runtime. And of course, using.NET you are free from memory-management hassles.

The C# sample (available electronically)builds a .NET class library that simplifies er-ror reporting. Using this library, you can:

• Format error messages using error de-tails retrieved from exception objects andinner exceptions.

• Get extended error information from com-plex exception classes, such as SqlEx-ception, WebException, and others.

• Customize the format of error messagesto include or exclude such exception de-tails as source, type, error code, method,and others.

• Extend the library to customize mes-sage formatting for your own excep-tion classes.

Table 3 lists some of the library’s classes,and Figure 1 is the class diagram. The C#project comes with an HTML help docu-ment.

ExceptionInfo is the primary class re-sponsible for retrieving error informationfrom exceptions. This is how you woulduse ExceptionInfo to retrieve error infor-mation from a SqlException object in-cluding all inner exceptions and error de-tails provided in the collection of SqlErrorobjects:

try{...

}catch (SqlException ex){

Console.WriteLine(ExceptionInfo.GetMessages(

ex, (int)FormatFlags.Detailed));}

The GetMessages method uses the For-matFlags enumerator, which identifies er-ror details you want to retrieve. Format-Flags currently supports 14 error detailoptions and a number of flag combinations(see Table 4). If you want to retrieve morethan one error detail, you can combinemultiple format flags in a bitmask or use apredefined mask. You do not have to spec-ify format flags on every call, but insteadset it only once via the SetFormat method.

SetFormat is implemented in the abstractMessageFormatter class, which is a parentof ExceptionInfo. I used the MessageFor-matter class to separate message format-ting logic from exception processing han-dled by ExceptionInfo. MessageFormatterincludes a number of methods, whichmake it easier to build error messages fromexception details. MessageFormatter im-plements the IMessageFormatter interface.

The IMessageFormatter interface definestwo FormatMessage methods: One usesexplicitly specified message format flags,the other uses the default setting. The For-matMessage method that uses the defaultformat flags is already implemented in the


Enumeration Value Description

Default Include exception details identified by the default setting.Detailed Include all available exception details.ErrorCollection If an exception contains a collection of errors, such as

System.Data.SqlClient.SqlException.Errors, include the information about each error in the collection. When this flag is set, all flags specified in the format bitmask will be applied to the error message describing each error in the collection.

Message Include System.Exception.Message.Source Include System.Exception.Source. For

System.Data.OleDb.OleDbException, this shows the name of the OLE DB provider; for System.Data.Odbc.OdbcException, it displays the name of the ODBC driver.

Method Include name of the failed method.Type Include name of the exception type (without the namespace),

such as “SqlException.”ErrorCode Include exception-specific error code, such as

System.Data.OleDb.OleDbException.ErrorCode, System.Data.OracleClient.OracleException.Code, System.Runtime.InteropServices.External-Exception.ErrorCode, andSystem.Net.WebException.Status.

Severity Include error severity, such as System.Data.SqlClient.SqlException.Class.

State Include database state, such as System.Data.SqlClient.SqlException.State.

MessageNumber Include message number, such as System.Data.SqlClient.SqlException.Number.

Procedure Include name of the failed stored procedure, such as System.Data.SqlClient.SqlException.Procedure.

Server Include name of the server executing the failed call, such as System.Data.SqlClient.SqlException.Server.

NativeError Include native error code, such as System.Data.OleDb.OleDbError.NativeError or System.Data.Odbc.OdbcError.NativeError.

AnsiCode Include ANSI code, such as System.Data.OleDb.OleDbError.SQLState or System.Data.Odbc.OdbcError.SQLState.

StatusCode Include status code, such as System.Web.HttpResponse.StatusCode.

Brief Same as Message | ErrorCollection.Details Same as Source | Method | Type | ErrorCode.DatabaseDetails Severity | State | MessageNumber | Procedure |

LineNumber | Server | NativeError | AnsiCode.HttpDetails Same as StatusCode.

Table 4: FormatFlags error options.

FormatMessage class; the other version ofFormatMessage is an abstract method,which is implemented in ExceptionInfoand overridden in SqlExceptionInfo,OdbcExceptionInfo, and other exception-specific formatter classes.

The library provides two utility classes:

• The Helper class is a general-purposewrapper for frequently used operations.

• ExceptionFactory can be used to throwany type of exception with a messagethat can be built using a message formatstring and optional message parameters(basically, it combines String.Format withthrowing an exception).

The ExceptionInfo class serves as acustom formatter for the base Excep-tion class. In addition, it handles theformatting of error messages for ex-ception classes, which do not have cus-tom formatters. By custom formatters, Imean helper classes such as SqlExcep-tionInfo, which know how to retrieveerror details and format error messagesfor specific types of exceptions, such asSqlException.

ExceptionInfo can also be used to re-trieve error information from any ex-ception, including the ones that have cus-tom formatters. It dynamically detectswhether the exception class has a cus-tom formatter; if so, it uses it to formatthe error message. If the exception classdoes not have a custom formatter, Ex-ceptionInfo checks its parent and grand-parents until it finds the one that has acustom formatter.

For example, say that ParentException isderived from ApplicationException, ChildEx-ception is derived from ParentException,and ParentExceptionInfo implements a cus-tom formatter for ParentException (Figure2). If you call ExceptionInfo.GetMessage(newChildException()), ExceptionInfo uses theFormatMessage method implemented byParentExceptionInfo to generate the errormessage.

How does ExceptionInfo find the rightcustom formatter for a given exceptiontype? It uses a naming convention alongwith the .NET Framework feature, lettingcode invoke class instances; which typesare determined at runtime. The dynamicinvocation is accomplished via the Acti-vator.CreateInstance method call that takesthe name of the class as a parameter. Ex-ceptionInfo assumes that the custom for-matter is named after the exception classwith the postfix “Info,” such as SqlExcep-tionInfo for SqlException. The custom for-matter class must belong to the samenamespace as the ExceptionInfo class. Tosee how ExceptionInfo invokes the customformatter, see the ExceptionInfo.GetFor-matter method.

ExceptionInfo provides a number of help-ful methods. In addition to GetMessage,which retrieves the error message for thecurrent exception, it implements severaloverloaded GetMessages and Get-MessageArray methods. Both GetMessagesand GetMessageArray functions can retrieveerror information, not only from the im-mediate exception, but also from inner ex-ceptions. You can specify from which lev-el of inner exceptions you want to start andhow many levels the method must process.GetMessages returns error messages in a sin-gle string (messages are formatted as sen-tences separated by a single whitespacecharacter with all unnecessary whitespacesremoved). GetMessageArray returns errormessages in a string array. GetMessages andGetMessageArray are handy if you want todisplay only parts of error information. Forexample, you can use them to retrieve theinformation about inner exceptions in theerror details field of the error message box.

Finally, ExceptionInfo implements theGetStackTraces and GetStackTraceArraymethods, which work like GetMessagesand GetMessageArray, only these meth-ods retrieve exception stack informationfrom the current and inner exceptions.

Custom Exception FormattersThe library provides a number of customformatters for exception classes that con-tain more information than a basic ex-ception. For example, exceptions returnedby data providers (such as SqlException)can include such details as the name ofthe failed stored procedure, error sever-ity, ANSI code, and others. A custom for-matter knows how to retrieve the re-quested details from the exception objectand display these details in the formattederror message. This is accomplished byoverriding the FormatMessage method.

To understand how to implement a cus-tom formatter for an exception class notcovered by the library, consider one ofthe custom formatters using SqlExcep-tionInfo.

SqlExceptionInfo is a custom formatterfor SqlException. It is derived from Excep-tionInfo and defined under the samenamespace. SqlExceptionInfo overrides onefunction—FormatMessage. This functionreceives two parameters: the exception ob-ject and the message format flags (bitmask).

To make it easier to access exception-specific members, FormatMessage typecaststhe exception object from Exception to


Figure 1: Class diagram.

<<interface>>IMessageFormatter

<<datatype>>FormatFlags

MessageFormatter ExceptionInfo

ApplicationExceptionInfo

ExternalExceptionInfo

Win32ExceptionInfo

ExceptionFactory

Helper

COMExceptionInfo

SystemExceptionInfo

OracleExceptionInfo

OleDbExceptionInfo

SqlExceptionInfo

OdbcExceptionInfo

HttpExceptionInfo

WebExceptionInfo

Listing One//-------------------------------------------------------------------// $Workfile: BasePage.aspx.cs $// Description: Implements the BasePage class.// $Log: $//-------------------------------------------------------------------using System;

namespace ErrorSample{/// <summary>/// Base class implementing common utility functions reused by/// different pages belonging to this Web application./// </summary>/// <remarks>/// Page classes on this site must derive from/// <see cref="ErrorSample.BasePage"/>, not the usual/// <see cref="System.Web.UI.Page"/>./// </remarks>public class BasePage: System.Web.UI.Page{

// We need to keep the counters of the popup script blocks.// (Actually, we can distinguish between client-side and// start-up scripts, but why bother?)private int _errorScriptCount = 0;private string _errorScriptNameFormat = "_errorScript{0}";/// <summary>/// Formats string and replaces characters, which can break /// JavaScript, with their HTML codes./// </summary>/// <param name="message">/// Message or message format./// </param>/// <param name="args">/// Optional message parameters./// </param>/// <returns>/// Formatted string which is JavaScript safe./// </returns>private static string FormatJavaScriptMessage(

string message,params object[] args

){

// Make sure we have a valid error message.if (message == null)

return String.Empty;// If we have message parameters, build a formatted string.if (args != null && args.Length > 0)

message = String.Format(message, args).Trim();else

message = message.Trim();// Make sure we have a valid error message.if (message.Length == 0)

return String.Empty;// Back slashes, quotes (both single and double),// carriage returns, line feeds, and tabs must be escaped.return message.Replace(

"\\", "\\\\").Replace("'", "\\'").Replace(

"\"", "\\\"").Replace("\r", "\\r").Replace(

"\n", "\\n").Replace("\t", "\\t");

}/// <summary>/// Shows a formatted error message in a client-side (JavaScript)/// popup dialog./// </summary>/// <param name="message">/// Error message or message format./// </param>/// <param name="args">/// Optional message arguments./// </param>/// <remarks>/// Error popup will be rendered as the first element of the/// page (form)./// </remarks>public void ShowErrorPopup(

string message, params object[] args

){

ShowErrorPopup(true, message, args);}/// <summary>/// Shows a formatted error message in a client-side (JavaScript)/// popup dialog./// </summary>/// <param name="showFirst">/// Flag indicating whether the error message must be rendered/// as the first element of the page (form)./// </param>/// <param name="message">/// Error message or message format./// </param>/// <param name="args">/// Optional message arguments.

SqlException. Then it verifies the messageformat flags, and if the value of the bitmaskindicates the default (FormatFlags.Default),it sets the bitmask to the value of the stat-ic member of the MessageFormatter classvia the MessageFormatter.VerifyFormatmethod. Finally, it processes error infor-mation.

Error information available in SqlEx-ception can be retrieved from the classmembers, such as State, LineNumber, Pro-cedure, and others, as well as from thecollection of SqlError objects accessed viathe items of the Errors member. Accord-ing to the documentation, the first item inthe SqlError object collection shares someof the details with the members of theSqlException object. For example, the val-ue of SqlException.Class member is thesame as SqlException.Errors[0].Class. Toavoid duplicate information, SqlExcep-

tionInfo retrieves all information availablefrom the Errors collection. After these er-rors are processed, it adds the details,which are only accessible via the mainclass members.

Before adding information about a par-ticular exception detail, SqlExceptionInfochecks whether a corresponding flag in thespecified message format is set; if it is not,the detail will not be included. SqlExcep-tionInfo uses methods inherited from Ex-ceptionInfo (and MessageFormatter) to buildformatted messages for exception details.For additional information, see commentsin the source code.

If you implement a custom formatter foran exception class not included in the li-brary, extend the project:

1. Add a class for custom formatter. Namethis class by appending “Info” to the

name of the corresponding exceptionclass. It must be derived from IMes-sageFormatter (or any other class de-rived from IMessageFormatter, such asExceptionInfo) and created under thesame namespace.

2. Override the FormatMessage methodand implement the logic to retrieve ex-ception details and display them in aformatted error message.

If you need to specify additional fieldsin the message format bitmask, either ex-tend the FormatFlags enumerator or setthe unused bits (the format mask is han-dled as an integer value). I was planningon adding a custom formatter for SoapEx-ception, but decided not to becauseSoapException is very customizable (SOAPexception details can be passed via XMLnodes). If your application makes SOAPcalls, I suggest implementing your own ver-sion of the custom formatter.

ConclusionAlthough you cannot totally eliminate er-rors from software, vigorous error handlingand good error messages can help you re-duce the impact of these errors on theusers. The better job you do reporting er-rors, the less time your team will spendtroubleshooting them.

DDJ


Figure 2: Custom formatter for ParentException.

ApplicationExceptionInfo

ApplicationException

ParentExceptionInfo

ParentException ChildException

/// </param>public void ShowErrorPopup(

bool showFirst,string message, params object[] args

){

// Build message string which is safe to display in JavaScript code.message = FormatJavaScriptMessage(message, args);// If we did not get any message, we should not generate any output.if (message.Length == 0)

return;// Generate a unique name of the start-up script.string scriptBlockName = String.Format(// Generate HTML for the script.string scriptHtml = String.Format(

"{0}" + "<SCRIPT Language=\"JavaScript\">{0}" + "{0}" + "</SCRIPT>{0}", Environment.NewLine, message);

// Generate script opening a popup with error message.if (showFirst)

RegisterStartupScript(scriptBlockName, scriptHtml);else

RegisterClientScriptBlock(scriptBlockName, scriptHtml);}

}}//-------------------------------------------------------------------// $Workfile: Default.aspx.cs $// Description: Implements the DefaultPage class.// $Log: $//-------------------------------------------------------------------using System;using System.Collections;using System.ComponentModel;using System.Data;using System.Drawing;using System.Web;using System.Web.SessionState;using System.Web.UI;using System.Web.UI.WebControls;using System.Web.UI.HtmlControls;

namespace ErrorSample{/// <summary>/// Implements the default Web page./// </summary>public class DefaultPage: BasePage{

protected System.Web.UI.WebControls.Button btnTest;private void Page_Load(object sender, System.EventArgs e){}#region Web Form Designer generated codeoverride protected void OnInit(EventArgs e){

// CODEGEN: This call is required by the ASP.NET Web Form Designer.InitializeComponent();base.OnInit(e);

}/// <summary>/// Required method for Designer support - do not modify/// the contents of this method with the code editor./// </summary>private void InitializeComponent(){

this.btnTest.Click += new System.EventHandler(this.btnTest_Click);this.ID = "Form";this.Load += new System.EventHandler(this.Page_Load);

}#endregion// Displays error messages when button is clicked.private void btnTest_Click(object sender, System.EventArgs e){

for (int i=0; i<4; i++){

// Display errors with odd IDs as the first elements of// the form and the rest as the last elements of the form.bool showFirst = (i & 0x1) == 1;ShowErrorPopup( showFirst, "Error '{0}' occurred at:{1}{1}{2}",

i, Environment.NewLine, DateTime.Now);}

}}}

Listing Twobool CreateAccounts(){UserListInfo = GetUserListInfo();foreach(UserInfo in UserListInfo){if (not CreateAccount(UserInfo)){Report error.

}}

}bool CreateAccount(UserInfo){if (not CreateProfile(UserInfo)){Report error.return false;

}if (not CreateMailbox(UserInfo)){

Report error.return false;

}return true;

}bool CreateProfile(UserInfo){Create profile.

}bool CreateMailbox(UserInfo){Create mailbox.

}

Listing Three bool CreateAccounts(){UserListInfo = GetUserListInfo();string msg, errMsg;foreach(UserInfo in UserListInfo){if (not CreateAccount(msg, UserInfo)){errMsg = "Cannot create account for user "+UserInfo.UserID + ". " + msg;ReportError(errMsg);

}}

}bool CreateAccount(errMsg, UserInfo){string msg;if (not CreateProfile(msg, UserInfo)){errMsg = "Cannot create profile for " +

UserInfo.ProfileName +". " + msg;

return false;}if (not CreateMailbox(msg, UserInfo)){errMsg = "Cannot create mailbox for " + UserInfo.MailboxName + ". " + msg;return false;

}...return true;

}bool CreateProfile(errMsg, UserInfo){if (profile exists){errMsg = "Profile already exists.";return false;

}Create profile....return true;

}bool CreateMailbox(errMsg, UserInfo){if (mailbox exists){errMsg = "Mailbox already exists.";return false;

}Create malbox....return true;

}

Listing Four bool DoThis(TCHAR** ptszErrMsg, ...){TCHAR* ptszMsg = NULL;if (!DoThat(&ptszMsg, ...)){BuildMessage(ptszErrMsg, _T("Cannot do that. %s"), ptszMsg);TryDeleteMessage(&ptszMsg);return false;

}return true;

}bool DoThat(TCHAR** ptszErrMsg, ...){TCHAR* ptszMsg = NULL;if (!DoTheOther(&ptszMsg, ...)){BuildMessage(ptszErrMsg, _T("Cannot do the other. %s"), ptszMsg);TryDeleteMessage(&ptszMsg);return false;

}return true;

}main(){TCHAR* ptszMsg = NULL;TCHAR* ptszErrMsg = NULL;

if (!DoThis(&ptszMsg, ...)){BuildMessage(&ptszErrMsg, "Cannot do this. %s", ptszMsg);puts(ptszErrMsg);TryDeleteMessage(&ptszMsg);TryDeleteMessage(&ptszErrMsg);

}}

DDJ

It’s crucial that the software we write beas close to bulletproof as possible. Butproduction environments are hostile,and it sometimes seems like they were

designed to chew up software and spitout a smoking, mutilated mass of worth-less bytes. Users often do things we donot expect them to— or worse, told themnot to. When our software doesn’t dosomething it was never designed to do —and do it perfectly— users cancel units,never to be seen or heard from again.

Nobody ever said programming waseasy. Just for the record, I’ll say it now—programming is hard. No matter howgood you are, how much experience youhave, how many books you read andclasses you take, you can’t escape in-evitability. You will write bugs. I do it ev-ery day. I often tell my coworkers that ifthey aren’t writing at least two new bugseach day, they probably aren’t workinghard enough. When I was a rookie C++programmer, I thought that the key towriting code that was defect free was toknow more C++, more techniques, moretricks. I don’t think this anymore, and I’mmuch happier.

Debugging sessions are fine for detect-ing the major design flaws and little syn-tactic errors that crop up during develop-ment — buffer overruns, sending thewrong kind of message to some serversomewhere, and the like. But what hap-

pens when bugs are detected by users inproduction software? Usually, angry usersor administrators phone the help desk,but with little information that helps youdebug the problem. You typically knowwhat the symptoms were, because theseare what told users there was a bug in thefirst place. But users are usually unreli-able sources of objective or accurate in-formation, and they generally cannot tellyou what was happening before the bugoccurred. Of course, this is the informa-tion you really need. So unless you arelucky and just happen to stumble acrossan obviously stupid piece of code suchas this:

QUOTE* pQuote = new QUOTE;delete pQuote;pQuote->SetPrice(1.234f);

you will probably spend days looking forthe bug. Once you get close enough toreproduce it, fixing the defect is usuallycomparatively simple, and often limitedto around one line of code.

The problem is a lack of information.A bug is, by definition, an unknown quan-tity. Most language-level features that aredesigned to help diagnose problems arenot intended for use in production soft-ware. Things like assert( ) become uselessor worse in release builds. When you geta call about a bug in production software,it takes forever to identify and fix the prob-lem. Most of the time you spend in justtrying to reproduce the problem. If youcould identify the state of the universemore quickly, the effort needed to resolvebugs would go down a lot.

The Production Software Debug (PSD)library I present here is a library of utili-ties designed to identify and diagnose bugsin production software. There are onlythree main features in the library, but theypack a wallop. Used liberally in produc-tion code, the PSD library (available elec-tronically; see “Resource Center” page 3)has helped to significantly reduce theamount of time it takes to fix bugs. Itsthree main features are:

• verify( ), a better assert( ).• static_verify( ), a compile-time version

of verify( ).• OutputMessage(), a generic logging mech-

anism that is easy to use and extend.

verify(): A Better assert()There are few C++ language-level featuresto help identify bugs, and what precious

few do exist are not suitable for produc-tion software. One language feature thatwas added early in the language’s evolu-tion was the assert( ) macro. The idea wassimple. When a function is executed, youexpect the software to be in a sane state.Pointers point to the right thing. Socketsare open. The planets are aligned. assert( )makes it possible to check these things atruntime easily, to add precondition andpostcondition checks to blocks of code.

But assert( ) is contrived. If the expres-sion sent to assert( ) is false, it kills yourprogram. Back in the ’70s, when softwarewas written by the people who ran it,maybe this kind of behavior was okay.But today, if a wild pointer results in theapplication going poof—well, that’s justnot going to do at all. Pointers shouldn’tbe wild in the first place, but the mainpoint is that no matter how much codeyou write to keep your pointers from be-ing wild, it’s not going to be enough.Sometimes they will go wild, anyway. Youmust come to terms with this fact. It turns

Debugging Production SoftwareThe PSD libraryprovides a fewpowerful tools

JOHN DIBLING

John is a programmer from Chicago, Illi-nois. He can be reached at [email protected].


“A bug is, bydefinition, anunknown quantity”

out that assert( ) isn’t really useful at allfor dealing with wild pointers in code thatwas written and tested. It’s only useful intesting code that’s still in development.

There are three major problems with as-sert that make it unsuitable for productioncode. The first one I already mentioned—it rips the bones from the back of yourrunning program if a check fails. Second,it has no return value so that you can han-dle a failed check. Third, it makes no at-tempt to report the fact that an error oc-curred. The PSD library’s runtime testingutilities address these problems. They aretemplate functions that accept any pa-rameter type that is compatible with op-erator!, and return bools—true for a suc-cessful check, false for a failed check. Ifthe check fails, the test utilities simply re-turn false and do not terminate the pro-gram or do anything similarly brutal. Likeassert( ), in a debug build, a failed veri-fy( ) will also break into the debugger. Butthe most significant features of the veri-fy( ) utilities are the tracing mechanisms.

The PSD library includes tracing utili-ties, and verify( ) uses these tracing utili-ties to dump a rich diagnostic messagewhen a check fails. The message is auto-matically generated and output to a placewhere you can get it.

The diagnostic message is rich, mean-ing it includes a lot of detailed informa-tion. Actually, there isn’t much informa-tion to include, but all of it is included.Specifically, the message says the exactlocation of the failed check, includingsource filename and line number, and theexpression that failed, including the actu-al value of the expression. For example,suppose your program is logging a userinto a server, and you have a runtimecheck to assert that the login succeeded:

if( !verify( LOGIN_OK == pServer->Login(jdibling,password))

{// handle a failed login attempt here

}

If the login did not succeed, this diag-nostic message is generated and logged:

*WARNING* DEBUG ASSERTION FAILED: Application:

'MyApp.EXE', File: 'login.cpp', Line: 120,Failed Expression:

'LOGIN_OK == pServer->Login("jdibling","password")'

This diagnostic message is sent to what-ever destinations you configure (one ormore), and you can configure whateverdestinations you like, including your ownproprietary logging utility. By default, thePSD library sends all such messages tothree places: std::cerr, the MSVC 6.0 de-bugging window (which is visible in pro-

duction runs using the DbgView.EXE util-ity, a freely available utility at http://www.sysinternals.com/), and a cleartext log file.(The name and location of this file is con-figurable, but by default it is named“log.txt” and saved in the current workingdirectory.) The diagnostic message is ex-tremely helpful in diagnosing problemsthat occur in customer’s machines. It isgenerally a much simpler matter to ac-quire a log file from a customer than totry and reproduce the error condition. Inaddition, it frequently is not enough toknow just the failed expression and thelocation of the failed source code. Usual-ly, you need to know how the universegot in such a state, and the previous out-put messages that occur when the PSD li-brary is used liberally is of extraordinarysignificance. For example, I usually wantto know what exact version of my soft-ware the error occurred in, and I outputthat information to a log file using the trac-ing utilities in the PSD library. These twopieces of information taken together areoften enough to know just what happenedand why.

In the aforementioned code, verify isactually a preprocessor macro for the tem-plate function Verify (note the change incase). Generally, I don’t like macros, butin this case, the benefits outweighed anydetraction. You could call the Verify( ) tem-plate function directly, as it is included inthe library interface, but there is little pointand I have never seen a reason to do so.Also, if you call verify( ) (the macro ver-sion) and the check fails, the diagnosticmessage includes the filename and linenumber of the failed check. This is ac-complished through macro black magic.If you call Verify( ) (the low-level templatefunction) directly, you lose this benefit andare on your own in trying to figure outwhich Verify( ) check failed.

There are several other flavors of veri-fy( ) as well, good for common specialcases and taking more control over its be-havior. One flavor is noimpl( ), a defaulthandling placeholder for the black holesin your code. The most common exam-ple of this is the default handler in a switchstatement. In the case where your intentin a switch is to handle every possibility,you often have a default handler to dosome default handling when things gowrong. Adding a noimpl( ) call to theseblocks triggers a call to verify(false). Manyotherwise very-hard-to-detect bugs aresimply flagged using this feature.

Another flavor of verify( ) is testme( ),which is kind of like a bookmark. Whenwriting new blocks of code that you in-tend to test by stepping through manual-ly, just add a call to testme( ) at the be-ginning of the block. I have found thatwhen I’m writing code that I plan to step



through, it is usually in lots of differentplaces and I tend to lose track of them.testme( ) breaks into the debugger whenit is run (just like verify) and reminds youwhere to test.

static_verify()static_verify( ) is a version of verify( ) thatis “run” at compile time, rather than run-time. The motivation for this device has ex-isted for many years, but the design for itwas derived from one presented in AndreiAlexandrescu’s book Modern C++ Design.

static_verify( ) is especially useful at de-tecting when some critical implementationdetails have changed without your realiz-ing it. Relying on the implementation de-tails of some data structure or object is al-most always a bad idea. But in the realworld, it happens all the time. Older code,newer programmers, and plain bad de-signs are everywhere, and our job is toget all of this code to work first, and pon-tificate about how it isn’t pristine later.

This code is guaranteed to work so longas the two user id fields are the same size:

struct USER{

char m_cUID [10];char m_cPwd[10];

};

struct LOGIN_MESSAGE{

char m_cUID[10];char m_cPwd[10];

};: :static_verify( sizeof(USER::m_cUID) ==

sizeof(LOGIN_MESSAGE::m_cUID) ));memcpy(user.m_cUID,login.m_cUID,

sizeof(user.m_cUID));

Because it is doing a memcpy( ), it’s go-ing to be fast. It does not matter what thesize actually is, and it does not matter whatthe format of the char buffers are (for ex-ample, whether they are null terminated,space padded, or whatever). But if one ofthe char buffers is changed in size, thestatic_verify( ) halts the compiler with anerror message, and you can adjust your al-gorithm to work with the new disparity.

OutputMessage()OutputMessage( ) and the other tracingutilities make it easy to generate messagesthat are sent wherever you want. Use Out-putMessage( ) liberally to log the values ofvariables and parameters, trace the exe-cution path of a function, and so on.Again, the runtime testing utilities also gen-erate calls to OutputMessage( ).

OutputMessage( ) works like sprintf, soit is easy to use, and chances are pretty

good that you already know how to useit. There are flavors of OutputMessage( )that take additional parameters specifyingthe destination of the message, optionsflags, and so on. But the general-purposeOutputMessage( ) takes just a format stringand a variable parameter list, just likesprintf( ).

OutputMessage( ) can send messages towherever you want, and it is easy to getit to send a message to somewhere new.Simply define a callback function and reg-ister it with the PSD library as such, set aglobal PSD library option to always sendmessages there, and you’re done. Fromthen on, every time OutputMessage( ) iscalled, messages will be sent to your rou-tine. You can define numerous destina-tions and have messages sent to all, one,or none of them. You can also call Out-putMessageEx( ) to send a specific mes-sage to a specific location.

ConclusionThe PSD library was written in C++ usingonly standard-compliant features in its in-terface. It was originally intended for useon Windows platforms and the MicrosoftVisual C 6.0 compiler, but there are noplatform-specific features in the interface.The implementation of these features inmany cases does make use of Windows-specific functions and primitives, as youmight expect. But on the whole, it shouldbe easily adaptable to other platforms andcompilers.

In production code, where the PSD li-brary was used extensively, the time need-ed to diagnose, debug, and fix bugs wasreduced drastically. There are two keysto reducing debug time:

• Using verify to detect errors at runtime.It is possible to simply do a globalsearch and replace in code that currentlyuses assert, changing all instances of as-sert to verify. Adding additional calls toverify also helps. The normal case ofexecution for a verify is for testing a trueexpression, and this common case is ex-ecuted fast. If the expression is true, ver-ify consists of one function call, an ifstatement, and a return statement. Be-cause of this, verify is appropriate to usein time-critical code.

• Logging the state of the running pro-gram before problems occur. To debugfaulty code, in addition to knowing thefailed expression, it is important to knowthe version of the software, the valuesof internal variables and function pa-rameters, whether pointers are valid, andso on. Using OutputMessage( ) adds thisinformation to the log and helps reducedebug time.

DDJ



Testing and debugging embedded soft-ware systems and hardware designsprovides challenges similar to soft-ware test and debug. Logic is logic,

whether it is implemented in software orsilicon. However, although this is certain-ly not universally true, the stakes can behigher for hardware systems because ofseveral factors. Market forces are drivingmuch of the electronics industry into theprofitable consumer arena. Consumerproducts need to be easy to use andcheap, and time-to-market is as critical to-day as ever. Finally, chip designs need tobe innovative and differentiated from com-petitors. These goals, which are often atodds with each other, are running into an-other technological force — the ever-shrinking transistor. Despite rumors of itsimminent demise, Moore’s Law continueswith frenzy and ever smaller geometries:Gate dimensions have marched down the“submicron” path from 0.65 microns, to 0.5microns, to 0.25 microns, to 0.18 microns,and now 0.13 microns. The process con-tinues. With each step, the factories thatprint chips have to retool, and that cost is

passed along to chip designers. To providereasonable cost and function, new chip de-signs are commonly entire “Systems-on-a-Chip,” with previously uncoupled sub-systems residing on a single chip thatmight total over 15 million gates. The costof failure is high. If a chip comes backfaulty from the fabricating plant, the av-erage cost for a respin is approaching $1million. Recent market research indicatesmore than half of all new chips require atleast one respin, and an even higher per-centage of final products have major func-tional flaws.

The Electronic Design Automation(EDA) industry provides tools that hard-ware and embedded systems designersuse to produce new designs. EDA toolshave evolved along with customer de-mands. On the design side, the last 15years have been marked by the dominanceof hardware-design languages— princi-pally Verilog and VHDL, which supportmodeling at the RTL-level of abstraction.The Register Transfer Level (RTL-level)tracks system changes at the clock cycle.This is turning out to be too computa-tionally intense for the design and verifi-cation of Systems-on-a-Chip that containover 10 million gates. A new level of ab-straction, the Transaction Level (or moregenerally, the Electronic System Level, ESL),is emerging where simulation is viewed asa behavioral flow of data through a sys-tem. Granularity shifts from clock burststo flow events such as data bursts acrossa bus. The single transfer of an MPEG frommemory to a speaker can be modeled inone discrete event with an associated cost,rather than being broken down into theindividual clock bursts and bit transfers

that make up the physical event. This pro-vides faster simulation and a quicker turn-around for design decisions.

This new ESL is not only important atthe design level but is increasingly morecritical at the verification stage that occurs

before a chip design is masked into sili-con. Historically, it was possible to ex-haustively test a circuit model by tryingall possible test vectors against a knownmatrix of results. This black-box testinghas parallels in software testing and thesame issues arise there— time. With largedesigns, an exhaustive approach is im-possible. Techniques familiar to softwaretesters are being modified and translatedto the hardware and embedded softwarearenas with specific emphasis being placedon the particular problems of massive de-sign spaces. Several ESL technologies havearisen to implement these techniques. Iaddress one of them in this article—Sys-temC Verification (SCV).

System Verification With SCV

Logic is logic, andtesting is testing

GEORGE F. FRAZIER

George works on ESL technologies and Sys-temC at Cadence Design Systems Inc. Hecan be reached at [email protected].


“SystemC wasdeveloped tostandardize an ad hoc collection of C-based ESLtechnologies”

The SCV library is an open-source classlibrary that works in conjunction with theopen-source library SystemC. Both librariesare built on Standard C++. SystemC andSCV are governed by the Open SystemCInitiative (http://www.systemc.org/). SCVis being widely used to verify not onlySystemC designs but designs written inVerilog and VHDL, or a combination ofall three. Note the similarity to softwaretesting. Chip designs are written in pro-gramming languages that are expressiveof hardware constructs, the design is ver-ified at a certain level of abstraction, andthen other programs translate (similar to“compile”) or synthesize the higher levelprogrammatic representations of Systemfunction down to a gate-level design thatcan be fabricated into a chip.

Verification at the ESLSystemC was developed to standardize anad hoc collection of C-based ESL tech-nologies so that EDA vendors and IP de-signers could exchange models and stan-dardize interfaces for better interoperabilitybetween products and product flows. Thisis important for system design, but espe-cially important for system verification,where the goal is to build a reusable in-frastructure of test stimuli and models. Fora more detailed discussion of SystemC,see my article “SystemC: Hardware Con-structs in C++” (C/C++ Users Journal, Jan-uary 2005).

As systems become increasingly morecomplex, it is no longer possible to ex-haustively test designs from a black boxperspective. Modern verification tech-niques are designed to be integrated withthe development process and implement-ed by experts who know most about whatshould and should not be happening atany stage of the design. SystemC itselflacks some of these methods. However,C++ in general is a convenient languagefor generating test benches, even if the IPmodels are written in Verilog or VHDL.With SystemC, ESL verification can bedone across the entire lifecycle of a pro-ject, and verification blocks can be reusedbetween projects and for both block ver-ification and examination of design trade-offs. These facts led to the developmentof a separate set of libraries on top of Sys-temC for Verification. Figure 1 illustratesa transaction-based verification schemebased on SCV. Here, Tests communicatevia a Transactor with the model of the sys-tem design. Tasks are Transaction-levelevents; that is, above the RTL-level. If thedesign is at the RTL-level because, for ex-ample, it is written in Verilog or VHDL,the signals between the Transactor andthe model are RTL-level signals. A Trans-actor can be thought of as an adaptor thattranslates communications between the

various levels of abstraction. For example,a Transactor can translate a high-level op-eration, such as a function call or bustransfer into its component signals or databursts that are clock accurate.

RandomizationBecause it is not possible to test all com-binations of possible inputs (or stimuli) ofa design, a subset of stimuli is chosen forsystem verification. One approach to thisis to construct tests by hand. This is help-ful for bug tracking (a unit test is intro-duced to ensure that subsequent changesdon’t “unfix” the improvement), but forthorough coverage, hand-constructed testscan be both limiting and biased. Ran-domization lets test vectors be picked ran-domly with values ascribed possibly basedon certain bounding criteria.

Unconstrained randomization is un-bounded: Data values have an equal prob-ability of occurring anywhere in the legalspace of the data type. This is similar tousing C’s rand( ) function to generate arandom integer value between 0 andRAND_MAX.

Weighted randomization weights theprobability so that the distribution of datavalues is not uniform. This is tunable. Anexample would be setting a 75 percentprobability that an integer input will takeon a value between 0 and 50 and a 25percent probability that the input will bebetween 51 and 100.

A more sophisticated randomization isconstraint-based randomization where aninput is constrained to a range of its legalvalues by a set of rules specified by con-straint expressions. The constraint ex-pressions can be simple ranges or a com-plex expression that may include othervariables under constraint and subex-pressions, and so on.

The first class of importance in SCV ran-domization is the scv_smart_ptr<T> class,a container that provides multithreadedmemory management and access and au-tomatic garbage collection for both built-in C++ types and SystemC types, such assc_int, sc_uint, and the like. Smart point-ers are essential in SCV because SystemCis a multithreaded library where memorymight be accessed by more than onethread or even more than one process.The get_instance( ) method of scv_smart_ptr<T> provides direct access to the un-derlying data object (this should be usedwith caution). Because it is problematicto apply the core algorithms of random-ization to data structures that can dy-namically change in size, SCV random-ization is limited to data types with afixed size (this excludes lists, trees, andso on). However, users can create smartpointers for all extended types, includ-ing user-defined structs. Finally SCV pro-

vides many general-purpose methods formanipulating smart pointers such as copy-ing a smart pointer (include “deep” copy).

Listing One is an example of usingweighted randomization in SCV. A systemis simulated where operations exist to RE-CEIVE, STORE, DECODE, MANIPULATE,ENCODE, and SEND a jpeg object. Theoperations are named in an enum that isused as a smart pointer. Weighted ran-domization is implemented in SCV withthe class scv_bag, which creates distribu-tions, and the smart pointer methodset_mode( ), which assigns a distributionto an input. I assign weights to each op-eration with the add( ) method of scv_bag(for convenience, a total of 100 is used,so it is obvious that, for example, there isa 40 percent probability in any event cy-cle of variable jpg taking on the value ofMANIPULATE). For clarity, the exampleomits the extensions definition that wouldbe required to do randomization on theuser’s enum. Weighted randomization ismost useful when an input has a knownset of legal values and it is desired to fa-vor certain ranges differently than others.

The richness of randomization in SCVis achieved through the constraint class-es, which are containers for randomiza-tion objects that have complex constraints.Constructing a class that contains both adata type and the constraints on that typeseparates the two and allows the use ofC++ class inheritance. While a detailedtreatment of constraint-based randomiza-tion in SCV and the mechanisms of an ef-ficient constraint solver is beyond thescope of this article, I can share one smallexample.

Constraint classes derive from scv_con-straint_base and must contain at least onescv_smart_ptr (each such smart pointermust be instantiated on a simple object,no nested smart pointers or hierarchy isallowed). Listing Two shows the creationand use of a constraint class. Here, a sim-ple bounded integer is represented bythree constraints. Note that the use of thescv_constraint_base::next( ) generates in-stances of the constrained object. This callrandomizes all smart pointers in the con-straint class.

Transaction Monitoring and RecordingOne of the basic debugging and analy-sis functions of verification is to record


Figure 1: A transaction-basedverification scheme based on SCV.

Test Transactor Model

SignalsTasks

Listing Oneenum PROCESS_JPEG_EVENTS {RECEIVE, STORE, DECODE, MANIPULATE, ENCODE, SEND};

int jpeg_stream(){

scv_smart_ptr<PROCESS_JPEG_EVENTS> jpg;scv_bag<PROCESS_JPEG_EVENTS> jpg_dist;jpg_dist.add(RECEIVE, 10);jpg_dist.add(STORE, 10);jpg_dist.add(DECODE, 20);jpg_dist.add(MANIPULATE, 40);jpg_dist.add(ENCODE, 10);jpg_dist.add(SEND, 10);

jpg->set_mode(jpg_dist);

while (1){

jpg->next();switch (jpg->read()){

case RECEIVE: jpg_receive(); break;case STORE: jpg_store(); break;case DECODE: jpg_decode(); break;case MANIPULATE: jpg_manipulate(); break;case ENCODE: jpg_encode(); break;case SEND: jpg_send(); break;default: return 1;

}}return 1; // never return

}

Listing Two// An SCV Constraint Base Class with 3 constraints

class boundary_constraint_class : public scv_constraint_base {public:

scv_smart_ptr<sc_uint> burst;scv_smart_ptr<uint> lower;scv_smart_ptr<uint> upper;

SCV_CONSTRAINT_CTOR(boundary_constraint_class) {

SCV_CONSTRAINT (lower() > 100);SCV_CONSTRAINT (upper() < 500);SCV_CONSTRAINT (burst() >= lower()

&& burst() <= upper() );}

};// using the boundary classint use_boundary() {

boundary_constraint_class bc ("boundary_constraint_instance"); // The argument is a name string required by SystemCfor(int i=0; i < DESIRED_NUMBER_OF_CONSTRAINED_RANDOM_VALUES; ++i) {

bc.next(); //generate valuescout << "Value of burst: " << bc.burst() << endl;

}return 0;

}

Listing Threetemplate <typename T> void fields(const T&obj){

scv_extensions<T> ext = scv_extensions(obj);cout << "Our object has " << ext.get_num_fields() << " fields." << endl;

};class aClass{public:

long I;int t;char *p;

};SCV_EXTENSIONS(aClass) {public:scv_extensions<long> I;scv_extensions<int> t;scv_extensions<char*> p;SCV_EXTENSIONS_CTOR(aClass) {SCV_FIELD(I);SCV_FIELD(t);SCV_FIELD(p);

}};

void example(){aClass ac;fields(ac);

}

DDJ

and report the results of operations with-in the Transactor. This is done via trans-action monitoring and recording withthe SCV Transaction API. The output inSCV is text based. In SCV, you controlwhat happens during transaction record-ing by registering callbacks. For exam-ple, to do text recording, users callscv_tr_text_init( ), which registers theappropriate callbacks for text recording.Similar strategies can be used to changehow transactions are recorded. For in-stance, to record to an SST2 database (avendor-specific signal database provid-ed by Cadence Design Systems) you callcve_tr_sdi_init( ) to register those call-backs. Text recording can be slow, sothis is a powerful way to extend trans-action recording. Monitoring can bedone dynamically or the output can bedumped out for postprocessing.

Transactions are events that have a be-ginning and ending time and an associat-ed data set. Important classes in SCV trans-action recording are:

• scv_tr_generator, a class for generatingtransactions of a specific type.

• scv_tr_stream, a grouping of related oroverlapping transactions.

• scv_tr_db, a transaction database thatcollects a group of transactionstreams.

• scv_tr_handle, a handle to a transaction.

Data IntrospectionAn important facilitator for randomization,constrained randomization, and transac-tion recording is the ability to perform cer-tain operations on complex C++ and Sys-temC data types. The C function rand( )generates a random integer. SCV allowssimilar randomization of higher order C++,SystemC, and SCV classes and types witha form of data introspection. Introspec-tion lets a program gain knowledge aboutthe properties of an initially unknown dataobject. Thus, in the example of rand( ),this can be extended to complex typesby using introspection to determine thename and data types of the data mem-bers of an object, and applying type-appropriate customizations of rand( ) onthose composite members (often wrappedin an scv_smart_ptr). In SCV, data intro-spection is implemented with templatespecialization so code can work with adata object without explicit type informa-tion at compile time. Without this power-ful SCV feature, users would need to im-plement this with custom code for everyclass in the verification design.

The data introspection facility provides astandard abstract interface, scv_extensions_if,from which a data object can be analyzedand manipulated. The scv_extensions tem-plate extends data objects so that they sup-port the abstract interface via partial tem-plate specialization. SystemC, C++, and C

built-in types have corresponding data in-trospection types, including:

• Class scv_extensions<bool> for bool.• Class scv_extensions<char> for char. • Class scv_extensions<sc_string> for

sc_string.• Class scv_extensions<T*> for pointer.

scv_extensions has a host of memberfunctions for introspecting the data type.Listing Three is an example where a user-defined type is queried for the number offields. This is just a small primer on dataintrospection. Besides static analysis, suchas in the aforementioned example, a richimplementation of dynamic analysis is alsoavailable in SCV.

ConclusionSCV is widely gaining support from theESL community and has already been usedto verify chips that have made it to tapeout. It is not without competitors in theESL verification space, namely Sys-temVerilog and the “e” language. Howev-er, as SystemC continues to gain promi-nence, SCV and the libraries that simplifyits use should gain traction and help spurthe adoption of Electronic System Level,“preimplementation” Verification in hard-ware and embedded system design.

DDJ


Whether an embedded-systemsdatabase is developed for a spe-cific application or as a commer-cial product, portability matters.

Most embedded data-management codeis still homegrown, and when externalforces drive an operating system or hard-ware change, data-management codeportability saves significant developmenttime. This is especially important becausethe lifespan of hardware is increasinglyshorter than that of firmware. For databasevendors, compatibility with the dozens ofhardware designs, operating systems, andcompilers used in embedded systems pro-vides a major marketing advantage.

For real- time embedded systems,database code portability means morethan the ability to compile and executeon different platforms: Portability strate-gies also tie into performance. Softwaredeveloped for a specific OS, hardwareplatform, and compiler often performspoorly when moved to a new environ-

ment, and optimizations to remedy thisare very time consuming. Truly portableembedded systems data-managementcode carries its optimization with it, re-quiring the absolute minimum adaptationto deliver the best performance in newenvironments.

Using Standard CWriting portable code traditionally beginswith a commitment to use only ANSI C.But this is easier said than done. Evencode written with the purest ANSI C in-tentions frequently makes assumptionsabout the target hardware and operatingenvironment. In addition, programmersoften tend to use available compiler ex-tensions. Many of the extensions— pro-totypes, stronger type checking, and soon— enhance portability, but others mayadd to platform dependencies.

Platform assumptions are often consid-ered necessary for performance reasons.Embedded code is intended to run opti-mally on targets ranging from the low-end8051 family, to 32-bit DSP processors, tohigh-end Pentium-based SMP machines.Therefore, after the software has been suc-cessfully built for the specific target, it iscustomary to have a performance tuningstage that concentrates on bringing out thebest of the ported software on the partic-ular platform. This process can be asstraightforward as using compiler-specificflags and optimizations, but often becomescomplex and time-consuming and involvespatching the code with hardware-specificassembler. Even with C language patches,hardware-optimized code is often obscure

and, more importantly, performs poorly ondifferent machines.

Programmers also attempt to maintainportability through conditional code(#ifdef/#else) in a master version that ispreprocessed to create platform-specificversions. Yet in practice, this method can

create the customization and version-management headaches that portability ismeant to eliminate. Another conditionalcode approach, implementing if-else con-ditions to select a processor-specific exe-cution path at runtime, results in both un-manageable code and wasted CPU cycles.

All told, it’s better to stick to ANSI Cand to use truly platform-independent datastructures and access methods as much aspossible to work around compiler- andplatform-specific issues.

Portability & Data Management

Simplify the reuse ofdata-management codein new environments

ANDREI GORINE

Andrei is principal architect at McObject.He can be reached at [email protected].

“One proventechnique is to avoidmaking assumptionsabout integer andpointer sizes”


In the process of creating the eXtreme-DB in-memory embedded database atMcObject (where I work), we developedseveral techniques that are useful for anydeveloper seeking to write highly portable,maintainable, and efficient embeddedcode. Some of these techniques apply toembedded systems portability generally,but are particularly important for data man-agement. In many cases, an embeddedapplication’s database is its most complexcomponent, and getting it right the firsttime (by implementing highly portablecode) saves programmer-months downthe road. Other techniques I present here,such as building lightweight database syn-chronization based on a user-mode spin-lock, constitute specific key buildingblocks for portable embedded systemsdatabases.

Word SizesOne proven technique is to avoid makingassumptions about integer and pointersizes. Defining the sizes of all base typesused throughout the database enginecode, and putting theses typedefs in a sep-arate header file, makes it much easier tochange them when moving the code fromone platform to another or even using adifferent compiler for the same hardwareplatform; see Listing One.

Defining a pointer size as a sizeof(void*)and using the definition to calculate mem-ory layout offsets or using it in pointer arith-metic expressions avoids surprises whenmoving to a platform such as ZiLOG eZ80with 3-byte pointers:

#define PTRSIZE sizeof(void *)

The void* type is guaranteed to haveenough bits to hold a pointer to any dataobject or to a function.

Data AlignmentData alignment can be a portability killer.For instance, on various hardware archi-tectures a 4-byte integer may start at anyaddress, or start only at an even address,or start only at a multiple-of-four address.In particular, a structure could have its el-ements at different offsets on different ar-chitectures, even if the element is the samesize. To compensate, our in-memory datalayout requires data object allocation tostart from a given position, and aligns el-ements via platform-independent macros.Listing Two aligns the position of the dataobject (pos) at a 4-byte boundary.

Another alignment-related pitfall is that,on some processors (such as SPARC), alldata types must be aligned on their nat-ural boundaries. Using Standard C datatypes, integers are aligned as follows:

• short integers are aligned on 16-bitboundaries.

• int integers are aligned on 32-bitboundaries.

• long integers are aligned on either 32-bit boundaries or 64-bit boundaries, de-pending on whether the data model ofthe kernel is 64-bit or 32-bit.

• long long integers are aligned on 64-bitboundaries.

Usually, the compiler handles thesealignment issues and aligns the variablesautomatically; see Listing Three. But re-defining the way a variable or a struc-ture element is accessed, while possibleand sometimes desirable, can be risky.For example, consider the declaration inListing Four of an object handle (as-suming the data object size is N bytes).Such opaque handle declarations arecommonly used to hide data object rep-resentation details from applications thataccess the data object with an interfacefunction, using the handle merely as theobject identifier as shown in Listing Five.Because d is a byte array, the address isnot memory aligned. The handle is fur-ther used as an identifier of the objectto the library:

void* function ( appData *handle);

Furthermore, internally the library“knows” about the detail behind the han-dle and declares the object as a structurewith the elements defined as short inte-gers, long integers, references, and so on;see Listing Six.

Accessing object elements leads to a buserror because they are not correctlyaligned. To avoid the problem, we declarein Listing Seven the object handle as anarray of operands of the maximum size(as opposed to a byte array). In this case,the compiler automatically aligns theoperands to their natural boundaries, pre-venting the bus error.

Word EndiannessByte order is the way the processor storesmultibyte numbers in memory. Big-endianmachines, such as Motorola 68k andSPARC, store the byte with the highestvalue digits at the lowest address whileLittle-endian machines (Intel 80x86) storeit at the highest address. Furthermore,some CPUs can toggle between Big- andLittle-endian by setting a processor reg-ister to the desired endian-architecture(IBM PowerPC, MIPS, and Intel Itaniumoffer this flexibility). Therefore, code thatdepends on a particular orientation ofbits in a data object is inherently non-portable and should be avoided.Portable, endian-neutral code shouldmake no assumptions of the underlyingprocessor architecture, instead wrappingthe access to data and memory structureswith a set of interfaces implemented via

processor-independent macros, which au-tomatically compile the code for a par-ticular architecture.

Furthermore, a few simple rules helpkeep the internal data access interfacesportable across different CPU architectures.

• Access data types natively; for instance,read an int as an integer number as op-posed to reading 4 bytes.

• Always read/write byte arrays as bytearrays instead of different data types.

• Bit fields defined across byte bound-aries or smaller than 8 bits are non-portable. When necessary, to access abit field that is not on byte boundaries,access the entire byte and use bit masksto obtain the desired bits.

• Pointer casts should be used with care.In endian-neural code, casting pointersthat change the size of the pointed-todata must be avoided. For example,casting a pointer to a 32-bit value0x12345678 to a byte pointer wouldpoint to 0x12 on a Big-endian and to0x78 on a Little-endian machine.

Compiler DifferencesCompiler differences often play a signif-icant role in embedded systems porta-bility. Although many embedded envi-ronments are said to conform to ANSIStandards, it is well known that in prac-tice, many do not. These nonconfor-mance cases are politely called “limita-tions.” For example, although requiredby the Standard, some older compilersrecognize void, but don’t recognize void*.It is difficult to know in advance whethera compiler is in fact a strict ANSI C com-piler, but it is very important for anyportable code to follow the Standard.Many compilers allow extensions; how-ever, even common extensions can leadto portability problems. In our develop-ment, we have come across several is-sues worth mentioning to avoid compiler-dependent problems.

When char types are used in expres-sions, some compilers treat them as un-signed, while others treat them as signed.Therefore, portable code requires thatchar variables be explicitly cast when usedin expressions; see Listing Eight.

Some compilers cannot initialize auto-aggregate types. For example, Listing Ninemay not be allowed by the compiler. Themost portable solution is to add code thatperforms initialization, as in Listing Ten.

C-Runtime LibraryDatabases in nonembedded settings makeextensive use of the C runtime. However,embedded systems developers commonlyavoid using the C runtime, to reduce mem-ory footprint. In addition, in some em-bedded environments, C-runtime functions,


such as dynamic memory allocations/deal-locations (malloc()/free()), are implementedso poorly as to be virtually useless.

An alternative, implementing the nec-essary C-runtime functionality within thedatabase runtime itself, reduces memoryoverhead and increases portability. Formain-memory databases, implementing dy-namic memory management through thedatabase runtime becomes vitally impor-tant because these engines’ functionalityand performance are based on the effi-ciency of memory-oriented algorithms. Weincorporate a number of portable embed-ded memory-management componentsthat neither rely on OS-specific, low-levelmemory-management primitives, nor makeany fundamental assumptions about theunderlying hardware architecture. Each ofthe memory managers employs its own al-gorithms, and is used by the database run-time to accomplish a specific task.

• A dynamic memory allocator providesfunctionality equivalent to the standardC runtime library functions malloc( ),calloc( ), free( ), and realloc( ) accord-ing to the POSIX Standard. The heap al-locator is used extensively by thedatabase runtime, but also can be usedby applications.

• Another memory manager is a compre-hensive data layout page manager thatimplements an allocation strategy adapt-ed to the database runtime requirements.Special care is taken to avoid introduc-ing unnecessary performance overheadassociated with multithreaded access tothe managed memory pools.

• A simple and fast single-threaded mem-ory manager is used while parsing SQLquery statements at runtime, and so on.

SynchronizationDatabases must provide concurrent accessacross multiple simultaneously runningtasks. Regardless of the database lockingpolicies (optimistic or pessimistic, recordlevel or table level, and the like), this mech-anism is usually based on kernel syn-chronization objects, such as semaphores,provided by the underlying OS. While eachoperating system provides very similar ba-sic synchronization objects, they do so withconsiderably different syntax and usage,making it nontrivial to write portable multi-threaded synchronization code. In addition,an embedded systems database must striveto minimize the expense associated with ac-quiring kernel-level objects. Operating-system semaphores and mutexes are usu-ally too expensive, in performance terms,to be used in embedded settings.

In our case, the solution was to buildup the database runtime synchroniza-tion mechanism based on a simple syn-chronization primitive— the test-and-set

method— that is available on most hard-ware architectures. Foregoing the kernelfor a hardware-based mechanism reducesoverhead and increases portability. All wemust do is port three functions to a spe-cific target. This approach can also be usedfor ultra low-overhead embedded systemswhere no operating system is present(hence, no kernel-based synchronizationmechanism is available). Furthermore, theperformance of the test-and-set “latch” inListing Eleven remains the same regardlessof the target operating system and dependsonly on the actual target’s CPU speed. List-ing Twelve(a) provides implementationsfor Win32, Listing Twelve(b) the SunSPARC platform, and Listing Twelve(c) isthe Green Hills INTEGRITY OS.

The concept of mutual exclusion is cru-cial in database development— it providesa foundation for the ACID properties thatguarantee safe sharing of data. The syn-chronization approach just discussed slantstoward the assumption that for embeddedsystems databases, it is often more effi-cient to poll for the availability of a lockrather than allow fair preemption of thetask accessing the shared database.

It is important to note that even thoughthis approach is portable in the sense thatit provides consistent performance of thesynchronization mechanism over multipleoperating systems and targets, it does notprotect against the “starvation” of taskswaiting for, but not getting, access to thedata. Also, provisions must be made forthe database system to clean itself up ifthe task holding the lock unexpectedlydies, so that other tasks in line for the spin-lock do not wait eternally. In any case,embedded data management is often builtentirely in memory, generally requires alow number of simultaneous transactions,and the transactions themselves are shortin duration. Therefore, the chances of aresource conflict are low and the task’swait to gain access to data is generallyshorter than the time needed for a con-text switch.

Nonportable FeaturesWhile replacing C runtime library func-tionality and memory managers, and im-plementing custom synchronization prim-itives lead to greater data-managementcode portability, sometimes it is not pos-sible or practical to overload the databasewith functionality— such as network com-munications or filesystem operations—that belongs to the operating system. Asolution is to not use these services di-rectly, or recreate them in the database,but instead create an abstraction of themthat is used throughout the database en-gine code. The actual implementation ofthe service is delegated to the application.This allows hooking up service imple-

mentations without changing the core en-gine code, which again contributes toportability.

For example, data-management solu-tions often include online backup/re-store features that, by their nature, re-quire file system or network interaction.Creating an abstraction of stream-basedread and write operations, and using thisabstraction layer within the databaseruntime during backup, allows thedatabase to implement the backup/re-store logic while staying independent ofthe actual I/O implementation. At thesame time, this approach allows a file-based, socket-based, or other customstream-based transport to be plugged inwith no changes needed to the databaseruntime. Listing Thirteen illustrates sucha plug- in interface. The applicationneeds to implement the actual read-from-stream/write-to-stream functional-ity; see Listing Fourteen.

Another example of the database “out-sourcing” services to the application in-volves network communications. Embed-ded databases often must provide a wayto replicate the data between severaldatabases over a network. Embedded set-tings always demand highly configurableand often deterministic communication thatis achieved using a great variety of mediaaccess protocols and transports. Thus, asa practical matter, a database should beable to adopt the communication proto-col used for any given embedded appli-cation, regardless of the underlying hard-ware or the operating system. Instead ofcommunicating directly with the trans-port or a protocol, a database runtimegoes through a thin abstraction layer thatprovides a notion of a “communicationchannel.” Like the backup/restore inter-faces, the network communication chan-nel can also be implemented via a stream-based transport; see Listing Fifteen. Thedatabase uses a set of API functions thatprovide the ability to initiate and closethe channel, send and receive the data,and so on.

ConclusionBy following general rules of developingportable code— such as using Standard Cand avoiding assumptions about hardwarerelated parameters—you can greatly sim-plify the reuse of your data-managementcode in new environments. And new ap-proaches to implementing standarddatabase services, such as those present-ed here, can ensure that the old conceptof a database delivers the portability, per-formance, and low-resource consumptiondemanded for embedded systems.

DDJ(Listings begin on page 54.)


Listing One#ifndef BASE_TYPES_DEFINED

typedef unsigned char uint1;typedef unsigned short uint2;typedef unsigned int uint4;typedef signed char int1;typedef short int2;typedef int int4;

#endif

Listing Two#define ALIGNEDPOS(pos, align) ( ((pos) + (align)-1) & ~((align)-1) ) pos = ALIGNEDPOS(pos, 4);

Listing Threechar c;//(padding)long l; - the address is aligned

Listing Four#define handle_size N typedef uint1 hobject [handle_size ];

Listing Fivetypedef struct appData_ { hobject h; } appData;char c;appData d; /* d is not aligned */

Listing Sixtypedef struct objhandle_t_{...obj_h po; ...uint4 mo; uint2 code;...

} objhandle_t;

Listing Seven#define handle_size N #define handle_size_w ((( handle_size + (sizeof(void*) -1)) & ~(sizeof (void*) -1)) / sizeof(void*));

typedef void * hobject [handle_size_w ];

Listing Eight#if defined( CFG_CHAR_CMP_SIGNED )#define CMPCHARS(c1,c2) ((int)(signed char)(c1)-(int)(signed char)(c2) )#elif defined( CFG_CHAR_CMP_UNSIGNED )#define CMPCHARS(c1,c2) ((int)(unsigned char)(c1)-(int)(unsigned char)(c2) )#else#define CMPCHARS(c1,c2) ( (int)(char)(c1) - (int)(char)(c2) )#endif

Listing Ninestruct S { int i; int j; };S s = {3,4};

Listing Tenstruct S { int i; int j; };S s;s.i = 3; s.j = 4;

Listing Eleven/* this is the TAS (test-and-set) latch template*/void sys_yield(){/* relinquish control to another thread */

}void sys_delay(int msec){/* sleep */

}int sys_testandset( /*volatile*/ long * p_spinlock){/* The spinlock size is up to a long ;* This function performs the atomic swap (1, *p_spinlock) and returns* the previous spinlock value as an integer, which could be 1 or 0*/

}

Listing Twelve(a)#ifndef SYS_WIN32_H__

#define SYS_WIN32_H__

/* sys.h definitions for WIN32 */

#define WIN32_LEAN_AND_MEAN#include <windows.h>#include <process.h>

#define sys_yield() SleepEx(0,1) /*yield()*/#define sys_delay(msec) SleepEx(msec,1)#define sys_testandset(ptr) InterlockedExchange(ptr,1)

#endif /* SYS_WIN32_H__ */

(b) #ifndef SYS_SOL_H__#define SYS_SOL_H__

/* sys.h definitions for Solaris */

#include <sys/time.h>#include <unistd.h>#include <sched.h>

int sys_testandset( /*volatile*/ long * p_spinlock){register char result = 1;volatile char *spinlock = ( volatile char * ) p_spinlock;__asm__ __volatile__(

" ldstub [%2], %0 \n": "=r"(result), "=m"(*spinlock): "r"(spinlock));

return (int) result;}void sys_yield(){sched_yield();

}void sys_delay(int msec){ /* */ }

(c)#ifndef SYS_GHSI_H__#define SYS_GHSI_H__

/* sys.h definitions for Green Hills Integrity OS */#include <INTEGRITY.h>

void sys_yield(){Yield();

}void sys_delay(int msec){}int sys_testandset(long * p_spinlock){return ! ( Success == TestAndSet(p_spinlock, 0, 1) );

}

Listing Thirteen/* abstraction of write and read stream interfaces;* a stream handle is a pointer to the implementation-specific data */typedef int (*stream_write)( void *stream_handle, const void * from, unsigned nbytes);

typedef int (*stream_read)( void *stream_handle, /*OUT*/ void * to, unsigned max_nbytes);

/* backup the database content to the output stream */RETCODE db_backup( void * stream_handle, stream_write output_stream_writer, void * app_data);

/* restore the database from input stream */RETCODE db_load

( void * stream_handle, stream_read input_stream_reader, void *app_data);

Listing Fourteenint file_writer(void *stream_handle, const void * from, unsigned nbytes){FILE *f = (FILE*)stream_handle;int nbytes = fwrite(from,1,nbytes,f);return nbytes;

}int file_reader(void *stream_handle, void * to, unsigned max_nbytes){FILE *f = (FILE*)stream_handle;int nbytes = fread(to,1,max_nbytes,f);return nbytes;

}

Listing Fifteen#define channel_h void* typedef int (*xstream_write)(channel_h ch, const void * from,

unsigned nbytes, void * app_data);typedef int (*xstream_read) (channel_h ch, void * to,

unsigned max_nbytes, void* app_data);typedef struct {xstream_write fsend;xstream_read frecv;...} channel_t, *channel_h;

DDJ


Contrary to widely held misconcep-tions, performance is portable. Vir-tually every optimization effort ini-tially involves tuning for architectural

features found on most nonvector pro-cessors. The code is modified to improvedata reuse in the caches, reduce addressmisses in the TLB, and to prevent registerspills to memory, all of which can causestalls in the processor’s pipeline. But thequestions often arise: Why should we op-

timize code at all? Why not just use thebest available compiler options and moveon? If the code runs too slowly, why notjust upgrade our hardware and let Moore’sLaw do the work for us? There are manyanswers to these questions, the simplestbeing that the performance of most ap-plications depends as much on the band-width of the memory subsystem as thecore clock speed of the processor. Whilewe do see processor megahertz continu-ing to double every 18 months, DRAMspeeds see 10 percent improvement inbandwidth at best. Another reason to op-timize has to do with the average perfor-mance of the user’s hardware. Many ap-plications, such as an MP3 player or videoconferencing package, have a certainamount of work to do in a finite amountof time. Considering that most of the usercommunity is running computers that areat least a generation or two behind, opti-mizing this software guarantees that morepeople will be able to run your code.

In the high-performance computing(HPC) community, optimization is of crit-ical importance as it directly translates tobetter science. At HPC centers, computertime is “sold” to various parties of eachinstitution. Take, for example, the NationalEnergy Research Scientific Computing Cen-ter (NERSC)— the computer center atLawrence Berkeley National Laboratory(http://www.lbl.gov/). The center runs avariety of large machines and sells time toeach of the different divisions of the lab,as well as other government and academicinstitutions around the country. On thesesystems, 99 percent of the workload is for

scientific simulation for running virtual ex-periments beyond physical or economicalconstraints. These applications range fromthe simulation of self-sustaining fusion re-actors to climate modeling of the entire

Earth down to a resolution of a few kilo-meters. The compute time of these mod-els is always dependent on the resolutionof the simulation. To fit within budget, ex-perimenters must choose an appropriateresolution that resolves the phenomena ofinterest, yet still remains within their allo-cation. Optimization of these applicationscan result in a tremendous savings of com-pute time, especially when you consider

Performance Monitoring with PAPI

Using the PerformanceApplicationProgramming Interface

PHILIP MUCCI WITH CONTRIBUTIONS FROM NILS SMEDS AND PER EKMAN

Philip is the original author and techni-cal lead for PAPI. He is a research con-sultant for the Innovative Computing Lab-oratory at the University of Tennessee,Knoxville. Nils and Per are computer sci-entists at the Center for Parallel Comput-ers (PDC) at the Royal Institute of Tech-nology in Stockholm, Sweden, and areregular contributors to the PAPI project.They can be contacted at [email protected], [email protected], and [email protected], respectively.

“The hardest part ofthe optimizationprocess isunderstanding theinteraction of thesystem architecture,operating system,compiler, andruntime system”



T he PerfSuite tool is from the NationalCenter for Supercomputing Applica-tions (http://perfsuite.ncsa.uiuc.edu/).

PerfSuite includes a number of tools andlibraries for performance analysis of se-rial, MPI, and OpenMP applications. Ofinitial interest to most users is thePSRUN tool that, when run on an ap-plication, generates textual output ofraw performance counter data as wellas a wealth of derived statistics. Figure1 is an example of a run of thegenIDLEST application. PSRUN can alsobe used to generate flat statistical pro-

files by file, function, and source line ofan application.

The HPCToolkit, from the Center forHigh Performance Software Research atRice University (http://www.hipersoft.rice.edu/hpctoolkit/), includes hpcrun—a command-line tool to generate flat sta-tistical profiles to be visualized with a Java-based GUI called “hpcview” as shown inFigure 2. HPCToolkit is capable of gen-erating and aggregating multiple simulta-neous profiles such that for a postanaly-sis, line-by-line profiles can be generatedfrom multiple hardware metrics. Like Perf-

Suite, HPCToolkit supports serial and par-allel execution models.

Often after an initial performance ana-lysis, users wish to narrow the focus oftheir examination. This may involve mak-ing detailed measurements of regions ofcode and associating them with an ap-plication event, such as updates of acell in a grid. TAU, the “Tuning andAnalysis Utilities” from the Universityof Oregon, is such a tool (http://www.cs.uoregon.edu/research/paracomp/tau/tautools/). TAU is a comprehensive en-vironment for performance analysis thatincludes modules for both source andbinary instrumentation, experiment man-agement of performance data, and scal-able GUIs suitable for visualization ofhighly parallel programs. TAU runs justabout everywhere and can even func-tion without PAPI being available (al-though only time-based metrics areavailable without PAPI). Figure 3 is aTAU display of the performance of a 16-process MPI program.

—P.M., N.S., and P.E.

Open-Source Performance Analysis Tools

Figure 3: TAU display of theperformance of a 16-process MPIprogram with multiple PAPI metrics,subroutine breakdown, and callgraph display.

Figure 2: HPCview showing PAPIevents gathered from hpcrun for atriply nested loop.

PerfSuite 1.0 summary for execution of gen.ver2.3.inte (PID=9867, domain=user)

Based on 800 MHz -1GenuineIntel 0 CPUCPU revision 6.000

Event Counter Name Counter Value===================================================================

0 Conditional branch instructions mispredicted............. 30060939561 Conditional branch instructions correctly predicted...... 329747098802 Conditional branch instructions taken.................... 269520222793 Floating point instructions.............................. 445259802374 Total cycles............................................. 3532622062345 Instructions completed................................... 4897646800256 Level 1 data cache accesses.............................. 563909215337 Level 1 data cache hits.................................. 419112069478 Level 1 data cache misses................................ 146157535709 Level 1 load misses...................................... 17611912424

10 Level 1 cache misses..................................... 1759724830011 Level 2 data cache accesses.......................... 5315861789912 Level 2 data cache misses................................ 844020538713 Level 2 data cache reads................................. 4352865178514 Level 2 data cache writes................................ 1024056377515 Level 2 load misses...................................... 361592333716 Level 2 store misses..................................... 66757597317 Level 2 cache misses..................................... 852993171718 Level 3 data cache accesses.............................. 382684327819 Level 3 data cache hits.................................. 279959198620 Level 3 data cache misses................................ 99971420621 Level 3 data cache reads................................. 357388213022 Level 3 data cache writes................................ 17180042523 Level 3 load misses...................................... 94462481424 Level 3 store misses..................................... 4942700025 Level 3 cache misses..................................... 102456937526 Load instructions........................................ 8490767568627 Load/store instructions completed........................ 9534609287028 Cycles Stalled Waiting for memory accesses............... 14003217612229 Store instructions....................................... 1026747235430 Cycles with no instruction issue......................... 6724712693131 Data translation lookaside buffer misses................. 8365029

Statistics===================================================================Graduated instructions/cycle................................... 1.386406Graduated floating point instructions/cycle.................... 0.126042Graduated loads & stores/cycle................................. 0.269902Graduated loads & stores/floating point instruction............ 2.141359L1 Cache Line Reuse............................................ 5.523515L2 Cache Line Reuse............................................ 0.731682L3 Cache Line Reuse............................................ 7.442618L1 Data Cache Hit Rate......................................... 0.846708L2 Data Cache Hit Rate......................................... 0.422527L3 Data Cache Hit Rate......................................... 0.881553% cycles w/no instruction issue................................ 19.036037% cycles waiting for memory access............................. 39.639729Correct branch predictions/branches taken...................... 1.000000MFLOPS......................................................... 100.833839

Figure 1: Output from the PSRUN tool of PerfSuite showing hardware eventsand derived metrics gathered via PAPI.

that these simulations are parallel codes;there may be thousands of processors ex-ecuting similar code. A 30 percent de-crease in runtime could allow for a 30 per-cent increase in the resolution of thesimulation, possibly exposing subtletiesnot present in the former simulation. Sucha difference directly translates into accel-erating the pace of scientific discovery,and even (as in the case of car crash sim-ulations or airplane wing design) result insaved lives.

However, the hardest part of the opti-mization process is understanding the in-teraction of the system architecture, op-erating system, compiler, and runtimesystem and how that affects the perfor-mance of the application. All these ele-ments must be taken into considerationwhen attempting to optimize an applica-tion. Wouldn’t it be nice if there was away to reduce the prerequisite knowl-edge and experience by having the pro-cessor tell you exactly where and whyyour code was losing performance? Well,it turns out that such a method does ex-ist— on-chip performance monitoring(PM) hardware found on almost every mi-croprocessor in use today. The PM hard-ware consists of a small number of reg-isters with connections to various otherparts of the chip. Traditionally, this hard-ware was used exclusively for testing andverification. However, now the importanceof the PM hardware has become widelyunderstood, especially among the HPCcommunity.

There are two methods of using the PMhardware— aggregate (direct) and statis-tical (indirect):

• Aggregate usage involves reading thecounters before and after the executionof a region of code and recording thedifference. This usage model permitsexplicit, highly accurate, fine-grainedmeasurements. There are two subcasesof aggregate counter usage: Summationof the data from multiple executions ofan instrumented location, and trace gen-eration, where the counter values arerecorded for every execution of the in-strumentation.

• The second method is statistical profiling:The PM hardware is set to generate aninterrupt when a performance counterreaches a preset value. This interrupt car-ries with it important contextual infor-mation about the state of the processorat the time of the event. Specifically, it in-cludes the program counter (PC), the textaddress at which the interrupt occurred.By populating a histogram with this data,users obtain a probabilistic distribution ofPM interrupt events across the addressspace of the application. This kind of pro-filing facilitates a good high-level under-

standing of where and why the bottle-necks are occurring. For instance, thequestions, “What code is responsible formost of the cache misses?” and “Whereis the branch prediction hardware per-forming poorly?” can quickly be answeredby generating a statistical profile.

In “Optimization Techniques” (DDJ, May2004), Tim Kientzle describes how to usethe real-time cycle counter (rdtsc) foundon the x86 architecture. However, thereare some problems with this technique.The first is due to the nature of a systemdesigned for multiprocessing. No systemcan be considered quiet when examinedfrom a sufficiently fine granularity. For ex-ample, the laptop on which this article isbeing written is busy servicing interrupts

from a variety of sources, as well as de-livering commands to the graphics co-processor and the PCM audio chip. Theconsequence of this is that it is very like-ly that the cycle count from rdtsc( ) in-cludes a host of other unrelated events.On a desktop system, a stock Redhat 9system with no “active” processes, the sys-tem (vmstat) reports about 120 interruptsand 50 context switches every 10 seconds.Doing cycle-by-cycle accounting of codeis thus impossible— just one interrupt, orone context switch during a measurementinterval, could mislead you in your questfor bottlenecks.

The next problem with timers is moresevere. Timers don’t tell you anythingabout why the hardware is behaving theway it is. The only way to understand the


system is to attempt to reconstruct the pro-cessor’s pipeline as it executes your code.This is a virtually impossible task on to-day’s advanced CPUs, even for expertswith detailed knowledge of the proces-sors microarchitecture. The last problemis one of portability and interface seman-tics. While the rdtsc( ) code segment worksnicely in the Intel/Microsoft environment,it doesn’t help when you migrate to an-other architecture, operating system, oreven compiler!

These problems and more have beenaddressed with the development of thePerformance Application Programming In-terface (PAPI) library. PAPI (available athttp://icl.cs.utk.edu/papi/) is intended tobe completely portable from an API stand-point. That is, performance tools and in-strumentation that work on one platform,seamlessly work with a recompile on an-other. While the interface is portable, thegenerated data is not. Data from the PMhardware rarely has the same semanticsfrom vendor to vendor, and often changesfrom model to model. A cache miss onone platform may be measured entirelydifferently on another; even definitions as“simple” as an instruction can becomeblurred on modern architectures (consid-er x86 instructions versus x86 υops).

PAPI is implemented on a wide varietyof architectures and operating systems. Thecurrent release, PAPI 3.0.7, is supported onthe Cray T3E, X1, Sun UltraSparc/Solaris, Al-pha/Tru64, IBM Power 604e,2,3,4/AIX, MIPSR10k/IRIX, IA64,IA32,x86 _64/Linux, and

Windows/IA32 (not P4). Wrappers for PAPIexist for C, C++, Fortran, Java, and Matlab.

To facilitate the development of portableperformance tools, PAPI provides interfacesto get information about the execution en-vironment. It also provides methods to ob-tain a complete listing of what PM eventsare available for monitoring. PAPI supportstwo types of events, preset and native. Pre-set events have a symbolic name associat-ed with them that is the same for everyprocessor supported by PAPI. Native events,on the other hand, provide a means to ac-cess every possible event on a particularplatform, regardless of there being a pre-defined PAPI event name for it.

For preset events, users can issue aquery to PAPI to find out if the event ispresent on the current platform. While theexact semantics of the event might be dif-ferent from processor to processor, thename for the event is the same (for ex-ample, PAPI_TOT_CYC for elapsed CPUcycles). Native events are specific to eachprocessor and have their own symbolicname (usually taken from the vendor’s ar-chitecture manual or header file, shouldone exist). By their nature, native eventsare always present.

PAPI supports measurements per-thread;that is, each measurement only containscounts generated by the thread performingthe PAPI calls. Each thread has its own PMcontext, and thus can use PAPI completelyindependently from other threads. This isachieved through the operating system,which saves/restores the performance

counter registers just like the rest of the pro-cessor state at a context switch. Using a lazysave and restore scheme effectively reducesthe additional instruction overhead to a fewdozen cycles for the processes/threads thatare using the performance monitoring hard-ware. Most operating systems (AIX, IRIX,Tru64, Unicos, Linux/IA64, HPUX, Solaris)have officially had this support for a sig-nificant time, motivated largely by the pop-ularity of the PAPI library and associatedtools. For Linux/IA32/x86_64, support ex-ists in the form of a PerfCtr kernel patch(http://user.it.uu.se/~mikpe/linux/perfctr/),written by Mikael Pettersson of UppsalaUniversity in Sweden. This patch has notyet been formally accepted in the main Lin-ux 2.6 kernel tree. Users with concernsabout stability and support should knowthat this patch has been installed and inuse for many years at a majority of the U.S.Government’s Linux clusters on the Top500 list, not to mention numerous othercomputational intensive sites around theworld. The notable exception to operatingsystem support is Microsoft Windows. Dueto the lack of support of preserving thestate of performance counters across con-text switches, counting on Windows mustbe done system-wide, greatly reducing theusefulness of the PM for application de-velopment and tuning.

PAPI has two different library interfaces.The first is the high-level interface meantfor use directly by application engineers.It consists of eight functions that make iteasy to get started with PAPI. It providesstart/read/stop functionality as well asquick and painless ways to get informa-tion, such as millions of floating-point op-erations per second (MFLOPS/S) and in-structions per cycle. Listing One containscode that demonstrates a canonical per-formance problem— traversing memorywith nonunit stride. We measure thiscode’s performance using the PAPI high-level interface. This example uses PAPIpresets. They are portable in name butmight not be implemented on all plat-forms, and may in fact mean slightly dif-ferent things on different platforms. Hard-ware restrictions also limit which eventscan be measured simultaneously. This sim-ple example verifies that the hardwaresupports at least two simultaneous coun-ters; it then starts the performance coun-ters counting the preset events for L1 datacache misses (PAPI_L1_DCM) and thenumber of floating-point operations ex-ecuted (PAPI_FP_OPS). Compiling andrunning this example on an Opteron (64-byte line size) with gcc 3.3.3 (-O -g) re-sults in this output:

Total software flops = 41943040.000000Total hardware flops = 42001076.000000MFlop/s = 48.258228L2 data cache misses is 12640205


T raditionally, performance monitor-ing hardware counts the occurrenceor duration of events. However, re-

cent CPUs, such as the Intel Itanium2and IBM Power4/5, include more ad-vanced features. The Itanium2 containssuch features as:

• Address and opcode matching. Withaddress matching, the event countingcan be constrained to events that aretriggered by instructions that either oc-cupy a certain address range in thecode or reference data in a certain ad-dress range. The first case enables youto measure, for instance, the numberof mispredicted branches within a pro-cedure in a program. In the secondcase, you can measure things like allthe cache misses caused by referencesto a particular array.

Opcode matching constrains count-ing to events caused by instructionsthat match a given binary pattern. Thisallows the counting of instructions of

a certain type or execution on a par-ticular functional unit.

• Branch traces. A CPU with a branchtrace buffer (BTB) can store the sourceand target addresses of branch in-structions in the PMU as they happen.On the Itanium2, the branch tracebuffer consists of a circular buffer ofeight registers in which source and tar-get addresses are stored. Address cap-turing can be constrained by the typeof branch, whether it was correctlypredicted and whether the branch wastaken.

• Event addresses. With event addresscapturing, addresses and latencies as-sociated with certain events are storedin Event Address Registers (EARs). Thismeans that the PM hardware capturesthe data and instruction addresses thatcause, for example, an L1 D-cachemiss together with the latency (in cy-cles) of the associated memory access.

—P.M., N.S., and P.E.

Advanced PM Features

There’s almost one cache miss for ev-ery four floating-point operations— per-formance is bound to be poor. Switchingthe for loops so that the memory is ac-cessed sequentially instead should give usbetter cache behavior; the result is:

Total software flops = 41943040.000000Total hardware flops = 42027880.000000MFlop/s = 234.481339L2 data cache misses is 2799387

Performance has more than quadrupledand the cache misses have been reducedby 80 percent. Here we note that the num-ber of hardware flops is different from runto run. Using “avail” (Figure 4) helps an-swer this question. Note that the outputfrom avail has been edited for brevity. ThePAPI_FP_INS preset is implemented as the

vendor FP_MULT_AND_ADD_PIPE met-ric. This metric is speculative, meaning thehardware counter includes instructionsthat were not retired.

The low-level interface is designed forpower users and tool designers. It con-sists of 53 functions that range from pro-viding information about the processorand executable to advanced features likecounter multiplexing, callbacks oncounter overflow, and advanced statisti-cal profiling modes. Counter multiplex-ing is a useful feature in situations wherethe PM hardware has a very limited num-ber of registers. This limitation can beovercome by trading accuracy and gran-ularity of the measurements for an in-crease in the measured number of events.By rapidly switching the contents of thePM’s control registers during the course

of a user’s run, the appearance of a greatnumber of PM registers is given to users.For more on advanced features, see theaccompanying text box entitled “Ad-vanced PM Features.”

While PAPI provides the necessary in-frastructure, the true power of PAPI liesin the tools that use it. There are a num-ber of excellent open-source performanceanalysis tools available that have beenimplemented using PAPI. While a com-plete discussion of all these tools is be-yond the scope of this article, we men-tion a few in the accompanying text boxentitled “Open-Source Performance Ana-lysis Tools.” For information about theseand other tools, please refer to the PAPIweb page.

PAPI’s goal is to expose real hardwareperformance information to users. Bydoing so, most of the guesswork re-garding the root cause of a code’s per-formance problem can be eliminated.PAPI does not solve algorithmic designissues or diagnose inefficient paral-lelization. It can, however, diagnose poorusage of the available processor re-sources— a problem that, before now,was largely intractable. As of now, PAPIis an ad hoc standard, taking into ac-count the input of the tool developmentand performance engineering commu-nity. In the future, we hope to see PAPIevolve into a true open standard and tosee the vendors ship native versions ofPAPI with each new release of proces-sor and operating system. In the mean-time, with continued cooperation fromindustry, academia, and the researchcommunity, PAPI will continue to evolveand further drive the science of perfor-mance engineering.

DDJ



Listing One#include <stdio.h>#include <stdlib.h>#include <errno.h>#include <sys/time.h>#include "papi.h"

#define MX 1024#define NITER 20#define MEGA 1000000#define TOT_FLOPS (2*MX*MX*NITER)

double *ad[MX];

/* Get actual CPU time in seconds */float gettime() {return((float)PAPI_get_virt_usec()*1000000.0);}int main () {float t0, t1;int iter, i, j;int events[2] = {PAPI_L1_DCM, PAPI_FP_OPS }, ret;long_long values[2];

if (PAPI_num_counters() < 2) {fprintf(stderr, "No hardware counters here, or PAPI not supported.\n");exit(1);

}for (i = 0; i < MX; i++) {

if ((ad[i] = malloc(sizeof(double)*MX)) == NULL) {fprintf(stderr,"malloc failed"\n);exit(1);

}for (j = 0; j < MX; j++) {

for (i = 0; i < MX; i++) {ad[i][j] = 1.0/3.0; /* Initialize the data */

}}t0 = gettime();if ((ret = PAPI_start_counters(events, 2)) != PAPI_OK) {

fprintf(stderr, "PAPI failed to start counters: %s\n", PAPI_strerror(ret));exit(1);

}for (iter = 0; iter < NITER; iter++) {

for (j = 0; j < MX; j++) {for (i = 0; i < MX; i++) {

ad[i][j] += ad[i][j] * 3.0;}

}}if ((ret = PAPI_read_counters(values, 2)) != PAPI_OK) {fprintf(stderr, "PAPI failed to read counters: %s\n", PAPI_strerror(ret));exit(1);

}t1 = gettime();

printf("Total software flops = %f\n",(float)TOT_FLOPS);printf("Total hardware flops = %lld\n",(float)values[1]);printf("MFlop/s = %f\n", (float)(TOT_FLOPS/MEGA)/(t1-t0));printf("L2 data cache misses is %lld\n", values[0]);}

DDJ

/avail -e PAPI_FP_INSTest case avail.c: Available events and hardware information.----------------------------------------------------------------Vendor string and code: AuthenticAMD (2)Model string and code: AMD K8 Revision C (15)CPU Revision: 8.000000CPU Megahertz: 1993.626953CPU's in this Node: 4Nodes in this System: 1Total CPU's: 4Number Hardware Counters: 4Max Multiplex Counters: 32----------------------------------------------------------------Event name: PAPI_FP_INSEvent Code: 0x80000034Number of Native Events: 1Short Description: |FP instructions|Long Description: |Floating point instructions||Native Code[0]: 0x40000002 FP_MULT_AND_ADD_PIPE||Number of Register Values: 2||Register[0]: 0xf P3 Ctr Mask||Register[1]: 0x300 P3 Ctr Code||Native Event Description: |Dispatched FPU ops - Revision B and later revisions -

Speculative multiply and add pipe ops excluding junk ops|

Figure 4: Output from the avail tool from the PAPI distribution.

The C++ Standard Library is a large andsuccessful project. The specification ofthe library is about 370 pages inlength— longer than the parts of the

C++ Standard that describe the core lan-guage itself (ISO, International Standard:Programming languages— C++. ISO/IEC14882:2003(E), 2003). There are now dozensof books explaining how to use the Stan-dard Library (I’ve listed a few in the refer-ences). Nor is this just the domain of stan-dardization bureaucrats or textbook writers.The C++ Standard was finalized more thansix years ago, every major C++ compilerships with a complete Standard Library im-plementation, and millions of programmersare using them. If a C++ program was writ-ten within the last five years (and often evenif it was written earlier), it’s standard farefor it to use std::string, STL containers andalgorithms, or IOStreams.

But if we look at it a little differently,the Standard Library doesn’t seem so largeafter all. Granted, it’s much larger than theC Standard Library (all of the 1990 C Li-brary is included by reference!) and it in-cludes an extensive collection of contain-ers and algorithms (while the C Libraryhas only qsort and bsearch). However,most of the problem areas the C++ Stan-dard Library addresses—I/O and local-ization, character string manipulation,memory allocation, math functions— arein the C library as well. How does thatcompare to what other languages providein their standard libraries?

The Perl, Python, and Java standard li-braries include facilities for parsing XML,performing regular expression searches,manipulating image files, and sending dataover the network. From that perspective,the C++ Standard Library looks like it’sonly the beginning.

And everyone in the Standards com-mittee knows that, of course. When theStandard was finished in 1998, we didn’tinclude everything that might have beena good idea—we only included what wecould. Many potentially good ideas wereleft out for lack of time, lack of consen-sus, lack of implementation experience,or sometimes just lack of imagination. AsBjarne Stroustrup stated in the “The De-sign of C++0x” (C/C++ Users Journal, May2005), the next C++ Standard will see farmore emphasis on library evolution thancore language changes. Within the Stan-dards committee, wishing for a larger Stan-dard Library is no more controversial thanwishing for world peace.

The question isn’t whether we need amore comprehensive library, but how we

get one. Standards committees are goodat evaluating ideas and working throughcorner cases, but they’re slow, and they’repoor at coming up with the ideas in thefirst place. “Design by committee,” in-cluding design by Standards committee,is rarely a success.

Historically, members of the Standardscommittee came up with two answers.At the 1998 Santa Cruz committee meet-ing, Beman Dawes, then chair of thecommittee’s Library Working Group, pro-posed that members of the C++ com-munity work together informally to de-sign and implement new librarycomponents, with the goal of advancingthe state of the art and eventually build-ing practice for a more extensive Stan-dard Library. That proposal was wildlysuccessful. It’s what eventually becameBoost (http://www.boost.org/).

Matthew is a software engineer at Google,chair of the C++ Standardization Com-mittee’s Library Working Group, projecteditor of the Technical Report on C++ Li-brary Extensions, and author of GenericProgramming and the STL. He can becontacted at [email protected].

The Technical ReportOn C++ LibraryExtensions


The Standardscommittee is nearlydone with TR1

MATTHEW H. AUSTERN

“The question isn’twhether we need amore comprehensivelibrary, but how weget one”

But that still wasn’t a complete answer.How does this informal library develop-ment, at Boost and elsewhere, get intosomething as formal as an official ISOStandard? At the 2001 Copenhagen meet-ing, Nico Josuttis and the rest of the Ger-man delegation came up with the secondpart— that we needed something in be-tween the freedom of individual librarydevelopment and the formality of an offi-cial ISO Standard. In 2001, a new revisionof the C++ Standard seemed very far away,but many committee members were al-ready working on libraries that they hopedcould be part of that new Standard.

The fundamental issue is that, for ob-vious reasons, library development can goa lot faster than language development.The Standards committee decided to rec-ognize that fact, and to start working onlibrary extensions even before work on anew Standard began. We couldn’t publishan official Standard on new library com-ponents, but what we could do, and whatwe did decide to do, was publish the workon library extensions in the form of a tech-nical report. This report isn’t “normative,”in the jargon of standards bureaucracy,which means that compiler vendors don’thave to include any of these new librariesto conform to the Standard. What it doesis provide guidance to those vendors whodo want to include them. In the meantimeit gives us a head start on describing newlibrary components in “standardese,” givesus experience implementing and usingnew libraries, and will eventually make iteasier to incorporate them into the nextofficial Standard.

The idea of a library extensions tech-nical report was first proposed at the 2001Copenhagen meeting. The committee’s li-brary working group (which I chair) be-gan discussing specific ideas for new li-brary extensions at the next meeting andstarted to solicit proposals. The first ex-tension proposals were accepted at the2002 Santa Cruz meeting, and the last atthe 2003 Kona meeting. Since then, thecommittee has been filling gaps, and fix-ing bugs and corner cases. At this writing,the C++ Standards committee has final-ized the Proposed Draft Technical Reporton C++ Library Extensions, ISO/IEC PDTR19768, and forwarded it to our parentbody. Most of the remaining steps are bu-reaucratic. There may be some minorchanges before it is officially finalized, butthe substantive work is now essentiallycomplete; see http://www.open-std.org/jtc1/sc22/wg21/.

If you do download the technical re-port (TR) and read through it, you’ll findthat it looks a lot like the library part ofthe C++ Standard. Like the Standard, itcontains header synopses; the Standarddescribes the contents of headers, such as

<iterator> and <iostream>, like the TR de-scribes the contents of headers, such as<tuple> and <regex>. (In some cases, suchas <utility> and <functional>, the TR de-scribes additions to existing headers.) Likethe Standard, it contains classes and func-tions and type requirements. The style ofspecification in the TR is deliberately sim-ilar to the style in the Standard, becauseit is expected that much of what’s in theTR will eventually become part of a fu-ture Standard.

There is one obvious difference betweenwhat’s in the Standard and what’s in theTR— the Standard Library defines all of itsnames within namespace std, but the TRdefines its names within std::tr1. Whystd::tr1 instead of just std::tr ? For the ob-vious reason: Someday there may be anamespace std::tr2! For similar reasons, in-stead of spelling out “Technical Report onC++ Library Extensions,” most people justsay “TR1.” Work on new Standard Librarycomponents has started. It hasn’t ended.

The fact that TR1 components aren’t de-fined within namespace std, however, isa reminder of the real difference— as of-ficial as it looks, the TR isn’t a standard.Everything in it, by definition, is experi-mental; it’s there to get user and imple-mentor experience. The expectation is thatmost of what’s in TR1 will make it intothe “C++0x” Standard (the next officialversion of the C++ Standard), but it’s pos-sible, once we’ve got more real-world ex-perience, that there will still be some bigchanges along the way. So, while there’sa lot in the TR that makes C++ program-ming easier and more fun, and while itgives you a taste of what the next Stan-dard will look like, be aware that usingthe libraries defined in the TR means liv-ing on the bleeding edge.

With that caveat in mind, let’s look atsome of the high points.

Taking the STL Further: ContainersThe STL, originally written by AlexStepanov, Dave Musser, and Meng Lee,was the most innovative part of the C++Standard Library (see Alexander Stepanovand Meng Lee, “The Standard TemplateLibrary,” HP Technical Report HPL-95-11(R.1), 1995). Generic STL algorithms, suchas sort and random_shuffle, can operateon objects stored in STL containers, suchas vector and deque, on objects stored inbuilt-in arrays, and in data structures thathaven’t even been written yet. The onlyrequirement is that these data structuresprovide access to their elements via itera-tors. Conversely, new algorithms, providedthat they access the objects they operate ononly through the iterator interface, can workwith all existing and yet-to-exist STL datastructures. For additional flexibility, STLalgorithms parameterize some of their be-

havior in terms of “function objects” thatcan be invoked using function call syn-tax. For example, find_if, which search-es for a particular element in a range, takesa function object argument that determineswhether its argument matches the search,and sort takes a function object argumentthat determines whether one object is lessthan another.

The STL was designed to be extensible,and people immediately extended it. Someof the most obvious gaps to be filled werein the containers. The STL includes threecontainers for variable-sized sequences ofelements: vector, list, and deque. It also in-cludes set and dictionary classes based onbalanced trees: the “associative contain-ers” set, map, multiset, and multimap. Acommon reaction to that list is to see anobvious omission— hash tables. Most oth-er languages, including Perl, Python, andJava, have hash-based dictionaries. Whydoesn’t C++?

The only real reason was historical.There was a proposal to add hash tablesto the STL as early as 1995 (see Javier Bar-reiro, Robert Fraley, and David R. Muss-er, “Hash Tables for the Standard Tem-plate Library,” X3J16/94-0218 andWG21/N0605, 1995), but that was alreadytoo late to make it into the Standard. In-dividual vendors filled that gap. Most ma-jor library implementations include someform of hash tables, including theDinkumware standard library that shipswith Microsoft Visual C++ and the GNUlibstdc++ that ships with GCC. Somethingthat’s provided by different vendors in not-quite-compatible versions is a natural can-didate for standardization, so the techni-cal report, at last, includes hash tables.

TR1 hash tables have many specializedfeatures, but in simple cases they’re usedmuch the same way as the standard as-sociative containers. In Listing One, thename of this class might give you pause.Why is it called unordered_map, whenall previous implementations used thename hash_map ? Unfortunately, that his-tory is the biggest reason for using a dif-ferent name. It’s impossible to make TR1hash tables compatible with all previousSTL hash table implementations becausethey aren’t compatible with each other.Since the committee had to use a differ-ent interface, it felt that the right decisionwas to use a different name to go with it.

The unordered associative containers,like all STL containers, are homogeneous.All of the elements in an std::vector<T>,or an std::trl::unordered_set<T>, have thesame type. But the Standard also has oneheterogeneous container: std::pair<T, U>,which contains exactly two objects thatmay be of different types. Pairs are use-ful whenever two things need to be pack-aged together. One common use is for


functions that return multiple values. Forexample, the associative containers setand map (and unordered_set and unor-dered_map) have a version of insert whosereturn type is pair<iterator, bool>. The sec-ond part of the return value is a flag thattells you whether you did actually insert anew element or whether there was alreadyone there with the same key, and the firstpart points to either the preexisting elementor the newly inserted one.

Pairs are useful for packaging multiplevalues, so long as “multiple” means two.But that seems like an arbitrary restriction.Just as functions sometimes need to re-turn more than one value, so they some-times need to return more than two. Inmathematics, an n-tuple is an ordered col-lection of n values; pairs are the specialcase where n is restricted to be 2. TR1 in-troduces a new heterogeneous containertemplate, std::tr1::tuple, which removesthat restriction.

Implementing tuple requires some so-phisticated programming techniques, butusing it couldn’t be simpler. As with pair,you just supply the types as template pa-rameters— except that with pair you sup-ply exactly two template parameters, andwith tuple you supply whatever numberyou like. (Up to some limit, but the lim-it should be large. On all implementa-tions that I know of, it’s at least 10.) Forexample:

#include <tr1/tuple>#include <string>

using namespace std;using namespace std::tr1;...tuple<int, char, string> t = make_tuple(1, 'a', "xyz");

If a function returns multiple values asa tuple, then, of course, one way to getthose values is one at a time, the same aswith a pair:

tuple<int, int, int> tmp = foo();int x = get<0>(tmp);int y = get<1>(tmp);int z = get<2>(tmp);

The syntax is a little different than pair’sfirst and second members, but the idea issimilar. With tuples, however, there’s aneasier way— you don’t even need thattemporary variable. Instead, you can justwrite:

tie(x, y, z) = foo();

and let the library automatically handle allof this packing/unpacking.

Today, functions that need to return mul-tiple values usually either pass in multiplereference parameters, or else define somead hoc class to serve as a return type (thinkdiv_t). Now that we have tuple, which pro-vides a dramatic improvement in usabili-

ty, these clumsy workarounds might dis-appear.

Infrastructure: Smart Pointers and Wrappers One reason tuple is useful is the same rea-son string is useful— it’s a primitive thathigher level libraries, including other partsof the Standard Library, can use in theirown interfaces. What’s important is that

these types are a common vocabulary, sothat two parts of a program, written in-dependently of each other, can use themto communicate. The library extensiontechnical report adds a number of otheruseful utility components of this nature.

One problem that appears in most pro-grams is managing resource lifetimes. Ifyou allocate memory or open a networksocket, when does that memory get deal-located and when does the socket getclosed? Two common solutions to thatproblem are automatic variables (“resourceacquisition is initialization,” or RAII) andgarbage collection. Both solutions are use-ful and important, and every C++ pro-grammer should be familiar with them.However, neither is appropriate in everycircumstance. RAII works best when re-source lifetime is statically determined andtied to the program’s lexical structure,while garbage collection works better formemory than for other kinds of resources,and in any case is sometimes overkill. Athird alternative, one that has been rein-vented many times, is reference-countedsmart pointers.

The basic idea behind reference-counted pointers is straightforward: In-stead of trafficking in raw pointers of typeT*, programs can use some kind of wrap-per class that “looks like” a pointer. Just

as with an ordinary pointer, you can deref-erence a smart pointer to access the T ob-ject it points to, or, more commonly, youcan use the -> operator to access one ofthat object’s members. The difference isthat the wrapper class can instrument itsbasic operations, its constructors and de-structor and assignment operators, so thatit can keep track of how many owners aparticular object has.

The TR1 reference-counted pointer classis shared_ ptr. In the simplest case, itworks just like the standard auto_ ptrclass—you create an object the usual waywith new, and use it to initialize theshared_ ptr. If there aren’t any othershared_ ptr instances that refer to theshared_ptr, then the object is destroyedwhen the shared_ptr goes out of scope.

What’s more interesting is what hap-pens if there are other instances that pointto the same object— it doesn’t get de-stroyed until the last instance that refersto it goes away. You can confirm this bydoing a simple test with a class that logsits constructors and destructors. In ListingTwo, the two pointers, p1 and p2, bothpoint to the same A object, and both ofthose pointers are destroyed when theygo out of scope. But shared_ ptr’s de-structor keeps track of that, so it doesn’tdestroy the A object until the last refer-ence to it disappears.

Naturally, real examples are more com-plicated than this test case. You can alsoassign shared_ptrs to global variables, passthem to and return them from functions,and put them in STL containers. A vec-tor<shared_ptr<my_class> > is one of themost convenient ways of managing con-tainers of polymorphic objects (see my ar-ticle “Containers of Pointers,” C/C++ UsersJournal, October 2001).

Providing shared_ptr as part of TR1 hastwo major benefits.

• As with tuple and string, it gives programsa common vocabulary: If I want to writea function that returns a pointer to dy-namically allocated memory, I can use ashared_ptr as my return type and be con-fident that the clients of my function willhave shared_ptr available. If I had writ-ten my own custom reference-countingclass, that wouldn’t have been true.

• Second (and perhaps more important-ly), shared_ptr works. That’s not as triv-ial as it might seem! Many people havewritten reference-counting smart point-er classes but many fewer people havewritten ones that get all the corner cas-es right— especially in a multithreadedenvironment. Classes such as shared_ptrare surprisingly easy to get wrong, soyou definitely want an implementationthat has been well tested and that hasseen lots of user experience.


“One problem thatappears in most

programs ismanaging resource

lifetimes”

Reference-counted pointers don’t com-pletely remove the possibility of resourcemanagement bugs; some discipline by pro-grammers is needed. There are two po-tential problems. First, suppose that twoobjects are pointing to each other. If x holdsa shared_ptr to y, and y holds a shared_ptrto x, then neither reference count can everdrop to zero even if nothing else in theprogram points to either x or y. They forma cycle, and will eventually cause a mem-ory leak. Second, suppose that you’re mix-ing memory-management policies, and thatyou have both a shared_ptr and a regularpointer to the same object:

my_class* p1 = new my_class;shared_ptr<my_class> p2(p1)...

If p1 outlives the last shared_ptr that’s acopy of p2, then p1 becomes a danglingpointer— and ends up pointing to an ob-ject that has already been destroyed. Try-ing to dereference p1 will probably makeyour program crash. Some smart pointer li-braries try to prevent this by making it im-possible to access a smart pointer’s under-lying raw pointer, but shared_ptr doesn’t.In my opinion, that was the right designdecision. A general-purpose smart pointerclass has to expose an underlying pointersomehow (otherwise operator-> can’t work),and throwing up artificial syntactic barriersjust makes legitimate uses harder.

These two sources of bugs go to-gether because people commonly mixreference- counted pointers and rawpointers precisely to avoid cycles. If twoobjects need to point to each other, acommon strategy is to choose one ofthose links as owning and the other asnonowning; the owning link can be rep-resented as a shared_ ptr, and thenonowning link as something else. Thisis a perfectly valid technique, especial-ly if you resist the temptation to use araw pointer for that “something else.”The TR1 smart pointer library providesa better alternative —weak_ ptr. Aweak_ptr points to an object that’s al-ready being managed by a shared_ptr;it doesn’t prevent that object from be-ing destroyed when the last shared_ptrgoes out of scope, but, unlike a rawpointer, it also can’t dangle and cause acrash, and again unlike a raw pointer, itcan safely be converted into ashared_ptr that shares ownership withwhatever other shared_ptrs already ex-ist. Listing Three is an example whereit makes sense to combine shared_ptrand weak_ptr.

TR1 includes other primitives as well assmart pointers, including components thatmake it easier to use functions and func-tion objects. One of the most useful is thenew function wrapper class.

Suppose you want to write somethingthat takes two arguments of type A andtype B and returns a result of type C. C++gives you lots of choices for how to ex-press that operation! You might write anordinary function:

C f(A a, B b) { ... }

Or, if C is your own class, you mightexpress this as a member function:

class A {...C g(B b);

};

Or, as the STL does with such classesas std::plus<T>, you might choose to writethis operation as a function object:

struct h {C operator()(A, B) const {...}

};

There are syntactic and semantic dif-ferences between these options, and mem-ber functions, in particular, are invokedin a different way. You can encapsulatethe syntactic differences using the Stan-dard’s mem_ fun adaptor (or more con-veniently, the TR’s new mem_ fn adaptoror bind adaptor), but there’s one thingthis won’t help with— the types. We havethree different ways of performing thesame fundamental operation, A×B->C, andas far as the language is concerned theyall have different types. It would be use-ful to have some single type that couldrepresent all of these different versions.

That’s what the TR1 function wrapperdoes. The type function<C(A,B)> repre-sents any kind of function that takes Aand B and returns C, whether it’s an or-dinary function, member function, or func-tion object. Again, implementing functionis quite difficult but all of that complexi-ty is hidden. From the user’s point of view,it just does what you expect: You can in-stantiate the function template with anyreasonable types (as usual there’s somelimit on the number of parameters, butthe limit should be large), you can assignanything to it that makes sense, and youinvoke it using the ordinary function callsyntax.

Use function<void(my_class)>, for ex-ample, to hold and invoke three differ-ent kinds of functions, all of which takea my_class argument and return nothing;see Listing Four. Putting these func-tion<void(my_class)> objects into a vec-tor may seem like an unnecessary com-plication. I did it to give a hint about whythis is useful. In one word— callbacks.You now have a uniform mechanism thathigher level libraries can use to hold thecallbacks they’re passed by their clients.I expect that in the future we will see thismechanism in the interfaces of many new

libraries, especially ones designed for dy-namism and loose coupling.

Application Support: Regular ExpressionsLow-level components that make it easi-er to write other libraries are important,but so are library components that directlysolve programmers’ problems. Most pro-grams have to work with text, and one ofthe classic techniques for text processingis pattern matching by regular expressions.Regular expressions are used in compil-ers, word processors, and any programthat ever has to read a configuration file.Good support for regular expressions isone of the reasons that it’s so easy to writesimple “throwaway” scripts in Perl. Con-versely, the lack of support for regular ex-pressions is one of C++’s greatest weak-nesses. Fortunately, as of TR1, we nowhave that support.

TR1 regular expressions have many fea-tures and options but the basic model isquite simple; it should seem familiar ifyou’ve used regular expressions in lan-guages like Python or Java. First, you cre-ate a tr1::regex object that represents thepattern you’d like to match, using the stan-dard syntax from ECMA-262 (that is, thesame syntax that JavaScript uses). Next,you use one of the regular expression al-gorithms (regex_match, regex_search, orregex_replace) to match that patternagainst a string. The difference between“match” and “search” is that regex_matchtests whether the string you’re trying tomatch is described by the regex pattern,while regex_search tests whether the stringcontains something that matches the pat-tern as a substring. Both regex_match andregex_search return bool, to tell youwhether the match succeeded. You canalso, optionally, pass in a match_resultsobject to get more details.

Actually, you probably won’t usematch_results explicitly. The TR1 regularexpression library, like the standardIOStream library, is templatized, but mostof the time you can ignore that feature.You probably use istream, and not ba-sic_istream. Similarly, you will probablyuse regex, which is an abbreviation forbasic_regex<char>, instead of using ba-sic_regex directly. In the case of match_re-sults, you will probably use one of twospecializations: smatch if you’re searchingthrough a string, and cmatch if you’researching through an array of char.

Listing Five shows how you might useTR1 regular expressions to write the coreof the UNIX grep utility. With do_grep,you’re only concerned with whether youhave a match, not with any substructure.But one of the other main uses of regu-lar expressions is to decompose com-pound strings into individual fields, as in


Listing Six(a). In Listing Six(b) we take thisfurther using regular expressions to con-vert between American and European cus-toms for writing dates.

When you use regex_search it onlyshows you the first place where there’s amatch, even if there may be more match-es later on in the string. What if you wantto find all of the matches? One answerwould be to do a search, find the firstmatch, examine the match_results to findthe end, search through the rest of thestring, and so on. But that’s harder thanit needs to be. This is a case of iteration,so naturally you can just use an iterator.To collect all matches into a vector:

const string str = "a few words on regular expressions";

const regex pat(“[a-zA-Z]+”);

sregex_token_iterator first(str.begin(), str.end(), pat);

sregex_token_iterator last;

vector<string> words(first, last);

The FutureClearly, the entire technical report is toomuch to cover in a single article. I men-tioned shared_ptr and function, but I onlyalluded to reference_wrapper, result_of,mem_ fn, and bind. I mentioned tuple andthe unordered associative containers, but Ileft out the other new STL container in TR1,the fixed-size STL container array<T, N>. Ientirely left out type traits, because it’smostly useful to library writers (it’s veryexciting if you do happen to be a librarywriter or if you do template metapro-gramming!), and I left out the extensivework on random-number generation andon mathematical special functions. Eitheryou care deeply about Bessel functions,hypergeometric functions, and the Rie-mann ζ function or you don’t; if you do,now you know where to find them. Andfinally I left out the section on C99 com-patibility. That’s for essentially the oppo-site reason as the others— it’s useful, it’simportant, and it just works. C99 functionsin C++ should work just the way youwould expect.

At this writing, I’m not aware of anycomplete implementations of the TR1 li-braries. Still, there is work being done:

• Metrowerks CodeWarrior 9.0 ships witha partial implementation of TR1, in-cluding such classes as function,shared_ptr, and tuple.

• Many parts of TR1, including the smartpointers, regular expressions, and ran-dom number generators, were original-ly Boost libraries (http://www.boost.org/).Boost releases are available, and free, forall popular compilers and platforms.

• Dinkumware is in the process of imple-menting the entire technical report. It isthe only company I know of that’s cur-rently working on the TR1 special func-tions, like cyl_bessel_ j and riemann_zeta,with the goal of achieving accuracy com-parable to today’s best implementationsof functions like sin and cos.

• The GNU libstdc++ project, which writesthe C++ library that ships with GCC, isactively working on implementing TR1.The next release of GCC, GCC 4.0, willship with a partial implementation ofTR1— exactly how partial is hard to say,since TR1 components are being addedto libstdc++ on a daily basis.

TR1 is real, not vaporware. All of thecode samples in this article are real; I com-piled them. (Well, except for the “…”parts.) The GNU libstdc++ implementa-tion is still experimental and incomplete,but already complete enough that I wasable to use it to test the examples of un-ordered associative containers, tuple, func-tional, and smart pointers. Because a GNUimplementation of TR1 regular expressionsdoesn’t yet exist, I used Boost.Regex. AllI had to do was change the header andnamespace names.

With the Standards committee nearlydone with TR1 and implementation workunderway, it’s time to think about the nextround of library extensions. What can weexpect to see? It’s too early to say. TheStandards committee hasn’t started dis-cussing proposals for TR2 yet. I’m main-taining a list of some of the extensionspeople have asked for (http://lafstern.org/matt/wishlist.html) but there’s a longway between a wish and a fully fleshedout proposal.

My own wish is better support for com-mon practical tasks: parsing HTML andXML, manipulating GIF and JPEG images,reading directories on file systems, usingHTTP. I’d like simple tasks, like trimmingwhitespace from a string or converting itto uppercase, to be simple. We’ve donean excellent job of creating general in-

frastructure to help library writers; nowit’s time to use some of that power to im-prove the experience of day-to-day de-velopment.

What’s important to remember, though,is that Standards committees standardize;they don’t invent. The only things thathave a chance of making it into TR2 willbe the ones that individuals feel stronglyenough about to do real work on. Per-haps you’ll be inspired by some of the en-tries on the “wish list,” or by my sugges-tions, or by libraries from Boost or someother organization. As I wrote a few yearsago, when work on TR1 had just begun(see “And Now for Something Complete-ly Different,” C/C++ Users Journal, Jan-uary 2002), a library extension proposalshould explain why this particular prob-lem area is important, what the solutionshould look like, how your work relatesto previous work in the same area, andhow your work affects the rest of the li-brary. It’s a nontrivial task, but it’s easi-er now than it was then: One thing wehave now, that we didn’t before, is ex-amples of what library extension pro-posals can look like. The proposals thatwere accepted into TR1 are collected athttp://open-std.org/jtc1/sc22/wg21/docs/library_technical_report.html, and they canserve as models for TR2 proposals.

Now that the first Technical Report onC++ Library Extensions has essentiallybeen completed, it’s time to start thinkingabout the next round of library extensions.What comes next is partly up to you!

ReferencesAustern, Matthew H. Generic Programmingand the STL: Using and Extending the C++Standard Template Library, Addison-Wesley, 1998.

Josuttis, Nicolai. The C++ Standard Li-brary: A Tutorial and Reference, Addison-Wesley, 1999.

Langer, Angelika and Klaus Kreft. Stan-dard C++ IOStreams and Locales: Ad-vanced Programmer’s Guide and Refer-ence, Addison-Wesley, 2000.

Lischner, Ray. STL Pocket Reference,O’Reilly & Associates, 2003.

Meyers, Scott. Effective STL: 50 SpecificWays to Improve Your Use of the StandardTemplate Library, Addison-Wesley, 2001.

Musser, David R., Gilmer Derge, andAtul Saini. STL Tutorial and ReferenceGuide: C++ Programming with the Stan-dard Template Library, Second Edition,Addison-Wesley, 2001.

Plauger, P.J., Alexander A. Stepanov,Meng Lee, and David R. Musser. The C++Standard Template Library, PrenticeHall, 2000.

DDJ(Listings begin on page 72.)


“With the Standardscommittee nearlydone, it’s time to

think about the nextround of library

extensions”

Listing One#include <tr1/unordered_map>#include <iostream>#include <string>

int main() {using namespace std;using namespace std::tr1; typedef unordered_map<string, unsigned long> Map;Map colors;

colors["black"] = 0xff0000ul;colors["red"] = 0xff0000ul;colors["green"] = 0x00ff00ul;colors["blue"] = 0x0000fful;colors["white"] = 0xfffffful;

for (Map::iterator i = colors.begin(); i != colors.end(); ++i)

cout << i->first << " -> " << i->second << endl;}

Listing Two#include <iostream>#include <tr1/memory>

using namespace std;using namespace std::tr1;

struct A {A() { cout << "Create" << endl; }A(const A&) { cout << "Copy" << endl; }~A() { cout << "Destroy" << endl; }

};

int main() {shared_ptr<A> p1(new A);shared_ptr<A> p2 = p1;assert (p1 != NULL && p2 != NULL && p1 == p2);

}

Listing Threeclass my_node {public:...

private:weak_ptr<my_node> parent;shared_ptr<my_node> left_child;shared_ptr<my_node> right_child;

};

Listing Four#include <tr1/functional>#include <vector>#include <iostream>


struct my_class{void f() { cout << "my_class::f()" << endl; }

};

void g(my_class) {cout << "g(my_class)" << endl;

}

struct h {void operator()(my_class) const {cout << "h::operator()(my_class)" << endl;

}};

int main(){typedef function<void(my_class)> F;vector<F> ops;ops.push_back(&my_class::f);ops.push_back(&g);ops.push_back(h());

my_class tmp;for (vector<F>::iterator i = ops.begin();

i != ops.end();++i)

(*i)(tmp);}

Listing Five#include <regex>#include <string>#include <iostream>


booldo_grep(const string& exp, istream& in, ostream& out){regex r(exp);bool found_any = false;string line;

while (getline(in, line))if (regex_search(line, r)) {found_any = true;out << line;

}

return found_any;}

Listing Six(a)const string datestring = "10/31/2004";const regex r("(\\d+)/(\\d+)/(\\d+)");smatch fields;if (!regex_match(datestring, fields, r))throw runtime_error("not a valid date");

const string month = fields[1];const string day = fields[2];const string year = fields[3];

(b)const string date = "10/31/2004";const regex r("(\\d+)/(\\d+)/(\\d+)");const string date2 = regex_replace(date, r, "$2/$1/$3");

DDJ


Software reuse has long been on theradar of many companies because ofits potential to deliver quantum leapsin production efficiencies. In fact, ba-

sic, or ad hoc software reuse already ex-ists within most organizations. This reuseof documents, coding styles, components,models, patterns, knowledge items, andsource code is rarely discussed becauseit usually starts and ends as an informalgrass roots effort, with management hav-ing little understanding of how it started,why it persists, and how they might proac-tively extract larger benefits from it.

With an understanding that some formof reuse very likely already exists withinmost, if not all, software development or-ganizations, the questions emerge, “howcan we measure the level of reuse that al-ready exists?”, “what can be done to in-crease reuse benefits?”, and “how can wetrack our progress along the way?”.

Finding a Unified Unit of MeasureThe first step to being able to measure howinstances of software reuse are impactingoperations is to define a base unit of mea-sure that you can use across all instances.

The primary issue in finding such a uni-fied unit of measure lies in the fact thatreuse is not limited to the source or ma-chine code. In fact, all of the assets asso-ciated with software development are pos-sible targets for reuse (components, sourcecode, documents, models, web services,and the like). As a result, artifact-specific

measures such as lines of code, pages ofdocumentation, or classes in a diagramare simply not generic enough to be use-ful, and for the most part do not readilytranslate into real corporate costs. We sug-gest using hours as the base unit of mea-sure. Work hours translate directly intocosts for a software organization, are eas-ily measurable, and can be universally ap-plied across all artifacts.

Some have used “average developerhours” as the base unit of measure withaverage developer hours defined as thenumber of productive hours that an aver-age developer typically spends directly onsoftware development (15 hours per week,for example). Because the organizationpays for software developers, regardlessof whether they work on a task directlyor indirectly related to the software pro-ject, we propose sticking to easily mea-surable worked hours as the base unit ofmeasure since it is less subjective.

Another reason cited for using “averagedeveloper hours” is that there are cases ofcertain developers within the organizationwho are much more (or less) productivethan the “average” developer. Presumably,however, developers who are extra pro-ductive will be recognized as such, andthis trait will be reflected in their salary.Salary, which is a market-determined met-ric, is therefore likely the best and onlyunbiased measure that we can use to com-pensate for productivity differences be-tween developers. As long as we usesalaried rates for each resource as opposedto some average for the group, we shouldbe able to implicitly keep track of thesedifferences in productivity as we measuredollar cost savings.

Measuring Ad Hoc Software ReuseBecause there are no set-up or other costsassociated with ad hoc reuse, the onlycosts to the enterprise relate to the timespent searching for and analyzing whethera particular reuse candidate can in facthelp accelerate the development of a cur-rent task. If the search yields a positiveresult, there are also subsequent costs as-sociated with modifying/integrating thereusable item into the current project. The

risks associated with ad hoc reuse initia-tives relate to the time spent to determinewhether reuse candidates exist becausethis time is nonrecoverable and is addedto the total development time in the eventthat no reuse item is located.

Over multiple search and reuse itera-tions, the combined time spent searching,understanding, and integrating the foundcontent into the current project must be

less than the time to develop all of the in-tegrated content from scratch for the reuseefforts to be judged as successful.

In mathematical terms, this is written asfollows, where the expression on the leftsignifies the total time to develop the con-tent over all reuse or attempted reuse it-erations, and the expression on the rightindicates the actual or expected time re-quired to build all of the combined con-tent from scratch:

(TLR+U)*N + i*SR*MOD + i*(1-SR)*BUILD < i*BUILD

In this case, TLR = Time to locate eachpotentially reusable item; U = Time to un-derstand suitability of each potentiallyreusable item for current task; N = Num-ber of items that were examined, includ-ing each of the items that finally get reused(if any); i = number of attempted instancesof reuse; SR = Search hit rate. Percentageof i that yielded a positive search result(for instance, the user discovered a suit-able reuse candidate that gets incorporat-ed into the project); MOD = Time to in-tegrate/modify the reused item for currentpurposes; and BUILD = Time to build anelement from scratch. This is the actual or

Examining threedifferent approaches tosoftware reuse

Lior is chief technology officer at OSTnet andcan be contacted at http://www.ostnet.com/.Jan is a software engineer at Inovant andcan be contacted at http://www.visa.com/.


Measuring the Benefits ofSoftware Reuse

LIOR AMAR AND JAN COFFEY

“Some form ofsoftware reuse existswithin mostorganizations”

estimated time spent building the software.To calculate expected time to project com-pletion, developers can use any estima-tion methods currently in place internal-ly (project plans, function point analyses,black magic voodoo).

Similarly, by taking the percentage dif-ference between the no reuse and ad hocreuse scenarios (for instance, (no reuse-ad hoc reuse)/no reuse*100), you can ar-rive at the percentage of savings generat-ed by ad hoc reuse in the enterprise. Aftersimplifying, this equation looks as follows:

% Savings = [SR – (TLR+U)*(N/i)/BUILD –SR*(MOD/BUILD)]*100

In this instance, (TLR+U)*N/i is the av-erage time spent searching for a reusableitem before an item is found or the userdecides to build from scratch. This num-ber is typically less than five minutes. Ifthe average size of the item that you arelooking to build is over eight hours (a rea-sonable assumption), then this term is neg-ligible compared to BUILD, and the ratioof the two terms is essentially 0.

MOD/BUILD is the relative cost of inte-grating an element versus building it fromscratch. This value has been determinedover numerous empirical studies to be inthe range of 0.08 for black box componentreuse to 0.85 for a small snippet of code.

We’ll use an average search hit rate SRof 20 percent (for example, a user findsa useful item one out of every five timesthat he actually tries to locate something)and 0.75 for an average MOD/BUILD val-ue. The MOD/BUILD value is on the highend of its normal range since the granu-larity of the things being used in an adhoc reuse initiative is typically small, asare the incremental benefits achieved. Thisis a fair assumption because the reuse ini-tiative is not being managed and the de-velopers’ source for the content beingreused is not optimized (that is, the con-tent is taken from the Internet, friends, andother unmanaged sources).

Plugging the aforementioned assump-tions into the equation, we find that adhoc reuse generates savings equal to 5 per-cent of development costs. Although it ap-pears small on a percentage basis, thisnumber can actually be quite large in dol-lar terms given the high total cost of thedevelopment.

For example, if a company’s total ITsalaries are $5 million, the 5 percent in-crease in productivity would equate to$250,000 in annual savings.

Evolutionary Software ReuseRegardless of the process or processesused to develop software within an orga-nization, there are easy to implement im-provements that can be initiated to en-

hance the returns currently being realizedwith ad hoc reuse. Although the tasks andthe ways of measuring results will notchange from one process to the next, theartifacts to be reused and the point atwhich the reuse-related tasks intervenein the process will vary. By way of ex-ample, companies following an RUP pro-cess will typically reuse such things asuse cases, subsystems, components,source code, and documents, and thesewill be accessed at various points duringthe elaboration, construction, and transi-tion phases.

Without significantly altering their coredevelopment process, companies can be-gin to benefit to a greater degree by ac-tively managing their existing software as-sets. In an “evolutionary reuse” practice,users are encouraged to identify all poten-tially reusable items and make them avail-able to others in the organization, withoutinvesting any time up-front to make them“reusable.” During each instance of reuse,the individual reusing the asset is encour-aged to refactor the reusable artifacts andrepublish the upgraded asset, thereby evolv-ing it towards black box reuse.

By following this reuse methodology,no initial investment is required to gener-alize the asset in anticipation of reuse thatmay not ever occur. Each asset is only ex-tended to the extent needed to accom-modate the current requirements, thusthere are no sunk costs on assets that werecreated for reuse but never reused.

To implement a more structured evolu-tionary reuse effort, companies need to:

• Provide better access to their own in-ternal software content.

• Promote the development of well-factoredsoftware (a process that is already quitefamiliar to most software developers).

• Measure results and gradually refine thereused content to ensure growing in-cremental benefits with each new in-stance of reuse.

Looking at how we model and measureevolutionary software reuse, we first needto identify all incremental costs and ben-efits that are not present in an ad hoc ini-tiative. These are:

• Users who locate a reusable asset willtypically need to refactor the asset forcurrent purposes. Most of this effort iscaptured in MOD, but there may occa-sionally be additional effort involvedwith restructuring the asset to ensurethat it remains well-factored. Since thiseffort is only necessary when somethingis to be reused, the total incrementalcost is i*SR*FACT, where FACT is theaverage incremental time to refactor as-sets for entry into the asset repository.

• In addition to ensuring that thereusable artifacts are well factored,there are additional costs associatedwith creating assets from your reusableartifacts (for instance, attaching meta-data to make the artifacts easier to find,understand, and reuse) and managinga repository of assets, although se-lecting the right repository tool for yourorganization can minimize these costs.These costs are accounted for as REPfor each new asset.

Inserting these terms into the ad hocreuse equation and taking the percentageof savings, we get (after simplifying):

% Savings = [SR – (TLR+U)*(N/i)/BUILD – SR*((MOD+FACT)/BUILD) –

REP/BUILD]*100

As before, (TLR+U)*(N/i)/BUILD is ap-proximately 0 and can be ignored. In-terestingly, the term (MOD+FACT)/BUILDin the evolutionary reuse scenario con-tinues to vary between 0.08 and 0.85 and,as an average, actually is smaller thanMOD/BUILD in an ad hoc scenario. By wayof example, in an ad hoc reuse scenario,if two developers reuse the same artifacton separate occasions, their efforts will like-ly be duplicated because the improvementsmade by the first developer reusing the ar-tifact will likely not be available to the sec-ond developer (unless they know of eachother’s work). If one spends 20 hours mod-ifying the artifact for reuse, the other willalso likely spend a similar amount of time,resulting in a combined MOD of 40 hours.

In an evolutionary reuse scenario, thefirst developer will likely spend a few morehours modifying and refactoring the artifactto make sure that its interfaces are cleanand easily consumable. Because the firstdeveloper publishes this asset after he isdone, the second developer will reuse theimproved asset, thus requiring only a frac-tion of the time to understand, modify, andrefactor it (eight hours, for instance). So ifthe first developer spent 22 hours modify-ing and refactoring the artifacts, the total ofMOD+FACT over the two reuse instancesunder the evolutionary reuse scenario willbe only 30 hours. With over hundreds ofreuse instances, it is easy to see how theaverage of MOD+FACT will continue totrend lower as the repository of softwareassets grows and matures. At the limit, whenan asset in the repository is black boxed,(MOD+FACT)/BUILD will equal 0.08 be-cause it will no longer be necessary to refac-tor the asset (FACT=0).

The term REP/BUILD in the equationrelates primarily to the time required topublish assets as they are located. Thistime will vary depending on the workflowprocess used to publish assets and on theamount of metadata that the organization


determines is necessary to accurately de-scribe the asset. In general, this time isvery small and its costs are more than off-set by the reduction in the time othersspend trying to understand what an arti-fact does when it is located.

By following an evolutionary reuse prac-tice, the company very quickly has at itsdisposal a rich asset repository filled withreusable company content that:

• Is exclusively focused on its particulardomain of operation.

• Has been tested and approved for usewithin the company.

As a result, developers looking to reusewill quickly be able to determinewhether useful reusable artifacts existand will also be able to locate more con-tent, with greater precision, thus in-creasing the search hit rate. While wewill use an increased search hit rate of40 percent in the aforementioned equa-tion, it should be noted that the searchhit rate will continue to increase as therepository grows and more content be-comes available for reuse.

We will use 0.5 for an average (MOD+FACT)/BUILD value, which is high sincethe most popular assets will be reused mul-tiple times, resulting in many cases of blackbox reuse (0.08) and driving down the av-erage. Plugging in the stated numbers, wefind that the evolutionary reuse scenariogenerates very respectable savings of 20percent. This will amount to a dollar sav-ings of $1 million using salaries of $5 mil-lion, as above. Interestingly, this value canbe extracted without a material initial in-vestment in time and effort to get started.

Systematic Software ReuseWhen people refer to software reuse with-out qualifying further, they are typicallyspeaking about traditional “systematicsoftware reuse.” Systematic software reuseis a highly structured practice that involvesarchitects and developers identifying po-tentially reusable components in a projector family of projects in advance of theirdevelopment.

Systematic software reuse efforts include“standards police,” review committees,and/or special “tools teams” responsible forspecifically developing reusable assets. Be-cause it is believed that future modificationscan be foreseen, developers practicing Sys-tematic software reuse build in abstractionsto cover any number of possible mutationsand implement “hooks” for future iterations.

The end goal of all of this up-front ef-fort is to reduce the time required to in-tegrate the reusable component into anew project by enabling black-box soft-ware reuse to the largest extent possible(for instance, MOD=0.08). However, over-

abstracting components ahead of time canmake code harder for others to read andunderstand and is an inadvertent problemassociated with this practice.

While the leverage associated with sys-tematic software reuse is very large be-cause each additional instance of reuseprovides enormous benefits, the addedup-front costs dramatically increase therisks associated with its implementation.

To properly measure the impact thatsystematic software reuse can have on adevelopment environment, we begin withthe ad hoc reuse approach and add alladditional tasks and their resulting bene-fits into the equation. Of particular note:

• Because reusable components are “built”to be reusable, there are costs associat-ed with building these components overand above what it would otherwise costto build them for a given set of softwarerequirements. Industry accepted figuresare that it typically costs anywhere be-tween 50 percent and 150 percent ex-tra to build a component for reuse, ver-sus building it for a single use. We’ll useRCOM to identify this extra effort in ourequations (to be shown shortly). In thecase of evolutionary reuse, this extra ef-fort to make an asset reusable is onlydone at the time of consumption by theperson who is looking to reuse the com-ponent, and this effort is captured in theterm (MOD+FACT).

• The cost of reusing a component builtfor reuse will be much lower than inother types of reuse with MOD/BUILDranging between 0.08 and 0.2.

• Because systematic reuse componentsare built for reuse, there will typicallyonly be a small number of them avail-able for reuse. Also, the availability ofthese components should be fairly easyto communicate within the organizationmeaning that the Search hit rate will bemuch higher in a Systematic reuse ef-fort, although the actual number of reuseinstances i will be dramatically lower,especially in the early years.

For a systematic software reuse effortto be profitable, therefore, the followingequation representing a systematic soft-ware reuse initiative must hold true:

(TLR+U)*N + i*SR*MOD + i*(1–SR)*BUILD + j*REP+ j*(1+RCOM)*

BUILD< i*BUILD

where j = number of reusable softwarecomponents that have been built, andRCOM = extra time required to build areusable software component versus build-ing one with equivalent functionality butthat is not designed to be reusable.

Taking the percentage difference be-tween the no reuse and Systematic soft-

ware reuse scenarios, we can arrive at the% Savings generated by systematic soft-ware reuse in the enterprise. After sim-plifying, this equation looks like:

% Savings = [SR – (TLR+U)*(N/i)/BUILD – SR*MOD/BUILD – (j*REP)/(i*BUILD) –

j*(1+RCOM)/i]*100

For demonstration purposes, and to sim-plify this equation, assume that the searchhit rate SR approaches 1 and that RCOM is50 percent, the low end of its industry ac-cepted value. As well, we’ll use a favorableMOD/BUILD value of 0.08 and will assumethat (TLR+U)*(N/i)/BUILD approximates 0,as was the case in each of the aforemen-tioned scenarios. Finally, we’ll assume thatthe expression (j*REP)/(i*BUILD) is alsoequal to zero, which should be the caseunless j (the number of reusable compo-nents that have been built and insertedinto the catalog) is orders of magnitudegreater than i (the number of reused ele-ments), which should be the case in allbut the most disastrous scenarios.

Plugging in the favorable values just list-ed for each expression and reducing theequation, we get:

% Savings = [0.92 - 1.5*j/i]*100

What we can interpret from this equa-tion is that the extra 50 percent spent tobuild each reusable component adds upvery quickly and needs to be amortizedover multiple reuse iterations for system-atic software reuse to generate positivesavings. In fact, if each item built is notused on an average of 1.63 projects (i/j1.63, for instance), then the reuse effortwill fail to generate a positive return.

Overall, systematic software reuse hasthe potential to generate very large sav-ings (theoretically as high as 92 percent ifone magical component were built thatcould be reused everywhere, which ofcourse, is not really possible). On the neg-ative side, systematic software reuse ishighly sensitive to the ratio of j/i, meaningthat participants in the initiative need tobe highly skilled at predicting whichreusable components need to get built toamortize them over the largest number ofreuse instances. Failing to accurately pickthe right components to build or mis-managing the Systematic software reuseinitiative have the potential to very quick-ly generate costly negative results.

ComparisonUsing the aforementioned methods for cal-culating the costs and benefits of each ofthe three reuse implementation methodscovered and deriving an ROI from each,we arrive at the ROI graph in Figure 1.

Again, systematic software reuse hasthe potential to be highly negative if theassets that are built are not quickly


reused on multiple projects. Systematicreuse does, however, have the highestslope in the early days, meaning that itcan provide a very quick ROI if proper-ly implemented.

Evolutionary reuse starts off with lowincremental benefits to the organizationbut quickly begins to generate increasingvalue as content is refactored and madeavailable to end users. It provides a nicecompromise for companies looking to en-hance the benefits they are currently get-ting from their ad hoc reuse efforts butwho are unwilling or unable to invest the

time required to set up and manage astructured systematic reuse effort.

Finally, ad hoc reuse currently gener-ates modest benefits to an organizationand it will continue to do so, althoughthese benefits grow slowly and are far frombeing optimized.

ConclusionMeasuring productivity and changes in pro-ductivity are important when implement-ing any new software tool or initiative. Tothat end, the overall techniques just usedto determine the costs and benefits relat-

ed to different reuse practices can also beapplied to measure savings associated withother initiatives. It is only in comparingthese different returns using standard meth-ods and units of measure that you will beable to make informed decisions and setquantifiable milestones for your company.

As a starting point, additional work needsto be done by most companies to gain abetter understanding of where develop-ment efforts are currently being focused,which tasks are the most costly, which arebeing duplicated, and can be altered togenerate the highest incremental returns.

The returns just quantified relate directlyto the savings that organizations can hopeto gain through developer productivity en-hancements. These savings are the mini-mum benefits realizable since they excludeall other costs (such as overhead) and ter-tiary benefits such as increased IT agility,reduced defects and maintenance costs,and the ability to deliver new productsand services at an accelerated rate to es-tablish or maintain key strategic compet-itive advantages. As we have seen, de-pending on the path chosen, establishingthis advantage through reuse does notnecessarily require a huge up-front in-vestment in time and human resources.

DDJ


Figure 1: Return on investment analysis.

The heart of Linux is the kernel, whichis in charge of scheduling tasks, man-aging memory, performing deviceI/O operations, servicing system calls,

and carrying out other critical tasks. Un-like other UNIX-like operating systemsthat implement a pure monolithic archi-tecture, Linux lets you dynamicallyload/unload portions of the kernel (mod-ules). This lets you provide support fornew devices and add system featureswithout recompiling or rebooting the ker-nel, then unload them when they are notneeded anymore.

The possibility of loading/unloadingmodules is a key feature for driver pro-grammers because it lets you test driversduring development without rebootingthe kernel at every change, thus dramat-ically speeding up the test-and-debugprocess.

Kernel 2.6 introduces significant changeswith respect to kernel 2.4: New featureswere added, existing ones removed, andsome marked as deprecated, althoughthey’re still usable but with severe limita-

tions. Consequently, modules written forkernel 2.4 don’t work anymore, or workwith grave restrictions. In this article, I ex-amine these changes.

A Minimal ModuleListing One is the shortest possible im-plementation of a module. Adhering tothis template lets you write code that canoperate equally as a module or staticallylinked into the kernel, without modifica-tions or #ifdefs.

The initialization and cleanup functionscan have arbitrary names, and must beregistered via the module_init( ) and mod-ule_exit( ) macros. The module_init(f)macro declares that function f must becalled at module insertion time if the fileis compiled as a module, or otherwise atboot time. Similarly, macro module_exit(f)indicates that f must be called at moduleremoval time (or never, if built- in). Thespecifier __init is effective only when thefile is compiled in the kernel, and indi-cates that the initialization function can befreed after boot. On the other hand, __exitmarks functions that are useful only formodule unloading and, therefore, can becompletely ignored if the file is not com-piled as a module.

Compiling ModulesThe 2.6 kernel’s build mechanism(“kbuild”) has been deeply reengineered,affecting how external kernel modules arecompiled. In 2.4, module developers man-ually called GCC, including command-linepreprocessor symbol definitions (such asMODULE or __KERNEL__), specifying in-clude directories and optimization options.This approach is no longer recommend-ed because external modules should bebuilt as if they were part of the officialkernel. Consequently, kbuild automatically

defines preprocessor symbols, optimiza-tion options, and include directories. Theonly required thing you do is create a one-line makefile:

obj-m := your_module.o

where your_module is the name of yourmodule, whose source is in the fileyour_module.c. You then type a commandline such as:

make -C /usr/src/linux-2.6.7 SUBDIRS='pwd' modules

The output provided by the build pro-cess is:

make -C /usr/src/linux-2.6.7 SUBDIRS=/root/your_dir modules

make[1]: Entering directory '/usr/src/linux-2.6.7'

CC [M] /root/your_dir/your_module.oBuilding modules, stage 2.MODPOSTCC /root/your_dir/your_module.mod.oLD [M] /root/your_dir/your_module.ko

Daniele is a Ph.D. student at Politecnicodi Milano (Italy), where he currently workson source- level software energy estima-tion. He can be contacted at [email protected].



Changes to the kernelmean changes must bemade elsewhere

DANIELE PAOLO SCARPAZZA

“The 2.6 moduleloader implementsstrict versionchecking”

make[1]: Leaving directory '/usr/src/linux-2.6.7'

In the end, a new kernel module isavailable in your build directory underthe name of your_module.ko (the .koextension distinguishes “kernel objects”from conventional objects). With a moreelaborate Makefile (such as ListingTwo), you can avoid typing this com-mand line.

Module VersioningThe 2.6 module loader implements strictversion checking, relying on “versionmagic” strings (“vermagics”), which areincluded both in the kernel and in eachmodule at build time. A vermagic, whichcould look like “2.6.5-1.358 686 REG-PARM 4KSTACKS gcc-3.3,” contains crit-ical information (for example, an ex-tended kernel version identifier, the targetarchitecture, compilation options, and com-piler version) and guarantees compatibil-ity between the kernel and a module. Themodule loader compares the module’sand kernel’s vermagics character- for-character, and refuses to load the mod-ule if differences are detected. The strict-ness of this check complicates things, butwas advocated after compatibility prob-lems arose when loading modules com-piled with different GCC versions withrespect to the kernel.

When compiling modules for a run-ning kernel that you may not want to re-compile, when cross compiling for a de-ployment box that you do not want toreboot, or when preparing a module bi-nary for a kernel provided with a givenLinux distribution, your module’s ver-magic must exactly match your target

kernel’s vermagic. To do this, you mustexactly duplicate the build environmentduring module compilation, to that pre-sent at kernel compilation time. This isdone by:

1. Using the same configuration file as thekernel (since the configuration file usedto compile the kernel is available inmost cases under /boot, a cp /boot/con-fig-'uname -r' /usr/src/linux-'uname -r'/.config command is enough in mostcases).

2. Using the same kernel top-level Make-file (again, it should be available under/lib/modules/2.6.x/build; therefore, thecommand cp /lib/modules/'uname -r'/build/Makefile /usr/src/linux-'uname -r' should go).

Module LicensingThe Linux kernel is released under theGNU Public License (GPL), whose pur-pose is to grant users rights to copy, mod-ify, and redistribute programs, and to en-sure that those rights are preserved inderivative works:

6. Each time you redistribute the Program(or any work based on the Program), therecipient automatically receives a licensefrom the original licenser to copy, dis-tribute or modify the Program subject tothese terms and conditions. You may notimpose any further restrictions on the re-cipients’ exercise of the rights grantedherein. You are not responsible for en-forcing compliance by third parties to thisLicense.

The practical case of a kernel moduledepending on a second module (a com-mon case in Linux) is not explicitly men-tioned in the GPL, yet some interpreta-tions of its underlying philosophy postulate

that a proprietary module should not de-pend on a GPL-licensed one, because thelatter would restrict the rights granted tothe user by the former. Module writersadvocating this interpretation can now en-force this policy with the EXPORT_SYM-BOL_GPL( ) macro in place of EX-PORT_SYMBOL( ), thus exporting symbolsthat can be linked only by modules spec-ifying a GPL-compatible license.

With this in mind, all module writersare asked to declare the license underwhich their module is released, via themacro MODULE_LICENSE( ). Table 1 liststhe licenses and respective indent stringscurrently supported by the kernel (all in-dent strings indicate free software exceptfor the last one). Additionally, the indica-tion of license makes it possible for usersto verify that their system is free, the freedevelopment community can ignore bugreports including proprietary modules, andvendors can do likewise based on theirown policies.

When no license is specified, a propri-etary license is assumed. Modules with aproprietary license cause the followingwarning when loading:

your_module: module license 'Proprietary' taints kernel.

and force flags must be specified to havethe module properly loaded.

The macro EXPORT_NO_SYMBOLS isdeprecated and not needed anymore be-cause a module exporting no symbols isthe norm.

ParameterPassingThe old parameter passing mechanism,based on the MODULE_PARM( ) macro,is obsolete. Modules should define theirparameters via a call to the macro mod-ule_param( ), whose arguments are:

• The name of the parameter (and asso-ciated variable).

• Its type (chosen among byte, short,ushort, int, uint, long, ulong, charp,bool, and invbool, or a custom type-name; for example, named xxx, forwhich helper functions param_ get_xxx( ) and param_set_xxx( ) must beprovided).

• The permissions for the associated sys-fs entry—0 indicates that the attributeis not to be exposed via sysfs.

Example 1 presents two example dec-larations.

Use CountModule use counts protect against the re-moval of a module that is still in use.Modules designed for previous kernelscalled MOD_INC_USE_COUNT( ) and


indent string Meaning

GPL GNU Public License v2 or later.

GPL v2 GNU Public License v2.

GPL and additional rights GNU Public License v2 rights and more.

Dual BSD/GPL GNU Public License v2 or BSD license choice.

Dual MPL/GP GNU Public License v2 or Mozilla license choice.

Proprietary Nonfree products.

Table 1: indent strings for licenses currently recognized by the kernel and theirrespective license names.

OO Concept Kernel Object Concept

object kobject

class ktype

generic container kset

class of a given object ktype pointed by field ktype in given kobject

destructor function pointed by field release in given ktype

methods functions pointed by fields sysfs_ops.show and sysfs_ops.store of ktype

'this' pointer first parameter (struct kobject * kobj) for the above

Table 2: Object-oriented/kernel object mappings.

MOD_DEC_USE_COUNT( ) to manipulatetheir use count. Since these macros couldlead to unsafe conditions, they are nowdeprecated. They should now be avoid-ed, for example, by setting the ownerfield of the file_operations structure, orreplaced with try_module_get( )/mod-ule_ put( ) calls. Alternatively, you canprovide your own locking mechanism ina custom function, and set the module’scan_unload pointer to it. The functionshould return 0 for “yes,” and -EBUSY ora similar error number for “no.”

If used, the deprecated MOD_INC_USE_COUNT macro marks the currentmodule as unsafe, thus making it im-possible to unload (unless enabling theforced unload kernel option and usingrmmod -force).

The 2.6 Device Model and /sys FilesystemKernel 2.6 introduces an “integrated de-vice model”— a hierarchical representa-tion of the system structure, originally in-tended to simplify power-managementtasks. This model is exposed to user spacethrough sysfs, a virtual filesystem (like/proc), usually mounted at /sys. By nav-igating sysfs, you can determine whichdevices make up the system, which pow-er state they’re in, what bus they’re at-tached to, which driver they’re associatedto, and so on. sysfs is now the preferredand standardized way to expose kernel-space attributes; module writers shouldthen avoid the soon-to-be obsolete procfs.

Figure 1 (available electronically; see“Resource Center,” page 3) is a typical sys-fs tree. The tree is conceptually similar tothe view provided by the Windows “hard-ware manager.” The first-level entries in/sys are:

• block, which enumerates all the blockdevices, independently from the bus towhich they are connected.

• bus, which describes the structure ofthe system in terms of buses and con-nections.

• class, which provides device localiza-tion based on device class (the mouse,for example) apart from its physical busconnection or device numbering.

• devices, which enumerate all the devicescomposing the system.

• firmware, which provides a facility forthe dynamic management of firmware.

• power, which provides the ability to con-trol the system-wide power state.

Given the first-level classification, the samedevice can appear multiple times in the tree.Symbolic links are widely used to connectidentical or related entities; for example, theblock device hda is represented by a di-rectory entry /sys/block/hda, which contains

a link named “device” pointing to /sys/de-vices/pci0000:00/0000:00:07.1/ide0/0.0. Thesame block device also happens to be thefirst device connected to the IDE bus; thus,entry /sys/bus/ide/devices/0.0 points to thesame location. Conversely, a link is provid-

ed pointing to the block device associatedto a given device; for example, in /sys/de-vices/pci0000:00/0000:00:07.1/ide0/0.0, alink named “block” points to /sys/block/hda.

Exposing module attributes via sysfsrequires a minimal understanding of


module_param(my_integer_parameter, int, S_IRUSR | S_IWUSR );

MODULE_PARM_DESC(my_integer_parameter, "An integer parameter");

module_param(my_string_parameter, charp, S_IRUSR |

S_IWUSR | S_IRGRP | S_IROTH);

MODULE_PARM_DESC(my_string_parameter, "A character string parameter");

Example 1: Declaration of an integer and a string parameter.

Dobbs_0506.indd 1 07.04.2005 15:38:06 Uhr

the device model and of its underlyingkobjects, ktypes, and ksets concepts.Understanding those concepts is easierin an object-oriented perspective be-cause all are C-language structs that im-plement (with debated success) a rudi-mentary object-oriented framework.Table 2 is a mapping between OO andkobject concepts.

Each directory in sysfs corresponds toa kobject, and the attributes of a kobjectappear in it as files. Reading and writ-ing attributes corresponds to invoking ashow or a store method on a kobject,with the attribute as an argument. A kob-ject is a variable of type struct kobject.It has a name, reference count, pointer

to its parent, and ktype. C++ program-mers know that methods are not definedon an object basis; instead, all the ob-jects of a given class share the samemethods. The same happens here, theidea of class being represented by aktype. Each kobject is of exactly onektype, and methods are defined forktypes (usually functions to show andstore attributes, plus a function to dis-pose of the kobject when its referencecount reaches zero: a destructor, in OOterms). A kset corresponds to a genericlinked list, such as a Standard C++ Li-brary generic container. It contains kob-jects and can be treated as a kobject it-self. Additionally, handlers can be

associated to events of kobjects enteringor leaving a set, thus providing a cleanway to implement hot-plug operations.The cleanness of the design of such aframework is still debated.

sysfs_example.c (available electronical-ly; see “Resource Center,” page 3), a com-plete example of a kernel module, showshow to create and register kobject vari-ables to expose three attributes to userspace— a string and two integers, the firstread and written as a decimal number andthe second as a hexadecimal one. Exam-ple 2 is an example of interaction withthat module.

Removed FeaturesSome features have been removed from2.4. For instance, the system call table isno longer exported. The system call table(declared as int sys_call_table[ ];) is a vec-tor containing a pointer to the routine tobe invoked to carry out that call for eachsystem call. In 2.4 kernels, this table wasvisible to— and, more important, writableby— any module. Any module could eas-ily replace the implementation of any sys-tem call with a custom version, within amatter of three lines of code. Apart frompossible race conditions issues (on SMPsystems, a system call could be replacedwhile in use by an application on anoth-er processor), this implied putting an in-credible amount of power— the ultimateheart of the OS— in the hands of any ex-ternal module. If you’re not convincedabout the relevance of this danger, lookinto how easy it was to write and injectmalicious code in the form of modulesthat replace sys_call_table entries. Imple-menting rootkits is possible in no morethan 30 lines of code (see http://www.insecure.org/).

Concern is not only related to maliciousmodules, but also to proprietary modulesprovided in binary form only for which itis hard to tell exactly what they may do.The issue was radically eradicated in ker-nel 2.6: The system call table can only bemodified by code built in the kernel,whose source is therefore available.

DDJ


Listing One#include <linux/module.h>#include <linux/config.h>#include <linux/init.h>

MODULE_LICENSE("GPL");static int __init minimal_init(void) {return 0;

}static void __exit minimal_cleanup(void) {}module_init(minimal_init);module_exit(minimal_cleanup);

Listing Twoobj-m := your_module.oKDIR := /usr/src/linux-$(shell uname -r)PWD := $(shell pwd)

default:$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules

install: default$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules_install

clean:rm -rf *.o *.ko .*.cmd *.mod.c .tmp_versions *~ core

DDJ

# makemake -C /usr/src/linux-2.6.5-1.358 SUBDIRS=/root/sysfs_example modulesmake[1]: Entering directory '/usr/src/linux-2.6.5-1.358'

CC [M] /root/sysfs_example/sysfs_example.oBuilding modules, stage 2.MODPOSTCC /root/sysfs_example/sysfs_example.mod.oLD [M] /root/sysfs_example/sysfs_example.ko

make[1]: Leaving directory '/usr/src/linux-2.6.5-1.358'# insmod sysfs_example.ko # cd /sys# lsblock bus class devices firmware power sysfs_example# cd sysfs_sample# lshex integer string# cat hex0xdeadbeef# cat integer 123# cat stringtest# echo 0xbabeface > hex# echo 456 > integer# echo hello > string # cat hex0xbabeface# cat integer 456# cat string hello# cd ..# rmmod sysfs_example# lsblock bus class devices firmware power

Example 2: Interactive session in which the module sysfs_example.c is compiled,inserted, and tested. Attributes are inspected, changed, and inspected again.

Running .NET web applications in theenterprise means accommodating amyriad of servers and browsers,many with distinct behaviors (see

Figure 1). The traditional approach tobuilding complex web apps for such en-vironments is to write separate versionsof your code— each meant to run cor-rectly with an individual server and brows-er pair. In this article, I examine some ofthe challenges you face when creating asingle version of .NET web apps, so thatthey function the same way no matterwhich server it’s deployed to, andwhichever client it’s downloaded to.

Thin clients, in which data is managedand stored centrally, are being adopted fora number of reasons. For one thing, feder-al regulations (Sarbanes-Oxley, the Home-

land Security Act, and HIPPA, among oth-ers) dictate that internal documents and com-munications be secured to a heretofore-un-seen level, and security is easier to achievewhen data is managed and stored cen-trally. Second, there’s the high total-cost-of-ownership of the PC desktop. Addi-tionally, networks have gotten faster, withmost businesses running 100-Mpbs FastEthernet or 54-Mbps 802.11g Wi-Fi net-works (both more than fast enough forthin-client computing). Finally, many ven-dors are shipping thin clients—with orwithout embedded Windows XP but allwith web browsers — with enough re-sources locally that you don’t waste timewaiting for your system to painfully ren-der its GUI.

At the same time, Linux is being sup-ported by vendors such as IBM and Ora-cle, and .NET apps, which can run on Lin-ux (and other flavors of UNIX), are notdependent solely on the Windows IIS ap-plication server and your web browsers—a one-to-many relationship. Enterprise-based .NET apps are now running inmany-to-many server-browser pair envi-ronments, as in Figure 1. Finally, Linux isappearing on mainframes and other pow-erful computers that manage and storedata centrally for large numbers of users.

On the server side, the standardizationof C# and .NET’s Common Language Run-time (CLR) lets you use open-source toolsthat are based on a language that is an in-

ternational standard and compatible withboth Microsoft and various UNIXs. Thishas given rise to initiatives such as Mono,an open-source development platformbased on the .NET Framework that lets

you build cross-platform applications(http://www.mono-project.com/). Mono’s.NET implementation is based on theECMA standards for C# and the CommonLanguage Infrastructure (CLI). While Monoincludes both developer tools and the in-frastructure needed to run .NET client andserver applications, I focus here onASP.NET apps developed with Microsoft’s

Marcia is chief technology officer at Gule-sian Associates and author of more than100 articles on software development. Shecan be contacted at [email protected].



W I N D O W S / . N E T D E V E L O P E R

Building blocks for.NET web apps

MARCIA GULESIAN “The defaultASP.NET 1.1validation controlsdo not provide aworking client-sidescript model fornonMicrosoftbrowsers”


Visual Studio .NET and deployed to bothMicrosoft’s IIS (Windows) and the ApacheSoftware Foundation’s Apache HTTPserver (after the addition of the Monomodule).

On the client side, .NET apps down-loaded to different browsers (running ona “thin” or “thick” client) exhibit differentbehaviors as a function of both the brows-er and the server from which it was down-loaded. I first review how adjustments inthe .NET configuration files can compen-sate for the problematic behavior of cer-tain browsers when they download an appfrom the IIS application server (Windows).Then, I show how Mono can be used tomask the behaviors of these samebrowsers when downloading the same appfrom an Apache (Mono) server.

As Figure 1 suggests, the plethora ofserver-browser combinations is too largefor a single article. However, I present anumber of representative cases that canbe used as building blocks to creating asingle version of a .NET web app thatfunctions the same way whichever serverit is deployed to and whichever clientdownloads it.

Uplevel and Downlevel Browsers Browsers are split into two distinct groups:uplevel and downlevel browsers. Thesegroups define the type of native supporta browser offers, and generally determinethe rendering and behavior of a pagedownloaded from a web server.

Browsers that are considered uplevel atminimum support ECMAScript (JScript,

JavaScript) 1.2; HTML 4.0; Microsoft Doc-ument Object Model (MSDOM); and Cas-cading style sheets (CSS).

On the other hand, downlevel browsersonly support HTML 3.2 (http://aspnet.4guysfromrolla.com/demos/printPage.aspx?path=/articles/051204-1.aspx).

In practice, only modern Microsoft In-ternet Explorer versions fall into the up-level category; most other browsers fallinto the downlevel category.

Server controls such as dropdown listsand text boxes can behave differently foreach browser type. If users have uplevelbrowsers, the server controls generateclient-side JavaScript that manipulates theMSDOM and traps the action events di-rectly on the client. If users have down-level browsers, the controls generate stan-dard HTML, requiring that browsers

Figure1: Server and browser pairs, some with different behaviors. Figure 2: WebForm rendered in IE 6.0.

perform round-trips to the server for trig-gered action events.

Because different web browsers— anddifferent versions of the same browser—have different capabilities and behaviors,web developers usually have to changetheir applications based on which user’sbrowser their code detects. They use twogeneral approaches to this problem:

• Code (typically JavaScript) is sent alongwith the page to be executed client-side.

• The user-agent string from the HTTP re-quests headers is analyzed server-sideand only the appropriate HTML code issent to the client.

Often a combination of the two is em-ployed (see http://msdn.microsoft.com/asp.net/using/migrating/phpmig/whitepapers/compatibility.aspx?print=true and http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vbcon/html/vbconwebforms-controlsbrowsercapabilities.asp).

In addition, given the existence of Rhi-no (Mozilla.org’s JavaScript implementa-tion in Java) and IKVM.NET (a JVM im-plementation for the CLR), it should bepossible to run JavaScript directly underMono (see http://chimpen.com/things/archives/001427.php).

ASP.NET’s Adaptive Rendering Figure 2 shows a web app downloadedfrom a Microsoft IIS application server byan Internet Explorer 5.5 or later browserrunning on a PC. The rendering is the sameas the original design of the app in a Visu-al Studio .NET IDE. However, .NET controls

such as single- and multiline text boxes orlabels appear distorted on the page whendeployed to an IIS application server anddownloaded by a downlevel browser suchas Safari or Konqueror; see Figures 3(b)and 4(b). That’s because the HTML renderedby .NET web controls depends on thebrowser requesting the ASP.NET web page.And, Safari and Konqueror browsers ren-der HTML 3.2-compliant HTML by default,in this situation. However, adding ListingsOne and Two to your web.config (or ma-chine.config) file causes these browsers torender the .NET web controls of your app(or all apps running on the server) usingHTML 4.0-compliant markup.

When Listings One and Two are addedto the <browserCaps> section of the Ma-chine.config file on the IIS server where

.NET web apps are running, the web con-trols in all apps running on that host ren-der without distortion in Safari (Mac), Kon-queror (Linux), and Internet Explorer (PC)browsers; see Figures 2, 3(a), and 4(a).What is most interesting is that the undis-torted rendering in Figures 3(a) and 4(a) isalso seen in all browsers when the sameapplication is copied to and downloadedfrom the Mono server without the use ofListings One and Two! To help account forchanges in the browser marketplace,cyScape (http://cyscape.com/browsercaps/)publishes regular updates for the browser-caps section of your machine.config file. Itis important that you keep this data current.Otherwise, pages that depend on browserdetection may not operate as expected dueto changes in the browser marketplace.


Figure 3: (a) WebForm rendered(correctly) in Safari 1.2 (Mac OS X)with Listing One in machine.config;(b) WebForm rendered malformed inSafari 1.2 (Mac OS X) without ListingOne in machine.config.

ASP.NET Validation The default ASP.NET 1.1 validation con-trols do not provide a working client-sidescript model for nonMicrosoft browsers,due to the fact that the proprietary scriptdocument.all[ID] is used in place of thestandards-compliant script document.get-ElementById(ID) for referencing an ele-ment in the HTML document. If you lookat the View Source in a downlevel brows-er and compare the code with the ViewSource of the IE browser, the client-sidecode for the validation controls is absentin the downlevel browser’s View Source.Client-side support can be added, but atthe cost of recreating the validation con-trols. Fortunately, the work has alreadybeen provided by Paul Glavich (http://weblogs.asp.net/pglavich/), so you canuse his DomValidators validation controlsif you need to support client-side valida-tion with nonMicrosoft browsers.

Other solutions are provided by third-party tools such as Peter Blum’s Valida-tion Controls (http://www.peterblum.com/), which emit client-side validationcode in Safari browsers. Blum’s solutionrequires some work to install and config-ure before Visual Studio .NET 2003 can takeadvantage of these components, but theydo work well. Another solution (my pref-erence) is to hand-code custom server-sidevalidation, or you can settle for client-sidevalidation in IE and server-side validationin all other browsers. Or you can wait un-til ASP.NET 2.0 ships.

It’s worth noting that this technique doeshave JavaScript issues. For instance, List-ing Three works in IE, Safari, and Kon-queror browsers, Listing Four works onlyin IE browsers, and Listing Five works in

IE (PC) but not Safari (Mac OS X) or Kon-queror. (Listing Five exploits a security bugin some browsers, IE for instance, that letsyou close the current window even if itwasn’t opened with client-side scripting.

ConclusionWith Mono and most all Linux distribu-tions bundling Java support, it’s importantto include Java in any discussion that con-siders thin clients. In mid 2004, IBM be-gan offering a Java-based thin softwareapplication, called “Workplace,” intendedfor web-based applications. And, the com-parative examples of JavaScript code pre-sented here apply equally well to servlet-and JSP-based Java apps.

However, it’s also important not tocompare apples with oranges. At the endof 2004, you were likely to have been de-veloping with .NET 1.1 and/or Mono 1.0and/or JDK 1.4. In the coming months,however, you can add .NET 2.0, Mono 1.2,and JDK 1.5 to the mix.

Of course, C# and Java are playingleapfrog. C# started out with many ofJava’s features and some useful improve-ments of its own, and now Java is takinga number of C# features — attributes,enums, foreach, and autoboxing— andadding generics, static imports, and ex-tendable enums. With the release ofASP.NET 2.0, Microsoft will reduce theamount of coding required for a normalweb site drastically, in some cases morethan 50 percent. Microsoft has also addedto all the out-of-the box User controls andValidation controls, and created a newconcept of Master pages, which shouldreduce the size of your web site. WithJ2EE 5.0 (previously J2EE 1.5), the Javacommunity is likewise making it easier forless-experienced developers to create ap-plications.

The bottom line, as suggested by Fig-ure 1, is that .NET apps have now fol-lowed J2EE apps into the world of mul-tiplatform deployment, which calls for anew and expanding skill set on the partof .NET developers. While both theMono and .NET Frameworks need to beconsidered during the planning stage ofyour next “.NET” web application, thisconsideration needs to include your abil-ity to work with operating systems andbrowsers other than just Windows Serv-er and Internet Explorer, respectively.Failure to do so can put you at a com-petitive disadvantage.

ReferencesMark Easton and Jason King, Cross-Platform.NET Development. Apress, 2004.

Brian Nantz, Open Source .NET Devel-opment. Addison-Wesley, 2005

DDJ


Figure 4: (a) WebForm renderedcorrectly in Konqueror (SuSE Linux)with Listing Two in machine.config;(b) WebForm rendered malformed inKonqueror (SuSE Linux) withoutListing Two in machine.config.

Listing One<case match="AppleWebKit/(?'version'(?'major'\d)

(?'minor'\d+)(?'letters'\w*))">browser=AppleWebKitversion=${version}majorversion=${major}minorversion=0.${minor}frames=truetables=truecookies=truejavascript=truejavaapplets=trueecmascriptversion=1.5w3cdomversion=1.0css1=truecss2=truexml=truetagwriter=System.Web.UI.HtmlTextWriter<case match="AppleWebKit/(?'version'(?'major'\d)

(?'minor'\d+)(?'letters'\w*))($KHTML, like Gecko$ )?(?'type'[^/\d]*)/.*$">

type=${type}</case>

</case>

Listing Two<case match = "Konqueror/(?'version'(?'major'\d+)

(?'minor'\.\d+)(?'letters'));\w*(?'platform'[^\)]*)">browser=Konquerorversion=${version}majorversion=${major}minorversion=${minor}platform=${platform}type=Konquerorframes=truetables=truecookies=truejavascript=truejavaapplets=trueecmascriptversion=1.5w3cdomversion=1.0css1=truecss2=truexml=truetagwriter=System.Web.UI.HtmlTextWriter</case>

Listing Threefunction disableTextBox() { var selectElement = document.getElementById('ddlWhatever');var len = selectElement.options.length;for (var i= 0; i < len; i++){var bln = selectElement.options[i].selected;var val = selectElement.options[i].value;if (bln == true){if (val == 'ABC'){document.Form1.TextBox1.disabled = true;

// Works in I.E. (PC), Safari 1.0.2 & 1.2.2, and Konqueror//document.Form1.TextBox1.readOnly = true;

// Works in I.E. (PC), Safari 1.2.2, and Konqueror//document.getElementById("TextBox1").setAttribute("readOnly",true); // Works only in I.E. (PC)}else{document.Form1.TextBox1.disabled = true;

// Works in I.E. (PC), Safari 1.0.2 & 1.2.2, and Konqueror//document.Form1.TextBox1.readOnly = true;

// Works in I.E. (PC), Safari 1.2.2, and Konqueror //document.getElementById("TextBox1").setAttribute("readOnly",true);

// Works only in I.E. (PC)}} } }

Listing Fourfunction launchWindow() { if (document.getElementById("ddlWhatever").getAttribute("value") == 'ABC'){var dateWin;dateWin = window.open("Page1.aspx",'');dateWin.focus();}else{var dateWin;dateWin = window.open("Page2.aspx",'');dateWin.focus();}}

Listing Fivewindow.opener=self;window.close();

DDJ

When debugging applications anddrivers on Pocket PC PDAs, I oftenmiss being able to use hardware-assisted breakpoints. When such

breakpoints are enabled, the CPU runs atnormal speed, stopping only when data ata given address is accessed or modified(such data breakpoints are often called“watchpoints”). While these breakpointsare not for everyday use, they can dra-matically speed up the debugging of cor-rupted data or the exploration of unfamil-iar code. Unfortunately, the MicrosofteMbedded Visual C++ (EVC) debuggerdoes not support hardware-assisted break-points (at least at this writing). Granted,EVC does provide a dialog for setting databreakpoints, but it appears to implementthis feature by running the program step-by-step and checking the data— a processtoo slow to be useful. This is unfortunatebecause you can substitute other debug-ger features, such as regular code break-

points, by inserting trace statements or mes-sage boxes in the program itself. Still, there’sno substitute for hardware breakpoints.

I recently needed software-controlleddata breakpoints when debugging a largeand unfamiliar code base. I noticed thata local variable in certain functions some-times changed its value erroneously. WhenI tried to step through the function in adebugger or insert a breakpoint within thefunction, the program timing was dis-rupted enough to hide the bug. Conse-quently, I decided to write a C++ classthat would set the data breakpoint in itsconstructor, and remove it in the destruc-tor with minimal overhead. Then I wouldonly need to instantiate the class in thefunction and run the program. After Iwrote the class and ran the program fora few minutes, the data breakpoint wastriggered and the debugger displayed theexact line that modified the variable inquestion. While I originally implementedthis class for a Pentium-based WindowsNT using the SetThreadContext API, I re-cently implemented it on a PocketPC PDAbased on the Intel XScale architecture.

In this article, I explain how to accessdebug registers on XScale-based CPUsfrom C/C++ applications. Using the codeI present here (available electronically; see“Resource Center” page 3), you can easi-ly set breakpoints on data reading and/orwriting and catch exceptions generatedby these breakpoints. Also, I show howto use another feature of XScale— thetrace buffer—which lets you collect pro-gram execution history. I’ve tested the

code on several off-the-shelf XScale-basedPocketPC PDAs with the Windows Mo-bile 2003 and Windows Mobile 2003 Sec-ond Edition operating systems. To findout if your PDA is running an XScale,open the About Control Panel applet. X-Scale CPUs have names starting with PXA;

for example “PXA270.” The code I presenthere won’t work with ARM-compatibleCPUs from manufacturers that do not sup-port XScale debug extensions.

Intel’s XScale ArchitectureIntel’s XScale architecture is a successor toStrongARM, which was originally de-signed by Digital Equipment Corporation.At this writing, most models of WindowsMobile/PocketPC 2003 and 2002 PDAs runXScale-based CPUs, while some olderPocketPC 2002 PDAs used StrongARM-based processors. All these processors arebased on the ARM architecture designed

Hardware-AssistedBreakpoints

“XScale-basedCPUs normally runwith debugfunctionalitydisabled”

Accessing XScale debugregisters from C/C++

DMITRI LEMAN

Dmitri is a consultant in Silicon Valley spe-cializing in embedded system integration,driver, and application development. Hecan be reached at [email protected].

E M B E D D E D S Y S T E M S


by ARM Limited. StrongARM was basedon ARMv4 (ARM Version 4) and XScaleon ARMv5TE. Compared to StrongARM,XScale has several extensions, such as sup-port for the Thumb 16-bit instruction set(in addition to the 32-bit ARM instructionset), DSP extensions and debug extensions.For user mode applications, XScale main-tains compatibility with StrongARM. Tolearn more about ARM architecture, regis-ters, instructions, and addressing modes,see The ARM Architecture Reference Man-ual, Second Edition, edited by David Seal(Addison-Wesley, 2000). For a quick refer-ence to XScale-supported instructions, seeXScale Microarchitecture Assembly Lan-guage Quick Reference Card (http://www.intel.com/design/iio/swsup/11139.htm).XScale-specific features, such as memorymanagement, cache, configuration registers,performance monitoring, and debug ex-tensions are documented in Intel’s XScaleCore Developer’s Manual (http://www.intel.com/design/intelxscale/273473.htm).

Using XScale Debug ExtensionsNormally, XScale-based CPUs run withdebug functionality disabled, but theymay be configured to execute in one oftwo debug modes—Halt and Monitor.Halt mode can only be used with an ex-ternal debugger connected to an XScaleCPU through JTAG interface. Since off-the-shelf PDAs are unlikely to have JTAGconnectors, I focus here on Monitor de-bug mode, which can be used by soft-ware running on the CPU itself without

any external hardware or software. Use-ful features in this mode include in-struction breakpoints, data breakpoints,software breakpoints, and a trace buffer.Except for instruction breakpoints (whichare generated by a special instruction in-serted into the program), these featurescan be enabled and configured using de-bug registers.

Intel provides the XDB Browser, a pow-erful visual debugging tool (included withthe Intel C++ compiler), which gives youfull control of XScale CPU internals, in-cluding debug extensions. Unfortunately,this tool requires special debug code that’sbuilt into the Board Support Package(BSP), which was unavailable on mostPDAs at the time of writing.

The debug registers in Table 1 belongto coprocessors 14 (CP14) and 15 (CP15).Coprocessors are modules inside the CPU,which extend the core ARM architecture.The coprocessor registers are accessed us-ing special commands. In the code ac-companying this article, I use the com-mands MRC and MCR with the syntax:MRC{cond} p<cpnum>, <op1>, Rd, CRn,CRm, <op2> to move from coprocessor toARM register, and MCR{cond} p<cpnum>,<op1>, Rd, CRn, CRm, <op2> to movefrom the ARM register to coprocessor.

• {cond} is an optional condition (in ARM,most instructions can be marked with acondition to specify whether the in-struction should be executed or skippeddepending on processor flags).

• p<cpnum> is either the p14 or p15 co-processor name.

• Rd is a general-purpose ARM register.• CRn and CRm identify the coprocessor

register.• <op1> and <op2> are opcodes (and are

always 0 when working with debug reg-isters).

For example:

MCR p15, 0, R0, c14, c0, 0 ; write R0 to DBR0MRC p14, 0, R1, c10, c0, 0 ; read DBGCSR to R1

Software access to debug registers canbe done from a privileged mode only;user-mode access generates exceptions.Fortunately, it appears that PocketPCs al-ways run applications in Kernel mode.Windows Mobile-based Smartphones, onthe other hand, run applications in usermode. Trusted applications (which aresigned with a trusted certificate) can switchto Kernel mode using the SetKMode API.Because I don’t have an XScale-basedSmartphone, I focus here on the PocketPC.

I implemented the debug register accesscode in assembly language; see Access-Coproc.s (available electronically), whichcontains several short routines: SetDebug-ControlAndStatus, SetDataBreakPoint, Set-CodeBreakPoint, ReadTraceBuffer, Get-ProgramStatusRegister, and ReadPID. Thefile Breakpoint.file contains declaration ofthese functions and related constants to letyou call the functions from C or C++.

To enable debug functionality, bit 31(Global Enable) should be set in DebugControl and Status Register (DCSR). Bits 0and 1 in this register are used to enabletrace buffer. See the SetDebugCon-trolAndStatus implementation in ListingOne. Applications should call DWORDdwOrigDCSR = SetDebugControlAndSta-tus(DEF_GlobalDebugEnabled, DEF_Glob-alDebugEnabled) before setting any break-points, then save the result. Before exiting,applications should call SetDebugCon-trolAndStatus(dwOrigDCSR, -1) to restorethe DCSR to the original value; see theWinMain function in BreakPointSam-ples.cpp (available electronically).

There are two data breakpoint registers:DBR0 and DBR1. There is also Data Break-point Control Register (DBCON), whichlets you configure hardware breakpointson up to two separate addresses or a sin-gle breakpoint on a range of addresses.The breakpoints can be configured for loadonly, store only, or any (load or store) ac-cess type. To set a breakpoint on a range,DBR0 should be set to the address andDBR1 to a mask. The breakpoint is trig-gered if a program accesses data at the ad-dress that matches the value in DBR0 whileignoring bits, which are set in the mask. Iimplemented an assembly routine Set-DataBreakPoint (in AccessCoproc.s),


Register Name Purpose CRn CRm

(a)TX Register (TX) Communication with JTAG debugger. 8 0RX Register (RX) Communication with JTAG debugger. 9 0Debug Control & Flags for enabling debug mode. 10 0

Status Register (DCSR) Trace Buffer Register Reading from trace buffer. 11 0

(TBREG)Checkpoint Register 0 Reference address for use 12 0

(CHKPTO) with trace buffer.Checkpoint Register 1 Reference address for use 13 0

(CHKPT1) with trace buffer.TXRX Control Register Communication with JTAG debugger. 14 0

(TXRXCTRL)

(b)Process ID (PID) Remapping current process addresses 13 0

to slot 0.Instruction breakpoint Address of instruction 14 8

register 0 (IBCR0) breakpoint 0.Instruction breakpoint Address of instruction 14 9

register 1 (IBCR1) breakpoint 1.Data breakpoint Address of data breakpoint 0. 14 0

register 0 (DBR0) Data breakpoint Address of data breakpoint 1. 14 3

register 1 (DBR1) Data Breakpoint Control Configures data breakpoints. 14 4

Register (DBCON)

Table 1: XScale debug registers and PID: (a) CP14 registers; (b) CP15 registers.

which assigns all three of these registers.Enum XScaleDataBreakpointFlags (inBreakpoint.h) defines configuration valuesfor DBCON, which can be passed as thethird argument to the function. For a con-venient way to set breakpoints on localvariables, use DataBreakPoint. The func-tions TestWriteBreakpoint, TestReadBreak-point, and TestRangeBreakpoint in Break-PointSamples.cpp show an example. Whena data breakpoint is hit, it generates a dataabort exception.

The Instruction Breakpoint Address andControl Registers IBCR0 and IBCR1 canbe used to set breakpoints on code exe-cution at a specific address. Usually, de-buggers insert a special instruction intothe program to implement a code break-point. This lets you set an unlimitednumber of breakpoints. But this methoddoes not work with code located in ROMor Flash. In these cases, the hardware-supported instruction breakpoints comein handy; however, there are only two ofthem. Unfortunately, instruction break-points appear to be useless because theygenerate a “prefetch abort” exception,which is not passed to the __try/__excepthandler or a debugger.

Register TBREG is for reading bytes fromthe trace buffer and CHKPT0 and CHKPT1are for associating execution history in thetrace buffer with instruction addresses.Several other debug registers are for com-munication with JTAG debugger and arenot discussed here.

The Process ID (PID) register is not adebug register, but used when preparingaddresses to be set in DBRx or IBCRx.Windows CE can run up to 32 process-es, each occupying its own 32-MB ad-dress slot. The current process is alsomapped to slot 0, which lets a DLL codesection (located in ROM) access differentdata sections when DLL is loaded in sev-eral processes. ARM architecture providesPID as a direct and efficient support forsuch slot remapping. The value of thePID is equal to the address of the pro-cess slot. The CPU uses the high 7 bits(31:25) on the PID to replace the corre-spondent bits of virtual addresses whenthey are 0. The same operation has to beperformed when preparing addresses forDBRx or IBCRx (see macro MAP_PTR inBreakpoint.h).

Reporting Data BreakpointsI present here three straightforward waysto handle data abort exceptions gener-ated when data breakpoints are trig-gered:

• Using an application debugger.• Using a __try/__except construct.• Writing a simple kernel debugger stub.

Application debuggers (such as EVC)handle data abort exceptions in anythread of the program under debug. Theybreak execution and display the sourceline or instruction that triggered the break-point, display registers, local variables,and call stack.

However, the application debugger of-ten cannot be used because a connec-tion is not available or it’s too slow. Also,the debugger cannot handle exceptionsin a system process, such as device.exe(hosts drivers) or gwes.exe (hosts userinterface).

The second approach is to wrap codein __try{}__except(Expression){} exception-handling blocks. When exceptions hap-pen within the try{} block, the system ex-ecutes an Expression statement. Iimplemented the function Exception-Handler in BreakPointSamples.cpp, whichshould be specified as the argument to_ _except. I call the _exception_info APIto get useful information, such as excep-tion code, address, and CPU registers. Ex-ceptionHandler displays this informationin message boxes (to simplify integrationof this code into various applications).Unfortunately, a __try/__except constructcan only handle exceptions coming froma thread, which executes code within thetry{} block or functions called from with-in the try{} block. This is not a problemif you can insert _ _try/_ _except intosource code for all suspect threads in yourprogram.

When printing information about anexception, it’s best to print the stacktrace. Printing the stack trace on an ARMis more difficult than x86 because theEVC compiler can generate several dif-ferent types of function prologs and doesnot have an option to produce consis-tent stack frames (see “ARM Calling Stan-dard” in EVC help for details). Also, un-

like the x86, which always pushes thereturn address to the stack when callinga function, ARM code moves return ad-dresses to a register LR. Most functionsusually start by storing the LR on thestack, but highly optimized code cankeep it in any register. This means thaton ARMs, it may not be possible to re-liably reconstruct the stack trace withoutdisassembling the code (which is beyondthe scope of this article).

A Simple Kernel Debugger StubIt is sometimes necessary to catch ex-ceptions globally— in any thread of anyprocess. The easiest way to achieve thisis to register a DLL as a kernel debug-ger stub. I include here the minimal code(available electronically) capable of han-dling system-wide exceptions. In thedays of Windows CE 3.0/Pocket PC 2002,you could register a regular user DLL asa kernel debugger stub and display ex-ceptions in a regular message box (thewhole code was just about 200 lines).Alas, in Windows CE 4.x/PocketPC 2003,the kernel debugger must be loaded asa kernel module. The problem is that aDLL such as this cannot link to any oth-er DLL, even coredll (which providesmost CE API and C/C++ runtime libraryfunctions). Consequently, I had to im-plement my own sprintf-like formattingroutine as well as integer division-by-10(both are normally imported fromcoredll ). I also recycled my old HTracelibrary to write trace to a shared memo-ry buffer, which can be displayed froma separate application. You can find thecode in the SimpleKDStub directory. Torun it, copy SimpleKDStub.dll and KD-Viewer.exe to the PDA and start the pro-gram. It loads the stub, which starts lis-tening for exceptions. Once an exceptionis caught, it is printed to a shared bufferand displayed in the application. Thistool is useful for data breakpoints andfor catching other exceptions in any ap-plication on the PDA.

XScale Trace BufferThe XScale architecture implements apowerful debugging feature— the tracebuffer. When enabled, it collects a his-tory of executed instructions. The tracebuffer is just 256 bytes long (built insidethe CPU itself), but stores the history asa compact sequence of 1- or 5-byte en-tries representing control flow changes(exceptions and branches). Each entryhas a 1-byte message, which indicatesthe type of entry (exception, direct, orindirect branch) and the count of in-structions executed since the previouscontrol flow change. If this count exceeds15, then a special roll-over message isstored. Entries for indirect branches include


“Alas, in WindowsCE 4.x/PocketPC2003, the kernel

debugger must beloaded as a kernel

module”

Listing One; SetDebugControlAndStatus writes (optionally) to ; Debug Control and Status Register (DCSR); and returns the original value of DCSR.; parameters: ; r0: flags to be set or reset in DCSR.; r1: mask - flags to be modified in DCSR, the rest is preserved.; return value: ; value of DCSR before the modification

EXPORT |SetDebugControlAndStatus||SetDebugControlAndStatus| PROC

stmdb sp!, {r2,lr} ; save registersmrc p14, 0, r2, c10, c0, 0 ; read DCSR to r2and r0, r0, r1 ; r0 = r0 & r1 - clear flags not in maskbic r1, r2, r1 ; r1 = r2 & ~r1 - leave flags not in maskorr r0, r0, r1 ; r0 = r0 | r1 - combine flagscmp r0, r2 ; compare new with originalmcrne p14, 0, r0, c10, c0, 0 ; write DCSR if flags have changedmov r0, r2 ; prepare to return the original flagsldmia sp!, {r2,pc} ; restore the registers and return

DDJ

an additional 4-byte target address. Thebuffer may be configured to work inwraparound or fill-once mode. Wrap-around is appropriate when waiting foran exception (as I do in this article). Fill-once mode (which generates a “trace-buffer full break” exception once thebuffer is full) may be used to record allcode execution continuously (however,I have not tried it yet).

The content of the trace buffer is ex-tracted by reading the TBREG register 256times (this also erases the buffer). TheCHKPTx registers are used to get an ad-dress of a starting point for the recon-

struction. Unfortunately, the buffer doesnot contain enough information to re-construct the execution history withoutdisassembling the executed code, count-ing instructions, and examining branch-es. Such a program is beyond the scopeof this article. However, I included elec-tronically the function ShowTraceBuffer,which simply displays the list of entriesin the buffer in a series of message box-es. You can use this information, alongwith the disassembly window of the EVCdebugger, to recover execution historyprior to an exception. This may be a morepowerful tool than a stack trace. Be awarethat the trace buffer collects global exe-cution information from all processes, theOS kernel, and interrupt handlers.

The function TestTraceBuffer in Break-PointSamples.cpp demonstrates using the

trace buffer to record execution history.TestTraceBuffer sets a data breakpoint andenables the trace buffer, then it calls func-tion Test, which calls Test1, which triggersthe breakpoint. Figure 1 is an annotateddisassembly listing for these functions. Theexception raised by the breakpoint is dis-played in a message box in Figure 2,where you can see the address of the in-struction that triggered the breakpoint (val-ue of register PC=1221C). Register R0 =2C02FDFC is the address of the data andregister R1= B(123)–the new value. Fig-ure 3 displays the parsed trace buffer: +1,IBr121BC, +1,BR,+ 4,BR. This lets you re-construct the execution history: The func-tion SetDebugControlAndStatus executedone instruction after enabling the tracebuffer, then returned to address 121BC (inTestTraceBuffer), then one instruction wasexecuted, then branch (to Test), then fourinstructions and branch (to Test1).

Further ImprovementsA simple way to enhance the code I pre-sent here would be to print the modulename and offset instead of the raw re-turn address when printing exception in-formation. A more difficult exercisewould be to print the stack trace or en-hance the trace buffer printing with a dis-assembler to fully reconstruct the exe-cution history. A completely newdirection would be to implement a con-tinuous execution recording tool using afill-once trace buffer. I may post bug fix-es and improvements on my web site(http://forwardlab.com/).

ConclusionXScale-based CPUs provide a powerfulsupport for hardware-assisted debugging.Fortunately, it is not necessary to wait forapplication debuggers to provide access toall CPU features from the GUI. On Pentium-based systems, Visual Studio never man-aged to implement breakpoints on datareading or hardware breakpoints on localvariables. Therefore, it is important for youto know the capabilities of the CPU andhow to exploit them from an application.The tricks I present here may not be foreveryday use, but every now and then,they can save hours (or days) of difficultdebugging.

DDJ


Figure 2: Breakpoint triggered byfunction Test1.

; void TestTraceBuffer():...

280121B0 mov r1, #1280121B4 mov r0, #1280121B8 bl |SetDebugControlAndStatus| ; enable trace buffer280121BC add r0, sp, #0x11, 28280121C0 bl |Test| ; call Test...

; void Test(int * p)280121F0 mov r12, sp280121F4 stmdb sp!, {r0}280121F8 stmdb sp!, {r12, lr}280121FC ldr r0, [sp, #8]28012200 bl |Test1| ; call Test128012204 ldmia sp, {sp, pc}

; void Test1(int * p)28012208 mov r12, sp2801220C stmdb sp!, {r0}28012210 stmdb sp!, {r12, lr}28012214 ldr r0, [sp, #8]28012218 mov r1, #0x7B2801221C str r1, [r0] ; *p = 123 - triggers a data breakpoint28012220 ldmia sp, {sp, pc}

Figure 1: Annotated disassembly listing for functions TestTraceBuffer, Test, andTest1 used to demonstrate the XScale trace buffer.

Figure 3: Execution history before thebreakpoint in function Test1.

For some reason, all the topics in thismonth’s column have something to dowith death or legacy. Dark. And Ihaven’t even seen Frank Miller’s Sin

City yet. I talk about an end-of-life lan-guage, two ex-Microsofties, a televisionTerminator, legacy code and coding lega-cy, and what comes after physics. To light-en things up, I allow some jokes to slipin. Fjordian slips.

It crossed my mind when these irrever-ent references were slipping into the textthat some readers might consider them tobe in poor taste. Monty Python jokes, suchreaders might say, are hardly appropriatein referring to matters of life and death, andnone of the above belongs in a columnabout programming paradigms. And whatare programming paradigms, anyway?

My three responses to such readers, ifthere are any among the Dr. Dobb’s read-ership, are: (1) Good point. I hope you’vementioned your outrage over exploitationof death to Tom DeLay and Jesse Jack-son. (2) I’m older than you and closer todeath. I have a special dispensation tolaugh about the subject. (3) May I suggestthat you Google “tasteful humor” and af-ter slogging through the depressing resultsthen tell me that the phrase is not an oxy-moron? (4) You’re absolutely right. I’m soashamed.

And now for something completelybasic.

VB-not-net: Pushing Up DaisiesIt’s hard to believe that, nearly 50 yearsafter John Kemeny and Thomas Kurtz de-mocratized programming by creating thetruly revolutionary programming language

Basic, 30 years after Basic became the firstconsumer HLL available for a microcom-puter (by a company then called “Micro-Soft”), and 29 years after two guys namedDennis Allison and Bob Albrecht decidedthat a magazine (then called Dr. Dobb’sJournal of Tiny Basic Calisthenics and Or-thodontia: Running Light without Over-byte) needed to be created to put a freeand open version of Basic into the handsof microcomputer users— after all thistime, we’re still talking about Basic.

Anyway I am.And I’m not alone. Microsoft is in the

process of inflicting a new programminglanguage implementation on its develop-ers and, no doubt in response to a sub-liminal Gatesian mandate insinuated intothe brain of anyone who drinks from thewell on the Microsoft Campus, they arecalling it Basic. Microsoft always has tohave a Basic, like the Winchester widowhad to keep adding onto her house, likesharks have to keep swimming, like FrankMiller needs to share his noir. On the daythat Microsoft no longer has a productcalled Basic, Mount St. Helens will raindown ash on the Microsoft Campus to thetops of the cubicle walls and the tecton-ic plate beneath castle Gates will shift,sending Bill’s house floating out PugetSound to sea on the crest of a tsunami.

But Microsoft is as loose in what it callsBasic as I am in deciding what fits underthe heading of programming paradigms.Visual Basic, or VB, is evolving into VB-dot-net, and VB-dot-net is arguably aboutas close to VB-not-net as VB-not-net is toKemeny and Kurtz Basic, which is to saynot very. And therein lies the rub that gallsthe kibe of the down-at-the-heels devel-oper. Because Microsoft has informed theapproximately six jillion VB-not-net de-

velopers that it will no longer support theirEOL’d version of VB. Free support endedMarch 31 of this year; paid support willlinger for another three years, or five ifthe U.S. Congress passes a special law.

But a lot of those developers claim thatVB-dot-net is a substantially different de-velopment platform, and not one that theynecessarily want to migrate all their codeto. Thousands of VB-not-net developerspetitioned Microsoft not to cut off theirlife support.

Microsoft’s response: The not-net ver-sion of VB has passed on. It is no more. Ithas ceased to be. It’s expired and gone tomeet its maker. It’s a stiff. Bereft of life, itrests in peace. Its metabolic processes arenow history. It’s kicked the bucket, it’s shuf-fled off this mortal coil, rung down the cur-tain, and joined the bleedin’ choir invisi-ble. It is an ex-language implementation.

Or words to that effect.

Bring Out Your Dead CodeEnter the folks at REALbasic, who giveVB’s head three perfunctory taps with agolden hammer, wrest away the Fisher-man’s Ring, and immediately parade theirlanguage’s palpability before the stunnedconclave of VB-not-net developers.

It’s too late to take advantage of thedeal now, aren’t I a big help, but REAL-basic offered its cross-platform V-like Ba-sic to VB-not-net developers for a pricethat is hard to beat— nothing. Hard to topthat for chutzpah — using one of Mi-crosoft’s own take-no-prisoners compet-itive tactics against it. The deal was set toend on April 15, though. As of April 1,more than 10,000 VB-not-net developershad taken advantage of it.

REALbasic is not a clone of Visual Ba-sic exactly, and although the RB folks have

Pining for theFjords

Michael Swaine

P R O G R A M M I N G P A R A D I G M S

Michael is editor-at-large for DDJ. He canbe contacted at [email protected].


made it as easy as they can to port VB-not-net programs to RB, it’s still a job. Onemight find it a less onerous job than port-ing to VB-dot-net, though, because RB ismuch more philosophically in synch withVB-not-net than the dot-net version is.Which is certainly a strength but also ar-guably a great weakness. Because the phi-losophy in question is preobject-oriented,or at least subobject-oriented, a throw-back to earlier programming models, like,well, like Basic. It could be argued, andMicrosoft argues thus, that if these throw-back developers had any gumption at allthey’d want to move into the 21st centu-ry. Even though I am one of those throw-back RB coders, I do get the point.

Maybe it is time to move on— althoughthis old language does have such lovelyplumage.

Life After MicrosoftA brief aside on an early Basic and aprogrammer’s legacy: Among the manyBasics that have been developed and/orsold by Microsoft, the Visual and the in-visible (Excel Basic), the Quick and thedead (MS-BASIC), one of the best knownis GW-BASIC. What is not so well knownis what or who GW was. There is a fair-ly widespread belief that GW meant “GeeWhiz.” This is not correct. Not does GWstand for “Gates, William.” Unless au-thorities in positions to know are de-luded, GW-BASIC honors early (single-digit employee number) Microsoftemployee Greg Whitten, who presum-ably had something to do with its de-velopment.

If the name honors Whitten that’s a lotmore than fellow former-Microsoftie JoelSpolsky does in his essay “Two Stories”(http://www.joelonsoftware.com/articles/TwoStories.html/. Joel more or less takescredit for the dismantling of Microsoft’sApplication Architecture group, which heperceived as a bunch of out-of- touchPh.D.s arguing about how many macroscan dance on the head of a pin. The Ap-plication Architecture group was headedby Ph.D. Greg Whitten.

I don’t know how fair Joel is being, butthis is all ancient history. According to GregWhitten’s bio, he “developed Microsoft’scommon cross-language compiler and run-time technology, the company-wide object-oriented software strategy and the soft-ware architecture strategy for the Office,Back Office, and Windows product lines,”which sounds pretty impressive to me. Butmaybe Greg should be judged on the ba-sis of what he’s doing now, which is serv-ing as the CEO and chairman of the boardof NumeriX. (He also collects and racesvintage cars, but it’s NumeriX that will al-low me to segue into the bit about com-puter mathematics and Mathematica.)

NumeriX provides software to techni-cal people in financial institutions to helpthem make buying and selling decisionsregarding exotic derivatives. Risk as-sessment stuff, Monte Carlo methods. Thecompany’s client list is impressive, in-cluding the World Bank. I’m not quali-fied to judge the quality of their work: Idon’t know what an exotic derivative is,couldn’t even tell you what a vanilladerivative is (although I do know whata Monte Carlo method is — and no, itdoesn’t involve fear, surprise, ruthless ef-ficiency, and an almost fanatical devo-tion to the Pope). I do understand,though, that the company is packed withfinancial and mathematical PhDs, whichis probably a more comfortable envi-ronment for Whitten than Microsoft backin the day. Perhaps NumeriX, rather thanGW Basic, will define Greg Whitten’slegacy.

What Comes After PhysicsEvery so often I check the web site ofthe Public Library of Science, a peer-reviewed, open-access journal publishedby a nonprofit organization committedto making scientific and medical litera-ture a public resource. It’s such a fineidea that I want to support it, even

though so far most of the articles are onmedical or biological topics that are ob-scure to me. But recently I found an ar-ticle titled “Mathematics Is Biology’s NextMicroscope, Only Better; Biology Is Math-ematics’ Next Physics, Only Better,” byJoel E. Cohen of the Laboratory of Pop-ulations, Rockefeller and Columbia Uni-versities, based on his keynote addressat the NSF-NIH Joint Symposium on Ac-celerating Mathematical-Biological Link-ages (http://biology.plosjournals.org/).

Cohen isn’t merely saying that mathe-matics will be important in making newdiscoveries in biology. He’s saying that thechallenges in biology will drive the de-velopment of new mathematics. Most in-triguing to me is the need in biology todeal with multilevel systems, in whichevents happening at higher or lower lev-els can have consequences on the currentlevel. When you deal with cells within or-gans within people in human communi-ties in physical, chemical, and biotic ecolo-gies, and causality can stretch across alllevels in nonlinear ways—well, maybeyou should read the article. I know I foundit inspiring.

DDJ


Solution to “Optimal Farming,” DDJ,May 2005.1. Observe that the minimum circum-

scribing circle for a square havingside 2L has radius L√2 and areaπ×2×L2, so the extra area is(π–2)×2×L2. This is divided equallyamong the four sides. So each sidegets (π–2)×(L2)/2 extra area. In the de-sign of Figure 1, the small squareshave side lengths 2L where L=1/4.The circles of radius L have 10 ex-tra sides having total area of5×(π–2)×(1/16). The large square hasL=1/2, so has (π–2)/2 extra area. Thetotal cost is slightly less than(13/16)×(π–2).

2. For the case where the circles all musthave the same radius, try the designwhere there are four squares each hav-

ing side length 2L. There are two il-lustrated in Figure 2. Then there arecircles covering each of these squares.So the radius of the circumscribing squareis L√2. (Note that the middle of the rect-angle is 1–2L from each side square.) Inaddition, there is a circle having a radiusL√2 whose center is the middle of therectangle. That reaches the near-centercorner of the corner square because thatdistance is (by the Pythagorean theorem)√((1–2L)2+(1/2)2)=√(1– 4L+4L2+1/4)=√(5/4+4L2– 4L). The length is L×√2.Squaring that gives 2L2. So 2L2=4L2–4L+5/4, yielding the quadratic expres-sion 2L2– 4L+5/4=0. So, L=4/4+/–√(16–10)/4=1–(√6)/4.

M.S. Gopal contributed greatly to thesesolutions.

Dr. Ecco Solution

2

1/2

1/2

1/2

1/2

1

2

1√(6)/41

Figure 1. Figure 2.

Cryptography revolves around thekeeping of secrets: knowing thingsthat others should not. Although theonly certain way of keeping a secret

is to never tell it to anyone else, useful se-crets require at least two people in theknow. Keeping everyone else out of theknow is the hard part.

Understanding the hardware and al-gorithms of a particular crypto schemeis easy compared to the challenge ofactually deploying a secure system. Clas-sic security can lock down point- to-point communications between a fewrelatively secure locations. We’re dis-covering the inadequacy of those tech-niques applied to widely deployed em-bedded systems.

In recent months, I’ve read of severalincidents that show how secrets sneak outof their crypto containers. Embedded sys-tems with long lifetimes must keep secretsfrom their users, so these attacks revealonly the leading edge of the wedge.

Cracking RFIDIn my January column, I related the storyof how to get a month’s free gas from astolen SpeedPass token. Shortly after thatcolumn went read-only, a group of re-searchers from Johns Hopkins and RSALaboratories described their SpeedPasscrypto crack: You can now get a month’s

free gas without even touching the vic-tim’s keyring.

The SpeedPass payment system lets youbuy gasoline by simply holding an RFIDtoken near a reader in the fuel pump. Eachtoken contains a Texas Instruments Digi-tal Signature Transponder (DST) chip pow-ered by the radio-frequency energy fromthe reader, so it can operate without bat-teries inside a sealed housing. Each DSThas a unique, hard-coded 24-bit serialnumber and a programmable 40-bit sig-nature (essentially a crypto key), a radiotransponder, some special-purpose cryp-to hardware, and barely enough computepower to make it all work.

The 24-bit serial number identifying thetoken links into an account entry in Exxon-Mobil’s customer database, which containsthe 40-bit signature for that token. TheRFID reader and token engage in achallenge-response handshake to verifythat both sides know the same signaturewithout exposing it to public view. If theyagree, the reader activates the pump, thecharge appears in your database entry, andaway you go.

In pragmatic terms, this might not be awhole lot faster than swiping a credit cardthrough the adjacent card reader, but it’sattractive enough to be marketable. Youcan also buy junk food and other neces-sities using a SpeedPass, which might befaster than cycling your wallet at the headof a line at the counter.

The 40-bit signature is the system’s onlysecret, despite a few other flourishes. Be-cause formulating the challenge-response

handshake requires knowledge of the sig-nature, there should be no way to extractthe signature without knowing it in thefirst place. Note that the customer doesnot know the secret, and in fact, theSpeedPass system design depends on thatlack of knowledge.

However, because the customer pos-sesses the hardware containing the sig-nature, the initial attack can take as longas required. It turns out that extractionuses off- the-shelf hardware, a dash ofcryptanalysis, and computational bruteforce. I won’t repeat the researchers’ ana-lysis here, other than to observe that it’sa great example of what notable expertscan accomplish with a gaggle of talentedgraduate students.

The end result is a hardware system thatcan extract the 40-bit digital signature fromany SpeedPass token without first know-ing the signature, given just two challenge-response transactions. A token can handleeight transactions per second, so acquir-ing the data is essentially instantaneousand, because it’s RF, doesn’t require phys-ical possession of the token. A few hoursof computation on a special-purpose par-allel processor, which costs a few kilo-bucks, suffice to crack the crypto and pro-duce the signature.

The researchers project that some engi-neering and a touch of Moore’s Law canincrease the range to a few feet, stuff themachinery into an iPod-size case, andshrink the cracking time to a few minutes.The attack scenarios include sniffing Speed-Pass tokens at a valet parking station (just

SecurityRemeasured

Ed Nisley

E M B E D D E D S P A C E

Ed’s an EE, PE, and author in Pough-keepsie, NY. Contact him at [email protected] with “Dr Dobbs” in the subject toavoid spam filters.


wave the victim’s keyring near your pock-et) and walking the length of a subwaycar or mall holding a neatly wrapped pack-age containing a big antenna.

The SpeedPass system’s fraud-detectionlogic trips on excessive purchases, im-possible locations, or atypical usage. How-ever, if you harvest a few thousand tokensand use each one exactly once, you canprobably get free gas for a long, long time.

While longer keys and tougher cryptomay delay the inevitable cracks, the ba-sic principle seems to be true: You can-not keep a secret from someone if yougive them the secret. Seems obvious,doesn’t it?

As is typical of widespread embeddedsystems, quickly replacing or updating theentire SpeedPass infrastructure to use bet-ter token hardware is essentially impossi-ble. The only short-term defense againstthis type of attack involves wrapping thetoken in aluminum foil to shield it fromcracked readers.

Designers of always-on devices, takenote!

Cracking TrustMy February column discussed the me-chanics behind the Trusted Platform Mod-ule (TPM) found in some recent laptopsand desktops. Several readers pointed outthat the “Trusted” part of the name has apeculiarly Orwellian definition: In actualfact, many software and media companiesdo not trust their customers. The compa-nies depend on hardware to increase theeffort required by customers who mightotherwise easily steal their software, data,or (shudder) music.

The essential TPM feature is a securehardware-storage mechanism, typically asingle-chip micro or a few chips withinan armored package, holding crypto keys,digital signatures, or hash values. Well-validated protocols allow external accessonly by trusted programs with the ap-propriate secrets of their own. You can’teven examine the information without de-stroying the TPM, quite unlike secretsstored on disk.

Software running on the PC can vali-date itself using hashes stored within theTPM, authenticate itself to programs run-ning on external servers using digital sig-natures from the TPM, then download andstore data that requires further crypto keysfor access. As long as the secrets storedwithin the TPM remain unknown to thePC’s user, the whole chain of trust frommusical source to hard drive remains un-broken.

Sounds iffy to you, too, huh?To build a system using Trusted Plat-

form Modules, manufacturers must haveaccess to documentation and sample partslong before production begins. The In-

quirer reports that Infineon will not sup-port small manufacturers or system inte-grators, claiming that they will supply TPMsonly to “qualified” customers. The storydoesn’t go into much detail, so we’re leftwith suppositions rather than facts.

The researchers who cracked theSpeedPass had some support from TexasInstruments in the form of developmentkits and sample DST tokens, as well astheir own SpeedPass tokens and car keys.

They did not have access to the DST’s in-ternal logic diagrams or other proprietaryinformation, and in fact, discovering howthe DST worked formed a major part ofthe effort.

Infineon may believe in “securitythrough obscurity” or there may simplybe licensing issues that we don’t knowabout. In any event, if the security of thewhole Trusted Platform Module infra-structure depends on keeping the docu-mentation out of the hands of the badguys, it doesn’t stand a chance.

Cracking The WallPerhaps the single most obvious (and mosttouted) feature of Linux systems is theirimmunity to Windows security flaws. Lin-ux and GNU software may present a com-pelling TCO justification, provide gener-ally higher reliability software, and reducethe time to get bugs fixed, but securityseems to be driving a broad-based changeof opinion in their favor.

One unfortunate side effect reduces Lin-ux system security to sound bites: “Linuxis immune to viruses” and “Crackers don’tbother Linux systems” and so forth. WhileLinux eliminates many of the common ex-posures, it cannot completely solve theproblem.

One member of the Mid-Hudson LinuxUser Group noticed that his system hadbegun behaving strangely and asked for

advice. His first post to the LUG’s mailinglist was titled “Have I been cracked?” andnoted that:

I don’t recall making an account called ‘sys-tens’, but apparently, someone ssh’d into itfrom 200.96.xxx.yyy. ‘host’ returns this infoabout that address…

yyy.xxx.96.200.in-addr.arpa domain namepointer 200-096-xxx-yyy.cpece7021.dsl.brasiltelecom.net.br.

brasil telecom? uh oh.

Mainstream Linux distros install and en-able software firewalls by default duringinstallation. In this case, however:

I am running a firewall through my phys-ical router. The inbound ports I open arefor ssh, http, https, smtp, pop, 8080, and81 for apache tomcat, ftp, and dns. […] I’mnot running a software firewall on the boxitself though.

By default, hardware firewalls block in-coming packets that are not related to pre-vious outbound messages. If you are run-ning a server on your system, however,the firewall must pass incoming connec-tion to a particular port directly to the serv-er, which means the server is directly con-nected to the Internet. Any security flawin the server provides a direct link intothe system:

Now that you mention it. I had a few CMSpackages running on there. Namely, tiki-wiki, drupal, and blog::cms. I locked downone of the tikiwiki instances […]. The oth-er instance was open to the public for useand anyone could use it - not as admin,but with rights to modify the wiki and addforum entries, etc.

Tiki systems lets users create and up-date web pages from their browsers,which means anyone with a web brows-er can change files on the system. Any se-curity flaw provides an opening:

Saw this vulnerability in the tikiwiki website of mid January. […] The vulnerability,initially, lets a user get a sort of shell onthe server under the web server user. Fromthen on, it is just a matter of time.

With the Web and tiki servers exposedto the Internet, your “users” can, indeed,be anyone:

I bet that was it. I’ll check my logs tonightwhen I get home. The ‘systens’ user ac-count apparently was created on the 23rd- just one short week after this flaw wasapparently reported.

Once an intruder has gained access toyour system, becoming root isn’t all thatdifficult and after the intruder is root, allmanner of things become possible. If youdon’t use secure passwords on all yourinternal accounts, things become evenmore interesting:


“The 40-bitsignature is thesystem’s only

secret”

I ran across this in my “/var/log/auth.log”file…

Feb 19 22:13:03 debian sshd[3020]: Ac-cepted password for root from 192.168.1.1port 3064 ssh2

This is curious because 192.168.1.1 is myrouter. […] Is this just a bug in the router(I DO have NAT enabled) or somethingmore I should perhaps worry about?

About a year ago, I described the pro-cess of cracking a router and uploadingnew firmware. I also observed that mostusers never change the admin account’spassword. That turns out to be a neces-sary, but not sufficient, security step:

My firmware is up-to-date, and I don’t haveremote access turned on. […] However, Idid use the same password for one of myaccounts as I did for the router setup. So,in theory if that rootkit could crack pass-words it could also allow access to therouter.

The consensus advice for cleaning upafter an intrusion boils down to two steps:reformat the hard drive and reinstall ev-erything from scratch. You cannot assumeanything about the compromised sys-tem— any command or program may doanything, including spoofing innocent re-turn values.

In fact, you cannot assume anythingabout the status of any systems connect-ed to the local network. In this case, thecompromised router provides a direct linkto the internal network, so restoring thecompromised system wouldn’t eliminatethe vulnerability.

The lesson to be learned from this ad-venture is the inadequacy of simply keep-ing the patches to an Internet-facing sys-tem up to date. You must also monitor thesystem logs, become familiar with “normal”operation, and track down any anomaliesto their source. That this level of involve-ment far exceeds the abilities or interests ofmost PC users, alas, goes without saying.

A Windows XP SP1 system without afirewall will be compromised in min-utes, while a firewall completely elimi-nates incoming attacks. A firewall withopen ports requires meticulous systemsecurity practices on the systems ex-posed to the Internet. In the end, how-ever, firewalls and up- to- the-minutepatches form just the first line of de-fense. Attention to detail must providedefense in depth.

Pop Quiz: What do we do with always-connected embedded systems with nouser interface? Essay: Describe the usermanual’s section on network security.

Reentry ChecklistMore on the SpeedPass RFID tag crack-ing adventure at http://rfidanalysis.org/.You should definitely read their prelimi-nary research paper at http://rfidanalysis

.org/DSTbreak.pdf, which does not pro-vide quite all the details required to crackSpeedPasses on your own.

The Inquirer article on Infineon’s TPMpolicy is at http://www.the-inquirer.com/?article=21113. Everything Infineon has tosay about its TPMs, at least to us, seems tostart at http://www.infineon.com/cgi/ecrm.dll/ecrm/scripts/prod_ov.jsp?oid=29049.

An overview of Digital Rights Manage-ment and online music from a Canadianperspective at http://www.pch.gc.ca/progs/ac-ca/progs/pda-cpb/pubs/online_music/tdm_e.cfm, which uses a differentdefinition of “TPM” than you see here.

Magnatune carries music released un-der the Creative Commons license athttp://www.magnatune.com/, entirelywithout DRM. Streamtuner for Linux sim-

plifies access to a myriad audio streamsat http://www.nongnu.org/streamtuner/,which is completely different from what-ever’s offered on the stub page at http://www.streamtuner.com/.

Writeups of the Linksys router flaw areat http://secunia.com/advisories/11754/and http://www.wi-fiplanet.com/news/article.php/1494941. An experiment mea-suring “system time to live” on the Inter-net is at http://www.avantgarde.com/xxxxttln.pdf. Kevin Mitnick consulted onthe study, should that name ring a bell.

Thanks to Alan Snyder, Sean Dague, Re-nier Morales, and MHVLUG for allowingme to slice up their threads. The originalarchives are at http://www.mhvlug.org/.

DDJ


The Spyware menace has gone beyondall reason. Spyware costs time andmoney. It costs you directly, and itdestroys Aunt Minnie’s confidence in

the Internet. Something must be done; in-deed something will be done because ifthings go on as they are, the Internet it-self is doomed.

And yet, despite the increasing Spy-ware/Adware/Malware assaults, there arespyware companies out there with lawyerssending warning messages to anyone whocalls their malware by its right name. See,for example, the story at http://www.ahbl.org/notices/iSearch.php.

We have tools that can partially protectus against this plague, but they don’t en-tirely work, and they take time and effortto use. Meanwhile the lawyers are havinga field day defending the rights of theirclients to invade and take over your com-puter. They claim that you have agreed tolet them do it, and they have every right.

Here is a typical “license agreement”that supposedly sane users have in theo-ry “accepted”:

2. Functionality—Software delivers adver-tising and various information and promo-tional messages to your computer screenwhile you view Internet web pages. iSearchis able to provide you with Software freeof charge as a result of your agreement todownload and use Software, and accept theadvertising and promotional messages it de-livers.By installing the Software, you understandand agree that the Software may, withoutany further prior notice to you, automati-cally perform the following: display adver-tisements of advertisers who pay a fee toiSearch and/or it’s [sic] partners, in the formof pop-up ads, pop-under ads, interstitialads and various other ad formats, displaylinks to and advertisements of related websites based on the information you viewand the web sites you visit; store nonper-sonally identifiable statistics of the web sites

you have visited; redirect certain URLs in-cluding your browser default 404-error pageto or through the Software; provide adver-tisements, links or information in responseto search terms you use at third-party websites; provide search functionality or capa-bilities; automatically update the Softwareand install added features or functionalityor additional software, including searchclients and toolbars, conveniently withoutyour input or interaction; install desktopicons and installation files; install softwarefrom iSearch affiliates; and install Third Par-ty Software.In addition, you further understand andagree, by installing the Software, that iSearchand/or the Software may, without any fur-ther prior notice to you, remove, disable orrender inoperative other adware programsresident on your computer, which, in turn,may disable or render inoperative, othersoftware resident on your computer, in-cluding software bundled with such ad-ware, or have other adverse impacts onyour computer.

I submit that no one in his right mindhas ever agreed to this; that the only wayit was “agreed” to was by stealth, notthrough anything like informed consent.There may be, out among the Aunt Min-nie’s of this world, one or two who actu-ally saw something like this and “agreed”;but how many DDJ readers would acceptsuch a thing?

Drive-By SpywareMy own case is illustrative. I had a reportfrom Associate Editor Dan Spisak on hisnot very successful attempts to removethe iSearch Toolbar and a number of oth-er infections from a friend’s machine. Hetried everything, and in the course of hisefforts discovered that, while it goes with-out saying that Internet Explorer was hi-jacked, even the Firefox browser was af-fected.

I collected notes on this and other spy-ware subjects in OneNote on my Tablet-PC, then started another column sectionon Bill Gates’s recent speech at the Gov-ernor’s Conference. When I did a Google

search for a particular quote to use, Ifound one likely source at a place called“Study World.” Fair warning: If you wantto go look there, set your browser secu-rity level to HIGH and don’t agree to any-thing.

When I went to that location, uppopped a warning from Microsoft that Ididn’t read closely, and in a moment ofsheer madness I clicked OK. MicrosoftAnti-Spyware instantly popped up towarn me that something was trying to in-fect my system. Other warnings camethick and fast. Meanwhile, though, myInternet Explorer browser changed homepages. Popup advertisements of everykind began to appear. My system was inreal trouble.

Microsoft Anti-Spyware said it wasblocking this and that (about six messages,all stacked). I told it in each case to blockthe stuff, and I closed the browser. Even-tually that flurry of messages stopped, butwhen I ran Microsoft Anti-Spyware, itfound I was infected with WinTools, Tool-bar Websearch, Network Essentials Brows-er Modifier, and CYDOOR adware. Mi-crosoft Anti-Spyware offered to removethem, trundled, and then said it had re-moved them. Then Microsoft Anti-Spywarewanted me to reset the machine.

I wasn’t sure I wanted to do that justyet. Suspiciously, I ran Microsoft Anti-Spyware again. It produced precisely thesame result, finding the same infections.I ran AdAware and Spybot Search and De-stroy. They didn’t find anything wrong atall. Clearly, the infection had managed tobypass or compromise all my antispywaretools.

Now I was sure I was in trouble. Iquickly opened a command window andused XCOPY to copy off to a thumb driveall the files in the places I keep docu-ments, using the /e/s/d/y switches to getonly those I hadn’t backed up recently. (Ikeep batch files for just that purpose.)With that done, I was ready to battle forpossession of my machine.

Drive-By Spyware & Other HorrorsJerry Pournelle

C H A O S M A N O R

Jerry is a science-fiction writer and seniorcontributing editor to BYTE.com. You cancontact him at [email protected].


Removing WinToolsThe first move was to use Microsoft Anti-Spyware one more time and this time re-boot. As I suspected, that did nothing atall: Although Anti-Spyware was very un-happy, WinTools was still in there, com-plete with the directory Program Files/Common Files/Wintools that WinTools cre-ates and Microsoft Anti-Spyware thoughtit had deleted. Deleting that directory doesnothing until you get the actual programthat regenerates it; this is considerably hard-er because it hides deep in your file sys-tem in another directory entirely (in mycase it was hiding in the System 32/DRIVERS subdirectory but infections usedifferent hiding places). In fact, eliminat-ing the source generator for the infectionfiles is beyond the capabilities of any au-tomatic program I can find.

Time to go to hell— that is, the web sitePCHell (http://www.pchell.com), particu-larly to http://www.pchell.com/support/wintools.shtml where there are completeinstructions for getting rid of the Wintoolsinfection. Well, practically almost: I wouldnever have got rid of this thing without thePCHell folks, and they have my gratitude,but even their instructions didn’t do it all.They also led me to the wonderful Hijack-This program (http://www.spychecker.com/program/hijackthis.html) which, de-spite the ominous name, is one great pro-gram. While you are thinking about it, godownload that program so that it is on yourcomputer. You may never need it, but ifyou do need it you will want it badly.

The solution to exorcising Wintools in-volves rebooting in Safe Mode. Once inSafe Mode, open a command window (orNorton Commander) and eliminate anydirectory called Wintools. Then edit theregistry to remove every reference to Win-tools. That done, use HijackThis severaltimes. HijackThis will find things you don’twant removed— such as the Google tool-bar— and offer to do other things youdon’t want done because its goal is to getyour system as close to the default reg-istry configuration as possible; meaningthat you want to employ some intelligencewhen using the program. On the otherhand, better safe than sorry. You can al-ways reinstall things you want to keep ifyou have accidentally eliminated them.

When you run HijackThis, you get a listof registry entries the program doesn’t carefor. There is an option to check items onthat list one at a time and ask for moreinformation. One item might be “This isa change from the default home page foryour browser.” Another might be a refer-ence to the Yahoo Toolbar. In each case,HijackThis has an option for “fixing” theproblem. The fix in general will be to elim-inate the registry key, or to restore it tothe Windows XP default value.


I let HijackThis fix everything I didn’tunderstand. Once I had used Regedit andHijackThis but before I left Safe Mode, Iran AdAware and Spybot Search and De-stroy. Neither found anything but cook-ies, which Spybot wanted badly to removefor me. Then I ran Microsoft Anti-Spyware,and lo! it found references to CYDOOR,an advertising robot. (CYDOOR has a website; you can go there and see what theyclaim to be doing. It won’t say anythingabout stealth infections. Before I went overthere I made sure my browser security set-ting was “HIGH” and I wished there wasa level above that; I’d call it “PARANOID.”)

I let Microsoft Anti-Spyware remove theCYDOOR references, then I went throughthe registry searching for Wintool— andfound several more references. I think theywere pretty well harmless by then, but Ideleted them all, and then did a searchfor CYDOOR, but found nothing. Then Iran HijackThis one more time.

This time, HijackThis found nothing Ididn’t understand. Neither did MicrosoftAnti-Spyware.

All was well. I shut down the system,turned it off, counted to 60, and brought itup again. All is still well according to fresh-ly reinstalled copies of AdAware, SpybotSearch and Destroy, and Microsoft Anti-Spyware. If any spyware lurks in here, it’s notdoing anything. HijackThis lists all running ser-vices, and there are none I don’t expect.

Trusting the EnemyMany Adware/Spyware/Malware programswill direct you to a “removal site,” whereyou can download a program that, theypromise, will remove their Adware and re-store your system to its pristine condition.

To do that you will have to run an ex-ecutable program provided by a compa-ny that snuck up on you and installed itsware by stealth.

If you think this is a good idea, pleasecontact me. I have a bridge I want tosell you.

Lessons LearnedFirst and foremost: When your system tellsyou something, listen. In this case, I wasactually warned that something wanted toinstall, and I let it, thinking that NortonAnti-Virus and Microsoft Anti-Spywarewould prevent any real problems. Theydidn’t. By the time Anti-Spyware got onthe job, the damage had been done— andNorton didn’t do a thing.

Second: Least done, soonest mended.If you think you’ve been infected, stopwhat you are doing and deal with it. Don’tgive it a chance to do any more damage.Pull your Ethernet connection plug andstart disinfecting.

Third: Seriously consider using Firefoxrather than Internet Explorer. I say this al-

though there is evidence that some mal-ware understands Firefox. I don’t makethis an absolute because this malwaredidn’t exploit a hole in Internet Explorerat all. It talked me into letting it install. Iam not sure what I thought I was allow-ing it to do, but I did okay something. Ihave no reason to believe Firefox can pro-tect me against stupidity. If you let an ex-ecutable program download and run,browser theft is likely to be the least ofyour problems.

Fourth: If you are considering partici-pating in various software and musicswapping schemes that give any kind ofcontrol over your system to the scheme,don’t do it. Malware and viruses andworms, o my! ride in with those file swap-ping schemes. There ain’t no such thingas a free lunch. I got bit by a drive-bybrowser infector. Bringing in file swap-ping schemes practically invites infection.

Fifth: Nothing finds it all. The MicrosoftAnti-Spyware program is pretty good, butyou’ll also want to keep AdAware andSpybot Search and Destroy. Don’t runthem at the same time.

Finally: Well, when you come down toit, it cost me an hour. In my case, it gaveme something to write about. But itwouldn’t have happened if I’d been do-ing all this on a Mac or a Linux box.

Windows XP 64-Bit Edition is Coming!Earlier this month, Microsoft released Re-lease Candidate 2 of Windows XP 64-bitEdition for AMD and Intel processorsalong with a confirmation that the prod-uct would be released to retail stores bythe end of April. If you happen to havean AMD 64-bit processor, such as theOpteron or Athlon 64, then this is the re-lease you will want to experiment with be-fore the retail version appears. We installedit on our NForce3 250-based motherboard

without incident. More drivers were in-cluded in this release of the OS as itmanaged to install drivers for the Mar-vell gigabit Ethernet chipset onboard thistime along with chipset drivers. (64-bitXP Release Candidate 1 didn’t recognizethis chip.)

We can also successfully report that go-ing to the Start Menu and selecting Win-dows Update now works as it should with-out giving cryptic error messages. ATI andNVIDIA both have recent builds of driversavailable for their video cards and moth-erboard chipsets for XP 64. Give the cur-rent release candidate a try if you havebeen waiting for a working version of theOS to test. This time it works pretty well.

Winding DownI had expected the Game of the Month tobe the new release of Sid Meier’s Pirates!.I very much enjoyed the game on my ear-ly Macintosh, and was disappointed whenI couldn’t get any of the PC or Windowsversions to run well. Consequently, whenthe new one came out, I leaped at it.

Alas, it has been a disappointment tome, largely because there is no gamespeed control. Now I realize I can jiggerone up. I can run it on a slow machineand employ one of those programs thatwaste cycles, but I don’t really want to dothat; perhaps it’s mere funk. For me,though, the game plays too fast, so thatit’s more like a shooter than the delight-ful combination humor/role playing gamethat the original Pirates! was.

Clearly, my view isn’t shared by all. Thegame has many excellent reviews, and in-deed, except for the unchangeable toofast game speed, I found little to dislike.It retains much of the flavor of the oldgame, but with better graphics. There’s alot to like about it, but I find that it tiresme to play it for long. Ah well, back toEverquest II, except that I find I am puttingtoo much time into that.

The first computer book of the monthis Kathy Jacobs and Bill Jelen’s, Life onOneNote (Holy Macro Press, 2004). If youare a TabletPC user, or thinking of be-coming one, you’ll want Michael Linen-berger’s Seize the work Day: Using theTabletPC to Take Total Control of YourWork and Meeting Day (New Academy,2004). It has a wealth of examples ofTabletPC applications and how to usethem. If you are building your own equip-ment, you already know you need Boband Barbara Thompson’s Building the Per-fect PC (O’Reilly & Associates, 2004). You’llalso want their PC Hardware Buyer’sGuide (O’Reilly & Associates, 2005) to helpyou choose components.

DDJ


“When your system tells you

something, listen”

P R O G R A M M E R ’ S B O O K S H E L F

The older I get, the more I find myselftrying to make sense of things. Crazy,I know, but I don’t just want my codeto work anymore—I want to under-

stand how it works from top to bottom,and how it fits into the grand scheme ofthings.

Judging from this month’s books, I’mnot the only one who feels this way. Onthe top of the list is Joel Spolsky’s Joel onSoftware, which collects some of the wit-ty, insightful articles he has written overthe past four years. If you’re a developer,Spolsky’s weblog is a must-read: His ob-servations on hiring programmers, mea-suring how well a dev team is doing itsjob, the API wars, and other topics are al-ways entertaining and informative. Overthe course of 45 short chapters, he rangesfrom the specific to the general and backagain, tossing out pithy observations onthe commoditization of the operating sys-tem, why you need to hire more testers,and why NIH (the “not- invented-here”syndrome) isn’t necessarily a bad thing.Most of this material is still available on-line, but having it in one place, edited,with an index, is probably the best $25.00you’ll spend this year.

Budi Kurniawan and Paul Deck’s HowTomcat Works is a much narrower book,but seems to be driven by the same needto make sense of things. The book deliv-ers exactly what its title promises: a detailed,

step-by-step explanation of how the world’smost popular Java servlet container works.The authors start with a naïve web serverthat does nothing except serve static HTMLpages until it’s told to stop. From that hum-ble beginning, they build up to a full-blownservlet container one feature at a time. Eachtime they add code, they explain what it’sdoing and (more importantly) why it’s need-ed. Their English is occasionally strained,and there were paragraphs I had to readseveral times to understand, but this bookis nevertheless an invaluable resource forservlet programmers who want to knowmore about their world.

John Goerzen’s Foundations of PythonNetwork Programming is superficially dif-

ferent, but at a deeper level, very simi-lar. Where Kurniawan and Deck look atone way to handle one protocol, Goerzenlooks at how to handle several commonprotocols, including HTTP, SMTP, andFTP. Goerzen also doesn’t delve as deeplyinto how servers work, concentrating in-stead on how to build clients that usethese protocols.

The similarity lies in the approach. Aswith How Tomcat Works, Goerzen buildssolutions to complex problems one stepat a time, explaining each addition or mod-ification along the way. He occasionallyassumes more background knowledgethan most readers of this book are likelyto have, but only occasionally, and makesup for it by providing both clear code andclear explanations of why this particularfunction has to do things in a particularorder, or why that one really ought to bemultithreaded. I’ve already folded downthe corners of quite a few pages and ex-pect I’ll refer to this book often in the com-ing months.

DDJ

Greg on SoftwareBooks

Gregory V. Wilson

Greg is a DDJ contributing editor and canbe contacted at [email protected].


Joel on SoftwareJoel SpolskyAPress, 2004362 pp., $24.99

ISBN 1590593898

How Tomcat WorksBudi Kurniawan and Paul DeckBrainySoftware.com, 2004450 pp., $49.99ISBN 097521280X

Foundations of Python NetworkProgrammingJohn GoerzenAPress, 2004512 pp., $44.99ISBN 1590593715

Programmer’s Bookshelf

Submissions to Programmer’s Book-shelf can be sent via e-mail to [email protected] or mailed to DDJ,2800 Campus Drive, San Mateo, CA94403.

Electric Cloud has sponsored the creationof the GNU Make Standard Library(GMSL), aimed at providing a commonset of tools for all users of GNU Make.The GMSL includes list and string manip-ulation, integer arithmetic, associative ar-rays, stacks, and debugging facilities. TheGMSL is released under the General Pub-lic License and is hosted on SourceForge.Electric Cloud also offers Electric Cloud2.1, designed to reduce build times by dis-tributing the software build in parallelacross scalable clusters of servers.

Electric Cloud Inc.2307 Leghorn StreetMountain View, CA 94043650-968-2950http://gmsl.sourceforge.net/

Pantero Software is designed to createrules and services that deliver valid datain integration projects. You use graphicaltools to import data schemas or models,map data from one representation to an-other, define rules to ensure validity andconsistency, and define error-handling be-havior—with all of it captured as meta-data. The Pantero runtime software im-plements these rules as web services orJava controls. Pantero Release 1.3 supportsEclipse and the JBoss application serversas well as software from BEA and IBM.

Pantero Software 300 Fifth Avenue, Suite 630Waltham, MA 02451781-890-2890http://www.pantero.com/

The 7.2 release of Amzi! Prolog + LogicServer incorporates the Eclipse 3.0 IDE toprovide editing, debugging, and projectmanagement for logic programs, embed-ded logicbases, and remote logicbases onweb servers. Amzi! 7.2 adds graphicalicons for breakpoints indicating the fourdebug ports: call, fail, redo, and exit. Inconjunction with the call stack, this func-tion illustrates advanced logic program-ming features such as backtracking, re-

cursion, and unification. Amzi! is availableon Windows, Linux, Solaris, and HP/UX.

Amzi! Inc.47 Redwood RoadAsheville, NC 28804828-350-0350http://www.amzi.com/

The EngInSite PHP Editor is an IDE thatautomatically recognizes public and pri-vate functions of the class. EngInSite PHPEditor is “HTML-aware” and comes withCSV support. Its architecture emulatesHTTP server behavior in the Editor’s out-put window. It also offers a breakpointoption and an option to upload PHPscripts to a server. SFTP, SSH1, and Web-DAV are supported as well as FTP: Theprogram connects to a remote server onits own, automatically uploading only newor changed files.

LuckaSoft Anholter Strasse 2BD46395 Bocholt, Germany+49 2871 233 8 01http://www.enginsite.com/

Recursion Software’s C++ Toolkits, a col-lection of C++ class libraries, has addedsupport for Sun Microsystem’s Solaris 10operating system, along with Linux, AIX,Tru64, HP-UX, IRIX, Red Hat Linux, MacOS, VxWorks, Windows, PocketPC, andSuse/Linux. The C++ Toolkits enable de-velopment of multithreaded distributed ap-plications, and include advanced class li-braries for developing and deployingtransactions in distributed and service-oriented architecture (SOA) environments.

Recursion Software Inc.2591 North Dallas Parkway, Suite 200Frisco, TX 75034800-727-8674

ActiveSMS is an ActiveX DLL that lets yousend and receive SMS through GSM/GPRSterminals. It handles voice calls and thephonebooks of both SIM and terminal. In-tegral into any application that supportsCOM technology, ActiveSMS provides aninterface that exports “events” and “meth-ods” to handle reception and forwardingof text, Flash, or binary messages, as wellas multiple forwarding, voice call han-dling, and status report handling. It sup-ports serial connections, IrCOMM, IrDA,USB, and Bluetooth and is able to com-municate with the terminals both in “PDUmode” and in “Text Mode.”

Net Sphaera S.n.c.Via Torre Della Catena, 150Benevento, Italyhttp://www.activesms.biz/eng

Helixoft has released VBdocman .NET2.0, a Visual Basic .NET Add-In for gen-

erating technical documentation fromVB.NET source files. It parses sourcecode and automatically creates table ofcontents, index, topics, cross references,and context-sensitive help. Users can cre-ate their own formats of output docu-mentation: The predefined output for-mats are Help 2 (Microsoft helptechnology used in Visual Studio .NET),CHM, RTF, HTML, and XML.

HelixoftTomasikova 14080 01 Presov, Slovakiahttp://www.vbdocman.com/net/

IT GlobalSecure has updated SecurePlayVersion 2.1, its multiplayer state engine.SecurePlay implements a suite of crypto-graphic protocols to stop many kinds ofcheating and piracy, and offers a pro-gramming interface for multiplayer, net-worked game development. SecurePlay2.1 includes integrated networking and in-teroperability between Java and C++ inWindows and Linux. You receive a com-plete copy of the documented source codein Flash, Java, J2ME, or C++, with a PS2release forthcoming.

IT GlobalSecure Inc.1837 16th Street NWWashington, DC 20009-3317202-332-5878http://www.secureplay.com/

Graphics & Scripting Tools has an-nounced Vector Graphics ActiveX, a ful-ly fledged vector graphics platform forincorporating 2D vector graphics intoan application-development cycle. Thecomponent is compatible with VisualC++, Visual Basic, and Delphi devel-opment tools and is designed to createprofessional-quality technical drawings,illustrations, charts, and diagrams. By es-tablishing links between graphic shapesand real objects, the developer can con-nect to OPC servers to watch and mod-ify processes in real time.

Graphics & Scripting Tools LLCKosm Strasse 10394055 Voronezh, Russiahttp://www.script-debugger.com/

DDJ

O F I N T E R E S T


Dr. Dobb’s Software Tools NewsletterWhat’s the fastest way of keeping up with newdeveloper products and version updates? Dr.Dobb’s Software Tools e-mail newsletter, deliveredonce a month to your mailbox. This uniquenewsletter keeps you up-to-date on the latest inSDKs, libraries, components, compilers, and thelike. To sign up now for this free service, go tohttp://www.ddj.com/maillists/.

“The prospect of hanging concentrates the mind wonderfully.” —Samuel Johnson

In his columns in Scientific American and in a book titled The Unexpected Hanging andOther Mathematical Diversions, Martin Gardner explores an intriguing paradox. It first sawprint in 1948 and has been cast in many forms, involving surprise inspections, class-A

blackouts, pop quizzes, hidden eggs, and card tricks. Patrick Hughes and George Brecht devote20 pages of their book, Vicious Circles and Infinities to the paradox.

Although the paradox has been richly analyzed, there is one approach that I’ve never seen.Perhaps I missed it, or perhaps the approach I have in mind is just wrong. But for what it’sworth, I thought I’d present my algorithmic analysis of the “Paradox of the Unexpected Hanging.”

The paradox: On a certain Saturday, a man is sentenced to be hanged. The judge, known toalways keep his word, tells him that the hanging will take place at noon on one of the next sevendays. The judge adds that the prisoner will not know which is the fateful day until he is soinformed on the morning of the day he is to be hanged.

The prisoner is elated, convinced that he will not be hanged. He reasons as follows: If he isalive on Friday afternoon, he will know that he is to be hanged on Saturday. But this contradictsthe judge’s assertion that he will not know his hanging day in advance. So his execution cannotbe scheduled for Saturday. Therefore, Friday is the last day on which he can be hanged inkeeping with what the (truthful) judge has said. But this means that if he is alive on Thursdayafternoon, he knows at that point that he will be hanged on Friday— because Saturday has beenconclusively eliminated. And by recursion, the prisoner reasons that he cannot be hanged on anyday of the week. He is serene.

Thursday comes around and he is informed that he is to be hanged that day— a completelyunexpected hanging. The judge spoke the truth. Where did the prisoner’s reasoning go wrong?

It is entertaining to see the subtle ways in which some have gone wrong in trying to unravelthe paradox. It is not, as some think, trivial. The judge’s statements do not appear to becontradictory, the prisoner’s reasoning seems to be perfectly logical, and yet something must bewrong with one or the other. We are tempted to suspect the judge of predicting somethingimpossible, but then it turns out to be true. What’s the resolution?

I think that the difficulty arises from confusing a prediction with a principle. The judge is notmerely predicting that the prisoner will be surprised, he is saying that it is impossible for theprisoner to know the date of his execution in advance. This is a stronger claim and cannot bedemonstrated by a single example.

In effect, the judge is claiming that there exists an algorithm for execution-day selection, andasserting that the prisoner cannot in principle determine the outcome of the algorithm early.

So can there be such an algorithm? I say no. Clearly the algorithm can’t be something like“Choose Thursday.” If that were the algorithm, the prisoner would know that Thursday was theday. So the algorithm must be probabilistic, such as “Choose Monday with probability 1/4,Tuesday with probability 1/4, or Wednesday with probability 1/2.”

If you’re thinking that there could be several algorithms like “Choose Thursday” and “ChooseWednesday,” and that the judge could choose one, you’re just multiplying algorithms, becausethen you have to specify his algorithm for choosing among these algorithms, and you’re backwith a probabilistic algorithm. Thus there is only one algorithm, thus the prisoner can, inprinciple, deduce what it is. It is against this fully informed, perfectly reasoning prisoner that thejudge makes his strong claim.

But any such probabilistic algorithm has a last day to which a nonzero probability is assigned,and if the judge or his random number generator picks that day, the prisoner will know his fateon the preceding afternoon.

So there is no algorithm with the described properties, and when the prisoner is surprised itonly shows that he guessed wrong.

S W A I N E ’ S F L A M E S

The Hanging Algorithm

Michael [email protected]


_Dr._Dobb[ap]s_Journal_(Volume_30,_Issue_6,_No._373,_May_2005)_(2005)(en)(76s)

Documents

Transcript of _Dr._Dobb[ap]s_Journal_(Volume_30,_Issue_6,_No._373,_May_2005)_(2005)(en)(76s)