7/28/2019 WRL-TN-20
1/21
J U L Y 1 9 9 1
WRL
Technical Note TN-20
How Digital ImpedesPortability andInteroperability
Jeffrey C. Mogul
Digital Internal Use Only
d ig
i t a lWestern Research Laboratory 250 University Avenue Palo Alto, California 94301 USA
7/28/2019 WRL-TN-20
2/21
The Western Research Laboratory (WRL) is a computer systems research group thatwas founded by Digital Equipment Corporation in 1982. Our focus is computer scienceresearch relevant to the design and application of high performance scientific computers.We test our ideas by designing, building, and using real systems. The systems we buildare research prototypes; they are not intended to become products.
There is a second research laboratory located in Palo Alto, the Systems Research Cen-ter (SRC). Other Digital research groups are located in Paris (PRL) and in Cambridge,Massachusetts (CRL).
Our research is directed towards mainstream high-performance computer systems. Ourprototypes are intended to foreshadow the future computing environments used by manyDigital customers. The long-term goal of WRL is to aid and accelerate the developmentof high-performance uni- and multi-processors. The research projects within WRL willaddress various aspects of high-performance computing.
We believe that significant advances in computer systems do not come from any singletechnological advance. Technologies, both hardware and software, do not all advance atthe same pace. System design is the art of composing systems which use each level oftechnology in an appropriate balance. A major advance in overall system performance
will require reexamination of all aspects of the system.
We do work in the design, fabrication and packaging of hardware; language processingand scaling issues in system software design; and the exploration of new applicationsareas that are opening up with the advent of higher performance systems. Researchers atWRL cooperate closely and move freely among the various levels of system design. Thisallows us to explore a wide range of tradeoffs to meet system goals.
We publish the results of our work in a variety of journals, conferences, researchreports, and technical notes. This document is a technical note. We use this form forrapid distribution of technical material. Usually this represents research in progress.Research reports are normally accounts of completed research and may include materialfrom earlier technical notes.
Research reports and technical notes may be ordered from us. You may mail yourorder to:
Technical Report DistributionDEC Western Research Laboratory, WRL-2250 University AvenuePalo Alto, California 94301 USA
Reports and notes may also be ordered by electronic mail. Use one of the followingaddresses:
Digital E-net: DECWRL::WRL-TECHREPORTS
Internet: [email protected]
UUCP: decwrl!wrl-techreports
To obtain more details on ordering by electronic mail, send a message to one of theseaddresses with the word help in the Subject line; you will receive detailed instruc-tions.
7/28/2019 WRL-TN-20
3/21
How Digital Impedes Portability and Interoperability
Jeffrey C. Mogul
July, 1991
Abstract
Digital is emerging from its years as a vendor of proprietary systems with
institutional attributes that impede the delivery of high-quality portable and
interoperable software. In spite of our best intentions, we cannot succeed in
todays market without recognizing these barriers. Some of the barriers(such as byte-order) are inherent in our systems and must be circum-
navigated; other barriers (such as the Not Invented Here syndrome) are
part of the culture and must be dismantled. Drawing on my experiences in
porting software to Unix systems, and in working on IP/TCP interoperability
problems, I describe a number of Digitals subtle organizational barriers and
suggest some solutions.
Digital Internal Use Only
Copyright 1991Digital Equipment Corporation
d i g i t a l Western Research Laboratory 250 University Avenue Palo Alto, California 94301 US
7/28/2019 WRL-TN-20
4/21
ii
7/28/2019 WRL-TN-20
5/21
1. Introduction
In the 1980s, Digital made a lot of money selling VAX/VMS and DECnet systems. The com-
pany built up a way of doing things that succeeded in this proprietary environment.
In the 1990s, our customers have changed; they arent willing to wait for us to provide the
solutions that we think they want. If we dont deliver, they can probably find something better,sooner, from another vendor. The corporate ways that we evolved in the 1980s arent going to
work in this new world. We need to find new ways of producing software products, and we
cant afford to hope that luck will save us.
In this paper I will describe a few of Digitals structural problems, loosely grouped under the
categories barriers to portability and barriers to interoperability. These might also be
called barriers to the creation of systems that support portable and interoperable software,
since the problems lie chiefly in our systems, rather than the applications we sell.
One thing should be understood: I am not blaming anyone for the structural issues I will
describe in this paper. As I said, these structures worked well in the 1980s. What people should
be blamed for is failing, now, to realize that some of these structures are no longer helpful.
I would also like to apologize in advance if some of the things I say turn out to be inaccurate.
It is extremely hard to find out exactly what is available, or in progress, within Digital; this in
itself is something we should be working to fix. Also, there are many people who are already
doing great work to solve our portability and interoperability problems; they deserve our support.
2. Barriers to portability
In this section, I will look at the problems we face in creating systems that support software
portability; that is, systems to which and from which it is easy to port software.
Customers like systems that support software portability, because it means that they dont
have to rely on one vendor. Established vendors used to hate software portability because they
feared losing their captive audiences. Today, of course, we know that no sane customer would
buy a system to which it is hard to port applications. What is not quite so obvious is that neither
would any sane customer buy a new system that they couldnt escape from later on; so, no sane
customer would buy a system from which it is hard to port their applications.
2.1. Avoid unnecessary differentiation
To a first approximation, this means that our systems have to be the same as everybody elsessystems. We must be very careful about positioning proprietary value added features in
otherwise standard systems. From the customers point of view, such features must be valuable
indeed in order to justify giving up the freedom of easy escape. From our point of view, adding
such features may not be the proper use of our limited engineering resources.
This is not to say that we should not add value to otherwise standard systems. Some features
would clearly be worth it to the customer (such as a high-quality backup system). Features that
do not affect the external interface, such as SMP, may also be worth adding. PrestoServe is a
good example of useful added value that does not harm portability.
Digital Internal Use Only 1
7/28/2019 WRL-TN-20
6/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
But, unnecessary differentiation is an unmitigated mistake. For example, DECwrite was
originally based on FrameMaker. Today, DECwrite and FrameMaker have diverged con-
siderably. Maybe DECwrite has features that arent available on the platforms of other vendors
... but most users probably dont need those features, and they would rather not have to learn a
new system.
In summarizing the results of a recent Product Directions Forum, Patricia Ward of T&Nwrote:
The customers are highly enthusiastic about NAS, viewing it as critical to the kinds of ap-plications they will be building in the near future. However, questions regarding DECs inten-tions to utilize industry standard APIs emerges as a paramount concern of these customers.They strongly prefer industry standard APIs over DEC APIs, even if the latter are developed formultiple platforms. They believe that DECs current positioning of NAS does not clearlyenough convey the commitment to comply with existing -- and in particular, emerging -- in-dustry standards in all areas. Where standards are only partially complete, e.g. Motif, thecustomers still want DEC to follow the developing standards are closely as possible. They alsomaintain that DEC must not choose to develop proprietary standards even when facing irrecon-cilable differences with standards bodies.
2.2. Dont take too much advantage of our advantages
One common mistake is to write software that takes full advantage of the underlying
hardware or operating system. This is a serious trap, because the tendrils of the system and the
software get so entwined that portability is choked off. Compiler writers used to make this mis-
take; they saw the nifty VAX instructions for doing complex operations, and assumed that if the
hardware designers went through the trouble of putting them in, the compiler would have to use
them. Often, the resulting programs ran slower, and the compilers were more complex ... and
entirely unportable to new architectures.
VMS suffers from the same problem. The entire Alpha project is an attempt to avoid solvingthis problem. It may even succeed, but we will still be stuck with an operating system that cant
be ported. In theory, the reason why Alpha was necessary (instead of, say, using a MIPS instruc-
tion set) was that VMS relies on certain protection features of the VAX architecture, not present
in MIPS. VAX/VMS might have been the right design for the 1970s and 1980s, but woe unto us
if we assume that this kind of entanglement will work in the future.
2.3. Standards are not the entire answer
Another mistake is to assume that standards can save us. Standards are supposed to make
portability easier. We can jump up and down and scream and wave standards at our customersall we want, but customers dont care about standards; they care about portability. If Sun sys-
tems dominate the workstation market, then it matters very little what official standards SunOs
meets; what matters is that one can port software between ULTRIX and SunOs. Another way to
say this is that de jure standards usually arent worth the paper they are printed on; de facto
standards are the only ones that count. To be successful, we have to be good at guessing where
the de facto standards are headed. Sometimes we can set them ourselves, but often we will have
to follow our competition at very short notice.
Digital Internal Use Only 2
7/28/2019 WRL-TN-20
7/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
2.4. Excess stability is a trap
A mistaken reliance on standards is often coupled with a mistaken reliance on stability. Open
systems evolve far more rapidly than our old proprietary systems do. If we cant keep up with
the changes in our competition, our systems will end up too different. Yes, of course it is true
that we should not jerk the customers around unnecessarily; but it is worse to change an interface
two years after our competition does so than it is to change it as soon as the trend is clear.
For example, one of the most persistent portability problems with ULTRIX is the use of the
4.2BSD syslog interface, when everyone else is using the 4.3BSD version of the interface. Many
customers have complained to me that if it were not for this one difference, their programs would
port without any source code changes. (One customer even offered to visit ZK3 and change all
the ULTRIX applications to use the new syslog interface, if we would simply change over.)
2.5. Bug reports are good
Even less excusable than excess stability is our inability to repair bugs quickly. Software
quality is one of our main advantages (I was told by a customer that our systems crash far lessoften than Suns), but some bugs seem never to get fixed. When a bug can be fixed without
disrupting existing applications, there is no excuse when it is not fixed in the first release follow-
ing the bug report. Yet, several times I have reported a bug and supplied a simple fix, only to be
told months later that it might be fixed in a future release. Perhaps we should not be waiting
until the end of field-test to fix bugs that have been known about for months.
Bug reports should be welcomed, not shunned, because they provide us with a chance to im-
prove our software. We should reward customers for reporting bugs, not charge them for the
privilege (or make them type the reports onto antique five-part forms). This is especially impor-
tant for our relationship with Independent Software Vendors (ISVs). When ISVs who discover
ULTRIX bugs first have to fight to get anyone in Digital to listen, and then have to wait monthsfor a fix, they arent likely to chose ULTRIX as a platform for their applications.
1Doug Clark, one of Digitals most successful computer designers, has written a paper that
shows how well the bugs are good philosophy works in hardware engineering. It should be
just as successful in software engineering.
Unfortunately, the ULTRIX product groups seem to be unable to provide timely fixes to those
bugs that do get reported. This, Im told, is due partly to a lack of human resources dedicated to
bug-fixing, and partly to an ancient software development environment that makes it unneces-
sarily difficult to fix bugs. If this is so, the fault lies with our engineering management hierar-
chy.
1Douglas W. Clark, Bugs are Good: A Problem-Oriented Approach to the Management of Design Engineer-ing, Research-Technology Management, May-June, 1990. This journal is available via the Digital Library Net-work.
Digital Internal Use Only 3
7/28/2019 WRL-TN-20
8/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
2.6. Staying honest about portability
One of the most important lessons about portability is that it may be extremely hard to port an
old program, but it is not hard at all to make a new program portable. That is, if portability is
designed in from the start, its usually almost free. For example, Ive ported numerous network
programs between big-endian machines (such as Suns) and little-endian machines (ours).
In the cases where the original programmer was sensitive to the possibility of a byte-order mis-match, the programs had (almost) all the necessary ntohs (etc.) macros, even though these are
no-ops on Suns. Porting these programs was easy. Other programmers were too lazy to think
ahead; porting these programs is hard, not simply because the macros are missing, but because it
often isnt at all obvious where to put them.
Its also important to have something to keep the programmers honest about portability. Even
when designed in, portability cannot be guaranteed without a lot of testing. The temptation is
too great to make the software work on our own systems, or add new features, rather than look-
ing ahead to the future.
I understand that, from the beginning, Sun tried to ensure that their kernel was portable bymaking sure that every release would run on their Vax. This surely added a little time to their
release cycle, but it even more surely saved them a lot of time when they introduced their 386-
based and SPARC-based systems.
Most of the code in the ULTRIX kernel isprobably portable to, say, big-endian systems, if only
because it came that way from Berkeley. Ive looked at the kernel DECnet Phase IV code,
though, and I would be extremely surprised if that code didnt have some nasty portability
problems. Were lucky that the ACE initiative went with little-endian byte order, but if we ever
want to move our DECnet code onto Suns installed base, we may have some trouble with this.
Fortunately, the DECnet group has learned its lesson and wrote portable code for Phase V; it
would be sad if other groups had to learn this lesson the hard way.
2.7. Taking advantage of public-domain software
One of the great things about Unix is that there is a lot of public-domain software floating
around, which with a little effort can be ported to ULTRIX. Not only are these programs an
opportunity for us to provide some added value (by shipping pre-ported versions along with our
systems), they are also wonderful tests to see if our systems support software portability. We
should be finding useful (or weird) public-domain software and porting it to ULTRIX, as part of
the development cycle for ULTRIX releases. The people doing the ports would learn things that
should influence the design of the system.
Right now, it is too hard to import public-domain software. Aside from the usual Not In-
vented Here (NIH) syndrome, our organizational structures are not able to cope with the concept
of shipping software from public sources. For example, when I ported tcpdump I gave the
ULTRIX documentation folks the manual page, which has a prominent copyright notice saying
that the documentation must continue to contain the copyright notice. This section doesnt meet
ULTRIX documentation standards, so it was removed, and I had to fight to get it put back.
Digital Internal Use Only 4
7/28/2019 WRL-TN-20
9/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
One of the benefits of public software is that we can rely on the public to improve it. In order
to take advantage of these improvements, we cannot let our version of the sources diverge far
from the public version. This would seem to stifle any improvement on our behalf, but in fact
we could easily play in the game by making our own improvements public. Yet we cant bring
ourselves to give software away, even if we got the original version for free (or nearly so). The
continuing confusion over which version ofsendmail to use is partly due to our inability to par-
ticipate in the public arena.
Digitals Cambridge Research Lab (CRL) is working on producing a CD-ROM of public
software. This would solve the problem of making such software easily available to customers,
but we really should be integrating some of that software into our mainstream product process.
2.8. How do we make money by giving away software
Digital used to make its profits on hardware, but in a marketplace where hardware is a com-
modity, we will have to make our money elsewhere. Many people have told me that they cannot
understand how we can make money if we follow my advice to give away source code forpublic-domain programs.
There are several answers to this objection. The most important is that we do not sell each of
our programs separately; we sell software products, some of which contain hundreds or
thousands of individual programs. The quality of a software product depends upon the quality of
the individual programs, and on how well they are integrated. We cannot charge separately for
each incremental addition to the base ULTRIX system. Instead, by integrating functionality we
make customers more likely to chose our software over the competitions.
Some of this base system functionality is best provided by public-domain programs. The only
way to keep the most up-to-date versions of such software integrated into our systems is to par-
ticipate in the open exchange of improvements. True, if we give away our improvements and ifour competitors are also efficient at importing public-domain software, we wont have much of
an advantage, but at least we wont be at a disadvantage.
If we dont participate, either our software will be obsolete or we will have to expend precious
resources to maintain it. We have a dismal track record for internal maintenance of originally
public-domain software, and there is no reason to expect that we can divert additional resources
to that task. If we do participate, our software will be better and our time-to-market could be a
lot shorter.
Remember also that for software that promotes interoperability, one never gains much of an
advantage by being different from the competition.
It is also important to understand that public-domain software, while often critical to the
smooth operation of our systems, constitutes a small fraction of our value-added software, and
almost none of our layered products. Giving away our changes will not cut into our cash cows.
Digital Internal Use Only 5
7/28/2019 WRL-TN-20
10/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
2.9. The dangers of poor documentation
Customers and applications vendors need to know how to use all the interfaces of our systems,
yet many of the features we have added to ULTRIX are poorly documented, or simply un-
documented. (VMS programmers porting code to or from ULTRIX have the same problem.) If
these interfaces arent good enough to document, they probably shouldnt be there in the first
place.
For example, ISVs sometimes need to know how to obtain the Ethernet hardware address as-
sociated with a workstation. The on-line manual says that the SIOCRPHYSADDR ioctl can be used
to obtain this value, but I cannot find any documentation on the form of the call. (By reading
some header files, one can infer the proper form for this call; in other cases, doing even that is
not easy.)
In general, the organization of ULTRIX documentation reflects Digitals political structure, not
the needs of the customer. For example, some of the features of the base ULTRIX system are
covered only in the DECnet/ULTRIX documentation. Worse, there is no decent overview
(tutorial) documentation on how to put everything together. This may be because the functionaldivisions between groups responsible for pieces of the system leaves nobody to take care of the
larger picture.
It also seems to be the case that there is no up-to-date, comprehensive documentation for how
to write kernel code to work with the SMP mechanisms now in ULTRIX; this is important for
people trying to port kernel mechanisms into ULTRIX from other BSD-based systems. I believe
that not even internally is such documentation available.
The only solution to the lack of documentation for these interfaces is to insist that writing, and
maintaining, the documentation be part of the job of the implementors. Currently, our program-
mers are often reluctant tech-writers, and some of our document editors dont understand impor-
tant issues about the systems they are documenting.
We are going to have to change the reward structures so that implementors produce honest, if
perhaps unpolished documentation. We are also going to have to find writers who truly know
our systems. In either case, this means investing in additional training, and probably it means
hiring more highly-skilled people. We cant afford to ship incompletely or improperly
documented systems; people will not port to a system they cannot understand.
2.10. Providing tools for portability
If customers buy a Unix system from several of our major competitors, they will also get tools
to help them port their VMS applications to Unix. These include clones of the VMS text editors,command interpreter, and certain library packages. Does Digital have it now? Apparently not.
Is this because we are unwilling to make it easy for our customers to switch from high-markup
VMS systems to low-markup ULTRIX systems? Too bad: they are switching anyway, but to HP
and IBM and Sun.
Do we have the tools to allow people to port big-endian applications to our little-endian sys-
tems? We could have had a nearly seamless big-endian support environment to ship with
Digital Internal Use Only 6
7/28/2019 WRL-TN-20
11/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
ULTRIX 4.2, given the availability of the R3000A CPU, but we dont. (Someone is working on
it, but dont hold your breath.)
Do we provide to customers guidebooks on how to port software to ULTRIX (and from
ULTRIX) to various systems? Or does each application vendor have to learn this from trial and
error?
3. Barriers to interoperability
Digitals success in the 1980s was largely due to our ability to use DECnet to tie together large
networks of VAX systems. In the 1990s, DECnet is no longer sufficient; our customers live in a
multi-vendor, multi-organizational environment, and interoperability is the key to making it all
work.
Perhaps the most important thing to remember is that although one may intend to create an
interoperable system, this does not mean that the system will interoperate. We are not in control
of the environment; this means that interoperability is a moving target. Intending to hit it is not
enough; we must continually watch where it is going, and to see if our attempts hit the mark.
3.1. Standards do not mean interoperability
Interoperability does not come magically with open network architectures. DECnet Phase
IV was not, in fact, a proprietary protocol architecture, but we failed to spread it into the larger
market. This may well be because Phase IV is not suitable for large, multi-organizational net-
works.
DECnet Phase V is premised on the openness of the ISO protocols, but after over a decade of
work and thousands of pages of ISO standards, there is still virtually no ISO networking in use.
Customers know that IP/TCP, with its many flaws, is the only truly interoperable networking
technology available today.
For years, people in the IP community have been searching unsuccessfully for an easy way of
testing IP/TCP implementations to ensure that they interoperate. Experience has shown that
standards, while necessary, are not sufficient; two competent implementors working from the
same standard often produce incompatible implementations. Our ability to use formal methods
is not yet, and may never be, sufficient to generate perfect implementations. Test suites are
helpful, but not sufficient, because they cannot simulate the whole range of bizarre behavior that
a robust implementation must handle.
3.2. Test early and often
The only accepted method for ensuring interoperability is to test each implementation against
as many others as is possible. The IP community has numerous ways of doing this; for example,
once a year Sun runs Connectathon to provide NFS and X vendors a chance to test their im-
plementations against one another. The rule at Connectathon is that the results are kept secret;
vendors (including Digital) go there to discover their own problems, not to obtain marketing
ammunition against the competition.
Digital Internal Use Only 7
7/28/2019 WRL-TN-20
12/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
We also need to take better advantage of the Interop conference trade show. Interop is billed
as the only conference where all vendors are required to connect to the show-floor network, and
has been an invaluable testbed for interoperability issues. Digital, as a company, has continually
resisted allocating sufficient resources to its Interop participation. We have also been reluctant
to stress multi-vendor interoperability, instead pushing VMS-to-ULTRIX interoperability as a
major theme.
Interoperability testing should occur early in the product development process, when there is
still time to fix problems. This means that we should be able to do a lot of it in-house, rather
than waiting for field tests. Doing so requires that some product groups have access to our com-
petitors most up-to-date products, as well as older systems (our own and the competitions) that
may still populate many customer sites. It isnt sufficient that these alien systems sit in the
corner, to be hauled out only for testing; if they arent heavily used by people who depend on
them for their daily work, the real bugs wont be discovered. Perhaps the right approach is to
make sure that the groups porting applications to other vendors platforms become an integral
part of the network interoperability testing process.
3.3. Multi-organizational networks are different
Interoperability is not just about linking systems from multiple vendors. It also means linking
together organizations that are under different administrations, and that might be potentially hos-
tile to one another. Today, via the Internet, I can reach systems at universities, at all of our major
competitors, and even in countries that a few years ago were considered enemy nations. This is a
significantly different and more difficult environment than the single-company DECnet networks
we built in the 1980s.
Multi-organizational networks are on a completely different scale than our old DECnet net-
works. We are proud of having one of the largest corporate networks in the world, but Digitals
network is tiny compared with the Internet. (As of this writing, there are more than 27,000
assigned IP network numbers, each of which may have hundreds or thousands of hosts). Al-
gorithms and administrative procedures that work on a network with a mere 60,000 nodes under
a single administration wont work on a network with millions of hosts under thousands of ad-
ministrations.
Digital does not have much ability to test our software in a multi-organizational environment,
before it gets to external field test. Mostly, this is because security constraints limit the ways in
which we can allow direct connections between our internal hosts and the Internet. These con-
straints, alas, are not unreasonable. Perhaps in a number of years, we will be able to expose parts
of our internal network without fear of hacking, but today our technology is not good enough.
This means that, in addition to improving our security technology, we must find other ways to
test our systems in the multi-organizational environment. One way would be to avoid the
temptation to put all of our internal networks under one management. Multiple managements
would complicate our internal networking, but we would certainly learn a lot.
It is interesting to note that, several years ago, IBM sought and won the contract to manage the
NSF network. IBM was not then, and probably still is not, recognized for its commitment to
IP/TCP, but by managing the core of the IP/TCP Internet, they probably learned an awful lot
more than we did about such networks.
Digital Internal Use Only 8
7/28/2019 WRL-TN-20
13/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
3.4. Shortening the product cycle
Open systems live by de facto standards. Unlike official standards (such as IEEE or ISO
standards), or proprietary standards (such as DECnet), de facto standards evolve rapidly. In the
good old days, we had control over how fast DECnet evolved: the standards didnt evolve any
faster than our ability to ship new implementations. Today, in the IP/TCP market, we do not
have the luxury of a centralized release control process to ensure that nobody gets ahead of thepack.
We can only compete in this market by shortening the time it takes us to respond to changes in
the de facto standards. Leadership does not come from being best in a market where unifor-
mity is important; it comes from being first (or at least, never last).
I again quote from Patricia Ward:
[The] relative importance of [the twenty identified NAS] attributes is highly dependent on theapplication environments at each customers site [...] However, [the customers] concur thatDEC must excel in the areas of: timeliness of implementation; ease of use; high availability;performance; reliability; and security. Timeliness of implementation is a key issue, with cus-tomers expressing concern over time-to-market plans.
There are many ways to shorten our development cycles. One is to be agile at acquiring
public domain software, and shipping it as soon as possible. Another is to do quick prototypes of
new software. Prototypes might not have all the features and performance of a carefully en-
gineered system, but they allow us to discover problems with our specification, approach, and
mindset early enough to do something about it. When conceptual problems are discovered only
in field test, nobody is willing to make the necessary changes.
3.5. Misreading the market
To get ahead of the competition, or at least to avoid falling far behind, we need more depth inunderstanding the market. Too often we decide to put our resources into doing something that
the customers really dont want, and meanwhile fail to do the things that they want. We cant
afford to let lost sales be our only indicator that the competition is ahead of us.
For example, take the curious availability of Kerberos support for ULTRIX. (Kerberos is the
MIT-Athena system for authenticating users in a distributed system.) MIT had already
developed kerberized versions of numerous user commands, including rlogin and rcp (which
in their original forms are scandalously insecure). However, ULTRIX still doesnt ship any ker-
berized applications, except for the BIND/Hesiod name server. While Kerberos support in the
name server might well be necessary for certain customers, I understand that this configuration is
possible only in an all-ULTRIX environment, and relatively few of our customers are sufficientlyconcerned about the integrity of their name service to use it. Meanwhile, almost all of them
could use the security of the kerberized commands, but we dont supply these. To me, this
reflects a misplaced understanding of the market. (Apparently, it also reflects a turf battle within
Digital ... which should have been resolved in order to satisfy the market. Note that blame for
our inability to respond to the market may not lie with any particular individual, but rather with a
corporate organization that diffuses control so that there is no focus for pushing the right
response.)
Digital Internal Use Only 9
7/28/2019 WRL-TN-20
14/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
Another example of misreading the market was the decision, a few years ago, to support Suns
YP name service but not the IP standard Domain Name Service (DNS). Fortunately, the ULTRIX
group has since not only rectified this error, but has created a mechanism to support the coexis-
tance of both YP and DNS (something that Sun has yet to do, apparently). In fact, dealing with
coexistance of multiple mechanisms is one of the keys to success in the 1990s. The world will
be a mixture of DECnet, SNA, XNS, IP, OSI, and myriad PC networking technologies, and we
must allow a customer to use all of them at once.
Blind spots in our understanding of the market can hurt us in other ways. For example, last
year I helped review a new version of the Network Troubleshooting Guide. The previous ver-
sion covered only DECnet networks; I was gratified that the new version covered IP/TCP net-
works. Unfortunately, while the authors were only familiar with Digitals own product set,
troubleshooting most IP/TCP networks demands an understanding of equipment from many ven-
dors. We still dont ship IP routers, for example, so any real customer installation will be full of
non-Digital routers. Also, troubleshooting on an Ethernet often requires the use of a network
monitor (such as Network Generals Sniffer); we dont sell such a product. The Network
Troubleshooting Guide would be far more useful to customers if it covered non-Digital products,
but our documentation people simply dont have access to them.
For a final example, I simply note how long it has taken us to get an IP router onto the market
(not quite yet, as of this writing). The Internet Portal, while a nice design for a niche product, is
not adequate for most uses.
3.6. Exposing ourselves in public
We desperately need to improve the connection between our development organizations and
the customers. Mediating all such communication through the field organizations doesnt work;
sales and marketing people dont usually have the technical sophistication to see the real tech-
nical shortcomings in our systems. Engineers and engineering managers can, and must, find
direct ways to communicate with our customers, and with our competitors customers (or else
how do we ever get them back?)
Part of the problem is that, because of our proprietary internal infrastructure, Digital is discon-
nected from the outside world. The two main examples of this are our electronic mail system
and our bulletin board systems.
Outside the company, virtually all interorganizational mail flows via IP/TCP or UUCP
mechanisms. Both of these use mail headers that roughly correspond to RFC822, and address
formats that are understood by the majority of Unix users (and by all Unix system ad-
ministrators). Inside Digital, we use message headers that dont really interoperate withRFC822, and a variety of address formats that make absolutely no sense to anyone outside of
Digital. The DECWRL electronic mail gateway goes through amazing contortions to paper over
these incompatibilities, but confusion inevitably results.
The situation with bulletin boards is even worse. Much of Digitals internal communication is
conducted via the Notes system. Notes has some very nice features, but connection with the
outside world is not one of them. As a result, Digital as a whole contemplates its navel, via
Notes, while ignoring the rest of the universe.
Digital Internal Use Only 10
7/28/2019 WRL-TN-20
15/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
Meanwhile, everyone else uses the USENET news system. News is quite similar to Notes;
although there are some conceptual differences, both provide a set of public discussions or-
ganized by topic area. The important difference is that news easily supports the connection of
multiple organizations; Notes does not.
There is a tremendous amount of information in the news system, much of it relevant to
2Digitals business . Although many Digital employees do participate in various newsgroups(there are fully-functional news clients for VMS), I am continually dismayed to find people in
the company who are ignorant of the whole concept, or who arent interested in finding out what
is going on in the outside world. If each product manager spent some time reading the
newsgroups relevant to his or her product plans, we might be in a much better position.
We can and must change our internal infrastructure to be more like that of the real world. This
means getting rid of NODE::USERNAME addresses; more and more systems inside the com-
pany can exchange mail via TCP, and we should make username@host our preferred form of
mail address. It also means shifting from Notes to news whenever possible; I find it amazing
that our internal discussions about the IETF (the organization that sets IP standards) is carried
out in a notesfile, rather than a newsgroup.
Our competition does not have this problem. Practically everyone at Sun uses Unix, TCP
mail, and news; they dont have proprietary in-house systems to isolate them from their cus-
tomers. HP brags about the size of their internal IP network. We shouldnt pretend that OSI will
save us; it will take too long, and when it arrives the winners will be those who have already
learned how to play in the open systems world.
3.7. Technical superiority is a red herring
One of the hardest lessons for us to learn is that, while technical excellence is necessary, it is
never sufficient. The world will not beat a path to your door simply because your mousetrap isbetter. A technically superior product may fail because it isnt compatible with the customers
existing systems, or it doesnt solve the problems that the customer wants solved, or it is simply
different from what the customer is used to. Most customers have realized that managing their
systems is the hardest problem of all; if we try to sell them something new, they may not want it
if they have to retrain people in order to use it.
Our efforts to induce the IP community to adopt the ISO IS-IS routing protocol, instead of the
OSPF protocol proposed by Proteon and others, should be instructive. Digital thought that IS-IS
was clearly superior, technically. The OSPF camp thought that OSPF was clearly superior. The
two camps were using different models of the world to make their judgements, and so agreement
on strictly technical grounds was never possible. (Also, the designers of the two protocols havelearned from each others criticism, so neither protocol has many identifiable flaws.)
The OSPF camp appears to have won. The decision was not only made for technological
reasons; politics (which Digital misunderstood) had something to do with it. My guess is that
2I recommend reading the newsgroup comp.unix.ultrix in particular.
Digital Internal Use Only 11
7/28/2019 WRL-TN-20
16/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
what gave OSPF the edge was that it was already available in products (nobody has yet shipped
an IS-IS product). Not only did Digital fail by being late to market with an IS-IS product; we
also failed by not anticipating OSPFs victory and getting an OSPF product to market. (Our first
IP router product is not likely to support OSPF for a while after FCS.)
3.8. Creating open standards
We should not let our past failures at creating open standards dissuade us from active par-
ticipation in the creation of future standards, but we must learn the right way to do it. The first
step is to realize that the OSI and POSIX standards process might not be good models to follow.
A standard developed under this kind of bureaucratic process doesnt always achieve significant
market share, even if it does become a check-off item on large contracts.
Successful promotion of a de facto standard requires more flexibility. First, one must work
closely with customers and other vendors. Second, one must be willing to compromise elegance
for relatively short-term pragmatic features. Finally, one must get implementations into use as
quickly as possible, on all major-vendor platforms, even if the software has to be given away.This is how Sun succeeded with NFS, and how all the successful Internet standards were created.
(Many Internet standards have languished for lack of widely-available implementations.)
Asking how we make money by giving implementations away misses the point. A standard is
no good to us if we are the only vendor to support it. Once the standard is widely accepted, we
make money by having the best (fastest, most robust, most manageable) implementation. We
cannot do that, though, until our competitors can interoperate using the standard.
4. Changing the system
It isnt hard to find problems with our software. Every large organization makes mistakes, andwe probably make fewer mistakes than most of our competitors. It is much harder to provide
constructive suggestions for improving things, especially now that we are far more resource-
limited than we once were.
If I were to pick one step to take first, it would be to spend more effort staying in touch with
the customers and the market. I dont mean participating in the standards process, which (while
necessary) is after all just a way to define a more useful dividing line between us and our cus-
tomers (or ISVs). I dont mean doing market surveys; they are also necessary, but they tend to
obscure the specific insights that one gets from individual voices.
Instead, I would like to see Digital switch more of its internal infrastructure to match the open
systems world that the customers are living in. This will make it much easier for us to speak thesame language as our customers, and it will also improve our ability to hear what they are saying
behind our back.
Opening up in this way leads to other changes: using more public-domain software, doing
more prototyping to test concepts of interoperability, doing broader portability and inter-
operability testing, and finding out about problems earlier than we do now.
Digital Internal Use Only 12
7/28/2019 WRL-TN-20
17/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
We also need to remove some of the layers that isolate engineers and product developers from
the customers and ISVs. It is too hard for customers and ISVs to discover technical information
about our products, and much too hard for them to complain.
Finally, as an outsider to the product creation process, I believe that much of our trouble lies
with engineering management, and in particular a lopsided allocation of resources. It is hard to
believe that our product groups in the open systems arena are still severely understaffed, giventhe obvious importance of these products to the health of the company. One cannot, of course,
simply add programmers to speed up a late project; we have to start hiring good people now if
we expect them to be useful in a year or two.
5. Conclusion
If we are going to make money, our product development process is going to have to do better
at supporting portability and interoperability. Our insular organizational structures are the main
barrier to success. We, meaning the people who actually do the work, must be willing to in-
novate not only in technology but in approach. If this means breaking with the traditional DECway of doing things, people will have to take some risks.
We need to improve our communication patterns. We need to find out where the open sys-
tems market is heading, what the customers want, and how well our systems will satisfy them.
There is a tremendous resistance in Digital to the spread of bad news; employees who complain
about our products are sometimes even accused of disloyalty. Only when we are willing to face
the hard truths, as soon as possible, will we get competitive products to the market on time.
6. Acknowledgements
I would like to thank Mary Jo Doherty, Henry Petras, Win Treese, and Kathy Wilde for com-menting on drafts of this document, but of course they are not responsible for my errors.
Digital Internal Use Only 13
7/28/2019 WRL-TN-20
18/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
Digital Internal Use Only 14
7/28/2019 WRL-TN-20
19/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
WRL Research Reports
Titan System Manual. MultiTitan: Four Architecture Papers.
Michael J. K. Nielsen. Norman P. Jouppi, Jeremy Dion, David Boggs, Mich-
WRL Research Report 86/1, September 1986. ael J. K. Nielsen.
WRL Research Report 87/8, April 1988.
Global Register Allocation at Link Time.
David W. Wall. Fast Printed Circuit Board Routing.
WRL Research Report 86/3, October 1986. Jeremy Dion.
WRL Research Report 88/1, March 1988.
Optimal Finned Heat Sinks.
William R. Hamburgen. Compacting Garbage Collection with Ambiguous
WRL Research Report 86/4, October 1986. Roots.
Joel F. Bartlett.
The Mahler Experience: Using an Intermediate WRL Research Report 88/2, February 1988.
Language as the Machine Description.
David W. Wall and Michael L. Powell. The Experimental Literature of The Internet: An
WRL Research Report 87/1, August 1987. Annotated Bibliography.
Jeffrey C. Mogul.
The Packet Filter: An Efficient Mechanism for WRL Research Report 88/3, August 1988.
User-level Network Code.
Jeffrey C. Mogul, Richard F. Rashid, Michael Measured Capacity of an Ethernet: Myths and
J. Accetta. Reality.
WRL Research Report 87/2, November 1987. David R. Boggs, Jeffrey C. Mogul, Christopher
A. Kent.
Fragmentation Considered Harmful. WRL Research Report 88/4, September 1988.
Christopher A. Kent, Jeffrey C. Mogul.
WRL Research Report 87/3, December 1987. Visa Protocols for Controlling Inter-Organizational
Datagram Flow: Extended Description.
Cache Coherence in Distributed Systems. Deborah Estrin, Jeffrey C. Mogul, Gene Tsudik,
Christopher A. Kent. Kamaljit Anand.
WRL Research Report 87/4, December 1987. WRL Research Report 88/5, December 1988.
Register Windows vs. Register Allocation. SCHEME->C A Portable Scheme-to-C Compiler.
David W. Wall. Joel F. Bartlett.
WRL Research Report 87/5, December 1987. WRL Research Report 89/1, January 1989.
Editing Graphical Objects Using Procedural Optimal Group Distribution in Carry-Skip Ad-
Representations. ders.
Paul J. Asente. Silvio Turrini.
WRL Research Report 87/6, November 1987. WRL Research Report 89/2, February 1989.
The USENET Cookbook: an Experiment in Precise Robotic Paste Dot Dispensing.
Electronic Publication. William R. Hamburgen.
Brian K. Reid. WRL Research Report 89/3, February 1989.
WRL Research Report 87/7, December 1987.
Digital Internal Use Only 15
7/28/2019 WRL-TN-20
20/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
Simple and Flexible Datagram Access Controls for Link-Time Code Modification.
Unix-based Gateways. David W. Wall.
Jeffrey C. Mogul. WRL Research Report 89/17, September 1989.
WRL Research Report 89/4, March 1989.
Noise Issues in the ECL Circuit Family.
Jeffrey Y.F. Tang and J. Leon Yang.Spritely NFS: Implementation and Performance of
WRL Research Report 90/1, January 1990.Cache-Consistency Protocols.
V. Srinivasan and Jeffrey C. Mogul.
Efficient Generation of Test Patterns UsingWRL Research Report 89/5, May 1989.
Boolean Satisfiablilty.
Tracy Larrabee.Available Instruction-Level Parallelism for Super-
WRL Research Report 90/2, February 1990.scalar and Superpipelined Machines.
Norman P. Jouppi and David W. Wall.
Two Papers on Test Pattern Generation.WRL Research Report 89/7, July 1989.
Tracy Larrabee.
WRL Research Report 90/3, March 1990.A Unified Vector/Scalar Floating-Point Architec-
ture.
Virtual Memory vs. The File System.Norman P. Jouppi, Jonathan Bertoni, and David
Michael N. Nelson.W. Wall.
WRL Research Report 90/4, March 1990.WRL Research Report 89/8, July 1989.
Efficient Use of Workstations for Passive Monitor-Architectural and Organizational Tradeoffs in the
ing of Local Area Networks.Design of the MultiTitan CPU.
Jeffrey C. Mogul.Norman P. Jouppi.
WRL Research Report 90/5, July 1990.WRL Research Report 89/9, July 1989.
A One-Dimensional Thermal Model for the VAXIntegration and Packaging Plateaus of Processor
9000 Multi Chip Units.Performance.
John S. Fitch.Norman P. Jouppi.
WRL Research Report 90/6, July 1990.WRL Research Report 89/10, July 1989.
1990 DECWRL/Livermore Magic Release.A 20-MIPS Sustained 32-bit CMOS Microproces-
Robert N. Mayo, Michael H. Arnold, Walter S. Scott,sor with High Ratio of Sustained to Peak Perfor-
Don Stark, Gordon T. Hamachi.mance.
WRL Research Report 90/7, September 1990.Norman P. Jouppi and Jeffrey Y. F. Tang.
WRL Research Report 89/11, July 1989.
Pool Boiling Enhancement Techniques for Water at
Low Pressure.The Distribution of Instruction-Level and Machine
Wade R. McGillis, John S. Fitch, WilliamParallelism and Its Effect on Performance.
R. Hamburgen, Van P. Carey.Norman P. Jouppi.WRL Research Report 90/9, December 1990.WRL Research Report 89/13, July 1989.
Writing Fast X Servers for Dumb Color Frame Buf-Long Address Traces from RISC Machines:
fers.Generation and Analysis.
Joel McCormack.Anita Borg, R.E.Kessler, Georgia Lazana, and David
WRL Research Report 91/1, February 1991.W. Wall.
WRL Research Report 89/14, September 1989.
Digital Internal Use Only 16
7/28/2019 WRL-TN-20
21/21
HOW DIGITAL IMPEDES PORTABILITY AND INTEROPERABILITY
Analysis of Power Supply Networks in VLSI Cir-
cuits.
Don Stark.
WRL Research Report 91/3, April 1991.
Procedure Merging with Instruction Caches.
Scott McFarling.
WRL Research Report 91/5, March 1991.
Dont Fidget with Widgets, Draw!.
Joel Bartlett.
WRL Research Report 91/6, May 1991.
Pool Boiling on Small Heat Dissipating Elements in
Water at Subatmospheric Pressure.
Wade R. McGillis, John S. Fitch, William
R. Hamburgen, Van P. Carey.
WRL Research Report 91/7, June 1991.
WRL Technical Notes
TCP/IP PrintServer: Print Server Protocol. Limits of Instruction-Level Parallelism.
Brian K. Reid and Christopher A. Kent. David W. Wall.
WRL Technical Note TN-4, September 1988. WRL Technical Note TN-15, December 1990.
TCP/IP PrintServer: Server Architecture and Im- The Effect of Context Switches on Cache Perfor-
plementation. mance.
Christopher A. Kent. Jeffrey C. Mogul and Anita Borg.WRL Technical Note TN-7, November 1988. WRL Technical Note TN-16, December 1990.
Smart Code, Stupid Memory: A Fast X Server for a MTOOL: A Method For Detecting Memory Bot-
Dumb Color Frame Buffer. tlenecks.
Joel McCormack. Aaron Goldberg and John Hennessy.
WRL Technical Note TN-9, September 1989. WRL Technical Note TN-17, December 1990.
Why Arent Operating Systems Getting Faster As Predicting Program Behavior Using Real or Es-
Fast As Hardware? timated Profiles.
John Ousterhout. David W. Wall.
WRL Technical Note TN-11, October 1989. WRL Technical Note TN-18, December 1990.
Mostly-Copying Garbage Collection Picks Up Systems for Late Code Modification.
Generations and C++. David W. Wall.
Joel F. Bartlett. WRL Technical Note TN-19, June 1991.
WRL Technical Note TN-12, October 1989.
Top Related