Intrusion-tolerant middleware: the road to automatic security

On the HorizonEditor: O. Sami Saydjari, [email protected]

from happening—by developingsystems without vulnerabilities, forexample, or by detecting attacks andintrusions and deploying ad hoccountermeasures before any part ofthe system is damaged. But what ifwe could address both faults and at-tacks in a seamless manner, through acommon approach to security anddependability? This is the proposal ofintrusion tolerance, which assumes that

• systems remain somewhat faultyor vulnerable;

• attacks on components will some-times be successful; and

• automatic mechanisms ensure thatthe overall system nevertheless re-mains secure and operational.

No large-scale computer networkcan be completely protected fromattacks or intrusions. Just as chainsbreak at their weakest link, any in-conspicuous vulnerability left be-hind by firewall protection or anysubtle attack that goes unnoticedby intrusion detection will beenough to let a hacker defeat aseemingly powerful defense. Usingideas from fault tolerance that putemphasis on automatically detect-ing, containing, and recoveringfrom attacks, the European projectMAFTIA (Malicious-and Acci-dental-Fault Tolerance for Internet

Applications; www.maftia.org) setout to develop an architecture anda comprehensive set of mechanismsand protocols for tolerating bothaccidental faults and malicious at-tacks in complex systems. Here, wereport some of the advances madeby the several teams involved in thisproject, which brought togetherinternational expertise in the areasof information security and faulttolerance.

Intrusion tolerance in a nutshellBuilding an intrusion-tolerant sys-tem to arrive at some notion of in-trusion-tolerant middleware forapplication support presents multi-ple challenges. Surprising as it mightseem, intrusion tolerance isn’t justanother instantiation of accidentalfault tolerance.

To capture the essence of intru-sion tolerance, we must first con-sider that an intrusion is in fact amalicious fault that has two underly-ing causes: a weakness, flaw, or vul-nerability, or a malicious act or attackthat attempts to exploit the former.

Attacks, vulnerabilities,and intrusionsVulnerabilities are the primitivefaults within a system—in particu-lar, design or configuration faults—

that can be introduced during thesystem’s development or operation.For example, as a step in an overallplan of attack, an attacker might in-troduce vulnerabilities in the formof malware.

Attacks are malicious faults thatattempt to exploit one or more vul-nerabilities. An attack that success-fully exploits a vulnerability resultsin an intrusion, which is normallycharacterized by an erroneous sys-tem state (for example, a system filewith unwarranted access permis-sions for the attacker). If nothing isdone to handle these errors, a secu-rity failure will probably occur. At-tacks often assume the form ofinconsistent interactions with differ-ent legitimate participants in orderto confuse them. Resilient systemsshould be able to handle these so-called Byzantine faults.

Figure 1a represents a fundamen-tal sequence: attack � vulnerability� intrusion � error � failure. Thiswell-defined relationship is calledthe AVI fault model.

Classical security methodologiesmainly focus—quite successfully—on preventing intrusion. However,as reality painfully proves every day,it’s impossible, even infeasible, toguarantee perfect prevention: sim-ply put, we can’t handle all attacksbecause they aren’t all known, andnew ones appear constantly. As aconsequence, a few inconspicuousweaknesses are easy prey to hackers,and what’s worse, the resulting in-trusions that escape the intrusion-pre-vention barrier, as Figure 1b suggests,will go unnoticed and will likelycause security failures.

The last resort is intrusion toler-ance, which, as the name suggests, acts

PAULO E.VERISSIMO

AND NUNO F.NEVES

University ofLisbon,Portugal

CHRISTIAN

CACHIN AND

JONATHAN

PORITZ

IBM ZurichResearch

DAVID POWELL

AND YVES

DESWARTE

Laboratoryfor AnalysisandArchitectureof Systems,CNRS

ROBERT

STROUD AND

IAN WELCH

University ofNewcastleupon Tyne

The pervasive interconnection of systems throughout

the world has given computer services a significant

socioeconomic value that both accidental faults and

malicious activity can affect. The classical approach

to security has mostly consisted of trying to prevent bad things

Intrusion-Tolerant MiddlewareThe Road to Automatic Security

54 PUBLISHED BY THE IEEE COMPUTER SOCIETY ■ 1540-7993/06/$20.00 © 2006 IEEE ■ IEEE SECURITY & PRIVACY

On the Horizon

after intrusion and before failure. In-trusion-tolerance techniques are inessence automatic, relying on localmechanisms and distributed proto-cols, and assume combinations of de-tection (of corrupted hosts ortampered communications), recovery(neutralization of intruder activity), ormasking (use of spare components orreplicas, such that the whole resists theintrusion of a minority).

Trust and trustworthinessThe relationship between the no-tions of trust and trustworthiness is im-portant to understand how intrusiontolerance can lead to secure designs.

Let’s consider trust to be a com-ponent’s accepted dependence on aset of (desirable) properties of an-other component, subsystem, or sys-tem.1 If A trusts B, then A acceptsthat a violation of B’s propertiesmight compromise A’s correct oper-ation. It might also happen thatthose properties A trusts don’t corre-spond quantitatively or qualitativelyto B’s actual properties. Thus trust-worthiness is the measure in which acomponent (say, B) meets a set ofproperties. Clearly, a robust designimplies that trust in B should beplaced to the extent of B’s trustwor-thiness—that is, the relation “Atrusts B” should imply A’s substanti-ated belief that B is trustworthy inthe measure of B’s trustworthiness.

There is a separation of concernsbetween how to make a component

trustworthy (constructing the com-ponent) and what to do with the trustplaced in the component (buildingfault-tolerant algorithms). These iter-ative chains of trust–trustworthinessrelations, with the proper specifica-tion and verification tools—lead tovery clear arguments about system se-curity and dependability.2,3

MAFTIA architectureThe MAFTIA architecture selec-tively uses intrusion-tolerancemechanisms to build layers of pro-gressively more trusted compo-nents and middleware subsystemsfrom baseline untrusted compo-nents (hosts and networks). Thisleads to an automation of the

www.computer.org/security/ ■ IEEE SECURITY & PRIVACY 55

Related work in intrusion tolerance

One of the first architectural attempts to build a secure and

robust architecture was Delta-4, a system developed in a

European project in the 1980s.1 It provided distributed intrusion-

tolerant services for data storage, authentication, and authorization.

More recently, several projects have also addressed this

problem, providing secure middleware,2 secure group or broadcast

communication,3 or secure authentication.4 Oasis is a large US

program on intrusion-tolerance research comparable to MAFTIA.5

References

1. D. Powell et al., “The Delta-4 Approach to Dependability in Open Distrib-

uted Computing Systems,” Proc. 18th IEEE Int’l Symp. Fault-Tolerant Com-

puting, IEEE CS Press, 1988, pp. 246–251.

2. M.K. Reiter, “Distributing Trust with the Rampart Toolkit,” Comm. ACM,

vol. 39, no. 4, 1996, pp. 71–74.

3. Y. Amir et al., “Secure Group Communication in Asynchronous Networks

with Failures: Integration and Experiments,” Proc. 20th IEEE Int’l Conf. Dis-

tributed Computing Systems, IEEE CS Press, 2000, pp. 330–343.

4. L. Zhou, F. Schneider, and R. van Renesse, “COCA: A Secure Distributed

On-Line Certification Authority,” ACM Trans. Computer Systems, vol. 20,

no. 4, 2002, pp. 329–368.

5. J.H. Lala, ed., Foundations of Intrusion Tolerant Systems, IEEE CS Press, 2003.

Figure 1. Intrusion sequence. In the (a) attack-vulnerability-intrusion (AVI) fault model,an attack hits a vulnerability, causing an intrusion which, if not handled, will cause afailure; (b) intrusion tolerance, the last resort for protection.

Intrusionfault

Intrusion prevention

Intrusiontolerance

Vulnerabilityfault

(b)

Vulnerabilityprevention

Vulnerability removal

Attackprevention

Attack removal

Error Failure

Intruder

Vulnerabilityfault

Error Failure

Intruder

(a)

Intruder/designer/operator

Intrusionfault

Attack fault

Attackfault

Intruder/designer/operator

On the Horizon

process of building resilience: atlower layers, a trustworthy commu-nication subsystem is constructedwith basic intrusion-tolerancemechanisms. Higher-layer distrib-uted software can then trust thissubsystem for secure communica-tion among participants withoutworrying about network intrusionthreats. Alternatively, an even moretrustworthy higher layer can bebuilt on top of the communicationsubsystem—by incrementally usingintrusion-tolerance mechanisms—such as a replication managementprotocol that’s resilient against bothnetwork and host intrusions.

Architectural optionsA MAFTIA host’s structure relies ona few main architectural options,some of which are natural conse-quences of the discussions in the pre-vious section:

• The notion of trusted—versus un-trusted—hardware. Most of MAF-TIA’s hardware is untrusted, butsmall parts of it are trusted to the ex-tent of some quantifiable measure oftrustworthiness—for example, beingtamper-proof by construction.

• The notion of trusted support soft-ware that can execute a few functionscorrectly, albeit in an environmentsubjected to malicious faults.

• The notion of a runtime environ-ment that extends operating systemcapabilities and hides heterogeneityamong host operating systems byoffering a homogeneous API andframework.

• The notion of trusted distributedcomponents, materialized byMAFTIA middleware, which aremodular and multilayered. Eachlayer overcomes lower layers’ faultybehavior.

We can depict the MAFTIA ar-chitecture in at least three differentdimensions (see Figure 2). Thehardware dimension includes thehost and networking devices thatcompose the physical distributedsystem. Within each node, the op-erating system and runtime plat-form (which can vary from host tohost) provide local support services.Finally, MAFTIA provides distrib-uted software: the layers of middle-ware running on top of the runtimesupport both the mechanisms thateach host provides and MAFTIA’snative services—authorization, in-trusion detection, and trusted thirdparties. To operate securely acrossseveral hosts even in the presence ofmalicious faults, applications run-ning on MAFTIA use the abstrac-tions that the middleware andapplication services provide.

HardwareWe assume that the hardware in indi-vidual MAFTIA hosts is untrusted ingeneral. In fact, most of a host’s oper-ations run on untrusted hardware—such as the usual PC or workstationmachinery connected through thenormal networking infrastructure tothe Internet, which we call the pay-load channel. However, some hostsmight have pieces of hardware thatare trusted to the extent of seemingtamper-proof (that is, we assume in-truders don’t have direct access to theinside of these components). MAF-TIA features two incarnations ofsuch hardware, both of which areeasy to incorporate in standard ma-chines because they’re commercial-off-the-shelf (COTS) products. Oneis a smart card (actually a Java card),connected to the machine’s hardwareand interfaced by the operating sys-tem. The other is an appliance board,which has a processor and an adapterto a (trusted) control network. Theruntime support has specializedfunctions provided by the trustedsupport software and implementedin two components, the Java CardModule ( JCM) and the TrustedTimely Computing Base (TTCB).For less demanding configurations,we also designed a software-imple-mented TTCB.4

Local runtime supportThe MAFTIA architecture’s run-time support dimension essentiallyconsists of the operating system aug-mented with appropriate extensions.The middleware, service, and appli-cation software modules run on theJava virtual machine ( JVM) runtimeenvironment. The JCM assists theoperation of a reference monitor thatsupports the MAFTIA authorizationservice.5 This reference monitorchecks all accesses to local objects andautonomously manages all accessrights for local objects. We trust theJava card to be tamper-proof for ap-plication services whose value ismuch less than the effort—in meansor time—necessary to subvert it.

56 IEEE SECURITY & PRIVACY ■ JULY/AUGUST 2006

Figure 2. The MAFTIA architecture’s three dimensions. Hardware, local support, anddistributed software help applications operate securely across several hosts, even in thepresence of malicious faults.

Untrustedhardware

Trustedhardware

Payloadchannel

(Internet)

Controlchannel

AS

TTP

IDS

TTCB

Applications

Activity support services

Communicationsupport services

Multipoint networkTrustedsoftware

Hardware Distributed softwareLocal support

Runtimeenvironment(JVM + Appia)

Operatingsystem

On the Horizon

Distributed runtime supportThe TTCB is a distributed trustedcomponent responsible for provid-ing a basic set of trusted time and se-curity services to middlewareprotocols for communication andactivity support. The TTCB servicesare accessed locally through runtimesupport but can have global reach,such as making a value known to alllocal TTCB parts, thus limiting thepotential for Byzantine faults by ma-licious protocol participants, as wediscuss later. We can assume that theTTCB component is infeasible tosubvert, but it might be possible tointerfere with its software compo-nent interactions through the JVM.Although this exposes a local host tocompromise, it doesn’t underminethe distributed TTCB operation.

MiddlewareThe middleware layers implementfunctionality at different levels of ab-straction and make it accessible at theinterfaces of several middlewaremodules. These interactions occurthrough the runtime environmentvia predefined APIs.

As mentioned earlier, a middle-ware layer can overcome the faultseverity at lower layers and providecertain functions in a trustworthyway. A (distributed) transactionalservice, for example, trusts that a(distributed) atomic multicast com-ponent ensures typical properties(agreement and total order), regard-less of the fact that the underlyingenvironment can suffer maliciousByzantine attacks.

MAFTIA’s intrusion-tolerance strategiesGiven the variety of possible MAF-TIA applications, different archi-tectural strategies should beavailable to cope with different riskscenarios. MAFTIA offers severalintrusion-tolerance strategiesthrough a versatile combination ofadmissible failure assumptions. Sys-tem designers can apply these

strategies at several levels of abstrac-tion in the architecture and, mostimportant, in the implementationof the middleware and applicationservices. An extended discussionappears elsewhere.6

Ultimately, MAFTIA suppliesdifferent solutions for different levelsof threats and criticality (dependingon the value of services or informa-tion), keeping the best possibleperformance-resilience trade-off.However, anything less than “arbi-trary behavior” as an assumptionraises eyebrows among many securityand cryptography experts, so thisstatement deserves discussion.

A crucial aspect of any fault- orintrusion-tolerant architecture is thefault model on which the system ar-chitecture is conceived and compo-nent interactions are defined.Classically, making assumptionsabout hacker behavior isn’t very sen-sible, which is why many system de-signers tend to assume any behavioris possible (asynchronous, arbitrary,or Byzantine).

However, such weak assump-tions limit the system’s power andperformance—for example, could itstill fulfill a service-level agreement(SLA), which is a contract a serviceprovider makes with a client aboutquality of service (QoS)?

Architectural hybridizationUp until recently, increased systemperformance or QoS have meant lesssecurity. But MAFTIA has advancedthe state of the art, demonstratingthat it’s possible to build applicationsthat gather the best of both worlds:high resilience at the level of arbi-trary failure systems and high perfor-mance at the level of controlledfailure systems.

Through the innovative conceptof architectural hybridization, the archi-tecture simultaneously supportscomponents with different kinds andseverity of vulnerabilities, attacks,and intrusions.7 For example, part ofthe system might be assumed to be

subject to malicious attacks, whereasother parts are specifically designedin a different manner, to resist differ-ent sets of attacks. These hybrid fail-ure assumptions are in fact enforcedin specific parts of the architecture,by system component construction,and are thus substantiated. That is, inMAFTIA, trusting an architecturalcomponent doesn’t mean makingpossibly naive assumptions aboutwhat a hacker can or can’t do to thatcomponent. Instead, the componentis specifically constructed to resist awell-defined set of attacks.

Wormholes modelArchitecture isn’t enough to solvethe resilience problem, though. Thecorrectness arguments of the algo-rithms and protocols rely on thewormholes model, a hybrid distrib-uted-systems model that postulatesthe existence of enhanced parts (orwormholes) of a distributed systemcapable of providing stronger behav-ior than is assumed for the rest of thesystem. MAFTIA perfected this hy-brid distributed-system modelspecifically for Byzantine faults.7

Protocol participants exchangemessages in a world full of threats. Ifsome of them are malicious andcheat, a wormhole can implement adegree of trust for low-level opera-tions: as a local oracle whose infor-mation can be trusted, as a channelthat participants can use to get intouch with each other securely(even for rare moments and forscarce bits of information), or as aprocessor that reliably executes a fewspecific functions or synchroniza-tion actions. Systems using strands ofthis model have received increasingamounts of attention lately.

A wormhole in the particularuse of a trusted security componentmight look like a trusted comput-ing base with a reference monitor.In fact, the concept is more generalin two senses. First, trusted com-puting base’s philosophy was basedon system-level prevention of in-trusions, whereas wormhole mod-


On the Horizon

els have intrusion tolerance inmind: they would allow (and toler-ate) intrusions even in the re-ference-monitor-protected part,significantly reducing the part ofthe system about which strongclaims of tamper-proofness aremade. Second, a wormhole can im-plement any semantics, including areference monitor, but also simplermechanisms, such as random num-ber generators or key distribution,or innovative distributed agree-ment microprotocols.8

Real wormholesThe wormholes model can be real-istically implemented via thenotion of architectural hybridiza-tion.7 MAFTIA implements eachwormhole as a trusted–trustworthycomponent—a component thatcan be trusted because it’s “better”by construction.

Figure 3 shows a snapshot of areal system that uses TTCB worm-holes. The general, or payload, sub-systems are the normal machines andnetworks depicted in dark shading inthe figure. Each host contains thetypical software layers such as the op-erating system, runtime environ-ment, and middleware. Think of asmall appliance board connected tothe machine bus (these boards exist

as COTS components), implement-ing a set of useful functions in a pro-tected manner. This is the MAFTIATTCB wormhole, depicted withlight shading in Figure 3. This proof-of-concept prototype was builtusing simple prevention techniques.However, adequately designedMAFTIA wormholes can withstandextreme attack levels, even life-cycleattacks (insertion of malicious codeduring development), by resortingto recursive design: a wormhole canitself be a modular or distributedsubsystem designed with intrusion-tolerance techniques (replication,diversity, obfuscation, rejuvenation),to substantiate trustworthinessclaims as firmly as desired, such aseliminating single points of failure.

As we mentioned earlier, worm-holes can be local or distributed.The TTCB is distributed through acontrol channel, which is a privatenetwork. As a practical example,consider a Web server farm in a datacenter that’s tolerant to intrusionsfrom the Internet: servers are con-nected through the payload Ether-net to the Internet, but the localwormhole boards are isolated fromthe directly attackable servers. Theboards are interconnected through asecondary Ethernet that’s com-pletely isolated from the payload

Ethernet or Internet—this is the pri-vate control channel. Even if the ma-chine is corrupted, the hacker can’ttamper with the local wormholeboard or with the control channel.

Trusted–trustworthy componentsMAFTIA assumes a fairly severefault model, assuming that hosts andthe communication environmentare asynchronous and can all be in-truded upon. However, hosts canhave local trusted components im-plementing certain functions (suchas random number generation, sig-nature, and time) that can be in-voked at certain steps of theMAFTIA software’s operation andwhose result can be trusted as alwayscorrect, regardless of intrusions inthe rest of the system. The construc-tion of the MAFTIA authorizationservice followed this local trustedcomponents strategy, which is im-plemented around Java cards fitted insome hosts.5

The distributed trusted compo-nents strategy amplifies the scope oftrust. As such, certain global actionscan be trusted (such as global timeand block agreement), despite gen-erally malicious communication andhost environments. MAFTIA im-plements this strategy through theTTCB, which is in effect a securitykernel distributed across severalhosts. Several of the MAFTIA mid-dleware protocols follow this strat-egy, and in fact these protocolssupport the MAFTIA intrusion-tolerant transactional service.6

Arbitrary failure assumptions The hybrid failure approach, how-ever resilient, relies on trustedcomponent assumptions, or trust-worthiness. Several operations willhave a value or criticality such thatthe risk of failure due to possibleviolation of these assumptions,however small, can’t be incurred.

The only way to lower the riskeven further is by resorting to arbi-


Figure 3. Architecture with a Trusted Timely Computing Base (TTCB) wormhole. Thegeneral systems are depicted in dark shading, and the wormhole is in light shading.

k

R

Process calls a service of the local wormhole

P1 P2 P3 Pi

Pi

Control channelPayload network

Process uses the payload networkto send messages to other processes

Local wormhole exchanges datathrough the secure control channel

On the Horizon

trary failure modes, in which noth-ing is assumed about the way com-ponents could fail. Consequently,this is another strategy pursued inMAFTIA—arbitrary-failure-re-silient components—namely, com-munication protocols of theByzantine class that don’t make as-sumptions about the existence oftrusted components. Some of theprotocols the MAFTIA middlewareuses to follow this strategy are of theprobabilistic Byzantine class andoffer several qualities of service (bi-nary and multivalued Byzantineagreement and atomic broadcast).Some MAFTIA trusted-third-partyservices rely on them.9

MAFTIA middlewareFigure 4 shows the MAFTIA mid-dleware’s layers. The lowest layer isthe multipoint network (MN), whichis created on the physical infra-structure. Its main properties arethe provision of multipoint ad-dressing, basic secure channels, andmanagement communications, allof which hide the underlying net-work’s specificities.

The communication support services(CS) module implements basiccryptographic primitives, Byzantineagreement, group communicationwith several reliability and orderingguarantees, clock synchronization,and other core services. The CSmodule depends on the MN mod-ule to access the network. The activ-ity support services (AS) moduleimplements building blocks that as-sist participant activity, such as repli-cation management, leader election,transactional management, autho-rization, key management, and soforth. It depends on the services theCS module provides.

The block to the left of the fig-ure implements failure detectionand membership management.Failure detection assesses remotehosts’ connectivity and correctnessand local processes’ liveness. Mem-bership management, which de-pends on failure information, helps

create and modify group member-ships (registered members) and theview (currently active, nonfailed,or trusted members). Both the ASand CS modules depend on thisinformation.

As discussed earlier, an estab-lished way for achieving fault or in-trusion tolerance is to distribute aservice among a set of servers andthen use replication algorithms formasking faulty servers. No singleserver has to be trusted completely,and the overall system derives its in-tegrity from a majority of correctservers. Consequently, a very im-portant part of the MAFTIA archi-tecture is related to the algorithmicsuites that implement communica-tion and agreement among processesin different hosts.

Let’s look more closely at the twomain configurations of the MAF-TIA middleware and algorithms.6

Byzantine agreementin an arbitrary worldIn this configuration, the systemmodel doesn’t include timing as-sumptions and is characterized by astatic set of servers with point-to-point communication and the useof modern threshold cryptography.There are no a priori trusted com-ponents, and trusted applicationsare implemented by deterministic

state machines replicated on all theservers and initialized to the samestate. An atomic broadcast protocoldelivers client requests, imposing atotal order on all of them and guar-anteeing that the servers performthe same sequence of operations.The atomic broadcast is built froma randomized consensus proto-col—that is, a protocol for Byzan-tine agreement.

Model. This asynchronous model issubject to Fischer, Lynch, and Pater-son’s10 impossibility result of reach-ing consensus by deterministicprotocols (FLP). Many developers ofpractical systems seem to haveavoided this model in the past andbuilt systems that are weaker thanconsensus and Byzantine agree-ment. However, randomization cansolve Byzantine agreement in an ex-pected constant number of rounds,as MAFTIA does. We use Byzantineagreement as a primitive for imple-menting atomic broadcast, which inturn guarantees a total ordering of alldelivered messages.

Cryptography. To protect keys, weuse threshold cryptography, an intru-sion-tolerant form of secret sharing.Secret sharing lets a group of npartiesshare a secret such that t or fewer ofthem have no information about it,


Figure 4. Detail of the MAFTIA middleware, showing the different architectural blocks.

Network

Failuredetection

andmembership

Multipoint network (MN)

Applications

Middleware

Participantm

Participantn

Participantp

Activity support services (AS)

Communication support services (CS)

On the Horizon

but t + 1 or more can uniquely re-construct it. However, someonecan’t simply share a cryptosystem’ssecret key and reconstruct it to de-crypt a message because as soon as asingle corrupted party knows thekey, the cryptosystem becomes com-pletely insecure and unusable.

In a threshold public-key cryptosys-tem, for example, each party holds akey share for decryption, which isdone in a distributed way. Given a ci-phertext resulting from encrypting amessage, individual parties decrypt-ing it output a decryption share. Atleast t + 1 valid decryption shares arerequired to recover the message. An-other important cryptographic algo-rithm is the threshold coin-tossingscheme, which provides a source ofunpredictable random bits that onlya distributed protocol can query. It’sthe key to circumventing the FLPimpossibility result, and the ran-domized Byzantine agreement pro-tocol uses it in MAFTIA.

No timing assumptions. Workingin a completely asynchronous modelis attractive because the alternative ofspecifying timeout values actuallyconstitutes a system’s vulnerability.It’s sometimes easier, for example,for a malicious attacker to simplyblock communication with a serverthan subvert it, but a system withtimeouts would nevertheless classifythe server as faulty.

This is how attackers can fooltime- or timeout-based failure de-tectors into making an unlimitednumber of wrong failure suspicionsabout honest parties. The problemarises because a crucial assumptionunderlying the failure detector ap-proach—namely, that the commu-nication system is stable (failuredetection is accurate) for a suffi-ciently long period to allow protocoltermination—doesn’t hold against amalicious adversary.

Secure asynchronous agreementand broadcast. Several protocolsare used in this architecture configu-

ration, such as reliable and consistentbroadcast, atomic broadcast, and se-cure causal atomic broadcast. De-tailed descriptions appearelsewhere.9,11,12 These protocolswork under the optimal assumptionthat fewer than one-third of theprocesses become faulty at any time.They’re implemented in a modularand layered way.

Byzantine agreement requires allparties to agree on a binary valueproposed by an honest party. Ourrandomized protocol checks if theproposal value is unanimous or elseadopts a random value.11 Multival-ued Byzantine agreement is basedon the previously described protocoland provides agreement on valuesfrom large domains.12 A basic broad-cast protocol in a distributed systemwith failures is reliable broadcast,which provides a way for a party tosend a message to all other parties. Itrequires that all honest parties deliverthe same set of messages and that thisset includes all messages broadcast bysuch parties, but makes no assump-tions if a message’s sender is cor-rupted. An atomic broadcastguarantees a total order on messagessuch that honest parties deliver mes-sages in the same order. Any imple-mentation of atomic broadcast mustimplicitly reach agreement whetherto deliver a message sent by a cor-rupted party, and, intuitively, this iswhere the Byzantine agreementmodule is needed: the parties pro-ceed in global rounds and agree on aset of messages to deliver via multi-valued agreement.12 A secure causalatomic broadcast ensures a causalorder among all broadcast messages.It’s implemented by combiningatomic broadcast with a robustthreshold cryptosystem. Encryptionensures that messages remain secretup to the moment at which they’reguaranteed to be delivered, prevent-ing any violations of causal order by acorrupted party.

Reliable communicationOther MAFTIA middleware con-

figurations follow a strategy based ondistributed trusted components andarchitectural hybridization. In thisconfiguration we implemented sev-eral distributed protocols, such as re-liable multicast, atomic multicast,and consensus. These protocols’correctness arguments rely on thewormholes model.

Model. In this configuration, agroup of processes executes a proto-col, as Figure 3 suggests. Processesrun outside the wormhole (in thedark part) and communicate bysending messages through the pay-load network. At certain points oftheir execution, however, they canrequest trusted services from thewormhole by calling its interface.

The global system assumptions inthis configuration are weak: the sys-tem is assumed to be asynchronous,and processes and communicationcan suffer Byzantine faults. Conse-quently, this model is also bound tothe FLP impossibility result10 men-tioned earlier. The wormholesmodel lets processes invoke functionsthat have enough power to circum-vent the FLP impossibility, while stillmaintaining a generically weak andthus resilient model.7,13 In this case,the TTCB and the control channelprovide timely (synchronous) execu-tion and communication amongTTCB modules. In practical imple-mentations, these synchrony guaran-tees can be ensured (despite the restof the system being completely asyn-chronous) because the wormholehas complete control over its re-sources. Furthermore, it’s assumed tofail only by crashing: it either pro-vides its services as expected, or itsimply stops running.

Example TTCB wormhole ser-vices. In MAFTIA, the wormholesmetaphor is materialized by theTTCB, whose most important ser-vices are the local authentication ser-vice, which makes the necessaryinitializations and authenticates thelocal wormhole component before


On the Horizon

the process; the trusted time-stamp-ing service returns timestamps withthe current global time; and thetrusted block agreement service ap-plies an agreement function to a setof values and returns informationabout who proposed what.

Designing wormhole-aware pro-tocols. In the project, we designedseveral protocols to form a coherentByzantine-resilient communicationsuite, comprising reliable multi-cast,14 atomic multicast, simple,8 andvector consensus,13 and state ma-chine replication management.15

A correct use of the wormholesprinciple mandates that most of theprotocol execution occurs in thepayload subsystem. The wormholeservices are only invoked whenthere is an obvious trade-off be-tween what is obtained from thewormhole service and the com-plexity or cost of implementing it inthe payload subsystem.

As a quick example, let’s assume areliable broadcast execution. Theprotocol starts with the sender mul-ticasting a message through the pay-load channel; now the message’sintegrity and reception must bechecked. In classical protocols, thisentails some complexity or delay be-cause the sender might be maliciousor the network might be attacked orhave omission failures. Wormholescan help here: the sender and all re-cipients send a hash of the message tothe wormhole, which runs a simpleagreement on the hashes in its pro-tected environment, returning toeveryone the sender’s hash as a result.If all goes well, the protocol termi-nates in an extremely quick manner.In case of faults, additional informa-tion returned by the wormhole al-lows fast termination after a fewadditional interactions.

One achievement related to ourmodel is that most of the protocolshave lowered known bounds on therequired total number n of processesto tolerate a given number of Byzan-tine faults f, from n > 3f to n > f + 1

for reliable multicast14 and n > 2 f forstate-machine replication.15

Another achievement concernsperformance and complexity. Despitemaintaining Byzantine resilience andworking on essentially weak arbitraryand asynchronous settings, the proto-cols—thanks to wormholes—exhibitunusually high performance and lowcomplexity when compared to alter-native implementations.

Y ou can find a detailed descriptionof MAFTIA’s history at http://

istresults.cordis.lu/index.cfm/section/news/tpl/article/BrowsingType/Features/ID/69871.

AcknowledgmentsMAFTIA was a project of the EU’s IST (In-formation Society and Technology) program.The EC supported this work through projectIST-1999-11583.

References1. P. Veríssimo, N.F. Neves, and M.

Correia, “Intrusion-Tolerant Archi-tectures: Concepts and Design,”Architecting Dependable Systems,LNCS 2677, R. Lemos, C. Gacek,and A. Romanovsky, eds., Springer-Verlag, 2003, pp. 3–36.

2. D. Powell and R.J. Stroud, eds.,Conceptual Model and Architecture ofMAFTIA, Project MAFTIADeliverable D21, Jan. 2003;www.maftia.org/maftia/deliver-ables/D21.pdf.

3. R. Stroud et al., “A QualitativeAnalysis of the Intrusion-TolerantCapabilities of the MAFTIAArchitecture,” Proc. Int’l Conf. De-pendable Systems and Networks (DSN04), 2004, pp. 453–461.

4. M. Correia, P. Verissimo, and N.F.Neves, “The Design of a COTSReal-Time Distributed SecurityKernel,” Proc. 4th European Depend-able Computing Conf., Springer Ver-lag, 2002, pp. 234–252.

5. Y. Deswarte et al., “An Intrusion-Tolerant Authorization Scheme forInternet Applications,” Proc. Int’lConf. Dependable Systems and Net-

works, IEEE CS Press, 2002, pp.C1.1–C1.6.

6. P. Veríssimo et al., “Intrusion-Tol-erant Middleware: The MAFTIAApproach,” tech. report DI/FCULTR 04–14, Dept. of Informatics,Univ. of Lisbon, Nov. 2004.

7. P. Verissimo, “Uncertainty andPredictability: Can They Be Rec-onciled?” Future Directions in Dis-tributed Computing, LNCS 2584,Springer-Verlag, 2003, pp.108–113.

8. N.F. Neves, M. Correia, and P.Verissimo, “Solving Vector Con-sensus with a Wormhole,” IEEETrans. Parallel and Distributed Sys-tems, vol. 16, no. 12, 2005, pp.1120–1131.

9. C. Cachin and J.A. Poritz, “SecureIntrusion-Tolerant Replication onthe Internet,” Proc. Int’l Conf.Dependable Systems and Networks,IEEE CS Press, 2002, pp. 167–176.

10. M.J. Fischer, N.A. Lynch, and M.S.Paterson, “Impossibility of Dis-tributed Consensus with OneFaulty Process,” J. ACM, vol. 32,no. 2, 1985, pp. 374–382.

11. C. Cachin, K. Kursawe, and V.Shoup, “Random Oracles in Con-stantinople: Practical AsynchronousByzantine Agreement Using Cryp-tography,” Proc. 19th ACM Symp.Principles of Distributed Computing,ACM Press, 2000, pp. 123–132.

12. C. Cachin et al., “Secure and Effi-cient Asynchronous Broadcast Pro-tocols,” Advances in Cryptology,LNCS 2139, J. Kilian, ed., Springer-Verlag, 2001, pp. 524–541.

13. M. Correia et al., “Low Complex-ity Byzantine-Resilient Consen-sus,” Distributed Computing, vol. 17,no. 3, 2005, pp. 237–249.

14. M. Correia et al., “Efficient Byzan-tine-Resilient Reliable Multicast ona Hybrid Failure Model,” Proc. 21stIEEE Symp. Reliable Distributed Sys-tems, IEEE CS Press, 2002, pp. 2–11.

15. M. Correia, N.F. Neves, and P.Verissimo, “How to Tolerate HalfLess One Byzantine Nodes inPractical Distributed Systems,”Proc. 23rd IEEE Symp. Reliable Dis-


On the Horizon

tributed Systems, IEEE CS Press,2004, pp. 174–183.

Paulo E. Veríssimo is a professor at theUniversity of Lisbon Faculty of Sciencesand director of the LASIGE research lab-oratory. His research interests includearchitecture, middleware, and protocolsfor distributed, pervasive, and embeddedsystems, in the facets of real-time adapt-ability, security, and fault and intrusiontolerance (http://navigators.di.fc.ul.pt/).Contact him via www.di.fc.ul.pt/~pjv.

Nuno F. Neves is an assistant professorat the University of Lisbon, Portugal. Hisresearch interests include security andfault tolerance in parallel and distributedsystems. Neves has a PhD in computerscience from the University of Illinois atUrbana-Champaign. He is a member ofthe IEEE. Contact him via www.di.fc.ul.pt\~nuno.

Christian Cachin is a research staffmember at the IBM Zurich Research Lab-

oratory. His research interests includecryptographic protocols, security in dis-tributed systems, and steganography.Cachin has a PhD in computer sciencefrom ETH Zürich. He is a member of theIACR, ACM, and IEEE. Contact him viawww.zurich.ibm.com/~cca.

Jonathan Poritz is a lecturer at ColoradoState University. His research interestsinclude mathematical cryptography,quantum computing, privacy technology,trusted computing, and computer visual-ization of geometry. Poritz has a PhD inmathematics from the University ofChicago. He is a member of the ACM andthe American Mathematical Society. Con-tact him via www.poritz.net/jonathan.

David Powell is a “directeur deRecherche CNRS” at LAAS-CNRS. Hisresearch interests include distributedalgorithms for software-implementedfault-tolerance, ad hoc networked sys-tems, and critical autonomous roboticsystems. Powell has a PhD in computerscience from the Toulouse National Poly-

technic Institute. Contact him at [email protected].

Yves Deswarte is a “directeur derecherche” at the CNRS Laboratory for Analysis and Architecture of Systems. Hisresearch interests include fault-tolerance,security, and privacy in distributed com-puting systems. Contact him at [email protected].

Robert Stroud is a reader in computingscience at the University of Newcastleupon Tyne. His research interests includesecurity and fault-tolerance. Stroud has aPhD in computing science from Univer-sity of Newcastle upon Tyne. Contact himvia www.cs.ncl.ac.uk/people/r.j.stroud/.

Ian Welch is a lecturer at Victoria Uni-versity, New Zealand. His research interests include secure auctions, intru-sion detection, aspect-oriented develop-ment, and community computing. Welchhas a PhD from the University of New-castle upon Tyne. Contact him at [email protected].


Mid Atlantic (product/recruitment)Dawn BeckerPhone: +1 732 772 0160Fax: +1 732 772 0161Email: [email protected]

New England (product)Jody EstabrookPhone: +1 978 244 0192Fax: +1 978 244 0103Email: [email protected]

New England (recruitment)John RestchackPhone: +1 212 419 7578Fax: +1 212 419 7589Email: [email protected]

Connecticut (product)Stan GreenfieldPhone: +1 203 938 2418Fax: +1 203 938 3211Email: [email protected]

Midwest (product)Dave JonesPhone: +1 708 442 5633Fax: +1 708 442 7620Email: [email protected] HamiltonPhone: +1 269 381 2156Fax: +1 269 381 2556Email: [email protected] DiNardoPhone: +1 440 248 2456Fax: +1 440 248 2594Email: [email protected]

Southeast (recruitment)Thomas M. FlynnPhone: +1 770 645 2944Fax: +1 770 993 4423Email: [email protected]

Southeast (product)Bill HollandPhone: +1 770 435 6549Fax: +1 770 435 0243Email: [email protected]

Midwest/Southwest (recruitment)Darcy GiovingoPhone: +1 847 498-4520Fax: +1 847 498-5911Email: [email protected]

Southwest (product)Steve LoerchPhone: +1 847 498-4520Fax: +1 847 498-5911Email: [email protected]

Northwest (product)Peter D. ScottPhone: +1 415 421-7950Fax: +1 415 398-4156Email: [email protected]

Southern CA (product)Marshall RubinPhone: +1 818 888 2407Fax: +1 818 888 4907Email: [email protected]

Northwest/Southern CA (recruitment)Tim MattesonPhone: +1 310 836 4064Fax: +1 310 836 4067Email: [email protected]

JapanTim MattesonPhone: +1 310 836 4064Fax: +1 310 836 4067Email: [email protected]

Europe (product) Hilary TurnbullPhone: +44 1875 825700Fax: +44 1875 825701Email: [email protected]

A D V E R T I S E R / P R O D U C T I N D E X J U L Y / A U G U S T 2 0 0 6

Addison Wesley 5

ISSE 2006 11

LinuxWorld Conference & Expo 2006 13

Usenix Security Symposium 2006 Cover 4

Boldface denotes advertisements in this issue.

Advertising PersonnelAdvertiser Page Number

Marion DelaneyIEEE Media, Advertising DirectorPhone: +1 415 863 4717Email: [email protected] AndersonAdvertising CoordinatorPhone: +1 714 821 8380Fax: +1 714 821 4010Email: [email protected]

Sandy BrownIEEE Computer Society,Business Development ManagerPhone: +1 714 821 8380Fax: +1 714 821 4010Email: [email protected]

Advertising Sales Representatives

Intrusion-tolerant middleware: the road to automatic security

Documents

Transcript of Intrusion-tolerant middleware: the road to automatic security