CC–PROTECT: Cooperative Cross-layer Protection to Mitigate ...

30
CC–PROTECT: Cooperative Cross-layer Protection to Mitigate the Impact of Hardware Defects on Multimedia Applications KYOUNGWOO LEE University of California, Irvine AVIRAL SHRIVASTAVA Arizona State University NIKIL DUTT University of California, Irvine and NALINI VENKATASUBRAMANIAN University of California, Irvine Soft errors are threatening the system reliability in mobile devices but traditional hardware pro- tection techniques incur significant overheads in terms of power and performance. Thus, incor- porating reliability in resource-limited mobile devices poses significant challenges. This paper discusses a cooperative method that exploits existing error control schemes at an application layer to mitigate the impact of hardware defects such as soft errors for mobile multimedia systems. So we study heterogeneous specifics about two different errors at two different abstraction lay- ers, and present a cooperative cross-layer approach to obtain low-cost reliability at the minimal degradation of QoS. In particular, we propose a cooperative approach to combat soft errors at data caches by using dual schemes – a Drop and Forward Recovery and an error-resilient video encoding – driven by intelligent middleware schemes. Experimental evaluation demonstrates that our cooperative, cross-layer method for a video encoding with different video streams improves performance by 60% and the energy consumption by 58% with even better reliability at the cost of 3% quality degradation on average, as compared to an ECC-based hardware protection tech- nique. Combining intelligent schemes to select a recovery mechanism can guide system designers for trading off multiple constraints such as performance, power, reliability, and QoS. To explore the expanded design space, we present a heuristic exploration algorithm and it is more efficient by up to 3.95× than an exhaustive exploration approach to find a configuration with minimal composite metric under the quality constraint. Author’s address: K. Lee, N. Dutt, and N. Venkatasubramanian, Department of Computer Sci- ence, University of California, Irvine, CA 92697. A. Shrivastava, Department of Computer Science and Engineering, Arizona State University, Tempe, AZ 85281. This is an expanded version of a paper published in the Proceedings of the ACM International Conference on Multimedia (ACM MM) 2008. The current manuscript extends the previous paper by exploring the design space for CC-PROTECT with combined selective algorithms and propos- ing the heuristic algorithms to efficiently find the best configuration considering the multiple properties such as performance, power, reliability, and QoS. Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. c 20YY ACM 1529-3785/20YY/0700-0001 $5.00 ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY, Pages 1–0??.

Transcript of CC–PROTECT: Cooperative Cross-layer Protection to Mitigate ...

CC–PROTECT: Cooperative Cross-layer

Protection to Mitigate the Impact of Hardware

Defects on Multimedia Applications

KYOUNGWOO LEE

University of California, Irvine

AVIRAL SHRIVASTAVA

Arizona State University

NIKIL DUTT

University of California, Irvine

and

NALINI VENKATASUBRAMANIAN

University of California, Irvine

Soft errors are threatening the system reliability in mobile devices but traditional hardware pro-tection techniques incur significant overheads in terms of power and performance. Thus, incor-porating reliability in resource-limited mobile devices poses significant challenges. This paperdiscusses a cooperative method that exploits existing error control schemes at an application layerto mitigate the impact of hardware defects such as soft errors for mobile multimedia systems.So we study heterogeneous specifics about two different errors at two different abstraction lay-

ers, and present a cooperative cross-layer approach to obtain low-cost reliability at the minimaldegradation of QoS. In particular, we propose a cooperative approach to combat soft errors atdata caches by using dual schemes – a Drop and Forward Recovery and an error-resilient videoencoding – driven by intelligent middleware schemes. Experimental evaluation demonstrates thatour cooperative, cross-layer method for a video encoding with different video streams improvesperformance by 60% and the energy consumption by 58% with even better reliability at the costof 3% quality degradation on average, as compared to an ECC-based hardware protection tech-nique. Combining intelligent schemes to select a recovery mechanism can guide system designersfor trading off multiple constraints such as performance, power, reliability, and QoS. To explorethe expanded design space, we present a heuristic exploration algorithm and it is more efficientby up to 3.95× than an exhaustive exploration approach to find a configuration with minimalcomposite metric under the quality constraint.

Author’s address: K. Lee, N. Dutt, and N. Venkatasubramanian, Department of Computer Sci-ence, University of California, Irvine, CA 92697.A. Shrivastava, Department of Computer Science and Engineering, Arizona State University,Tempe, AZ 85281.This is an expanded version of a paper published in the Proceedings of the ACM International

Conference on Multimedia (ACM MM) 2008. The current manuscript extends the previous paperby exploring the design space for CC-PROTECT with combined selective algorithms and propos-ing the heuristic algorithms to efficiently find the best configuration considering the multipleproperties such as performance, power, reliability, and QoS.Permission to make digital/hard copy of all or part of this material without fee for personalor classroom use provided that the copies are not made or distributed for profit or commercialadvantage, the ACM copyright/server notice, the title of the publication, and its date appear, andnotice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish,to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.c© 20YY ACM 1529-3785/20YY/0700-0001 $5.00

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY, Pages 1–0??.

Categories and Subject Descriptors: []: —;

General Terms:

Additional Key Words and Phrases:

1. INTRODUCTION

Exponentially increasing with technology scaling, soft errors will soon become aneveryday concern [Hazucha and Svensson 2000; Wrobel et al. 2001]. Soft errorsare transient faults that are caused due to a variety of deep submicron reasons,including sudden voltage drops, signal interference, random noise, etc., but cosmicradiation strike causes more soft errors than all other reasons put together [Bau-mann 2005]. Soft errors are emerging as a problem in mobile multimedia devices,since these consumer devices – for competitive market reasons – increasingly deploycomponents manufactured using the latest technology, and operate at low voltagesfor extended battery life. Both of these factors reduce the threshold charge Qcritical

for a striking radiation particle to cause a soft error, which in turn greatly affectsthe reliability due to exponential relationship between Qcritical and soft error rate(SER) [Hazucha and Svensson 2000]. Indeed, Baumann [2005] predicted that theSER in the next generation SRAMs will be up to two orders of magnitude higher,and Ruckerbauer et al. [2007] measured SER in 65nm SRAMs and showed higherSER than expected mainly due to unexpected high number of double bit errors persingle event upset induced by neutron radiation.

The occurrence of soft errors is directly proportional to the exposed area of the cir-cuit [Cai et al. 2006]. Since memories occupy majority real estate on-chip, they aremost vulnerable to soft errors. Also, logic components are becoming more suscepti-ble to soft errors due to reducing feature size and increasing pipeline depths [Mitraet al. 2005; Shivakumar et al. 2002]. Thus, mobile multimedia devices are espe-cially prone to soft errors due to the sheer volume of data and high complexity ofalgorithms in multimedia applications. Also, SER has significant relationship withwhere to operate devices. For example, SER at airplanes is hundreds times higherthan that at the ground level due to the increasing saturation of neutrons at higheraltitudes [Mastipuram and Wee 2004]. Mobile multimedia devices are likely to beused in pervasive computing environments with high SER such as mountain topsand airplanes.

Combating soft errors in modern mobile multimedia devices is extremely chal-lenging, owing to the multi-dimensional design requirements. Traditional reliabilitytechniques attempt to provide the entire “fix” at one level, e.g., error correctioncodes (ECC) at the hardware level, packet retransmission at the network level,and triple modular redundancy (TMR) at the component level, and consequentlyhave extremely high overheads. For example, TMR typically uses three function-ally equivalent replicas of a logic circuit and a majority voter, but the overheads ofhardware and power for conventional TMR exceed 200% [Nieuwland et al. 2006].Also implementing an ECC-based scheme may raise access time by 95% [Li andHuang 2005] and power consumption by 22% [Phelan 2003] in the caches. Clearlysuch high overheads are not acceptable for mobile devices as they are extremely

sensitive to performance, power, and cost overheads. On the other hand, an EDC(Error Detection Codes) based technique such as parity codes [Pradhan 1996] in-curs much less overheads than an ECC-based technique such as a Hamming code[Pradhan 1996]. For example, a parity code decreases the access delay by up to 47%and the power consumption by up to 73% compared to a Hamming code (38,32)according to [Li et al. 2004]. However, an EDC scheme can only detect an er-ror. Thus, to make a system tolerant against soft errors, an EDC scheme mustbe followed by an error recovery technique such as rolling-backward recovery withcheckpoints [Pradhan 1996]. However, such backward error recovery (BER) is in-appropriate for real-time applications since they have in general poor predictabilityof the completion time.

Network

Application(Video Encoding)

Operating System

video input video output

Middleware /

(Data Cache)

Mobile DeviceExternal Data (Multimedia Data)

Hardware

and Control Data)Internal Data (Multimedia Data

Fig. 1. System Model - Mobile Video Encoding System

This paper studies low-cost approaches to mitigate the impacts of soft errors atdata caches in a mobile video encoding system as shown in Fig. 1. We proposea cooperative, cross-layer approach (CC-PROTECT). CC-PROTECT performs anerror detection rather than an error correction at the hardware layer since an errordetection is more cost-efficient than an error correction. To correct an error, wepropose and explore joint approaches of a backward error recovery (BER) and adrop and forward recovery (DFR) at the middleware layer. A DFR mechanismdrops the erroneous frame once an error is detected, and reconstructs it by usingdata from adjacent frames [Wang and Zhu 1998]. While BER can result in signifi-cant power and performance overheads, DFR can result in significant loss in QoS.To recover the degradation of QoS due to frame drops, CC-PROTECT exploits anerror-resilient video encoding scheme at the application layer, which was developedto combat the network errors and now is used for reducing the impact of soft errorson the video quality in our CC-PROTECT. CC-PROTECT extends a design spacefor mobile multimedia embedded systems. However, it is challenging to find the ef-fective configurations in terms of multiple constraints such as performance, power,reliability, and QoS in the largely expanded tradeoff space. Thus, we present acomposite metric to evaluate configurations considering multiple properties. Also,we propose a design space exploration heuristic (DSExplore) to find the interestingdesign points, especially QoS-aware configurations with minimal composite metric,i.e., minimal costs of power and performance with maximal reliability under QoSconstraint.

Contributions and results of this article are:

—We propose a cooperative, cross-layer protection technique named CC-PROTECT,which is very effective for achieving cost-efficient reliability in mobile multimediaapplications.

—CC-PROTECT improves the performance by 61%, the energy consumption by52%, and the reliability by about 2.1× 103 times at the cost of less than 1 dB ofvideo quality, as compared to no protection technique.

—We present several intelligent selective schemes, which extend design space totradeoff QoS for cost efficiency. Also, our proposed DSExplore efficiently findsinteresting design space using the proposed composite metric.

—Our DSExplore heuristic can find interesting design configurations under 32 dBvideo quality for a video clip FOREMAN.QCIF by 3.95 times faster than anexhaustive exploration approach in a limited tradeoff space.

2. CROSS-LAYER APPROACH

2.1 Motivation

Due to the rapid deployment of wireless communications, video applications onmobile devices such as video streaming and conferencing applications have growndramatically. Error-prone networks due to the congested routers and faded accesspoints in wireless communications incur data packet delay and packet losses. Itis well understood that mobile video applications have soft real-time constraintson data delivery. Packet losses in error-prone networks may degrade the videoquality. Such degradation is acceptable to some extent by end users mainly dueto the fault tolerant nature of video data. However, mobile devices are resource-constrained (limited buffer and battery), and exploiting natures of soft real-timeand fault-tolerance is limited. To cope with network errors such as packet losses,error-resilient video encoding techniques have been investigated [Cheng and Zarki2004; Kim et al. 2006; Wang et al. 2000; Worrall et al. 2001; Zhang et al. 2000].These techniques enable the adaptive resilience level based on the network condi-tions [Cheng and Zarki 2004; Kim et al. 2006]. Note that the main objective ofresilient techniques is to reduce the impact of packet losses on the video content.We refer to the video content as external data, i.e., the payload on which the videoapplication is executed, as illustrated in Fig. 1. In contrast, internal data is de-fined as data, program code, etc., which is residing inside mobile device duringthe process of execution as shown in Fig. 1, and internal data represents the pro-grams/data implementing the application functionality, e.g., the video codec andassociated data/variables.

The key observation is that while errors on external data only cause the qualitydegradation while errors on internal data cause not only the quality degradationbut also system failures as described in Table I. In particular, defects induced at thehardware component, e.g., soft errors on data caches, can result in system crashes,infinite loops, and memory segmentation faults, causing application failures. Tech-niques to protect internal data from hardware defects such as soft errors have beeninvestigated, but conventional techniques such as TMR and ECC [Pradhan 1996]incur significant overheads in terms of power and performance. For example, con-

ventional TMR may exceed 200% [Nieuwland et al. 2006]. Cost efficient techniqueshave been presented but still they have overheads in terms of performance andpower. For instance, PPC (Partially Protected Caches) [Lee et al. 2006] utilizesknowledge of content and device hardware capabilities to selectively place criticaldata in more reliable hardware (e.g., a protected cache), but it still incurs overheadsof power and performance at the protected cache in a PPC.

Fig. 2 shows isolated approaches of error-resilient video encoding and error-protected data cache. Error-resilient video encoding is designed to protect externalvideo data from packet losses in error-prone networks while error-protected datacache is designed to protect internal data from soft errors in error-prone hardware.They have been studied separately mainly because i) they target different char-acters of errors as summarized in Table I and ii) no method exists to cooperatetwo schemes at different abstraction layers, for example, the way to convert therepresentation of an error rate for the different error control schemes.

This article investigates the effectiveness of coupling and cooperating differenterror control schemes across system abstraction layers in mobile devices. Our goalis to find effective solutions for high reliability with minimal costs of performanceand power at the minimal cost of QoS degradation. We believe that addressingsuch performance, power, reliability, and QoS tradeoffs in the presence of hardwaredefects requires a cross-layer approach. First, we need to understand characters oferrors and existing error control schemes against errors at each abstraction layeras summarized in Table I. Then this enables us to figure out how and when errorsbecome failures in mobile devices. Then we can design appropriate schemes tocooperate existing schemes to mitigate the impacts of specific errors in a cost-efficient way. By judiciously coupling existing techniques, the cooperative, cross-layer approach is able to maximize the effectiveness of techniques and to minimizethe negative impacts of errors in a cost-efficient way.

2.2 Our Cooperative Cross-layer Approach

Increasing exponentially with each technology scaling, hardware-induced soft errorspose a significant threat for the reliability of mobile multimedia devices. Our co-operative, cross-layer method implements a cost-efficient approach to mitigate theimpacts of soft errors on mobile video encoding systems. Table I shows soft errorscan cause not only quality degradation but also failures.

First, we exploit a PPC architecture with an error detection code rather than

Table I. Error models and error control schemes at different abstraction layers

Abstraction Layer Application Layer Hardware Layer

Error Model Packet Losses Soft ErrorsData Perspective External Data Internal Data

Causes Congestion and Noise RadiationQuality Degradation Quality Degradation

Impacts and System FailureError-Resilience, Triple Modular Redundancy,

Protection Error-Concealment, etc. Error Correction Codes, etc.Error Metric Packet Loss Rate (%) Soft Error Rate (FITa)

Time Perspective Dynamic TransientaFIT (Failures In Time): the number of failures in 109 operation hours

Error−ProneNetwork(Error−Resilient Video Encoding)

(Error−Protected Data Cache)

Application

Error−Prone Hardware

video input

Mobile Video Device

No Coupling and No Cooperation

video output

SoftError

PacketLoss

Fig. 2. No cooperation has been studied to mitigate the impact of hardware defects by exploitingerror resilient techniques at the application layer.

an error correction code at the hardware layer, which reduces the overhead of per-formance and power significantly. However, an error detection does not correcterrors. Thus, to improve the reliability, a drop and forward recovery (DFR) isexploited for correcting the detected errors. DFR mechanism is similar to back-ward error recovery with checkpoints (BER) except for moving forward to the nextframe (in frame-based video processing) rather than rolling backward when an er-ror is detected. Dropping frames may degrade the video quality while it can skipthe expensive processing algorithms of video encoding, resulting in the significantreduction of performance and power. Now we can save the energy consumption andimprove the performance while obtaining reliability as high as conventional errorcorrection schemes.

Soft errors and frame drops may degrade the video quality. In our cooperative,cross-layer approach, the video quality is managed by error-resilient video encodingtechniques. Interestingly, PBPAIR [Kim et al. 2006] is not only error-resilient butalso energy-efficient. We exploit PBPAIR at the application layer to mitigate theimpact of hardware-induced soft errors on the QoS in a cost-efficient way. To exploitdifferent approaches in a cooperative way, we develop middleware-driven schemes.

2.3 Related Work

Existing work already demonstrates the effectiveness of cross-layer methods for mo-bile multimedia as opposed to schemes isolated at a single abstraction layer [Kimet al. 2008; Mohapatra et al. 2003; Mohapatra et al. 2005; Yuan and Nahrstedt2004]. Yuan et al. [2003] proposed an energy-efficient real-time scheduler (GRACE-OS) based on statistical distribution of application cycle demands, and presenteda practical voltage scaling algorithm [Yuan and Nahrstedt 2004] to coordinateadaptation of multimedia applications and CPU speeds for mobile multimedia sys-tems. Mohapatra et al. [2003] presented an integrated power management tech-nique considering hardware-level power optimization and middleware-level adap-tation to minimize the energy consumption while maintaining user experience ofvideo quality in mobile video applications. Recently, Kim et al. [2008] proposed aunified framework that allows coordinated interactions among sub-layer optimizersthrough constraint refinement in a compositional cross-layer manner to tune the

system parameters.Cross-layer methods in the OSI reference model have been widely investigated

as a promising optimization tools to efficiently reduce the transmission energy con-sumption in wireless multimedia communications [Bajic 2007; Vuran and Akyildiz2006; van der Schaar and Turaga 2007]. Vuran et al. [2006] presented a cross-layermethodology to analyze error control schemes with respect to transmission powerand end-to-end latency, especially impacts of routing, medium access, and physicallevels in wireless sensor networks. Schaar et al. [2007] proposed a joint cross-layerapproach of application-layer packetization and MAC-layer retransmission strategy,and developed on-the-fly adaptive algorithms to improve the video quality underthe bandwidth and delay constraints for wireless multimedia transmission. Ba-jic [2007] developed cross-layer error control schemes considering joint source rateselection and power management for wireless video multicast.

Our work is novel in two respects. First, we address a broader notion of reliabilitythan has been explored for error-resilient multimedia applications by specificallyfocusing on hardware induced defects and their impacts. As illustrated earlier, thisissue is a leading concern for embedded architectures of the future. Secondly, wepresent how to exploit the cross-layer methodology to activate error control schemesat one abstraction layer to combat errors at a different abstraction layer.

3. COOPERATIVE CROSS-LAYER PROTECTION

3.1 Cooperative Protection Across System Layers

������������

NetworkError−Prone

Soft ErrorInduced

Frame Loss

Packet LossInduced

Frame Loss

f1

f3f4

f1f2

f3f4

f1

f3

f2 is dropped due to soft errors f4 is lost due to packet losses

UnprotectedCache EDCProtected

Cache(Data Cache − PPC)

Hardware

Monitor &Translate SER

Support PPC &PBPAIR

TriggerSelective DFR

Middleware/OS

(Video Encoding − PBPAIR)Application

Error−ResilientVideo Encoding

PLR

Feedback

raw video

Resilience Level

Data MappingSER

ErrorDetection

Error Mitigation

Constraints

error−aware video

FLR+PLR BER or DFR

Parameters

Mobile Video Encoding

CC−PROTECT

Fig. 3. CC–PROTECT (Cooperative Cross-layer Protection)

In this Section, we present specific CC-PROTECT - a middleware driven ap-

proach for cooperative composition of cross-layer strategies to support error re-silience. Fig. 3 illustrates our CC-PROTECT scheme which exploits the error-resilience of video encoding along with DFR-based error recovery mechanisms tomitigate the impacts of soft errors at the hardware layer.

We instantiate two specific strategies for error recovery and error-resilience withinCC-PROTECT. The specific error metric we use and evaluate in this article is softerror rate (SER). First, to mitigate the impact of soft errors on the video quality, weexploit a power-aware error-resilient video encoding technique, PBPAIR [Kim et al.2006]. We present a simple, intuitive, and effective translation of SER into frameloss rate (FLR) used in turn by the error-resilient PBPAIR. Next, we exploit ourprior work (partially protected caches or PPC [Lee et al. 2006]) to design a naiveDFR mechanism. Using information captured in the middleware, we then extendthe naive mechanism to achieve a balance between DFR and BER in Section 4.

3.2 Error Detection at Hardware

A PPC architecture consists of two caches at the same level of memory hierarchyfor unequal data protection – we refer to them as the unprotected cache and theprotected cache. Typically, hardware-based ECC techniques are applied on theprotected cache. In our design, the protected cache is equipped with an EDC thatonly detects errors and hence improves power and performance as compared to anECC-equipped cache [Li et al. 2004]. Since multimedia data itself does not cause asystem failure [Lee et al. 2006], multimedia data is exposed to soft errors by beingmapped into the unprotected cache in a PPC. And the rest of data is mapped intothe protected cache in a PPC. Mapping data into two caches in a PPC is managedby the operating system as shown in Fig. 3. To correct an error, we use a dropand forward recovery at the middleware layer. Thus, the error detection in theprotected cache will be reported to the middleware, and mitigation technique willhandle it as shown in Fig. 3. Also, to manage the quality degradation due to framedrops induced by soft errors on control data in the protected cache, soft error rateis informed to the middleware as shown in Fig. 3. Now CC-PROTECT implementsthe data protection using a PPC with an EDC scheme at the minimal costs ofperformance and power at the hardware layer.

3.3 Drop and Forward Recovery at Middleware

Soft error rates, obtained by error detection techniques at the hardware layer, arecommunicated to the middleware which then as shown in Fig. 3:

(1) monitors errors, and maintains execution histories and video quality informa-tion,

(2) translates SER values to corresponding metrics used by other policies (frameloss rate or FLR in our case),

(3) initiates DFR/BER policies (discussed later) to avoid and bypass potentialhardware failures, and

(4) adaptively fortifies multimedia content when hardware errors occur by trigger-ing error-resilient encoding at the application layer.

Traditional error recovery techniques can be classified into Forward Error Recov-ery or FER (e.g., ECC) and Backward Error Recovery or BER (e.g., Checkpoints)

[Pradhan 1996] according to when an error is recovered as shown in Fig. 4(a) andFig. 4(b), respectively. We explore the use of Drop and Forward Recovery or DFR(See Fig. 4(c)) that combines error detection mechanisms with checkpoints to dis-continue processing of the current frame and to initiate processing of the nextcheckpointed frame.

Our objective is to apply DFR techniques on a cache architecture that uses a PPC(partially protected caches). We first present a naive implementation of DFR in thePPC architecture. Here, checkpoints are taken just before the starting operationwhere each frame K is encoded (similar to BER). The only difference is that DFRmust save the required values for the next encoding frame (frame K+1) in Fig. 4(c).Whenever an error is detected on the (control) data in the protected data cache bythe EDC mechanism in a PPC, content for the next frame encoding is loaded intothe protected data cache with the help of the operating system averting a memory-based system failure. The expectation is that generally a frame drop induced byDFR does not cause a significant quality loss mainly due to the inherent error-tolerance of video data [Lee et al. 2008]. First, we exploit the error-resilient videoencoding technique to recover the possible degradation of the quality due to framedrops. We next discuss extensions to the naive DFR scheme to overcome qualitylosses when they occur.

time

Forward Error Recovery

An Error is Detected

(a) Forward Error Recovery (FER): it detects andeven corrects an error

time

Backward Error Recovery

An Error is Detected

Checkpoint K+1Checkpoint K

(b) Backward Error Recovery (BER): it rolls backto the last saved state when it detects an error

time

(Frame K+1)Checkpoint K+1

(Frame K)Checkpoint K

Drop and Forward Recovery

An Error is Detected

(c) Drop and Forward Recovery (DFR): it dropsa current frame when an error occurs at controldata, and moves forward to the next frame in caseof a frame-based multimedia processing

Fig. 4. Error Recovery Mechanisms

3.4 Error-Resilient Video Encoding at Application

Error-resilient video encoding techniques have been developed to reduce the impactof transmission errors, e.g., packet losses, on the video quality [Cheng and Zarki2004; Kim et al. 2006; Worrall et al. 2001]. The PBPAIR (Probability-Based PowerAware Intra Refresh) technique [Kim et al. 2006] addresses the tradeoff betweenenergy-efficiency and compression-efficiency based on given knowledge of networkerrors. PBPAIR is designed to increase the compression efficiency, i.e., to decreasethe encoded file size, at stable network status, and to decrease compression effi-ciency by increasing the number of intra-coded macro-blocks in the video when thepacket loss rate is high. Inherently, PBPAIR is energy-efficient (especially whenpacket loss rates are higher) and adaptive since its resilience can be adjusted forvarious packet loss rates. PBPAIR takes two parameters – Packet Loss Rate (PLR)and Intra Threshold. PLR indicates the anticipated error rate in the network andIntra Threshold can be adjusted given the user expectation of the quality. Tomake use of PBPAIR, our cross-layer approach converts SER for PLR, and selectsIntra Threshold using the original method in PBPAIR. We present a simple con-version for SER. First, the number of soft errors, NSE, during the execution of oneframe encoding is calculated as NSE = Scache×Ninst×RSE where Scache is the sizeof a cache in KB, Ninst is the number of instructions for one frame encoding, andRSE is an SER per instruction per KB. NSE value is then converted to a percentvalue and used as a FLR (Frame Loss Rate) in our study. For example, Scache is32, Ninst is 108, and RSE is 10−10, then NSE becomes 0.32. So FLR is 32%. Notethat FLR becomes 100% if NSE is larger than 1. Now, PBPAIR can generate thecompressed video data resilient against the packet losses in networks (PLR) as wellas against the soft errors at the hardware layer (FLR) as shown in Fig. 3.

4. SELECTIVE ALGORITHMS AND EXTENDED DESIGN SPACE EXPLORATION

4.1 Intelligent Selective Algorithms

In a naive DFR approach, any single soft error at the hardware layer causes a framedrop whenever it occurs at the control data (non-multimedia data). However, naiveDFR can significantly degrade the quality in case of consecutive frame drops. Toprevent this result, we present a family of intelligent schemes to select a policy thatbalances DFR and BER based on the useful information at the mobile device.

Slack-Aware DFR/BERThe main problem with BER is no guarantee to deliver multimedia service in real-

time. However, if the remaining time to reach the deadline is enough to re-encodethe video frame when an error is detected, we can apply BER rather than DFR forthe quality improvement. Since the encoding time is varying from frame to frameand it is hard to measure the remaining time accurately, our scheme presents a knobto select a policy based on the elapsed time with ACET (Average Case ExecutionTime) as shown in Fig. 5. Our knob, S, indicates the portion of ACET, TACET ,that the system can endure. Thus, SA-DFR/BER (Slack-Aware DFR/BER) selectsBER (lines 04-08) as shown in Fig. 5 if the elapsed time from the starting of theframe K encoding to the time of error occurrence, Telapsed = Terror −TK , is smallerthan given threshold time, Tthreshold = S × TACET . Otherwise, SA-DFR/BERselects DFR. For example, if S = 0.2 and TACET = 100, 000 cycles, Tthreshold

becomes 20,000 cycles. Thus, an error occurring before 20,000 cycles from thestarting of the current frame encoding results in BER. The higher S value increasesthe probability of BER policy to be selected, and thus improves the video qualitywhile incurring more performance and power overheads due to rolling backwardrecovery. Indeed, the infinite value of S always results in a BER policy and thezero value of S does a DFR policy.

SINGLE DFR/BER(Algo)01: policy = DFR02: switch (Algo)03: case SLACK :04: Telapsed = TError − TK

05: Tthreshold = S × TACET

06: if (Telapsed < Tthreshold)07: policy = BER08: endIf09: break;10: case FRAME :11: Fimportance = estimateImportance()12: if (Fimportance > Fthreshold)13: policy = BER14: endIf15: break;16: case QoS :17: Qcurrent = calcPSNR()18: if (Qcurrent < Qthreshold)19: policy = BER20: endIf21: break;22: endSwitch23: return policy

Fig. 5. Intelligent Selective Schemes (SINGLE DFR/BER)

Frame-Aware DFR/BEREach frame has a different impact on the video quality. For example, I-frames

are considered more important than P-frames with the perspective of the videoquality [Cheng and Zarki 2004; Kim et al. 2006]. Thus, if a frame in which a softerror is detected is important in terms of the video quality, FA-DFR/BER (Frame-Aware DFR/BER) rolls back and encodes this frame again (BER) to minimizethe quality loss (lines 11-14) as shown in Fig. 5. Otherwise, it drops the currentframe and moves forward to the next frame (DFR). Based on available informationat the mobile device, the importance of a frame can be decided in several ways.The frame type such as I-frame or P-frame is one example, and any I-frame willbe encoded eventually until no soft error is detected. Another information suchas recovery history, e.g., whether the previous frame has been dropped or notdue to a soft error, can be used to decide a policy. If the previous frame wasdropped, FA-DFR/BER prevents the current frame from being dropped since theconsecutive frame drops may degrade the video quality significantly. Also, thedifference between two consecutive frames can be used to estimate the importanceof a frame in terms of the video quality. The intuition behind this approach is thatthe larger difference between two frames indicates the higher impact on the videoquality if the current frame is lost. Thus, if the difference between them is largerthan given threshold value FA-DFR/BER selects BER. Otherwise, DFR is selected.

This article exploits the difference between consecutive frames to determine theimportance of frames.

QoS-Aware DFR/BERThe potential problem with a DFR mechanism is the significant degradation of

the QoS due to several frame drops. Thus, QA-DFR/BER (QoS-Aware DFR/BER)selects BER when the delivered QoS is unsatisfactory with the QoS requirement.The current quality value, Qcurrent, for frames that have been encoded so far canbe calculated at the end of encoding of each frame. QA-DFR/BER selects BER forthe erroneous frame if Qcurrent is worse than Qthreshold, given threshold QoS value(lines 17-20) as shown in Fig. 5. Otherwise, the default policy, DFR, is selected. Thedelivered QoS to the decoding end is also important but this approach consideringtransmission errors is beyond this paper.

4.2 Combined Selective Algorithms

To explore the tradeoff space in a finer granularity, we present combined selec-tive algorithms such as double-knob selective algorithms and triple-knob selectivealgorithms.

COMBINED DFR/BER(Algo)01: policy = DFR02: if ((SINGLE DFR/BER(SLACK) == BER)03: and(SINGLE DFR/BER(FRAME) == BER)04: and(SINGLE DFR/BER(QoS) == BER))05: policy = BER06: endIf07: return policy

Fig. 6. Combined Hybrid Scheme of DFR/BER

SFA-DFR/BER, FQA-DFR/BER, QSA-DFR/BERSFA-DFR/BER, FQA-DFR/BER, and QSA-DFR/BER are double-knob selec-

tive algorithms. If both conditions are satisfied, i.e., both select BER throughsingle selective algorithms as described in Fig. 5, they choose BER. Otherwise,DFR will be chosen. Fig. 6 shows these algorithms. For example, SFA-DFR/BERcalls SA-DFR/BER and FA-DFR/BER with their own knob values. Unless bothselect DFR, SFA-DFR/BER goes with BER. Note that double-knob selective algo-rithms call only two single-knob algorithms and one of three single-knob algorithmsis ignored. For instance, QA-DFR/BER returns DFR at the example above.

SFQA-DFR/BERSFQA-DFR/BER is triple-knob selective algorithm, which selects BER only

when all single-knob algorithms select BER as shown in Fig. 6. Otherwise SFQA-DFR/BER selects DFR. Since it considers all constraints, SFQA-DFR/BER ob-tains less chance of selecting BER than single-knob selective algorithms and double-knob selective algorithms with the same constraints. Indeed, our experiments willshow that QoS of triple-knob selective algorithm is worse than QoS of double-knobselective algorithms, which is worse than QoS of single-knob selective algorithmsunder the identical constraints. On the contrary, cost overheads show the reverse

results, i.e., cost of triple-knob selective algorithm is less than cost of double-knobselective algorithms, which is less than cost of single-knob selective algorithms.

4.3 Extended Design Space

4.3.1 DSE Problem. Our cooperative, cross-layer protection significantly ex-tends the tradeoff space for multiple properties such as performance, energy con-sumption, reliability, and QoS. However, it is challenging to find the best config-uration considering those multiple constraints for mobile video encoding systemmainly because of interactions and conflicts among them in resource-limited mo-bile devices. For example, redundancy technique such as an ECC scheme incurshigh overheads of power and performance while increasing reliability and QoS. Onthe other hand, no protection technique at all abstraction layers may reduce thepower consumption and runtime delay but it causes low reliability (i.e., high failurerate) and quality degradation. Thus, finding the interesting design point is multi-objective optimization problem where we need to increase the reliability at minimaloverheads of performance and energy consumption with minimal quality degrada-tion. To show the effectiveness of our CC-PROTECT and to efficiently explorethe largely expanded design space by CC-PROTECT, we need a compound metricto consider multiple properties. Thus, we present a composite metric consistingof runtime, energy consumption, reliability, and video quality. Lower runtime andlower energy consumption are better while higher reliability and higher video qual-ity in PSNR are better. Thus, composite metric models the cost evaluation forobtained video quality for a design configuration. Each metric will be discussed indetail in Section 5.2.

Thus, our problem of design space exploration is formulated as: Given an al-lowable quality degradation, determine the configuration to minimize the compositemetric.

4.3.2 DSExplore Heuristics. In terms of video quality, Qsingle is better thanQdouble and Qdouble is better than Qtriple, where Qsingle is the quality of sin-gle knob selective algorithm and Qdouble (Qtriple) is the quality of double knobselective algorithm (that of triple knob selective algorithm) generated from thesingle knobs. This is because combined ones such as SFA-DFR/BER and SFQA-DFR/BER require to satisfy all conditions, which results in more frequent DFRselections rather than BER. Thus, multiple knob selective algorithms are worsethan single knob selective algorithms in terms of the video quality. On the otherhand, more frequent DFR selections result in less costs. Thus, in terms of com-posite metric, Ctriple is better than Cdouble, and Cdouble is better than Csingle ingeneral. Based on these observations, we implement a design space explorationheuristic, DSExplore. Our DSExplore consists of two algorithms such as DSEx-plore single and DSExplore combined as shown in Fig. 7. First, DSExplore singleexplores single-knob selective algorithms to satisfy the given quality requirement.Second, DSExplore combined generates double-knob and triple-knob selective al-gorithms using single knobs discovered by DSExplore single, and finds the possibleconfigurations with minimal costs compared to single-knob selective algorithms.

Fig. 7(a) shows our DSExplore single. First, each set consists of single knobvalues in the ascending order of positive impact on QoS. For example, each knob

n

config n

=SHIFT( )

<Q,C> EVALUATE( )=

YESconstraintQ>Q NO

=min CC

best = configconfig

config

n

set

n

config

= POP( , setframe , set

BEGIN

slackset )

C C

QoS

ADD(= )

YES

NO

min

== NULLYES

DSExplore_combined

<

NO

set

setconfig

setconfigconfig

single

(a) DSExplore single

=C C

=bestconfig config n

setconfig config n

config n<Q,C> EVALUATE( )=

YESconfig nsetconfig= UPDATE( , UP)

config n

setconfig

DSExplore_single

= )

C C minYES

NO

SHIFT(

NO

= UPDATE( , DOWN

QQ constraint

num

)

END

== NULLYES

>

YES

NO

numnum

up round

NO

up++

dn ++

dn

>=

num

>= numround

<

= GENERATE(

num

single)

setconfigconfig n

min

set

(b) DSExplore combined

Fig. 7. DSExplore – design space exploration heuristics

value in setslack is ordered from 0% to 100% since the estimated video qualityof SA-DFR/BER with 100% slack is better than that of SA-DFR/BER with 0%slack value. DSExplore single finds the least knob value of each set for single-knobselective algorithms. For this goal, the first step pops one set out of three sets suchas setslack, setframe, and setQoS. Next DSExplore single shifts one knob value fromeach set and evaluates this configuration until it reaches the knob value satisfyingthe given quality constraint. Then, it saves this knob value at setsingle for furtherusage at DSExplore combined. DSExplore single repeats the same evaluation forthe rest of sets. Among configurations, the best one in terms of composite metric ismaintained as the best configuration, configbest, and the minimal composite value,

Cmin, is updated as well. If it meets no more set with knob values, DSExplore singleends and initiates DSExplore combined.

DSExplore combined begins with generating a set of configurations using setsingle

from DSExplore single. By shifting out one configuration from newly generatedsetconfig, DSExplore combined evaluates it in terms of the video quality and thecomposite metric as shown in Fig. 7(b). If Qconstraint is not satisfied, then DSEx-plore combined updates the set of configuration by increasing the knob value in theevaluated configuration. Otherwise, DSExplore combined decreases the knob valueto update setconfig. For example, QSA-DFR/BER with 31 dB quality and 10%slack knobs does not satisfy Qconstraint, the new configuration of QSA-DFR/BERwith 32 dB or 20% slack. Note that our algorithm does not increase the both knobstogether at the first time since doing so may lose the finer level of exploration. If aknob value cannot increase or decrease any more, no configuration is added. Alsoonly unexplored configuration is added to setconfig. DSExplore combined updatesthe best configuration (cfgbest) and minimal composite (Cmin) if the evaluatedquality (Q) is larger than Qconstraint, and composite metric value (C) is less thanminimal composite metric value (Cmin) as shown in Fig. 7(b). If no more con-figurations exist in setconfig, it terminates and returns the best configuration interms of the composite metric. Since it may keep generating new configurations,we limit the number of rounds by numround. numup and numdown increase byone when the direction of UPDATE() is UP and DOWN, respectively, as shownin Fig. 7(b). If both numup and numdown are equal to or larger than numround,DSExplore combined terminates and returns the best configuration discovered byDSExplore so far.

5. EXPERIMENTAL FRAMEWORK

5.1 Simulation Setup

error−resilient video encodingerror−prone video encoding

Video Encoding Parameters

Video DataDFR ParametersSER

Access PenaltyPower Number

Page Mapping

Executable

Application Compiler CacheSimulator

protected cache parametersunprotected cache parameters

Cache Configuration

Results

QoS

Performance

Power

Reliability

Analyzer

Fig. 8. Experimental Setup – Compiler/Simulator/Analyzer

We perform our study on an extensive simulation environment that we have builtto model the HP iPAQ h5555 [Hewlett Packard ] like processor-memory system.We have modified the sim-cache simulator from the SimpleScalar toolchain [Burgerand Austin 1997] to model the PPC architecture and to inject soft errors as in [Leeet al. 2006]. To support the unequal protection for a PPC architecture, compileras shown in Fig. 8 generates not only an executable but also a page mapping table.

A page mapping table has a list of the marked global variables (multimedia data),which will be mapped into an unprotected data cache and the other data will beexclusively mapped into a protected data cache in a PPC during simulations. Notethat all data will be mapped into an unprotected cache in case of an error-pronedata cache.

As test video streams, AKIYO, FOREMAN, and COASTGUARD in QCIF for-mat (176×144 pixels) are used for our simulation study, and each of them representsa video clip of low activity, medium activity, and high activity. To evaluate the cy-cle accurate results within reasonable amount of simulation time, 300 frames ofeach video stream are chopped into 75 sequences of four frames (several hours tosimulate a video encoding with 300 frames of video on Sun Sparc at 1.5 GHz). Forexample, 300 frames of FOREMAN.QCIF are separate into FOREMAN0.QCIF,FOREMAN1.QCIF, . . ., and FOREMAN74.QCIF. And we ran a simulation atleast four times with each sequence, and thus more than 300 runs have been studied(300 runs = 4 times of run×75 sequences). DFR parameters are inputs for selec-tive DFR/BER schemes. For instance, a slack value (S) is given for SA-DFR/BERin Section 4.1.

The simulator models soft errors by randomly injecting single-bit errors anddouble-bit errors in an unprotected data cache according to SERs. Thus, a single-bit in a data cache is randomly chosen, and a bit value at this single-bit is invertedif a randomly generated number is less than SER when an instruction is executed inthe simulator. Similarly, double-bit errors are injected. Since a protected data cacheis resilient against single-bit errors, only double-bit errors occur. To accomplish theexperiments in reasonable amount of time, accelerated SERs are used. SER is setto 10−11 per KB per instruction for single-bit errors. Note that SER for currenttechnology (about 2.28 × 10−17 at 90nm1) is much less than this accelerated SERby several orders of magnitude, but it increases exponentially as technology scales[Baumann 2005; Hazucha and Svensson 2000; Mastipuram and Wee 2004; Wrobelet al. 2001]. However, we maintain the accurate rate (about 10−2) between single-bitSER and double-bit SER, thus 10−13 per KB per instruction is used for double-biterrors.

5.2 Evaluation

Our simulator returns the number of accesses and the number of misses to eachcache configuration. We analyzed these statistics with given power and performancenumbers, and estimated access time and energy consumption of memory subsystemas shown in Fig. 8. QoS is measured in PSNR using the encoded video output andthe original video input.

Access Latency to Memory Subsystem for Performance : For per-formance evaluation of each composition, we estimate the access latency to thememory subsystem. The access latency of memory subsystem L is estimated asL = (Acache × Laccess) + (Mcache × Lmiss) + (Npolicy × Lpolicy) where Acache isthe number of accesses to a cache, Laccess is the cache access time, Mcache is thenumber of misses to a cache, Lmiss is the cache miss penalty, i.e., the access penalty

1It is projected using an exponentially increasing ratio from 1,000 FIT/Mbit at 180nm technologyto 100,000 FIT/Mbit at 130nm technology [Baumann 2005; Mastipuram and Wee 2004].

to a bus and a memory, Npolicy is the number of triggered policies such as DFRand BER, and Lpolicy is the latency penalty for a policy. The overhead of delay forECC is estimated and synthesized using the CACTI [Shivakumar and Jouppi 2001]and the Synopsys Design Compiler [Synopsys Inc. 2001] as in [Lee et al. 2006], andthe overhead of delay for EDC is calculated using the ratio between delays of ECCand EDC from [Lee et al. 2006; Li et al. 2004]. Also, the delay overheads for DFRand BER are estimated through the simulations so that the overheads for contextswitch and checkpoints are added at the analysis stage in our simulation study asshown in Fig. 8.

Energy Consumption of Memory Subsystem for Power : We estimatethe energy consumption of the memory subsystem using the power models presentedin [Shrivastava et al. 2005]. The overheads of power for a Hamming code (38,32) anda parity code are synthesized and estimated similar to those of delay. The powerconsumption penalty for a recovery policy such as DFR and BER is estimatedthrough the simulations. The energy consumption of the memory subsystem E asE = (Acache ×Paccess) + (Mcache ×Pmiss) + (Npolicy ×Ppolicy) where Paccess is thepower consumption per cache access, Pmiss is the power penalty per cache miss,and Ppolicy is the power penalty for a recovery policy. Due to the lack of space,power and delay penalties are detailed in our technical report [Lee et al. 2008].

Vulnerability for Reliability : To estimate reliability, we evaluate the vul-nerability of caches [Mukherjee et al. 2003; Wang et al. 2006; Zhang 2005]. Datain caches is vulnerable only if it is used by the processor or it is written back tothe memory eventually. If an error occurs on data that will not be used, the errordoes not matter. Thus, vulnerability is a measure of vulnerable time from when adata incomes to the cache to when the data is used by the processor or to when thedata is written back to the memory. Vulnerability is an effective metric to representthe reliability since vulnerability closely estimates the failure rate [Lee et al. pear].Also, vulnerability is very efficient since vulnerability demands one execution of anapplication while the failure rate requires several hundred executions [Lee et al.2008]. For the unprotected cache, the vulnerability is estimated by the sum of allvulnerable times of data. To estimate the vulnerability of the protected cache, thesum of all vulnerable times of data is multiplied by the ratio of difference betweenSBE and DBE, i.e., 10−2, since all SBEs are detected by an error detection scheme.

Video Quality for Quality of Service : We estimate the QoS in PSNR (Peak

Signal to Noise Ratio). PSNR is defined in dB as PSNR = 10LOG10(MAX2

MSE ),where MAX is the maximum pixel value and MSE is the Mean Squared Error,which is the mean of the square of differences between the pixel values of theerroneous video output (due to soft errors and frame drops), and of the correctlyreconstructed output (without errors).

Composite Metric for Multiple Property Evaluation : It is a challengingtask to find design points to consider multiple properties such as performance,power, reliability, and QoS all together. Thus, we present a composite metric to

evaluate interesting design points, composite = (runtime×energy×vulnerability)quality , since

low runtime, low energy, and low vulnerability are good and high quality is good.If composite metric of a design point is low, the design point shows relatively lowruntime, energy consumption, low vulnerability for the quality.

Table II. System Compositions - Our CC-PROTECT is a middleware-driven, co-operative approach aware of hardware failures

System Abstraction Layers

System

CompositionsApplication Middleware Operating System Hardware

GOP Unprotected CacheBASE

(error-prone)None None

(error-prone)

HW- GOP PPC with ECC

PROTECT (error-prone)None ◦Map pages to a PPC

(error-protected)

APP- PBPAIR ◦Monitor network errors Unprotected Cache

PROTECT (error-resilient) & Inform PBPAIR of PLRNone

(error-prone)

MULTI- PBPAIR ◦Monitor network errors PPC with ECC

PROTECT (error-resilient) & Inform PBPAIR of PLR◦Map pages to a PPC

(error-protected)

◦Monitor network errors

& Inform PBPAIR of PLR

CC- PBPAIR •Translate SER ◦Map pages to a PPC PPC with “EDC”

PROTECT (error-resilient) •Trigger Selective DFR •Monitor soft errors (error-protected)

(Drive cache update

& Inform PBPAIR)

5.3 System Compositions

To demonstrate the effectiveness of our cooperative, cross-layer scheme to combatsoft errors, we develop 5 system compositions, shown in Table II:

(1) BASE: This is the default composition, which does not provide any error de-tection and/or correction. In this composition, we use GOP (Group of Picture)video encoding [Cheng and Zarki 2004]. For GOP, the first frame is encoded asan I-frame and the other frames are encoded as P-frames, and the quantizationscale is set to 10. The middleware and operating system are unaware of softerrors, and hardware has just a unified unprotected cache. Base compositiondoes not incur overheads for protection in terms of power and performance, butsuffers from high failure rates and low multimedia quality due to no protectionon internal data from hardware defects.

(2) HW-PROTECT: In this composition, all error detection and correction areprovided in hardware. This is implemented through the use of Error CorrectionCode (ECC) in Partially Protected Cache architecture [Lee et al. 2006]. Ascompared to protecting the whole cache, PPCs provide efficient reliability byjust protecting the non-multimedia data against soft errors. This compositionpresents the low failure rate and high QoS, as it protects at hardware level.However, it incurs overheads in terms of power and performance.

(3) APP-PROTECT: In this composition, all error detection and correction areprovided in the application. For this, we use error-resilient video encoding PB-PAIR [Kim et al. 2006]. We set the PLR parameter in PBPAIR to 0% to isolatethe effects of soft errors from those of network packet losses. Intra Thresholdis selected through the original method of PBPAIR to generate the similar size

of the compressed video as GOP to ensure a fair comparison with respect tothe transmission cost.

(4) MULTI-PROTECT: In this composition, error correction is provided atall levels. We use error-resilient video encoding (PBPAIR) and a protectedcache (a PPC with an ECC scheme). It implements both error-resilience at theapplication layer and an ECC scheme at the hardware layer.

(5) CC-PROTECT: This is our proposed composition, in which we use error-resilient video encoding, PBPAIR, and PPC with an EDC scheme, and supportsmiddleware-driven mechanisms aware of soft errors such as translating SER forPBPAIR and triggering a hybrid scheme of DFR and BER.

Within our proposed composition, we study various selective schemes such as:

—Naive DFR - always triggers a DFR mechanism.

—Naive BER - always triggers a BER mechanism.

—No DFR/BER - never triggers a DFR or BER.

—Random DFR/BER - randomly triggers a DFR or BER.

—SA-DFR/BER - determines a DFR or BER based on slack.

—FA-DFR/BER - determines a DFR or BER based on frame importance.

—QA-DFR/BER - determines a DFR or BER based on QoS.

6. EXPERIMENTAL RESULTS

We present three sets of experiments. First, we demonstrate the effectiveness ofour cooperative, cross-layer methods in low-cost reliability at the slight cost ofQoS for different video streams (Section 6.1). Second, we show the effectiveness ofintelligent DFR/BER selection schemes to improve the video quality and to extendthe design spaces (Section 6.2). Lastly, we show the effectiveness of our designspace exploration heuristic to find the interesting tradeoff space (Section 6.3).

6.1 Effectiveness of CC-PROTECT

Fig. 9 clearly shows that our cross-layer, error-aware approach increases the reliabil-ity with the minimal costs of performance and energy consumption at the minimaldegradation of video quality.

Fig. 9(a) clearly demonstrates that our CC-PROTECT (PBPAIR, a DFR mecha-nism, and a PPC with an EDC protection) improves the vulnerability by more than1.7 × 103 times than that of BASE (GOP and an unprotected data cache). Thisreliability improvement is mainly because of the error detection and a DFR mech-anism in a cross-layered manner. While HW-PROTECT and MULTI-PROTECThave lower vulnerabilities than that of BASE, CC-PROTECT has lower vulner-ability than them. This is because it has less time to be exposed to soft errorsdue to a frame drop and the performance efficiency of PBPAIR than GOP. Notethat vulnerability is the number of cycles when the cache is vulnerable to softerrors, which has a linear relationship to the runtime. It is important that APP-PROTECT (composed of PBPAIR and an unprotected data cache) shows the worsevulnerability than that of BASE since the amount of exposed time of the cache inAPP-PROTECT is larger than that in BASE as shown in Fig. 9(b). Thus, ourCC-PROTECT can achieve the best reliability among all compositions.

Vulnerability

1.0E+12

1.6E+09

1.1E+12

1.5E+09

6.0E+08

1.00E+07

1.00E+08

1.00E+09

1.00E+10

1.00E+11

1.00E+12

1.00E+13

BASE HW-PROTECT APP-PROTECT MULTI-PROTECT CC-PROTECT

System Composition

Vu

lne

rab

le T

ime

(L

OG

)

(a) Reliability: Vulnerability

Memory Subsystem Access Latency

1.09E+08 1.09E+08

4.58E+07

1.13E+081.14E+08

0.00E+00

4.00E+07

8.00E+07

1.20E+08

1.60E+08

BASE HW-PROTECT APP-PROTECT MULTI-PROTECT CC-PROTECT

System Composition

Ac

ce

ss

Tim

e (

Cy

cle

s)

(b) Performance: Access Time to Memory Sub-system

Memory Subsystem Energy Consumption

5.96E+07

6.83E+07

3.03E+07

6.30E+07

7.12E+07

0.00E+00

2.50E+07

5.00E+07

7.50E+07

1.00E+08

BASE HW-PROTECT APP-PROTECT MULTI-PROTECT CC-PROTECT

System Composition

En

erg

y (

nJ

ou

les

)

(c) Power: Energy Usage of Memory Subsystem

Video Quality

31.7932.96

32.06 32.6531.55

0

10

20

30

40

BASE HW-PROTECT APP-PROTECT MULTI-PROTECT CC-PROTECT

System Composition

PS

NR

(d

B)

(d) QoS: Video Quality

Fig. 9. CC-PROTECT achieves the low-cost reliability at the minimal QoS degra-dation

Fig. 9(b) shows that our CC-PROTECT is the best in terms of performance. Itreduces the memory subsystem access time by 58%, compared to that of BASE.It is very effective since our CC-PROTECT reduces the vulnerability by about1.7 × 103 times and it reduces the access latency of memory subsystem comparedto BASE. Note that all the other compositions incur the performance overheadbut CC-PROTECT improves the performance. This performance improvement isbecause of skipping intensive compression algorithms due to a DFR mechanismand the performance efficiency of PBPAIR algorithms. However, the performanceefficiency of PBPAIR is not well exploited in case of APP-PROTECT as it shows 5%overhead compared to BASE because PBPAIR increases the compression efficiencyrather than the performance efficiency at low PLR such as 0% PLR. With the samereason, MULTI-PROTECT incurs about 4% overhead compared to BASE. Indeed,HW-PROTECT does not incur the performance overhead because a PPC achieveshigh performance by protecting only non-multimedia data [Lee et al. 2006].

With the perspective of energy consumption of memory subsystem, our CC-PROTECT saves the energy consumption by 49%, 56%, 52%, and 57% as com-pared to BASE, HW-PROTECT, APP-PROTECT, and MULTI-PROTECT, re-spectively, as shown in Fig. 9(c). Cross-Layer approach reduces the energy con-sumption of memory subsystem due to (i) less expensive EDC technique thanECC, (ii) skipping expensive compression algorithms due to a cooperative DFR

Table III. CC-PROTECT is very effective in terms of performance, power, andreliability at the minimal QoS degradation for different video streams (normalizedresult of each composition to that of BASE)

System Composition Access Time Energy Consumption Vulnerability Video Quality

BASE 1 1 1 1

HW-PROTECT 0.99 1.14 1.5 E-3 1.02

AKIYO APP-PROTECT 0.87 0.89 9.6 E-1 1.01

(low activity) MULTI-PROTECT 0.89 1.03 1.3 E-3 1.02

CC-PROTECT 0.27 0.34 6.3 E-4 1.02

BASE 1 1 1 1

HW-PROTECT 1 1.15 1.6 E-3 1.04

FOREMAN APP-PROTECT 1.05 1.06 1.1 1.01

(medium activity) MULTI-PROTECT 1.04 1.19 1.5 E-3 1.03

CC-PROTECT 0.42 0.51 5.8 E-4 0.99

BASE 1 1 1 1

HW-PROTECT 0.99 1.14 1.6 E-3 1.03

COASTGUARD APP-PROTECT 1.09 1.1 1.1 0.99

(high activity) MULTI-PROTECT 1.06 1.23 1.4 E-3 1.02

CC-PROTECT 0.49 0.58 3.2 E-4 0.93

Video Stream

mechanism, and (iii) energy efficiency of PBPAIR by introducing more intra-codedmacro-blocks than expensive inter-coded macro-blocks. Note that all other com-positions incur overheads of performance and power compared to BASE except forCC-PROTECT. Thus, our CC-PROTECT can even reduce the power and accesstime of memory subsystem while obtaining the high reliability.

Our CC-PROTECT achieves video quality close to those of other compositionsas shown in Fig. 9(d). While an EDC scheme protects the non-multimedia datain our CC-PROTECT, a frame drop due to a DFR mechanism degrades the videoquality. Note that PBPAIR algorithms can improve this video quality by increasingthe resilience level at the cost of the compressed video size (causing the transmissioncosts of power and delay). However, CC-PROTECT saves at least 49% of powerand performance for the minimal vulnerability at the minimal cost of QoS by upto 1.41 dB (less than 5% quality degradation) compared to all the compositions.Note that these results come from only one soft error at the protected cache ina PPC (tens errors in the unprotected cache), and the video quality may degradesignificantly due to multiple frame drops resulting from multiple occurrences of softerrors. We will present the experimental results in those cases in Section 6.2.

Table III summarizes the normalized results of each composition to those of BASEin terms of performance, power, reliability, and QoS for different video streams.This table clearly shows that CC-PROTECT has the least costs of power and per-formance for the minimal vulnerability with the minimal QoS degradation for allvideo streams. The interesting observation that we can make from this table is thatwe can even improve the video quality while still saving the performance and powercosts (73% and 66%, respectively) compared to BASE for a video stream AKIYO.This quality improvement (about 2%) is because: (i) a frame drop may not affectthe video quality for a video stream with low-activity such as AKIYO and (ii) lessamount of execution time of PBPAIR results in less exposure of a data cache tosoft errors. Indeed, the QoS impact of one frame drop for AKIYO is about 0.08%on average. On the other hand, for high activity of video stream such as COAST-

GUARD, our CC-PROTECT degrades the video quality by about 6% in PSNR.But still CC-PROTECT demonstrates the least access time and energy consump-tion for the minimal vulnerability. Note that all these results are evaluated underthe condition of no errors in the network. We also observed the similar results underthe various network status. For example, at 10% PLR, our CC-PROTECT reducesaccess time of memory subsystem by 58%, while APP-PROTECT saves it by 32%(more error rate triggers more intra-coded macro-blocks, causing high performancein PBPAIR algorithms), as compared to BASE. Also, we have run simulationsfor different compositions and similar results demonstrated the effectiveness of ourCC-PROTECT. For instance, a composition (GOP and a 32 KB of protected cachewith an ECC – forward error recovery) incurs 45% performance and 34% energyoverheads as compared to BASE. This is because all data (multimedia data andcontrol data) are protected from soft errors with an expensive ECC scheme. Dueto the lack of space, more results are available in our technical report [Lee et al.2008].

In summary, our cooperative, cross-layer methods exploit a DFR mechanism withan inexpensive EDC protection to decrease the vulnerability by about 2.1 × 103

times, and an error-resilient video encoding technique to minimize the qualitydegradation by 2% while significantly saving the access time by 61% and energyconsumption by 52% on average over multiple video streams, as compared to BASE.Also, our cooperative, cross-layer approach achieves a better reliability than a pre-viously proposed PPC architecture with an ECC protection at the cost of 3% QoSdegradation while reducing the access time by 60% and the energy consumption by58%.

6.2 Effectiveness of Intelligent Selective Schemes

Our CC-PROTECT outperforms all possible compositions in terms of performance,power, and reliability while it slightly degrades the video quality mainly due toframe drops when soft errors occur. Fig. 10 demonstrates that all intelligent selec-tive schemes improve the video quality without incurring performance and energycosts significantly (still mostly lower than costs of BASE).

In Fig. 10, X-axis represents selective mechanisms compared to BASE. Note thatthey are all running PBPAIR on a PPC architecture with an EDC scheme exceptfor BASE (running GOP on an unprotected cache) and No DFR/BER (runningPBPAIR on a PPC without any protection) for comparison. And we parameter-ize a policy selection based on available information in a mobile device. NaiveDFR scheme shows the worst video quality as shown in Fig. 10(b) at the leastcosts in terms of power and performance as shown in Fig. 10(c) and Fig. 10(d).Note that Naive DFR in Fig. 10 results from multiple soft errors (1.7 errors onaverage) on the protected data cache in a PPC, which degrades the video qualityworse than that of CC-PROTECT in Fig. 9(d). On the other hand, Naive BERscheme presents the better video quality than that of Naive DFR while incurringthe most expensive power and performance costs compared to other schemes. Interms of reliability, Naive BER shows worse vulnerability than that of Naive DFRas shown in Fig. 10(a). This is mainly because Naive BER increases the execu-tion time, causing the more time for a PPC to be exposed to soft errors. Clearly,No DFR/BER does not have a mechanism to protect a system from soft errors,

Vulnerability

6.0E+089.5E+08

4.6E+11

7.7E+08 9.6E+08 9.1E+08 7.3E+08

1.0E+12

1E+07

1E+08

1E+09

1E+10

1E+11

1E+12

1E+13

BASE Naïve DFR Naïve BER No

DFR/BER

Random

DFR/BER

SA-

DFR/BER

FA-

DFR/BER

QA-

DFR/BER

Policy

Vu

lne

rab

le T

ime

(L

OG

)

(a) Reliability: Vulnerability

Video Quality

31.79

30.33

32.65 32.8032.18

31.8132.27 31.95

20

25

30

35

40

BASE Naïve DFR Naïve BER No

DFR/BER

Random

DFR/BER

SA-

DFR/BER

FA-

DFR/BER

QA-

DFR/BER

Policy

PS

NR

(d

B)

(b) QoS: Video Quality

Memory Subsystem Access Latency

1.00

0.39

0.71

0.39

0.51

0.66

0.60

0.48

0

0.2

0.4

0.6

0.8

1

1.2

BASE Naïve DFR Naïve BER No

DFR/BER

Random

DFR/BER

SA-

DFR/BER

FA-

DFR/BER

QA-

DFR/BER

Policy

No

rma

lize

d M

em

ory

Su

bs

ys

tem

Ac

ce

ss

Tim

e t

o B

as

e

(c) Performance: Access Time to Memory Sub-system

Memory Subsystem Energy Consumption

1.00

0.48

0.86

0.560.62

0.80

0.72

0.58

0

0.2

0.4

0.6

0.8

1

1.2

BASE Naïve DFR Naïve BER No

DFR/BER

Random

DFR/BER

SA-

DFR/BER

FA-

DFR/BER

QA-

DFR/BER

Policy

No

rma

lize

d M

em

ory

Su

bs

ys

tem

En

erg

y C

on

su

mp

tio

n t

o B

as

e

(d) Power: Energy Usage of Memory Subsystem

Fig. 10. Intelligent selective schemes maintain the video quality and reliability withminimal overheads of power and performance

causing very high vulnerability as shown in Fig. 10(a). Note that Fig. 10(b) showshigher video quality of No DFR/BER than others. This is because we measuredthe video quality in PSNR when simulations are successes where No DFR/BERdoes not skip any frame. Random DFR/BER provides the good video quality withinexpensive power and performance. For SA-DFR/BER, the results have been pro-filed with the knob S, the portion of ACET, from 0% to 100% in 10% increments,and SA-DFR/BER with S = 60% is compared in Fig. 10 since it is the least valueof the knob to recover the video quality better than that of BASE according toprofiled results. However, it is an expensive approach since it incurs high overheadsin terms of power and performance while it presents a better video quality thanthat of Naive DFR. For FA-DFR/BER scheme, our preliminary experiments showthat the difference in PSNR between consecutive frames make the 2nd, 3rd, and 4th

frame in the descending order of the importance on average. Thus, FA-DFR/BERwith 2nd frame is studied to improve the video quality most and indicates thatwe select BER rather than DFR whenever a soft error occurs in encoding the 2nd

frame. In these particular experiments, FA-DFR/BER scheme is more effectivethan SA-DFR/BER scheme since it has lower costs with better QoS (vulnerabil-ities are close). For QA-DFR/BER, 31.79 dB is considered as the QoS thresholdvalue since it is the average video quality in case of BASE. QA-DFR/BER provides

lower video quality while incurring less costs than FA-DFR/BER scheme. Thus,each selective scheme has pros and cons in terms of performance, power, reliability,and QoS.

Fig. 11 shows our CC-PROTECT with intelligent selective schemes significantlyextend the interesting tradeoff space in terms of QoS and performance as comparedto existing compositions. Fig. 11(a) plots a limited tradeoff space with existingtechniques and configurations such as BASE, HW-PROTECT, APP-PROTECT,MULTI-PROTECT, and Naive BER. Exploiting a Naive DFR mechanism anderror-resilient video encoding in a cooperative, cross-layer approach finds a config-uration (named Naive DFR in Fig. 11(b)) and extends a possible tradeoff space to-ward low cost reliability at the cost of QoS degradation. The tradeoff space betweenNaive DFR and Naive BER can be explored by several single-knob selective algo-rithms as shown in Fig. 11(c). For single selective algorithms, we consider 0%, 10%,60%, and 100% for SA-DFR/BER, 2nd, 3rd, and 4th frame for FA-DFR/BER, and31 dB, 32 dB, and 33 dB for QA-DFR/BER. Fig. 11(c) shows that QA-DFR/BERis one of the most effective algorithms since it generates configurations with higherQoS and lower costs compared to FA-DFR/BER and SA-DFR/BER. This spacecan be further explored in a finer granularity of QoS and performance cost byapplying combined selective algorithms such as SFA-DFR/BER, FQA-DFR/BER,QSA-DFR/BER, and SFQA-DFR/BER as shown in Fig. 11(d). Thus, we can findinteresting design points satisfying the given quality with minimal performanceby exploring those expanded tradeoff space with our heuristic algorithm, DSEx-plore. Fig. 11(d) shows 83% performance variation between Naive DFR and NaiveBER at the cost of up to 8% quality degradation. With respect to vulnerabilityand energy consumption, our experiments show approximately 60% variation and81% variation, respectively. These experimental results also demonstrate that ourCC-PROTECT with single/combined selective algorithms can reduce the memoryaccess time by up to 47%, the memory energy consumption by up to 37%, andvulnerability by up to 23% without losing QoS as compared to BASE.

In summary, selective DFR/BER mechanisms allow a system to maintain thevideo quality and reliability with minimal costs of power and performance. Alsohybrid selective schemes out of DFR and BER expand the tradeoff space signifi-cantly, and guide designers to consider this largely expanded space.

6.3 Effectiveness of DSExplore

Composite metric addresses the effectiveness of a configuration, and the lower com-posite metric, the better configuration (low runtime, low energy consumption, lowvulnerability, and high QoS). Fig. 12 shows that Naive DFR (DFR in figure) isgood in considering multiple constraints and Naive BER (BER in figure), BASE,and SA-DFR/BER (S10, S60, and S100 in figure) are bad in terms of compositemetric. Indeed, BASE is bad in terms of performance, power, vulnerability, andQoS as shown in Fig. 9 and SA-DFR/BER is relatively not good compared toFA-DFR/BER and QA-DFR/BER since it obtains high QoS with more costs thanFA-DFR/BER and QA-DFR/BER as shown in Fig. 10 and in Fig. 11(c). QA-DFR/BER and FA-DFR/BER are good selective algorithms since they distributeconfigurations evenly in Fig. 12 and relatively cost-efficient than SA-DFR/BER.Points that are not presented in X-axis represent combined configurations such as

Quality vs. Memory Access Time

BASE

Naïve BER

HW-PROTECT

APP-PROTECT MULTI-PROTECT

3.0E+07

6.0E+07

9.0E+07

1.2E+08

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Me

mo

ry S

ub

sy

ste

m A

cc

es

s T

ime

(Cy

cle

s)

(a) A design space with existing techniques

Quality vs. Memory Access Time

BASE

Naïve DFR

Naïve BER

HW-PROTECT

APP-PROTECT MULTI-PROTECT

3.0E+07

6.0E+07

9.0E+07

1.2E+08

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Me

mo

ry S

ub

sy

ste

m A

cc

es

s T

ime

(Cy

cle

s)

(b) Naive DFR extends a design space with re-spect to QoS and Performance

Quality vs. Memory Access Time

S=10%

S=100%

Q=31dB

Q=32dB

BASE

Naïve DFR

S=60%

F=2

Naïve BER

S=0%

F=3

F=4

Q=33dB

HW-PROTECT

APP-PROTECT MULTI-PROTECT

3.0E+07

6.0E+07

9.0E+07

1.2E+08

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Me

mo

ry S

ub

sy

ste

m A

cc

es

s T

ime

(Cy

cle

s)

(c) Single selective algorithms (△) balance QoSand Performance

Quality vs. Memory Access Time

S=10%

S=100%

Q=31dB

Q=32dB

BASE

Naïve DFR

S=60%

F=2

Naïve BER

S=0%

F=3

F=4

Q=33dB

HW-PROTECT

APP-PROTECT MULTI-PROTECT

3.0E+07

6.0E+07

9.0E+07

1.2E+08

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Me

mo

ry S

ub

sy

ste

m A

cc

es

s T

ime

(Cy

cle

s)

(d) Combined selective algorithms extend inter-esting design space at finer granularity (⋄ – doubleknob combination, ◦ – triple knob combination)

Fig. 11. CC-PROTECT with single/combined selective algorithms extends the design space sig-nificantly with respect to performance and QoS

Composite Metric

1E+22

1E+25

1E+28

Configurations

Ru

nti

me

*En

erg

y*V

uln

era

bil

ity

/Qo

S

(LO

G)

B

A

S

E

B

E

R

S

1

0

0

S

6

0

F

2

S

1

0

Q

3

3

Q

3

2

Q

3

1

F

4

S

0

D

F

R

F

3

Fig. 12. Composite metric evaluation differentiates good and bad configurations

SFA-DFR/BER, FQA-DFR/BER, QSA-DFR/BER, and SFQA-DFR/BER.Pareto-optimal evaluations also demonstrate the effectiveness of our composite

metric. We consider 82 configurations such as BASE, Naive DFR, Naive BERconfigurations and 79 newly extended configurations from single/combined selectivealgorithms with 4 different slack values (0%, 10%, 60%, and 100%), 3 different

frames (2nd, 3rd, and 4th), and 3 different QoS values (31 dB, 32 dB, and 33 dB)as shown in Fig. 13. A configuration is Pareto-optimal, if it is no worse than anyother configuration in all the four properties, i.e., runtime, energy consumption,vulnerability and video quality. Out of 82 configurations, 12 points are Pareto-optimal and they are S0F2Q33, S10F4Q33, S60F4Q33, S10F3Q33, F4Q31, S100F4Q31,S60F3Q33, Q31, S100F3, Q32, S100F2, and Naive BER, where SlFmQn is SFQA-DFR/BER with a slack knob l (%) for SA-DFR/BER, a frame knob mth (frame)for FA-DFR/BER, and a video quality knob n (dB) for QA-DFR/BER. Pareto-optimal points discovered by QoS and composite metric are the same as Pareto-optimal points above discovered by all four properties, which demonstrates theeffectiveness of the composite metric.

Quality vs. Memory Access Time

4.0E+07

5.0E+07

6.0E+07

7.0E+07

8.0E+07

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Me

mo

ry S

ub

sy

ste

m A

cc

es

s T

ime

(Cy

cle

s)

Fig. 13. Pareto-optimal points with respect to QoS, performance, energy consumption, and vul-nerability (on QoS/Performance place)

Our heuristic algorithm explores the largely extended design space (79 configura-tions) to find out the minimal composite metric with QoS satisfying Qconstraint =32dB as shown in Fig. 14(a). Fig. 14(a) plots the design points explored by ourDSExplore (marked with ×) and exhaustive algorithm (marked with ◦) to findthe least cost configuration under 32 dB Qconstraint. Our DSExplore is about3.95× faster than an exhaustive exploration approach (20 configurations exploredby DSExplore whereas all 79 configurations by an exhaustive search as shown inFig. 14(a)). The best configuration (configbest = Q32) found out by our DSExploreis the best one by an exhaustive search algorithm. Also it is the best configurationin terms of performance, power, and vulnerability while satisfying QoS constraint.When we ran DSExplore with Qconstraint = 31dB, we found out a configuration(S10F3Q33) with minimal composite metric value. This configuration is the sameas the best configuration found out by an exhaustive search in terms of compositemetric. But our DSExplore explores 54 configurations out of 79. This is because itexplores more configurations in the crowded area as shown in Fig. 14(b). This bestconfiguration in terms of the composite metric has less than 5% difference than thebest configuration in terms of performance, that in terms of energy consumption,and that in terms of vulnerability. This exploration takes 28 DOWN and 19 UPin updating a set of configurations in our DSExplore combined algorithm. If welimit the number of rounds, we can reduce the number of configurations exploredby DSExplore but the configuration may not be the best configuration discovered

by an exhaustive exploration algorithm. When we limit the number of rounds to10 (= numround), the best configuration discovered by DSExplore has about 20%difference from the best one in terms of vulnerability.

Our heuristic algorithms work very effective for other video clips such as AKIYO(low activity) and COASTGUARD (high activity). With AKIYO under 33 dBquality requirement, the best configuration (S60F4Q33) found out by our DSEx-plore has difference only by less than 1% than the best configuration in terms ofperformance, that in terms of energy consumption, and that in terms of vulnerabil-ity. With COASTGUARD under 30 dB quality constraint, the best configuration(S60Q31) found out by our DSExplore is also the best in terms of performance andpower, but it presents higher vulnerability by about 37% as compared to the bestconfiguration in terms of vulnerability. These exploration results demonstrate thesignificant effectiveness of our DSExplore since it explores less number of configu-rations than exhaustive exploration approach but finds out very interesting designpoints in terms of multiple properties such as performance, power, and vulnerability.

DSExplore vs. Exhaustive Exploration

BEST

0.0E+00

4.0E+22

8.0E+22

1.2E+23

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Ru

nti

me

*En

erg

y*V

uln

era

bil

ity

/Qo

S

DSExploreExhaustive ExplorationDSExplore-1

(a) Qconstraint = 32dB

DSExplore vs. Exhaustive Exploration

BEST

0.0E+00

4.0E+22

8.0E+22

1.2E+23

30 30.5 31 31.5 32 32.5 33

PSNR (dB)

Ru

nti

me

*En

erg

y*V

uln

era

bil

ity

/Qo

S

DSExploreExhaustive ExplorationDSExplore-1

(b) Qconstraint = 31dB

Fig. 14. DSExplore explores tradeoff space effectively to find interesting configurations comparedto an exhaustive exploration approach

In summary, our DSExplore finds out interesting design points from largely ex-panded tradeoff space with our CC-PROTECT and hybrid selections from DFRand BER. DSExplore is very efficient by up to 3.95× faster than an exhaustiveexploration algorithm, and finds out the best configuration in terms of compositemetric under the quality constraint.

7. CONCLUSION AND FUTURE WORK

Reliability is a paramount concern in mobile embedded systems where the resourcessuch as power and performance are limited. To resolve the complexity of tradeoffsamong multi-dimensional properties, a cross-layer approach from the hardware layerto the application layer should be taken into account since traditional techniquesare unable to address the impacts of an approach on other properties at other layers,and are unable to drive the whole system’s reliability in a power and performanceefficient way.

In this context, this paper proposes an approach, cooperative, cross-layer method,for high reliability with low costs in mobile devices. In particular, we present the

mechanism exploiting the existing error-resilient video encoding (PBPAIR) incor-porated with a DFR mechanism, which can also protect the internal data fromsoft errors in a PPC with minimal overheads of power and performance at thecost of QoS degradation. Note that this cross-layer, error-aware method can beincorporated with any error detection techniques at components against soft errorsor transient faults. And intelligent schemes are discussed to present an adaptiveexploitation between DFR and BER to minimize the quality loss in considering theworst scenario. Furthermore, cross-layer, error-aware approach introduces a newdesign margin to further save the resources with high reliability at the cost of QoS.To explore the largely expanded tradeoff space, we present an exploration heuris-tic, DSExplore. Our heuristic is very efficient to find the interesting design pointsconsidering multiple properties under the quality constraint.

Our future work includes the extended cross-layer approach considering end-to-end devices in distributed real-time systems. Considering different error controlschemes with different error models across system abstraction layers is the directionof our future work.

REFERENCES

Bajic, I. V. 2007. Efficient cross-layer error control for wireless video multicast. 53, 1 (Mar),276–285.

Baumann, R. 2005. Soft errors in advanced computer systems. IEEE Design and Test of Com-puters, 258–266.

Burger, D. and Austin, T. M. 1997. The SimpleScalar Tool Set, version 2.0. SIGARCHComputer Architecture News 25, 3, 13–25.

Cai, Y., Schmitz, M. T., Ejlali, A., Al-Hashimi, B. M., and Reddy, S. M. 2006. Cache sizeselection for performance, energy and reliability of time-constrained systems. In ASP-DAC.

Cheng, L. and Zarki, M. E. 2004. PGOP: An error resilient techniques for low bit rate and lowlatency video communications. In Picture Coding Symposium (PCS).

Hazucha, P. and Svensson, C. 2000. Impact of CMOS technology scaling on the atmosphericneutron soft error rate. IEEE Trans. on Nuclear Science 47, 6, 2586–2594.

Hewlett Packard. HP iPAQ h5555 Series - System Specifications. Hewlett Packard,http://www.hp.com.

Kim, M., Dutt, N., Venkatasubramanian, N., and Talcott, C. 2008. xTune: Online verifiablecross-layer adaptation for distributed real-time embedded systems. ACM SIGBED Review:Special Issue on the RTSS Forum on Deeply Embedded Real-Time Computing 5, 1 (Jan).

Kim, M., Oh, H., Dutt, N., Nicolau, A., and Venkatasubramanian, N. 2006. PBPAIR: Anenergy-efficient error-resilient encoding using probability based power aware intra refresh. ACMSIGMOBILE Mobile Computing and Communications Review 10, 3 (July), 58–69.

Lee, K., Kim, M., Dutt, N., and Venkatasubramanian, N. 2008. Error exploiting video encoderto extend energy/QoS tradeoffs for mobile embedded systems. In DIPES.

Lee, K., Shrivastava, A., Dutt, N., and Venkatasubramanian, N. 2008, to appear. Datapartitioning techniques for partially protected caches to reduce soft error induced failures. In6th IFIP Working Conference on Distributed and Parallel Embedded Systems (DIPES).

Lee, K., Shrivastava, A., Issenin, I., Dutt, N., and Venkatasubramanian, N. 2006. Mitigatingsoft error failures for multimedia applications by selective data protection. In CASES.

Lee, K., Shrivastava, A., Kim, M., Dutt, N., and Venkatasubramanian, N. 2008. Cross-layerinteractions of error control schemes in mobile multimedia systems. Tech. Rep. TR 08-09,University of California at Irvine. Jul.

Li, J.-F. and Huang, Y.-J. 2005. An error detection and correction scheme for RAMs withpartial-write function. In IEEE International Workshop on Memory Technology, Design andTesting (MTDT). 115–120.

Li, L., Degalahal, V., Vijaykrishnan, N., Kandemir, M., and Irwin, M. J. 2004. Soft error

and energy consumption interactions: A data cache perspective. In ISLPED.

Mastipuram, R. and Wee, E. C. 2004. Soft Errors’ Impact on System Reliability.http://www.edn.com/article/CA454636.

Mitra, S., Seifert, N., Zhang, M., Shi, Q., and Kim, K. S. 2005. Robust system design withbuilt-in soft-error resilience. IEEE Computer 38, 2 (Feb), 43–52.

Mohapatra, S., Cornea, R., Dutt, N., Nicolau, A., and Venkatasubramanian, N. 2003.Integrated power management for video streaming to mobile handheld devices. In ACM inter-national conference on Multimedia.

Mohapatra, S., Cornea, R., Oh, H., Lee, K., Kim, M., Dutt, N., Gupta, R., Nicolau,

A., Shukla, S., and Venkatasubramanian, N. 2005. A cross-layer approach for power-performance optimization in distributed mobile systems. In Next Generation Software Programin conjunction with IPDPS. 218.1.

Mukherjee, S. S., Weaver, C., Emer, J., Reinhardt, S. K., and Austin, T. 2003. A sys-tematic methodology to compute the architectural vulnerability factors for a high-performancemicroprocessor. In MICRO.

Nieuwland, A. K., Jasarevic, S., and Jerin, G. 2006. Combinational logic soft error analysisand protection. In IOLTS06.

Phelan, R. 2003. Addressing soft errors in arm core-based designs. Tech. rep., ARM.

Pradhan, D. K. 1996. Fault-Tolerant Computer System Design. Prentice Hall. ISBN 0-1305-7887-8.

Ruckerbauer, F. X. and Georgakos, G. 2007. Soft error rates in 65nm SRAMs - analysis ofnew phenomena. In IEEE Internation On-Line Testing Symposium.

Shivakumar, P. and Jouppi, N. 2001. CACTI 3.0: An Integrated Cache Timing, Power, and

Area Model. In WRL Technical Report 2001/2.

Shivakumar, P., Kistler, M., Keckler, S., Burger, D., and Alvisi, L. 2002. Modeling theeffect of technology trends on soft error rate of combinational logic. In DSN02.

Shrivastava, A., Issenin, I., and Dutt, N. 2005. Compilation techniques for energy reductionin horizontally partitioned cache architectures. In CASES. 90–96.

Synopsys Inc. 2001. Design Compiler Reference Manual. Synopsys Inc., Mountain View, CA,USA.

van der Schaar, M. and Turaga, D. S. 2007. Cross-layer packetization and retransmissionstrategies for delay-sensitive wireless multimedia transmission. IEEE Transactions on Multi-media 9, 1 (Jan.), 185–197.

Vuran, M. C. and Akyildiz, I. F. 2006. Cross-layer analysis of error control in wireless sensornetworks. In IEEE Communications Society on Sensor and Ad Hoc Communications andNetworks (SECON). 585–594.

Wang, S., Hu, J., and Ziavras, S. G. 2006. On the characterization of data cache vulnerabilityin high-performance embedded microprocessors. In SAMOS.

Wang, Y., Wenger, S., Wen, J., and Katsaggelos, A. K. 2000. Review of error resilient codingtechniques for real-time video communications. IEEE Signal Processing Magazine 17, 4 (July),61–82.

Wang, Y. and Zhu, Q.-F. 1998. Error control and concealment for video communication: Areview. 86, 5 (May), 974–997.

Worrall, S., Sadka, A., Sweeney, P., and Kondoz, A. 2001. Motion adaptive error resilientencoding for MPEG-4. In ICASSP.

Wrobel, F., Palau, J. M., Calvet, M. C., Bersillon, O., and Duarte, H. 2001. Simulation

of nucleon-induced nuclear reactions in a simplified SRAM structure: Scaling effects on SEUand MBU cross sections. IEEE Trans. on Nuclear Science 48, 6.

Yuan, W. and Nahrstedt, K. 2003. Energy-efficient soft real-time CPU scheduling for mobilemultimedia systems. 37, 5 (Dec), 149–163.

Yuan, W. and Nahrstedt, K. 2004. Practical voltage scaling for mobile multimedia devices. InACM international conference on Multimedia. 924–931.

Zhang, R., Regunathan, S. L., and Rose, K. 2000. Video coding with optimal inter/intra-model

switching for packet loss resilience. IEEE Journal on Selected Areas in Communications 18, 6(June), 966–976.

Zhang, W. 2005. Computing cache vulnerability to transient errors and its implication. InDFT’05.