Dependability and Security Challenges in Emerging...

16
1 JAA 6/28/2007 JAA 6/28/2007 Dependability and Security Challenges Dependability and Security Challenges in Emerging Technologies in Emerging Technologies Jacob A. Abraham Jacob A. Abraham University of Texas at Austin University of Texas at Austin Workshop Panel Workshop Panel June 28, 2007 June 28, 2007

Transcript of Dependability and Security Challenges in Emerging...

1JAA 6/28/2007JAA 6/28/2007

Dependability and Security Challenges Dependability and Security Challenges in Emerging Technologiesin Emerging Technologies

Jacob A. AbrahamJacob A. AbrahamUniversity of Texas at AustinUniversity of Texas at Austin

Workshop PanelWorkshop Panel

June 28, 2007June 28, 2007

2JAA 6/28/2007JAA 6/28/2007

Nanoscale CMOS Trends

Moore's Law

3JAA 6/28/2007JAA 6/28/2007

Process Variations

Source: Intel

4JAA 6/28/2007JAA 6/28/2007

Nanoscale CMOS Example

Fin-FETs

5JAA 6/28/2007JAA 6/28/2007

Effects on Circuits and SystemsExperiments on chips today show that running some chips at rated speeds produce errors

Correct operation when running at normal speeds

Resistive opens (possible in copper interconnect) cause delay defects

Crosstalk effects could also cause errors

As technology scales down, chips in the future prone to erroneous operation due to:

Process variations (soft errors)Increasing defects (today's memories are an example)

6JAA 6/28/2007JAA 6/28/2007

New Nano SystemsCarbon Nanotubes

Co

urt

esy:

IBM

7JAA 6/28/2007JAA 6/28/2007

New Nano Systems

Inter-dot Barriers

Outer Barriers

Dot occupied by Electron

Dot unoccupied

1

“1” “0”

1

QCA Wire

1 0

QCA Inverter

Stable

Unstable

Quantum Cellular Automata

Quantum Devices

8JAA 6/28/2007JAA 6/28/2007

Dealing with ErrorsRedundancy

Hardware (duplication/retry, triplication)Time (recomputation)

Basic ideas in dealing with errors are not newMoore and Shannon, 1956:

“Reliable Circuits Using Less Reliable Relays”von Neumann, 1956:

“Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components”

Self-checking circuits (1970s)Algorithm-based fault tolerance (1980s)Software-based checks (control flow checks, etc.) (1980s)

9JAA 6/28/2007JAA 6/28/2007

Redundancy at Different Levels

Switch (Circuit) level

Module level

Voter

Functional

Module

Checker

Outputs

Error

Inputs

Self CheckingSystem

10JAA 6/28/2007JAA 6/28/2007

Protecting Computations at the System Level

ColumnChecksum

RowChecksum

ijA

B jk

1

2

3

4

1 2 3 4

Algorithm-BasedFault Tolerance

Overhead decreasesas system gets larger!

11JAA 6/28/2007JAA 6/28/2007

CEDA: Control-flow Error Detection through Assertions (Integrated with GCC)

S: global runtime signature register– updated at the beginning and end of each node– each update either an XOR or an AND operation– op. performed based on the program graph properties

S = S XOR 1011

Se = 0011

Se = 1000

br S != 0011 errS = S XOR 0110

Se = 0101 Se: expected value of S at each point in the program - calculated at compile time

Check point: S is checked against its expected value - detects CFE if one occurred - not required inside every node

Node signature: expected value of S inside a node

Node exit signature: expected value of S immediately after exiting a node

12JAA 6/28/2007JAA 6/28/2007

Dealing with FaultsIncreasing possibility of defectsDefect tolerance key for yieldWant system to start in a good state

Cannot produce cost-effective DRAMS without replacing faulty cells with spares

However, sparing cells is much more difficult for logic (very high cost for multiplexers, routing)

13JAA 6/28/2007JAA 6/28/2007

Defect Tolerance in Carbon Nanotube Circuits

Source: Mitra

14JAA 6/28/2007JAA 6/28/2007

What are some of the characteristics of future products?

Low-cost consumerproducts

High frequency,high resolutionsignals

System service (what matters is what customer sees)

Regulations0

200

400

600

800

1000

1200

1400

1991 '93

'95

'97

'99

2001 '03

'05

'07

'09

'11

SoC Market Size

World Wide Semiconductor Market Size

MS-SOC Contribution to the

SoC Market Size

Mar

ket S

ize 

(B$)

15JAA 6/28/2007JAA 6/28/2007

Objective of dependabilityGuarantee that the system meets customer or regulatory specifications

Need not check directly for the specificationsThis is becoming impossible to do with low cost

Only solution for systems of the future is indirect checks from which the specifications can be inferred accurately

Detector Output

10M Samples per Second

­2.5 ­2 ­1.5 ­1 ­0.5 0 0.5 1 1.5­2.5

­2

­1.5

­1

­0.5

0

0.5

1

1.5

Predicted IIP3 [dBm]

Mea

sure

d IIP

3 [d

Bm

]

LNA IIP3

IIP3y=x ref line

Third harmonic of 940 MHz RF system deducedfrom 10 MS/sec detector output

16JAA 6/28/2007JAA 6/28/2007

ConclusionsNeed to deal with soft errors (due to variations, etc.)

Detection and correction techniques

Tolerate defects in manufactureLots of devices, but efficiently using them is key

Level to apply solutions?Usually higher levels are betterMay find good solutions at low levels, too

Can utilize many “old” techniques

Need to look for “new” techniques