Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

27
Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested? Test Methodology for Characterizing the SEE Sensitivity of a Commercial IEEE 1394 Serial Bus (FireWire) Christina Seidleck Raytheon ITSS Lanham, MD Stephen Buchner QSS Landover, MD Hak Kim Jackson & Tull Wahsington, DC P.W. Marshall Consultant Brookreal, VA Kenneth LaBel NASA GSFC Greenbelt, MD Results LLC Irradiated Hard Errors Results LLC Irradiated Soft Errors Results PHY Irradiated Hard Errors Conclusions References Example of a Soft Error Example of a Hard Error SEFIs Categorized by Steps Required to Start Communications Results LLC Asynchronous Mode Results PHY Asynchronous Mode Radiation Test Hardware Diagram Radiation Test Software Software Flow for Asynchronous Mode Software Flow for Isochronous Mode Two Main Types of Error Observed Modes of Operation Packet-Based Transactions Test Part Function and Acc Radiation Characterization Radiation Test Hardware Setup

description

Test Methodology for Characterizing the SEE Sensitivity of a Commercial IEEE 1394 Serial Bus (FireWire). Christina Seidleck Raytheon ITSS Lanham, MD. Stephen Buchner QSS Landover, MD. Hak Kim Jackson & Tull Wahsington, DC. P.W. Marshall Consultant Brookreal, VA. Kenneth LaBel - PowerPoint PPT Presentation

Transcript of Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Page 1: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Abstract

Introduction

Typical PC-based Imple-mentation

The Protocol Layers

What Was Tested?

Test Methodology for Characterizing the SEE Sensitivity of a Commercial IEEE 1394 Serial Bus (FireWire)

Christina SeidleckRaytheon ITSSLanham, MD

Stephen BuchnerQSS

Landover, MD

Hak KimJackson & Tull

Wahsington, DC

P.W. MarshallConsultant

Brookreal, VA

Kenneth LaBelNASA GSFC

Greenbelt, MD

Results LLC IrradiatedHard Errors

Results LLC IrradiatedSoft Errors

Results PHY IrradiatedHard Errors

Conclusions

References

Example of a Soft Error

Example of a Hard Error

SEFIs Categorized bySteps Required to Start

Communications

Results LLC AsynchronousMode

Results PHY AsynchronousMode

Radiation Test HardwareDiagram

Radiation Test Software

Software Flow forAsynchronous Mode

Software Flow for Isochronous Mode

Two Main Types of ErrorObserved

Modes of Operation

Packet-Based Transactions

Test Part Function and Acc

Radiation Characterization

Radiation Test HardwareSetup

Page 2: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

AbstractThe Single Event Effect (SEE) responses of two FireWire serial buses based on the IEEE 1394 standard were tested with heavy ions and protons. A unique approach to testing and categorizing the SEEs is presented.

Page 3: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Introduction and Background

IEEE 1394 is a formal description of the architecture called FireWire originally developed by Apple Computer. FireWire is an advanced serial bus used for connecting numerous high performance devices together.

Why FireWire?

•Less Expensive Alternative to Parallel Buses - a variety of devices can connect directly to a single serial bus, ~4.5 meters allowed between devices (cable implementation)

•Backplane and Cable Implementations Supported -only the cable implementation is presented here

•Plug and Play Support - supports automatic configuration of devices without intervention from the host system

•Scalable Performance - support of transfer rates of 400Mb/s, 200Mb/s, and 100Mb/s

•Attachment Of Up To 63 Nodes On A Single Serial Bus

•Supports Two Transmission Modes: Isochronous and Asynchronous

•Peer to Peer Transfers - data can be transferred between individual nodes without intervention from the host system

Page 4: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

DigitalVCR

CD-ROM

PC

LaserPrinter

DigitalCamera

1394 Cable

Typical PC-Based 1394 Implementation

The serial bus allows a variety of high-speed peripheral devices to be attached and supported

Page 5: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

The Protocol Layers of the 1394

Bus

Management

Interface

Asynchronous

Transfer

Interface

Isochronous

Transfer

Interface

Software Driver

BusManager

IsochronousResourceManager

Cycle Master

NodeController

Serial BusManagementLayer

TransactionLayer

Physical Layer

Link Layer

Serial Bus

Page 6: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

What Was Tested?

Physical Layer PHY

Link Layer LLC

Commercial 1394 Development Board

•FIFOs•PCI Registers •OHCI Registers

•16 Internal Registers

LLC Part Number

TI

NSC

TSB12LV26PZT

CS4210VJG

Vendor PHY Part Number Development Board

TSB41AB3PFP

CS4103VHG

TSBKOHCI403

CS4210A-DK

CA-OAAO45T

Lot Date Code Lot Date Code

VS052ABC4

OCC4RTT

VS052ABC4

Page 7: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Modes of OperationAsynchronous Isochronous

•Data transfers target a particular node based on a unique address (one-to-one transfers)

•Data transfers do not require a constant data rate

•All data transfers of this type are guaranteed 20% (min.) of overall bus bandwidth

•Verifies data delivery with acknowledge, CRC checks and response codes

•Supports data retransmits

•Used when data integrity is required/critical

•Data transfers target nodes based on a channel number of the transfer (like a broadcast, one-to-many)

•Receiving nodes “listen” to channel numbers to receive data packets

•No error detection or retransmits

•Uses constant bandwidth which is requested from the isochronous resource manager

•80% of bus bandwidth used for isochronous data transfers

•Used for time critical, error-tolerant data transfers

Page 8: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Packet-Based Transactions

Asynchronous Packets:•Reads•Writes•Locks

All transactions are transmitted over the bus in a packetized form. Different types of packets are defined for asynchronous and isochronous modes.

Destination Address Source ID Data Label CRCTransaction Type Acknowledge Code Parity

Sample Write Request Packet Sample Acknowledge Packet

Isochronous Packets•Stream Data

Channel Number Transaction Type Data CRC

Sample Stream Packet

Acknowledge

Page 9: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Test Part Function and Accessibility

Physical Layer PHYLink Layer LLC

Functions

•Forms packets for transmission•Provides address decoding for incoming asynchronous packets•Provides channel number decoding for incoming isochronous packets•Performs CRC error checking

Functions

•Electrical and mechanical interface for transmission and reception of packets transferred across the bus•Arbitration - ensures only one node at a time transmits on the bus

Registers Monitored

•FIFOs

•42 out of a possible 102 Open Host Controller Interface (OHCI) registers

•21 out of a possible 22 PCI registers

Registers Monitored

•Due to the volatility of the 16 registers on the PHY, none was monitored

Page 10: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Radiation Characterization

• Protons (TRIUMF) and heavy ions (BNL and TAMU) used to test parts from Texas Instruments and National Semiconductor.

• Irradiate PHY and LINK chips separately on DUT board.

• National Semiconductor part underwent destructive latchup when irradiated with ions having a LET = 27 MeV.cm2/mg. Therefore, did a full characterization on the TI parts only.

Page 11: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Radiation Test Hardware Setup

•Two personal computers (PCs) with PCI slots were used in the test

•Each had an IEEE1394 board

•One of the PCs with the devices-under-test (DUTs) was placed in the beam line while the other was placed in a remote area

•The two PCs were connected by their 1394 interface via a 10 ft 1394 cable for data communication

•A PCI bus isolation card was placed between the DUT board and its host PC

•This card enables current consumption readings from the +5V supply to the DUT board from the host PC via the PCI interface

•A HP34401A Digital Multi-Meter (DMM) was used to read and record this supply current

Page 12: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Radiation Test Hardware Diagram

1394 DUT

PHY LLCBeam

Host PC

1394 Board

Remote PC (CTRL)

1394 Cable

Monitor, Keyboard, Mouse

Monitor, Keyboard, Mouse

Target Area

PCI Bus Isolation Card HP34401ADMM

10ft.

Laptop

Page 13: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Radiation Test Software

•Custom device driver software was developed using C++ and Jungo’s WinDriver targeted for a PC Windows NT 4.0 platform

•Software was an interrupt driven program which established continuous communications between DUT and CTRL at 100 Mbps

•For SEL testing at BNL no registers were monitored

•For proton testing at TRIUMF only asynchronous mode was implemented

•For heavy ion testing at TAMU both asynchronous and isochronous modes were implemented

Page 14: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

CTRLR DUT

Software Flow for Asynchronous Mode

Determine test type LLC or PHY Wait for interrupt

Form data request packet and send to DUT

Response Buffer

Request Buffer

Register data response packet

Register data request packet

Setup•Lockdown memory•Set node ID•Set Delay

•Enable receive buffers•Turn on interrupts

•Compare Data•Log Errors•Continue Test Loop

•Determine test type requested•Poll LLC or PHY registers•Form Data Response Packet

Setup•Lockdown memory•Set node ID•Set Delay

•Enable receive buffers•Turn on interrupts

Page 15: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Setup• Lockdown memory• Enable ARRS for bus reset packets• Turn on Isoch receive buffer• Turn on interrupts

Setup• Lockdown memory• Enable ARRS for bus reset packets• Turn on Isoch receive buffer• Turn on interrupts

Form register data solicit packet

• Compare register values

• Log errors • Continue loop

CTRLR

Software Flow for Isochronous Mode

Wait for interrupt

• Poll LLC registers• Build data packet

Register data stream

packet

Register data solicit

stream packet

DUT

Receive Buffer

Receive Buffer

Page 16: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Two Main Type of Errors Observed

Soft Errors

•Bit flips logged by software which occurred in registers, FIFOs or data that did not disrupt communications between the DUT and CTRLR during the test run

Hard Errors or SEFIs

•Errors occurring in registers which halted communications between the DUT and CTRLR during the test run •Errors of this type required a series of software and/or operator steps in order to recover communications

•SEFIs were further classified by the steps taken to re-establish communications

Page 17: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Example of a Soft Error

Asynchronous Request Filter Low Register on the LLC

Enables reception of asynchronous request packets on a per-node basis (handles lower node IDs). When an asynchronous request packet is received, the source node ID is examined. If the bit corresponding to the node ID is not set in this register, then the packet is not acknowledged and the request is not queued. In this example, the register is setup such that only asynchronous request packets from nodes 0 and 1 will be accepted.

Bit 012345678910111213141516171819202122232425262728293031

11000000000000000000000000000000

If bit 26 transitions to a 1, this incorrectly would enable asynchronous request packets from node 26 to be accepted.

Page 18: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Example of a Hard Error (SEFI)

Bit 012345678910111213141516171819202122232425262728293031

00000000000000000111000000000000

ReservedReserved

Host Controller Control Register on the LLC

Provides flags for controlling the TSB12LV26

Bit 17 is the Link Enable bit. This bit is set to 1 when the system is ready to begin operation. If an upset cleared it to 0,the TSB12LV26 would be logically and immediately disconnected from the 1394 bus. No packets would be received or transmitted. Communications would be halted between the CTRLR and DUT.

Page 19: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

SEFIs Categorized By Steps Required to Start Communications

5

6

7

8

9

10

Step Action

1

2

3

4

SEU test loop is restarted on the CTRLR, i.e., a packet is sent to DUT requesting register informationSoftware bus reset. Force CTRLR to be root, initiate bus reset in the PHY, reset node on LLC. Restore registers and flush FIFOs. Set bus Ops, IRMC, CMC, ISC, configuration ROM, enable transmit and receive. Implies step 1.Reload software application. This refreshes the lockdown memory region shared by hardware and software. Implies steps 2,1.

Able to verify CTRLR is sending register data solicit packets to DUT. Able to verify that DUT receives the packets and sends data response packet to CTRLR. CTRLR cannot see response packet from DUT. Power cycle the CTRLR. Implies steps 3,2,1.

Disconnect/reconnect the 1394 cable. This causes hard bus reset, tree ID process.

Step 5, followed by steps 3, 2, 1.

Step 6 followed by cold rebooting DUT followed by steps 3, 2, 1.

Cold reboot DUT followed by steps 3, 2, 1.

Step 5 followed by step 8.

Reboot CTRLR followed by steps 3, 2, 1.

11 Reboot both CTRLR and DUT PCs followed by steps 3, 2, 1.

Page 20: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Results - LLC Running Asynchronous ModeERRORS IN LLC RUNNINGASYNCHRONOUS MODE 3 4.2 8.39 11.9 27.7 39.2 51.6 59.6 73

1

2

4

5

6

7

8

9

10

11

12

13

14

15

16

3

“Soft” ErrorsNo errors observed current jumped from 18mA >44mA

Register error, self corrected and no change in current

Register error, self corrected, current jumped 18mA >44mA

“Hard” ErrorsRestart communications from CTRLR

Software bus reset current junped from 18mA to 44mA

Reset CTRLR and/or DUT software

Software bus reset and reset software on DUT and CTRLR

CTRLR sends packet, does not listen cold reboot CTRLR

Disconnect/reconnect cable (hard bus reset)

Disconnect/reconnect cable, reload bus DUT software

Reset cable and cold reboot DUT

Cold reboot DUT after lockup, but no change in current

Cold reboot DUT after lockup, current jump 18mA to 44mA

Reset cable, reboot DUT and software, delta I=0

Reset cable, reboot DUT and software: 18- >44mA

Reboot CTRLR, reload software on bus, DUT and CTRLR

17 Reboot both computers, reset all software

0 0 0 0 0 0 0 0 x

1.3E-4 1.0E-5 4.6E-5 2.5E-5 8.8E-5 3.1E-4 2.4E-4 1.3E-4 x

0 0 0 0 0 0 0 0 x

xxxxxxxxx

xxxxx

0 0 0 8.3E-7 0 0 6.8E-6 0

2.6E-50 0 0 0 0 0 0

4.3E-6 4.2E-7 00 0 1.3E-5 0 0

0 0 4.3E-6 8.3E-7 2.3E-6 0 0 0

2.3E-60 0 0 0 0 0 0

0 0 0 0 0 0 0 06.8E-60 0 0 0 0 0 0

5.7E-52.3E-60 0 0 0 0 0

0 0 0 8.3E-7 4.5E-6 2.6E-5 1.4E-5 0

0 0 2.2E-6 1.7E-6 4.5E-6 0 1.4E-5 0

0

0

00

0

00

0

0

00

4.3E-6

0

0

00

0

0

0

0

0

00

0

0

0

006.8E-6

00

0

Page 21: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Results - PHY Running Asynchronous ModeERRORS IN PHY RUNNINGASYNCHRONOUS MODE 3 4.2 8.39 11.9 27.7 39.2 51.6 59.6 73

1

2

4

5

6

7

8

9

10

11

12

13

14

15

16

3

“Soft” ErrorsNo errors observed current jumped from 18mA >44mA

Register error, self corrected and no change in current

Register error, self corrected, current jumped 18mA >44mA

“Hard” ErrorsRestart communications from CTRLR

Software bus reset current junped from 18mA to 44mA

Reset CTRLR and/or DUT software

Software bus reset and reset software on DUT and CTRLR

CTRLR sends packet, does not listen cold reboot CTRLR

Disconnect/reconnect cable (hard bus reset)

Disconnect/reconnect cable, reload bus DUT software

Reset cable and cold reboot DUT

Cold reboot DUT after lockup, but no change in current

Cold reboot DUT after lockup, current junp 18mA to 44mA

Reset cable, reboot DUT and software, delta I=0

Reset cable, reboot DUT and software: 18- >44mA

Reboot CTRLR, reload software on bus, DUT and CTRLR

17 Reboot both computers, reset all software

0 0 x 0 0 0 0 x x

0 0 x 0 0 0 0 x x

0 0 x 0 0 0 0 x x

xxxxxxxxx

xxxx

x

0 0 x 0 0 1.0E-4 6.4E-5 x00 0 x 0 9.1E-6 0 x

x 0 00 0 0 0 x0 0 x 0 0 0 0 x

00 0 x 0 0 0 x9.1E-8 0 x 8.3E-7 0 0 0 x

00 0 x 3.3E-6 0 0 xx00 0 x 0 0 0

0 0 x 0 0 2.0E-4 0 x0 0 x 0 0 0 0 x0

0

0

0

0

00

0

xxxx

0

0

0

2.5E-6

0

0

0

3.6E-5

0

0

0

2.0E-4

x

xxx2.6E-4

00

0

Page 22: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

10-7

10-6

10-5

10-4

0 10 20 30 40 50 60

y=6e-5*(1-exp(-((x-2)/20)1.15))y=2e-5*(1-exp(-((x-2)/40)0.645))IsochronousAsynchronous

Effective LET (MeV.cm2/mg)

SE

FI C

ross

Sect

ion (

cm2 /d

evi

ce)

Results LLC Irradiated Hard Errors

Page 23: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

10-7

10-6

10-5

10-4

10-3

0 10 20 30 40 50 60

eqn: y=0.0003*(1-exp(-((x-2.5)/30)1.1))eqn: y=0.0001*(1-exp(-((x-2.5)/30)1.15))IsochronousAsynchronous

Effective LET (MeV.cm2/mg)

SE

U C

ross

Se

ctio

n (

cm2 /d

evi

ce)

Results - LLC Irradiated Soft Errors

Page 24: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

10-6

10-5

10-4

10-3

0 10 20 30 40 50 60 70 80

IsochronousAsynchronous

Effective LET (MeV.cm2/mg)

SE

FI C

ross

Sect

ion (

cm2 /d

evi

ce)

Results - PHY Irradiated Hard Errors

Page 25: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

Conclusions

• NSC part exhibited destructive latchup at LET=27 MeV.cm2/mg

• TI part exhibited both SEUs (soft errors) and SEFIs (hard errors)

• At low LETs, the errors are mostly soft errors

• The presence of SEFIs resulting in rebooting of the system makes this part problematic for space usage.– power cycling may be required

• An improved test would involve:– automatic reboot– another device

Page 26: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?

References

Anderson, Don and Mindshare, Inc. FireWire System Architecture. Addison-Wesley:Reading Massachusetts, 1999.

1394 Open Host Controller Interface Specification. Release 1.1, January, 2000.

S. Buchner, et al. Radiation Testing of the 1394 FireWire. Presentation SEU Symposium, Los Angeles. April, 2002.

SponsorsNEPP

NRL/NPOES

Special thanks to Kent Larson and Mike Worcester of Boeing

Page 27: Abstract Introduction Typical PC-based Imple- mentation The Protocol Layers What Was Tested?