[IEEE 2012 IEEE International Integrated Reliability Workshop (IIRW) - South Lake Tahoe, CA, USA...

1
DG Summary: Product/Circuit Reliability Discussion Moderator: Aditya Bansal, IBM Minutes: James Wu, PMCS The DG on product/circuit reliability took place on the evening of Oct. 17, 2012. The moderator welcomed about twenty or so attendees to share their thoughts and experiences relevant to this topic. Challenges of product reliability Attendees shared their view on today’s challenges: Physics –crucial that research scientists get the real failure mechanisms right. Extrapolating stress data to use condition with incorrect model can severely impact reliability at device and system level Geometry scaling – statistical variation at atomic scale is significantly impacting device reliability. Test vehicles – does reliability of a handful of test structures help validate product reliability? Reliability models – accurate model can help designers close their design with reliability degradation. How do we get them and is there industry standard on reliability models (BTI/HCI)? Reliability spec – mission profile varies widely, from consumer-grade electronics to mission- critical components. Automotive, for example, is incorporating more consumer-grade electronics to suit lifestyle changes (infotainment), but overall system reliability must be maintained. How? Reliability Risks Manufacturing and in-field failures Mismatch in design specs due to unforeseen use conditions Component vendors’ competence level in reliability assurance Accelerated testing In general people agree tests such as burn-in may not be representative of long-term reliability at use condition: traditional BI needs to cool down before measurement, missing some recovery effect field failure rate is actually lower than prediction based on accelerated testing BSIM accurately predicts performance at time zero, but reliability is an extrapolation and is sensitive to physics model used Despite shortfalls of accelerated testing, fact remains that Fab cannot realistically collect enough data without acceleration. Qualification vehicles Ring-oscillators (ROs) with variety of inverting stages (INV, NAND, NOR etc.) have been shown to estimate performance of general purpose micro-processors o may not be true for all circuit types o challenging to correlate the degradation in test structures to the products Why not stress the product to failure? o challenge of translating accelerated stress tests to use condition still exists o one attendee made a point that ESD protection and tester time can hinder such effort Design for reliability Simplest method is to add guard-band Add some health monitors to the products and receive the information periodically (legal?) Understand the physics and change reliability models accordingly to allow useful lifetime – accurate age models necessary at circuit, RTL and system level Increase device size to cover reliability [often done in memories for stability] – this is not desirable for logic density reasons Worst-case condition is real, so must do “smart design” – e.g. ensure that all logic circuits must toggle at some point to avoid static stress Even though field failure rate may be low, temporal reliability must be taken care of, for it has implication on cultural stigma of failure. Reliability scientists/engineers and circuit/architecture designers must work together. 224 2012 IIRW FINAL REPORT

Transcript of [IEEE 2012 IEEE International Integrated Reliability Workshop (IIRW) - South Lake Tahoe, CA, USA...

Page 1: [IEEE 2012 IEEE International Integrated Reliability Workshop (IIRW) - South Lake Tahoe, CA, USA (2012.10.14-2012.10.18)] 2012 IEEE International Integrated Reliability Workshop Final

DG Summary: Product/Circuit Reliability Discussion Moderator: Aditya Bansal, IBM

Minutes: James Wu, PMCS

The DG on product/circuit reliability took place on the evening of Oct. 17, 2012. The moderator welcomed about twenty or so attendees to share their thoughts and experiences relevant to this topic.

Challenges of product reliability Attendees shared their view on today’s challenges:

• Physics –crucial that research scientists get the realfailure mechanisms right. Extrapolating stress datato use condition with incorrect model can severelyimpact reliability at device and system level

• Geometry scaling – statistical variation at atomicscale is significantly impacting device reliability.

• Test vehicles – does reliability of a handful of teststructures help validate product reliability?

• Reliability models – accurate model can helpdesigners close their design with reliabilitydegradation. How do we get them and is thereindustry standard on reliability models (BTI/HCI)?

• Reliability spec – mission profile varies widely,from consumer-grade electronics to mission-critical components. Automotive, for example, isincorporating more consumer-grade electronics tosuit lifestyle changes (infotainment), but overallsystem reliability must be maintained. How?

Reliability Risks

• Manufacturing and in-field failures• Mismatch in design specs due to unforeseen use

conditions• Component vendors’ competence level in

reliability assurance

Accelerated testing In general people agree tests such as burn-in may not be representative of long-term reliability at use condition:

• traditional BI needs to cool down beforemeasurement, missing some recovery effect

• field failure rate is actually lower than predictionbased on accelerated testing

• BSIM accurately predicts performance at timezero, but reliability is an extrapolation and issensitive to physics model used

Despite shortfalls of accelerated testing, fact remains that Fab cannot realistically collect enough data without acceleration.

Qualification vehicles

• Ring-oscillators (ROs) with variety of invertingstages (INV, NAND, NOR etc.) have beenshown to estimate performance of generalpurpose micro-processors

o may not be true for all circuit typeso challenging to correlate the degradation in test

structures to the products• Why not stress the product to failure?o challenge of translating accelerated stress tests

to use condition still existso one attendee made a point that ESD protection

and tester time can hinder such effort

Design for reliability

• Simplest method is to add guard-band• Add some health monitors to the products and

receive the information periodically (legal?)• Understand the physics and change reliability

models accordingly to allow useful lifetime –accurate age models necessary at circuit, RTLand system level

• Increase device size to cover reliability [oftendone in memories for stability] – this is notdesirable for logic density reasons

• Worst-case condition is real, so must do “smartdesign” – e.g. ensure that all logic circuits musttoggle at some point to avoid static stress

Even though field failure rate may be low, temporal reliability must be taken care of, for it has implication on cultural stigma of failure. Reliability scientists/engineers and circuit/architecture designers must work together.

224 2012 IIRW FINAL REPORT