White.p.johnson.k

Dr. K. Preston White, University of VirginiaKenneth L. Johnson, NASA/ NESC, LaRC

ver. 1/16/2009 1Prob Req Verification for PM Challenge 2009

Overview

Motivation – why care?Probabilistic requirements

SR&QA vs. engineering performance basedHow to write them

Decision matrix: consumer’s vs. producer’s riskThe “simplest” verification case: pass/fail

Number of simulation trials needed to verifyReliability Test Planner (RTP)Tweaking the verification sampling plan

More sophisticated verifications

ver. 1/16/2009Prob Req Verification for PM Challenge 2009 2

Motivation

CxCEF (Constellation Chief Engineer’s Forum) asked for expert input on how to clarify and standardize the process of writing probabilistic requirements

Address previous Review Item Discrepancies (RIDs)/ no more RIDS on this topicVerify the hardware/ process/ whatever using modeling and simulation (M&S)Make sure you don’t have a problem

The hardware will perform as requiredSee how close an intermediate product is to verification


NASA’s next manned space program, scheduled to make its first flights

early in the next decade.

Ares I and Ares V rockets

The vision is to send human explorers back to the moon and then onward to Mars and other destinations in the solar system . . .

Orion spacecraft

Constellation Program (Cx)

. . . SAFELYver. 1/16/2009 5Prob Req Verification for PM Challenge 2009

Deterministic Engineering Design

Traditional approachIgnore uncertainties initially

Use deterministic values to stand in for uncertain model parameters

Apply factors of safety and similar constructs to outputs to account for uncertainty


Probabilistic Engineering Design

Rapidly gaining acceptanceConfront uncertainties directly—use estimated probability distributions for model parametersEstimate probability distributions for outputs from test and/ or via Monte Carlo sim and/ or other methods

Capture, describe and leverage uncertainty to help design robust, reliable systems

AdvantagesBetter understanding of impacts of uncertainties (lower risks)Design “closer to the margin” (lower costs)


Monte Carlo Approach

Staggering range of applications: computational mathematics, science, social science, economics and finance, computer science, all branches of engineering

Input Distributions

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

-3 -2 -1 0 1 2 3 4 5 6

Output Distributions

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

-4 -3 -2 -1 0 1 2 3

Draw Observations Calculate Observations

Sensitivity AnalysisFixed parameters and controlled

inputs with known values

Simulation Model

Random inputs with knownprobability distributions

Fixed parameters and controlledinputs with known values

Sample distributions ofmodel outputs

Simulation Model


Writing Probabilistic Requirements

Requirements which involve M&S run under uncertainty need particular treatment

Wording needs to be clearMake sure address “goodness” of M&S as well as whether output specifically meets a numberNeed to address uncertainty inherent in M&S along with uncertainty within the model and assumptions themselves

Recommendations to CxCEF by Probabilistic Requirements Verification Team

JSC document EA4‐07‐005 dated 5/14/2007 with attachments

This is a probabilistic technology (PT)


M&S Design Aids and Checks

Part of a well‐written requirementGoals includeMake sure simulation model isn’t designed in a vacuumMake sure the simulation model is appropriate to the questions at handMake sure uncertainties are correctly and fully addressedPeer review

Methods (neither exhaustive nor mutually exclusive)Six Steps (Suren Singhal, MSFC; attachment to EA4‐07‐005)NASA‐STD‐7009


Two Major Types of Probabilistic Requirements

Safety, Reliability and Quality Assurance (SR&QA)‐type, aka probabilistic risk assessment (PRA)‐type

Failure generally results in loss of crew and/ or loss of mission (LOC/LOM)

Engineering performance‐based (aka physics‐based)

Next block in failure scenario is generally non‐catastrophic

Project has ultimate decision of which typeTalk to a statistician, PRA expert and/or requirements expert if not clear which to apply


SR&QA Type Requirements

Requirement: [CAxxxx‐PO] The XXX system shall limit its contribution to the risk of loss of crew (LOC) for a Xsssss mission to no greater than 1 in 200 (TBR‐xxx‐xxx).Rationale: The 1 in 200 (TBR‐xxx‐xxx) means a .005 (or .5%) probability of LOC due to the XXX during any Xsssss mission. The baseline numbers were derived from a preliminary PRA within NASA‐TM‐2005‐214062, NASA's Exploration Systems Architecture Study. This requirement is driven by CxP 70003‐XXXXxx, Constellation Program Plan, Annex 1: Need, Goals, and Objectives (NGO), Safety Goal CxP‐Xxx: Provide a substantial increase in safety, crew survival and reliability of the overall system over legacy systems.


SR&QA Type Verification Statement

[CA0501V‐PO] Xsssss LOC due to XXX shall be verified by analysis. The simulation tools and analysis methodology, and the assumed non‐ideal model behavior and design data which is used in the analysis shall be developed and peer reviewed to ensure the potential causes for off‐nominal behavior are adequately identified and their probabilities properly quantified [ see note 1]. The requirement shall be considered satisfied when analysis results show there is at most a 0.5% (TBR‐xxx‐xxx) probability of LOC with the probability taken as a mean probability [ see note 2 ].

Notes: see backup


Physics‐Based Probabilistic Requirements

[CAyyyy‐PO] The Vehicle shall perform Ysssss action under Yrrrrrr conditions

Rationale: Establishes the Vehicle as the launch vehicle to perform Yssss action with sufficient remaining propellant to execute further necessary actions. The architecture design solution of using the Ynnnnn approach was a result of NASA‐TM‐yyyyyyyy.


Physics‐Based Verification Statement[CAxxxxV‐PO] The performance of the Vehicle to perform Ysssss action under Yrrrrrr conditions shall be verified by analysis. The simulation tools and analysis methodology, and the assumed non‐ideal model behavior and design data which is used in the analysis shall be developed and peer reviewed to ensure the potential causes for off‐nominal behavior are adequately identified and their probabilities properly quantified [ see note 1 ]. The requirement shall be considered satisfied when analysis results show there is at least a 99.73% (TBR‐yyy‐yyy) probability of successfully achieving success criteria with a “consumer’s risk” of 10% [ see note 2 ].


The Notes

Note 1: See “6‐Step Process” for one satisfactory approach. Note that the "peer review" may generally be performed by the SIG's, Panels, and Engineering Review Boards responsible for the engineering effort related to the requirements being verified.

Note 2: “Consumer’s risk” is defined in the glossary. A 10% maximum is suggested, and consumer’s risk is specified because of the criticality of meeting this constraint to mission success. The term “β‐confidence” could be used if preferred where the value specified would then be 90%.


How To

Break down components of requirementDesired performanceDesired probability/proportion of achieving desired performanceAcceptable riskSampling error Consumer’s (β, Type II) and producer’s (α, Type I) risks


Decision Matrix

The Actual Design:

Verification Procedure Determines that the Design:Meets the Standard Fails the Standard

Meets the StandardCorrect

Determination(probability 1-α)

Producer’s RiskType I Error

(probability α)

Fails the StandardConsumer’s Risk

Type II Error(probability β)

CorrectDetermination

(probability 1-β)


The Courtroom Analogy

Threshold for a criminal trialAssume innocence/ prove beyond a reasonable doubtFocuses on α risk: want to make sure that given you found evidence of wrongdoing, you really are sure of the evidenceType I error: wrongful convictionType II error: letting a guilty person go freeAmerican courts try not to convict based on finding a possibility that the defendant is guilty(Don’t look too closely: the analogy isn’t quite right web.bsu.edu/cob/econ/research/papers/bsuecwp200601liu.pdf)


Bare‐Bones Physics‐Based Verification Statement

The system will attain the success threshold 99.73% of the time with a “consumer’s risk”of 10%.

99.73% is a coverage probability, aka a percentile of a distribution, aka a reliability of the systemMeans you expect to be ρ = 99.73% reliable, and can deal with failure 27 times out of 10,000Generally flowed down from parent requirements


Bare‐Bones Physics‐Based Verification Statement

The system will attain the success threshold 99.73%of the time with a “consumer’s risk” of β = 10%.

10% is an expression of consumer’s riskMeans the Program expects to be ρ = 99.73% reliable, but if the system is in actual reality ρ = 99.729% reliable, the Program can deal with accepting that condition β = ~10% of the time

Suggest this is a programmatic decision, but probably needs thoughtful input from designers and systems analystsOften depends on economics

β risk is for figuring out whether you’ve met the requirement and is not necessarily used for flowing down requirements to the next level


How do I prove I have met my requirement based on coverage and consumer’s risk?


Proof of Verification

The Pass/Fail Case(Binomial)

A typical case:n simulation trials are runThe number of trials in which the simulated result failed requirement is counted

HowHow do we determine n?How many of the n trials can exceed the requirement before we know we’ve failed?


Number of Trials Required

General ideaWant to run enough trials to be able to claim, “We’ve looked exhaustively”“Exhaustively” is determined by the Project: consumer’s risk (the probability we say it’s good when it’s actually bad)

This is a well‐characterized statistical caseAcceptance samplingAddressed in MIL‐HDBK‐781Watch it: many QC acceptance standards focus on producer’s risk (ASTM/ASQ Z1.4)


Acceptance Sampling

Qualities of acceptance sampling as a best practice

Strengths• Accepted national and international standard • Easy to implement—good software and “cookbook”to apply

• Sampling plan can be determined a priori before any simulation runs are made

Weakness• For high reliability (large ρ ) with low risk (small β) requires thousands of runs


Why Do Math?

An easy‐to‐run (pretty much) sample size calculator is available for free online

Gary Pryor for the US Army TRADOC, Ft. Leonard WoodHandles many cases accurately

Reliability Test Planner (RTP)http://www.wood.army.mil/msbl/Reliability%20Test%20Planner/Reliability_Test_Planner.htm


Steps

1. Start up the program, then click the “Solver” tab2. Select the “Binomial Dist” radio button3. Choose whether you want to find a first‐cut plan by

specifying number of trials or number of “acceptable”failures

4. Enter desired reliability (coverage) in the “reliability” box5. Enter one minus the consumer’s risk in the “confidence”

box (this is β confidence or power)6. Enter either desired number of trials or acceptable failures7. Press “Solve”8. Read out the (n,c) sampling plan and its (within rounding)

exact reliability ρ and β confidence (power) in the window


Determining n: RTP Solver


Sampling plan indicated: (852,0)RTP doesn’t work for high reliabilities and other large numbers

See a statistician if you need more

Now, Press “Implement”

This creates the “power curve” for your sampling plan

Also known as an operating characteristic (OC) curveX axis: true system reliability, an unknown valueY axis: probability that, given a true system reliability of X and your sampling plan, the system will be accepted


Power Curve Shows Your Plan

This plan says to run 852 sim trials and call the requirement verified if there are zero failures

A single failure means the system is not verified to the requirement

Given a true reliability of 99.73%, the power curve shows a 10% probability that you will verify (accept) using this sampling plan

This is the way we specified the plan


99.73% reliability,

10% consumer’s risk

Power Curve Details Your Plan

Given a true reliability of ~99.65%, we have a ~5% chance we’ll verify

This should be rejectedCan you live with accepting verification 5% of the time?


99.65% reliability,


Power Curve Details Your Plan

Given a true reliability of ~99.92%, we have a ~50% chance we’ll verify

. . . And a ~50% chance we’ll rejectThis should be accepted50% of the time, a perfectly good system will be rejected using this plan


99.92% reliability,


Can the Plan Be Improved?Definition of improvement: discrimination

Want to be able to discriminate between good and bad systems as perfectly as possible

Graphically: want the power curve to be vertical

All systems <99.73% reliable are rejected 100% of the timeAll systems >99.73% reliable are accepted 100% of the time


Power curve of sampling plan with

perfect discrimination

Can I Get Closer?

Goal: steeper curve, still going through the point (0.9973, 0.10)One way: brute force

More trials (aka more samples)

Conceptually: specify a second point you want your curve to go through


Bring this point . . .

. . . back this way

Can I Get Closer?RTP will generate related plans with constant consumer’s risk

Press “Add Plans” in menu barSelect “Constant Consumer Risk” and type in the desired β riskType in the number of plans you want to examine “Above” and “Below” the current curve and the increment of number of failures between each planPress “OK” to get the sampling plans and curves


More Plans

Adding a lot more trials and accepting more failures gets us closer to a vertical power curve

Best plan here is the (12114, 25) planYou can get better by running more trials, but clearly, there are diminishing returns


Plan comparisons10 sampling plans with fixed ρ=0.995 and β=0.05

0

500

1000

1500

2000

2500

3000

3500

0 1 2 3 4 5 6 7 8 9

Acceptance Number (c)

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Number of Trials (n)(Left Axis)

Producer's Risk (α)(Right Axis)


Test Plan DescriptionTo get a better test plan description, select the “Detail” tab and press the square button on the tabNote that the “length”column contains decimal trials: should be rounded up

E.g. Plan no. 7 should be (12115, 8)

Plan Description: Original Fixed Length Test PlanPlan Type: Fixed LengthSource: OriginalDistribution: Binomial

Lower Test Value: 0.997Consumers Risk (Beta): 9.990989E‐02

Upper Test Value: 0.999Producers Risk (Alpha): 0.6836846

Primary Plan Length: 852 trials & 0 failures

Plan # Length FailBeta Alpha

1 852.0 0 0.10.684

2 850.3 0 0.10.683

3 3,435.0 5 0.10.321

4 5,702.3 10 0.10.155

5 7,881.4 15 0.10.075

6 10,012.3 20 0.10.035

7 12,114.8 25 0.10.017


What If I Can’t Do That Many?

It may not be possible to run the number of trials required for verification

Each sim trial takes a very long timeResources not available to run a lot of trialsTeam doesn’t want to run that many trials


Here’s What You Can Do

Get a faster computerUse a simpler simulation modelGo with the number of trials you can run

Examine risk with RTP or other correct statistical calculationACCEPT THAT RISKMay need to write a waiver

Use acceptance sampling by variables techniqueUse a technique that searches the response space efficiently

Response surface methodology?Probabilistic methods?

Calculate an answerBe sure calculations are correct or conservative

“Worst‐on‐worst”?Make sure it really isworst‐on‐worst


But . . . I Know I’m OK!

“But I know I’m OK because all my trials’ output values are a long way from my requirement limit!”

This is not the binomial (pass/fail) case, but may be able to be dealt with another wayNOT PASS/FAIL

Continuous variable output characterization generally requires far fewer samples than binomial case

Sometimes a small fraction of the number of trials needed for pass/failDepends on distribution

You still need to verify statistically or somehow convince the verification panel using engineering judgment that risk is sufficiently low


Variables Acceptance Plans

Basic idea: compare the mean of your sim’s output distribution to the requirement limit

Add a factor which accounts for sampling error


Required Characteristic

Shape,Scale,Threshold400,16,10000

Gamma (3-Parameter) DistributionProbability = 0.0

1001

0

1001

5

1002

0

1002

5

1003

0

1003

5

0

0.1

0.2

0.3

0.4

dens

ity

Requirement limit

Sim output

HoweverImportant caveat: you have to conservatively describe the distribution

Therefore, you may need enough sim trials to produce enough data points to make sure you have the distribution you think you haveAlternatives (may require a waiver)HistoryEngineering judgmentOther Bayesian methods

DON’T just assume a normal distribution!


Methods: See a Statistician

Talk to a statistician if this is the route you want to take

Conservative distribution fittingCorrect sample size determination

Dr. White has assembled Excel sample size calculators for a significant number of distributions

Need beta (sic) testers to exercise the methods in real situations using real data

Characterization of β risk given your data may be possible post hoc


Summary

Allow correct risk decisions and avoid RIDS by using correct probabilistic verification language and methods

Language recommendations availableUse good M&S design and peer review methods

Physics‐based probabilistic requirements are all about diligence in searching for problems

Consumer’s riskCalculators are available for many pass/ fail verificationsOther methods are available which allow for potentially less resource‐intensive verifications

Verification by variables requires more rigor in verification



Backup

The Search Analogy

In a test or experiment, the engineer wants to be sure the results prove the hypothesis

Make sure the results aren’t due to chanceProducer’s (α) risk: probability that what you found is really due to chance

e.g. “95% confidence” means only α = 5% chance results could have occurred by chance and were not due to the controlled inputs

In accepting a lot of material, ideally you want to minimize your risk of accepting a bad lot

Want to search diligently for bad materialConsumer’s (β) risk : probability that you accepted bad material

E.g. “10% consumer’s risk” means that there is a 10% probability that you would accept a lot of barely rejectable quality given your acceptance sampling plan1 – β is called the “power” of the sampling plan

α and β risks are not equivalent


Six Steps

1. Document the requirement, consequences of not meeting the requirements, and causes leading to the consequences.

2. Logic diagram such as a fault tree showing potential causes for not meeting the requirement.

3. Document the rationale supporting the methods (including analysis, software, testing, and inspection) selected to compute probability of failure at various gates of the logic diagram.

4. Compute probabilities at various gates of the logic diagram and results of the completed logic diagram analysis leading to verification of the requirement with specified probability and its associated risk (confidence level).

5. Peer review of the four steps stated above. An independent review shall be performed focusing on critical failure modes and events on the critical path

6. A verification report should be submitted to the organization responsible for the requirement


NASA‐STD‐7009

M&S Standard development started in May 2005 in response to Diaz Action #4The permanent NASA M&S Standard was issued by the NASA Chief Engineer on July 11, 2008 as NASA‐STD‐7009 Goal: credibilityContains requirements for various parts of M&S to achieve that goal

Construct a credible M&SObtain credible outputPeer assessment of credibility of M&S as a wholeEnsure that the credibility of the results from M&S is clearly and properly conveyed to those making critical decisions


The Notes for SR&QA Reqs

Note 1: See “6‐Step Process” for one satisfactory approach. Note that the "peer review" may generally be performed by the SIG's, Panels, and Engineering Review Boards responsible for the engineering effortrelated to the requirements being verified. If analysis is performed using Probabilistic Risk Assessment methodology, the analysis shall be performed in accordance with CxP 70017, Constellation Program Probabilistic Risk Assessment (PRA) Methodology Document. The team recommends, however, that this Methodology Document should be reviewed and modified to incorporate the salient features of the “6‐Step Process” if the Probabilistic Risk Assessment Methodology is used for this example and the resulting information be discussed with the design community.Note 2: LOC and LOM conditions are defined in the glossary. The use of mean probabilities without confidence levels for these classes of requirements is specified to allow for allocation of requirements to subsystems.


White.p.johnson.k

Technology

Transcript of White.p.johnson.k