2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance...

24
05/08/22 1 Californ ia Institut e of Technolo gy Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora Nelson W. Green Jet Propulsion Laboratory/California Institute of Technology

description

2/21/20163 California Institute of Technology Software Anomaly Trends in JPL Missions Overview Presentation based on work performed for the Ultra- Reliability 1 (UR) program –UR objective: Achieve NASA-wide reliability of one order of magnitude better than today –Definitions Ultra-reliability –Given a specific time frame – reliability one order of magnitude more than current standard Long Life –Missions with a design lifetime of 20 years or more –UR Program Elements Integrated Systems Health Management with feedback for extremely long term reliability Reliability Roadmap Software reliability Reliability for extended missions Workshop on Lunar and Mars mission reliability 1Ultra-Reliability Integration is a multi-center task funded by NASA OSMA Phil Napala – NASA Headquarters S&MA Sponsor Charles Barnes – ATPO Program Manager Andrew Shapiro – Program Element Manager

Transcript of 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance...

Page 1: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 1

California Institute of Technology

Software Anomaly Trends in JPL Missions

Assurance Technology Program Office

presented byAllen P. Nikora

Nelson W. GreenJet Propulsion Laboratory/California Institute of Technology

Page 2: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 2Software Anomaly Trends in JPL Missions

California Institute of Technology

Agenda

• Overview• Software Failure Intensities• Software Failures vs. All Failures• Software Failures by Criticality• Discussion and Future Work• Backup Material

Page 3: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 3Software Anomaly Trends in JPL Missions

California Institute of Technology

Overview• Presentation based on work performed for the Ultra-

Reliability1 (UR) program– UR objective: Achieve NASA-wide reliability of one order of

magnitude better than today– Definitions

• Ultra-reliability– Given a specific time frame – reliability one order of magnitude more than

current standard• Long Life

– Missions with a design lifetime of 20 years or more

– UR Program Elements• Integrated Systems Health Management with feedback for extremely long

term reliability• Reliability Roadmap• Software reliability• Reliability for extended missions• Workshop on Lunar and Mars mission reliability1 Ultra-Reliability Integration is a multi-center task funded by NASA OSMA

• Phil Napala – NASA Headquarters S&MA Sponsor• Charles Barnes – ATPO Program Manager• Andrew Shapiro – Program Element Manager

Page 4: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 4Software Anomaly Trends in JPL Missions

California Institute of Technology

Overview

Ultra Reliability Phases

Pre-Launch/Launch

Program Planning – Area IdentificationReliability Issue Identification, Mitigation Strategies

TransitInitial task ExecutionRe-evaluationNew task identificationRevaluation

Orbit/DescentInfrastructure Development

Strategies for new missions

Surface

Ultra-Reliability by Design

Page 5: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 5Software Anomaly Trends in JPL Missions

California Institute of Technology

Overview• UR program is NASA-wide:

– to address different ultra-reliability needs in different NASA Enterprises

– to leverage the wide variety of expertise across all of NASA– to get buy-in and make this a successful program – to develop a NASA - wide infrastructure (paramount)– to leverage overlapping issues

• to take advantage of related on-going NASA tasks

• There is a lead center for each major area, but many centers should participate and be funded in each area

• Metric for leveraging of internal S&MA research• The development of reliability assessment is a key for

success – Intelligent consistent use of existing NASA methods and an

opportunity to develop novel ways of assessing reliability

Page 6: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 6Software Anomaly Trends in JPL Missions

California Institute of Technology

Overview• Results reported are based on work performed for the

Software Reliability element of the UR program– UR program overall goal: improve the reliability of NASA systems

by an order of magnitude• Reliability improvement goal includes software components• Achieving goal requires knowledge of software reliability for

current and historical missions• Analyzed space mission software failures observed during

mission operations to determine if and how software failure behavior changes from mission to mission.– How does software failure intensity change from mission to

mission?– Does the proportion of anomalies due to software change from

mission to mission?– Does the proportion of software anomalies associated with a

specific criticality level change from mission to mission?

Page 7: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 7Software Anomaly Trends in JPL Missions

California Institute of Technology

Overview

Flight and Ground Software Anomalies by Mission, Date

Legend• Flight Software Anomaly• Ground Software Anomaly

The number of points on a given date represents the number of anomalies observed on that date

• Left box edge represents launch date• Right box edge represents

• End of mission• Anomaly collection date (for

current missions)

Page 8: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 8Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure Intensities

• Observed increased software failure intensity during mission operations from mission to mission

• Computing software failure intensity– Collect ISAs for planetary missions– Identify software anomalies for a given project using code in “Cause”

field– Compute failure intensity = number of failures/mission length

• Completed missions length: (mission end date) – (mission launch date)• Current missions length: (ISA data collection date) – (mission launch date)

– Flight and ground software failure intensities computed separately• Flight and ground software may be of different mission criticality• Different structural characteristics• Different development practices

– Applied T4253H smoother to remove noise in anomaly data• More thorough recording of failures for one mission than for another• Different skill, experience levels in different operations teams• Incorrect identification of anomaly cause (e.g., SW failure labeled as non-

SW)

Page 9: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 9Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure IntensitiesSmoothed Data

Additional FSWFailure Intensities

Last Slide Viewed

Mission Name (in launch order)

Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact

ReconnaissanceSurveyor Orbiter Lander Rover Orbiter

Raw DataRaw DataT4253H Smoothed

Page 10: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 10Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure IntensitiesSmoothed Data

Additional GSWFailure Intensities

Last Slide Viewed

Mission Name (in launch order)

Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact Reconnaissance

Surveyor Orbiter Lander Rover Orbiter

Raw DataRaw DataT4253H Smoothed

Page 11: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 11Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure Intensities• Analysis indicates that failure intensities are increasing at a

greater than linear rate from mission to mission.– New techniques to achieve UR program reliability goal may need to be

developed– Estimated failure intensity may be low. Detailed analysis of small

sample of ISAs from one project indicates that number of SW ISAs may be undercounted by at least a factor of 2.

• Work underway to identify software/mission/development process characteristics associated with increasing failure intensity– Budget– Schedule– Mission complexity– Staffing/effort– In-house vs. subcontracted– Avionics complexity– Executable image size

Page 12: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 12Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failures vs. All Failures

• Analyzed SW ISAs for projects identified on slides 9 and 10 to determine trends in the proportion of SW anomalies to all anomalies.

• Results– Software anomalies represent an increasing proportion of mission

anomalies• Increase in the proportion of anomalies due to SW (next slide) between

1996 and 2003 (especially ground software)• Overall increase in proportion of anomalies due to SW for Mars missions

(slide 14), rising to nearly 70%.• No trend apparent in proportion of SW anomalies from 2003 to present (

next slide)– Discrepancy between proportions in slides 13, 14

• Different techniques used to identify SW anomalies – “Cause” field vs. detailed analysis of “Description” and “Corrective Action” fields.

– Inconsistent representation may indicate issues with problem reporting practices• Partial, but not complete, overlap between missions analyzed for slide 13

and slide 14.• Different computation of proportions – cumulative for slide 13, mission-by-

mission for slide 14.

Page 13: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 13Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failures vs. All Failures

Proportion of SW to Non-SW ISAs – Running Average(Planetary Missions - Post Mars Observer)

Mission NameLaunch

Date

Mars Global Surveyor 11/7/1996

Mars Pathfinder 12/4/1996

Cassini 10/15/1997

Mars Climate Orbiter 12/11/1998

Mars Polar Lander 1/3/1999

Stardust 2/7/1999

QuikScat 6/19/1999

MISR 12/18/1999

Acrimsat 12/22/1999

Mars Odyssey 4/7/2001

Genesis 8/8/2001

Jason 12/7/2001

AIRS 5/4/2002

GALEX 4/28/2003

Mars Exploration Rover 6/10/2003

EMLS 7/15/2004

TES 7/15/2004

Deep Impact 1/12/2005

MRO 8/12/2005

Running Proportions - SW to Non-SW : Counted by Analysis of Cause Field

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

01/01/1995

01/01/1996

12/31/1996

12/31/1997

01/01/1999

01/01/2000

12/31/2000

12/31/2001

01/01/2003

01/01/2004

12/31/2004

12/31/2005

01/01/2007

Incident Date

Prop

ortio

n of

ISAs

due

to S

W

FSW

GSW

Total SW

Launches

Last Slide Viewed

Page 14: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 14Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failures vs. All FailuresSmoothed Data

Proportion of All Anomalies due to SW for Selected Missions (raw data and smoothed)Adapted from results presented in “Anomaly Trends for Robotic Missions to Mars: Implications for Mission Reliability”, N. Green, A. Hoffman, T. Schow and

H. Garrett, 44th AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, Jan. 9-12, 2006

Last Slide Viewed

Only anomalies after launch and before MOI are included in this plot

Mars Mars Mars Mars Mars Mars MarsObserver Global Pathfinder Climate Polar Odyssey Exploration

Surveyor Orbiter Lander Rover

Page 15: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 15Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failures by Criticality

• Analyzed SW ISAs for projects identified on slides 9 and 10 to determine trends in the proportions of different criticality levels for SW anomalies.

• Results– FSW

• Small decrease in Criticality 2 anomalies• Increase in Criticality 1 anomalies from ~5% to ~10%• Small increase in Criticality 3 anomalies

– GSW• Significant decrease in proportion of Criticality 2 anomalies• No trend in Criticality 1 anomalies• Significant increase in proportion of Criticality 3 anomalies

Page 16: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 16Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failures by Criticality

Running Proportion of FSW ISAs by Criticality(Planetary Missions - Post Mars Observer) Last Slide Viewed

FSW ISAs - Running Proportion by Criticality

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

01/01/1998

01/01/1999

01/01/2000

12/31/2000

01/01/2002

01/01/2003

01/01/2004

12/31/2004

01/01/2006

01/01/2007

Incident Date

Runn

ing

Prop

ortio

n

Red Flag

Crit 1

Crit 2

Crit 3

Crit 4

No Crit Value

Page 17: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 17Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failures by Criticality

Last Slide ViewedRunning Proportion of GSW ISAs by Criticality

(Planetary Missions - Post Mars Observer)

GSW ISAs - Running Proportion by Criticality

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

01/01/1998

01/01/1999

01/01/2000

12/31/2000

01/01/2002

01/01/2003

01/01/2004

12/31/2004

01/01/2006

01/01/2007

Incident Date

Runn

ing

Prop

ortio

n

Red Flag

Crit 1

Crit 2

Crit 3

Crit 4

No Crit Value

Page 18: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 18Software Anomaly Trends in JPL Missions

California Institute of Technology

Discussion and Future Work

• Apparent increase in SW Failure Intensities, Proportion of SW Anomalies– Potential to affect future mission operations

• Reduced science return– Missed observation opportunities– Damage to instruments

• Increase effort required for– Contingency planning– Recovering from anomalies

– Additional analysis in progress to verify trends, check accuracy of estimated failure intensities

• Detailed analysis of anomaly descriptions, anomaly verification, and corrective action descriptions from ISAs

Page 19: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 19Software Anomaly Trends in JPL Missions

California Institute of Technology

Discussion and Future Work• Future Work

– Identify relationships between observed increase in failure intensities/proportion of failures due to SW and measurable characteristics of software/mission/development process

• Budget• Schedule• Mission complexity• Staffing/effort• In-house vs. subcontracted• Avionics complexity• Executable image size

– Determine whether there are relationships between numbers and types of SW failures observed during development testing and SW failures observed during launch

– Identify trends in effort required to deal with SW anomalies– Monitor current/future missions to determine whether trends continue– Resolve discrepancies between results reported on slide 12 and slide 13.

• Different techniques used to identify software anomalies– Detailed analysis of description, verification, and corrective action

vs.– Identification via “Cause” field in ISA.

• Indicates that problem reporting procedures may need to modified to accurately identify SW anomalies.

Page 20: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 20

California Institute of Technology

Backup Material

Page 21: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 21Software Anomaly Trends in JPL Missions

California Institute of Technology

T4253H Smoothing

• Description from help system for SPSS 13.0– The smoother starts with a running median of 4, which is centered

by a running median of 2. It then resmoothes these values by applying a running median of 5, a running median of 3, and hanning (running weighted averages). Residuals are computed by subtracting the smoothed series from the original series. This whole process is then repeated on the computed residuals. Finally, the smoothed residuals are computed by subtracting the smoothed values obtained the first time through the process.

• References– P.F. Velleman, “Definition and Comparison of Robust Nonlinear

Data Smoothing Algorithms,” Journal of the American Statistical Association, vol. 75, September 1980, pp. 609-615.

– P. F. Velleman and D. C. Hoaglin, Applications, Basics, and Computing of Exploratory Data Analysis, Boston: Duxbury Press, 1981.

Last Slide Viewed

Page 22: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 22Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure IntensitiesPlanetary Missions Flight Software From Voyager to MRO

Last Slide Viewed

Mission Name (in launch order)

Voyager GALILEO ULYSSES Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact Recon.

Surveyor Orbiter Lander Rover Orbiter

Raw DataRaw DataT4253H Smoothed

Page 23: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 23Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure IntensitiesPlanetary Missions Ground Software From Voyager to MRO

Last Slide Viewed

Mission Name (in launch order)

Voyager GALILEO ULYSSES Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact Recon.

Surveyor Orbiter Lander Rover Orbiter

Raw DataRaw DataT4253H Smoothed

Page 24: 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance Technology Program Office presented by Allen P. Nikora.

05/05/23 24Software Anomaly Trends in JPL Missions

California Institute of Technology

Software Failure IntensitiesAnalysis Summary

• Conducted Curve Fit Analysis with SPSS 13.0 to determine whether failure intensities were increasing, decreasing, or showed no trends.– Best-fit curve for all data sets indicates super-linear growth

in failure intensities.• Cubic curve with adjusted R2 0.7

– 11 curves fitted to data

• Compound • Logarithmic• Cubic • Logistic• Exponential • Power

• Growth • Quadratic• Inverse • S-shaped• Linear

Last Slide Viewed