2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance...
-
Upload
norma-newman -
Category
Documents
-
view
223 -
download
0
description
Transcript of 2/21/20161 California Institute of Technology Software Anomaly Trends in JPL Missions Assurance...
05/05/23 1
California Institute of Technology
Software Anomaly Trends in JPL Missions
Assurance Technology Program Office
presented byAllen P. Nikora
Nelson W. GreenJet Propulsion Laboratory/California Institute of Technology
05/05/23 2Software Anomaly Trends in JPL Missions
California Institute of Technology
Agenda
• Overview• Software Failure Intensities• Software Failures vs. All Failures• Software Failures by Criticality• Discussion and Future Work• Backup Material
05/05/23 3Software Anomaly Trends in JPL Missions
California Institute of Technology
Overview• Presentation based on work performed for the Ultra-
Reliability1 (UR) program– UR objective: Achieve NASA-wide reliability of one order of
magnitude better than today– Definitions
• Ultra-reliability– Given a specific time frame – reliability one order of magnitude more than
current standard• Long Life
– Missions with a design lifetime of 20 years or more
– UR Program Elements• Integrated Systems Health Management with feedback for extremely long
term reliability• Reliability Roadmap• Software reliability• Reliability for extended missions• Workshop on Lunar and Mars mission reliability1 Ultra-Reliability Integration is a multi-center task funded by NASA OSMA
• Phil Napala – NASA Headquarters S&MA Sponsor• Charles Barnes – ATPO Program Manager• Andrew Shapiro – Program Element Manager
05/05/23 4Software Anomaly Trends in JPL Missions
California Institute of Technology
Overview
Ultra Reliability Phases
Pre-Launch/Launch
Program Planning – Area IdentificationReliability Issue Identification, Mitigation Strategies
TransitInitial task ExecutionRe-evaluationNew task identificationRevaluation
Orbit/DescentInfrastructure Development
Strategies for new missions
Surface
Ultra-Reliability by Design
05/05/23 5Software Anomaly Trends in JPL Missions
California Institute of Technology
Overview• UR program is NASA-wide:
– to address different ultra-reliability needs in different NASA Enterprises
– to leverage the wide variety of expertise across all of NASA– to get buy-in and make this a successful program – to develop a NASA - wide infrastructure (paramount)– to leverage overlapping issues
• to take advantage of related on-going NASA tasks
• There is a lead center for each major area, but many centers should participate and be funded in each area
• Metric for leveraging of internal S&MA research• The development of reliability assessment is a key for
success – Intelligent consistent use of existing NASA methods and an
opportunity to develop novel ways of assessing reliability
05/05/23 6Software Anomaly Trends in JPL Missions
California Institute of Technology
Overview• Results reported are based on work performed for the
Software Reliability element of the UR program– UR program overall goal: improve the reliability of NASA systems
by an order of magnitude• Reliability improvement goal includes software components• Achieving goal requires knowledge of software reliability for
current and historical missions• Analyzed space mission software failures observed during
mission operations to determine if and how software failure behavior changes from mission to mission.– How does software failure intensity change from mission to
mission?– Does the proportion of anomalies due to software change from
mission to mission?– Does the proportion of software anomalies associated with a
specific criticality level change from mission to mission?
05/05/23 7Software Anomaly Trends in JPL Missions
California Institute of Technology
Overview
Flight and Ground Software Anomalies by Mission, Date
Legend• Flight Software Anomaly• Ground Software Anomaly
The number of points on a given date represents the number of anomalies observed on that date
• Left box edge represents launch date• Right box edge represents
• End of mission• Anomaly collection date (for
current missions)
05/05/23 8Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure Intensities
• Observed increased software failure intensity during mission operations from mission to mission
• Computing software failure intensity– Collect ISAs for planetary missions– Identify software anomalies for a given project using code in “Cause”
field– Compute failure intensity = number of failures/mission length
• Completed missions length: (mission end date) – (mission launch date)• Current missions length: (ISA data collection date) – (mission launch date)
– Flight and ground software failure intensities computed separately• Flight and ground software may be of different mission criticality• Different structural characteristics• Different development practices
– Applied T4253H smoother to remove noise in anomaly data• More thorough recording of failures for one mission than for another• Different skill, experience levels in different operations teams• Incorrect identification of anomaly cause (e.g., SW failure labeled as non-
SW)
05/05/23 9Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure IntensitiesSmoothed Data
Additional FSWFailure Intensities
Last Slide Viewed
Mission Name (in launch order)
Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact
ReconnaissanceSurveyor Orbiter Lander Rover Orbiter
Raw DataRaw DataT4253H Smoothed
05/05/23 10Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure IntensitiesSmoothed Data
Additional GSWFailure Intensities
Last Slide Viewed
Mission Name (in launch order)
Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact Reconnaissance
Surveyor Orbiter Lander Rover Orbiter
Raw DataRaw DataT4253H Smoothed
05/05/23 11Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure Intensities• Analysis indicates that failure intensities are increasing at a
greater than linear rate from mission to mission.– New techniques to achieve UR program reliability goal may need to be
developed– Estimated failure intensity may be low. Detailed analysis of small
sample of ISAs from one project indicates that number of SW ISAs may be undercounted by at least a factor of 2.
• Work underway to identify software/mission/development process characteristics associated with increasing failure intensity– Budget– Schedule– Mission complexity– Staffing/effort– In-house vs. subcontracted– Avionics complexity– Executable image size
05/05/23 12Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failures vs. All Failures
• Analyzed SW ISAs for projects identified on slides 9 and 10 to determine trends in the proportion of SW anomalies to all anomalies.
• Results– Software anomalies represent an increasing proportion of mission
anomalies• Increase in the proportion of anomalies due to SW (next slide) between
1996 and 2003 (especially ground software)• Overall increase in proportion of anomalies due to SW for Mars missions
(slide 14), rising to nearly 70%.• No trend apparent in proportion of SW anomalies from 2003 to present (
next slide)– Discrepancy between proportions in slides 13, 14
• Different techniques used to identify SW anomalies – “Cause” field vs. detailed analysis of “Description” and “Corrective Action” fields.
– Inconsistent representation may indicate issues with problem reporting practices• Partial, but not complete, overlap between missions analyzed for slide 13
and slide 14.• Different computation of proportions – cumulative for slide 13, mission-by-
mission for slide 14.
05/05/23 13Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failures vs. All Failures
Proportion of SW to Non-SW ISAs – Running Average(Planetary Missions - Post Mars Observer)
Mission NameLaunch
Date
Mars Global Surveyor 11/7/1996
Mars Pathfinder 12/4/1996
Cassini 10/15/1997
Mars Climate Orbiter 12/11/1998
Mars Polar Lander 1/3/1999
Stardust 2/7/1999
QuikScat 6/19/1999
MISR 12/18/1999
Acrimsat 12/22/1999
Mars Odyssey 4/7/2001
Genesis 8/8/2001
Jason 12/7/2001
AIRS 5/4/2002
GALEX 4/28/2003
Mars Exploration Rover 6/10/2003
EMLS 7/15/2004
TES 7/15/2004
Deep Impact 1/12/2005
MRO 8/12/2005
Running Proportions - SW to Non-SW : Counted by Analysis of Cause Field
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
01/01/1995
01/01/1996
12/31/1996
12/31/1997
01/01/1999
01/01/2000
12/31/2000
12/31/2001
01/01/2003
01/01/2004
12/31/2004
12/31/2005
01/01/2007
Incident Date
Prop
ortio
n of
ISAs
due
to S
W
FSW
GSW
Total SW
Launches
Last Slide Viewed
05/05/23 14Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failures vs. All FailuresSmoothed Data
Proportion of All Anomalies due to SW for Selected Missions (raw data and smoothed)Adapted from results presented in “Anomaly Trends for Robotic Missions to Mars: Implications for Mission Reliability”, N. Green, A. Hoffman, T. Schow and
H. Garrett, 44th AIAA Aerospace Sciences Meeting and Exhibit, Reno, Nevada, Jan. 9-12, 2006
Last Slide Viewed
Only anomalies after launch and before MOI are included in this plot
Mars Mars Mars Mars Mars Mars MarsObserver Global Pathfinder Climate Polar Odyssey Exploration
Surveyor Orbiter Lander Rover
05/05/23 15Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failures by Criticality
• Analyzed SW ISAs for projects identified on slides 9 and 10 to determine trends in the proportions of different criticality levels for SW anomalies.
• Results– FSW
• Small decrease in Criticality 2 anomalies• Increase in Criticality 1 anomalies from ~5% to ~10%• Small increase in Criticality 3 anomalies
– GSW• Significant decrease in proportion of Criticality 2 anomalies• No trend in Criticality 1 anomalies• Significant increase in proportion of Criticality 3 anomalies
05/05/23 16Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failures by Criticality
Running Proportion of FSW ISAs by Criticality(Planetary Missions - Post Mars Observer) Last Slide Viewed
FSW ISAs - Running Proportion by Criticality
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
01/01/1998
01/01/1999
01/01/2000
12/31/2000
01/01/2002
01/01/2003
01/01/2004
12/31/2004
01/01/2006
01/01/2007
Incident Date
Runn
ing
Prop
ortio
n
Red Flag
Crit 1
Crit 2
Crit 3
Crit 4
No Crit Value
05/05/23 17Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failures by Criticality
Last Slide ViewedRunning Proportion of GSW ISAs by Criticality
(Planetary Missions - Post Mars Observer)
GSW ISAs - Running Proportion by Criticality
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
01/01/1998
01/01/1999
01/01/2000
12/31/2000
01/01/2002
01/01/2003
01/01/2004
12/31/2004
01/01/2006
01/01/2007
Incident Date
Runn
ing
Prop
ortio
n
Red Flag
Crit 1
Crit 2
Crit 3
Crit 4
No Crit Value
05/05/23 18Software Anomaly Trends in JPL Missions
California Institute of Technology
Discussion and Future Work
• Apparent increase in SW Failure Intensities, Proportion of SW Anomalies– Potential to affect future mission operations
• Reduced science return– Missed observation opportunities– Damage to instruments
• Increase effort required for– Contingency planning– Recovering from anomalies
– Additional analysis in progress to verify trends, check accuracy of estimated failure intensities
• Detailed analysis of anomaly descriptions, anomaly verification, and corrective action descriptions from ISAs
05/05/23 19Software Anomaly Trends in JPL Missions
California Institute of Technology
Discussion and Future Work• Future Work
– Identify relationships between observed increase in failure intensities/proportion of failures due to SW and measurable characteristics of software/mission/development process
• Budget• Schedule• Mission complexity• Staffing/effort• In-house vs. subcontracted• Avionics complexity• Executable image size
– Determine whether there are relationships between numbers and types of SW failures observed during development testing and SW failures observed during launch
– Identify trends in effort required to deal with SW anomalies– Monitor current/future missions to determine whether trends continue– Resolve discrepancies between results reported on slide 12 and slide 13.
• Different techniques used to identify software anomalies– Detailed analysis of description, verification, and corrective action
vs.– Identification via “Cause” field in ISA.
• Indicates that problem reporting procedures may need to modified to accurately identify SW anomalies.
05/05/23 20
California Institute of Technology
Backup Material
05/05/23 21Software Anomaly Trends in JPL Missions
California Institute of Technology
T4253H Smoothing
• Description from help system for SPSS 13.0– The smoother starts with a running median of 4, which is centered
by a running median of 2. It then resmoothes these values by applying a running median of 5, a running median of 3, and hanning (running weighted averages). Residuals are computed by subtracting the smoothed series from the original series. This whole process is then repeated on the computed residuals. Finally, the smoothed residuals are computed by subtracting the smoothed values obtained the first time through the process.
• References– P.F. Velleman, “Definition and Comparison of Robust Nonlinear
Data Smoothing Algorithms,” Journal of the American Statistical Association, vol. 75, September 1980, pp. 609-615.
– P. F. Velleman and D. C. Hoaglin, Applications, Basics, and Computing of Exploratory Data Analysis, Boston: Duxbury Press, 1981.
Last Slide Viewed
05/05/23 22Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure IntensitiesPlanetary Missions Flight Software From Voyager to MRO
Last Slide Viewed
Mission Name (in launch order)
Voyager GALILEO ULYSSES Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact Recon.
Surveyor Orbiter Lander Rover Orbiter
Raw DataRaw DataT4253H Smoothed
05/05/23 23Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure IntensitiesPlanetary Missions Ground Software From Voyager to MRO
Last Slide Viewed
Mission Name (in launch order)
Voyager GALILEO ULYSSES Mars Pathfinder CASSINI Mars Mars Stardust Mars Genesis Mars Deep MarsGlobal Climate Polar Odyssey Exploration Impact Recon.
Surveyor Orbiter Lander Rover Orbiter
Raw DataRaw DataT4253H Smoothed
05/05/23 24Software Anomaly Trends in JPL Missions
California Institute of Technology
Software Failure IntensitiesAnalysis Summary
• Conducted Curve Fit Analysis with SPSS 13.0 to determine whether failure intensities were increasing, decreasing, or showed no trends.– Best-fit curve for all data sets indicates super-linear growth
in failure intensities.• Cubic curve with adjusted R2 0.7
– 11 curves fitted to data
• Compound • Logarithmic• Cubic • Logistic• Exponential • Power
• Growth • Quadratic• Inverse • S-shaped• Linear
Last Slide Viewed