Mars Climate Orbiter - CIS Personal Web Pages...– September 23, 1999: Mars Orbiter Insertion –...
Transcript of Mars Climate Orbiter - CIS Personal Web Pages...– September 23, 1999: Mars Orbiter Insertion –...
Mars Climate Orbiter
Kevin Henry Kevin Henry Rory Rory
MacKenzieMacKenzie
Introduction
• Launched in Dec 1998 as part of the Mars Surveyor ProgramProgram
• Objective was enter the Mars atmosphere and collect scientific data
• Crashed on entry to atmosphere in Sep 1999• Metric mix-up meant Orbiter entered atmosphere at wrong
altitude• Mishap Investigation Report issued report I only six weeks
laterlater• Second report followed in March 2000• Mishap blamed on miscommunication and poor project
management
Mission Overview
• Orbiter carried two Instruments:– Pressure Modulator Infrared Radiometer (PMIRR)– Pressure Modulator Infrared Radiometer (PMIRR)– Mars Colour Imager (MARCI)
• Science Objectvives:– Monitor daily weather and atmospheric conditions
– Record surface changes due to wind and other effects– Determine temperature profiles– Determine temperature profiles– Monitor water vapor and dust content– Look for evidence of past climate change
Mission Overview
cont...
• Science mission to last 2 years, then last 2 years, then act as relay station for 5 years
• Data relay station would be used by would be used by Mars Polar Lander and future Mars missions.
Mission Timeline
• Expected Timeline:– 1993: Mars Surveyor Program is Launched– 1995: Mars Surveyor Project ’98 Missions are Identified– 1995: Mars Surveyor Project ’98 Missions are Identified– Dec 11, 1998: Launch– September 23, 1999: Mars Orbiter Insertion– September 27 1999: Mars Aerobraking Begins– November 10, 1999: Mars Aerobraking Ends– December 1, 1999: Transfer to Mapping Orbit– December 3, 1999: Mars Polar Lander Support– March 3, 2000: Mars Mapping Begins– March 3, 2000: Mars Mapping Begins– January 15, 2002: Mars Relay Mission Begins– December 1, 2004: End of Primary Mission
• Projected cost: $327.6 Million for MCO and MPL
Thrusters, and the crash
• Used by Spacecraft to perform trajectory adjustments
• 4 thruster manoeuvres planned during the flight of the MCOthe flight of the MCO
• Trajectory Correction Maneuver-4 executed as planned on Sep 15, 1999
• Mars Orbit Insertion planned for Sep 23
• Signal lost at 09:04:52, early than expected, and never reaquired
• Software calculating trajectory models used English units of pound-seconds rather than Metric units of Newton-rather than Metric units of Newton-seconds
• Effect of spacecraft trajectory underestimated by factor of 4.45
• Altitude for entry was 57km instead of 220km
Companies Involved
• Jet Propulsion Laboratory (JPL) of California– Lead flight centre
• Locheed Martin Astronautics (LMA) of Denver, Colorado– Prime contractor– Design and Development of Spacecraft– Flight System Integration and testing– Supporting launch operations
•• Mars Surveyor Operations Project Mars Surveyor Operations Project •• Mars Surveyor Operations Project Mars Surveyor Operations Project –– Created by JPLCreated by JPL–– Responsible from MCO and MPL flightResponsible from MCO and MPL flight
•• operationsoperations
Mishap Investigation Board
• Phase I Report released Nov 10 1999• Focuses on issues that must be resolved • Focuses on issues that must be resolved
before Mars Polar Lander (MPL) reaches Mars Surface
• Purpose: Determine root causes and contributing factors
• Recommendations to improve MPL operations
• Meetings conducted at Jet Propulsion Lab with members of JPL and LMA.
Root Cause
• Failure to use metric units in the coding of a ground software file “Small Forces”a ground software file “Small Forces”
• Angular Momentum Desaturation (AMD) contained output data from small forces
• Trajectory modellers assumed the data was in the correct units
• AMD events during the journey occurred 10-14 times more often than expected10-14 times more often than expected
• Small errors introduced in trajectory estimates over 9 months
• Discrepancies were only informally reported
Contributing Causes
1) Undetected mis-modelling of spacecraft velocity changeschanges
• AMD files unused for first four months• When files were used the underestimation was noticed
2) Navigation team unfamiliar with spacecraft• Operations navigation team not involved in key development
stages• Critical information passed on
3) Trajectory correction manoeuvre number 5 not performedperformed
• Contingency manoeuvre plan was in place but not prepared for• TCM-5 was discussed verbally but not executed
4) System engineering process did not adequately address transition from development to operations
• Inadequate transition from development to operations• Navigation team unfamiliar with spacecraft design characteristics
Contributing Causes cont...
5) Inadequate communications between project elements• Development and operations teams; project management and • Development and operations teams; project management and
technical teams; project and technical line management• Assumptions were made and key knowledge not passed between
project teams
6) Inadequate operations navigation team staffing• Only 2 full time staff
7) Inadequate Training• Unaware of reporting procedure• Unaware of reporting procedure• Not enough emphasis on end-to-end testing
8) Verification and validation• Small forces file not validated
Contributing Causes cont...
• Throughout all the project elements there was an absence peer reviewswas an absence peer reviews
• Those held were without key personnel• Recommendations from these causes
included the obvious (checking units) and changes to project structure
• Face to face meetings between elements • Face to face meetings between elements and long term support to improve communications
Mars Polar Lander• Launched Jan 3 1999• Second of Mars Surveyor ’98
programme• Expected to touch down on
programme• Expected to touch down on
South Polar Region• Purpose was to record
weather conditions and collect samples from surface
• Communication lost during landing procedure on Dec 3 1999
• Software error is most likely • Software error is most likely reason
• Incorrectly indicated the ‘touch down’ signal and cut off engines 40 metres above surface
Report on Project Management in
NASA
•• MCO mission was conducted under MCO mission was conducted under NASA’s “Faster, better, cheaper” NASA’s “Faster, better, cheaper” NASA’s “Faster, better, cheaper” NASA’s “Faster, better, cheaper” philosophy.philosophy.
•• Failure to instil sufficient rigor in risk Failure to instil sufficient rigor in risk management throughout the mission management throughout the mission lifecycle.lifecycle.lifecycle.lifecycle.•• Increased risk to an unacceptable level.Increased risk to an unacceptable level.
•• Cuts in money and resources available to Cuts in money and resources available to support MCO mission.support MCO mission.
Project Management Issues
•• Roles and responsibilities of team members on Roles and responsibilities of team members on MCO mission were not clearly defined.MCO mission were not clearly defined.MCO mission were not clearly defined.MCO mission were not clearly defined.
•• Authority and accountability an issueAuthority and accountability an issue–– Who is in charge?Who is in charge?–– Who is the mission manager?Who is the mission manager?
•• Project plan did not provide a careful handover Project plan did not provide a careful handover from the development project to the operations from the development project to the operations project.project.
•• Inadequate trainingInadequate training•• Inadequate trainingInadequate training•• “The board found that the project management “The board found that the project management
team appeared more focused on meeting mission team appeared more focused on meeting mission cost and schedule objectives and did not cost and schedule objectives and did not adequately focus on mission risk.”adequately focus on mission risk.”
Recurring themes from failure
investigations and studies
•• Outlines lack of peer reviews over majority of NASA Outlines lack of peer reviews over majority of NASA projectsprojects
•• Poor risk managementPoor risk management•• Inadequate testing and quality controlInadequate testing and quality control•• Poor intercommunication between teamsPoor intercommunication between teams
Lack of discipline in processes
•• Processes used to develop, validate and Processes used to develop, validate and operate the spacecraft were not sufficient to operate the spacecraft were not sufficient to operate the spacecraft were not sufficient to operate the spacecraft were not sufficient to minimise the risks introduced by these cuts.minimise the risks introduced by these cuts.–– This risk compromised the mission to the point of mission failure.This risk compromised the mission to the point of mission failure.
•• Mission deemed a success up until right Mission deemed a success up until right before Mars orbit insertion.before Mars orbit insertion.
•• Processes should be in place to catch Processes should be in place to catch •• Processes should be in place to catch Processes should be in place to catch mistakes before they become detrimental to mistakes before they become detrimental to mission success.mission success.
•• Led to NASA to define a new philosophy for Led to NASA to define a new philosophy for further projects further projects –– “MISSION SUCCESS “MISSION SUCCESS FIRST”FIRST”
MISSION SUCCESS FIRST
•• Mission success must become the highest Mission success must become the highest priority at all levels of the project and the priority at all levels of the project and the priority at all levels of the project and the priority at all levels of the project and the organisation.organisation.
•• New philosophy focuses on 4 primary New philosophy focuses on 4 primary concerns:concerns:–– People People –– People People –– ProcessProcess–– ExecutionExecution–– Technology Technology
Test Driven Development
•• Under Mission Success First teams take full Under Mission Success First teams take full ownership of the development.ownership of the development.ownership of the development.ownership of the development.–– Have to understand, document and communicate Have to understand, document and communicate
limitations of the system.limitations of the system.–– Continuous reviews, internally and externally.Continuous reviews, internally and externally.
•• “Test, test and test some more.” “Test, test and test some more.” –– PhilosophyPhilosophy
“Know what you “Know what you build,build,
Test what you build, Test what you build, Test what you fly,Test what you fly,Test like you fly.”Test like you fly.”
Mission Success First Check List
Not catching the error
•• “Our inability to recognise and correct this simple “Our inability to recognise and correct this simple error has had major implications” error has had major implications” –– Edward Stone, Edward Stone, error has had major implications” error has had major implications” –– Edward Stone, Edward Stone, Director of Jet Propulsion LaboratoryDirector of Jet Propulsion Laboratory
•• “The problem here was not the error, it was the “The problem here was not the error, it was the failure of NASA’s systems engineering, and the failure of NASA’s systems engineering, and the checks and balances in our processes, to detect checks and balances in our processes, to detect the error. That’s why we lost the spacecraft.” the error. That’s why we lost the spacecraft.” ––Edward Weiler, NASA associate administrator for Edward Weiler, NASA associate administrator for space sciencespace sciencespace sciencespace science
•• “A single error should not bring down a $125 “A single error should not bring down a $125 million mission.” million mission.” –– Thomas Gavin, Deputy director Thomas Gavin, Deputy director for space and earth science at NASA’s Jet for space and earth science at NASA’s Jet Propulsion LaboratoryPropulsion Laboratory
Conclusion
•• Poor communication between teamsPoor communication between teams•• Formal error reporting not followedFormal error reporting not followed•• Formal error reporting not followedFormal error reporting not followed•• Poor training of staffPoor training of staff•• CostCost--cuts resulting in less staff and cuts resulting in less staff and
resourcesresources•• Prove it doesn’t work, instead of proving it Prove it doesn’t work, instead of proving it •• Prove it doesn’t work, instead of proving it Prove it doesn’t work, instead of proving it
does workdoes work
Learning from
experience?
•• Delayed launch of Mars Science Delayed launch of Mars Science Laboratory Rover Mission was due to fly Laboratory Rover Mission was due to fly 2009, now 20112009, now 2011–– Mission dogged by testing and hardware Mission dogged by testing and hardware
problemsproblemsproblemsproblems–– Will cost an extra $400mWill cost an extra $400m–– "Trying for '09 would require us to assume too "Trying for '09 would require us to assume too
much risk, more than I think is appropriate for much risk, more than I think is appropriate for a flagship mission,” a flagship mission,” –– NASA administrator NASA administrator Michael GriffinMichael Griffin