Tim.barth mark.nappi
-
Upload
nasapmc -
Category
Technology
-
view
14.302 -
download
1
Transcript of Tim.barth mark.nappi
1
Beyond Band-Aid Solutions: Proactively Reducing Mishap Risks
Project Management Challenge 2010
Presenters:Tim Barth, NASA Engineering and Safety CenterMark Nappi, United Space Alliance
Used with Permission
2
Outline Background Methodology Event-Specific Risk Reduction Actions System-Level Risk Reduction Actions Summary
Background Safety performance in Shuttle ground operations over the history of
the Shuttle Program is commendable Many safety improvements over the years We can still raise the bar
Hardware/Software Systems
Workers Tasks andProcesses
Work Environment
3
KSC is proactively reduce mishap risks through Shuttle fly-out Significant numbers of hardware
and software changes/challenges, process changes/challenges, and workforce changes/challenges happening at the same time
Systems will be stretched, especially with workforce challenges
Making KSC organizational systems and processes more robust to handle these changes and challenges reduces the risks of mishaps, process escapes, and other adverse events
4
Background - continued
“We must challenge our assumptions, recognize our risks, and address each difficulty directly and openly so that we can operate more safely and more successfully than we did yesterday, or last month, or last year. We must always strive to be better, and to do better.“ Chris Scolese, Day of Remembrance Memo, Jan. 29, 2009
“Space shuttle safety is not a random event. It is derived from carefully understanding and then controlling or mitigating known risks.”Richard Covey, Florida Today, Jan. 15, 2009
5
Staying on the Cutting Edge of Investigative Methods and Tools
Mishaps, close-calls, and process escapes are learning opportunities
Steady evolution of investigative techniques and capabilities over the past 20 years in Shuttle ground ops Joint NASA/Contractor Human Factors Team
Perry Committee Human factors model Human factors reps on investigation teams
Industrial and Human Engineering Groups Standing Accident Investigation Boards Additional investigation teams
White papers Corrective Action Engineering
Software and experts for root cause analysis
“No one wants to learn by mistakes, but we cannot learn enough from successes to go beyond the state of the art.”
Henry Petrosky, To Engineer is Human
Mishap Study Kicked off by PH
& USA MgmtAug 2006
Initial Research Jan 2002 – Feb 2003
2002 2003 2004 2005 2006 2007 2009
STS-121July 4-17, 2006
Columbia Tragedy
Feb 1, 2003
2008
Methodology Development & Validation March 2003 – Aug 2006
Shuttle Processing Mishap Study
Aug 2006 – May 2008 (mishaps from Feb 2003 – May 2008)
KSC Shuttle Processing All-Star
Off-site MeetingJune 2008
Risk Reduction Action Dev & Process Escape
Assessment June 2008 – Jan 2009
Risk Red. Actions Sept 2008 – present
KSC Safety Stand-down
March 16, 2006
STS-116 Dec 9-22, 2006STS-115 Sept 9-23, 2006
STS-114July 26-Aug 9, 2005
STS-117 June 8-22, 2007STS-118 Aug 9-21, 2007STS-120 Oct 23–Nov 6, 2007
NESC Established Nov 1, 2003
STS-122 Feb 7-20, 2008STS-123 March 11-26, 2008
STS-126 Nov 14-30, 2008STS-119 Mar 15-25, 2009STS-125 May 11-24, 2009
Methodology Development andImplementation Timeline
“The NESC gains insight into the technical activities of programs/projects through…systems engineering reviews and independent trend or pattern analyses of program/project technical problems, technical issues, mishaps, and close calls within and across programs/projects”
NESC Management Plan, Feb. 2008
STS-127 July 15-31, 2009
STS-124 May 31-June 14, 2008
STS-128 Aug 28-Sept 11, 2009
6
STS-129 Nov 16-27, 2009
7
Fundamental (System-level) and Symptomatic Solutions
Two “balancing processes” compete for control of a problem symptom Proactive & reactive Preventive & corrective
Both solutions treat the symptom, but only the fundamental solution treats the system-level issue Medical analogies: lung cancer,
diabetes Symptomatic solution
frequently has the side effect of deferring the fundamental solution, making it harder to achieve
from Peter Senge, “Systemic Leadership and Change”
8
Adverse Event
Pre Conditions
Production Activities
Human Error Defenses
Decision Makers
Line Management / Support Activities
Error trajectory passes through corresponding holes in the layers
of defenses, barriers, safeguards, and controls
Active Failures Individual
Swiss Cheese Model of Defenses
Adapted from James Reason
Latent Failures
Contributing Factors and Causes
Indirect Contributing FACTORS
Direct Contributing FACTORS
Influence chain assessments focus on DIRECT CONTRIBUTING FACTORS
and CAUSES
CAUSESContributing
CausesProximate
CausesRoot or Probable Cause(s)
Mishap investigations focus on CAUSES
9
10
Influence Chain Mapping Methodology Specifically designed to step back from individual
mishaps to evaluate trends and patterns in contributing factors/causes in order to identify the most significant system-level safety issues
Shuttle mishap “recurring cause” study Complements (does not replace) root cause
analysis methods Explicitly models the influences between
organizational systems and individual behaviors of front-line workers
Emphasizes absent barriers/controls in addition to failed barriers/controls
Dual Role Model for Addressing System-Level Safety Issues
11
Dual Role Taxonomy ofContributing Factors and Causes
Control System Factors
Dual Role Factors
Local Resource Factors
12
Notional Influence ChainControl System Factors
Dual Role Factors
Local Resource Factors
13
Proactive Risk Reduction
14
Example Influence Chain Assessment: Mobile Crane Boom Impact With VAB
April 19, 2004
15
Control System Factors
Dual Role Factors
Local Resource Factors
IC Contributing FactorIC Influence Link
Key:
1a
2a
2c
2b
3b
3a
1c, 3c
1b
Mobile Crane MishapCompleted Influence Chain (IC) Map
SUMMARY- 3 influence chains (major issues) - 9 contributing factors
16
IndividualsMaterial Resources & Work Environment
Support Information
Operational Procedures
Task Team
“Swiss Cheese” Model for Mobile Crane
Mishap
Crane Impact with Facility
Quality Control
Supervision
Training Systems
Design & Dev Systems
Senior Leadership
Schedule Controls
Emotional Factors
Cognitive Factors
Team Comm
Support Equip Design
Procedure Design
Incomplete Procedures Support Equip
Feedback
Enabling Systems
17
Control System Factors
Dual Role Factors
Local Resource Factors2 IA cf’s F1.2-through corrective action link
IC Contributing FactorIC Influence Link
SAIB Contributing FactorKey: SAIB Corrective ActionSAIB Corr. Action Link
F1.1
F1.2
F2.1
F2.2F2.3
F3.1
Mobile Crane MishapInfluence Chain Cont. Factors + SAIB Findings + SAIB Recommendations
18
19
Event-Specific Risk Reduction Recommendations
From a human-system integration perspective, a vulnerable system enables workers to make unintentional errors and/or cause collateral damage
A well designed (robust or resistant) system enables workers to avoid errors and collateral damage
vulnerable
average
resistantDesign, Guard,
Provide System Feedback
Inspect, Warn, Train,
Add Procedure Details
AcceptLeast Effective
Most Effective
“To err is human, but errors can be prevented.”National Institute of Medicine
20
Event-Specific Recommendations Examples
Crane Boom Impact with VAB Structure (04/20/04)Install a sensor system with beepers and/or flashing lights on the mobile cranes that are activated when the cranes are moved in the destowed position, similar to backup beepers on trucks.
Freedom Star Retrieval Ship Frustum Incident (12/10/06)Replace the polyester straps used to secure the frustum to the deck. Consider using steel cables and the frustum’s cable attach points used for VAB stacking operations.
21
Event-Specific Recommendations Additional Examples
OPF HB1 Platform System Leak onto OV-104 RH OMSPod TPS (10/26/04)Re-implement torquestripe requirements for facility KC/AA fittings.
Crane Overturned on Pad BSurface (01/31/05)Install tire pressure indicators in crane tires to provide visual indications of low tire pressure and potential instability issues.
22
“Recurring Cause” Summary Influence chain assessments were completed
for over 60% of Standing Accident Investigation Board (SAIB) reports from February 2003 through May 2008
Observed similar trends and patterns in contributing factors/causes to process escapes and process catches from August 2008 through January 2009
Results of aggregate data analysis were used to formulate system-level risk reduction actions
Aggregate Data Analysis Results “Top 8” Proactive Risk Reduction Opportunities
Control System Factors
Dual Role Factors
Local Resource Factors
Proactive risk reduction opportunity for Shuttle ground operations
Key:
• Control System or Dual Role Factor• Non-design issue• Frequency of occurrence
Major Factors in Analysis:
• Frequency unaddressed by SAIB• Part of influence chains• Emerging risk area
23
24
Development of System-Level Risk Reduction Actions
Selected Shuttle processing “all-stars” developed recommendations for actions focused on buying down the risk of mishaps and process escapes Recognized leaders from Engineering, Shop, and
Operations organizations in different facilities Reviewed the data and applied their knowledge of
operational practices
Some recommendations were not practical to implement at this point in the Shuttle Program
Results presented to Ground Operations Steering Committee Multiple iterations of risk reduction action plans
Overview of System-Level Risk Reduction Actions
25
Performance Self-Assessments
Designed to stimulate a two-way conversation between supervisors and employees to identify: What went well (recognize successes) Opportunities for improvement (identify and manage
risk) Positive behaviors (reinforce and encourage)
Similar to post task de-briefings Minimum 1x/month Listen and learn: “together we’re smarter and
safer”
26
“Do Not Use or Operate” Tags Visual operational
constraint system to alert and inform personnel of the following conditions: Out of configuration hardware
with potential to be forgotten In-process work unattended
for more than one shift Replaces an ad hoc system
(tape) for stationary GSE panel set-ups and portable GSE OSHA lock-out tag-out (LOTO)
Operating procedure released
27
Systems Training for Loaned Personnel
A new process to reduce risks of mishaps associated with personnel loaned to other facilities or Programs Flight systems, unique facility
systems, and GSE The need for support and
applicable skills are matched to a group capabilities model
Identifies requisite skills and provides management the opportunity to assure any deltas to equivalent training are addressed before work begins
Prior to returning to the home department, the employee receives notification to review current policies/practices
28
Risk Assessment Enhancements Ground Operations Risk Assessment
(GORA) performed for any first-time or infrequent task, unplanned task (especially unplanned work performed in previously closed out work areas), troubleshooting, hazardous jobs, or tasks with unusual test assemblies/setups
Scope of each Process Failure Modes and Effects Analysis (PFMEA) and GORA includes pre-ops and close-out inspections
Require an assessment of similar operations for associated mishaps or process escapes
Technician and human factors engineering support
Team members communicate identified risks
NESC support to KSC Risk Review Board
STS 128
STS 124
STS 117
29
Problem Resolution Center and Flow Management Workshop
Problem Resolution Center deploys floor engineers to "hot spots" to help resolve technical and scheduling issues real-time Roving troubleshooters
Joint NASA/USA Flow Management Workshop addressed the following issues (what to do, what NOT to do): Workload vs. right resources Constellation and Shuttle co-existence Uncertainty Critical skills and sharing resources Maintaining focus and attention to detail
30
Crucial Conversations Training
Communication skills training to increase trust and dialog during Shuttle fly-out and transition
Focus is on making it safe to talk about anything by creating mutual purpose and mutual respect
USA Ground Operations and NASA Shuttle Processing managers and supervisors have received training
31
32
Summary Proactive risk reduction efforts will continue
through Shuttle fly-out Influence chain methodology complements root
cause analysis efforts Study results have also been applied to
Constellation systems Human factors engineering pathfinder for GSE
designers Ground support equipment (GSE) design reviews Ground operations planning and operability
enhancements Orion assembly and processing
"Complex systems sometimes fail in complex ways. Sometimes you have to work pretty hard to pin down those complex failure mechanisms. But if you can do that, you will have done the system a great service.” Admiral Gehman, Chair of the Columbia Accident Investigation Board