How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems...
-
Upload
shannon-clark -
Category
Documents
-
view
215 -
download
0
Transcript of How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems...
![Page 1: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/1.jpg)
How safe is safe enough?(and how do we demonstrate that?)
Dr David Pumfrey
High Integrity Systems Engineering Group
Department of Computer Science
University of York
![Page 2: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/2.jpg)
2
Why System Safety ?
Why do we strive to make systems safe?
Self interestwe wouldn’t want to be harmed by systems we develop and use
unsafe systems are bad business
We have to do sorequired by law
required by standards
But what do the law and standards represent?laws try to prevent what society finds morally unacceptable
ultimately assessed by the courts, as representatives of society
standards try to define what is acceptable practiseto discharge legal and moral responsibilities
![Page 3: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/3.jpg)
3
Perception of Safety
Perception (and hence individual acceptance) of risk affected by many factors
(Apparent) degree of control
Number of deaths in one accident (aircraft versus cars)
Familiarity vs. novelty
“Dreadness” of risk (“falling out of the sky”, nuclear radiation)
Voluntary vs. involuntary risk (hang gliding vs nuclear accident)
Politics and journalismFrequency / profile of reporting of accidents / issues
Experience
Individual factors – age, sex, religion, culture
How do companies (engineers?) make decisions given diversity of views?
![Page 4: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/4.jpg)
4
Getting it wrong – some examples
Ariane 5 (mis-use of legacy software)
A330 ADIRUs (software could not cope with hardware failure mode)
Boeing 777 (software error allowed switch back to ADIRU that had previously been detected as faulty)
Therac 25 (software errors contributed to radiotherapy overdose accidents)
![Page 5: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/5.jpg)
5
Boeing 777An incident of massive altitude fluctuations on a flight out of Perth
Problem caused by Air Data Inertial Reference Unit (ADIRU)
Software contained a latent fault which was revealed by a change
Problem was in fault management/dispatch logic
June 2001 accelerometer#5 fails with erroneoushigh output values, ADIRUdiscards output values
Power Cycle on ADIRU occurs each occasionaircraft electrical systemis restarted
Aug 2006 accelerometer #6 fails, latent softwareerror allows use of previously failed accel #5
http://www.atsb.gov.au/publications/investigation_reports/2005/AAIR/aair200503722.aspx
![Page 6: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/6.jpg)
6
Therac 25
Therac 25 was a development of (safe, successful) earlier medical machines
Intended for operation on tumoursUses linear accelerator to produce electron stream and generate X-rays (both can be used in treatments)
X-ray therapy requires about 100 times more electron energy than electron therapy
this level of electron energy is hazardous if patient exposed directly
Selection of treatment type controlled by a turntable
![Page 7: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/7.jpg)
7
Therac 25 Schematic
M irror
Counterw eight
Electron modescan target
X-ray modetarget
Position sensemicrosw itchassembly Locking
plunger
![Page 8: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/8.jpg)
8
Software in Therac-25
On older models, there were mechanical interlocks on turntable position and beam intensity
In Therac-25, mechanical interlocks were removed; turntable position and beam activation were both computer controlled
Older models required operator to enter data twice - at patient’s side, in shielded area – and then cross-checked
In Therac-25, data only entered once (to speed up therapy sessions)
Very poor user interfaceDisplay updated so slowly experienced therapists could “type ahead”
Undocumented error codes which occurred so often the operators ignored them
Six over-dosage accidents (resulting in deaths)May have been many cases where ineffective treatment was given
![Page 9: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/9.jpg)
Westland Helicopters Merlin (EH101)
Helicopter Electric Actuation Technology (HEAT) project
To replace traditional flight controls...rods and links
power assistance from high pressure hydraulics
...with electrical actuationsmaller, lighter
reduction in fire risk
BUT
totally fly-by-wire – no mechanical reversion
flight control electronics become extremely critical
![Page 10: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/10.jpg)
10
Eurofighter Typhoon: Display Processor Hardware
Second MMU
Private RAM
Private ROM
Timers
Shared RAM
Shared ROM
Priv
ate
bus
Priv
ate
bus
Loca
l bus
Processor
Private RAM
Private ROM
Timers
Arbitration
Arbitration
Processor
Second MMU
I/O
Arbitration
Specialisthardware
Arbitration
System bus
![Page 11: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/11.jpg)
11
Timing diagram
ContextSwitch
GenerateBroadcastInterrupt
System Health Monitor
Check_SG Status
MIM InputOutput
ProcessorCBIT
SG BLTProcessor
CBITLocaliseBus Data
ProcessorCBIT
ProcessorCBIT
CPERequests
HUDMonitor
ContextSwitch
CPE_USERHSG_CBIT &CHECKSUM
ContextSwitch
System Health Monitor MIM InputProcessor
CBITSG BLT
RadarTransfer
LocaliseBus Data
ProcessorCBIT
RadarInterfaceManager
ServiceDiscrete I/F
ProcessorCBIT
IC_USERL_MSG_CBIT& CHECKSUM
& MIM_CBIT
Global BusInput Data
ProcessorCBIT
ProcessorCBIT
LocaliseBus Data
SG BLTProcessor
CBITProcessor
CBITProcessor
CBITContextSwitch
PE_USERC_MSG_CBIT
&CHECKSUM
ProcessorCBIT
ProcessorCBIT
LocaliseBus Data
SG BLTProcessor
CBITProcessor
CBIT
Global BusInput Data
ProcessorCBIT
ProcessorCBIT
SG BLTLocaliseBus Data
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
SG BLTLocaliseBus Data
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
SG BLTLocaliseBus Data
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
SG BLTLocaliseBus Data
ProcessorCBIT
ProcessorCBIT
ContextSwitch
System Health Monitor
ContextSwitch
System Health Monitor
ContextSwitch
System Health Monitor
ContextSwitch
System Health Monitor
ContextSwitch
System Health Monitor
ContextSwitch
System Health Monitor
CPE
IC
PE1
PE2
PE3
PE4
PE5
PE6
ContextSwitch
PE_USERR_MSG_CBIT
&CHECKSUM
ContextSwitch
PE_USERCBIT
CHECKSUM
ContextSwitch
PE_USERCBIT
CHECKSUM
ContextSwitch
PE_USERCBIT
CHECKSUM
ContextSwitch
PE_USERCBIT
CHECKSUM
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
ProcessorCBIT
Global BusInput Data
ProcessorCBIT
TimerInterruptLevel 6
Sync 1CPE(All)
Sync 2CPE
Sync 3PE1
(CPE)
Sync 4CPE
Sync 5IC
Sync 6PE3
Sync 7CPE(All)
SUPERVISOR
USER
SUPERVISOR
USER
SUPERVISOR
USER
SUPERVISOR
USER
SUPERVISOR
USER
SUPERVISOR
USER
SUPERVISOR
USER
SUPERVISOR
USER
VME Bus Block Transfer Period
BroadcastInterruptLevel 5
BroadcastInterruptLevel 5
BroadcastInterruptLevel 5
BroadcastInterruptLevel 5
BroadcastInterruptLevel 5
BroadcastInterruptLevel 5
BroadcastInterruptLevel 5
Non-synchronised supervisoroperations
MCTimer
Update
ContextSwitch
Radar EOF interrputsCPU level 5
No latency
IFF EOF interrputsCPU level 5No latency
Sync withdata wordCPU level 2
Acyclicinterrupts
Warninginterrupt
Multi-mission data / PDSload
![Page 12: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/12.jpg)
12
Recursive Resource DependencyEVENTS M EM O RY
RAMRO M CPU regs I/O regs
ProgramRO M
StackRAM
Critica lvariab les
Interrupts O utput events
M astercycle clock
M M Uregisters
Bus arb itra tioncontro l reg isters
T imerregisters
Interruptconfiguration
registers
In itia lisationroutines
RAMRO M CPU regs
All resources
Intrinsica llycritica l resources
Primary contro lresources
In itia lisation routines forprimary contro l resources use
system resources, anddependencies become cyclic.
![Page 13: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/13.jpg)
13
Safety Cases: Who are they for?
Many people and organisations will have an interest in a safety case
supplier / manufactureroperatorregulatory authoritiesbodies that conduct acceptance trialspeople who will work with the system
and their representatives (unions)
“neighbours” (e.g. general public who live round an air base)emergency services
May need more than one “presentation” of safety case to suit different audiencesWho has the greatest interest?
![Page 14: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/14.jpg)
14
Goal Structuring Notation
Purpose of a Goal Structure
To show how goals are broken down into
sub-goals, and eventually supported by evidence
(solutions) whilst making clear the
strategies adopted, the rationale for the
approach (assumptions, justifications)
and the context in which goals are stated
A/J
![Page 15: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/15.jpg)
15
Control Systemis Safe
All identified hazards eliminated /
sufficiently mitigated
I.L. Process Guidelines defined
by Ref X.
Hazards Identifiedfrom FHA (Ref Y)
Tolerability targets(Ref Z)
Fault Tree Analysis
FormalVerification
Process Evidenceof I.L. 4
Probability of H2 occurring
< 1 x 10-6 per annum
H1 has been eliminated
Probability of H3 occurring
< 1 x 10-3 per annum
Primary Protection System developed
to I.L. 4
Secondary Protection System developed to I.L. 2
Process Evidence of
I.L. 2
J
1x10-6 p.a.limit for
Catastrophic Hazards
Software developed to I.L.
appropriate to hazards involved
A Simple Goal Structure
![Page 16: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/16.jpg)
16
HEAT: Developing the ArgumentTop goal
Trials aircraft is acceptably safe to fly with HEAT/ACT fitted
System
HEAT/ACT system is acceptably safe
Clearance
Procedures for flight clearance and certification followed
Integration
Trials a/c remains acceptably safe with HEAT fitted
SMS
SMS implemented to DS00-56
Product
All identified hazards have been suitably addressed
Process
All relevant requirements and standards have been complied with
![Page 17: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/17.jpg)
17
Progressive Development
Hazard Log Application
G 1.1.4.7Hazard Log
requirement satisfied
G 1.1.4.7Hazard Log
requirement satisfied
G 1.1.4.7.1Hazard Log initiated
Hazard Log Application
Hazard Log Guidance
Notes document
G 1.1.4.7.3Hazard Log used to assess levels of risk throughout project
G 1.1.4.7.2Hazard Log correctly
maintained
Safety Review minutes
G 1.1.4.7Hazard Log
requirement satisfied
G 1.1.4.7.1Hazard Log initiated
Hazard Log Application
Hazard Log Guidance
Notes document
ISAT Hazard Log audit
report
G 1.1.4.7.3Hazard Log used to assess levels of risk throughout project
G 1.1.4.7.2Hazard Log correctly
maintained
Safety Review minutes
G 1.1.4.7.2.1Access rights to
Hazard Log correctly controlled
G 1.1.4.7.2.2Sign-off procedure and
rights to Hazard Log correctly controlled
G 1.1.4.7.2.3 Hazard Log
used consistently
G 1.1.4.7.2.4 Hazard Log update
procedure understood and correctly followed
![Page 18: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/18.jpg)
18
An analogy
Safety case like a legal case presented in courtLike a legal case, a safety case must:
be clearbe crediblebe compellingmake best use of available evidence
Like a legal case, a safety case will always be subjectiveThere is no such thing as absolute safetySafety can never be provedAlways making an argument of acceptability
![Page 19: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/19.jpg)
19
What is a convincing argument?Example: The Completeness Problem
G1.1.2.1.1
All relevant airworthiness requirements have been identified completely and correctly
AwComplete
Argument by showing extreme improbability of overlooking relevant requirements
AwCorrect
Argument by showing assumptions used to derive requirements were correct
G1.1.2.1.1.1
Airworthiness requirements specified
G1.1.2.1.1.2
Relevant airworthiness requirements satisfy mandated standards where applicable
G1.1.2.1.1.3
Relevance of airworthiness requirements to HEAT/ACT assessed by competent staff
G1.1.2.1.1.4
Assumptions are proven correct by flight test
BoC
Basis of Certification document
#####
CompAwStaff
Competencies of staff used to filter
airworthiness requirements
AwSigs
Competencies of specialists used to vet and approve
requirements
FltTest
Assumptions proven by flight test
DS970
Def Stan 00-970
JAR 29
GRS
EH101 General Requirement Specification
![Page 20: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/20.jpg)
20
How is evidence used?
Strong, specific – individually compelling, taken together show system properties
Weak, general – compelling in sum
Think about evidence in used in legal (court) caseDirect - Supports a conclusion with no “intermediate steps”
e.g. a witness testifies that he saw the suspect at point X at time Y.
Circumstantial - Requires an inference to be made to reach a conclusione.g. ballistics test proves the suspect’s gun fired the fatal shot.
Safety case evidence is similare.g. Testing is direct – shows how the system behaves in specific instanceConformance to design rules is indirect – allows inference that system is fit for purpose (if rules have been proven)
Evidence may “stack up” in different ways:
![Page 21: How safe is safe enough? (and how do we demonstrate that?) Dr David Pumfrey High Integrity Systems Engineering Group Department of Computer Science University.](https://reader030.fdocuments.in/reader030/viewer/2022032708/56649e865503460f94b8a073/html5/thumbnails/21.jpg)
21
Conclusions
Demonstrating safety is a challenge
We are building ever more complex systems
Much of the “bespoke” complexity is in software
Essential that safety is a design driver...
... and also, design for ability to demonstrate safety