EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software...
Transcript of EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software...
CPS Applications
Heechul Yun
1
Note: Some slides are adopted from Prof. Pellizzoni
Outline
• Avionics
• Automotive Systems
• Other CPS Applications
2
Avionics
• Electronic systems on an aircraft
– Avionics = Aviation + electronics
– Multiple subsystems: communications, navigation, display, flight control, management, etc.
• Modern avionics
– Increasingly computerized
• Safety culture
– Safety critical; conservative; regulated (FAA, EASA)
3
Fly-by-wire
• Modern aircrafts rely on computers to fly
• Pilots do not directly move flight control surfaces (ailerons, elevator, rudder)
• Instead, Electronic Flight Control System does.
4
FCS
YokeControl surfaces
Autopilot
• Specify desired track: heading, course, waypoints, altitude, airspeed, etc.
5
FCS
YokeControl surfaces
Increasing Complexity
6
Example: F-22
• In 2007, 12 F-22s were going from Hawaii to Japan.
• After crossing the IDL, all 12 experienced multiple crashes.– No navigation– No fuel subsystems– Limited
communications– Rebooting didn’t
help
• F-22 has 1.7 million lines of code
F-22 Raptor
Verification and Validation (V&V)
• Validation– “Are we building the right system?”
– Check if the system meet the requirements
• Verification– “Are we building the system right?”
– Check if the system meets the specification
• It is possible the a system is verified correct but not useful (not valid)
8
requirements
specification
implementation
V & V Cost
9Image credit: Dr. Guillaume Brat NASA Ames Research Center
Certification
• Convincing the certification authority that the validation process is correct
• Largely process driven– Shows that you followed a good process
– Document everything
– Review everything (with independence)
• Evidence driven– Use formal methods and automated tools (model
checkers, theorem provers, …) in place of independent reviewers
10
DO-178 B/C
• Software Considerations in Airborne Systems and Equipment Certification
• A document used by certification authorities (FAA, EASA) to certify avionics software
• Basic idea– Access the safety implications of failure modes– Map failure modes to 5 safety levels (A to E)
• A: catastrophic – failure may cause a crash• B: hazardous – failure has a negative impact on safety/perf.• …
– Must satisfy a set “objectives” (with independence)• E.g., algorithms are accurate, software partitioning is confirmed,
source code complies low-level requirements, …
11
DO-178C and Formal Methods
12Image credit: Dr. Lucas Wagner, Honeywell
Avionics Architecture
• Federated architecture
– A function = a computer (box)
– More functions more boxes (computers)
• flight management, fuel management, flight envelope protection, collision avoidance…
– Each box is uniquely designed for each specific aircraft
• custom hardware/software
• 100s km cabling
13Image credit: ARTIST2 - Integrated Modular Avionics A380
Avionics in Airbus
14Image credit: ARTIST2 - Integrated Modular Avionics A380
Integrated Modula Avionic (IMA)
• Use a set of standard computers
• Use a standard OS (ARINC 653)
• Use standard data communication network (AFDX)
• Multiple applications can be executed on the same computer
• Each computer can be configured to partition its resources to serve multiple functions
15Image credit: ARTIST2 - Integrated Modular Avionics A380
ARINC 653
• Avionics Application Standard Software Interface (think POSIX for avionics)
• The software base of IMA
• Main idea: integrate software partitions with different criticality levels on the same/communicating computational node.
• A set of OS/Hypervisor provisions for safe partitioning and associated API.
16
ARINC 653
• Time partitioning
17Image credit: http://www.cotsjournalonline.com/articles/view/100736
What about multicore?
• ARINC653 and time partitioning was designed single-core systems in mind.
• Problems of executing multiple partitions in parallel on multiple cores
– Cache, memory, bus are shared.
– Isolation is not guaranteed
– A critical partition may be delay by low critical partitions on different cores
18
Denial-of-Service Attack
• Delay execution time of time sensitive code– E.g., real-time control software of a car– Observed >21X execution time increase on Odroid XU4 (*)
• Even after cache partitioning is applied
– Observed >10X increase on RPi 3 (**)
• Of a realistic DNN-based real-time control program
19
LLC
Core1 Core2 Core3 Core4
bench co-runner(s)
(*) Prathap Kumar Valsan, Heechul Yun, Farzad Farshchi. “Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems.” In RTAS, IEEE, 2016. Best Paper Award(**) Michael Garrett Bechtel, Elise McEllhiney, Minje Kim, Heechul Yun. “DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car.” In RTCSA, IEEE, 2018
Denial-of-Service Attack
20[C] Michael Garrett Bechtel and Heechul Yun. Denial-of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention. IEEE Intl. Conference on Real-Time and Embedded Technology and Applications Symposium (RTAS), IEEE, 2019. (to appear)
> 300X slowdown !!!
Multicore and Certification
• CAST32
– A position paper by FAA, EASA, and other certification agencies on multicore certification
– Not a definite rule or guideline, but
– Discuss “interference channels” of multicore
– State objectives to meet for certification
21
MCP_Planning_1
• Identify the specific MCP processor
• Identify the number of active cores,
• Identify the MCP software architecture
• Identify dynamic features in software
• Identify whether the MCP is for IMA
• Identify whether the MCP support “Robust Resource/Time Partitioning”
• Identify the methods and tools used for software development/verification
22
MCP_Planning_2
• Describe how MCP shared resources will be used, allocated, verified to avoid contention
• Identify hardware dynamic features
23
MCP_Resource_Usage_3
• The applicant has identified the interference channels that could permit interference to affect the software applications hosted on the MCP cores, and has verified the applicant’s chosen means of mitigation of the interference.
– Two cases: MCP w/ or w/o Robust Partitioning
24
Robust Resource Partitioning
• Software partitions cannot contaminate storage space for the code, I/O, data of other partitions (MMU, VM)
• Software partitions cannot consume more than allocated resources
• Failures of hardware unique to a software partition cannot cause adverse effect on the other software partitions.
25
Robust Time Partitioning
• No software partition consumes more than its allocated execution time on the core(s) on which it executes, irrespective of other partitions on different cores.
26
MCP_Resource_Usage_3
• Case 1: MCP Platforms With Robust Partitioning– “…may verify applications separately on the MCP and
determine their WCETs separately. “
• Case 2: All Other MCP Platforms – “ … should be tested on the target MCP with all
software components executing in the intended final configuration, …”
– “… WCET should be determined by analysis and confirmed by test on the target MCP with all software components executing in the intended final configuration.”
27
MCP_Resource_Usage_4
• The applicant has identified the available resources of the MCP and of its interconnect in the intended final configuration, has allocated the resources of the MCP to the software applications hosted on the MCP and has verifiedthat the demands for the resources of the MCP and of the interconnect do not exceed the available resources when all the hosted software is executing on the target processor.
• NOTE: The need to use Worst Case scenarios is implicit in this objective.
28
MCP_Software_1
• The applicant has verified that all the software components hosted by the MCP comply with the Applicable Software Guidance. In particular, the applicant has verified that all the hosted software components functioncorrectly and have sufficient time to complete their execution when all the hosted software is executing in the intended final configuration. – TL;DR: Need logical and temporal correctness
29
MCP_Software_2
• The applicant has verified that the data and control coupling between all the individual software components hosted on the same core or on different cores of the MCP has been exercised during software requirement-based testing, including exercising any interfaces between the applications via shared memory and any mechanisms to control the access to shared memory, and that the data and control coupling is correct. – TL;DR: need system-level testing
30
MCP_Error_Handling_1
• The applicant has identified the effects of failures that may occur within the MCP and has planned, designed, implemented and verified means (which may include a ‘safety net’ external to the MCP) commensurate with the safety objectives, by which to detect and handle those failures in a fail-safe manner that contains the effects of any failures within the equipment in which the MCP is installed.
31
Safety-Net
• Assumption: MCP can fail
• Goal: “fail-safe” operation
– can safely fly and land, but not at 100% performance
• How?
– passive monitoring functions
– active fault avoidance functions
– control functions for recovery
32
Multicore and Certification
• Major research topic
• Research goal
– A) Achieve time predictability and isolation
– B) Maximize throughput (as long as A. is met)
– For multicore based real-time embedded systems
– So that such systems can be certifiable
33
Outline
• Avionics
• Automotive Systems
• Other CPS Applications
34
Recap: Multicore and Certification
• CAST32
– A position paper by FAA, EASA, and other certification agencies on multicore certification
– Not a definite rule or guideline, but
– Discuss “interference channels” of multicore
– State objectives to meet for certification
• Robust Resource/Time Partitioning
35
Automotive Systems
• 100s of processors (ECU)
36
Image credit: Simon Fürst, BMW, EMCC2015 Munich, adopted from OSPERT2015 keynote
Automotive Systems
• ECUs, sensors, actuators
• Many subsystems
– Anti-lock breaking systems,Electronic stability control,Adaptive cruise control,Adaptive light control,Lane departure warning,infotainment, …
– For comports and safety
37
Image credit: Prof. Brandenburg
Increasing Complexity
• Today’s cars depend on software/computers
– Safe, dependable software is hard
38
Image source: https://hbr.org/resources/images/article_assets/hbr/1006/F1006A_B_lg.gif
Safety Challenge
• Unintended acceleration
– Caused several fatal accidents
• e.g., Aug. 2009 accident in California
– Toyota settled to pay $1.2B
• Potential causes
– Memory corruption
– Unsafe software design
39
Safety Challenge
40
Automotive System Development
• Each subsystem (function)’s software and hardware is built by a vendor (contractor)
• Car manufacturers (e.g., GM) integrate and validate
• Problems
– Size, weight, and power issue
– Lack of standards
– Difficult to interoperate, share code, refine
41
Size, Weight, and Power (SWaP)
• Maximum performance with minimal resources
– Cannot afford too many or too power hungry ECUs
42Figure source: OSPERT 2015 Keynote by Leibinger
AUTOSAR
• AUTOSAR – AUTomotive Open Systems ARchitecture
• Same motivation: cope with complex with standardization
• Define standard interfaces for software independent of hardware ECU
43
AUTOSAR
• Improve interoperability
44
Image credit: AUTOSAR tutorial at autosar.org
AUTOSAR RTE
• POSIX for AUTOSAR.
– Define services and APIs for applications
45Image credit: AUTOSAR tutorial at autosar.org
46slide credit: AUTOSAR tutorial at autosar.org
Controller Area Network (CAN)
• Multi-master serial bus for connecting ECUs
– Up to 1Mbps
• De-factor comm. standard in automotive
• Safety critical controls
– E.g.) steering, breaking, throttle, …
47Image credit: https://en.wikipedia.org/wiki/CAN_bus
CAN Networks and Vulnerability• Set of ECU connected by CAN buses.
• CAN buses are designed for real-time (fixed priority messages), but not security…
• Broadcast with no authentication field: any ECU connected to a CAN bus broadcasts to all other ECU on the same bus. No way to determine the sender.
• Weak Access Control: there is a challenge-response sequence but the codes must be known by all service centers to perform diagnostic = they are out in the open.
• ECU Firmware Update: the firmware of any ECU can be updated over the CAN bus.
• Bridge nodes: there are different CAN buses (critical / non-critical), but they are bridged by dedicated ECU nodes.
• Result: if you can hack any ECU, you can re-flash any other ECU…48
External I/O Channels
49Comprehensive Experimental Analyses of Automotive Attack Surfaces, USENIX Security, 2011
Autonomous Car
50
https://www.latimes.com/business/autos/la-fi-waymo-self-driving-california-20181030-story.html
Levels of Automation (SAE J3016)
• L1– Some cars today
• L2– Tesla
• L3– Uber
• L4– Waymo
• L5– None
51
(SAE, "Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle Automated Driving Systems.")
52S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada. ``An Open Approach to Autonomous Vehicles,'' IEEE Micro, Vol. 35, No. 6, pp. 60-69, 2015. Link
Autoware
• Open-source software stack for self-driving54
https://github.com/CPFL/Autoware
56https://arxiv.org/abs/1901.08567
Autoware
• Limitations
– Require detailed 3D map
– Require accurate localization (~cm) in the map
– Heavily rely on expensive Lidar sensor
• Cameras are supplementary
– Not the way human drives
57
Outline
• Avionics
• Automotive Systems
• Other CPS Applications
58
FDA regulation and medical devices
• Federal Drug Administration regulations are somehow lacking compared to other federal agency (ex: FAA).
• April 2010 push: “Infusion Pump Improvement Initiative”
59
• Patient-controlled analgesia
– Nurses are expensive
– Patient feels bad, press button, gets a shot of morphine
– What if he presses the button too often?
Example: Ventilator
• Patient on ventilator (mechanically operates lungs) during surgery.
• Surgeon asks assistant to take x-ray.
– Can’t take x-ray while lungs are moving (doctor always ask you to stop breathing…).
– Anesthesiologist stops ventilator.
– Assistant has trouble with x-ray – room is a cable mess.
60
• Anesthesiologist go help with x-ray.
• Nobody turns the ventilator back on…
Heart and Pacemaker
61
Modern Pacemaker?
62
Heart and Pacemaker Models
PROCEEDINGS OF THE IEEE 4
several studies have focused on the electrical-biomechanical
function of the whole heart ([30], [31]).
While these high-fidelity functional models capture the
heart functions in great detail, the full electro-physiological
model of the heart derived from them is computationally too
expensive for implantable device V&V. Furthermore, during
test case generation, fitting patient data with the large number
of parameters (in the 100,000s) is nearly impossible and
unnecessary as the pacemaker has only two or three electrodes
to interface with the heart. Thus these models are therefore not
at the right level of abstraction for V&V and do not interface
with implantable cardiac devices.
Medtronic’s Virtual Interactive Patient simulator can be used
in closed-loop operation with real medical devices, but the lack
of clinical relevance of this signal generator allows it to be only
used as a training tool and not during the testing of device
software itself [18]. In 1989, Malik et. al. [32] extracted the
timing properties of the cardiac conduction system to model
the heart. Their model was able to do close-loop simulation
with pacemaker software for several clinically-relevant cases
and produce template-based ECG signals. The VHM platform
builds upon this modeling technique and allows for cardiac
device testing and interaction with real devices.
B. Requirements for Model-Based Closed-loop V&V
For model-based V&V it is necessary to develop a frame-
work wherein the device itself, or a model of the device, is
verified or tested in closed-loop with a model of the patient or
the organ of concern. Thus, the main part of the framework is
the model of the patient or the organ of concern (i.e., in this
case the heart) that satisfies the following requirements:
A. Model Fidelity: The design of the heart model must
cover the functioning heart (i.e., normal sinus rhythm) and
improper heart function including the most common and
potent arrhythmias. Our Virtual Heart Model (VHM) covers
the following conditions that capture a majority of closed-loop
test cases: normal sinus rhythm, sinus bradycardia, Wencke-
bach type heat block, AV nodal reentry tachycardia (AVNRT)
for supraventricular tachycardia, pacemaker mode-switch op-
eration and pacemaker mediated tachycardia condition. In
addition, pacemaker lead related issues such as crosstalk, lead
dislodgement and lead displacement can be modeled in the
spatial-temporal VHM model (for details see [33], [34], [35],
[36], [37]. These conditions, while not exhaustive, suffice to
demonstrate the closed-loop methodology for V&V.
B. Simplicity: A majority of the heart models currently
used are extremely high order with hundreds of thousands of
ordinary differential equations or millions of finite elements.
While these models are several orders of magnitude richer than
the one presented here, they are primarily concerned with the
mechanics of fluid flow and muscle contraction. Furthermore,
the simulation of the models are time-consuming and the
models do not interact with medical devices [38], [39], [40],
[41] and [42]. The VHM presents an abstraction of the timing
and electrical conduction only as these are the primary inputs
to the pacemaker.
C. Physical Test-bed: One of the potential problems with
medical device development is that the behavior of a man-
ufactured device might differ from the model used during
its development. Thus, it is necessary to provide a closed-
loop test-bed that can be used for testing of the physical
devices. In this case, the model of the organ of concern that
was used for model-based simulation and verification has to
be compatible with the device that mimics the organ during
physical testing. The VHM is implemented in Mathwork’s
Stateflow/Simulink models which allows extraction of VHDL
description of the heart. This enabled us to operate the heart
on VHDL-based FPGA platform for black-box closed-loop
testing, which complements the model-based V&V.
C. Overview of the VHM Approach
We developed Virtual Heart Model (VHM), an integrated
framework for implantable device validation and verification
(see Fig. 2). A formal model which captures the timing
properties of the electrical conduction system of the heart is
developed as a kernel. With a formal model of the device,
closed-loop verification can be done to evaluate device soft-
ware safety against safety requirements. Through a functional
interface, the heart model is able to perform closed-loop
device validation by generating synthetic electrogram signals
(detailed in the next section) to the devices and respond to a
functional pacing signal from the device. This combination of
physiological model and environmental model within a formal
model framework will assist with device certification, which
is based on a validated and verified device model.
Fig. 3 presents a high-level description of the approach we
used to V&V a pacemaker design. We started by formally
specifying the behavior of the heart, using a network of ex-
tended timed-automata (see Section IV). Afterward, the formal
specification was manually mapped into two types of Simulink
designs, a counter-based design and a Simulink design that
uses Stateflow temporal logic operators. Both models utilize
discrete-time Stateflow charts and can be used for Simulink
simulation of the closed-loop scenarios, where the VHM is
composed with a Simulink model of the device. However, the
main difference between the models is the approach that was
used to control execution of a Stateflow chart in terms of time.
The former model utilizes a set of counters that count the
number of global, periodically generated clock events. On the
other hand, the latter model utilizes absolute-time temporal
Fig. 2. Functional and Formal interfaces of the Virtual Heart Model [33].63
Multiple Models
64
Timed Automata• Example of verification tool: UPPAAL
• Based on the theory of Timed Automata
• Adds variables and channels.– Main idea: reduce the number of states and simplify
composition of automata
65
How are properties specified?• Typically two ways:
1. Use another timed automata. Composed the two. Check if you reach some state.
2. Use another formal language (ex: linear temporal logic).
• The verification engine is called a model-checker.– Maintains a representation of states, clocks, variables (explicit
or implicit)
– Compute which states can be reached from a given set of states (forward or backward)
66
Tissue Activation and Timed Automata Model
67
PROCEEDINGS OF THE IEEE 7
RRPt<=Trrp
ERPt<=Terp
t>=Trest
t:=0
Act_path(i)!
C(i):=1
Act_node(i)?
t:=0
Act_path(i)!
C(i):=1
Act_node(i)?
Terp:=g(f(t)), C(i):=f(t)
Act_path(i)!
t:=0
t>=Terp
t:=0
t>=Trrp
t:=0
Restt<=Trest
(a)
Antet1<=Tante
Idle
Retrot2<=Tretro
Confilict
Act_path(a)?
Tante:=h(C(a))
t1:=0
Double
Act_path(b)?
Tretro:=h(C(b))
t2:=0
t1>=Tante
Act_node(b)!t2>=Tretro
Act_node(a)!
Act_path(b)?
Tretro:=h(C(b))
t2:=0
Act_path(a)?
Tante:=h(C(a))
t1:=0
(b) (c)
Fig. 6. (a) Node automaton. Dotted transition is only valid for pacemaker tissue like SA node; (b) Path automaton; (c) Model of the electricalconduction system of the heart using a network of node & path automata [33].
expressing constraints on the clock values for the location. In
most models, a state invariant defines an upper bound on the
values that a clock can have while the state is active.
A transition guard is a condition on the values of clocks.
A typical guard is of the form t ≥ T , which provides a lower
bound for the clock value. A transition between locations is
enabled when the guard of the transition is true. However, a
transition between locations can occur at each moment when
it is enabled. Thus, to model deterministic transitions at a
particular time, state invariants are usually defined as the
closure of the complement of the guard (e.g., if the guard
is defined as t ≥ 1, then an invariant t < = 1 is added to the
state). Finally, when a transition occurs, associated actions are
taken, which involve updating local variables and/or reseting
clocks.
The VHM is modeled as a network of extended timed-
automata that includes synchronization primitives and shared
variables. In addition, in the VHM each automaton has its own
clocks and all clocks progress synchronously. Automata syn-
chronize with each other using broadcast channels. A channel
c synchronizes between a sender c! and an arbitrary number
of receivers c?. A transition with receiver c? is taken if c! is
available. Since shared variables are used, as with UPPAAL,
linear operations and conditions that include variables can be
used as a part of guards and reset actions. However, unlike
timed-automata semantics in UPPAAL, in our framework a
variable can be assigned with a value that is obtained using
a non-linear function of variables and clocks. This is done to
be able to model the refractory period and conduction delay
change. However, this aspect complicates the use of standard
verification tools based on the timed-automata framework.
B. Modeling the Electrical Conduction System of the Heart
Using the aforementioned semantics, we define node au-
tomaton that models the refractory properties of heart tissue,
and path automaton that models the propagation properties of
heart tissue. Heart tissue along a conduction path can then
be modeled using two node automata and one path automaton
connecting them. When one of the node automata is activated,
it will send an Act path! event to the path automaton. The
path automaton, which models a conduction delay, generates
an Act node! event after the conduction is completed. The
event will activate the node automaton at the other end of the
conduction path. The activation signal will keep propagating
if the node automaton is connected to other path automata.
In this manner, the electrical conduction system of the heart
can be modeled as a network of node and path automata (see
Fig. 6(c)).
1) Node automaton: The node automaton with index i
(presented in Fig. 6(a)) is used to mimic the timing behavior
of cellular action potential from Fig. 4(a). The refractory
periods are modeled as three states. The automaton starts from
Rest state, which corresponds to the resting potential of the
action potential. For the specialized tissue like SA node, the
corresponding node automaton will be self-activated into ERP
state after Trest. A broadcast event Act path(i)! is sent out
to all path automata that are connected to node automaton
i. In this case the shared variable C(i ) ∈ (0, 1], which is
shared among all paths connecting to node i, is updated to 1,
indicating a normal conduction delay. All node automata can
be activated by receiving event Act node(i)? after some path
connecting to it finishes conduction.
ERP state serves as a blocking period since the node does
not react to activation signals while the state is active. After
Terp time in ERP state, the transition to RRP state occurs. If no
external stimuli occurs, the node will return to Rest state after
Trrp time. If a node is activated during RRP state, the transition
to ERP state will occur, activating all paths connected to the
node. Before entering the ERP state, the variable used for
the the clock invariant of the state is modified. This behavior
corresponds to the ERP change due to early activation. The
conduction delay of the paths connecting to the node will
also be updated by the value of shared variable C(i). The
physiological basis and clinical data of this behavior has been
studied in [50] and [51] and an exponential approximation
of the changing trend has been made to approximate similar
behavior.
The functions f and g that are used to mimic the change
in Terp are defined as:
f (t) = 1 − t/ T r r p (1)
The AV node has a different profile than the other tissue.
The ERP period increases rather than decreases when activated
Path Model
PROCEEDINGS OF THE IEEE 7
RRPt<=Trrp
ERPt<=Terp
t>=Trest
t:=0
Act_path(i)!
C(i):=1
Act_node(i)?
t:=0
Act_path(i)!
C(i):=1
Act_node(i)?
Terp:=g(f(t)), C(i):=f(t)
Act_path(i)!
t:=0
t>=Terp
t:=0
t>=Trrp
t:=0
Restt<=Trest
(a)
Antet1<=Tante
Idle
Retrot2<=Tretro
Confilict
Act_path(a)?
Tante:=h(C(a))
t1:=0
Double
Act_path(b)?
Tretro:=h(C(b))
t2:=0
t1>=Tante
Act_node(b)!t2>=Tretro
Act_node(a)!
Act_path(b)?
Tretro:=h(C(b))
t2:=0
Act_path(a)?
Tante:=h(C(a))
t1:=0
(b) (c)
Fig. 6. (a) Node automaton. Dotted transition is only valid for pacemaker tissue like SA node; (b) Path automaton; (c) Model of the electricalconduction system of the heart using a network of node & path automata [33].
expressing constraints on the clock values for the location. In
most models, a state invariant defines an upper bound on the
values that a clock can have while the state is active.
A transition guard is a condition on the values of clocks.
A typical guard is of the form t ≥ T , which provides a lower
bound for the clock value. A transition between locations is
enabled when the guard of the transition is true. However, a
transition between locations can occur at each moment when
it is enabled. Thus, to model deterministic transitions at a
particular time, state invariants are usually defined as the
closure of the complement of the guard (e.g., if the guard
is defined as t ≥ 1, then an invariant t < = 1 is added to the
state). Finally, when a transition occurs, associated actions are
taken, which involve updating local variables and/or reseting
clocks.
The VHM is modeled as a network of extended timed-
automata that includes synchronization primitives and shared
variables. In addition, in the VHM each automaton has its own
clocks and all clocks progress synchronously. Automata syn-
chronize with each other using broadcast channels. A channel
c synchronizes between a sender c! and an arbitrary number
of receivers c?. A transition with receiver c? is taken if c! is
available. Since shared variables are used, as with UPPAAL,
linear operations and conditions that include variables can be
used as a part of guards and reset actions. However, unlike
timed-automata semantics in UPPAAL, in our framework a
variable can be assigned with a value that is obtained using
a non-linear function of variables and clocks. This is done to
be able to model the refractory period and conduction delay
change. However, this aspect complicates the use of standard
verification tools based on the timed-automata framework.
B. Modeling the Electrical Conduction System of the Heart
Using the aforementioned semantics, we define node au-
tomaton that models the refractory properties of heart tissue,
and path automaton that models the propagation properties of
heart tissue. Heart tissue along a conduction path can then
be modeled using two node automata and one path automaton
connecting them. When one of the node automata is activated,
it will send an Act path! event to the path automaton. The
path automaton, which models a conduction delay, generates
an Act node! event after the conduction is completed. The
event will activate the node automaton at the other end of the
conduction path. The activation signal will keep propagating
if the node automaton is connected to other path automata.
In this manner, the electrical conduction system of the heart
can be modeled as a network of node and path automata (see
Fig. 6(c)).
1) Node automaton: The node automaton with index i
(presented in Fig. 6(a)) is used to mimic the timing behavior
of cellular action potential from Fig. 4(a). The refractory
periods are modeled as three states. The automaton starts from
Rest state, which corresponds to the resting potential of the
action potential. For the specialized tissue like SA node, the
corresponding node automaton will be self-activated into ERP
state after Trest. A broadcast event Act path(i)! is sent out
to all path automata that are connected to node automaton
i. In this case the shared variable C(i ) ∈ (0, 1], which is
shared among all paths connecting to node i, is updated to 1,
indicating a normal conduction delay. All node automata can
be activated by receiving event Act node(i)? after some path
connecting to it finishes conduction.
ERP state serves as a blocking period since the node does
not react to activation signals while the state is active. After
Terp time in ERP state, the transition to RRP state occurs. If no
external stimuli occurs, the node will return to Rest state after
Trrp time. If a node is activated during RRP state, the transition
to ERP state will occur, activating all paths connected to the
node. Before entering the ERP state, the variable used for
the the clock invariant of the state is modified. This behavior
corresponds to the ERP change due to early activation. The
conduction delay of the paths connecting to the node will
also be updated by the value of shared variable C(i). The
physiological basis and clinical data of this behavior has been
studied in [50] and [51] and an exponential approximation
of the changing trend has been made to approximate similar
behavior.
The functions f and g that are used to mimic the change
in Terp are defined as:
f (t) = 1 − t/ T r r p (1)
The AV node has a different profile than the other tissue.
The ERP period increases rather than decreases when activated
68
Software Model
69
Other issues with medical equipment…
1. Interoperability– Currently equipment of vendor X only works with other
equipment of vendor X– Strong push for an open medical interoperability standard– Problem #1: if something goes wrong, who gets the blame
(weak FDA regulations)?– Problem #2: equipment vendors have nothing to gain…
2. Wireless Communication– Solve the cable mess– Problem: how to resist interference and jamming?– Some physical-layer techniques are promising (Ultra-
Wide Bandwidth, Dynamic Frequency Selection…)
70
Other issues with medical equipment…
• General solution: stand-alone safety
– Design each component such that it is safe in isolation
– Device must be able to maintain a safe state even if all communication is lost
– Device must be able to maintain a safe state even if it receives incorrect information from other devices
– Ex: ventilator should automatically turn on after a maximum amount of time!
71
Other CPS Examples
• Electric grid
• Manufacturing robots
• Nuclear plant control systems
• …
72
Automotive Systems
73
Image credit: Prof. Brandenburg
Example: Audi A4
• In 2014, Audi recalled 850,000 A4 cars due to a software bug that prevented timely airbag deployment
74http://www.autoblog.com/2014/10/23/audi-a4-airbag-recall/
What are the problems?1. Lack of care
– Most attacks use conventional buffers overflow– Software simply isn’t checking for malicious inputs– People writing safety software are not used to think about
security..
2. Interfaces are not clearly specified– ECU development delegated to subsystem providers– Interfaces between components are not specified well-
enough to check for malicious interaction
3. No separation among criticality levels– Systems with different criticality are clearly isolated from a
temporal perspective, but not a function one– Very hard to solve…
75
Autonomous Car
• Google, GM, ford, Audi, …
76
Source: http://on-demand.gputechconf.com/gtc/2015/presentation/S5870-Daniel-Lipinski.pdf
CAN Networks and Vulnerability• Set of ECU connected by CAN buses.
• CAN buses are designed for real-time (fixed priority messages), but not security…
• Broadcast with no authentication field: any ECU connected to a CAN bus broadcasts to all other ECU on the same bus. No way to determine the sender.
• Weak Access Control: there is a challenge-response sequence but the codes must be known by all service centers to perform diagnostic = they are out in the open.
• ECU Firmware Update: the firmware of any ECU can be updated over the CAN bus.
• Bridge nodes: there are different CAN buses (critical / non-critical), but they are bridged by dedicated ECU nodes.
• Result: if you can hack any ECU, you can re-flash any other ECU…77
Car Hacking• Comprehensive Experimental Analysis of Automotive Attack Surfaces
• Scary stuff: an attacker can very easily gain control over all electronics systems in your car
– Start/stop/rev up/rev down engine
– Brake/disable braking
– Open doors
– Determine your position through GPS
– Listen to whatever you say in the car (all without your knowledge)
• Infiniti Q50 has steer-by-wire, so an attacker can remotely start and drive your car from your parking lot to his safehouse without moving from his couch…
• At least handbrake is completely manual…
78
High Computational Demand
• “Approximately 1 GB of data will need to be processed each second in the car’s real-time operating system. This data will need to be analyzed quickly enough that the vehicle can react to changes in its surroundings in less than a second.”
79
Intel, “Technology and Computing Requirements for Self-Driving Cars”