EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software...

79
CPS Applications Heechul Yun 1 Note: Some slides are adopted from Prof. Pellizzoni

Transcript of EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software...

Page 1: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

CPS Applications

Heechul Yun

1

Note: Some slides are adopted from Prof. Pellizzoni

Page 2: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Outline

• Avionics

• Automotive Systems

• Other CPS Applications

2

Page 3: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Avionics

• Electronic systems on an aircraft

– Avionics = Aviation + electronics

– Multiple subsystems: communications, navigation, display, flight control, management, etc.

• Modern avionics

– Increasingly computerized

• Safety culture

– Safety critical; conservative; regulated (FAA, EASA)

3

Page 4: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Fly-by-wire

• Modern aircrafts rely on computers to fly

• Pilots do not directly move flight control surfaces (ailerons, elevator, rudder)

• Instead, Electronic Flight Control System does.

4

FCS

YokeControl surfaces

Page 5: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autopilot

• Specify desired track: heading, course, waypoints, altitude, airspeed, etc.

5

FCS

YokeControl surfaces

Page 6: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Increasing Complexity

6

Page 7: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Example: F-22

• In 2007, 12 F-22s were going from Hawaii to Japan.

• After crossing the IDL, all 12 experienced multiple crashes.– No navigation– No fuel subsystems– Limited

communications– Rebooting didn’t

help

• F-22 has 1.7 million lines of code

F-22 Raptor

Page 8: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Verification and Validation (V&V)

• Validation– “Are we building the right system?”

– Check if the system meet the requirements

• Verification– “Are we building the system right?”

– Check if the system meets the specification

• It is possible the a system is verified correct but not useful (not valid)

8

requirements

specification

implementation

Page 9: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

V & V Cost

9Image credit: Dr. Guillaume Brat NASA Ames Research Center

Page 10: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Certification

• Convincing the certification authority that the validation process is correct

• Largely process driven– Shows that you followed a good process

– Document everything

– Review everything (with independence)

• Evidence driven– Use formal methods and automated tools (model

checkers, theorem provers, …) in place of independent reviewers

10

Page 11: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

DO-178 B/C

• Software Considerations in Airborne Systems and Equipment Certification

• A document used by certification authorities (FAA, EASA) to certify avionics software

• Basic idea– Access the safety implications of failure modes– Map failure modes to 5 safety levels (A to E)

• A: catastrophic – failure may cause a crash• B: hazardous – failure has a negative impact on safety/perf.• …

– Must satisfy a set “objectives” (with independence)• E.g., algorithms are accurate, software partitioning is confirmed,

source code complies low-level requirements, …

11

Page 12: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

DO-178C and Formal Methods

12Image credit: Dr. Lucas Wagner, Honeywell

Page 13: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Avionics Architecture

• Federated architecture

– A function = a computer (box)

– More functions more boxes (computers)

• flight management, fuel management, flight envelope protection, collision avoidance…

– Each box is uniquely designed for each specific aircraft

• custom hardware/software

• 100s km cabling

13Image credit: ARTIST2 - Integrated Modular Avionics A380

Page 14: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Avionics in Airbus

14Image credit: ARTIST2 - Integrated Modular Avionics A380

Page 15: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Integrated Modula Avionic (IMA)

• Use a set of standard computers

• Use a standard OS (ARINC 653)

• Use standard data communication network (AFDX)

• Multiple applications can be executed on the same computer

• Each computer can be configured to partition its resources to serve multiple functions

15Image credit: ARTIST2 - Integrated Modular Avionics A380

Page 16: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

ARINC 653

• Avionics Application Standard Software Interface (think POSIX for avionics)

• The software base of IMA

• Main idea: integrate software partitions with different criticality levels on the same/communicating computational node.

• A set of OS/Hypervisor provisions for safe partitioning and associated API.

16

Page 17: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

ARINC 653

• Time partitioning

17Image credit: http://www.cotsjournalonline.com/articles/view/100736

Page 18: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

What about multicore?

• ARINC653 and time partitioning was designed single-core systems in mind.

• Problems of executing multiple partitions in parallel on multiple cores

– Cache, memory, bus are shared.

– Isolation is not guaranteed

– A critical partition may be delay by low critical partitions on different cores

18

Page 19: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Denial-of-Service Attack

• Delay execution time of time sensitive code– E.g., real-time control software of a car– Observed >21X execution time increase on Odroid XU4 (*)

• Even after cache partitioning is applied

– Observed >10X increase on RPi 3 (**)

• Of a realistic DNN-based real-time control program

19

LLC

Core1 Core2 Core3 Core4

bench co-runner(s)

(*) Prathap Kumar Valsan, Heechul Yun, Farzad Farshchi. “Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems.” In RTAS, IEEE, 2016. Best Paper Award(**) Michael Garrett Bechtel, Elise McEllhiney, Minje Kim, Heechul Yun. “DeepPicar: A Low-cost Deep Neural Network-based Autonomous Car.” In RTCSA, IEEE, 2018

Page 20: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Denial-of-Service Attack

20[C] Michael Garrett Bechtel and Heechul Yun. Denial-of-Service Attacks on Shared Cache in Multicore: Analysis and Prevention. IEEE Intl. Conference on Real-Time and Embedded Technology and Applications Symposium (RTAS), IEEE, 2019. (to appear)

> 300X slowdown !!!

Page 21: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Multicore and Certification

• CAST32

– A position paper by FAA, EASA, and other certification agencies on multicore certification

– Not a definite rule or guideline, but

– Discuss “interference channels” of multicore

– State objectives to meet for certification

21

Page 22: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Planning_1

• Identify the specific MCP processor

• Identify the number of active cores,

• Identify the MCP software architecture

• Identify dynamic features in software

• Identify whether the MCP is for IMA

• Identify whether the MCP support “Robust Resource/Time Partitioning”

• Identify the methods and tools used for software development/verification

22

Page 23: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Planning_2

• Describe how MCP shared resources will be used, allocated, verified to avoid contention

• Identify hardware dynamic features

23

Page 24: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Resource_Usage_3

• The applicant has identified the interference channels that could permit interference to affect the software applications hosted on the MCP cores, and has verified the applicant’s chosen means of mitigation of the interference.

– Two cases: MCP w/ or w/o Robust Partitioning

24

Page 25: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Robust Resource Partitioning

• Software partitions cannot contaminate storage space for the code, I/O, data of other partitions (MMU, VM)

• Software partitions cannot consume more than allocated resources

• Failures of hardware unique to a software partition cannot cause adverse effect on the other software partitions.

25

Page 26: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Robust Time Partitioning

• No software partition consumes more than its allocated execution time on the core(s) on which it executes, irrespective of other partitions on different cores.

26

Page 27: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Resource_Usage_3

• Case 1: MCP Platforms With Robust Partitioning– “…may verify applications separately on the MCP and

determine their WCETs separately. “

• Case 2: All Other MCP Platforms – “ … should be tested on the target MCP with all

software components executing in the intended final configuration, …”

– “… WCET should be determined by analysis and confirmed by test on the target MCP with all software components executing in the intended final configuration.”

27

Page 28: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Resource_Usage_4

• The applicant has identified the available resources of the MCP and of its interconnect in the intended final configuration, has allocated the resources of the MCP to the software applications hosted on the MCP and has verifiedthat the demands for the resources of the MCP and of the interconnect do not exceed the available resources when all the hosted software is executing on the target processor.

• NOTE: The need to use Worst Case scenarios is implicit in this objective.

28

Page 29: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Software_1

• The applicant has verified that all the software components hosted by the MCP comply with the Applicable Software Guidance. In particular, the applicant has verified that all the hosted software components functioncorrectly and have sufficient time to complete their execution when all the hosted software is executing in the intended final configuration. – TL;DR: Need logical and temporal correctness

29

Page 30: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Software_2

• The applicant has verified that the data and control coupling between all the individual software components hosted on the same core or on different cores of the MCP has been exercised during software requirement-based testing, including exercising any interfaces between the applications via shared memory and any mechanisms to control the access to shared memory, and that the data and control coupling is correct. – TL;DR: need system-level testing

30

Page 31: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

MCP_Error_Handling_1

• The applicant has identified the effects of failures that may occur within the MCP and has planned, designed, implemented and verified means (which may include a ‘safety net’ external to the MCP) commensurate with the safety objectives, by which to detect and handle those failures in a fail-safe manner that contains the effects of any failures within the equipment in which the MCP is installed.

31

Page 32: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Safety-Net

• Assumption: MCP can fail

• Goal: “fail-safe” operation

– can safely fly and land, but not at 100% performance

• How?

– passive monitoring functions

– active fault avoidance functions

– control functions for recovery

32

Page 33: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Multicore and Certification

• Major research topic

• Research goal

– A) Achieve time predictability and isolation

– B) Maximize throughput (as long as A. is met)

– For multicore based real-time embedded systems

– So that such systems can be certifiable

33

Page 34: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Outline

• Avionics

• Automotive Systems

• Other CPS Applications

34

Page 35: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Recap: Multicore and Certification

• CAST32

– A position paper by FAA, EASA, and other certification agencies on multicore certification

– Not a definite rule or guideline, but

– Discuss “interference channels” of multicore

– State objectives to meet for certification

• Robust Resource/Time Partitioning

35

Page 36: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Automotive Systems

• 100s of processors (ECU)

36

Image credit: Simon Fürst, BMW, EMCC2015 Munich, adopted from OSPERT2015 keynote

Page 37: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Automotive Systems

• ECUs, sensors, actuators

• Many subsystems

– Anti-lock breaking systems,Electronic stability control,Adaptive cruise control,Adaptive light control,Lane departure warning,infotainment, …

– For comports and safety

37

Image credit: Prof. Brandenburg

Page 38: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Increasing Complexity

• Today’s cars depend on software/computers

– Safe, dependable software is hard

38

Image source: https://hbr.org/resources/images/article_assets/hbr/1006/F1006A_B_lg.gif

Page 39: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Safety Challenge

• Unintended acceleration

– Caused several fatal accidents

• e.g., Aug. 2009 accident in California

– Toyota settled to pay $1.2B

• Potential causes

– Memory corruption

– Unsafe software design

39

Page 40: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Safety Challenge

40

Page 41: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Automotive System Development

• Each subsystem (function)’s software and hardware is built by a vendor (contractor)

• Car manufacturers (e.g., GM) integrate and validate

• Problems

– Size, weight, and power issue

– Lack of standards

– Difficult to interoperate, share code, refine

41

Page 42: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Size, Weight, and Power (SWaP)

• Maximum performance with minimal resources

– Cannot afford too many or too power hungry ECUs

42Figure source: OSPERT 2015 Keynote by Leibinger

Page 43: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

AUTOSAR

• AUTOSAR – AUTomotive Open Systems ARchitecture

• Same motivation: cope with complex with standardization

• Define standard interfaces for software independent of hardware ECU

43

Page 44: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

AUTOSAR

• Improve interoperability

44

Image credit: AUTOSAR tutorial at autosar.org

Page 45: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

AUTOSAR RTE

• POSIX for AUTOSAR.

– Define services and APIs for applications

45Image credit: AUTOSAR tutorial at autosar.org

Page 47: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Controller Area Network (CAN)

• Multi-master serial bus for connecting ECUs

– Up to 1Mbps

• De-factor comm. standard in automotive

• Safety critical controls

– E.g.) steering, breaking, throttle, …

47Image credit: https://en.wikipedia.org/wiki/CAN_bus

Page 48: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

CAN Networks and Vulnerability• Set of ECU connected by CAN buses.

• CAN buses are designed for real-time (fixed priority messages), but not security…

• Broadcast with no authentication field: any ECU connected to a CAN bus broadcasts to all other ECU on the same bus. No way to determine the sender.

• Weak Access Control: there is a challenge-response sequence but the codes must be known by all service centers to perform diagnostic = they are out in the open.

• ECU Firmware Update: the firmware of any ECU can be updated over the CAN bus.

• Bridge nodes: there are different CAN buses (critical / non-critical), but they are bridged by dedicated ECU nodes.

• Result: if you can hack any ECU, you can re-flash any other ECU…48

Page 49: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

External I/O Channels

49Comprehensive Experimental Analyses of Automotive Attack Surfaces, USENIX Security, 2011

Page 50: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autonomous Car

50

https://www.latimes.com/business/autos/la-fi-waymo-self-driving-california-20181030-story.html

Page 51: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Levels of Automation (SAE J3016)

• L1– Some cars today

• L2– Tesla

• L3– Uber

• L4– Waymo

• L5– None

51

(SAE, "Taxonomy and Definitions for Terms Related to On-Road Motor Vehicle Automated Driving Systems.")

Page 52: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

52S. Kato, E. Takeuchi, Y. Ishiguro, Y. Ninomiya, K. Takeda, and T. Hamada. ``An Open Approach to Autonomous Vehicles,'' IEEE Micro, Vol. 35, No. 6, pp. 60-69, 2015. Link

Page 53: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autoware

53https://www.youtube.com/watch?v=zujGfJcZCpQ

Page 54: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autoware

• Open-source software stack for self-driving54

https://github.com/CPFL/Autoware

Page 55: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autoware

55https://youtu.be/gq8El7-36z0?t=896

Page 56: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

56https://arxiv.org/abs/1901.08567

Page 57: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autoware

• Limitations

– Require detailed 3D map

– Require accurate localization (~cm) in the map

– Heavily rely on expensive Lidar sensor

• Cameras are supplementary

– Not the way human drives

57

Page 58: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Outline

• Avionics

• Automotive Systems

• Other CPS Applications

58

Page 59: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

FDA regulation and medical devices

• Federal Drug Administration regulations are somehow lacking compared to other federal agency (ex: FAA).

• April 2010 push: “Infusion Pump Improvement Initiative”

59

• Patient-controlled analgesia

– Nurses are expensive

– Patient feels bad, press button, gets a shot of morphine

– What if he presses the button too often?

Page 60: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Example: Ventilator

• Patient on ventilator (mechanically operates lungs) during surgery.

• Surgeon asks assistant to take x-ray.

– Can’t take x-ray while lungs are moving (doctor always ask you to stop breathing…).

– Anesthesiologist stops ventilator.

– Assistant has trouble with x-ray – room is a cable mess.

60

• Anesthesiologist go help with x-ray.

• Nobody turns the ventilator back on…

Page 61: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Heart and Pacemaker

61

Page 62: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Modern Pacemaker?

62

Page 63: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Heart and Pacemaker Models

PROCEEDINGS OF THE IEEE 4

several studies have focused on the electrical-biomechanical

function of the whole heart ([30], [31]).

While these high-fidelity functional models capture the

heart functions in great detail, the full electro-physiological

model of the heart derived from them is computationally too

expensive for implantable device V&V. Furthermore, during

test case generation, fitting patient data with the large number

of parameters (in the 100,000s) is nearly impossible and

unnecessary as the pacemaker has only two or three electrodes

to interface with the heart. Thus these models are therefore not

at the right level of abstraction for V&V and do not interface

with implantable cardiac devices.

Medtronic’s Virtual Interactive Patient simulator can be used

in closed-loop operation with real medical devices, but the lack

of clinical relevance of this signal generator allows it to be only

used as a training tool and not during the testing of device

software itself [18]. In 1989, Malik et. al. [32] extracted the

timing properties of the cardiac conduction system to model

the heart. Their model was able to do close-loop simulation

with pacemaker software for several clinically-relevant cases

and produce template-based ECG signals. The VHM platform

builds upon this modeling technique and allows for cardiac

device testing and interaction with real devices.

B. Requirements for Model-Based Closed-loop V&V

For model-based V&V it is necessary to develop a frame-

work wherein the device itself, or a model of the device, is

verified or tested in closed-loop with a model of the patient or

the organ of concern. Thus, the main part of the framework is

the model of the patient or the organ of concern (i.e., in this

case the heart) that satisfies the following requirements:

A. Model Fidelity: The design of the heart model must

cover the functioning heart (i.e., normal sinus rhythm) and

improper heart function including the most common and

potent arrhythmias. Our Virtual Heart Model (VHM) covers

the following conditions that capture a majority of closed-loop

test cases: normal sinus rhythm, sinus bradycardia, Wencke-

bach type heat block, AV nodal reentry tachycardia (AVNRT)

for supraventricular tachycardia, pacemaker mode-switch op-

eration and pacemaker mediated tachycardia condition. In

addition, pacemaker lead related issues such as crosstalk, lead

dislodgement and lead displacement can be modeled in the

spatial-temporal VHM model (for details see [33], [34], [35],

[36], [37]. These conditions, while not exhaustive, suffice to

demonstrate the closed-loop methodology for V&V.

B. Simplicity: A majority of the heart models currently

used are extremely high order with hundreds of thousands of

ordinary differential equations or millions of finite elements.

While these models are several orders of magnitude richer than

the one presented here, they are primarily concerned with the

mechanics of fluid flow and muscle contraction. Furthermore,

the simulation of the models are time-consuming and the

models do not interact with medical devices [38], [39], [40],

[41] and [42]. The VHM presents an abstraction of the timing

and electrical conduction only as these are the primary inputs

to the pacemaker.

C. Physical Test-bed: One of the potential problems with

medical device development is that the behavior of a man-

ufactured device might differ from the model used during

its development. Thus, it is necessary to provide a closed-

loop test-bed that can be used for testing of the physical

devices. In this case, the model of the organ of concern that

was used for model-based simulation and verification has to

be compatible with the device that mimics the organ during

physical testing. The VHM is implemented in Mathwork’s

Stateflow/Simulink models which allows extraction of VHDL

description of the heart. This enabled us to operate the heart

on VHDL-based FPGA platform for black-box closed-loop

testing, which complements the model-based V&V.

C. Overview of the VHM Approach

We developed Virtual Heart Model (VHM), an integrated

framework for implantable device validation and verification

(see Fig. 2). A formal model which captures the timing

properties of the electrical conduction system of the heart is

developed as a kernel. With a formal model of the device,

closed-loop verification can be done to evaluate device soft-

ware safety against safety requirements. Through a functional

interface, the heart model is able to perform closed-loop

device validation by generating synthetic electrogram signals

(detailed in the next section) to the devices and respond to a

functional pacing signal from the device. This combination of

physiological model and environmental model within a formal

model framework will assist with device certification, which

is based on a validated and verified device model.

Fig. 3 presents a high-level description of the approach we

used to V&V a pacemaker design. We started by formally

specifying the behavior of the heart, using a network of ex-

tended timed-automata (see Section IV). Afterward, the formal

specification was manually mapped into two types of Simulink

designs, a counter-based design and a Simulink design that

uses Stateflow temporal logic operators. Both models utilize

discrete-time Stateflow charts and can be used for Simulink

simulation of the closed-loop scenarios, where the VHM is

composed with a Simulink model of the device. However, the

main difference between the models is the approach that was

used to control execution of a Stateflow chart in terms of time.

The former model utilizes a set of counters that count the

number of global, periodically generated clock events. On the

other hand, the latter model utilizes absolute-time temporal

Fig. 2. Functional and Formal interfaces of the Virtual Heart Model [33].63

Page 64: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Multiple Models

64

Page 65: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Timed Automata• Example of verification tool: UPPAAL

• Based on the theory of Timed Automata

• Adds variables and channels.– Main idea: reduce the number of states and simplify

composition of automata

65

Page 66: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

How are properties specified?• Typically two ways:

1. Use another timed automata. Composed the two. Check if you reach some state.

2. Use another formal language (ex: linear temporal logic).

• The verification engine is called a model-checker.– Maintains a representation of states, clocks, variables (explicit

or implicit)

– Compute which states can be reached from a given set of states (forward or backward)

66

Page 67: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Tissue Activation and Timed Automata Model

67

PROCEEDINGS OF THE IEEE 7

RRPt<=Trrp

ERPt<=Terp

t>=Trest

t:=0

Act_path(i)!

C(i):=1

Act_node(i)?

t:=0

Act_path(i)!

C(i):=1

Act_node(i)?

Terp:=g(f(t)), C(i):=f(t)

Act_path(i)!

t:=0

t>=Terp

t:=0

t>=Trrp

t:=0

Restt<=Trest

(a)

Antet1<=Tante

Idle

Retrot2<=Tretro

Confilict

Act_path(a)?

Tante:=h(C(a))

t1:=0

Double

Act_path(b)?

Tretro:=h(C(b))

t2:=0

t1>=Tante

Act_node(b)!t2>=Tretro

Act_node(a)!

Act_path(b)?

Tretro:=h(C(b))

t2:=0

Act_path(a)?

Tante:=h(C(a))

t1:=0

(b) (c)

Fig. 6. (a) Node automaton. Dotted transition is only valid for pacemaker tissue like SA node; (b) Path automaton; (c) Model of the electricalconduction system of the heart using a network of node & path automata [33].

expressing constraints on the clock values for the location. In

most models, a state invariant defines an upper bound on the

values that a clock can have while the state is active.

A transition guard is a condition on the values of clocks.

A typical guard is of the form t ≥ T , which provides a lower

bound for the clock value. A transition between locations is

enabled when the guard of the transition is true. However, a

transition between locations can occur at each moment when

it is enabled. Thus, to model deterministic transitions at a

particular time, state invariants are usually defined as the

closure of the complement of the guard (e.g., if the guard

is defined as t ≥ 1, then an invariant t < = 1 is added to the

state). Finally, when a transition occurs, associated actions are

taken, which involve updating local variables and/or reseting

clocks.

The VHM is modeled as a network of extended timed-

automata that includes synchronization primitives and shared

variables. In addition, in the VHM each automaton has its own

clocks and all clocks progress synchronously. Automata syn-

chronize with each other using broadcast channels. A channel

c synchronizes between a sender c! and an arbitrary number

of receivers c?. A transition with receiver c? is taken if c! is

available. Since shared variables are used, as with UPPAAL,

linear operations and conditions that include variables can be

used as a part of guards and reset actions. However, unlike

timed-automata semantics in UPPAAL, in our framework a

variable can be assigned with a value that is obtained using

a non-linear function of variables and clocks. This is done to

be able to model the refractory period and conduction delay

change. However, this aspect complicates the use of standard

verification tools based on the timed-automata framework.

B. Modeling the Electrical Conduction System of the Heart

Using the aforementioned semantics, we define node au-

tomaton that models the refractory properties of heart tissue,

and path automaton that models the propagation properties of

heart tissue. Heart tissue along a conduction path can then

be modeled using two node automata and one path automaton

connecting them. When one of the node automata is activated,

it will send an Act path! event to the path automaton. The

path automaton, which models a conduction delay, generates

an Act node! event after the conduction is completed. The

event will activate the node automaton at the other end of the

conduction path. The activation signal will keep propagating

if the node automaton is connected to other path automata.

In this manner, the electrical conduction system of the heart

can be modeled as a network of node and path automata (see

Fig. 6(c)).

1) Node automaton: The node automaton with index i

(presented in Fig. 6(a)) is used to mimic the timing behavior

of cellular action potential from Fig. 4(a). The refractory

periods are modeled as three states. The automaton starts from

Rest state, which corresponds to the resting potential of the

action potential. For the specialized tissue like SA node, the

corresponding node automaton will be self-activated into ERP

state after Trest. A broadcast event Act path(i)! is sent out

to all path automata that are connected to node automaton

i. In this case the shared variable C(i ) ∈ (0, 1], which is

shared among all paths connecting to node i, is updated to 1,

indicating a normal conduction delay. All node automata can

be activated by receiving event Act node(i)? after some path

connecting to it finishes conduction.

ERP state serves as a blocking period since the node does

not react to activation signals while the state is active. After

Terp time in ERP state, the transition to RRP state occurs. If no

external stimuli occurs, the node will return to Rest state after

Trrp time. If a node is activated during RRP state, the transition

to ERP state will occur, activating all paths connected to the

node. Before entering the ERP state, the variable used for

the the clock invariant of the state is modified. This behavior

corresponds to the ERP change due to early activation. The

conduction delay of the paths connecting to the node will

also be updated by the value of shared variable C(i). The

physiological basis and clinical data of this behavior has been

studied in [50] and [51] and an exponential approximation

of the changing trend has been made to approximate similar

behavior.

The functions f and g that are used to mimic the change

in Terp are defined as:

f (t) = 1 − t/ T r r p (1)

The AV node has a different profile than the other tissue.

The ERP period increases rather than decreases when activated

Page 68: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Path Model

PROCEEDINGS OF THE IEEE 7

RRPt<=Trrp

ERPt<=Terp

t>=Trest

t:=0

Act_path(i)!

C(i):=1

Act_node(i)?

t:=0

Act_path(i)!

C(i):=1

Act_node(i)?

Terp:=g(f(t)), C(i):=f(t)

Act_path(i)!

t:=0

t>=Terp

t:=0

t>=Trrp

t:=0

Restt<=Trest

(a)

Antet1<=Tante

Idle

Retrot2<=Tretro

Confilict

Act_path(a)?

Tante:=h(C(a))

t1:=0

Double

Act_path(b)?

Tretro:=h(C(b))

t2:=0

t1>=Tante

Act_node(b)!t2>=Tretro

Act_node(a)!

Act_path(b)?

Tretro:=h(C(b))

t2:=0

Act_path(a)?

Tante:=h(C(a))

t1:=0

(b) (c)

Fig. 6. (a) Node automaton. Dotted transition is only valid for pacemaker tissue like SA node; (b) Path automaton; (c) Model of the electricalconduction system of the heart using a network of node & path automata [33].

expressing constraints on the clock values for the location. In

most models, a state invariant defines an upper bound on the

values that a clock can have while the state is active.

A transition guard is a condition on the values of clocks.

A typical guard is of the form t ≥ T , which provides a lower

bound for the clock value. A transition between locations is

enabled when the guard of the transition is true. However, a

transition between locations can occur at each moment when

it is enabled. Thus, to model deterministic transitions at a

particular time, state invariants are usually defined as the

closure of the complement of the guard (e.g., if the guard

is defined as t ≥ 1, then an invariant t < = 1 is added to the

state). Finally, when a transition occurs, associated actions are

taken, which involve updating local variables and/or reseting

clocks.

The VHM is modeled as a network of extended timed-

automata that includes synchronization primitives and shared

variables. In addition, in the VHM each automaton has its own

clocks and all clocks progress synchronously. Automata syn-

chronize with each other using broadcast channels. A channel

c synchronizes between a sender c! and an arbitrary number

of receivers c?. A transition with receiver c? is taken if c! is

available. Since shared variables are used, as with UPPAAL,

linear operations and conditions that include variables can be

used as a part of guards and reset actions. However, unlike

timed-automata semantics in UPPAAL, in our framework a

variable can be assigned with a value that is obtained using

a non-linear function of variables and clocks. This is done to

be able to model the refractory period and conduction delay

change. However, this aspect complicates the use of standard

verification tools based on the timed-automata framework.

B. Modeling the Electrical Conduction System of the Heart

Using the aforementioned semantics, we define node au-

tomaton that models the refractory properties of heart tissue,

and path automaton that models the propagation properties of

heart tissue. Heart tissue along a conduction path can then

be modeled using two node automata and one path automaton

connecting them. When one of the node automata is activated,

it will send an Act path! event to the path automaton. The

path automaton, which models a conduction delay, generates

an Act node! event after the conduction is completed. The

event will activate the node automaton at the other end of the

conduction path. The activation signal will keep propagating

if the node automaton is connected to other path automata.

In this manner, the electrical conduction system of the heart

can be modeled as a network of node and path automata (see

Fig. 6(c)).

1) Node automaton: The node automaton with index i

(presented in Fig. 6(a)) is used to mimic the timing behavior

of cellular action potential from Fig. 4(a). The refractory

periods are modeled as three states. The automaton starts from

Rest state, which corresponds to the resting potential of the

action potential. For the specialized tissue like SA node, the

corresponding node automaton will be self-activated into ERP

state after Trest. A broadcast event Act path(i)! is sent out

to all path automata that are connected to node automaton

i. In this case the shared variable C(i ) ∈ (0, 1], which is

shared among all paths connecting to node i, is updated to 1,

indicating a normal conduction delay. All node automata can

be activated by receiving event Act node(i)? after some path

connecting to it finishes conduction.

ERP state serves as a blocking period since the node does

not react to activation signals while the state is active. After

Terp time in ERP state, the transition to RRP state occurs. If no

external stimuli occurs, the node will return to Rest state after

Trrp time. If a node is activated during RRP state, the transition

to ERP state will occur, activating all paths connected to the

node. Before entering the ERP state, the variable used for

the the clock invariant of the state is modified. This behavior

corresponds to the ERP change due to early activation. The

conduction delay of the paths connecting to the node will

also be updated by the value of shared variable C(i). The

physiological basis and clinical data of this behavior has been

studied in [50] and [51] and an exponential approximation

of the changing trend has been made to approximate similar

behavior.

The functions f and g that are used to mimic the change

in Terp are defined as:

f (t) = 1 − t/ T r r p (1)

The AV node has a different profile than the other tissue.

The ERP period increases rather than decreases when activated

68

Page 69: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Software Model

69

Page 70: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Other issues with medical equipment…

1. Interoperability– Currently equipment of vendor X only works with other

equipment of vendor X– Strong push for an open medical interoperability standard– Problem #1: if something goes wrong, who gets the blame

(weak FDA regulations)?– Problem #2: equipment vendors have nothing to gain…

2. Wireless Communication– Solve the cable mess– Problem: how to resist interference and jamming?– Some physical-layer techniques are promising (Ultra-

Wide Bandwidth, Dynamic Frequency Selection…)

70

Page 71: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Other issues with medical equipment…

• General solution: stand-alone safety

– Design each component such that it is safe in isolation

– Device must be able to maintain a safe state even if all communication is lost

– Device must be able to maintain a safe state even if it receives incorrect information from other devices

– Ex: ventilator should automatically turn on after a maximum amount of time!

71

Page 72: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Other CPS Examples

• Electric grid

• Manufacturing robots

• Nuclear plant control systems

• …

72

Page 73: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Automotive Systems

73

Image credit: Prof. Brandenburg

Page 74: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Example: Audi A4

• In 2014, Audi recalled 850,000 A4 cars due to a software bug that prevented timely airbag deployment

74http://www.autoblog.com/2014/10/23/audi-a4-airbag-recall/

Page 75: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

What are the problems?1. Lack of care

– Most attacks use conventional buffers overflow– Software simply isn’t checking for malicious inputs– People writing safety software are not used to think about

security..

2. Interfaces are not clearly specified– ECU development delegated to subsystem providers– Interfaces between components are not specified well-

enough to check for malicious interaction

3. No separation among criticality levels– Systems with different criticality are clearly isolated from a

temporal perspective, but not a function one– Very hard to solve…

75

Page 76: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Autonomous Car

• Google, GM, ford, Audi, …

76

Source: http://on-demand.gputechconf.com/gtc/2015/presentation/S5870-Daniel-Lipinski.pdf

Page 77: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

CAN Networks and Vulnerability• Set of ECU connected by CAN buses.

• CAN buses are designed for real-time (fixed priority messages), but not security…

• Broadcast with no authentication field: any ECU connected to a CAN bus broadcasts to all other ECU on the same bus. No way to determine the sender.

• Weak Access Control: there is a challenge-response sequence but the codes must be known by all service centers to perform diagnostic = they are out in the open.

• ECU Firmware Update: the firmware of any ECU can be updated over the CAN bus.

• Bridge nodes: there are different CAN buses (critical / non-critical), but they are bridged by dedicated ECU nodes.

• Result: if you can hack any ECU, you can re-flash any other ECU…77

Page 78: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

Car Hacking• Comprehensive Experimental Analysis of Automotive Attack Surfaces

• Scary stuff: an attacker can very easily gain control over all electronics systems in your car

– Start/stop/rev up/rev down engine

– Brake/disable braking

– Open doors

– Determine your position through GPS

– Listen to whatever you say in the car (all without your knowledge)

• Infiniti Q50 has steer-by-wire, so an attacker can remotely start and drive your car from your parking lot to his safehouse without moving from his couch…

• At least handbrake is completely manual…

78

Page 79: EECS 750: Advanced Operating Systemsheechul/courses/eecs753/S19/... · during software requirement-based testing, including exercising any interfaces between the applications via

High Computational Demand

• “Approximately 1 GB of data will need to be processed each second in the car’s real-time operating system. This data will need to be analyzed quickly enough that the vehicle can react to changes in its surroundings in less than a second.”

79

Intel, “Technology and Computing Requirements for Self-Driving Cars”