A Trusted Autonomic Architecture to Safeguard Cyber-Physical … · 2020. 1. 20. · A Trusted...

A Trusted Autonomic Architecture to Safeguard Cyber-PhysicalControl Leaf Nodes and Protect Process Integrity

Nayana Teja Chiluvuri

Thesis submitted to the Faculty of the

Virginia Polytechnic Institute and State University

in partial fulfillment of the requirements for the degree of

Master of Science

in

Computer Engineering

Cameron D. Patterson, Chair

William T. Baumann

Thomas L. Martin

July 17, 2015

Blacksburg, Virginia

Keywords: Process control systems, cyber-physical systems, autonomic systems,

programmable logic controller, remote terminal unit, human-machine interface, FPGA,

trust, configurable system-on-chip, heterogeneous computing, high-level synthesis

Copyright 2015, Nayana Teja Chiluvuri

A Trusted Autonomic Architecture to Safeguard Cyber-Physical Control

Leaf Nodes and Protect Process Integrity

Nayana Teja Chiluvuri

ABSTRACT

Cyber-physical systems are networked through IT infrastructure and susceptible to malware.

Threats targeting process control are much more safety-critical than traditional computing

systems since they jeopardize the integrity of physical infrastructure. Existing defence mech-

anisms address security at the network nodes but do not protect the physical infrastructure

if network integrity is compromised. An interface guardian architecture is implemented on

cyber-physical control leaf nodes to maintain process integrity by enforcing high-level safety

and stability policies.

Preemptive detection schemes are implemented to monitor process behavior and anticipate

malicious activity before process safety and stability are compromised. Autonomic properties

are employed to automatically protect process integrity by initiating switch-over to a ver-

ified backup controller. Subsystems adhere to strict trust requirements safeguarding them

from adversarial intrusion. The preemptive detection schemes, switch-over logic, backup

controller, and process communication are all trusted components that are separated from

the untrusted production controller.

The proposed architecture is applied to a rotary inverted pendulum experiment and imple-

mented on a Xilinx Zynq-7000 configurable SoC. The leaf node implementation is integrated

into a cyber-physical control topology. Simulated attack scenarios show strengthened re-

silience to both network integrity and reconfiguration attacks. Threats attempting to disrupt

process behavior are successfully thwarted by having a backup controller maintain process

stability. The system ensures both safety and liveness properties even under adversarial

conditions.

Dedication

I would like to dedicate my thesis to my great grandfather, Dr. Ranganadha Raju Mudundi.

A witness of Mahatma Gandhi, he retired after 60 years of service as a medical doctor in

Bhimavaram, India. He is respected for his generosity towards the poor and equality in

patient treatment in a time when poverty and caste were prevalent social issues in India.

He continues to have an open outlook on life and society which empowers and motivates me

three generations down.

Pursing higher education in a time when opportunities and encouragement were lacking,

Dr. Ranganadha Raju’s service as a doctor exemplifies all aspects of the Hippocratic Oath

to which he swore. He leads a life of utmost humbleness and simplicity, qualities I aspire

towards. His lifelong dedication to his profession, affection towards people, attitude towards

human welfare, importance to education, and simple yet idealistic lifestyle are inspirational.

In my life, I hope to look back with as much achievement and satisfaction as he.

iii

Acknowledgments

I greatly appreciate the academic and research guidance of my adviser, Dr. Cameron

Patterson, who initially gave me the opportunity to work on this project. His helpful feed-

back and support were valuable to the completion of my thesis. I would like to thank

Dr. William Baumann and Dr. Thomas Martin for participating on my academic advisory

committee and inspiring my work. This study was made successful by the support of my

colleagues: Omkar Harshe, Vivek Gopal, Christopher McCarty, and Pallavi Deshmukh.

I am extremely thankful to my parents and family for first introducing me to engineering

and supporting me in pursuing my passion throughout my academic career. Their affection

continues to motivate me to excel at all aspects of life. I am grateful for the continuous

technical mentorship of Dr. Nitin Patil and Deepak Patil who first recognized my passion

for electronics and gave me a summer internship in eight grade. The experience I obtained

from working at their company has been invaluable and their entrepreneurial spirit continues

to stimulate my passion to this day.

I would also like to thank all of the friends, classmates, and roommates that I have accu-

mulated as an undergraduate and graduate student at Virginia Tech. They have made my

life in Blacksburg enjoyable and provided me with unforgettable memories and relationships

that I will cherish forever.

This material is based upon work supported by the National Science Foundation under

Grant Number CNS-1222656. Any opinions, findings, and conclusions or recommendations

expressed in this material are those of the authors and do not necessarily reflect the views

of the National Science Foundation.

Zedboards and design tools were donated by Xilinx, Inc.

iv

Contents

1 Introduction 1

1.1 Cyber-Physical Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background 4

2.1 CPS Security Vulnerabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Network Integrity Attack Space . . . . . . . . . . . . . . . . . . . . . 6

2.1.2 Reconfiguration Attack Space . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Autonomic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3 TAIGA Overview 13

3.1 Autonomic Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Control Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2.1 Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.2 Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2.3 Trigger Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3 Trust Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

v

3.4.1 Isolation of Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.4.2 Inter-Module Communication . . . . . . . . . . . . . . . . . . . . . . 22

3.4.3 TAIGA Transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 Rotary Inverted Pendulum 24

4.1 Process Control Telemetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2 Control Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.3 Pendulum Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.4 Trigger Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.4.1 Trivial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.4.2 Linear Online Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.4.3 Neural Network Classification . . . . . . . . . . . . . . . . . . . . . . 33

5 TAIGA Implementation 34

5.1 Target Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.1.1 Processing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1.2 Programmable Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2 Production Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

5.2.1 FreeRTOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.2.2 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3 Backup Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.4.1 Inter-Module Communication Protocol . . . . . . . . . . . . . . . . . 49

5.5 I/O Intermediary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.5.1 Robust Process Control . . . . . . . . . . . . . . . . . . . . . . . . . 51

vi

5.5.2 Supervisory Control and Process Monitor . . . . . . . . . . . . . . . . 53

5.5.3 Trigger Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.5.4 Watch Dog Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.6 Controller Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6 Integration of TAIGA in Cyber-Physical Control 60

6.1 Remote Terminal Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.1.1 Remote Monitoring and Control Server . . . . . . . . . . . . . . . . . 61

6.2 Human-machine interface GUI . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.2.1 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.2.2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.3 Remote Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

7 Results 67

7.1 Resilience to Simulated Attack Scenarios . . . . . . . . . . . . . . . . . . . . 67

7.1.1 Denial-of-Service Attack . . . . . . . . . . . . . . . . . . . . . . . . . 68

7.1.2 Set-Point Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

7.1.3 Deception Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

7.2 Execution Time and Control Latency . . . . . . . . . . . . . . . . . . . . . . 74

7.3 Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

8 Conclusions 79

8.1 Scope of TAIGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Bibliography 82

vii

List of Figures

1.1 Elements of CPSes and the relationship between them. . . . . . . . . . . . . 2

2.1 Abstracted cyber-physical control components containing supervisory and plant

control loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Three-dimensional network integrity attack space for cyber-physical control. 6

2.3 Hierarchical topology of a DCS or SCADA system. . . . . . . . . . . . . . . 9

3.1 The control modules associated with TAIGA. . . . . . . . . . . . . . . . . . 15

3.2 Output-feedback control loop with state estimation. . . . . . . . . . . . . . . 16

3.3 Black box view of CPS leaf nodes with TAIGA. . . . . . . . . . . . . . . . . 20

3.4 TAIGA’s realization on a configurable SoC. . . . . . . . . . . . . . . . . . . 21

4.1 Photograph of the Quanser rotary inverted pendulum setup. . . . . . . . . . 25

4.2 Interface for telemetry between the regulatory control and infrastructure layers

of the Quanser RIP experiment. . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 SPI data transfer between master and slave controllers. . . . . . . . . . . . . 27

4.4 Inverted pendulum setup and sign conventions for θ and α. . . . . . . . . . . 29

5.1 The hardware implementation of TAIGA on a Zynq-7000 configurable SoC. . 34

5.2 Internal layout of the Zynq processing system. . . . . . . . . . . . . . . . . . 37

5.3 FPGA architecture and fabric composition. . . . . . . . . . . . . . . . . . . . 39

viii

5.4 Execution trace resembling FreeRTOS scheduling of concurrent tasks. . . . . 44

5.5 Software implementation of the FIFO interrupt handler. . . . . . . . . . . . 52

5.6 Idle loop and WDT interrupt service routine of the IOI. . . . . . . . . . . . . 54

6.1 Integration of the TAIGA leaf node into a cyber-physical control topology. . 60

6.2 GUI for remote monitoring and control of the RIP. . . . . . . . . . . . . . . 63

6.3 Live camera feed of RIP for remote surveillance. . . . . . . . . . . . . . . . . 66

7.1 Digital oscilloscope capture of plant response to a simulated DoS attack at

time TDoS = 30 seconds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

7.2 Plant response to a simulated supervisory attack at time Tattack = 60 seconds. 70

7.3 Digital oscilloscope capture of servo voltage saturation at actuation limits,

±10 volts, during voltage sweep of ±15 volts. . . . . . . . . . . . . . . . . . 73

7.4 Post-implementation Zynq FPGA resource usage with and without TAIGA. 77

ix

List of Tables

4.1 SPI ICs in Quanser pendulum interface board. . . . . . . . . . . . . . . . . . 28

4.2 Variables in the state vector of the inverted pendulum control experiment. . 30

4.3 Safety-critical and operational guard definitions for α and θ. . . . . . . . . . 31

5.1 Specifications for ZYBO Zynq-7000 development board. . . . . . . . . . . . . 35

5.2 Transmit and receive data channel signals of the interface between the AXI-

Stream FIFO and FIFO generator blocks. . . . . . . . . . . . . . . . . . . . 49

5.3 FIFO queue packet structure. The portions shaded gray are relevant only for

a PLANT command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.4 Syntax of an input operational set-point command on the supervisory UART

bus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.5 Packet composition of process data transmission on UART from IOI. . . . . 55

5.6 Flag transmitted on the UART bus to report TAIGA’s trigger state. . . . . . 56

5.7 Signals of the controller queue multiplexer allocated to their respective masters. 59

7.1 Execution time of RIP control sequence. . . . . . . . . . . . . . . . . . . . . 75

7.2 Execution time of IOI idle loop. . . . . . . . . . . . . . . . . . . . . . . . . . 75

7.3 Estimated slack for standalone production and TAIGA implementations on

Zynq. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7.4 Estimated power consumption for standalone production and TAIGA imple-

mentations on Zynq. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

x

Acronyms

ADC Analog to Digital Converter

AMBA Advanced Microcontroller Bus Ar-

chitecture

AMP Asymmetric multiprocessing

APU Application processor unit

AR Autonomic requirement

AXI Advanced eXtensible Interface

BRAM Block RAM

BSP Board support package

BUFG Global buffer

CLB Configurable logic block

CPS Cyber-physical system

DAC Digital to Analog Converter

DCS Distributed control system

DMA Direct memory access

DoS Denial-of-service

DSP Digital signal processing

EMIO Extended multiplexed I/O

FF Flip-flop

FIFO First-in first-out

FPGA Field-programmable gate array

FPU Floating point unit

GPIO General purpose I/O

GUI Graphical user interface

HDL Hardware description language

HLS High-level synthesis

HMI Human-machine interface

IC Integrated circuit

ICS Industrial control system

IOB I/O block

IOI I/O intermediary

IP Internet Protocol

IT Information technology

JTAG Joint Test Action Group

LMB Local-memory bus

xi

LQG Linear-quadratic-Gaussian

LUT Lookup table

MIO Multiplexed I/O

MLP Multilayer perceptron

MMU Memory management unit

MSR Machine status register

OS Operating system

OSI Open systems interconnection

PCS Process control system

PL Programmable logic

PLC Programmable logic controller

PLL Phase-locked loop

PMU Phasor monitoring unit

PS Processing system

RIP Rotary inverted pendulum

RTOS Real-time operating system

RTU Remote terminal unit

SACIB Sensory and control interface board

SCADA Supervisory control and data acqui-

sition

SoC System-on-chip

SPI Serial peripheral interface

TAIGA Trustworthy Autonomic Interface

Guardian Architecture

TCP Transmission Control Protocol

TPM Trusted Platform Model

TR Trust requirement

UART Universal asynchronous re-

ceiver/transmitter

UDP User Datagram Protocol

USB Universal Serial Bus

WDT Watch dog timer

xii

Chapter 1

Introduction

The 19th century industrial revolution sparked a transition in manufacturing processes from

manual labor to machines. Today, machines have become pervasive in society and industrial

applications. Modern-day industrial control systems (ICSes) have evolved with the progress

in technology and are heavily automated to reduce human labor and significantly optimize

performance. The information revolution gave rise to sophisticated electronics and the Inter-

net which further increases the capability of machines. Current cars, consumer devices, and

even process control systems (PCSes) are networked through information technology (IT)

infrastructure to allow for reliable, timely, and unbounded exchange of information between

humans and machines.

1.1 Cyber-Physical Control

Cyber-physical systems (CPSes) are defined as large-scale heterogeneous systems that en-

capsulate PCSes, are networked through IT infrastructure, and contain control loops for

governing a physical process [13]. The processes involved in CPSes vary in application and

can range from nuclear fission to home automation. The interaction with physical infras-

tructure makes PCSes safety critical for certain applications. The IT infrastructure for CPS

process telemetry introduces a diverse set of security implications. Telemetry between the

IT infrastructure and PCSes warrants stringent security measures that are arguably more

critical than traditional cyber security measures.

1

2

Figure 1.1: Elements of CPSes and the relationship between them.

CPSes contain three functional elements—communication, computation, and control—as

shown in Figure 1.1 [31]. Embedded controllers arbitrate the interaction between these

three elements in CPSes. Existing security measures typically focus on addressing intrusion

through the communication channels of the system by monitoring the cyber and systems

relationships between the communication and computation, and communication and con-

trol elements as depicted in Figure 1.1. Violation of network channel integrity allows the

embedded controllers in CPSes to be compromised with latent malware.

CPS leaf nodes enable computational and control element interaction. Typically, these nodes

contain the control loops that govern the physical process and present a higher risk to process

safety. In Figure 1.1, the computation and control elements are responsible for cyber-physical

control as they interact with cyber elements externally and internally actuate the physical

process.

Violating the integrity of the embedded cyber-physical controllers in CPS leaf nodes can

cause internal threats to jeopardize the safety of the physical process. As a result, security

must be addressed not only on the cyber relationships, but also the physical relationships of

CPSes. Anomaly detection and network monitoring enables external threats to be identified

but does not address internal threats due to possible latent malware that can be introduced

through the network. Indirect statistical detection methods may yield false positives or

negatives.

3

Applications of CPSes such as ICSes require system availability to ensure safety and mission-

critical services even in the presence of attacks. This thesis presents an implementation of

a Trustworthy Autonomic Interface Guardian Architecture (TAIGA) that maintains safety

and liveness properties in the presence of internal and external threats.

1.2 Contributions

Aspects of autonomic systems are implemented in TAIGA for preemptive detection of mali-

cious process behavior to ensure physical process safety and stability. TAIGA is implemented

at the leaf nodes of CPSes in order to safeguard interaction with physical processes. It is

assumed that the supervisory telemetry channels are untrusted to address the network in-

tegrity attack space. TAIGA does not trust the production controller in order to safeguard

the system from malicious reconfiguration.

TAIGA is successfully realized on a configurable system-on-chip (SoC) platform for a rotary

inverted pendulum application. Heterogeneous computing is leveraged with multiple isolated

processors operating on asynchronous clocks to execute independent functions in parallel.

TAIGA is integrated into a system resembling conventional cyber-physical control. Simulated

attack scenarios show added resilience to network integrity and reconfiguration attacks.

1.3 Thesis Organization

The thesis initially provides relevant background information establishing the motivation

and research goals in Chapter 2. Chapter 3 provides an overview of TAIGA including its

various components and properties. The rotary inverted pendulum application is introduced

in Chapter 4 along with the control algorithm used in the implementation. Chapter 5 focuses

on the implementation of all of the elements within TAIGA. The integration of the TAIGA

architecture in cyber-physical control is presented in Chapter 5. The TAIGA implementation

is evaluated in Chapter 6. Chapter 7 concludes with what was achieved in the research and

establishes the scope of TAIGA as well as future research directions.

Chapter 2

Background

Cyber-physical controllers govern a physical process based on operator conditions specified

through IT communication networks. To achieve timely information exchange, computation,

and process control, CPSes are organized into multi-layer hierarchies that contain numerous

embedded control nodes with specific functionalities. Distributed control system (DCS) and

supervisory control and data acquisition (SCADA) systems are widespread organizations for

CPSes especially in industrial control applications.

Figure 2.1: Abstracted cyber-physical control components containing supervisory and plantcontrol loops.

Controllers within these systems are comprised of numerous control loops, both external and

internal, for exchanging information and controlling the process. External control loops exist

between the computation elements and the physical process. Internal control loops provide

4

5

interaction between the computation and communication elements of CPSes to exchange

process information and operating conditions. Modern cyber-physical control is divided into

four main components as shown in Figure 2.1:

• The plant contains the physical process with sensors for measuring process state and

actuators for altering the process behavior.

• The supervisor is responsible for remotely monitoring process behavior and providing

operational conditions of the plant.

• A programmable logic controller (PLC) is a computational element that interacts with

the plant to sense process behavior and execute the control algorithm for actuation.

• A remote terminal unit (RTU) is a computational and communication element that

facilitates the exchange of information between the supervisor and PLCs.

Communication between the entities illustrated in Figure 2.1 vary in both complexity and

locality. Typically, a PLC governs a contained individual process control loop which may

only be a sub-process within the entire plant. Telemetry between the RTU and PLC is local

with respect to the plant. In some ICS organizations, the RTU aggregates information from

multiple PLC nodes and facilitates data exchange between these nodes for control loops

containing inter-node dependencies. Supervisory interaction with the RTU is more often

than not networked with IT infrastructure which allows remote access through the Internet.

2.1 CPS Security Vulnerabilities

The complex cyber-physical control structure and interaction with unregulated or susceptible

IT infrastructure introduces CPS security vulnerabilities. Cyber threats targeting critical

processes can damage physical infrastructure, have economic repercussions, and even endan-

ger human life. Many existing CPSes are not designed with security in mind and are deficient

in even trivial protections. Adversaries attempt to exploit the lack of sufficient security mea-

sures or vulnerabilities in the IT infrastructure to gain access to ICSes and cause damaging

behavior. CPS attacks can be separated into: network integrity attacks that exploit IT

network vulnerabilities; and reconfiguration attacks that maliciously reconfigure embedded

platforms.

6

2.1.1 Network Integrity Attack Space

PCSes are adaptive in nature and typically require interaction with operators. The perpetual

need for remote actuation and monitoring necessitates constant networked process telemetry.

IT infrastructure is adopted by ICSes for efficient and timely flow of information between the

operators and process and are customarily used in DCS and SCADA systems. Both DCS and

SCADA systems contain IT networks to interface the supervisory layer, which contains the

operators, to the control layer, which contains the PCSes. Network integrity violations allow

adversarial communication with the PCS. Attacks on the network integrity are represented

in the literature with a three-dimensional attack space as illustrated in Figure 2.2 [24].

Figure 2.2: Three-dimensional network integrity attack space for cyber-physical control.

The attack space maps network integrity attacks based on the types of resources available

to the adversary:

• Disclosure resources allow visibility of confidential sensor and actuator data to the

adversary.

• Disruption resources enable physical intervention and control of the system by the

adversary through transmission of sensor and controller data.

• System knowledge about the controller and the behavior of the physical process in-

creases the overall stealthiness of an attack.

7

The primary goal of an attacker is to disrupt the behavior of an autonomous process without

the CPS detecting or responding to the attack. With more resources available, an adversary

is not only able to execute a more effective attack, but also able to better conceal the attack.

Detection schemes for network integrity attacks mirror the traditional detection schemes of

IT cyber attacks. Both detection and response schemes in this realm are heavily researched

and implemented in CPSes [4].

With the availability of disruption resources, an adversary is able to execute a denial-of-

service (DoS) attack. As is the case of IT systems, CPS DoS attacks initially target the

communication interfaces and deactivate supervisory monitoring and interaction with the

PCSes. Once the adversary penetrates the CPS, other system functions or physical processes

are disabled [20]. Typically, DoS attacks are detected by the supervisor due to the lack of

communication or response from the attacked nodes in the system.

Disclosure resources coupled with disruption resources enables an adversary to disguise in-

trusion into a CPS. In a replay attack on the network integrity channels, an adversary

hijacks the sensor and actuator values, and re-transmits them to the supervisory system

while executing a disruptive attack to the plant. The supervisor responds to past data that

does not reflect the potential dangerous behavior of the plant’s current state. Detection

schemes for such an attack look for anomalies in the system operation or network channels

using methods such as a χ2 detector [19].

Availability of system knowledge enables an adversary to engineer more stealthy attacks such

as a zero-dynamics or bias-injection attack [24][25]. These covert attacks attempt to evade

detection mechanisms, disclosed to the the adversary through significant system knowledge.

Vulnerabilities within the detection schemes themselves are exploited and the system is

pushed to its limits [4]. A well designed covert attack adapts to the behavior of the process

and responds to the security safeguards that are implemented.

Although physical process integrity is not jeopardized, the presence of only disclosure re-

sources allows an adversary to eavesdrop and collect confidential process data. Plant response

and control parameters are extracted from the collected data by offline machine learning al-

gorithms. For example, data mining and decision trees are employed in phasor monitoring

units (PMUs) to compute real-time state estimation and provide critical feedback for power

plant operators [3]. Similar tactics on collected process data can be used to model plant

behavior, allowing an adversary to learn about the system and craft more covert attacks.

8

Aurora Vulnerability Case Study

In 2007, a cyber-physical network integrity attack, the Aurora vulnerability, was experi-

mentally demonstrated by Idaho National Laboratory. Adversarial red-team access to a US

Department of Energy diesel generator was gained by exploiting the lack of security in net-

work communication protocols such as Modbus. Modbus is pervasive in existing electrical

grid equipment and does not support even simple security methods such as authentication.

Once access is gained, the attack opens and closes circuit breakers out of synchronization with

the grid causing elevated electrical and mechanical stress on the generator. Typical circuit

breakers contain synchronous checks that prevent out-of-synchronous closing. Aurora takes

advantage of the time delay between recognizing the out-of-synchronous relay closing and

the protective response to execute the attack and cause irreparable physical damage [32].

A dramatic video of the attack shows smoke coming out of the generator and severe damage

to the generator itself [32]. Continued execution of the attack has potential of generator

explosion. The distributed nature of electrical power grids causes even deeper concern since

destruction of a single generator can have a cascading effect to additional nodes in the power

system resulting in widespread disruption.

2.1.2 Reconfiguration Attack Space

CPSes contain various computing system layers organized in hierarchical schemes specific to

the application. Industrial control applications conventionally use DCS or SCADA systems

which contain a defined set of subsystems and protocols for communication between the

systems. The topology of a typical DCS or SCADA system is represented in Figure 2.3.

Such a system has three layers of control and computation:

• The infrastructure layer contains sensors, actuators, and the physical processes.

• The regulatory control layer includes the embedded controllers that automatically gov-

ern the plant through interaction with the sensors and actuators of the infrastructure

layer via control loops between the two layers.

• The supervisory control layer allows operators to send control commands to the regula-

tory controllers as well as monitor the system with human-machine interfaces (HMIs).

9

Figure 2.3: Hierarchical topology of a DCS or SCADA system.

Components in the regulatory control layer — RTUs and PLCs — are customarily embed-

ded systems that interact with the supervisor through IT network telemetry. A similar IT

network channel exists for embedded platform reconfiguration intended for firmware updates

and performance optimization. The IT network telemetry with these regulatory controllers

makes them susceptible to malicious reconfiguration that not only violates the integrity of the

network channel, but also threatens the integrity of the physical process since the regulatory

layer interacts directly with components of the physical infrastructure.

Security measures to strengthen network integrity are implemented within the regulatory

controllers. Malicious reconfiguration may revoke the security barriers for network integrity

attacks in addition to disrupting controller operation. As a result, the reconfiguration attack

space subsumes the network integrity attack space and is much more critical. The ability to

adversely reconfigure controllers in the regulatory layer allows an attacker to not only evade

detection and response schemes, but also hijack the controller and alter its overall function.

10

Stuxnet Case Study

A CPS reconfiguration attack is the most critical and covert. However, gaining system

reconfiguration capabilities more often than not requires gaining network access. Typically,

reconfiguration attacks are a result of multiple cumulative network integrity attacks that

advance an adversary’s intrusion into the system until reconfiguration resources are available.

Stuxnet is a prime example of such an attack.

The Stuxnet worm is perhaps the most sophisticated in the cyber-physical attack realm.

Originally designed to target Iranian nuclear facilities, the worm infected Siemens PLCs.

The virus exploits vulnerabilities in the Microsoft Windows operating system and initially

propagates through the IT network of the nuclear fuel enrichment in which it exploited four

different zero-day flaws to gain elevated adversarial privileges [12]. The worm ultimately

infects Siemens software and reconfigures the PLCs controlling nuclear centrifuges. Once

reconfigured, all safety precautions, detection schemes, and security barriers are disabled [4].

The virus caused the uranium enrichment centrifuges to tear themselves apart [12].

Reportedly, Stuxnet destroyed up to one fifth of Iran’s enrichment centrifuges. As nuclear

facilities are considered one of the most safety critical infrastructures, the attack raises signif-

icant concern regarding the protection of CPSes. The architectural complexity, robustness,

and sheer brilliance of the Stuxnet worm motivates cyber-physical security professionals and

researchers to this day [12].

The network integrity and reconfiguration attack spaces raise significant concern about pro-

tecting critical infrastructure. The Aurora vulnerability and Stuxnet are just two examples

of attacks that have been successfully executed and showed the physical destruction that can

result from cyber attacks. The Stuxnet virus is classified by some professionals as a nation-

state attack. Such attacks and threats have gained significant national security concerns.

On February 12, 2013, United States President Barack Obama issued an executive order

titled “Improving Critical Infrastructure Cybersecurity” [21]. The order addresses the con-

cerns of cyber-physical attacks and states “national and economic security of the United

States depends on the reliable functioning of the Nation’s critical infrastructure in the face

of such threats” [21]. An implicit call for action is stated in this executive order to ad-

dress the cyber-physical threats that could jeopardize critical American infrastructure. Such

attacks are classified as a modern form of warfare that threatens targets on American soil.

11

2.2 Autonomic Systems

Traditionally, the cybersecurity model has been based on thwarting known attacks. Typi-

cal anti-virus software for IT infrastructure maintains databases of discovered viruses and

searches systems for footprints of such viruses. While firewalls attempt to keep intruders out,

there are always new exploits that make systems vulnerable. Cybersecurity engineers have a

never ending list of zero-day exploits that must be addressed within their systems. A similar

model is present in modern day CPSes where detection schemes and response methods are

implemented once exploits are discovered. While attacks on conventional IT infrastructure

may cause a disruption of information or restricted access to confidential information, at-

tacks on CPSes can cause irreparable physical damage. Hence a reactive approach to CPSes

attacks is undesirable.

Autonomic computing, first noted in a 2001 IBM manifesto, contains self-managing resources

which adapt to unprecedented circumstances without user intervention or operation [10].

These systems are inspired by the human body’s autonomic nervous system which controls

bodily functions without conscious intervention. Such systems make decisions independently

using high-level policies. According to IBM, an autonomic system contains four properties

for self-management [10]:

• Self-configuring: High-level policies arbitrate automated and seamless configuration

of components in a system.

• Self-optimizing: Resources are monitored and automatic control ensures the most

efficient functional operation of the system.

• Self-healing: Software and hardware faults and abnormalities are automatically de-

tected and rectified.

• Self-protecting: The system proactively anticipates threats that jeopardize the func-

tional operation of the system and defends against system failure.

Typically, autonomic system properties are realized with a large set of closed control loops

that monitor a specific hardware or software resource and ensure the system maintains the

relevant parameters within a specified range. With respect to cybersecurity, the self-healing

12

and self-protecting properties of autonomic systems are the most relevant in protecting a

system from attacks while responding and recovering from malicious intrusion. Similar to

how the human body combats viruses and infections, an ideal autonomic system will protect

itself from adversarial intrusion that threatens the health of the system.

Chapter 3

TAIGA Overview

The primary objective of TAIGA is to preemptively detect malicious process behavior and

maintain plant safety and stability. As a result, TAIGA contains methods not only for

detecting attacks, but also responding and recovering from them.

Prior work presents the use of formally verified, application-specific hardware for monitoring

system operation in real-time at the lowest I/O pin level [15]. Leveraging run-time prediction

allows for forecasting the behavior of the plant and detecting a reconfiguration attack [16].

Initially, plant I/O and trusted components of the predictive architecture are implemented

in configurable hardware using high-level synthesis (HLS) [17].

TAIGA is introduced as an intermediary for controller I/O and isolates untrusted components

prone to malicious configuration by harnessing the advantages of a commercially available

configurable SoC [7]. TAIGA has evolved to not only isolate trust, but also provide a

generalized autonomic architecture that monitors plant behavior and supervisory commands

to preemptively respond to reconfiguration or networked integrity attacks before they disrupt

process behavior [5]. TAIGA recognizes that an attack might have manifest itself within an

embedded platform and safeguards the system as a last line of defence before the attack

jeopardizes plant safety and stability.

13

14

3.1 Autonomic Requirements

Autonomic enforcement of high-level policies can help ensure plant safety and stability. The

self-protecting and self-healing nature of these systems allows adaptive defense mechanisms

to threats jeopardizing the operation of physical processes by not only detecting attacks but

also responding and recovering from them.

In order to effectively govern the properties of autonomic computing, the system must be

implemented to exhibit the following characteristics and abilities which are formally defined

as the autonomic requirements (ARs):

AR1 Awareness is the ability to sense the operational parameters of the system that are

bounded with high-level policies.

AR2 Adaptive systems contain resources that can be functionally and operationally re-

configured based on the spatial and temporal context.

AR3 Automatic systems are self-contained in that the monitoring and reconfiguration is

initiated without any manual intervention.

3.2 Control Strategy

TAIGA incorporates three distinct control modules: the production controller, backup con-

troller, and trigger mechanism. These elements, illustrated in Figure 3.1, are isolated and

run concurrently within TAIGA to allow for detection of malicious process operation and

regain control of the plant automatically without violating operational limits. The opera-

tional thresholds are defined as guards for the plant and are used to differentiate expected

and malicious operation.

Initially, the process is governed by the production controller as shown in Figure 3.1. The

trigger mechanism tracks the plant’s behavior in real-time by probing the telemetry between

the controller and plant. If the plant behavior is anticipated to violate specified guards, the

system switches to a trusted backup controller that provides stability and is not susceptible

to reconfiguration. The ability to automatically recognize malicious behavior and initiate

switch-over to reliable operation satisfies AR3.

15

Figure 3.1: The control modules associated with TAIGA.

3.2.1 Controllers

The objective of a PCS is to maintain the physical process as close to the desired operating

conditions as possible. In industrial control applications, a feedback loop is typically used to

determine the discrepancy between the actual plant state and the desired state. A control

algorithm generates an actuating signal to compensate for this variation. Adaptive con-

trol algorithms adjust to the process behavior by altering control and actuator parameters

dynamically based on the feedback response of the plant. While adaptive controllers are

occasionally used in ICSes, they are much less common and contain a high degree of uncer-

tainty within the control parameters; TAIGA only considers conventional output-feedback

control loops.

An output-feedback controller, as depicted in Figure 3.2, is pervasive within ICS applications.

More often than not, the control algorithm requires a wider set of process states than those

that are provided through the sensors within the infrastructure layer. The state-estimation in

the feedback loop determines the larger set of states based on the known sensory values and

the expected dynamics of the plant. Figure 3.2 represents the control loop of the production

and backup controller.

16

Figure 3.2: Output-feedback control loop with state estimation.

Production Controller

The production controller exists within a traditional PLC in the leaf nodes of a CPS. The

need for optimization and controller updates makes this node prone to covert reconfiguration

attacks. The production control algorithm is optimized for performance in which the control

parameters of fC (rk, xk+1) are tuned aggressively to minimize the discrepancy between rk

and yk while optimizing controller response time. This controller is responsible for meeting

the performance and throughput specifications of the plant.

Backup Controller

In contrast, the backup controller is a high-assurance controller that is verified to provide

stable and reliable plant operation. This control algorithm instance trades off performance

for assurance. While the production controller may undergo numerous performance updates,

the backup controller is the initial “factory” controller that is verified to not have any latent

malware or malicious operation. Typically, the various layers of complexities are stripped

down in the implementation of the backup controller to ensure internal threats are not

present. The purpose of the backup controller is not to optimize plant deliverables, but to

preserve the safety and stability of the system.

17

3.2.2 Guards

The ability to preemptively detect malicious and disruptive process behavior depends on the

accurate identification and definition of guards for the physical process. Guards identify the

operational or safety bounds of the system and are categorized as follows:

• Safety critical guards define the limits at which a system may operate while ensuring

safety. In a conventional PCS, violation of a safety critical guard requires drastic

measures usually involving auxiliary system intervention to bring the system back to

safety.

• Operational guards define the normal limits of system operation. These guards can

be designed to prevent a safety critical guard violation from occurring under normal

operation.

Guards are application-specific and are used to enforce the parameters of an application’s

physical process. Since the primary purpose of TAIGA is to ensure process safety and sta-

bility, the operational guards must bound the system within a region of safe and stable

operation. However, covert attacks can sometimes maintain safe and stable plant operation

but degrade the performance of the plant. Performance guards define the thresholds of per-

formance for the physical process to meet the plant requirements and deliverables. Enforcing

performance guards in addition to the operational guards is suitable for application domains

in which performance degradation of the controller causes detrimental effects to the system

but does not threaten the plant’s safety.

3.2.3 Trigger Mechanism

The trigger mechanism is responsible for initiating the switch-over from production to backup

control in the presence of a disruptive attack. More often than not, industrial processes

contain non-linearities and are inherently unstable. Initiating switch-over to the backup

controller once a guard has been violated does not ensure plant safety and stability. As a

result, the trigger mechanism anticipates the future plant behavior and initiates switch-over

to the backup controller if the plant’s trajectory shows disruptive behavior in the future. This

ensures a safe recovery to process stability under the governance of the backup controller.

18

Monitoring physical process sensors to detect faults and switch-over from a high-performance

controller to a high-assurance controller is considered by Sha [23]. In Sha’s architecture, de-

cision logic ensures that the plant governed by the high-performance controller stays within

the stable envelope of the high-assurance controller. The response of the plant during switch-

over endangers the recovery of stability as the fault is detected after the process has deviated

from the allowed stable envelope in Sha’s scheme [16]. In contrast, TAIGA’s trigger mecha-

nism forecasts the trajectory of the plant based on the current state vector to preemptively

detect a guard violation and effectively maintain the system within the operational bounds

during and after the switch-over process.

The trigger mechanism generates a decision based on a state vector representative of the

plant’s current operation derived exclusively from physical sensor measurements and state

estimation to forecast the future behavior of the plant. The advantage of the trigger mech-

anism is its ability to forecast the plant’s future tendency and therefore preemptively detect

malicious behavior. In order to accurately and reliably forecast the plant’s tendency within

the TAIGA framework without false positives or negatives, the physical process must satisfy

the following attributes:

• The state vector is derived exclusively from the physical sensor measurements.

• A plant model accurately describes the dynamics of the physical process.

With the first attribute, the trigger mechanism is sufficiently aware of the physical process to

anticipate malicious behavior and satisfy AR1. A plant model is required to foresee disruptive

operating conditions. With an accurate plant model, the plant behavior is forecast using

two possible methods: online prediction and classification.

Prediction

The plant model connected to an instance of the control algorithm is run faster than real time

to estimate the plant’s trajectory in the online prediction method. For the current physical

state of the plant, the prediction method iterates several control cycles into the future to

anticipate the trajectory of the plant under the governance of the backup control algorithm.

If a guard is violated in a future iteration, the trigger mechanism initiates switch-over to the

backup controller.

19

Prediction is computationally intensive and requires a reasonably accurate plant model,

which is not always possible. Linear models may not accurately simulate the plant behavior

while the implementation of complex non-linear models is not feasible on embedded systems

due to computational limitations. As a result, an alternative to online prediction is sought.

Classification

Deciding whether the process will remain in a safe operating region can be considered a

classification problem. Machine-learning methods are used offline with simulated state data

from the plant model to bound the process in a region of safe return for each given state.

The classifier determines whether the system can regain safe operation under the governance

of the backup controller for the current process state vector. The instant the process state

vector is classified as operating in an unsafe region, the classifier-based trigger mechanism

initiates switch-over to the backup controller. The classifier is tuned offline using simulated

data. As a result, computational complexity is not a inhibiting limitation. Complex non-

linear models can also be used for more accurate classification.

3.3 Trust Requirements

CPS leaf nodes contain the embedded platforms for production control of physical processes.

As illustrated in Figure 3.3a, these platforms are susceptible to both network integrity at-

tacks as well as malicious reconfiguration. Traditionally, the production controller interacts

directly with the physical process. The internal and external threats to this controller jeop-

ardize the safety of the physical process since attacks can cause disruptive plant actuation.

As a result, TAIGA is introduced as an intermediary between the production controller and

the physical process as shown in Figure 3.3b. TAIGA operates on the assumption that the

production controller may have malicious code, and is therefore untrustworthy. The physical

process is protected from disruptive actuation since TAIGA components are not susceptible

to reconfiguration and the supervisory nodes do not have the capabilities to directly actuate

the process. An abstracted black box view of TAIGA is shown in Figure 3.3 containing just

the supervisory and control layers. Typically, there are multiple layers in between the leaf

and supervisory nodes such as RTUs as shown in the CPS topology in Figure 2.3.

20

(a) Conventional CPS leaf nodes.

(b) Leaf nodes with TAIGA.

Figure 3.3: Black box view of CPS leaf nodes with TAIGA.

The TAIGA black box in Figure 3.3b contains the backup controller, the I/O intermediary

(IOI) module, switch-over logic, and peripherals for communication with the production

controller, supervisor, and physical process. The trigger mechanism used to enforce the

guards on the system is located within the IOI. Trust is essential in the implementation

of these TAIGA modules to ensure robust operation of the system and prevent malicious

intrusion. Formal trust requirements (TRs) for each of the trusted components in TAIGA

are defined by Lerner [14]:

TR1 The source code and implementation for the entire component are analyzed.

TR2 The component uses private hardware resources for computation, internal commu-

nication, and memory, and does not invoke external components as sub-functions.

TR3 All external communication with untrusted components is through hardware-implemented,

bounded, and isolated queues.

TR4 The component cannot be bypassed or disabled, and has a fixed repertoire of essen-

tial services, such as I/O or cryptography.

TR5 Critical functionalities of the component, such as rule checking logic, cannot be

updated without provably secure or physical access.

21

The only commercial security apparatus fulfilling all five of these TRs is a Trusted Platform

Model (TPM) used primarily as a secure cryptoprocessor [27]. Similarly high standards

of trust are instituted within the trusted elements of TAIGA to prevent corruption. The

separation of trust between the production controller and the trusted elements of TAIGA

ensures that the trusted elements will operate correctly regardless of what happens to the

production controller.

3.4 Architecture

Architecturally, TAIGA mandates two capabilities. In order to switch between the pro-

duction and backup controller under the presence of an attack, the architecture must be

adaptive and satisfy AR2. Secondly, the trusted components of TAIGA must satisfy all of

the TRs and be isolated from the untrusted production controller. Figure 3.4 illustrates

TAIGA realized on a configurable SoC platform. The production and backup controllers

host the two instances of the control algorithm for actuating the process. The IOI hosts

the trigger mechanism that monitors the physical process and identifies malicious activity.

A trigger is asserted when malicious process behavior is anticipated; an asserted trigger

switches governance of the plant from the production to the backup via the controller queue

multiplexer.

Figure 3.4: TAIGA’s realization on a configurable SoC.

22

A configurable SoC is well suited for an autonomic system since the field-programmable

gate array (FPGA) fabric can be customized to adapt automatically based on adversarial

conditions while satisfying the ARs and TRs. TAIGA is implemented in a configurable SoC

to isolate trust, restrict inter-module communication, and maintain TAIGA’s transparency.

3.4.1 Isolation of Trust

The production controller, backup controller, and IOI host processes and algorithms that

are both computationally and arithmetically intensive. As a result, they are run in micro-

processors that can execute compiled software rather than low-level logic. The isolation of

trust is ensured by separating hardware resources between the production controller and the

trusted entities of TAIGA. Figure 3.4 shows the production controller implemented in the

dedicated processing cores of the configurable SoC. These processors have their own RAM,

cache, and peripheral controllers. Similarly, the backup controller and IOI are realized in

soft-core processors that also have isolated memory and peripheral resources instantiated

within the FPGA fabric. Access to the hard- and soft-core’s resources are explicitly not

permitted by the configurable SoC, thereby satisfying TR2.

Configurable SoC’s typically contain a processing system (PS) which hosts the primary

processing cores, and the programmable logic (PL) which contains the FPGA fabric. The

separation of trust illustrated in Figure 3.4 bisects these two systems. The reconfiguration

network has access only to the processing system and can modify applications that are

executed within the processing cores. The PL is hardware-defined and does not include any

ports or methods for remote reconfiguration. The backup controller and IOI, the two trusted

entities of TAIGA, can only be reconfigured through physical access to the platform, which

satisfies TR5.

3.4.2 Inter-Module Communication

In order to limit the scope for malicious reconfiguration and communication, telemetry with

trusted entities of TAIGA is restricted. The inter-module communication of the three pro-

cessing blocks described in Figure 3.4 is limited to queues. These queues are implemented

within the FPGA fabric, and do not allow external tampering from the production con-

troller. Furthermore, both the production and backup controllers have their own set of

23

isolated queues such that no communication resources are shared between the untrusted and

trusted elements. The controllers communicate with the IOI over the queues using a pre-

defined protocol that limits the scope of interaction. This implementation of inter-module

communication satisfies TR3.

The IOI module hosts the trigger mechanism and is responsible for monitoring process be-

havior, detecting malicious activity, and initiating switch-over to the backup controller. All

interaction with the supervisor and physical process are channeled through the IOI module

as shown in Figure 3.4. In order to interact with the plant, the production controller must

communicate via the IOI using queues. TR4 is satisfied since the IOI cannot be bypassed

and autonomically determines process control between the production and backup controller.

3.4.3 TAIGA Transparency

The black box representing TAIGA in Figure 3.3b contains the trusted components of

TAIGA: the backup controller, the IOI, and the controller multiplexer. From the perspective

of the production control algorithm, these TAIGA components are transparent as interaction

with the physical process is not bypassed or intercepted by the I/O intermediary module,

but merely probed and arbitrated. Typically in a PCS, the low-level peripheral controllers

responsible for process sensing and actuation are implemented within the production con-

troller itself. These low-level drivers are relocated to the IOI with the TAIGA framework

as the production controller uses queues rather than direct sensor and actuator telemetry to

interact with the process. Scalability of the production controller is ensured since only the

low-level drivers of the production controller need to be modified for queue communication.

Chapter 4

Rotary Inverted Pendulum

The rotary inverted pendulum (RIP) experiment is a classical electro-mechanical controls

challenge which incorporates nonlinearity, stability, actuation limits, and noise. These con-

cerns are representative of those found in industrial control applications. The table-top setup

of the RIP experiment makes it an ideal fit for evaluating TAIGA. The Quanser RIP system

is used [22].

The Quanser RIP system contains two electrical subsystems: the linear voltage amplifier and

the rotary servo base unit. The RIP base unit, shown in Figure 4.1, contains a servo motor

for actuating the arm of the pendulum radially, and two high-resolution optical encoders

for sensing the pendulum and servo arm positions. The base unit also contains an analog

potentiometer for sensing the servo arm position but is not used in this experimentation

since the digital encoder achieves the same purpose.

The linear voltage amplifier is responsible for driving the servo motor and accepts a voltage

within the range of±10 volts for rotating the servo motor in the clockwise or counterclockwise

directions. Each of the optical encoders contain two digital signals: channel A and channel

B. These signals toggle at each encoder tick based on the rotation of the encoder shaft. The

high-resolution encoders contain 4096 counts per revolution. The phase differential between

the two encoder channels and the toggle frequency of the digital signal are used to determine

the direction and implicit velocity of the encoder respectively.

24

25

Figure 4.1: Photograph of the Quanser rotary inverted pendulum setup.

26

4.1 Process Control Telemetry

Physical process sensors and actuators do not directly interface to the controllers in the reg-

ulatory layer of CPSes. Typically, the interaction between the regulatory and infrastructure

layers goes through defined telemetry interfaces. The discrepancy between process control

commands and the telemetry capabilities of the embedded controller are adressed by this

interface. Modern embedded systems operate on digital logic rather than analog for com-

munication and transmission of information with external entities. Various digital protocols

exist for bidirectional communication between embedded controllers and peripherals. Some

pervasive embedded peripheral protocols include I2C, SPI, CAN, and UART; these protocols

are defined by established standardization bodies and favor certain applications over others.

For the Quanser RIP, an external sensory and control interface board (SACIB) is designed

for telemetry between the embedded controller and the physical pendulum process. The

high-level interface between the pendulum process and embedded controller is illustrated in

Figure 4.2. TAIGA assumes integrity of all entities within the physical infrastructure layer.

In the RIP application, the serial peripheral interface (SPI) bus and SACIB are trusted and

physically protected with perimeter security measures.

Figure 4.2: Interface for telemetry between the regulatory control and infrastructure layersof the Quanser RIP experiment.

27

Serial Peripheral Interface

SACIB communicates with the embedded controller via a SPI bus and can sense and actuate

the pendulum. SPI is a synchronous serial communication protocol originally developed

by Motorola [26]. It contains one master device that governs the bus and multiple slaves

each with a dedicated slave select (SS ) signal. A slave is enabled for communication by

asserting its corresponding slave select (active low); only one slave can be enabled at a time.

Communication in the SPI protocol is achieved with three signals [26]:

• SCK is the serial clock generated by the SPI master and corresponds to the data rate

of the serial communication.

• MOSI is the output data from the master and the input to the slave.

• MISO is the input data to the master and the output from the slave.

The SPI bus acts as an inter-chip circular buffer between the master and slave devices as

illustrated in Figure 4.3. On each SCK clock cycle, a bit from the master device is transmitted

on the MOSI signal line while a bit from the slave device is transmitted on the MISO signal

line. Internally, the buffer of bits in the master and slave SPI transfer registers are bit

shifted so that words are transmitted with multiple clock cycles. The hardware SPI transfer

registers are updated with new data by the embedded controller. Since data is exchanged

bidirectionally between the master and slave at each SCK clock cycle, SPI does not follow a

cline request and server handler data flow like other serial protocols.

Figure 4.3: SPI data transfer between master and slave controllers.

28

Sensory and Control Interface

SACIB contains four core integrated circuits (ICs), defined in Table 4.1, that are slaves

on the SPI; the embedded controller is the master. Since the linear voltage amplifier and

potentiometer operate at a different voltage spectrum (±10 volts) than the digital ICs (3.3

volts) on the interface board, operational amplifiers are used to scale the voltage.

Table 4.1: SPI ICs in Quanser pendulum interface board.

ICs DescriptionDigital to Analog Converter (DAC)MCP4921

Actuator output for interacting with the linear volt-age amplifier and for mobilizing the pendulum arm.

Analog to Digital Converter (ADC)MCP3202

Sensory input of the absolute potentiometer positionof the pendulum arm. Not used in this experiment.

2×32-bit Quadrature CountersLS7366R

Sensory input that keeps count of the encoder ticksto determine position of servo and pendulum arms.

Both the ADC and DAC ICs do not require software configuration. The quadrature counters

require an initialization process in which internal registers are configured for sensing the

encoder’s radial position:

1. The counter is cleared with the pendulum position pointing downwards in a free hang-

ing position with no oscillation by writing the CLR CNTR op-code on the SPI bus.

2. The operation mode is configured by writing the following mask to the MDR0 register:

QUADRX4 | FREE RUN | DISABLE INDX | FILTER 2.

3. The operation mode is configured further by writing the the following mask to the

MDR1 register: NO FLAGS | BYTE 2 | EN CNTR.

4.2 Control Algorithm

The RIP contains two mechanical parts: the servo arm and the pendulum as shown in

Figure 4.4. The servo arm is radially actuated with the servo motor by applying a control

voltage. The position of the servo arm (θ) is sensed via an optical encoder. The pendulum

29

freely pivots along the end of the servo arm and its position, (α), is sensed by the position

of the second encoder shaft attached to the freely rotating pendulum’s pivot point [1]. The

sign conventions of both α and θ are illustrated in Figure 4.4.

Figure 4.4: Inverted pendulum setup and sign conventions for θ and α.

The control objective is to balance the pendulum at a commanded servo arm position. The

inverted pendulum experiment, like many other physical processes, is a continuous system.

Embedded platforms are digital electronics that do not operate in continuous time like analog

devices, but rather, operate on a discrete time interval. As a result, the dynamics of the

pendulum are modeled in discrete time k using Equation 4.1 [9].

xk+1 = Axk +Buk + wk

yk = Cxk + vk(4.1)

The state vector is defined by x ∈ <4 and contains the radial pendulum and servo arm

positions and velocities. The control voltage applied to the servo arm is represented by

u ∈ <. The output process vector is defined by y ∈ <2. The state vector is influenced

by process noise while the output is influenced by measurement noise modeled by w and v

30

respectively [8]. The resolution of the encoder is 4096 counts per revolution which results in

a 0.088◦ tolerance for error in sensing the radial position of the pendulum and servo arm.

Measurement noise is set to half a count of the encoder resolution and taken into account

by v [8]. The four-dimensional state vector of the pendulum is represented in Table 4.2.

Table 4.2: Variables in the state vector of the inverted pendulum control experiment.

State Descriptionθ Servo arm radian angle position derived from encoder measurement.α Freely pivoting pendulum radian angle derived from encoder measurement.

θ Velocity of the servo arm derived using state estimation.α Velocity of the pendulum derived using state estimation.

A linear-quadratic-Gaussian (LQG) feedback controller is implemented as the control algo-

rithm for the pendulum [8]. The LQG controller minimizes a cost function of the optimal

control law and enforces large penalties on the deviation of θ and α. This control algorithm

satisfies the overall goal of the controller by reducing the discrepancy of the pendulum and

servo from the desired positions while maintaining the control voltage of the servo within

the actuator limits.

The LQG controller is a combination of a linear-quadratic regulator and a Kalman filter.

The Kalman filter serves two purposes: state estimation and noise immunity. A 1 millisec-

ond discrete control cycle time interval is used to sense and actuate the pendulum with the

control algorithm. The first two parameters of the state vector in Table 4.2 are derived

exclusively from sensory measurements. The last two parameters, velocities of the servo arm

and pendulum, are estimated using the Kalman filter. The Kalman filter establishes the

linear relationships between the change in sensor measurements and sampling time inter-

val to estimate the associated velocities. Furthermore, turbulent sensor measurements are

suppressed by the Kalman filter.

4.3 Pendulum Guards

Process stability for the inverted pendulum is continuous upright pendulum balance. This

is used as the primary basis for defining the α guard. The desired operating position for

31

α is constant and set to 0◦ which is the upright position. A safety-critical guard is set on

αSCG = ±15◦ to bound the recovery state of process stability within the actuator limits.

Pendulum deviation larger than 15◦ from the inverted pendulum position requires a control

voltage larger than the capabilities of the servo to regain pendulum balance [8]. Since the

desired pendulum position is always constant and does not vary with operational conditions,

an operational guard is not enforced on α.

The desired servo arm position at which pendulum balance is maintained is defined as an

operational parameter that can be modified during run-time by supervisory operators. This

θdesired is defined as the operational set-point. In order to differentiate safe and malicious

operation, the servo arm position is bounded by both safety-critical and operational guards.

The guards are summarized in Table 4.3.

Table 4.3: Safety-critical and operational guard definitions for α and θ.

Guard DescriptionαSCG ±15◦ Safety-critical guard on the pendulum.θOG ±35◦ Operational guard on the servo arm.θSCG ±50◦ Safety-critical guard on the servo arm.

4.4 Trigger Mechanisms

Servo arm operation outside of 50◦ on both directions is considered unsafe while operation

outside of 35◦ is considered unstable according to the guard definitions in Table 4.3. The

trigger mechanism maintains both process stability and safety and thus must enforce the

operational and safety-critical guards on α and θ. Three different trigger mechanisms are

developed for the evaluation of TAIGA: trivial, linear online prediction, and neural network

classification [9][8].

4.4.1 Trivial

The trivial trigger mechanism is a control mechanism used as a baseline standard for com-

parison with the other trigger mechanisms. Unlike the prediction and classification methods,

32

the trivial trigger mechanism does not preemptively forecast a guard violation, but rather

asserts the trigger once the operational guard is violated. This mechanism is the simplest

to implement and represents current process monitoring schemes routinely found in CPSes

where hard-defined operational limits are enforced by the process controllers.

4.4.2 Linear Online Prediction

A linearized model of the pendulum is used with an instance of the LQG control algorithm

to forecast future plant behavior in the linear online prediction method. Two sets of pro-

cess states are implemented, one resembling the real-time physical process and one for the

prediction unit used to forecast future behavior. Initially, the prediction unit’s states are

synchronized with the physical process. The prediction algorithm accelerates the behavior

of the plant several control cycles into the future by using the linear plant model to esti-

mate process behavior and the control algorithm to determine the actuation signal at each

iteration. The trigger is asserted if a guard is violated at any of the future iterations.

The settling time of the pendulum is approximately 1.2 seconds which translates to 1200

iterations with a 1 millisecond control cycle [8]. Computational limitations on the embedded

controller may not allow for 1200 iterations of the prediction unit within the 1 ms control

cycle time; in this case, the iterations are broken up and partitioned among several control

cycles. The state vector of the prediction unit is synchronized with the actual physical

process once all iterations are complete.

The control algorithm used by the prediction unit is the backup instance of the LQG con-

troller. The backup control algorithm is trusted and responsible for regaining stable and

safe pendulum operation. Hence, predicting with the backup control algorithm ensures the

backup controller is capable of recovery if a guard violation is anticipated and the trigger

is asserted. However, the production controller is initially governing the physical process

prior to trigger assertion. As a result, the trajectory forecasted by the linear online pre-

diction method may not accurately resemble the future behavior of the physical process if

the operating conditions of the production controller vary from those of the backup or are

compromised. Rather, the prediction method attempts to foresee whether process stability

and safety can be regained and maintained with the governance of the backup controller at

each control cycle.

33

4.4.3 Neural Network Classification

A classification algorithm is derived offline using a nonlinear model of the pendulum. The

parameters of the state vector are incrementally permuted to obtain an initial set of state

vectors. The non-linear model is used to simulate the behavior of the pendulum under

backup control for each given initial state vector. The large set of simulation results are

used to train a neural network-based classifier.

A set of four-input neurons corresponding to each state of the RIP are mapped to outputs

using a nonlinear activation function with a multilayer perceptron (MLP), a neural network

model [9]. The MLP is trained using the simulation results with a nonlinear least squares

method. The resulting classification algorithm determines whether or not a given input

state vector is in a region of safe and stable operation, which is defined as the backup

controller’s ability to maintain pendulum balance at a predefined servo position without a

guard violation.

The neural network classifier bounds the physical process within the four-dimensional state

vector space. It does not require multiple iterations online but rather linear algebraic com-

putations of the classification algorithm, which makes it much less computationally intensive

and better suited to embedded platforms.

Chapter 5

TAIGA Implementation

TAIGA necessitates hardware and software coherence to be effective. Since fabrication of a

custom architecture is infeasible, TAIGA is implemented on a commercially available config-

urable SoC. The Xilinx Zynq-7000 allows hardware customization and has tight integration

between the software and hardware design flows. The hardware realization of TAIGA, ini-

tially proposed in Figure 3.4, is implemented on the target Zynq platform as depicted in

Figure 5.1. The specifics of this platform, along with the details regarding the hardware and

software implementation of TAIGA for the RIP application are described in the following

sections.

Figure 5.1: The hardware implementation of TAIGA on a Zynq-7000 configurable SoC.

34

35

The source code, IP, and system design for TAIGA is maintained in the following GitHub

repository: https://github.com/tejachil/TAIGA.git.

5.1 Target Platform

An embedded platform that is adaptive, customizable, and high-performance is essential for

a robust TAIGA implementation. A Xilinx Zynq-7000 All Programmable SoC is a suit-

able commercially available chip with reasonable support and readily available development

platforms [30]. TAIGA is implemented for the RIP application on a ZYBO development

board which contains a Zynq-7010 IC [6]. The specifications for this platform are defined in

Table 5.1.

Table 5.1: Specifications for ZYBO Zynq-7000 development board.

Zynq-7010 ZYBOZYNQ (PS) XC7Z010-1CLG400C Serial Flash 128MB

Processor Dual-Core ARM Cortex-A9 RAM Capacity 3×512MB DDR3Frequency 650 MHz RAM Speed 1050Mbps

FPGA (PL) Artix-7 Pmod 1 MIO, 1 ADC, 4 EMIOLogic Cells 28K

BRAM 240KBDSP 80 slices

The Zynq-7000 configurable SoC is partitioned into two subsystems: the PS and the PL.

The PS contains a dual-core ARM processor while the PL contains an FPGA fabric that

resembles that of a Xilinx 7-series FPGA as specified in Table 5.1. The Zynq platform

maintains a separation of resources, memory, and peripherals between the PL and PS.

TAIGA is implemented with the Xilinx Vivado tool suite. Custom intellectual property

blocks are generated using the Vivado HLS tool. The PS and PL are configured with various

functional blocks within the Vivado design tool. Once the hardware is synthesized and

implemented, a bitstream is generated.

Software is developed, compiled, and deployed as applications on available processing cores

using the Xilinx SDK. Each application requires a board support package (BSP) which pro-

https://github.com/tejachil/TAIGA.git

36

vides the specific libraries and support code for the processor’s hardware profile. The hard-

ware peripherals and details are abstracted from the software with the BSP. Applications

are classified in the Xilinx SDK as either:

• Bare-metal applications use the standalone BSP and allow the software implementation

to directly interact with the hardware and the available peripherals with the drivers

and libraries defined by the BSP.

• Kernel applications use a specific BSP targeting an operating system and contain

complex methods such as scheduling that can host and execute multiple external ap-

plications while arbitrating interaction between the software and hardware.

5.1.1 Processing System

The Zynq’s PS contains an application processor unit (APU), a set of peripheral I/O con-

trollers, and an independent memory hierarchy. The PS closely resembles a microcontroller

with the added computational power of ARM processing cores. The internal structure of the

PS is represented in Figure 5.2.

APU and Memory Hierarchy

The APU contains two ARM cores, each with a distinct floating point unit (FPU), memory

management unit (MMU), and L1 caches. Two types of L1 caches are present, one for data

and one for instructions. In addition to maintaining local data sets, each ARM core can

execute independent instruction streams asynchronously. Xilinx’s asymmetric multiprocess-

ing (AMP) configuration enables concurrent multi-OS or application support across the two

cores. The L2 cache and on-chip SRAM are shared among both cores along with various

internal peripherals such as timers and interrupt controllers, and external peripherals such as

off-chip memory, flash, and I/O. The shared memory structures of the APU provide robust

communication between applications running on the two cores in the AMP configuration.

The ZYBO platform contains off-chip RAM as specified in Table 5.1. This memory is ac-

cessed through the direct memory access (DMA) driver within the APU. Typically, programs

for execution are initially loaded into external DDR3 memory. However, the ZYBO develop-

ment board also contains an SD card slot and external serial flash memory. An SD peripheral

37

Figure 5.2: Internal layout of the Zynq processing system.

controller is enabled within the PS to interface access to the SD card. The SD card is ad-

vantageous for storing non-volatile boot images, which can be loaded into DDR3 memory

during start-up, or hosting file systems for complex operating systems such as Linux. The

SD card and external serial flash memory, although relatively slow in terms of data access,

are the only non-volatile ZYBO memory systems that can preserve data across power cycles.

Peripherals

The ZYBO platform contains a variety of peripherals that interact with the Zynq; the

controllers for these peripherals are configured within the Zynq PS and highlighted in the

I/O peripheral controllers module of Figure 5.2. The ARM Advanced Microcontroller Bus

Architecture (AMBA) defines a standard for on-chip interconnections of functional blocks

within a SoC [18]. The Advanced eXtensible Interface (AXI) protocol is a part of AMBA and

is used to facilitate interaction between the PS and functional peripheral blocks instantiated

within the PL in the Zynq architecture. Peripheral, controller, and hardware accelerator

38

blocks are instantiated in the PL as AXI slave or AXI master devices and communicate with

the PS via the AXI ports shown in Figure 5.2.

Clock Generation

Sequential functional blocks and peripherals within the PL operate on a clock signal from

the PS’s clock generation unit as represented in Figure 5.2. The CPU clock frequency for

the two ARM cores in the ZYBO platform is 650 MHz as specified in Table 5.1. The Zynq

processor contains four PL fabric clocks configurable between 100-250 MHz by the clock

generation unit.

TAIGA’s PL-implemented functional blocks use one of these PL fabric clocks, FCLK CLK0,

configured to 144.4 MHz satisfying the timing constraints of the functional blocks in the PL.

The fabric clocks are sourced by one of three available phase-locked loops (PLLs): ARM

PLL, I/O PLL, and DDR PLL. PLL’s are used to generate frequencies that are divisors of

the sourced input clock. The requested frequency to satisfy the PL’s timing requirements

is 145 MHz. However, the source frequency for the three PLLs are not multiples of this

requested frequency. As a result, the ARM PLL is used to generate 144.4 MHz, the closest

to the requested frequency.

5.1.2 Programmable Logic

The PL contains a large set of configurable logic blocks (CLBs), block RAM (BRAM), digital

signal processing (DSP) slices, I/O blocks (IOBs), and registers as specified in Table 5.1.

Figure 5.3 illustrates an abstracted view of a typical FPGA fabric. CLBs are groupings of

lookup tables (LUTs) configured for combinational logic requirements, and flip-flops (FFs)

for sequential logic. The interconnections and components represented in Figure 5.3 are

configurable for different applications. Typically, FPGAs are configured using a hardware

description language (HDL) such as VHDL or Verilog. In addition, the Vivado design flow

enables FPGA configuration using existing high-level functional blocks.

The Zynq PS is configured and instantiated in the Vivado block diagram as a functional

block. With just the PS, programmable access to the two ARM cores is enabled. However,

the PL is necessary for enabling external controllers, accelerators, peripherals, or functional

blocks that do not exist within the PS. Most functional blocks enabled in the PL interact

39

Figure 5.3: FPGA architecture and fabric composition.

with the PS via the AXI peripheral buses. Interaction with external peripherals on the

ZYBO are typically channeled through Zynq general purpose I/O (GPIO) pins. The Zynq

contains two banks of I/O pins as shown in Figure 5.2:

• Multiplexed I/O (MIO) bank contains a set of GPIOs that are dedicated to the PS and

cannot be accessed from the PL.

• Extended multiplexed I/O (EMIO) bank contains a set of GPIOs that are available to

the PL and can also be accessed by the PS.

These banks contain a wide array of internal I/Os that are multiplexed to the pins of the

Zynq package. Certain peripherals on the ZYBO board such as the universal asynchronous

receiver/transmitter (UART) connector, Ethernet port, and SD card are connected to signals

on the MIO bank and cannot be accessed by PL. Any external peripherals required by the

PL require EMIO connection usually through one of the ZYBO Pmod connectors.

Block RAM

As shown in Figure 5.2, the off-chip flash memory and SD card are connected via the MIO

bank while the volatile memory within the APU is local to the APU. As a result, these

memory systems are internal to the PS and inaccessible from the PL. Rather, the PL

40

contains its own BRAM distributed memory as specified in Table 5.1. The FPGA contains

multiple blocks of dedicated two-port memory, each block containing 36Kbits.

One or more BRAMs may be instantiated within the FPGA and addressed individually.

BRAM is typically allocated for functional blocks such as a first-in first-out (FIFO) which

requires memory for a queue. AXI peripherals exist to access sectors of BRAM from the PS.

However, the BRAM cannot be explicitly accessed from the PS.

Digital Signal Processing Slices

In addition to logic cells and BRAM, the PL contains a set of DSP slices to provide fast

execution of common arithmetic and logical operations. They are a level of abstraction

higher than the logic cells and are optimized for high-performance digital signal processing

and computation. These DSP slices are configurable within the PL and are instantiated as

hardware accelerators for functional blocks requiring high-performance arithmetic.

MicroBlaze

Occasionally, processes destined for implementation within the PL suit a processing core

for executing compiled code. MicroBlaze is a Xilinx proprietary soft processor core with

a RISC instruction set and designed for implementation exclusively in FPGA fabric [28].

The MicroBlaze uses a local-memory bus (LMB) to efficiently interact with BRAM memory

instantiated alongside the processing core within the PL. The size of BRAM dedicated to

the processor memory is configurable; MicroBlaze also has support for instruction and data

caches similar to the APU within the PS. All process control and instruction registers are

addressed within the PL and are local to the MicroBlaze processor.

In TAIGA, the backup controller and IOI are hosted within independent MicroBlaze cores.

MicroBlaze supports a maximum clock rate of 150 MHz with the lowest performance speed

grade of the ZYBO platform’s Zynq. The backup controller and IOI necessitate hardware

accelerators to satisfy arithmetic needs. The addition of these accelerators raises negative

slack errors during the PL routing when clocked at 150 MHz. As a result, 145 MHz is

requested to meet the PL timing requirements. FCLK CLK0 is configured accordingly to

the closest possible frequency of 144.4 MHz by ARM PLL sourcing.

41

The two MicroBlaze cores are configured for maximum frequency with a five stage pipeline

and customized with the following parameters:

• Hardware barrel shifter enabled to allow multiple bit shifts within one clock cycle.

• Extended FPU enabled in hardware to improve single-precision IEEE-754 standard

floating point arithmetic performance.

• Hardware integer multiplier enabled to improve performance for 32-bit integers.

• Hardware integer divider enabled for increased performance in integer division.

• Additional machine status register (MSR) instructions enabled for faster bit modifica-

tions to the process control register of the architecture.

• Branch target cache size of 1024 entries stored in BRAM for better branch prediction.

A MicroBlaze configuration can trade off performance and resource usage. The configuration

parameters used in TAIGA primarily improve the computation performance. MicroBlaze

supports both data and instruction caches to hide memory access latency. However, these

caches are implemented in BRAM similar to the local instruction and data memory which

does not yield a significant performance increase. The PL real estate is a particularly precious

resource. In TAIGA, BRAM is a bottleneck resource primarily due to the large amount of

function units requiring it. As a result, MicroBlaze optimizations are FF, LUT, and DSP

intensive in order to obtain performance improvements while conserving BRAM.

5.2 Production Controller

The ZYBO embedded controller interacts with the pendulum via the SACIB SPI bus con-

nected to a dedicated ZYBO Pmod connector. As a standalone apparatus without TAIGA,

the production controller directly senses and actuates the pendulum with the PS resem-

bling a microcontroller. In this configuration the internal SPI peripheral controller (SPI 0),

as shown in Figure 5.2, is enabled. Initially, the production controller algorithm is imple-

mented as a bare-metal application in one of the ARM cores using an internal hardware

timer to initiate the one millisecond control cycle. This implementation without TAIGA is

used to verify proper and reliable control of the RIP.

42

The separation of resources between the PS and PL is leveraged by TAIGA to implement the

untrusted production controller in the PS as illustrated in Figure 5.1. With the incorporation

of TAIGA, the production controller no longer interacts directly with the SPI bus, but

rather through queues interfaced by the IOI module. The SPI controller internal to the PS

is disabled and an external AXI FIFO stream peripheral is instantiated within the PL to

interface communication with the queues as shown in Figure 5.1. The implementation of the

queues is described in greater detail in Section 5.4. Xilinx’s AMP solution is implemented

within the APU to support a real-time operating system (RTOS) on one core and a high-level

operating system (OS) on the other.

5.2.1 FreeRTOS

The production control algorithm and methods for controlling the pendulum process are

implemented in a FreeRTOS framework on ARM core 1. In general, the use of an operating

system greatly increases the scalability of the application’s software and also enrich’s the

capabilities of the processor by abstracting away hardware specifics in the software imple-

mentation. An RTOS achieves these goals while preserving the strong and direct interaction

between the hardware and software.

FreeRTOS is a hybrid between a bare-metal and kernel application. It contains a scheduler

that is used to execute internal processes, called tasks, that are implemented, compiled,

and executed with the kernel itself. This close-knit integration between the kernel and the

tasks ensures integrity between the hardware and software interaction which is essential in

real-time process control applications. FreeRTOS has the following advantages compared to

bare-metal and kernel applications [2]:

• Smaller memory footprint and overhead than traditional operating systems.

• Enriched with useful OS features such as a configurable scheduler, software-defined

timers, task manager with multitasking capabilities, and interrupt handlers.

• Efficient inter-task communication through the implementation of message queues.

• Shared memory access between tasks with mutex and semaphore implementations.

• No distinct kernel layer to separate hardware from the software.

43

The core pendulum control sequence is implemented as a timer callback function with a

one millisecond software timer. The tick rate of FreeRTOS is configured to 1000 Hz in the

FreeRTOS BSP such that the pendulum control callback function can be executed every

tick. The pendulum control sequence is executed as follows:

1. Request an operational θ set-point at which to maintain the servo arm.

2. Request servo arm and pendulum position measurements from the physical process.

3. If the pendulum position is within the controllable region (|α| < 15◦), compute the

LQG control algorithm and Kalman filter to generate a control voltage for actuation.

4. Write the control voltage to the servo.

The control algorithm is implemented as a C function and contains the control parameters

of the LQG algorithm for high-performance pendulum balance. The methods for physical

process interaction are implemented as low-level drivers and utilities. Traditionally, these

methods interact directly with the SPI bus. They are modified to instead interact with the

IOI over the FIFO message queues using a strict packet syntax defined in Section 5.4.

Multitasking

FreeRTOS implements multitasking with a single processing core by rapidly switching be-

tween tasks. The execution of three concurrent tasks using this time-slicing method is

illustrated in Figure 5.4a. The perceived parallel execution of tasks achieved by rapid and

efficient switching is illustrated in Figure 5.4b. The execution of a task is impeded when the

task suspends itself with a yield or the needs of a higher priority task, interrupt, or timer

arise.

Each task is allocated its own dedicated stack which contains local memory blocks for storage

of task-specific variables. Inter-task memory access is efficiently orchestrated through the

use of locks and semaphores which protects the integrity of the variables accessed by multiple

concurrent tasks. Message queues are another method for inter-task data transfer.

A timer callback function is essentially treated as a task in FreeRTOS. A timer handle

for the pendulum control sequence is created and started with the highest priority such

44

(a) Time-sliced scheduling of tasks. (b) Perceived parallel execution of tasks.

Figure 5.4: Execution trace resembling FreeRTOS scheduling of concurrent tasks.

that the method can interrupt any other tasks dispatched by the scheduler. Typically,

introducing other tasks in a bare-metal implementation requires possible modifications to the

pendulum control sequence or external task handling methods. However, the priority-driven

concurrent task capabilities of FreeRTOS ensures strict execution of the pendulum control

sequence at every millisecond as long as it maintains highest priority. In the implementation

of the production controller, the FreeRTOS framework allows scalable implementation of

other tasks for arbitrating reconfiguration, control parameter optimization, or facilitating

interaction with external peripherals without disrupting the control process.

5.2.2 Linux

Linux is implemented in ARM core 0 primarily to provide high-level network and file transfer

services to support reconfiguration. The leaf nodes of cyber-physical control require interac-

tion with hierarchical elements both in the regulatory and supervisory layer such as RTUs,

other PLCs, and supervisory controllers. Since communication between the IOI and pro-

duction controller is restricted to bounded queues, reconfiguration capabilities through the

IOI module are not practical. Illustrated in Figure 5.1, the reconfiguration network interacts

directly with the production controller via Ethernet or UART.

The ZYBO platform contains a shared UART and Joint Test Action Group (JTAG) Univer-

sal Serial Bus (USB) port. The JTAG interface is used for programming and debugging the

Zynq’s hardware and software. JTAG port access allows reconfiguration of the Zynq architec-

ture and is prevented to safeguard the integrity of TAIGA. Rather than using ZYBO’s shared

UART and JTAG USB port as the reconfiguration network, the PS’s UART 1 peripheral

controller is routed to an external USB-to-UART converter board connected to a dedicated

45

MIO Pmod port. The USB-to-UART converter uses a FTDI FT232RQ IC which interfaces

the production controller’s UART bus with a USB connector to provide bidirectional serial

communication.

Direct communication to the PS allows the reconfiguration network to not only modify

certain parameters, but also execute firmware updates of the entire PS image. Typically,

such updates require modification to the boot images stored within the SD card and file

transfers with external networks over Ethernet or UART. The peripheral controllers for

interfacing with the SD card, Ethernet, and UART are enabled within the PS as shown in

Figure 5.1.

Access to the Ethernet, SD card, and UART peripherals is available to the core running

FreeRTOS. However, the implementation of file transfer and networking methods within

the FreeRTOS framework is tedious, hardware specific, and messy. Linux is implemented

to manage this process since the Linux OS already contains Ethernet, serial, and SD card

drivers as well as services for interacting with these peripherals. Lower-level methods for

making parameter optimizations or alterations to the control sequence are implemented in

the FreeRTOS framework as lower priority tasks relative to the process control sequence. The

use of a high-level OS like Linux to facilitate production controller reconfiguration conforms

to existing industrial control PLCs.

The integration of Linux within the AMP framework, interaction with external peripherals,

and networking are investigated independently and made feasible in the TAIGA implemen-

tation. The software routines to interface and arbitrate production controller reconfiguration

is out of the scope of this thesis and considered an extension to the work presented.

5.3 Backup Controller

Similar to the production controller, the backup controller is implemented in C and involves

arithmetic complexity necessitating intensive computation. As a result, an instruction set

processor suits the implementation of the backup controller. In order to ensure trust and

prevent malicious sharing or access of resources, the backup controller is implemented within

the PL on a MicroBlaze soft processor. Two BRAM modules are allocated to the backup

controller: 32 KB for local instruction memory and 32 KB for local data memory.

46

The backup controller is primarily a simplified version of the production controller with an

LQG control algorithm tuned for high assurance rather than performance. The controller

only contains one routine, the pendulum control sequence:

1. Request servo arm and pendulum position measurements from the physical process.

2. If the pendulum position is within the controllable region (|α| < 15◦), compute the

LQG control algorithm and Kalman filter to generate a control voltage for actuation.

3. Write the control voltage to the servo.

The control sequence for the backup controller is almost identical to that defined for produc-

tion. Unlike the production controller, the backup controller does not request a supervisory

set-point, but rather, operates on one that is pre-configured. The operational set-point for

the backup controller is set to θdesired = 0◦.

Under production control, the supervisor dictates the operational set-point through inter-

action with the IOI. The production controller then requests the supervisor’s operating

conditions from the IOI module. In a typical PCS, the operating conditions are intended to

actuate the system to satisfy plant deliverables. Since the backup controller’s intention is to

maintain process safety and stability rather than satisfy plant deliverables, it operates on a

pre-configured operating condition that is verified for safe and stable process control.

Operating under pre-defined conditions, the backup controller does not interact with CPS

hierarchical nodes, and only interacts with the physical process. This eliminates untrusted

inputs to the backup controller, which can be thought of as an independent and isolated

physical control leaf node. This attribute is essential in ensuring backup controller trust.

The backup control sequence is implemented using an AXI timer peripheral with a 32-bit

configurable counter and an interrupt controller for detecting counter overflow and initiating

a reset. The timer merely acts as a counter incrementing at each clock cycle. In software, the

counter is configured to reset to a preconfigured value, RESET COUNT, when the maximum

count is reached. The calculation of RESET COUNT is shown in Equation 5.1.

Ttimer =1

fCLK

(MAX COUNT− RESET COUNT)

RESET COUNT = MAX COUNT− TtimerfCLK

(5.1)

47

Since the counter hardware is 32 bits wide, MAX COUNT = 0xFFFFFFFF. The PL fabric

clock FCLK CLK0 clocks the timer module at fCLK = 144.4 MHz. The desired period of

the timer, Ttimer, is the one millisecond control cycle time of the pendulum control sequence.

With these parameters, Equation 5.1 results in a RESET COUNT = 0xFFFDCBC2 to

generate a one millisecond interrupt for executing the pendulum control sequence.

The backup controller is implemented as a standalone bare-metal application within the

MicroBlaze processing core primarily to eliminate unnecessary kernel overheads and com-

plications. Since the backup controller only hosts the process control sequence, a kernel or

OS is unnecessary. The bare-metal implementation also makes the backup controller much

leaner and suitable for a MicroBlaze processing core which does not have the computational

power of an ARM processor. Lastly, formal verification and analysis of code implemented in

trusted elements is necessary to satisfy TR1; this is much more practical without the added

complexity of a kernel.

5.4 Queues

Queues are effective for robust inter-module communication in embedded systems. Typically,

FIFOs are used for inter-chip communication through serializer/deserializer telemetry links.

In the TAIGA implementation, FIFOs are used for inter-module communication between the

controllers and the the IOI module internal to the Zynq. In the Zynq architecture, FIFOs

are implemented with a set of BRAMs which exists in the PL, protecting the integrity of

the queue. The ordered and buffered nature of FIFOs allows for efficient data transactions

between the IOI and controller.

The allocation of BRAM and implementation of the queue is configured through a FIFO

generator block in Vivado. As illustrated in Figure 5.1, each controller is interfaced to the

IOI module with an independent pair of FIFOs. This isolates the telemetry between the

production controller and IOI, and the backup controller and IOI. The integrity of trusted

communication between the backup controller and IOI is ensured since production read/write

access to the backup’s FIFOs is prevented.

The FIFO generator IP contains three types of FIFO interfaces: native, AXI memory

mapped, and AXI stream. The native interface type is primarily intended for interfac-

ing the queues with custom functional blocks within the PL. In the AXI memory mapped

48

protocol, all data is addressed within system memory and referenced from the FIFO. This is

not suitable for the implementation of TAIGA in order to preserve the isolation and locality

of PS memory and PL BRAM.

The AXI stream protocol is used to interface the processing cores with the respective set

of FIFOs. The AXI stream protocol is part of ARM’s AMBA AXI4 interface specification

and follows a data-flow paradigm [29]. Rather than addressing system memory, data is

moved from local memory into the allocated memory of the FIFO on an enqueue method,

and moved to the destination core’s local memory on a dequeue method. Each of the FIFO

generator blocks are configured with the following parameters:

• AXI stream interface type

• Common clock connected to FCLK CLK0 which is 144.4 MHz

• Word width set to 32 bits, TDATA is 4 bytes wide

• TLAST enabled for indication of packet termination

• FIFO implemented in common clock BRAM as a data FIFO application for low-latency

memory access (2 clock cycles)

• FIFO depth configured for 512 words

At the lowest implementation level, the FIFO generator block contains numerous input and

output signals which require precise timing for writing (enqueuing) to the FIFO and receiving

(dequeuing) from the FIFO. The AXI-Stream FIFO block used to interface the processing

cores with the FIFO generator handles all of the required signal manipulation for read and

write operations which are executed using an AXI-Stream FIFO software library in the Xilinx

SDK.

Each FIFO generator block is connected to one AXI stream peripheral at the receiving end,

and another at the transmitting end. Each AXI stream peripheral contains one queue for

transmission and another for receiving as depicted in Figure 5.1. The FIFO stream signals for

both enqueuing and dequeuing messages to the FIFO generator are described in Table 5.2.

49

Table 5.2: Transmit and receive data channel signals of the interface between the AXI-StreamFIFO and FIFO generator blocks.

EnqueueSignal Direction Description

TDATA[31:0] Output Used to write a word to the FIFO.TLAST Output Indicates the boundary of the packet. Set high when the

current word for transmission is the last of the packet.TREADY Input Indicates the slave FIFO generator can accept a new

enqueue in the current clock cycle when asserted high.TVALID Output Asserted high when the master AXI-Stream block is

ready to initiate an enqueue to the FIFO block.

DequeueSignal Direction Description

TDATA[31:0] Input Used to receive a word from the FIFO.TLAST Input Indicates the boundary of the packet. Set high by the

FIFO when the last word of a packet is received.TREADY Output Indicates that the master FIFO generator is ready to

accept a dequeue message in the current clock cycle.TVALID Input Asserted high by the FIFO generator indicating that

valid dequeue data is present on the data channel.

5.4.1 Inter-Module Communication Protocol

As described in Table 5.2, data transfer occurs parallelly on the TDATA lines when both the

TREADY and TVALID signals are asserted indicating both the master and slave are ready.

Typically, FIFOs act as a buffered stream of words with a pre-configured width. However by

enabling the TLAST signal, the implemented FIFOs in TAIGA are able to stream packets

with varying number of words. The number of words in each packet is indicated by the

assertion of the TLAST signal.

The dequeue method of the IOI module is interrupt-driven as shown in Figure 5.1. An

interrupt is raised once a new packet, possibly containing multiple words, is available on

the FIFO. The controllers enqueue packets to the IOI requesting the execution of various

commands. As a result, the IOI acts as a server of requests made by the controllers. The

packet syntax used for interaction between the controller and IOI is defined in Table 5.3.

50

Table 5.3: FIFO queue packet structure. The portions shaded gray are relevant only for aPLANT command.

31 24 23 16 15 8 7 0Command Operation Transfer Bytes Slave Select DATA[i] ...

PLANT,

SET POINT,

STATE VECTOR

WRITE, READ,

STATE VECTOR0, 1, 2, 3, 4

NO SLAVE,ADC, DAC,

ENCODER P,ENCODER S

The first word of each packet transmitted by a controller is considered the header and

contains specific information regarding the type of interaction required by the controller. As

identified in Table 5.3, three types of commands are accepted by the IOI:

• A PLANT command allows interaction with the pendulum process by either writing

or reading from the SPI bus. The number of bytes to transfer on the SPI bus are

selected along with the slave device. The data bytes to transfer are appended to the

packet header. If a READ operation is requested, a packet of the same length as the

number of bytes transferred is returned.

• A SET POINT command returns a single word containing the current operational

set-point of the pendulum.

• The STATE VECTOR command returns four words containing the locally stored copy

of the state vector reflecting the current state of the pendulum.

With these three command types, the controller is able to interact with the physical pendu-

lum process and supervisor through the IOI. This packet structure and parameters are made

available to the production controller, backup controller, and IOI module through a globally

shared header file. The IOI module’s interrupt handler method for controller requests is

described in detail in Section 5.5.

51

5.5 I/O Intermediary

Similar to the backup controller, the IOI is a trusted entity within TAIGA and contains a

variety of methods and processes that are computationally and arithmetically intensive. As

a result, the IOI is implemented in the PL using a second MicroBlaze soft processor core

with similar configurations to the backup controller.

In contrast to the backup controller, the IOI contains more methods and processes requiring

larger local BRAM allocation: 128 KB is allocated for local instruction memory and 128

KB is allocated for local data memory. While these two blocks of memory are more than

sufficient to implement the methods described here, they provide excess storage intended for

the implementation of future detection schemes, monitoring systems, or response methods

addressing additional cyber-physical security needs.

5.5.1 Robust Process Control

The IOI serves a variety of functions. Within TAIGA, the controller must robustly and

reliably sense and actuate the physical process without noticeable degradation in process

control performance. The IOI is responsible for efficiently channeling sensor and actuator

commands between the RIP on the SPI bus, and the controller on the FIFOs. The IOI is

implemented to ensure expected RIP control even with the added complexity of TAIGA.

Similar to the backup and production controllers, the IOI interacts with the FIFOs via an

AXI FIFO stream peripheral. While the two controllers interact with the FIFO stream using

a polling method for receiving messages, the IOI’s FIFO stream module is interfaced through

an interrupt controller configured to generate an interrupt, as illustrated in Figure 5.1,

when new packets are available to dequeue. The IOI acts as a server for all controller I/O

requests. The interrupt routine handles requests that conform to the packet syntax described

in Table 5.3. The packet handler method is illustrated in Figure 5.5.

On each control cycle, the production controller requests an operation set-point with the

SET POINT command, requests sensor measurements of the servo arm and pendulum with

the PLANT command, and writes a voltage to the servo with the PLANT command. Every

time a new sensor measurement is requested, the current sensor values are updated locally

within the IOI to reflect the current state of the pendulum.

52

Figure 5.5: Software implementation of the FIFO interrupt handler.

The backup controller contains a process control routine similar to the production controller

but without a request of the operational set-point. Rather, the backup controller requests

the current state vector using the STATE VECTOR command when it is first enabled to

ensure a bump-less transition in the presence of an anticipated guard violation. The method

described in Figure 5.5 is implemented as a case statement within the FIFO interrupt handler.

As illustrated in Figure 5.1, a hardware AXI SPI peripheral is instantiated alongside the IOI

to communicate on the SPI bus. The block acts as a master on the SPI bus and is configured

with an 8-bit transaction width and four slaves connected to each of the ICs on SACIB. The

AXI SPI peripheral is sourced by the PL fabric clock, FCLK CLK0, which is divided by a

frequency ratio to generate the SCK as shown in Equation 5.2.

fSCK =fCLK

Fratio

(5.2)

The SPI protocol is not limited by a maximum clock speed and typically operates at very fast

data rates. In order to ensure clean and sharp clock edges, the source frequency is divided by

a frequency ratio of Fratio = 16×5. With fCLK = 144.4, the resulting SPI clock is configured

to fSCK = 1.805 MHz. This clock rate is verified to provide reliable communication between

the ZYBO and SACIB.

53

SPI Filter

A SPI filter, as identified in Figure 5.5, is implemented to validate commands that attempt

to interact with the physical pendulum process. Controllers are only able to read encoders

or write voltages to the servo through the SPI bus. Upon boot-up, the IOI module executes

a pendulum initialization sequence that resets and configures the encoder counter ICs. The

SPI filter prevents latent software from re-initializing the system during operation. Re-

initialization the encoder counters during operation would result in offset readings which

would jeopardize not only production controller operation but also the integrity of the backup

controller. The SPI filter also prevents the controller from transmitting other configuration

commands on the SPI bus that can disrupt the operation of ICs in SACIB.

By restricting the scope of interaction with the physical process to known and expected com-

mands, the SPI filter addresses bias-injection attacks executed by the production controller.

It also ensures actuator commands are maintained within the operational limits of the servo.

Any voltage write command to the servo is decoded within the IOI’s SPI filter and saturated

to +3.3 volts, which is the unscaled actuator limit of the DAC IC responsible for driving the

servo.

5.5.2 Supervisory Control and Process Monitor

Cyber-physical control leaf nodes require bidirection communication with hierarchical en-

tities. In the TAIGA implementation, this communication channel is referred to as the

supervisory I/O, identified in Figure 5.1. The supervisory layer within a CPS contains

multiple nodes of HMIs that allow operators to monitor the current process behavior.

A USB-to-UART peripheral, similar to the one used by the reconfiguration network, is

connected to a Pmod port of the ZYBO board as the supervisory I/O peripheral. The

Pmod is routed to the PL using EMIO pins and connected to an AXI UART peripheral. The

AXI UART peripheral is configured with 8 data bits and a baud rate of 921,600. The fastest

supported baud rate is used to ensure fast transmission speed necessary for sending the entire

state vector at each control cycle. The methods for interfacing bidirectional communication

with the supervisory network and other tasks specific to TAIGA are implemented in an

infinite idle loop within the IOI framework. This loop is depicted in Figure 5.6 and executes as

a background process that is occasionally interrupted by the FIFO packet interrupt handler.

54

Figure 5.6: Idle loop and WDT interrupt service routine of the IOI.

At each iteration of the idle loop, the supervisory UART port is polled for new supervisory

inputs. The IOI accepts a new operational set-point command for which to actuate the

system. An operational set-point command is a three byte message on the UART port with

the syntax described in Table 5.4. The set-point command is specified with two header bytes

containing the characters “SP” to identify a set-point command. The last byte contains the

magnitude of the set-point in the least significant 7 bits, and the most significant bit specifies

the sign of the operational set-point.

Table 5.4: Syntax of an input operational set-point command on the supervisory UART bus.

Byte 1 Byte 2 Byte 37 0 7 0 7 6 0

‘S’ ‘P’+→ 0− → 1

Magnitude of Set-Point in Degrees (0-127)

Since only 7 bits are allocated to represent the magnitude, the set-point is restricted to a

range within ±127◦. While the servo arm can be actuated within a range of ±180◦, the

guard specifications limit the operational mobility within ±35◦ as defined by the operational

guard in Table 4.3.

The IOI contains a control cycle flag that determines when a control sequence is completed by

the production controller. The flag is asserted when a pendulum encoder read, servo encoder

read, and a control voltage write to the actuator are recorded. As portrayed in Figure 5.6,

the state vector is written on the supervisory UART once a control cycle is completed.

55

The current process state is maintained by probing the encoder reads in the FIFO inter-

rupt handler. Upon completion of the control cycle, the Kalman filter control algorithm is

executed to estimate the entire state vector prior to transmission on the supervisory I/O

bus. The state vector is stored as a four-dimensional floating point array in which each state

variable is 32 bits. Each byte of the state vector along with the output control voltage, also

stored as a floating point variable, is transmitted on the UART bus and reassembled by the

external recipient.

Table 5.5: Packet composition of process data transmission on UART from IOI.

Data Bytes

θ0 1 2 3

Upper Upper-Middle Lower-Middle Lower

α4 5 6 7


θ8 9 10 11


α12 13 14 15


u16 17 18 19


Footer20 21 22 23

Assertion State ‘–’ ‘–’ ‘\n’

The syntax of the transmission packet is represented in Table 5.5. Each of the 32-bit floating

point variables are divided into four bytes. A total of 24 bytes are transmitted on the super-

visory UART at each control cycle by the IOI. The last three bytes are a constant delimiter

representing the end of each packet. This delimiter is used for establishing boundaries be-

tween multiple packets received on the UART. The UART transmission is handled by the

AXI UART peripheral in hardware and is non-blocking. Once space is available, bytes are

loaded to the buffer and transmitted by the UART controller without hanging the IOI idle

loop as represented in Figure 5.6. As a result, the transmission of the 24 bytes happens in

parallel to the execution of other idle task methods. The time required to transmit a packet

is calculated using Equation 5.3.

Tpacket =bytes (data bits + overhead bits)

baud rate(5.3)

56

There are a total of 24 bytes in each packet as shown in Table 5.5. The AXI UART peripheral

is configured for 8 data bits with no parity. However, the UART protocol specifies a start and

stop bit during the transmission of each byte. As a result there are two additional overhead

bits transmitted with each byte. With a baud rate of 921,600, the transmission of a packet

on the supervisory I/O bus takes Tpacket = 0.2604 milliseconds. This is only a fraction of the

one millisecond control cycle time ensuring no delay or lag in the transmission of real-time

process data on the supervisory I/O bus.

As depicted in Figure 5.6, a trigger is asserted by two mechanisms: the trigger mechanism

or watch dog timer (WDT). These mechanisms are the primary safeguards of TAIGA and

are enabled with a physical button press on the ZYBO once a proper boot sequence is

initiated and the pendulum is balanced upright under stable production control operation.

A single byte trigger assertion state, specified in Table 5.5, is sent to the supervisor for

reporting whether or not the trigger has been asserted. This flag reports TAIGA-specific

states represented by single characters as described in Table 5.6.

Table 5.6: Flag transmitted on the UART bus to report TAIGA’s trigger state.

Flag Description‘P’ Production control of the RIP without IOI trigger mechanisms enabled.‘S’ Production control of the RIP with IOI trigger mechanisms enabled.‘G’ The trigger is asserted by the trigger mechanism which preemptively detected

a guard violation.‘W’ Trigger assertion due to expiration of the WDT counter.‘T’ Trigger assertion by both the WDT and trigger mechanism.

5.5.3 Trigger Mechanism

TAIGA’s IOI is responsible for responding to malicious plant behavior by maintaining process

states, anticipating malicious operation, and initiating switch-over to the backup controller.

The IOI hosts a trigger mechanism for forecasting the trajectory of the pendulum based

on the current process states. As illustrated in Figure 5.6, a trigger mechanism method —

trivial, prediction, or classification — is executed upon completion of the state estimation in

order to use the real-time process states for detecting malicious behavior. If a guard violation

is anticipated by the trigger mechanism, the trigger is asserted which initiates a switch-over to

57

the backup controller via the controller multiplexer. The three trigger mechanisms developed

for the inverted pendulum application are implemented as C functions:

• The trivial trigger mechanism accepts the first two states, α and θ, as inputs and

checks if they are confined within the operational and safety-critical guard boundaries

defined in Table 4.3.

• The online prediction mechanism accepts the current process state vector and the

number of iterations as input arguments. A local prediction state vector is maintained

within the method which is periodically synchronized with the current process state.

The LQG control algorithm and plant model are applied to the prediction state vector

and the trivial trigger mechanism is validated for future each iteration.

• The classification method is an arithmetic function that accepts the current process

state vector and returns a boolean value indicating whether or not the pendulum

process is within a region of safety.

5.5.4 Watch Dog Timer

The process states in the IOI are only updated and maintained upon the completion of a

control cycle. The IOI does not independently initiate sensor reads or voltage writes on

the process control bus, but rather probes the sensory measurements from process control

requests initiated by the controller. Under a covert DoS attack, a compromised process

control sequence in the production controller may disable any communication with the IOI.

In order to detect the absence of control, a WDT is implemented to ensure strict conformity

to the one millisecond control cycle period.

As illustrated in Figure 5.1, an AXI WDT peripheral is implemented in the PL with an

interrupt controller for detecting the timer expiration. The WDT acts as a hardware counter

that generates an interrupt when it overflows. The WDT counter width is configured in

hardware with the period calculated using Equation 5.4.

TWDT =2Width of WDT

fCLK

(5.4)

58

The WDT expires when a one millisecond control cycle is exceeded by the production con-

troller. As a result, TWDT ≥ 1 ms to allow some threshold for imprecision within the control

cycle. The width of the WDT is configured to 18 bits with the WDT clocked by the 144.4

MHz PL fabric clock (FCLK CLK0) which results in a WDT period of TWDT = 1.8148 ms.

As illustrated in Figure 5.6, the WDT counter is reset every time a control cycle is completed

by the controller. When the period of the production controller’s control cycle exceeds 1.8148

milliseconds, a trigger is asserted initiating switch-over to backup control.

The idle loop shown in Figure 5.6 is a background process that can be preempted by only an

interrupt. The WDT addresses DoS attacks which are not detected by the trigger mechanism

since process states within the IOI are not updated. Another covert attack leverages the

idle loop interruption by flooding FIFO requests on the queue such that the idle loop is

continuously interrupted by the FIFO interrupt handler. Since the WDT is also interrupt

driven and external to the control loop, such an attack is detected by the WDT’s interrupt

service once the WDT’s counter is not routinely reset as expected.

The idle task and WDT represented in Figure 5.6 contain mechanisms that assert the trigger

to switch to backup control when malicious or unexpected control and process behavior are

detected. No automated process or mechanism exists within the IOI or TAIGA system that

clears the trigger to resume production control. In order to satisfy TR5, physical and manual

operator intervention is required to reset the TAIGA system and regain production control.

5.6 Controller Multiplexer

The controller queue multiplexer is implemented within the PL, as illustrated in Figure 5.1,

and acts as a bidirectional multiplexer for all signals in the backup and production controller

FIFOs. An input controller switch select signal is used to select between the production

or backup controller queues. When cleared to logic low, the enqueue and dequeue signals of

the IOI are connected to the enqueue and dequeue FIFO generator blocks of the production

controller. When set to logic high, the enqueue and dequeue signals of the IOI are connected

to the enqueue and dequeue FIFO generator blocks of the backup controller.

The controller queue multiplexer block is created using HLS. The Xilinx HLS design flow

allows a high-level implementation, in C/C++, of a combinational or sequential circuit

with PL resources. The C/C++ code is translated to HDL which is then synthesized and

59

implemented using the Vivado tool. Furthermore, Xilinx interface protocols such as AXI are

included in the HLS design flow as libraries and pragmas which allow for simpler interfacing

with external functional blocks by abstracting away the bit-level and timing details of the

interface.

Table 5.7: Signals of the controller queue multiplexer allocated to their respective masters.

Production Backup IOI

Inpu

t

rx data a[31:0]

rx valid a

rx tlast a

tx ready a

rx data b[31:0]

rx valid b

rx tlast b

tx ready b

tx data[31:0]

tx valid

tx tlast

rx readyswitch select

Ou

tput tx data a[31:0]

tx valid a

tx tlast a

rx ready a

tx data b[31:0]

tx valid b

tx tlast b

rx ready b

rx data[31:0]

rx valid

rx tlast

tx ready

The inputs and outputs of the controller queue multiplexer module are represented in Ta-

ble 5.7. These signals, described in Table 5.2, are defined as ports to the HLS block with

the following directive:

#pragma HLS INTERFACE ap none port=signal

The ap none parameter of the HLS directive specifies that the specific port does not contain

an I/O protocol such as AXI or FIFO. The input signals of each controller are connected to

the output signals of the IOI and vice versa based on the switch select signal. This function

is implemented as an if-then statement with the switch select signal as the condition. Once

translated, the controller multiplexer block is synthesized and implemented using 105 LUTs.

The port interfaces and functional implementation do not require any logic storage elements

or sequential logic design. The functional block is thus implemented as a combinational

circuit without any clock inputs.

Chapter 6

Integration of TAIGA in

Cyber-Physical Control

TAIGA is applied to the CPS leaf nodes and acts as a last line of defence in protecting

process integrity. Typically, CPS leafs are interfaced through other regulatory controllers for

interaction with the IT infrastructure which monitors and operates the plant. Conforming to

conventional cyber-physical control, the TAIGA implementation for the RIP application is

integrated into a CPS topology. Figure 2.3 illustrates the implementation of the regulatory

controllers and the integration of the the ZYBO platform with the cyber network as initially

sketched in Figure 2.1.

Figure 6.1: Integration of the TAIGA leaf node into a cyber-physical control topology.

60

61

The ZYBO board resembles a PLC leaf node with supervisory interaction interfaced through

an RTU. As presented in Figure 6.1, an RTU is introduced to facilitate interaction be-

tween the ZYBO and IT infrastructure. This chapter elaborates on the implementation

of the RTU and various software services that enable remote monitoring and actuation of

the system. The source code of both client and server side services for cyber-physical con-

trol is maintained in the following GitHub repository: https://github.com/tejachil/

RPi-Remote-Terminal-Unit.git.

6.1 Remote Terminal Unit

In DCS or SCADA systems, the PLCs are the lowest-level embedded platforms that do not

directly interact with the IT supervisory network. An RTU is responsible for interacting

with the PLC on a simpler telemetry channel, aggregating multiple PLC nodes, and relaying

messages on the IT network. A Raspberry Pi running an optimized flavor of Linux, Raspbian,

is used in the implementation of the RTU as illustrated in Figure 6.1. Containing a high-

performance processor capable of interaction with IT networking services as well as lower-

level telemetry channels, the Raspberry Pi is a suitable embedded platform.

As identified in the implementation of TAIGA, Figure 5.1, the reconfiguration network and

supervisory I/O network are both implemented on dedicated UARTs using USB-to-UART

converters. These UARTs are connected to the Raspberry Pi’s USB ports and are supported

as serial devices in Linux as shown in Figure 6.1. The UART is isolated from the shared

UART/JTAG port to prevent FPGA reconfiguration via the Raspberry Pi. The JTAG USB

port is not networked to supervisory controllers and is used to securely update the PL of

TAIGA with physical access to the controller.

6.1.1 Remote Monitoring and Control Server

The IOI routinely transmits process states through the supervisory I/O network at each

control cycle based on the packet specifications in Table 5.5. Intended for supervisory mon-

itoring of the pendulum, the RTU relays this information through a local area network.

The Raspberry Pi hosts a web server that transmits the process data to any remote clients

requesting it.

https://github.com/tejachil/RPi-Remote-Terminal-Unit.git

https://github.com/tejachil/RPi-Remote-Terminal-Unit.git

62

In the open systems interconnection (OSI) model, Transmission Control Protocol (TCP)/Internet

Protocol (IP) and User Datagram Protocol (UDP) are the most prevalent transport layer

protocols for information exchange on the Internet. The one millisecond control cycle time

is aggressive for relaying the 24 byte data packet specified in Table 5.5. For efficient real-

time process monitoring, a robust communication protocol with high bandwidth is necessary.

Data transfer in the TCP/IP model follows a request and response scheme with the client

and server respectively. New information is requested by the client and the server responds.

The bidirectional communication ensures reliable packet exchange but is relatively slow.

UDP is a minimal network transmission model that transmits data unidirectionally to a

destination IP and port. The lack of handshaking and bidirectional acknowledgements does

not guarantee packet delivery and is relatively unreliable in comparison to TCP/IP. In the

cyber-physical control domain, obtaining real-time process data is more crucial for process

monitoring than the reliability of transmission. UDP is much faster and capable of stream-

ing the RIP process states unidirectionally minimizing unnecessary overheads and ensuring

timely data delivery that represents the real-time process state.

A UDP web server is implemented on the Raspberry Pi using Node.js, a JavaScript run-time

environment with an event-driven programming model. The environment is well supported

and allows easy implementation of lightweight, fast, and scalable network applications which

suit the Raspberry Pi platform. The web server receives packets on the supervisory I/O serial

port, assembles the process states, and transmits them to clients requesting the information.

Initially, a client transmits a message to the web server which then adds the client’s IP and

port to an internal list. At each control cycle, the state vector is streamed to all clients on

the web server’s list.

Supervisory control commands to update the operational set-point, as specified in Table 5.4,

are accepted via the UDP port and transmitted on the supervisory UART. The web server

also keeps track of the IOI’s state by monitoring the state flag described in Table 5.6.

Anytime the state of the flag is changed, a message is logged to the serial terminal window

signifying the start of trigger mechanisms or a trigger assertion. Providing bidirectional

communication on the supervisory UART, the UDP web server allows real-time remote

process monitoring and operation of the RIP.

The Raspberry Pi is a private node connected to a router with a static static IP address,

128.173.52.36. The UDP web server is opened on port 32392. The service transmits

63

process data to clients and receives data requests and control commands on this port. Public

access to the Raspberry Pi is granted by forwarding this port via the router to the private

Raspberry Pi’s IP making the web server accessible to any clients on the Internet.

6.2 Human-machine interface GUI

The implementation of the web server within the RTU is standalone and isolated from the

requirements and applications of the client. Client software may be customized to the mon-

itoring and control needs of the supervisor and varies in implementation making the overall

system portable. For the RIP application, a graphical user interface (GUI) is designed to

visually represent the pendulum state and intuitively control the RIP’s operating conditions.

Figure 6.2: GUI for remote monitoring and control of the RIP.

64

The GUI, illustrated in Figure 6.2, is implemented in the Processing development environ-

ment. Processing is a community driven and open source programming language using Java

with support for OpenGL and other interactive graphics libraries. The Processing environ-

ment suits and simplifies the pendulum’s visual representation.

6.2.1 Monitoring

By pressing ‘a’ on the GUI window, the client transmits a message to the UDP server running

on the Raspberry Pi requesting a process data stream. The client’s IP address and port are

added to the server’s client list and data is transmitted to the client at each consecutive

control cycle. The state vector and control voltage values received from the server are

quantitatively displayed on the GUI. The current state of the IOI module, as described in

Table 5.6, is also displayed to indicate whether or not a trigger has been asserted.

The pendulum is projected into 2-D to visually represent the RIP operation as illustrated

in Figure 6.2. The black line radiating from the center of the white semi-circle represents

the servo arm and rotates along the center axis based on the θ position. The thick blue line

represents the pendulum and protrudes from the end of the servo arm vertically. Similar to

the servo arm, the pendulum also oscillates from the vertical based on the current α position.

The trigger mechanisms within the IOI idle loop are started by a physical button press once

the pendulum is in stable operation. In order to quantitatively determine stable operation,

the classification algorithm implemented as a trigger mechanism in the IOI is ported to a

Java function and implemented in the processing GUI. The algorithm uses the process states

derived over the supervisory I/O to determine if the process is in safe and stable operation

before starting the IOI trigger mechanism. Unsafe classification is signalled to the operator

on the GUI. This offline classification prevents unnecessary guard triggers during setup.

6.2.2 Control

The GUI provides an intuitive visual input for selecting the operational set-point of the RIP.

The green line, as shown in Figure 6.2, is draggable radially along the the θ semi-circle and

used to input new operational set-points. Once the mouse is released on a desired position,

the new set-point is displayed on the GUI and a new set-point command is formulated with

65

the syntax specified on Table 5.4 and transmitted to the UDP server. The last byte of the

the set-point command is encoded using UTF-8 to allow usage of all 8 bits as opposed to

ASCII encoding.

User control capabilities are limited based on the current state of the IOI module. If the

trigger mechanisms are not started and the process is governed by the production controller,

actuation limits for the draggable set-point selection line are set to ±90◦. Once the trigger

mechanisms are started within the IOI, the acceptable region for set-point selection is limited

to the operational guards of ±35◦ and represented by the red lines. This client side limiter

on acceptable set-points prevents transmission of malicious set-points that will be ignored

anyway by the IOI.

6.3 Remote Surveillance

More often than not, safety-critical processes are under constant remote surveillance using

networked cameras for supervisory operator feedback and enforcement of perimeter security.

A Logitech C52 USB webcam with 720p HD resolution and auto-focus is used to remotely

surveil the RIP setup during operation. The camera is connected to a USB port of the

Raspberry Pi which broadcasts a live camera feed to the Internet.

Motion is a Linux service primarily intended for motion detection and video processing of

camera feeds. The service also contains a lightweight web server that broadcasts the live

camera feed to a specified network port. The motion service is started on Raspberry Pi’s

Linux and configured to broadcast on port 8081. Similar to the UDP server, the webcam

server’s port is forwarded to the public IP via the network router.

The live feed is accessed remotely via the Internet by directing a browser to the webcam

server’s IP and port: 128.173.52.36:8081. An example of the live camera stream from

the client side is shown in Figure 6.3. Surveillance of the webcam feed provides a secondary

external source for monitoring the RIP behavior that does not rely on the process state data

transmitted by the supervisory UART and relayed by the UDP server.

Adversarial access into the RTU compromises the process data transmission on the UDP

server inherently shutting off the communication with the leaf node and supervisor. Replay

attacks at this level transmit old process data to the supervisor while mounting an attack

66

Figure 6.3: Live camera feed of RIP for remote surveillance.

on the PLC. The TAIGA framework protects process safety and stability but the supervisor

is unaware of the attempted attack or backup operation of the PLC. The webcam acts as

an external visual sensor for assessing the state of the RIP. Replay attacks are much harder

with remote surveillance since it requires replaying and synchronizing not only process state

data, but also the camera stream which is difficult and impractical even with the availability

of network resources.

Chapter 7

Results

TAIGA’s effectiveness is evaluated in the cyber-physical control domain on the RIP ex-

periment. The primary purpose of TAIGA is to strengthen security at the leaf nodes of

cyber-physical control. The effectiveness of the trigger mechanism in protecting the RIP

safety and stability is validated under simulated network integrity and reconfiguration at-

tack scenarios. For network integrity attacks, the supervisory control GUI and RTU are

used to launch malicious operating conditions to the leaf node. Reconfiguration attacks are

simulated by loading compromised production control code containing latent methods that

threaten process safety and stability.

Since TAIGA is applied to embedded systems, the added cost in terms of resources, com-

putation, and power are important considerations. The addition of TAIGA is compared to

a typical leaf node implementation. Resource usage, latency, and execution time of various

methods are relevant performance metrics.

Process data for attack responses are collected in real-time using the supervisory network.

Several I/O pins are routed to a Pmod and used primarily for testing and debugging. These

pins are toggled and analyzed on a digital oscilloscope to obtain execution and timing results.

7.1 Resilience to Simulated Attack Scenarios

The RIP implementation of TAIGA is rigorously tested under network integrity and recon-

figuration attacks. The reconfiguration attack space subsumes the attacks possible through

67

68

the network channels and thus much more dangerous. By strengthening resilience to re-

configuration attacks on the system, network integrity attacks are also addressed. Network

protection, although complementary to TAIGA, is not the primary focus of this study. The

attack scenarios investigated attempt to disrupt process behavior and do not attempt to

violate network integrity since TAIGA operates under the assumption that the network and

untrusted modules are already compromised.

7.1.1 Denial-of-Service Attack

In this implementation, the RIP is governed independently by the controllers within TAIGA

and does not require constant supervisory input. Supervisory commands only update the

operating condition of the pendulum and are not integrated directly into the process control

loop. As a result, DoS from the supervisory network does not jeopardize pendulum safety

or stability.

Rather, a DoS attack originates from the production controller and thus requires intrusion

through the reconfiguration network. In the simulated attack, the production controller is

infected with latent malware that denies execution of the process control sequence and shuts

off all communication with the FIFOs. Since new sensor measurements are not updated in

the IOI, the trigger mechanism is no longer aware of the current process state and cannot an-

ticipate guard violations. Rather, the IOI’s WDT expires once the control cycle time period

exceeds 1.8148 ms and asserts the trigger initiating switch-over to the backup controller.

Figure 7.1 shows the response of the system under the simulated DoS attack. Initially, the

production controller is governing the process with a strict millisecond control cycle time

interval. At TDoS = 30 ms, the production controller stops executing the control sequence

and shuts off all communication with the IOI. Approximately 0.8148 ms after, the WDT

counter expires generating an interrupt in the IOI which initiates switch-over to backup

control, which operates the RIP without loss of safety or stability.

The backup and production controller operate asynchronously on separate clocks. However,

the timer routine for the backup does not initiate until it is invoked by a trigger assertion. In

the implementation of the backup controller, an iteration of the pendulum control sequence

is executed after process states are retrieved from the IOI and before starting the timer. This

ensures backup control starts at the instant of trigger assertion as observed in Figure 7.1.

69

Figure 7.1: Digital oscilloscope capture of plant response to a simulated DoS attack at timeTDoS = 30 seconds

Under a DoS attack, a 0.8148 ms delay is caused in the control cycle during the switch-over

process since the expiration period of the WDT is greater than the control cycle time. In the

results presented in Figure 7.1, a 1.884 ms time interval is measured from the time of DoS

and the start of backup control. This delay is negligible relative to the pendulum response

time of 1.2 seconds, and ensures process stability during the switch-over.

Livelock

Another stealthy DoS attack also originating from malicious reconfiguration requires system

knowledge of TAIGA and exploits the interrupt-driven FIFO handler within the IOI. The

trigger mechanism exists in the IOI’s idle loop and is routinely interrupted by new packets

on the FIFO. An attacker can take advantage of this interruption by continuously flooding

the FIFO with new packet requests. The execution of the idle task containing the trigger

mechanism is constantly interrupted causing livelock in which process control is maintained

but execution of other IOI methods are prevented. Such an attack is also addressed in a

manner similar to a DoS attack since the WDT expiration is interrupt-driven and able to

assert the trigger. The WDT is only reset once the trigger mechanism is executed, enforcing

strict trigger mechanism validation at each millisecond control cycle time.

70

7.1.2 Set-Point Attack

The operational set-point of the production controller originates from supervisory control.

The production controller requests an updated set-point from the IOI at each control cy-

cle. As a result, a set-point attack can originate from both the supervisory network, or a

maliciously reconfigured process control routine in the production controller.

Figure 7.2: Plant response to a simulated supervisory attack at time Tattack = 60 seconds.

A set-point attack originating from the supervisory network is checked to be within the

operational guard by the IOI module, −35◦ < θdesired < 35◦. A stealthy supervisory set-

point attack is executed in which the supervisor changes the operational set-point slightly

less than the operational guard boundary, θdesired = 34◦, at time Tattack = 60 seconds to evade

detection of the set-point verification method within the IOI.

The trigger mechanisms are the primary safeguard for recognizing this attack and are evalu-

ated in the results presented by Figure 7.2. For the trivial, prediction, and classifier trigger

mechanisms, the trigger is asserted at Tt = 60.344, Tp = 60.032, and Tc = 60.1138 seconds

respectively. The production controller operates safely at a set-point of 10◦ prior to the

attack. The backup controller is pre-configured to operate at 0◦.

71

Trivial Trigger Mechanism

The trigger is asserted by the trivial trigger mechanism at the moment the operational guard

is violated. The RIP undergoes a lightly damped response when repositioning the servo arm

to the backup set-point of θbackup = 0◦ causing an overshoot that violates both operational

and safety critical guards as shown in Figure 7.2. Since the trigger mechanism does not

preemptively anticipate the guard violation and take corrective action ahead of time, both

process safety and stability are violated.

Online Prediction Trigger Mechanism

The prediction trigger mechanism forecasts the RIP trajectory and preemptively detects a

guard violation. For a short instant following the attack, the prediction trigger mechanism

synchronizes its states with the real-time process states. Accelerated execution of the plant

model and backup control algorithm foresees a guard violation and the trigger is asserted.

At each control cycle, 50 iterations of the prediction trigger mechanism are executed to

maintain the execution time of the prediction unit within the one millisecond control cycle

time. In actuality, the guard violation occurs in approximately 344 milliseconds after the

time of attack as demonstrated by the trajectory of the trivial trigger mechanism.

Theoretically, the prediction trigger mechanism is able to predict a guard violation within

seven control cycles after state synchronization if it follows a similar trajectory as the trivial

trigger mechanism. In the results presented in Figure 7.2, the trigger is asserted in about 32

control cycles after the time of attack which maintains the RIP well within the operational

and safety-critical bounds.

Classification Trigger Mechanism

After Tattack, the pendulum arm velocity is dramatically increased to actuate the system

to the malicious set-point as seen by the increase in slope for the classifiers trajectory at

Tc = 60.1138 seconds. The classifier bounds all four dimensions of the state vector within

a region of safety and stability. The drastic increase in pendulum arm velocity at a θ

position that is approaching the operational guard causes the classifier to assert the trigger

by identifying the RIP operation to be outside of a safe state vector region.

72

The classifier asserts the trigger at the last possible point of safe recovery under backup

control. While the trigger is not asserted as preemptively as the online prediction method,

the pendulum is maintained well within the system’s operational guards. Process safety and

stability are not compromised, the primary objective of the trigger mechanism.

Reconfiguration Set-Point Attack

Under supervisory attacks, the prediction unit has knowledge of the production controller’s

desired operational set-point since it propagates through the IOI. A more covert set-point at-

tack originates from a compromised reconfiguration network in which latent malware within

the process control sequence ignores the operational conditions of the supervisor from the

IOI and attempts to operate the RIP at a malicious set-point.

In such a scenario, the prediction algorithm does not have knowledge of the desired oper-

ational set-point and predicts trajectory based on the backup controller’s set-point. The

prediction trigger mechanism is not able to assert the trigger as preemptively as the tra-

jectory shown in Figure 7.2. Rather, the prediction trigger mechanism follows an attack

response trajectory similar to the classifier’s and still maintains process safety and stability.

The classifier does not require knowledge of the desired operating conditions since it is

testing whether the backup controller can return the system to the backup set-point without

incurring a guard violation. The classifier trigger mechanism responds similarly to set-point

attacks originating from the supervisory or reconfiguration network; this response is captured

in Figure 7.2.

7.1.3 Deception Attack

Deception attacks are a class of attacks that are executed independently within process

control loops to disrupt plant behavior without the need of disclosure resources or online

network availability [24]. In the RIP framework, deception attacks propagate through a

compromised reconfiguration network and require malicious reconfiguration of the production

controller to gain sufficient disruption resources to modify the process control loop. Two

types of deception attack scenarios are investigated: a zero-dynamics attack in which the

encoder counter is reset arbitrarily under stable operation, and a bias-injection attack in

which dangerous voltages are written to the servo.

73

Zero-Dynamics Attack

Upon boot-up, the IOI initializes the ICs in the SACIB and the encoder counters are reset

with the servo arm stationary and pendulum pointing downwards. This position is the zero-

frame of reference for all encoder reads during process control for both the production and

backup controller. Resetting the encoder counters upon mobile operation of the RIP would

result in an offset frame of reference for sensory inputs jeopardizing not only process stability

for the production controller, but also the backup.

The IOI’s SPI filter limits the scope of interaction with the controller and plant. A SPI

command on the FIFO containing encoder counter reset data is ignored by the SPI filter and

is not transmitted on the SPI bus. Only sensory reads and servo voltage writes are relayed

to the physical process thus addressing zero-dynamic attacks on the RIP experiment.

Bias-injection Attack

Another class of deception attacks is false control data injection to the process control loop.

In the RIP experiment, malicious reconfiguration of the process control sequence allows

the adversary to write voltages that do not control the pendulum or are outside of the

servo actuator limits. Typically, lack of proper pendulum control is detected by the trigger

mechanisms and control is switched over to backup. However, momentary voltage writes

outside of the actuator limits can instantaneously damage the servo hardware before switch-

over to backup control.

Figure 7.3: Digital oscilloscope capture of servo voltage saturation at actuation limits, ±10volts, during voltage sweep of ±15 volts.

74

The SPI filter addresses false voltage bias injection attacks by enforcing servo actuator

limits. The DAC ICs and operational amplifier circuit in the SACIB is sourced with a ±12

volt supply but tuned to operate at the ±10 volt range, the servo actuator limits. While the

SACIB circuitry addresses hard actuator limits, any voltage write on the SPI bus is verified

to be within ±10 volts and saturated to this range by the IOI as well. Figure 7.3 shows

the saturation of the servo enforced by both the IOI and SACIB infrastructure to protect

the servo. The trigger mechanisms are disabled in order to obtain these results in which

the production control sequence is maliciously reconfigured and attempts to sweep a servo

voltage actuation between ±15 volts.

7.2 Execution Time and Control Latency

In process control loops, minimizing latency between sensory feedback and actuator response

is necessary to maintain efficient and robust control. Embedded platforms are digital elec-

tronics that inherently have computation and processing latencies caused by the execution

of various processes. For the RIP, the execution time of TAIGA-specific methods are mea-

sured with a digital oscilloscope by probing debug I/Os toggled in software. The additional

latencies caused by TAIGA are analyzed as a key metric for ensuring robust process control.

Without TAIGA, the production controller interacts directly with the RIP on a SPI bus

for sensing and actuation. Transferring two bytes of data on the SPI bus takes 17.04 µs,

approximately 8.51 µs per byte. Based on fSCK = 1.805 MHz clock rate for the SPI bus,

the actual transmission of one byte takes 8 clock cycles which is approximately 4.4308 µs.

The remaining measured transfer time is attributed to loading the transfer register and

communication with the AXI peripheral, which is unavoidable.

In the TAIGA framework, the low-level SPI drivers are moved from the production controller

to the IOI, and communication is directed through FIFOs. The transmission of packets

through the FIFO for process interaction causes a propagation delay of approximately 3.7 µs.

While this added latency associated with the TAIGA implementation is large relative to the

SPI transfer of a single byte, all routines interacting with the process typically require more

than a single byte of transfer. Sensor reads require four byte transfers while voltage writes

require two. As a result, the added latency of the FIFOs and IOI does not drastically impact

the execution time of process interactions within the pendulum control sequence.

75

Table 7.1: Execution time of RIP control sequence.

GetSet-Point

Read SPIEncoders

ControlAlgorithm

Write SPIVoltage

Total

Production 6.93 µs 2× 42.7 µs= 85.4 µs

2.27 µs26.8 µs

121.4 µsBackup 0 µs 13.48 µs 125.68 µs

The execution times of the RIP control sequence methods are measures and presented in

Table 7.1. The set-point command does not require peripheral I/O; the execution time is

primarily attributed to the FIFO latency arising from bidirectional enqueue and dequeue

between the controller and the IOI. Interaction with the physical plant requires relatively

slow SPI transfers with respect to the FIFO latency as reflected by both the encoder read

and servo voltage write operations. The control algorithm execution deviates between the

production and backup controllers due to the difference in clock frequency and floating point

performance between the ARM and MicroBlaze processors.

The execution time of the entire process control sequence is less than 130 µs for both

controllers, consuming only about 13% of the one millisecond control cycle time. In the

TAIGA framework, the remaining control cycle time is used for updating process states,

executing the trigger mechanism, and interacting with the supervisor. These processes are

implemented in the IOI’s idle loop and execution time is measured and presented in Table 7.2.

Table 7.2: Execution time of IOI idle loop.

Method Execution Time

UART Receive 2.32 µsKalman Filter 16.3 µs

Trigger Mechanism Trivial: 4.05 µs 50 Predictions: 828 µs Classifier: 557 µsUART Transmit 190 µs

The UART receive method polls the UART message buffer for new control command data

and is not a time-intensive process as depicted in Table 7.2. The execution of the Kalman

filter algorithm is necessary after the completion of each control cycle for state estimation

in the the IOI. Since the computational performance of the IOI is the same as the backup

controller, the time required to calculate the Kalman filter is similar to the backup control

76

algorithm calculation. At each control cycle, 24 bytes of data is transmitted on the super-

visory UART bus as specified in Table 5.5. This transmission is non-blocking and executed

concurrently by the UART hardware controller but requires 190 µs to complete the UART

transmission methods. The actual transmission time is greater and calculated by Equa-

tion 5.3. As a result, the trigger mechanisms are run in parallel to the UART transmission.

The two preemptive trigger mechanisms, prediction and classification, are computationally

intensive as demonstrated in Table 7.2. The online prediction trigger mechanism is run for

an experimentally determined 50 iterations at each control cycle to ensure the method does

not exceed the allotted time. In contrast, the classifier takes much less time to execute.

Running the prediction algorithm for 50 iterations at each control cycle would require syn-

chronization with real-time process states every 24 control cycles to forecast trajectory for

the entire RIP response time, 1.2 seconds. Perfectly timed attacks in between this synchro-

nization time attempting to violate operational guards in the 24 control cycle time window

can evade the detection of the prediction method making it non-ideal.

The classifier method is a much more suitable trigger mechanism due to its smaller execution

time. A one millisecond control cycle is aggressive even in comparison to typical industrial

control applications. The ability to successfully implement TAIGA and preemptive detection

schemes without jeopardizing this control cycle time validates the effectiveness of TAIGA.

7.3 Resource Utilization

The additional resources utilized by TAIGA are an important metric for assessing cost. A

standalone production controller for the RIP experiment is implemented with just a micro-

controller containing build-in peripheral resources. In the Zynq processor, this implementa-

tion is contained within the PS. TAIGA is realized primarily in the PL with the addition

of the backup controller, IOI, FIFOs, controller multiplexer, and numerous other functional

and peripheral blocks for interfacing with internal and external components. A comparison

of resources with a standalone production controller and TAIGA is illustrated in Figure 7.4.

The standalone production controller uses no FPGA resources; it only uses I/O and a single

global buffer (BUFG) for interfacing and control of the SPI bus. As illustrated in Figure 7.4,

TAIGA uses a large amount of FPGA resources. The FIFO multiplexer designed using HLS

77

Figure 7.4: Post-implementation Zynq FPGA resource usage with and without TAIGA.

is synthesized to use 105 LUTs for the combinational circuit. The hardware optimizations for

the MicroBlaze processor each use five DSPs. A large amount of FFs are used during routing

for registers and buffers. Lastly, TAIGA is BRAM-intensive since BRAMs are allocated for

the instruction and local data memories of the two MicroBlazes and the four FIFOs. While

a significant portion of the FPGA resources are used, the Zynq-7010 used in the ZYBO

platform has fewer resources available than other chips in the family. For the cost and

relatively low performance of the platform, TAIGA is successfully implemented without

requiring any excess resources.

Satisfying timing constraints is crucial for accurate and reliable operation of FPGA designs.

TAIGA’s implementation contains a mixture of both combinational and sequential design.

The sequential portions of the PL are clocked by the FCLK CLK0 fabric clock at 144.4 MHz.

For proper digital design, it is essential that the propagation delay through combinational

logic is less than the clock period of the sequential circuit elements.

The difference between the combinational propagation delay and sequential clock period is

known as slack. Negative slack is a critical timing violation since it implies combinational

propagation delay exceeds the clock period of sequential logic. The Vivado design tool es-

timates propagation through combinational circuit paths to determine slack. MicroBlaze

configurations form the critical paths in the TAIGA implementation. The fabric clock fre-

quency and MicroBlaze configuration are tuned to prevent negative slack. The estimated

slack for the standalone production and TAIGA implementation are specified in Table 7.3.

78

Table 7.3: Estimated slack for standalone production and TAIGA implementations on Zynq.

Standalone Production TAIGA

Setup Slack - 0.205 ns

Hold Slack - 0.027 ns

Pulse Width Slack 7.845 ns 2.211 ns

The standalone production controller only contains one path in the PL between combina-

tional and sequential logic as shown in the pulse width slack. The TAIGA implementation,

on the other hand, contains 25712 setup and hold sensitive paths and 9338 pulse width sen-

sitive paths. Larger slack is favorable since it guarantees better circuit integrity. However,

in order to achieve maximum performance from the MicroBlaze processors, the fabric clock

is maximized without incurring negative slack. While the TAIGA slack is low, it is still

positive and satisfies the timing requirements for the Zynq’s PS.

Table 7.4: Estimated power consumption for standalone production and TAIGA implemen-tations on Zynq.

Standalone Production TAIGA

Dynamic Power 1.56 W 1.766

Static Power 0.133 W 0.141

Total Power 1.693 W 1.907

Power consumption is another cost metric relevant to embedded systems. The Vivado de-

sign tool contains a power estimation tool based on the number of resources used by the

FPGA design. Table 7.4 describes the power consumed by a standalone production con-

troller implementation and TAIGA. TAIGA consumes approximately 12.6% more power

than a standalone implementation of the production controller. Static power remains ap-

proximately the same on the Zynq even with the increase in resource usage. However, an

increase in the resources utilizes a larger portion of the Zynq IC. Activity in this larger

die area of the chip causes an increase in the dynamic power. While power is pertinent to

embedded systems, it is not of critical concern in large control applications. The power con-

sumption of the embedded controllers is negligible relative to other power hungry electrical

components in most industrial applications.

Chapter 8

Conclusions

Incorporating aspects of autonomic systems and enforcing stringent trust requirements,

TAIGA is successfully implemented on an embedded configurable SoC platform to address

security at cyber-physical leaf nodes. Traditional security practices focus on protecting the

channels of intrusion at the CPS network layers. Compromise to these security precautions

and further infiltration to leaf nodes jeopardizes process safety and stability.

In TAIGA, security measures are implemented from a controls perspective to maintain pro-

cess integrity. The RIP experiment resembles industrial control applications with similar

process control concerns. Guards are defined on the pendulum system to establish bound-

aries of safe and stable operation. All production control process interaction is monitored

and arbitrated through the IOI. Trigger mechanisms are tailored for the RIP to preemp-

tively detect guard violations and switch-over to a trusted backup controller. The backup

controller, IOI, and trigger mechanism adapt to detected threats autonomically and are

protected by the TRs safeguarding them from malicious reconfiguration.

Responses to simulated attack scenarios show increased resilience to malicious reconfigura-

tion and network integrity attacks. DoS attacks originating from malicious reconfiguration

are successfully detected by the WDT within the IOI. Set-point attacks classified as either

a network integrity or reconfiguration attack are preemptively detected by the prediction

or classification trigger mechanisms maintaining process safety and stability. The SPI filter

ignores any attempted deception attacks originating from a compromised production control

sequence that may damage process infrastructure.

79

80

In contrast to existing cyber-physical threat response schemes, TAIGA maintains safety and

liveness when malicious behavior is detected by initiating a bump-less transition to backup

control. The preemptive trigger mechanisms satisfy the self-protecting property of auto-

nomic systems by proactively defending against anticipated threats. TAIGA autonomically

reconfigures control to backup which brings the system back to stable operation and makes

the system self-healing.

8.1 Scope of TAIGA

More often than not, ICSes contain numerous sub-processes with many interconnected con-

trol nodes and intertwined control loops. Process control at a specific leaf node can influence

the behavior of other leaf nodes. For example, a PLC responsible for opening a valve based

on a pressure measurement in a chemical plant can cause disruptive temperature variation

at a different location. While the pressure measurement may have been contained within

defined operational guards, the valve opening can cause a guard violation at a different node.

The autonomic nature of TAIGA mandates strong awareness of the entire process which is

not always feasible in industrial control applications. Inter-node communication between

ICS leafs is usually interfaced through networks that are untrusted in a TAIGA framework.

As a result, implementation of TAIGA in large interdependent CPSes is much more difficult.

TAIGA’s effectiveness relies on the integrity of the infrastructure layer and trust of all process

sensors and actuators. It is implemented in leaf nodes to enforce this process integrity by

maintaining process safety and liveness properties. However, certain safety-critical industrial

control applications do not guarantee trust below the leaf nodes. Power systems are an

example of such CPSes that are physically distributed preventing efficient perimeter security

measures. TAIGA is more suitable for application domains in which the infrastructure is

protected with physical perimeter security.

The RIP implementation contains only a single process control loop. The scalability of

such a system to larger leaf nodes and process control loops is possible with TAIGA as

long as each leaf node can accurately enforce high-level safety and stability properties, or

inter-node dependencies are contained locally to ensure trust within the telemetry channels.

The proposed TAIGA framework for cyber-physical leaf nodes is best suited for consolidated

control networks (such as aircraft, drones, and automobiles) more so than ICSes.

81

8.2 Future Work

TAIGA has been successfully implemented and applied to the RIP apparatus. However, to

better integrate the leaf node to a cyber-physical control topology, efficient reconfiguration

processes need to be implemented. Currently, simulated reconfiguration attacks are loaded

to the production controller as latent malware before bootup. In CPS systems, controllers

are remotely updated in real-time while the system is active. The controller parameter

optimization and firmware update methods will be implemented within the AMP framework

to effectively create a remote channel for reconfiguration.

IOI methods protect the RIP system from threats targeting process safety and stability.

However, attacks degrading process performance also cause plant disruption. The IOI is a

process-aware framework that is safeguarded by the TRs from common external threats. As

a result, additional trigger mechanisms, process monitors, and attack detection schemes may

possibly be hosted within the IOI to address other threats or more proactively respond to

specific network integrity or reconfiguration attacks.

TAIGA, from an architectural perspective, supports extension to encompass additional con-

trol modules. In the RIP application, the production and backup controllers are sufficient

to enforce process stability. However, more complex processes may require additional aux-

iliary, performance, or other controllers to ensure recovery from application-specific guard

violations.

TAIGA is well suited for independent or contained CPSes. Modern automobiles contain a

large number of computing and network elements for remote monitoring and control, making

them more vulnerable than ever before [11]. TAIGA could protect the individual controllers

of automotive systems by enforcing safety guards that can prevent accidents under network

attacks. Similar to automobiles, aircraft also contain a large number of interconnected and

networked computation and control elements making them susceptible to malicious intrusion.

Aircraft are perhaps even more safety-critical than automobiles and need protection from a

wide range of adversaries and threats.

Bibliography

[1] A. Al-Jodah, H. Zargarzadeh, and M. Abbas. Experimental verification and comparison

of different stabilizing controllers for a rotary inverted pendulum. In Control System,

Computing and Engineering (ICCSCE), 2013 IEEE International Conference on, pages

417–423, Nov 2013.

[2] R. Barry. FreeRTOS Reference Manual - API Functions and Configuration Options.

[3] E. Bernabeu, J. Thorp, and V. Centeno. Methodology for a security/dependability

adaptive protection scheme based on data mining. Power Delivery, IEEE Transactions

on, 27(1):104–111, Jan 2012.

[4] A. A. Cardenas, S. Amin, Z.-S. Lin, Y.-L. Huang, C.-Y. Huang, and S. Sastry. Attacks

against process control systems: Risk assessment, detection, and response. In Pro-

ceedings of the 6th ACM Symposium on Information, Computer and Communications

Security, ASIACCS ’11, pages 355–366, New York, NY, USA, 2011. ACM.

[5] N. T. Chiluvuri, O. A. Harshe, C. D. Patterson, and W. T. Baumann. Using heteroge-

neous computing to implement a trust isolated architecture for cyber-physical control

systems. In Proceedings of the 1st ACM Workshop on Cyber-Physical System Security,

CPSS ’15, pages 25–35, New York, NY, USA, 2015. ACM.

[6] Digilent. ZYBO Reference Manual, Feb 2014.

[7] Z. Franklin, C. Patterson, L. Lerner, and R. Prado. Isolating trust in an industrial

control system-on-chip architecture. In Resilient Control Systems (ISRCS), 2014 7th

International Symposium on, pages 1–6, Aug 2014.

82

83

[8] O. A. Harshe. Preemptive detection of cyber attacks in industrial control systems. Mas-

ter’s thesis, Virginia Tech, Bradley Department of Electrical and Computer Engineering,

Blacksburg, VA, Apr 2015.

[9] O. A. Harshe, N. Teja Chiluvuri, C. D. Patterson, and W. T. Baumann. Design and

implementation of a security framework for industrial control systems. In Industrial

Instrumentation and Control (ICIC), 2015 International Conference on, pages 127–132,

May 2015.

[10] J. Kephart and D. Chess. The vision of autonomic computing. Computer, 36(1):41–50,

Jan 2003.

[11] K. Koscher, A. Czeskis, F. Roesner, S. Patel, T. Kohno, S. Checkoway, D. McCoy,

B. Kantor, D. Anderson, H. Shacham, and S. Savage. Experimental security analysis of

a modern automobile. In Security and Privacy (SP), 2010 IEEE Symposium on, pages

447–462, May 2010.

[12] D. Kushner. The real story of Stuxnet. Spectrum, IEEE, 50(3):48–53, March 2013.

[13] E. A. Lee. Computing foundations and practice for cyber-physical systems: A prelim-

inary report. Technical Report UCB/EECS-2007-72, EECS Department, University of

California, Berkeley, May 2007.

[14] L. Lerner. Trustworthy Embedded Computing for Cyber-Physical Control. PhD thesis,

Virginia Tech, Bradley Department of Electrical and Computer Engineering, Blacks-

burg, VA, Jan 2015.

[15] L. Lerner, Z. Franklin, W. Baumann, and C. Patterson. Application-level autonomic

hardware to predict and preempt software attacks on industrial control systems. In

Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International

Conference on, pages 136–147, June 2014.

[16] L. W. Lerner, M. M. Farag, and C. D. Patterson. Run-time prediction and preemption

of configuration attacks on embedded process controllers. In Proceedings of the First

International Conference on Security of Internet of Things, SecurIT ’12, pages 135–144,

New York, NY, USA, 2012. ACM.

84

[17] L. W. Lerner, Z. R. Franklin, W. T. Baumann, and C. D. Patterson. Using high-

level synthesis and formal analysis to predict and preempt attacks on industrial control

systems. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-

programmable Gate Arrays, FPGA ’14, pages 209–212, New York, NY, USA, 2014.

ACM.

[18] X. Liao, J. Zhou, and X. Liu. Exploring AMBA AXI on-chip interconnection for

TSV-based 3D SoCs, year=2012, month=Jan, pages=1-4, keywords=elemental semi-

conductors;integrated circuit interconnections;integrated circuit reliability;network-

on-chip;protocols;silicon;system-on-chip;three-dimensional integrated circuits;3D

network-on-chip;AMBA AXI on-chip interconnection;Si;TSV reliability;TSV-

based 3D SoC;advanced microcontroller bus architecture advanced extensi-

ble interface protocol;through silicon via;Bridge circuits;Bridges;Integrated cir-

cuit interconnections;Latches;Protocols;System-on-a-chip;Through-silicon vias,

doi=10.1109/3DIC.2012.6263036,. In 3D Systems Integration Conference (3DIC),

2011 IEEE International.

[19] Y. Mo and B. Sinopoli. Secure control against replay attacks. In Communication,

Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on,

pages 911–918, Sept 2009.

[20] T. H. Morris and W. Gao. Industrial control system cyber attacks. Proceedings of

the 1st International Symposium for ICS & SCADA Cyber Security Research, page 22,

2013.

[21] B. Obama. Executive order – improving critical infrastructure cybersecurity. The White

House, 2013.

[22] M. Roman, E. Bobasu, and D. Sendrescu. Modelling of the rotary inverted pendu-

lum system. In Automation, Quality and Testing, Robotics, 2008. AQTR 2008. IEEE

International Conference on, volume 2, pages 141–146, May 2008.

[23] L. Sha. Using simplicity to control complexity. Software, IEEE, 18(4):20–28, Jul 2001.

[24] A. Teixeira, D. Perez, H. Sandberg, and K. H. Johansson. Attack models and scenarios

for networked control systems. In Proceedings of the 1st International Conference on

High Confidence Networked Systems, HiCoNS ’12, pages 55–64, New York, NY, USA,

2012. ACM.

85

[25] A. Teixeira, I. Shames, H. Sandberg, and K. Johansson. Revealing stealthy attacks

in control systems. In Communication, Control, and Computing (Allerton), 2012 50th

Annual Allerton Conference on, pages 1806–1813, Oct 2012.

[26] R. T. P. Trupti D. Shingare. SPI implementation on FPGA. International Journal of

Innovative Technology and Exploring Engineering (IJITEE), 2(2):7–9, Jan 2013.

[27] Trusted Computing Group, Incorporated. TPM Main Specification Level 2 Version 1.2,

Revision 116 Part 1 Design Principles, Mar 2011.

[28] Xilinx. MicroBlaze Processor Reference Guide, Apr 2014.

[29] Xilinx. Vivado Design Suite - AXI Reference, Nov 2014.

[30] Xilinx. Zynq-7000 All Programmable SoC Technical Reference Manual, Feb 2015.

[31] L. Yongfu, S. Dihua, L. Weining, and Z. Xuebo. A service-oriented architecture for

the transportation cyber-physical systems. In Control Conference (CCC), 2012 31st

Chinese, pages 7674–7678, July 2012.

[32] M. Zeller. Myth or reality–does the Aurora vulnerability pose a risk to my generator?

In Protective Relay Engineers, 2011 64th Annual Conference for, pages 130–136, April

2011.

A Trusted Autonomic Architecture to Safeguard Cyber-Physical … · 2020. 1. 20. · A Trusted...

Documents

Transcript of A Trusted Autonomic Architecture to Safeguard Cyber-Physical … · 2020. 1. 20. · A Trusted...