A Trusted Autonomic Architecture to Safeguard Cyber-Physical … · 2020. 1. 20. · A Trusted...
Transcript of A Trusted Autonomic Architecture to Safeguard Cyber-Physical … · 2020. 1. 20. · A Trusted...
A Trusted Autonomic Architecture to Safeguard Cyber-PhysicalControl Leaf Nodes and Protect Process Integrity
Nayana Teja Chiluvuri
Thesis submitted to the Faculty of the
Virginia Polytechnic Institute and State University
in partial fulfillment of the requirements for the degree of
Master of Science
in
Computer Engineering
Cameron D. Patterson, Chair
William T. Baumann
Thomas L. Martin
July 17, 2015
Blacksburg, Virginia
Keywords: Process control systems, cyber-physical systems, autonomic systems,
programmable logic controller, remote terminal unit, human-machine interface, FPGA,
trust, configurable system-on-chip, heterogeneous computing, high-level synthesis
Copyright 2015, Nayana Teja Chiluvuri
A Trusted Autonomic Architecture to Safeguard Cyber-Physical Control
Leaf Nodes and Protect Process Integrity
Nayana Teja Chiluvuri
ABSTRACT
Cyber-physical systems are networked through IT infrastructure and susceptible to malware.
Threats targeting process control are much more safety-critical than traditional computing
systems since they jeopardize the integrity of physical infrastructure. Existing defence mech-
anisms address security at the network nodes but do not protect the physical infrastructure
if network integrity is compromised. An interface guardian architecture is implemented on
cyber-physical control leaf nodes to maintain process integrity by enforcing high-level safety
and stability policies.
Preemptive detection schemes are implemented to monitor process behavior and anticipate
malicious activity before process safety and stability are compromised. Autonomic properties
are employed to automatically protect process integrity by initiating switch-over to a ver-
ified backup controller. Subsystems adhere to strict trust requirements safeguarding them
from adversarial intrusion. The preemptive detection schemes, switch-over logic, backup
controller, and process communication are all trusted components that are separated from
the untrusted production controller.
The proposed architecture is applied to a rotary inverted pendulum experiment and imple-
mented on a Xilinx Zynq-7000 configurable SoC. The leaf node implementation is integrated
into a cyber-physical control topology. Simulated attack scenarios show strengthened re-
silience to both network integrity and reconfiguration attacks. Threats attempting to disrupt
process behavior are successfully thwarted by having a backup controller maintain process
stability. The system ensures both safety and liveness properties even under adversarial
conditions.
Dedication
I would like to dedicate my thesis to my great grandfather, Dr. Ranganadha Raju Mudundi.
A witness of Mahatma Gandhi, he retired after 60 years of service as a medical doctor in
Bhimavaram, India. He is respected for his generosity towards the poor and equality in
patient treatment in a time when poverty and caste were prevalent social issues in India.
He continues to have an open outlook on life and society which empowers and motivates me
three generations down.
Pursing higher education in a time when opportunities and encouragement were lacking,
Dr. Ranganadha Raju’s service as a doctor exemplifies all aspects of the Hippocratic Oath
to which he swore. He leads a life of utmost humbleness and simplicity, qualities I aspire
towards. His lifelong dedication to his profession, affection towards people, attitude towards
human welfare, importance to education, and simple yet idealistic lifestyle are inspirational.
In my life, I hope to look back with as much achievement and satisfaction as he.
iii
Acknowledgments
I greatly appreciate the academic and research guidance of my adviser, Dr. Cameron
Patterson, who initially gave me the opportunity to work on this project. His helpful feed-
back and support were valuable to the completion of my thesis. I would like to thank
Dr. William Baumann and Dr. Thomas Martin for participating on my academic advisory
committee and inspiring my work. This study was made successful by the support of my
colleagues: Omkar Harshe, Vivek Gopal, Christopher McCarty, and Pallavi Deshmukh.
I am extremely thankful to my parents and family for first introducing me to engineering
and supporting me in pursuing my passion throughout my academic career. Their affection
continues to motivate me to excel at all aspects of life. I am grateful for the continuous
technical mentorship of Dr. Nitin Patil and Deepak Patil who first recognized my passion
for electronics and gave me a summer internship in eight grade. The experience I obtained
from working at their company has been invaluable and their entrepreneurial spirit continues
to stimulate my passion to this day.
I would also like to thank all of the friends, classmates, and roommates that I have accu-
mulated as an undergraduate and graduate student at Virginia Tech. They have made my
life in Blacksburg enjoyable and provided me with unforgettable memories and relationships
that I will cherish forever.
This material is based upon work supported by the National Science Foundation under
Grant Number CNS-1222656. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reflect the views
of the National Science Foundation.
Zedboards and design tools were donated by Xilinx, Inc.
iv
Contents
1 Introduction 1
1.1 Cyber-Physical Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Background 4
2.1 CPS Security Vulnerabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Network Integrity Attack Space . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Reconfiguration Attack Space . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Autonomic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 TAIGA Overview 13
3.1 Autonomic Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Control Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2.1 Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2.2 Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.3 Trigger Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Trust Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
v
3.4.1 Isolation of Trust . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.2 Inter-Module Communication . . . . . . . . . . . . . . . . . . . . . . 22
3.4.3 TAIGA Transparency . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 Rotary Inverted Pendulum 24
4.1 Process Control Telemetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Control Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Pendulum Guards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.4 Trigger Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4.1 Trivial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4.2 Linear Online Prediction . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.3 Neural Network Classification . . . . . . . . . . . . . . . . . . . . . . 33
5 TAIGA Implementation 34
5.1 Target Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.1.1 Processing System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.2 Programmable Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Production Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2.1 FreeRTOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2.2 Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 Backup Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.4.1 Inter-Module Communication Protocol . . . . . . . . . . . . . . . . . 49
5.5 I/O Intermediary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.5.1 Robust Process Control . . . . . . . . . . . . . . . . . . . . . . . . . 51
vi
5.5.2 Supervisory Control and Process Monitor . . . . . . . . . . . . . . . . 53
5.5.3 Trigger Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.5.4 Watch Dog Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.6 Controller Multiplexer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6 Integration of TAIGA in Cyber-Physical Control 60
6.1 Remote Terminal Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.1.1 Remote Monitoring and Control Server . . . . . . . . . . . . . . . . . 61
6.2 Human-machine interface GUI . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.1 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.2.2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3 Remote Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7 Results 67
7.1 Resilience to Simulated Attack Scenarios . . . . . . . . . . . . . . . . . . . . 67
7.1.1 Denial-of-Service Attack . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.1.2 Set-Point Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.1.3 Deception Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.2 Execution Time and Control Latency . . . . . . . . . . . . . . . . . . . . . . 74
7.3 Resource Utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8 Conclusions 79
8.1 Scope of TAIGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Bibliography 82
vii
List of Figures
1.1 Elements of CPSes and the relationship between them. . . . . . . . . . . . . 2
2.1 Abstracted cyber-physical control components containing supervisory and plant
control loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Three-dimensional network integrity attack space for cyber-physical control. 6
2.3 Hierarchical topology of a DCS or SCADA system. . . . . . . . . . . . . . . 9
3.1 The control modules associated with TAIGA. . . . . . . . . . . . . . . . . . 15
3.2 Output-feedback control loop with state estimation. . . . . . . . . . . . . . . 16
3.3 Black box view of CPS leaf nodes with TAIGA. . . . . . . . . . . . . . . . . 20
3.4 TAIGA’s realization on a configurable SoC. . . . . . . . . . . . . . . . . . . 21
4.1 Photograph of the Quanser rotary inverted pendulum setup. . . . . . . . . . 25
4.2 Interface for telemetry between the regulatory control and infrastructure layers
of the Quanser RIP experiment. . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 SPI data transfer between master and slave controllers. . . . . . . . . . . . . 27
4.4 Inverted pendulum setup and sign conventions for θ and α. . . . . . . . . . . 29
5.1 The hardware implementation of TAIGA on a Zynq-7000 configurable SoC. . 34
5.2 Internal layout of the Zynq processing system. . . . . . . . . . . . . . . . . . 37
5.3 FPGA architecture and fabric composition. . . . . . . . . . . . . . . . . . . . 39
viii
5.4 Execution trace resembling FreeRTOS scheduling of concurrent tasks. . . . . 44
5.5 Software implementation of the FIFO interrupt handler. . . . . . . . . . . . 52
5.6 Idle loop and WDT interrupt service routine of the IOI. . . . . . . . . . . . . 54
6.1 Integration of the TAIGA leaf node into a cyber-physical control topology. . 60
6.2 GUI for remote monitoring and control of the RIP. . . . . . . . . . . . . . . 63
6.3 Live camera feed of RIP for remote surveillance. . . . . . . . . . . . . . . . . 66
7.1 Digital oscilloscope capture of plant response to a simulated DoS attack at
time TDoS = 30 seconds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2 Plant response to a simulated supervisory attack at time Tattack = 60 seconds. 70
7.3 Digital oscilloscope capture of servo voltage saturation at actuation limits,
±10 volts, during voltage sweep of ±15 volts. . . . . . . . . . . . . . . . . . 73
7.4 Post-implementation Zynq FPGA resource usage with and without TAIGA. 77
ix
List of Tables
4.1 SPI ICs in Quanser pendulum interface board. . . . . . . . . . . . . . . . . . 28
4.2 Variables in the state vector of the inverted pendulum control experiment. . 30
4.3 Safety-critical and operational guard definitions for α and θ. . . . . . . . . . 31
5.1 Specifications for ZYBO Zynq-7000 development board. . . . . . . . . . . . . 35
5.2 Transmit and receive data channel signals of the interface between the AXI-
Stream FIFO and FIFO generator blocks. . . . . . . . . . . . . . . . . . . . 49
5.3 FIFO queue packet structure. The portions shaded gray are relevant only for
a PLANT command. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.4 Syntax of an input operational set-point command on the supervisory UART
bus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.5 Packet composition of process data transmission on UART from IOI. . . . . 55
5.6 Flag transmitted on the UART bus to report TAIGA’s trigger state. . . . . . 56
5.7 Signals of the controller queue multiplexer allocated to their respective masters. 59
7.1 Execution time of RIP control sequence. . . . . . . . . . . . . . . . . . . . . 75
7.2 Execution time of IOI idle loop. . . . . . . . . . . . . . . . . . . . . . . . . . 75
7.3 Estimated slack for standalone production and TAIGA implementations on
Zynq. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.4 Estimated power consumption for standalone production and TAIGA imple-
mentations on Zynq. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
x
Acronyms
ADC Analog to Digital Converter
AMBA Advanced Microcontroller Bus Ar-
chitecture
AMP Asymmetric multiprocessing
APU Application processor unit
AR Autonomic requirement
AXI Advanced eXtensible Interface
BRAM Block RAM
BSP Board support package
BUFG Global buffer
CLB Configurable logic block
CPS Cyber-physical system
DAC Digital to Analog Converter
DCS Distributed control system
DMA Direct memory access
DoS Denial-of-service
DSP Digital signal processing
EMIO Extended multiplexed I/O
FF Flip-flop
FIFO First-in first-out
FPGA Field-programmable gate array
FPU Floating point unit
GPIO General purpose I/O
GUI Graphical user interface
HDL Hardware description language
HLS High-level synthesis
HMI Human-machine interface
IC Integrated circuit
ICS Industrial control system
IOB I/O block
IOI I/O intermediary
IP Internet Protocol
IT Information technology
JTAG Joint Test Action Group
LMB Local-memory bus
xi
LQG Linear-quadratic-Gaussian
LUT Lookup table
MIO Multiplexed I/O
MLP Multilayer perceptron
MMU Memory management unit
MSR Machine status register
OS Operating system
OSI Open systems interconnection
PCS Process control system
PL Programmable logic
PLC Programmable logic controller
PLL Phase-locked loop
PMU Phasor monitoring unit
PS Processing system
RIP Rotary inverted pendulum
RTOS Real-time operating system
RTU Remote terminal unit
SACIB Sensory and control interface board
SCADA Supervisory control and data acqui-
sition
SoC System-on-chip
SPI Serial peripheral interface
TAIGA Trustworthy Autonomic Interface
Guardian Architecture
TCP Transmission Control Protocol
TPM Trusted Platform Model
TR Trust requirement
UART Universal asynchronous re-
ceiver/transmitter
UDP User Datagram Protocol
USB Universal Serial Bus
WDT Watch dog timer
xii
Chapter 1
Introduction
The 19th century industrial revolution sparked a transition in manufacturing processes from
manual labor to machines. Today, machines have become pervasive in society and industrial
applications. Modern-day industrial control systems (ICSes) have evolved with the progress
in technology and are heavily automated to reduce human labor and significantly optimize
performance. The information revolution gave rise to sophisticated electronics and the Inter-
net which further increases the capability of machines. Current cars, consumer devices, and
even process control systems (PCSes) are networked through information technology (IT)
infrastructure to allow for reliable, timely, and unbounded exchange of information between
humans and machines.
1.1 Cyber-Physical Control
Cyber-physical systems (CPSes) are defined as large-scale heterogeneous systems that en-
capsulate PCSes, are networked through IT infrastructure, and contain control loops for
governing a physical process [13]. The processes involved in CPSes vary in application and
can range from nuclear fission to home automation. The interaction with physical infras-
tructure makes PCSes safety critical for certain applications. The IT infrastructure for CPS
process telemetry introduces a diverse set of security implications. Telemetry between the
IT infrastructure and PCSes warrants stringent security measures that are arguably more
critical than traditional cyber security measures.
1
2
Figure 1.1: Elements of CPSes and the relationship between them.
CPSes contain three functional elements—communication, computation, and control—as
shown in Figure 1.1 [31]. Embedded controllers arbitrate the interaction between these
three elements in CPSes. Existing security measures typically focus on addressing intrusion
through the communication channels of the system by monitoring the cyber and systems
relationships between the communication and computation, and communication and con-
trol elements as depicted in Figure 1.1. Violation of network channel integrity allows the
embedded controllers in CPSes to be compromised with latent malware.
CPS leaf nodes enable computational and control element interaction. Typically, these nodes
contain the control loops that govern the physical process and present a higher risk to process
safety. In Figure 1.1, the computation and control elements are responsible for cyber-physical
control as they interact with cyber elements externally and internally actuate the physical
process.
Violating the integrity of the embedded cyber-physical controllers in CPS leaf nodes can
cause internal threats to jeopardize the safety of the physical process. As a result, security
must be addressed not only on the cyber relationships, but also the physical relationships of
CPSes. Anomaly detection and network monitoring enables external threats to be identified
but does not address internal threats due to possible latent malware that can be introduced
through the network. Indirect statistical detection methods may yield false positives or
negatives.
3
Applications of CPSes such as ICSes require system availability to ensure safety and mission-
critical services even in the presence of attacks. This thesis presents an implementation of
a Trustworthy Autonomic Interface Guardian Architecture (TAIGA) that maintains safety
and liveness properties in the presence of internal and external threats.
1.2 Contributions
Aspects of autonomic systems are implemented in TAIGA for preemptive detection of mali-
cious process behavior to ensure physical process safety and stability. TAIGA is implemented
at the leaf nodes of CPSes in order to safeguard interaction with physical processes. It is
assumed that the supervisory telemetry channels are untrusted to address the network in-
tegrity attack space. TAIGA does not trust the production controller in order to safeguard
the system from malicious reconfiguration.
TAIGA is successfully realized on a configurable system-on-chip (SoC) platform for a rotary
inverted pendulum application. Heterogeneous computing is leveraged with multiple isolated
processors operating on asynchronous clocks to execute independent functions in parallel.
TAIGA is integrated into a system resembling conventional cyber-physical control. Simulated
attack scenarios show added resilience to network integrity and reconfiguration attacks.
1.3 Thesis Organization
The thesis initially provides relevant background information establishing the motivation
and research goals in Chapter 2. Chapter 3 provides an overview of TAIGA including its
various components and properties. The rotary inverted pendulum application is introduced
in Chapter 4 along with the control algorithm used in the implementation. Chapter 5 focuses
on the implementation of all of the elements within TAIGA. The integration of the TAIGA
architecture in cyber-physical control is presented in Chapter 5. The TAIGA implementation
is evaluated in Chapter 6. Chapter 7 concludes with what was achieved in the research and
establishes the scope of TAIGA as well as future research directions.
Chapter 2
Background
Cyber-physical controllers govern a physical process based on operator conditions specified
through IT communication networks. To achieve timely information exchange, computation,
and process control, CPSes are organized into multi-layer hierarchies that contain numerous
embedded control nodes with specific functionalities. Distributed control system (DCS) and
supervisory control and data acquisition (SCADA) systems are widespread organizations for
CPSes especially in industrial control applications.
Figure 2.1: Abstracted cyber-physical control components containing supervisory and plantcontrol loops.
Controllers within these systems are comprised of numerous control loops, both external and
internal, for exchanging information and controlling the process. External control loops exist
between the computation elements and the physical process. Internal control loops provide
4
5
interaction between the computation and communication elements of CPSes to exchange
process information and operating conditions. Modern cyber-physical control is divided into
four main components as shown in Figure 2.1:
• The plant contains the physical process with sensors for measuring process state and
actuators for altering the process behavior.
• The supervisor is responsible for remotely monitoring process behavior and providing
operational conditions of the plant.
• A programmable logic controller (PLC) is a computational element that interacts with
the plant to sense process behavior and execute the control algorithm for actuation.
• A remote terminal unit (RTU) is a computational and communication element that
facilitates the exchange of information between the supervisor and PLCs.
Communication between the entities illustrated in Figure 2.1 vary in both complexity and
locality. Typically, a PLC governs a contained individual process control loop which may
only be a sub-process within the entire plant. Telemetry between the RTU and PLC is local
with respect to the plant. In some ICS organizations, the RTU aggregates information from
multiple PLC nodes and facilitates data exchange between these nodes for control loops
containing inter-node dependencies. Supervisory interaction with the RTU is more often
than not networked with IT infrastructure which allows remote access through the Internet.
2.1 CPS Security Vulnerabilities
The complex cyber-physical control structure and interaction with unregulated or susceptible
IT infrastructure introduces CPS security vulnerabilities. Cyber threats targeting critical
processes can damage physical infrastructure, have economic repercussions, and even endan-
ger human life. Many existing CPSes are not designed with security in mind and are deficient
in even trivial protections. Adversaries attempt to exploit the lack of sufficient security mea-
sures or vulnerabilities in the IT infrastructure to gain access to ICSes and cause damaging
behavior. CPS attacks can be separated into: network integrity attacks that exploit IT
network vulnerabilities; and reconfiguration attacks that maliciously reconfigure embedded
platforms.
6
2.1.1 Network Integrity Attack Space
PCSes are adaptive in nature and typically require interaction with operators. The perpetual
need for remote actuation and monitoring necessitates constant networked process telemetry.
IT infrastructure is adopted by ICSes for efficient and timely flow of information between the
operators and process and are customarily used in DCS and SCADA systems. Both DCS and
SCADA systems contain IT networks to interface the supervisory layer, which contains the
operators, to the control layer, which contains the PCSes. Network integrity violations allow
adversarial communication with the PCS. Attacks on the network integrity are represented
in the literature with a three-dimensional attack space as illustrated in Figure 2.2 [24].
Figure 2.2: Three-dimensional network integrity attack space for cyber-physical control.
The attack space maps network integrity attacks based on the types of resources available
to the adversary:
• Disclosure resources allow visibility of confidential sensor and actuator data to the
adversary.
• Disruption resources enable physical intervention and control of the system by the
adversary through transmission of sensor and controller data.
• System knowledge about the controller and the behavior of the physical process in-
creases the overall stealthiness of an attack.
7
The primary goal of an attacker is to disrupt the behavior of an autonomous process without
the CPS detecting or responding to the attack. With more resources available, an adversary
is not only able to execute a more effective attack, but also able to better conceal the attack.
Detection schemes for network integrity attacks mirror the traditional detection schemes of
IT cyber attacks. Both detection and response schemes in this realm are heavily researched
and implemented in CPSes [4].
With the availability of disruption resources, an adversary is able to execute a denial-of-
service (DoS) attack. As is the case of IT systems, CPS DoS attacks initially target the
communication interfaces and deactivate supervisory monitoring and interaction with the
PCSes. Once the adversary penetrates the CPS, other system functions or physical processes
are disabled [20]. Typically, DoS attacks are detected by the supervisor due to the lack of
communication or response from the attacked nodes in the system.
Disclosure resources coupled with disruption resources enables an adversary to disguise in-
trusion into a CPS. In a replay attack on the network integrity channels, an adversary
hijacks the sensor and actuator values, and re-transmits them to the supervisory system
while executing a disruptive attack to the plant. The supervisor responds to past data that
does not reflect the potential dangerous behavior of the plant’s current state. Detection
schemes for such an attack look for anomalies in the system operation or network channels
using methods such as a χ2 detector [19].
Availability of system knowledge enables an adversary to engineer more stealthy attacks such
as a zero-dynamics or bias-injection attack [24][25]. These covert attacks attempt to evade
detection mechanisms, disclosed to the the adversary through significant system knowledge.
Vulnerabilities within the detection schemes themselves are exploited and the system is
pushed to its limits [4]. A well designed covert attack adapts to the behavior of the process
and responds to the security safeguards that are implemented.
Although physical process integrity is not jeopardized, the presence of only disclosure re-
sources allows an adversary to eavesdrop and collect confidential process data. Plant response
and control parameters are extracted from the collected data by offline machine learning al-
gorithms. For example, data mining and decision trees are employed in phasor monitoring
units (PMUs) to compute real-time state estimation and provide critical feedback for power
plant operators [3]. Similar tactics on collected process data can be used to model plant
behavior, allowing an adversary to learn about the system and craft more covert attacks.
8
Aurora Vulnerability Case Study
In 2007, a cyber-physical network integrity attack, the Aurora vulnerability, was experi-
mentally demonstrated by Idaho National Laboratory. Adversarial red-team access to a US
Department of Energy diesel generator was gained by exploiting the lack of security in net-
work communication protocols such as Modbus. Modbus is pervasive in existing electrical
grid equipment and does not support even simple security methods such as authentication.
Once access is gained, the attack opens and closes circuit breakers out of synchronization with
the grid causing elevated electrical and mechanical stress on the generator. Typical circuit
breakers contain synchronous checks that prevent out-of-synchronous closing. Aurora takes
advantage of the time delay between recognizing the out-of-synchronous relay closing and
the protective response to execute the attack and cause irreparable physical damage [32].
A dramatic video of the attack shows smoke coming out of the generator and severe damage
to the generator itself [32]. Continued execution of the attack has potential of generator
explosion. The distributed nature of electrical power grids causes even deeper concern since
destruction of a single generator can have a cascading effect to additional nodes in the power
system resulting in widespread disruption.
2.1.2 Reconfiguration Attack Space
CPSes contain various computing system layers organized in hierarchical schemes specific to
the application. Industrial control applications conventionally use DCS or SCADA systems
which contain a defined set of subsystems and protocols for communication between the
systems. The topology of a typical DCS or SCADA system is represented in Figure 2.3.
Such a system has three layers of control and computation:
• The infrastructure layer contains sensors, actuators, and the physical processes.
• The regulatory control layer includes the embedded controllers that automatically gov-
ern the plant through interaction with the sensors and actuators of the infrastructure
layer via control loops between the two layers.
• The supervisory control layer allows operators to send control commands to the regula-
tory controllers as well as monitor the system with human-machine interfaces (HMIs).
9
Figure 2.3: Hierarchical topology of a DCS or SCADA system.
Components in the regulatory control layer — RTUs and PLCs — are customarily embed-
ded systems that interact with the supervisor through IT network telemetry. A similar IT
network channel exists for embedded platform reconfiguration intended for firmware updates
and performance optimization. The IT network telemetry with these regulatory controllers
makes them susceptible to malicious reconfiguration that not only violates the integrity of the
network channel, but also threatens the integrity of the physical process since the regulatory
layer interacts directly with components of the physical infrastructure.
Security measures to strengthen network integrity are implemented within the regulatory
controllers. Malicious reconfiguration may revoke the security barriers for network integrity
attacks in addition to disrupting controller operation. As a result, the reconfiguration attack
space subsumes the network integrity attack space and is much more critical. The ability to
adversely reconfigure controllers in the regulatory layer allows an attacker to not only evade
detection and response schemes, but also hijack the controller and alter its overall function.
10
Stuxnet Case Study
A CPS reconfiguration attack is the most critical and covert. However, gaining system
reconfiguration capabilities more often than not requires gaining network access. Typically,
reconfiguration attacks are a result of multiple cumulative network integrity attacks that
advance an adversary’s intrusion into the system until reconfiguration resources are available.
Stuxnet is a prime example of such an attack.
The Stuxnet worm is perhaps the most sophisticated in the cyber-physical attack realm.
Originally designed to target Iranian nuclear facilities, the worm infected Siemens PLCs.
The virus exploits vulnerabilities in the Microsoft Windows operating system and initially
propagates through the IT network of the nuclear fuel enrichment in which it exploited four
different zero-day flaws to gain elevated adversarial privileges [12]. The worm ultimately
infects Siemens software and reconfigures the PLCs controlling nuclear centrifuges. Once
reconfigured, all safety precautions, detection schemes, and security barriers are disabled [4].
The virus caused the uranium enrichment centrifuges to tear themselves apart [12].
Reportedly, Stuxnet destroyed up to one fifth of Iran’s enrichment centrifuges. As nuclear
facilities are considered one of the most safety critical infrastructures, the attack raises signif-
icant concern regarding the protection of CPSes. The architectural complexity, robustness,
and sheer brilliance of the Stuxnet worm motivates cyber-physical security professionals and
researchers to this day [12].
The network integrity and reconfiguration attack spaces raise significant concern about pro-
tecting critical infrastructure. The Aurora vulnerability and Stuxnet are just two examples
of attacks that have been successfully executed and showed the physical destruction that can
result from cyber attacks. The Stuxnet virus is classified by some professionals as a nation-
state attack. Such attacks and threats have gained significant national security concerns.
On February 12, 2013, United States President Barack Obama issued an executive order
titled “Improving Critical Infrastructure Cybersecurity” [21]. The order addresses the con-
cerns of cyber-physical attacks and states “national and economic security of the United
States depends on the reliable functioning of the Nation’s critical infrastructure in the face
of such threats” [21]. An implicit call for action is stated in this executive order to ad-
dress the cyber-physical threats that could jeopardize critical American infrastructure. Such
attacks are classified as a modern form of warfare that threatens targets on American soil.
11
2.2 Autonomic Systems
Traditionally, the cybersecurity model has been based on thwarting known attacks. Typi-
cal anti-virus software for IT infrastructure maintains databases of discovered viruses and
searches systems for footprints of such viruses. While firewalls attempt to keep intruders out,
there are always new exploits that make systems vulnerable. Cybersecurity engineers have a
never ending list of zero-day exploits that must be addressed within their systems. A similar
model is present in modern day CPSes where detection schemes and response methods are
implemented once exploits are discovered. While attacks on conventional IT infrastructure
may cause a disruption of information or restricted access to confidential information, at-
tacks on CPSes can cause irreparable physical damage. Hence a reactive approach to CPSes
attacks is undesirable.
Autonomic computing, first noted in a 2001 IBM manifesto, contains self-managing resources
which adapt to unprecedented circumstances without user intervention or operation [10].
These systems are inspired by the human body’s autonomic nervous system which controls
bodily functions without conscious intervention. Such systems make decisions independently
using high-level policies. According to IBM, an autonomic system contains four properties
for self-management [10]:
• Self-configuring: High-level policies arbitrate automated and seamless configuration
of components in a system.
• Self-optimizing: Resources are monitored and automatic control ensures the most
efficient functional operation of the system.
• Self-healing: Software and hardware faults and abnormalities are automatically de-
tected and rectified.
• Self-protecting: The system proactively anticipates threats that jeopardize the func-
tional operation of the system and defends against system failure.
Typically, autonomic system properties are realized with a large set of closed control loops
that monitor a specific hardware or software resource and ensure the system maintains the
relevant parameters within a specified range. With respect to cybersecurity, the self-healing
12
and self-protecting properties of autonomic systems are the most relevant in protecting a
system from attacks while responding and recovering from malicious intrusion. Similar to
how the human body combats viruses and infections, an ideal autonomic system will protect
itself from adversarial intrusion that threatens the health of the system.
Chapter 3
TAIGA Overview
The primary objective of TAIGA is to preemptively detect malicious process behavior and
maintain plant safety and stability. As a result, TAIGA contains methods not only for
detecting attacks, but also responding and recovering from them.
Prior work presents the use of formally verified, application-specific hardware for monitoring
system operation in real-time at the lowest I/O pin level [15]. Leveraging run-time prediction
allows for forecasting the behavior of the plant and detecting a reconfiguration attack [16].
Initially, plant I/O and trusted components of the predictive architecture are implemented
in configurable hardware using high-level synthesis (HLS) [17].
TAIGA is introduced as an intermediary for controller I/O and isolates untrusted components
prone to malicious configuration by harnessing the advantages of a commercially available
configurable SoC [7]. TAIGA has evolved to not only isolate trust, but also provide a
generalized autonomic architecture that monitors plant behavior and supervisory commands
to preemptively respond to reconfiguration or networked integrity attacks before they disrupt
process behavior [5]. TAIGA recognizes that an attack might have manifest itself within an
embedded platform and safeguards the system as a last line of defence before the attack
jeopardizes plant safety and stability.
13
14
3.1 Autonomic Requirements
Autonomic enforcement of high-level policies can help ensure plant safety and stability. The
self-protecting and self-healing nature of these systems allows adaptive defense mechanisms
to threats jeopardizing the operation of physical processes by not only detecting attacks but
also responding and recovering from them.
In order to effectively govern the properties of autonomic computing, the system must be
implemented to exhibit the following characteristics and abilities which are formally defined
as the autonomic requirements (ARs):
AR1 Awareness is the ability to sense the operational parameters of the system that are
bounded with high-level policies.
AR2 Adaptive systems contain resources that can be functionally and operationally re-
configured based on the spatial and temporal context.
AR3 Automatic systems are self-contained in that the monitoring and reconfiguration is
initiated without any manual intervention.
3.2 Control Strategy
TAIGA incorporates three distinct control modules: the production controller, backup con-
troller, and trigger mechanism. These elements, illustrated in Figure 3.1, are isolated and
run concurrently within TAIGA to allow for detection of malicious process operation and
regain control of the plant automatically without violating operational limits. The opera-
tional thresholds are defined as guards for the plant and are used to differentiate expected
and malicious operation.
Initially, the process is governed by the production controller as shown in Figure 3.1. The
trigger mechanism tracks the plant’s behavior in real-time by probing the telemetry between
the controller and plant. If the plant behavior is anticipated to violate specified guards, the
system switches to a trusted backup controller that provides stability and is not susceptible
to reconfiguration. The ability to automatically recognize malicious behavior and initiate
switch-over to reliable operation satisfies AR3.
15
Figure 3.1: The control modules associated with TAIGA.
3.2.1 Controllers
The objective of a PCS is to maintain the physical process as close to the desired operating
conditions as possible. In industrial control applications, a feedback loop is typically used to
determine the discrepancy between the actual plant state and the desired state. A control
algorithm generates an actuating signal to compensate for this variation. Adaptive con-
trol algorithms adjust to the process behavior by altering control and actuator parameters
dynamically based on the feedback response of the plant. While adaptive controllers are
occasionally used in ICSes, they are much less common and contain a high degree of uncer-
tainty within the control parameters; TAIGA only considers conventional output-feedback
control loops.
An output-feedback controller, as depicted in Figure 3.2, is pervasive within ICS applications.
More often than not, the control algorithm requires a wider set of process states than those
that are provided through the sensors within the infrastructure layer. The state-estimation in
the feedback loop determines the larger set of states based on the known sensory values and
the expected dynamics of the plant. Figure 3.2 represents the control loop of the production
and backup controller.
16
Figure 3.2: Output-feedback control loop with state estimation.
Production Controller
The production controller exists within a traditional PLC in the leaf nodes of a CPS. The
need for optimization and controller updates makes this node prone to covert reconfiguration
attacks. The production control algorithm is optimized for performance in which the control
parameters of fC (rk, xk+1) are tuned aggressively to minimize the discrepancy between rk
and yk while optimizing controller response time. This controller is responsible for meeting
the performance and throughput specifications of the plant.
Backup Controller
In contrast, the backup controller is a high-assurance controller that is verified to provide
stable and reliable plant operation. This control algorithm instance trades off performance
for assurance. While the production controller may undergo numerous performance updates,
the backup controller is the initial “factory” controller that is verified to not have any latent
malware or malicious operation. Typically, the various layers of complexities are stripped
down in the implementation of the backup controller to ensure internal threats are not
present. The purpose of the backup controller is not to optimize plant deliverables, but to
preserve the safety and stability of the system.
17
3.2.2 Guards
The ability to preemptively detect malicious and disruptive process behavior depends on the
accurate identification and definition of guards for the physical process. Guards identify the
operational or safety bounds of the system and are categorized as follows:
• Safety critical guards define the limits at which a system may operate while ensuring
safety. In a conventional PCS, violation of a safety critical guard requires drastic
measures usually involving auxiliary system intervention to bring the system back to
safety.
• Operational guards define the normal limits of system operation. These guards can
be designed to prevent a safety critical guard violation from occurring under normal
operation.
Guards are application-specific and are used to enforce the parameters of an application’s
physical process. Since the primary purpose of TAIGA is to ensure process safety and sta-
bility, the operational guards must bound the system within a region of safe and stable
operation. However, covert attacks can sometimes maintain safe and stable plant operation
but degrade the performance of the plant. Performance guards define the thresholds of per-
formance for the physical process to meet the plant requirements and deliverables. Enforcing
performance guards in addition to the operational guards is suitable for application domains
in which performance degradation of the controller causes detrimental effects to the system
but does not threaten the plant’s safety.
3.2.3 Trigger Mechanism
The trigger mechanism is responsible for initiating the switch-over from production to backup
control in the presence of a disruptive attack. More often than not, industrial processes
contain non-linearities and are inherently unstable. Initiating switch-over to the backup
controller once a guard has been violated does not ensure plant safety and stability. As a
result, the trigger mechanism anticipates the future plant behavior and initiates switch-over
to the backup controller if the plant’s trajectory shows disruptive behavior in the future. This
ensures a safe recovery to process stability under the governance of the backup controller.
18
Monitoring physical process sensors to detect faults and switch-over from a high-performance
controller to a high-assurance controller is considered by Sha [23]. In Sha’s architecture, de-
cision logic ensures that the plant governed by the high-performance controller stays within
the stable envelope of the high-assurance controller. The response of the plant during switch-
over endangers the recovery of stability as the fault is detected after the process has deviated
from the allowed stable envelope in Sha’s scheme [16]. In contrast, TAIGA’s trigger mecha-
nism forecasts the trajectory of the plant based on the current state vector to preemptively
detect a guard violation and effectively maintain the system within the operational bounds
during and after the switch-over process.
The trigger mechanism generates a decision based on a state vector representative of the
plant’s current operation derived exclusively from physical sensor measurements and state
estimation to forecast the future behavior of the plant. The advantage of the trigger mech-
anism is its ability to forecast the plant’s future tendency and therefore preemptively detect
malicious behavior. In order to accurately and reliably forecast the plant’s tendency within
the TAIGA framework without false positives or negatives, the physical process must satisfy
the following attributes:
• The state vector is derived exclusively from the physical sensor measurements.
• A plant model accurately describes the dynamics of the physical process.
With the first attribute, the trigger mechanism is sufficiently aware of the physical process to
anticipate malicious behavior and satisfy AR1. A plant model is required to foresee disruptive
operating conditions. With an accurate plant model, the plant behavior is forecast using
two possible methods: online prediction and classification.
Prediction
The plant model connected to an instance of the control algorithm is run faster than real time
to estimate the plant’s trajectory in the online prediction method. For the current physical
state of the plant, the prediction method iterates several control cycles into the future to
anticipate the trajectory of the plant under the governance of the backup control algorithm.
If a guard is violated in a future iteration, the trigger mechanism initiates switch-over to the
backup controller.
19
Prediction is computationally intensive and requires a reasonably accurate plant model,
which is not always possible. Linear models may not accurately simulate the plant behavior
while the implementation of complex non-linear models is not feasible on embedded systems
due to computational limitations. As a result, an alternative to online prediction is sought.
Classification
Deciding whether the process will remain in a safe operating region can be considered a
classification problem. Machine-learning methods are used offline with simulated state data
from the plant model to bound the process in a region of safe return for each given state.
The classifier determines whether the system can regain safe operation under the governance
of the backup controller for the current process state vector. The instant the process state
vector is classified as operating in an unsafe region, the classifier-based trigger mechanism
initiates switch-over to the backup controller. The classifier is tuned offline using simulated
data. As a result, computational complexity is not a inhibiting limitation. Complex non-
linear models can also be used for more accurate classification.
3.3 Trust Requirements
CPS leaf nodes contain the embedded platforms for production control of physical processes.
As illustrated in Figure 3.3a, these platforms are susceptible to both network integrity at-
tacks as well as malicious reconfiguration. Traditionally, the production controller interacts
directly with the physical process. The internal and external threats to this controller jeop-
ardize the safety of the physical process since attacks can cause disruptive plant actuation.
As a result, TAIGA is introduced as an intermediary between the production controller and
the physical process as shown in Figure 3.3b. TAIGA operates on the assumption that the
production controller may have malicious code, and is therefore untrustworthy. The physical
process is protected from disruptive actuation since TAIGA components are not susceptible
to reconfiguration and the supervisory nodes do not have the capabilities to directly actuate
the process. An abstracted black box view of TAIGA is shown in Figure 3.3 containing just
the supervisory and control layers. Typically, there are multiple layers in between the leaf
and supervisory nodes such as RTUs as shown in the CPS topology in Figure 2.3.
20
(a) Conventional CPS leaf nodes.
(b) Leaf nodes with TAIGA.
Figure 3.3: Black box view of CPS leaf nodes with TAIGA.
The TAIGA black box in Figure 3.3b contains the backup controller, the I/O intermediary
(IOI) module, switch-over logic, and peripherals for communication with the production
controller, supervisor, and physical process. The trigger mechanism used to enforce the
guards on the system is located within the IOI. Trust is essential in the implementation
of these TAIGA modules to ensure robust operation of the system and prevent malicious
intrusion. Formal trust requirements (TRs) for each of the trusted components in TAIGA
are defined by Lerner [14]:
TR1 The source code and implementation for the entire component are analyzed.
TR2 The component uses private hardware resources for computation, internal commu-
nication, and memory, and does not invoke external components as sub-functions.
TR3 All external communication with untrusted components is through hardware-implemented,
bounded, and isolated queues.
TR4 The component cannot be bypassed or disabled, and has a fixed repertoire of essen-
tial services, such as I/O or cryptography.
TR5 Critical functionalities of the component, such as rule checking logic, cannot be
updated without provably secure or physical access.
21
The only commercial security apparatus fulfilling all five of these TRs is a Trusted Platform
Model (TPM) used primarily as a secure cryptoprocessor [27]. Similarly high standards
of trust are instituted within the trusted elements of TAIGA to prevent corruption. The
separation of trust between the production controller and the trusted elements of TAIGA
ensures that the trusted elements will operate correctly regardless of what happens to the
production controller.
3.4 Architecture
Architecturally, TAIGA mandates two capabilities. In order to switch between the pro-
duction and backup controller under the presence of an attack, the architecture must be
adaptive and satisfy AR2. Secondly, the trusted components of TAIGA must satisfy all of
the TRs and be isolated from the untrusted production controller. Figure 3.4 illustrates
TAIGA realized on a configurable SoC platform. The production and backup controllers
host the two instances of the control algorithm for actuating the process. The IOI hosts
the trigger mechanism that monitors the physical process and identifies malicious activity.
A trigger is asserted when malicious process behavior is anticipated; an asserted trigger
switches governance of the plant from the production to the backup via the controller queue
multiplexer.
Figure 3.4: TAIGA’s realization on a configurable SoC.
22
A configurable SoC is well suited for an autonomic system since the field-programmable
gate array (FPGA) fabric can be customized to adapt automatically based on adversarial
conditions while satisfying the ARs and TRs. TAIGA is implemented in a configurable SoC
to isolate trust, restrict inter-module communication, and maintain TAIGA’s transparency.
3.4.1 Isolation of Trust
The production controller, backup controller, and IOI host processes and algorithms that
are both computationally and arithmetically intensive. As a result, they are run in micro-
processors that can execute compiled software rather than low-level logic. The isolation of
trust is ensured by separating hardware resources between the production controller and the
trusted entities of TAIGA. Figure 3.4 shows the production controller implemented in the
dedicated processing cores of the configurable SoC. These processors have their own RAM,
cache, and peripheral controllers. Similarly, the backup controller and IOI are realized in
soft-core processors that also have isolated memory and peripheral resources instantiated
within the FPGA fabric. Access to the hard- and soft-core’s resources are explicitly not
permitted by the configurable SoC, thereby satisfying TR2.
Configurable SoC’s typically contain a processing system (PS) which hosts the primary
processing cores, and the programmable logic (PL) which contains the FPGA fabric. The
separation of trust illustrated in Figure 3.4 bisects these two systems. The reconfiguration
network has access only to the processing system and can modify applications that are
executed within the processing cores. The PL is hardware-defined and does not include any
ports or methods for remote reconfiguration. The backup controller and IOI, the two trusted
entities of TAIGA, can only be reconfigured through physical access to the platform, which
satisfies TR5.
3.4.2 Inter-Module Communication
In order to limit the scope for malicious reconfiguration and communication, telemetry with
trusted entities of TAIGA is restricted. The inter-module communication of the three pro-
cessing blocks described in Figure 3.4 is limited to queues. These queues are implemented
within the FPGA fabric, and do not allow external tampering from the production con-
troller. Furthermore, both the production and backup controllers have their own set of
23
isolated queues such that no communication resources are shared between the untrusted and
trusted elements. The controllers communicate with the IOI over the queues using a pre-
defined protocol that limits the scope of interaction. This implementation of inter-module
communication satisfies TR3.
The IOI module hosts the trigger mechanism and is responsible for monitoring process be-
havior, detecting malicious activity, and initiating switch-over to the backup controller. All
interaction with the supervisor and physical process are channeled through the IOI module
as shown in Figure 3.4. In order to interact with the plant, the production controller must
communicate via the IOI using queues. TR4 is satisfied since the IOI cannot be bypassed
and autonomically determines process control between the production and backup controller.
3.4.3 TAIGA Transparency
The black box representing TAIGA in Figure 3.3b contains the trusted components of
TAIGA: the backup controller, the IOI, and the controller multiplexer. From the perspective
of the production control algorithm, these TAIGA components are transparent as interaction
with the physical process is not bypassed or intercepted by the I/O intermediary module,
but merely probed and arbitrated. Typically in a PCS, the low-level peripheral controllers
responsible for process sensing and actuation are implemented within the production con-
troller itself. These low-level drivers are relocated to the IOI with the TAIGA framework
as the production controller uses queues rather than direct sensor and actuator telemetry to
interact with the process. Scalability of the production controller is ensured since only the
low-level drivers of the production controller need to be modified for queue communication.
Chapter 4
Rotary Inverted Pendulum
The rotary inverted pendulum (RIP) experiment is a classical electro-mechanical controls
challenge which incorporates nonlinearity, stability, actuation limits, and noise. These con-
cerns are representative of those found in industrial control applications. The table-top setup
of the RIP experiment makes it an ideal fit for evaluating TAIGA. The Quanser RIP system
is used [22].
The Quanser RIP system contains two electrical subsystems: the linear voltage amplifier and
the rotary servo base unit. The RIP base unit, shown in Figure 4.1, contains a servo motor
for actuating the arm of the pendulum radially, and two high-resolution optical encoders
for sensing the pendulum and servo arm positions. The base unit also contains an analog
potentiometer for sensing the servo arm position but is not used in this experimentation
since the digital encoder achieves the same purpose.
The linear voltage amplifier is responsible for driving the servo motor and accepts a voltage
within the range of±10 volts for rotating the servo motor in the clockwise or counterclockwise
directions. Each of the optical encoders contain two digital signals: channel A and channel
B. These signals toggle at each encoder tick based on the rotation of the encoder shaft. The
high-resolution encoders contain 4096 counts per revolution. The phase differential between
the two encoder channels and the toggle frequency of the digital signal are used to determine
the direction and implicit velocity of the encoder respectively.
24
25
Figure 4.1: Photograph of the Quanser rotary inverted pendulum setup.
26
4.1 Process Control Telemetry
Physical process sensors and actuators do not directly interface to the controllers in the reg-
ulatory layer of CPSes. Typically, the interaction between the regulatory and infrastructure
layers goes through defined telemetry interfaces. The discrepancy between process control
commands and the telemetry capabilities of the embedded controller are adressed by this
interface. Modern embedded systems operate on digital logic rather than analog for com-
munication and transmission of information with external entities. Various digital protocols
exist for bidirectional communication between embedded controllers and peripherals. Some
pervasive embedded peripheral protocols include I2C, SPI, CAN, and UART; these protocols
are defined by established standardization bodies and favor certain applications over others.
For the Quanser RIP, an external sensory and control interface board (SACIB) is designed
for telemetry between the embedded controller and the physical pendulum process. The
high-level interface between the pendulum process and embedded controller is illustrated in
Figure 4.2. TAIGA assumes integrity of all entities within the physical infrastructure layer.
In the RIP application, the serial peripheral interface (SPI) bus and SACIB are trusted and
physically protected with perimeter security measures.
Figure 4.2: Interface for telemetry between the regulatory control and infrastructure layersof the Quanser RIP experiment.
27
Serial Peripheral Interface
SACIB communicates with the embedded controller via a SPI bus and can sense and actuate
the pendulum. SPI is a synchronous serial communication protocol originally developed
by Motorola [26]. It contains one master device that governs the bus and multiple slaves
each with a dedicated slave select (SS ) signal. A slave is enabled for communication by
asserting its corresponding slave select (active low); only one slave can be enabled at a time.
Communication in the SPI protocol is achieved with three signals [26]:
• SCK is the serial clock generated by the SPI master and corresponds to the data rate
of the serial communication.
• MOSI is the output data from the master and the input to the slave.
• MISO is the input data to the master and the output from the slave.
The SPI bus acts as an inter-chip circular buffer between the master and slave devices as
illustrated in Figure 4.3. On each SCK clock cycle, a bit from the master device is transmitted
on the MOSI signal line while a bit from the slave device is transmitted on the MISO signal
line. Internally, the buffer of bits in the master and slave SPI transfer registers are bit
shifted so that words are transmitted with multiple clock cycles. The hardware SPI transfer
registers are updated with new data by the embedded controller. Since data is exchanged
bidirectionally between the master and slave at each SCK clock cycle, SPI does not follow a
cline request and server handler data flow like other serial protocols.
Figure 4.3: SPI data transfer between master and slave controllers.
28
Sensory and Control Interface
SACIB contains four core integrated circuits (ICs), defined in Table 4.1, that are slaves
on the SPI; the embedded controller is the master. Since the linear voltage amplifier and
potentiometer operate at a different voltage spectrum (±10 volts) than the digital ICs (3.3
volts) on the interface board, operational amplifiers are used to scale the voltage.
Table 4.1: SPI ICs in Quanser pendulum interface board.
ICs DescriptionDigital to Analog Converter (DAC)MCP4921
Actuator output for interacting with the linear volt-age amplifier and for mobilizing the pendulum arm.
Analog to Digital Converter (ADC)MCP3202
Sensory input of the absolute potentiometer positionof the pendulum arm. Not used in this experiment.
2×32-bit Quadrature CountersLS7366R
Sensory input that keeps count of the encoder ticksto determine position of servo and pendulum arms.
Both the ADC and DAC ICs do not require software configuration. The quadrature counters
require an initialization process in which internal registers are configured for sensing the
encoder’s radial position:
1. The counter is cleared with the pendulum position pointing downwards in a free hang-
ing position with no oscillation by writing the CLR CNTR op-code on the SPI bus.
2. The operation mode is configured by writing the following mask to the MDR0 register:
QUADRX4 | FREE RUN | DISABLE INDX | FILTER 2.
3. The operation mode is configured further by writing the the following mask to the
MDR1 register: NO FLAGS | BYTE 2 | EN CNTR.
4.2 Control Algorithm
The RIP contains two mechanical parts: the servo arm and the pendulum as shown in
Figure 4.4. The servo arm is radially actuated with the servo motor by applying a control
voltage. The position of the servo arm (θ) is sensed via an optical encoder. The pendulum
29
freely pivots along the end of the servo arm and its position, (α), is sensed by the position
of the second encoder shaft attached to the freely rotating pendulum’s pivot point [1]. The
sign conventions of both α and θ are illustrated in Figure 4.4.
Figure 4.4: Inverted pendulum setup and sign conventions for θ and α.
The control objective is to balance the pendulum at a commanded servo arm position. The
inverted pendulum experiment, like many other physical processes, is a continuous system.
Embedded platforms are digital electronics that do not operate in continuous time like analog
devices, but rather, operate on a discrete time interval. As a result, the dynamics of the
pendulum are modeled in discrete time k using Equation 4.1 [9].
xk+1 = Axk +Buk + wk
yk = Cxk + vk(4.1)
The state vector is defined by x ∈ <4 and contains the radial pendulum and servo arm
positions and velocities. The control voltage applied to the servo arm is represented by
u ∈ <. The output process vector is defined by y ∈ <2. The state vector is influenced
by process noise while the output is influenced by measurement noise modeled by w and v
30
respectively [8]. The resolution of the encoder is 4096 counts per revolution which results in
a 0.088◦ tolerance for error in sensing the radial position of the pendulum and servo arm.
Measurement noise is set to half a count of the encoder resolution and taken into account
by v [8]. The four-dimensional state vector of the pendulum is represented in Table 4.2.
Table 4.2: Variables in the state vector of the inverted pendulum control experiment.
State Descriptionθ Servo arm radian angle position derived from encoder measurement.α Freely pivoting pendulum radian angle derived from encoder measurement.
θ Velocity of the servo arm derived using state estimation.α Velocity of the pendulum derived using state estimation.
A linear-quadratic-Gaussian (LQG) feedback controller is implemented as the control algo-
rithm for the pendulum [8]. The LQG controller minimizes a cost function of the optimal
control law and enforces large penalties on the deviation of θ and α. This control algorithm
satisfies the overall goal of the controller by reducing the discrepancy of the pendulum and
servo from the desired positions while maintaining the control voltage of the servo within
the actuator limits.
The LQG controller is a combination of a linear-quadratic regulator and a Kalman filter.
The Kalman filter serves two purposes: state estimation and noise immunity. A 1 millisec-
ond discrete control cycle time interval is used to sense and actuate the pendulum with the
control algorithm. The first two parameters of the state vector in Table 4.2 are derived
exclusively from sensory measurements. The last two parameters, velocities of the servo arm
and pendulum, are estimated using the Kalman filter. The Kalman filter establishes the
linear relationships between the change in sensor measurements and sampling time inter-
val to estimate the associated velocities. Furthermore, turbulent sensor measurements are
suppressed by the Kalman filter.
4.3 Pendulum Guards
Process stability for the inverted pendulum is continuous upright pendulum balance. This
is used as the primary basis for defining the α guard. The desired operating position for
31
α is constant and set to 0◦ which is the upright position. A safety-critical guard is set on
αSCG = ±15◦ to bound the recovery state of process stability within the actuator limits.
Pendulum deviation larger than 15◦ from the inverted pendulum position requires a control
voltage larger than the capabilities of the servo to regain pendulum balance [8]. Since the
desired pendulum position is always constant and does not vary with operational conditions,
an operational guard is not enforced on α.
The desired servo arm position at which pendulum balance is maintained is defined as an
operational parameter that can be modified during run-time by supervisory operators. This
θdesired is defined as the operational set-point. In order to differentiate safe and malicious
operation, the servo arm position is bounded by both safety-critical and operational guards.
The guards are summarized in Table 4.3.
Table 4.3: Safety-critical and operational guard definitions for α and θ.
Guard DescriptionαSCG ±15◦ Safety-critical guard on the pendulum.θOG ±35◦ Operational guard on the servo arm.θSCG ±50◦ Safety-critical guard on the servo arm.
4.4 Trigger Mechanisms
Servo arm operation outside of 50◦ on both directions is considered unsafe while operation
outside of 35◦ is considered unstable according to the guard definitions in Table 4.3. The
trigger mechanism maintains both process stability and safety and thus must enforce the
operational and safety-critical guards on α and θ. Three different trigger mechanisms are
developed for the evaluation of TAIGA: trivial, linear online prediction, and neural network
classification [9][8].
4.4.1 Trivial
The trivial trigger mechanism is a control mechanism used as a baseline standard for com-
parison with the other trigger mechanisms. Unlike the prediction and classification methods,
32
the trivial trigger mechanism does not preemptively forecast a guard violation, but rather
asserts the trigger once the operational guard is violated. This mechanism is the simplest
to implement and represents current process monitoring schemes routinely found in CPSes
where hard-defined operational limits are enforced by the process controllers.
4.4.2 Linear Online Prediction
A linearized model of the pendulum is used with an instance of the LQG control algorithm
to forecast future plant behavior in the linear online prediction method. Two sets of pro-
cess states are implemented, one resembling the real-time physical process and one for the
prediction unit used to forecast future behavior. Initially, the prediction unit’s states are
synchronized with the physical process. The prediction algorithm accelerates the behavior
of the plant several control cycles into the future by using the linear plant model to esti-
mate process behavior and the control algorithm to determine the actuation signal at each
iteration. The trigger is asserted if a guard is violated at any of the future iterations.
The settling time of the pendulum is approximately 1.2 seconds which translates to 1200
iterations with a 1 millisecond control cycle [8]. Computational limitations on the embedded
controller may not allow for 1200 iterations of the prediction unit within the 1 ms control
cycle time; in this case, the iterations are broken up and partitioned among several control
cycles. The state vector of the prediction unit is synchronized with the actual physical
process once all iterations are complete.
The control algorithm used by the prediction unit is the backup instance of the LQG con-
troller. The backup control algorithm is trusted and responsible for regaining stable and
safe pendulum operation. Hence, predicting with the backup control algorithm ensures the
backup controller is capable of recovery if a guard violation is anticipated and the trigger
is asserted. However, the production controller is initially governing the physical process
prior to trigger assertion. As a result, the trajectory forecasted by the linear online pre-
diction method may not accurately resemble the future behavior of the physical process if
the operating conditions of the production controller vary from those of the backup or are
compromised. Rather, the prediction method attempts to foresee whether process stability
and safety can be regained and maintained with the governance of the backup controller at
each control cycle.
33
4.4.3 Neural Network Classification
A classification algorithm is derived offline using a nonlinear model of the pendulum. The
parameters of the state vector are incrementally permuted to obtain an initial set of state
vectors. The non-linear model is used to simulate the behavior of the pendulum under
backup control for each given initial state vector. The large set of simulation results are
used to train a neural network-based classifier.
A set of four-input neurons corresponding to each state of the RIP are mapped to outputs
using a nonlinear activation function with a multilayer perceptron (MLP), a neural network
model [9]. The MLP is trained using the simulation results with a nonlinear least squares
method. The resulting classification algorithm determines whether or not a given input
state vector is in a region of safe and stable operation, which is defined as the backup
controller’s ability to maintain pendulum balance at a predefined servo position without a
guard violation.
The neural network classifier bounds the physical process within the four-dimensional state
vector space. It does not require multiple iterations online but rather linear algebraic com-
putations of the classification algorithm, which makes it much less computationally intensive
and better suited to embedded platforms.
Chapter 5
TAIGA Implementation
TAIGA necessitates hardware and software coherence to be effective. Since fabrication of a
custom architecture is infeasible, TAIGA is implemented on a commercially available config-
urable SoC. The Xilinx Zynq-7000 allows hardware customization and has tight integration
between the software and hardware design flows. The hardware realization of TAIGA, ini-
tially proposed in Figure 3.4, is implemented on the target Zynq platform as depicted in
Figure 5.1. The specifics of this platform, along with the details regarding the hardware and
software implementation of TAIGA for the RIP application are described in the following
sections.
Figure 5.1: The hardware implementation of TAIGA on a Zynq-7000 configurable SoC.
34
35
The source code, IP, and system design for TAIGA is maintained in the following GitHub
repository: https://github.com/tejachil/TAIGA.git.
5.1 Target Platform
An embedded platform that is adaptive, customizable, and high-performance is essential for
a robust TAIGA implementation. A Xilinx Zynq-7000 All Programmable SoC is a suit-
able commercially available chip with reasonable support and readily available development
platforms [30]. TAIGA is implemented for the RIP application on a ZYBO development
board which contains a Zynq-7010 IC [6]. The specifications for this platform are defined in
Table 5.1.
Table 5.1: Specifications for ZYBO Zynq-7000 development board.
Zynq-7010 ZYBOZYNQ (PS) XC7Z010-1CLG400C Serial Flash 128MB
Processor Dual-Core ARM Cortex-A9 RAM Capacity 3×512MB DDR3Frequency 650 MHz RAM Speed 1050Mbps
FPGA (PL) Artix-7 Pmod 1 MIO, 1 ADC, 4 EMIOLogic Cells 28K
BRAM 240KBDSP 80 slices
The Zynq-7000 configurable SoC is partitioned into two subsystems: the PS and the PL.
The PS contains a dual-core ARM processor while the PL contains an FPGA fabric that
resembles that of a Xilinx 7-series FPGA as specified in Table 5.1. The Zynq platform
maintains a separation of resources, memory, and peripherals between the PL and PS.
TAIGA is implemented with the Xilinx Vivado tool suite. Custom intellectual property
blocks are generated using the Vivado HLS tool. The PS and PL are configured with various
functional blocks within the Vivado design tool. Once the hardware is synthesized and
implemented, a bitstream is generated.
Software is developed, compiled, and deployed as applications on available processing cores
using the Xilinx SDK. Each application requires a board support package (BSP) which pro-
36
vides the specific libraries and support code for the processor’s hardware profile. The hard-
ware peripherals and details are abstracted from the software with the BSP. Applications
are classified in the Xilinx SDK as either:
• Bare-metal applications use the standalone BSP and allow the software implementation
to directly interact with the hardware and the available peripherals with the drivers
and libraries defined by the BSP.
• Kernel applications use a specific BSP targeting an operating system and contain
complex methods such as scheduling that can host and execute multiple external ap-
plications while arbitrating interaction between the software and hardware.
5.1.1 Processing System
The Zynq’s PS contains an application processor unit (APU), a set of peripheral I/O con-
trollers, and an independent memory hierarchy. The PS closely resembles a microcontroller
with the added computational power of ARM processing cores. The internal structure of the
PS is represented in Figure 5.2.
APU and Memory Hierarchy
The APU contains two ARM cores, each with a distinct floating point unit (FPU), memory
management unit (MMU), and L1 caches. Two types of L1 caches are present, one for data
and one for instructions. In addition to maintaining local data sets, each ARM core can
execute independent instruction streams asynchronously. Xilinx’s asymmetric multiprocess-
ing (AMP) configuration enables concurrent multi-OS or application support across the two
cores. The L2 cache and on-chip SRAM are shared among both cores along with various
internal peripherals such as timers and interrupt controllers, and external peripherals such as
off-chip memory, flash, and I/O. The shared memory structures of the APU provide robust
communication between applications running on the two cores in the AMP configuration.
The ZYBO platform contains off-chip RAM as specified in Table 5.1. This memory is ac-
cessed through the direct memory access (DMA) driver within the APU. Typically, programs
for execution are initially loaded into external DDR3 memory. However, the ZYBO develop-
ment board also contains an SD card slot and external serial flash memory. An SD peripheral
37
Figure 5.2: Internal layout of the Zynq processing system.
controller is enabled within the PS to interface access to the SD card. The SD card is ad-
vantageous for storing non-volatile boot images, which can be loaded into DDR3 memory
during start-up, or hosting file systems for complex operating systems such as Linux. The
SD card and external serial flash memory, although relatively slow in terms of data access,
are the only non-volatile ZYBO memory systems that can preserve data across power cycles.
Peripherals
The ZYBO platform contains a variety of peripherals that interact with the Zynq; the
controllers for these peripherals are configured within the Zynq PS and highlighted in the
I/O peripheral controllers module of Figure 5.2. The ARM Advanced Microcontroller Bus
Architecture (AMBA) defines a standard for on-chip interconnections of functional blocks
within a SoC [18]. The Advanced eXtensible Interface (AXI) protocol is a part of AMBA and
is used to facilitate interaction between the PS and functional peripheral blocks instantiated
within the PL in the Zynq architecture. Peripheral, controller, and hardware accelerator
38
blocks are instantiated in the PL as AXI slave or AXI master devices and communicate with
the PS via the AXI ports shown in Figure 5.2.
Clock Generation
Sequential functional blocks and peripherals within the PL operate on a clock signal from
the PS’s clock generation unit as represented in Figure 5.2. The CPU clock frequency for
the two ARM cores in the ZYBO platform is 650 MHz as specified in Table 5.1. The Zynq
processor contains four PL fabric clocks configurable between 100-250 MHz by the clock
generation unit.
TAIGA’s PL-implemented functional blocks use one of these PL fabric clocks, FCLK CLK0,
configured to 144.4 MHz satisfying the timing constraints of the functional blocks in the PL.
The fabric clocks are sourced by one of three available phase-locked loops (PLLs): ARM
PLL, I/O PLL, and DDR PLL. PLL’s are used to generate frequencies that are divisors of
the sourced input clock. The requested frequency to satisfy the PL’s timing requirements
is 145 MHz. However, the source frequency for the three PLLs are not multiples of this
requested frequency. As a result, the ARM PLL is used to generate 144.4 MHz, the closest
to the requested frequency.
5.1.2 Programmable Logic
The PL contains a large set of configurable logic blocks (CLBs), block RAM (BRAM), digital
signal processing (DSP) slices, I/O blocks (IOBs), and registers as specified in Table 5.1.
Figure 5.3 illustrates an abstracted view of a typical FPGA fabric. CLBs are groupings of
lookup tables (LUTs) configured for combinational logic requirements, and flip-flops (FFs)
for sequential logic. The interconnections and components represented in Figure 5.3 are
configurable for different applications. Typically, FPGAs are configured using a hardware
description language (HDL) such as VHDL or Verilog. In addition, the Vivado design flow
enables FPGA configuration using existing high-level functional blocks.
The Zynq PS is configured and instantiated in the Vivado block diagram as a functional
block. With just the PS, programmable access to the two ARM cores is enabled. However,
the PL is necessary for enabling external controllers, accelerators, peripherals, or functional
blocks that do not exist within the PS. Most functional blocks enabled in the PL interact
39
Figure 5.3: FPGA architecture and fabric composition.
with the PS via the AXI peripheral buses. Interaction with external peripherals on the
ZYBO are typically channeled through Zynq general purpose I/O (GPIO) pins. The Zynq
contains two banks of I/O pins as shown in Figure 5.2:
• Multiplexed I/O (MIO) bank contains a set of GPIOs that are dedicated to the PS and
cannot be accessed from the PL.
• Extended multiplexed I/O (EMIO) bank contains a set of GPIOs that are available to
the PL and can also be accessed by the PS.
These banks contain a wide array of internal I/Os that are multiplexed to the pins of the
Zynq package. Certain peripherals on the ZYBO board such as the universal asynchronous
receiver/transmitter (UART) connector, Ethernet port, and SD card are connected to signals
on the MIO bank and cannot be accessed by PL. Any external peripherals required by the
PL require EMIO connection usually through one of the ZYBO Pmod connectors.
Block RAM
As shown in Figure 5.2, the off-chip flash memory and SD card are connected via the MIO
bank while the volatile memory within the APU is local to the APU. As a result, these
memory systems are internal to the PS and inaccessible from the PL. Rather, the PL
40
contains its own BRAM distributed memory as specified in Table 5.1. The FPGA contains
multiple blocks of dedicated two-port memory, each block containing 36Kbits.
One or more BRAMs may be instantiated within the FPGA and addressed individually.
BRAM is typically allocated for functional blocks such as a first-in first-out (FIFO) which
requires memory for a queue. AXI peripherals exist to access sectors of BRAM from the PS.
However, the BRAM cannot be explicitly accessed from the PS.
Digital Signal Processing Slices
In addition to logic cells and BRAM, the PL contains a set of DSP slices to provide fast
execution of common arithmetic and logical operations. They are a level of abstraction
higher than the logic cells and are optimized for high-performance digital signal processing
and computation. These DSP slices are configurable within the PL and are instantiated as
hardware accelerators for functional blocks requiring high-performance arithmetic.
MicroBlaze
Occasionally, processes destined for implementation within the PL suit a processing core
for executing compiled code. MicroBlaze is a Xilinx proprietary soft processor core with
a RISC instruction set and designed for implementation exclusively in FPGA fabric [28].
The MicroBlaze uses a local-memory bus (LMB) to efficiently interact with BRAM memory
instantiated alongside the processing core within the PL. The size of BRAM dedicated to
the processor memory is configurable; MicroBlaze also has support for instruction and data
caches similar to the APU within the PS. All process control and instruction registers are
addressed within the PL and are local to the MicroBlaze processor.
In TAIGA, the backup controller and IOI are hosted within independent MicroBlaze cores.
MicroBlaze supports a maximum clock rate of 150 MHz with the lowest performance speed
grade of the ZYBO platform’s Zynq. The backup controller and IOI necessitate hardware
accelerators to satisfy arithmetic needs. The addition of these accelerators raises negative
slack errors during the PL routing when clocked at 150 MHz. As a result, 145 MHz is
requested to meet the PL timing requirements. FCLK CLK0 is configured accordingly to
the closest possible frequency of 144.4 MHz by ARM PLL sourcing.
41
The two MicroBlaze cores are configured for maximum frequency with a five stage pipeline
and customized with the following parameters:
• Hardware barrel shifter enabled to allow multiple bit shifts within one clock cycle.
• Extended FPU enabled in hardware to improve single-precision IEEE-754 standard
floating point arithmetic performance.
• Hardware integer multiplier enabled to improve performance for 32-bit integers.
• Hardware integer divider enabled for increased performance in integer division.
• Additional machine status register (MSR) instructions enabled for faster bit modifica-
tions to the process control register of the architecture.
• Branch target cache size of 1024 entries stored in BRAM for better branch prediction.
A MicroBlaze configuration can trade off performance and resource usage. The configuration
parameters used in TAIGA primarily improve the computation performance. MicroBlaze
supports both data and instruction caches to hide memory access latency. However, these
caches are implemented in BRAM similar to the local instruction and data memory which
does not yield a significant performance increase. The PL real estate is a particularly precious
resource. In TAIGA, BRAM is a bottleneck resource primarily due to the large amount of
function units requiring it. As a result, MicroBlaze optimizations are FF, LUT, and DSP
intensive in order to obtain performance improvements while conserving BRAM.
5.2 Production Controller
The ZYBO embedded controller interacts with the pendulum via the SACIB SPI bus con-
nected to a dedicated ZYBO Pmod connector. As a standalone apparatus without TAIGA,
the production controller directly senses and actuates the pendulum with the PS resem-
bling a microcontroller. In this configuration the internal SPI peripheral controller (SPI 0),
as shown in Figure 5.2, is enabled. Initially, the production controller algorithm is imple-
mented as a bare-metal application in one of the ARM cores using an internal hardware
timer to initiate the one millisecond control cycle. This implementation without TAIGA is
used to verify proper and reliable control of the RIP.
42
The separation of resources between the PS and PL is leveraged by TAIGA to implement the
untrusted production controller in the PS as illustrated in Figure 5.1. With the incorporation
of TAIGA, the production controller no longer interacts directly with the SPI bus, but
rather through queues interfaced by the IOI module. The SPI controller internal to the PS
is disabled and an external AXI FIFO stream peripheral is instantiated within the PL to
interface communication with the queues as shown in Figure 5.1. The implementation of the
queues is described in greater detail in Section 5.4. Xilinx’s AMP solution is implemented
within the APU to support a real-time operating system (RTOS) on one core and a high-level
operating system (OS) on the other.
5.2.1 FreeRTOS
The production control algorithm and methods for controlling the pendulum process are
implemented in a FreeRTOS framework on ARM core 1. In general, the use of an operating
system greatly increases the scalability of the application’s software and also enrich’s the
capabilities of the processor by abstracting away hardware specifics in the software imple-
mentation. An RTOS achieves these goals while preserving the strong and direct interaction
between the hardware and software.
FreeRTOS is a hybrid between a bare-metal and kernel application. It contains a scheduler
that is used to execute internal processes, called tasks, that are implemented, compiled,
and executed with the kernel itself. This close-knit integration between the kernel and the
tasks ensures integrity between the hardware and software interaction which is essential in
real-time process control applications. FreeRTOS has the following advantages compared to
bare-metal and kernel applications [2]:
• Smaller memory footprint and overhead than traditional operating systems.
• Enriched with useful OS features such as a configurable scheduler, software-defined
timers, task manager with multitasking capabilities, and interrupt handlers.
• Efficient inter-task communication through the implementation of message queues.
• Shared memory access between tasks with mutex and semaphore implementations.
• No distinct kernel layer to separate hardware from the software.
43
The core pendulum control sequence is implemented as a timer callback function with a
one millisecond software timer. The tick rate of FreeRTOS is configured to 1000 Hz in the
FreeRTOS BSP such that the pendulum control callback function can be executed every
tick. The pendulum control sequence is executed as follows:
1. Request an operational θ set-point at which to maintain the servo arm.
2. Request servo arm and pendulum position measurements from the physical process.
3. If the pendulum position is within the controllable region (|α| < 15◦), compute the
LQG control algorithm and Kalman filter to generate a control voltage for actuation.
4. Write the control voltage to the servo.
The control algorithm is implemented as a C function and contains the control parameters
of the LQG algorithm for high-performance pendulum balance. The methods for physical
process interaction are implemented as low-level drivers and utilities. Traditionally, these
methods interact directly with the SPI bus. They are modified to instead interact with the
IOI over the FIFO message queues using a strict packet syntax defined in Section 5.4.
Multitasking
FreeRTOS implements multitasking with a single processing core by rapidly switching be-
tween tasks. The execution of three concurrent tasks using this time-slicing method is
illustrated in Figure 5.4a. The perceived parallel execution of tasks achieved by rapid and
efficient switching is illustrated in Figure 5.4b. The execution of a task is impeded when the
task suspends itself with a yield or the needs of a higher priority task, interrupt, or timer
arise.
Each task is allocated its own dedicated stack which contains local memory blocks for storage
of task-specific variables. Inter-task memory access is efficiently orchestrated through the
use of locks and semaphores which protects the integrity of the variables accessed by multiple
concurrent tasks. Message queues are another method for inter-task data transfer.
A timer callback function is essentially treated as a task in FreeRTOS. A timer handle
for the pendulum control sequence is created and started with the highest priority such
44
(a) Time-sliced scheduling of tasks. (b) Perceived parallel execution of tasks.
Figure 5.4: Execution trace resembling FreeRTOS scheduling of concurrent tasks.
that the method can interrupt any other tasks dispatched by the scheduler. Typically,
introducing other tasks in a bare-metal implementation requires possible modifications to the
pendulum control sequence or external task handling methods. However, the priority-driven
concurrent task capabilities of FreeRTOS ensures strict execution of the pendulum control
sequence at every millisecond as long as it maintains highest priority. In the implementation
of the production controller, the FreeRTOS framework allows scalable implementation of
other tasks for arbitrating reconfiguration, control parameter optimization, or facilitating
interaction with external peripherals without disrupting the control process.
5.2.2 Linux
Linux is implemented in ARM core 0 primarily to provide high-level network and file transfer
services to support reconfiguration. The leaf nodes of cyber-physical control require interac-
tion with hierarchical elements both in the regulatory and supervisory layer such as RTUs,
other PLCs, and supervisory controllers. Since communication between the IOI and pro-
duction controller is restricted to bounded queues, reconfiguration capabilities through the
IOI module are not practical. Illustrated in Figure 5.1, the reconfiguration network interacts
directly with the production controller via Ethernet or UART.
The ZYBO platform contains a shared UART and Joint Test Action Group (JTAG) Univer-
sal Serial Bus (USB) port. The JTAG interface is used for programming and debugging the
Zynq’s hardware and software. JTAG port access allows reconfiguration of the Zynq architec-
ture and is prevented to safeguard the integrity of TAIGA. Rather than using ZYBO’s shared
UART and JTAG USB port as the reconfiguration network, the PS’s UART 1 peripheral
controller is routed to an external USB-to-UART converter board connected to a dedicated
45
MIO Pmod port. The USB-to-UART converter uses a FTDI FT232RQ IC which interfaces
the production controller’s UART bus with a USB connector to provide bidirectional serial
communication.
Direct communication to the PS allows the reconfiguration network to not only modify
certain parameters, but also execute firmware updates of the entire PS image. Typically,
such updates require modification to the boot images stored within the SD card and file
transfers with external networks over Ethernet or UART. The peripheral controllers for
interfacing with the SD card, Ethernet, and UART are enabled within the PS as shown in
Figure 5.1.
Access to the Ethernet, SD card, and UART peripherals is available to the core running
FreeRTOS. However, the implementation of file transfer and networking methods within
the FreeRTOS framework is tedious, hardware specific, and messy. Linux is implemented
to manage this process since the Linux OS already contains Ethernet, serial, and SD card
drivers as well as services for interacting with these peripherals. Lower-level methods for
making parameter optimizations or alterations to the control sequence are implemented in
the FreeRTOS framework as lower priority tasks relative to the process control sequence. The
use of a high-level OS like Linux to facilitate production controller reconfiguration conforms
to existing industrial control PLCs.
The integration of Linux within the AMP framework, interaction with external peripherals,
and networking are investigated independently and made feasible in the TAIGA implemen-
tation. The software routines to interface and arbitrate production controller reconfiguration
is out of the scope of this thesis and considered an extension to the work presented.
5.3 Backup Controller
Similar to the production controller, the backup controller is implemented in C and involves
arithmetic complexity necessitating intensive computation. As a result, an instruction set
processor suits the implementation of the backup controller. In order to ensure trust and
prevent malicious sharing or access of resources, the backup controller is implemented within
the PL on a MicroBlaze soft processor. Two BRAM modules are allocated to the backup
controller: 32 KB for local instruction memory and 32 KB for local data memory.
46
The backup controller is primarily a simplified version of the production controller with an
LQG control algorithm tuned for high assurance rather than performance. The controller
only contains one routine, the pendulum control sequence:
1. Request servo arm and pendulum position measurements from the physical process.
2. If the pendulum position is within the controllable region (|α| < 15◦), compute the
LQG control algorithm and Kalman filter to generate a control voltage for actuation.
3. Write the control voltage to the servo.
The control sequence for the backup controller is almost identical to that defined for produc-
tion. Unlike the production controller, the backup controller does not request a supervisory
set-point, but rather, operates on one that is pre-configured. The operational set-point for
the backup controller is set to θdesired = 0◦.
Under production control, the supervisor dictates the operational set-point through inter-
action with the IOI. The production controller then requests the supervisor’s operating
conditions from the IOI module. In a typical PCS, the operating conditions are intended to
actuate the system to satisfy plant deliverables. Since the backup controller’s intention is to
maintain process safety and stability rather than satisfy plant deliverables, it operates on a
pre-configured operating condition that is verified for safe and stable process control.
Operating under pre-defined conditions, the backup controller does not interact with CPS
hierarchical nodes, and only interacts with the physical process. This eliminates untrusted
inputs to the backup controller, which can be thought of as an independent and isolated
physical control leaf node. This attribute is essential in ensuring backup controller trust.
The backup control sequence is implemented using an AXI timer peripheral with a 32-bit
configurable counter and an interrupt controller for detecting counter overflow and initiating
a reset. The timer merely acts as a counter incrementing at each clock cycle. In software, the
counter is configured to reset to a preconfigured value, RESET COUNT, when the maximum
count is reached. The calculation of RESET COUNT is shown in Equation 5.1.
Ttimer =1
fCLK
(MAX COUNT− RESET COUNT)
RESET COUNT = MAX COUNT− TtimerfCLK
(5.1)
47
Since the counter hardware is 32 bits wide, MAX COUNT = 0xFFFFFFFF. The PL fabric
clock FCLK CLK0 clocks the timer module at fCLK = 144.4 MHz. The desired period of
the timer, Ttimer, is the one millisecond control cycle time of the pendulum control sequence.
With these parameters, Equation 5.1 results in a RESET COUNT = 0xFFFDCBC2 to
generate a one millisecond interrupt for executing the pendulum control sequence.
The backup controller is implemented as a standalone bare-metal application within the
MicroBlaze processing core primarily to eliminate unnecessary kernel overheads and com-
plications. Since the backup controller only hosts the process control sequence, a kernel or
OS is unnecessary. The bare-metal implementation also makes the backup controller much
leaner and suitable for a MicroBlaze processing core which does not have the computational
power of an ARM processor. Lastly, formal verification and analysis of code implemented in
trusted elements is necessary to satisfy TR1; this is much more practical without the added
complexity of a kernel.
5.4 Queues
Queues are effective for robust inter-module communication in embedded systems. Typically,
FIFOs are used for inter-chip communication through serializer/deserializer telemetry links.
In the TAIGA implementation, FIFOs are used for inter-module communication between the
controllers and the the IOI module internal to the Zynq. In the Zynq architecture, FIFOs
are implemented with a set of BRAMs which exists in the PL, protecting the integrity of
the queue. The ordered and buffered nature of FIFOs allows for efficient data transactions
between the IOI and controller.
The allocation of BRAM and implementation of the queue is configured through a FIFO
generator block in Vivado. As illustrated in Figure 5.1, each controller is interfaced to the
IOI module with an independent pair of FIFOs. This isolates the telemetry between the
production controller and IOI, and the backup controller and IOI. The integrity of trusted
communication between the backup controller and IOI is ensured since production read/write
access to the backup’s FIFOs is prevented.
The FIFO generator IP contains three types of FIFO interfaces: native, AXI memory
mapped, and AXI stream. The native interface type is primarily intended for interfac-
ing the queues with custom functional blocks within the PL. In the AXI memory mapped
48
protocol, all data is addressed within system memory and referenced from the FIFO. This is
not suitable for the implementation of TAIGA in order to preserve the isolation and locality
of PS memory and PL BRAM.
The AXI stream protocol is used to interface the processing cores with the respective set
of FIFOs. The AXI stream protocol is part of ARM’s AMBA AXI4 interface specification
and follows a data-flow paradigm [29]. Rather than addressing system memory, data is
moved from local memory into the allocated memory of the FIFO on an enqueue method,
and moved to the destination core’s local memory on a dequeue method. Each of the FIFO
generator blocks are configured with the following parameters:
• AXI stream interface type
• Common clock connected to FCLK CLK0 which is 144.4 MHz
• Word width set to 32 bits, TDATA is 4 bytes wide
• TLAST enabled for indication of packet termination
• FIFO implemented in common clock BRAM as a data FIFO application for low-latency
memory access (2 clock cycles)
• FIFO depth configured for 512 words
At the lowest implementation level, the FIFO generator block contains numerous input and
output signals which require precise timing for writing (enqueuing) to the FIFO and receiving
(dequeuing) from the FIFO. The AXI-Stream FIFO block used to interface the processing
cores with the FIFO generator handles all of the required signal manipulation for read and
write operations which are executed using an AXI-Stream FIFO software library in the Xilinx
SDK.
Each FIFO generator block is connected to one AXI stream peripheral at the receiving end,
and another at the transmitting end. Each AXI stream peripheral contains one queue for
transmission and another for receiving as depicted in Figure 5.1. The FIFO stream signals for
both enqueuing and dequeuing messages to the FIFO generator are described in Table 5.2.
49
Table 5.2: Transmit and receive data channel signals of the interface between the AXI-StreamFIFO and FIFO generator blocks.
EnqueueSignal Direction Description
TDATA[31:0] Output Used to write a word to the FIFO.TLAST Output Indicates the boundary of the packet. Set high when the
current word for transmission is the last of the packet.TREADY Input Indicates the slave FIFO generator can accept a new
enqueue in the current clock cycle when asserted high.TVALID Output Asserted high when the master AXI-Stream block is
ready to initiate an enqueue to the FIFO block.
DequeueSignal Direction Description
TDATA[31:0] Input Used to receive a word from the FIFO.TLAST Input Indicates the boundary of the packet. Set high by the
FIFO when the last word of a packet is received.TREADY Output Indicates that the master FIFO generator is ready to
accept a dequeue message in the current clock cycle.TVALID Input Asserted high by the FIFO generator indicating that
valid dequeue data is present on the data channel.
5.4.1 Inter-Module Communication Protocol
As described in Table 5.2, data transfer occurs parallelly on the TDATA lines when both the
TREADY and TVALID signals are asserted indicating both the master and slave are ready.
Typically, FIFOs act as a buffered stream of words with a pre-configured width. However by
enabling the TLAST signal, the implemented FIFOs in TAIGA are able to stream packets
with varying number of words. The number of words in each packet is indicated by the
assertion of the TLAST signal.
The dequeue method of the IOI module is interrupt-driven as shown in Figure 5.1. An
interrupt is raised once a new packet, possibly containing multiple words, is available on
the FIFO. The controllers enqueue packets to the IOI requesting the execution of various
commands. As a result, the IOI acts as a server of requests made by the controllers. The
packet syntax used for interaction between the controller and IOI is defined in Table 5.3.
50
Table 5.3: FIFO queue packet structure. The portions shaded gray are relevant only for aPLANT command.
31 24 23 16 15 8 7 0Command Operation Transfer Bytes Slave Select DATA[i] ...
PLANT,
SET POINT,
STATE VECTOR
WRITE, READ,
STATE VECTOR0, 1, 2, 3, 4
NO SLAVE,ADC, DAC,
ENCODER P,ENCODER S
The first word of each packet transmitted by a controller is considered the header and
contains specific information regarding the type of interaction required by the controller. As
identified in Table 5.3, three types of commands are accepted by the IOI:
• A PLANT command allows interaction with the pendulum process by either writing
or reading from the SPI bus. The number of bytes to transfer on the SPI bus are
selected along with the slave device. The data bytes to transfer are appended to the
packet header. If a READ operation is requested, a packet of the same length as the
number of bytes transferred is returned.
• A SET POINT command returns a single word containing the current operational
set-point of the pendulum.
• The STATE VECTOR command returns four words containing the locally stored copy
of the state vector reflecting the current state of the pendulum.
With these three command types, the controller is able to interact with the physical pendu-
lum process and supervisor through the IOI. This packet structure and parameters are made
available to the production controller, backup controller, and IOI module through a globally
shared header file. The IOI module’s interrupt handler method for controller requests is
described in detail in Section 5.5.
51
5.5 I/O Intermediary
Similar to the backup controller, the IOI is a trusted entity within TAIGA and contains a
variety of methods and processes that are computationally and arithmetically intensive. As
a result, the IOI is implemented in the PL using a second MicroBlaze soft processor core
with similar configurations to the backup controller.
In contrast to the backup controller, the IOI contains more methods and processes requiring
larger local BRAM allocation: 128 KB is allocated for local instruction memory and 128
KB is allocated for local data memory. While these two blocks of memory are more than
sufficient to implement the methods described here, they provide excess storage intended for
the implementation of future detection schemes, monitoring systems, or response methods
addressing additional cyber-physical security needs.
5.5.1 Robust Process Control
The IOI serves a variety of functions. Within TAIGA, the controller must robustly and
reliably sense and actuate the physical process without noticeable degradation in process
control performance. The IOI is responsible for efficiently channeling sensor and actuator
commands between the RIP on the SPI bus, and the controller on the FIFOs. The IOI is
implemented to ensure expected RIP control even with the added complexity of TAIGA.
Similar to the backup and production controllers, the IOI interacts with the FIFOs via an
AXI FIFO stream peripheral. While the two controllers interact with the FIFO stream using
a polling method for receiving messages, the IOI’s FIFO stream module is interfaced through
an interrupt controller configured to generate an interrupt, as illustrated in Figure 5.1,
when new packets are available to dequeue. The IOI acts as a server for all controller I/O
requests. The interrupt routine handles requests that conform to the packet syntax described
in Table 5.3. The packet handler method is illustrated in Figure 5.5.
On each control cycle, the production controller requests an operation set-point with the
SET POINT command, requests sensor measurements of the servo arm and pendulum with
the PLANT command, and writes a voltage to the servo with the PLANT command. Every
time a new sensor measurement is requested, the current sensor values are updated locally
within the IOI to reflect the current state of the pendulum.
52
Figure 5.5: Software implementation of the FIFO interrupt handler.
The backup controller contains a process control routine similar to the production controller
but without a request of the operational set-point. Rather, the backup controller requests
the current state vector using the STATE VECTOR command when it is first enabled to
ensure a bump-less transition in the presence of an anticipated guard violation. The method
described in Figure 5.5 is implemented as a case statement within the FIFO interrupt handler.
As illustrated in Figure 5.1, a hardware AXI SPI peripheral is instantiated alongside the IOI
to communicate on the SPI bus. The block acts as a master on the SPI bus and is configured
with an 8-bit transaction width and four slaves connected to each of the ICs on SACIB. The
AXI SPI peripheral is sourced by the PL fabric clock, FCLK CLK0, which is divided by a
frequency ratio to generate the SCK as shown in Equation 5.2.
fSCK =fCLK
Fratio
(5.2)
The SPI protocol is not limited by a maximum clock speed and typically operates at very fast
data rates. In order to ensure clean and sharp clock edges, the source frequency is divided by
a frequency ratio of Fratio = 16×5. With fCLK = 144.4, the resulting SPI clock is configured
to fSCK = 1.805 MHz. This clock rate is verified to provide reliable communication between
the ZYBO and SACIB.
53
SPI Filter
A SPI filter, as identified in Figure 5.5, is implemented to validate commands that attempt
to interact with the physical pendulum process. Controllers are only able to read encoders
or write voltages to the servo through the SPI bus. Upon boot-up, the IOI module executes
a pendulum initialization sequence that resets and configures the encoder counter ICs. The
SPI filter prevents latent software from re-initializing the system during operation. Re-
initialization the encoder counters during operation would result in offset readings which
would jeopardize not only production controller operation but also the integrity of the backup
controller. The SPI filter also prevents the controller from transmitting other configuration
commands on the SPI bus that can disrupt the operation of ICs in SACIB.
By restricting the scope of interaction with the physical process to known and expected com-
mands, the SPI filter addresses bias-injection attacks executed by the production controller.
It also ensures actuator commands are maintained within the operational limits of the servo.
Any voltage write command to the servo is decoded within the IOI’s SPI filter and saturated
to +3.3 volts, which is the unscaled actuator limit of the DAC IC responsible for driving the
servo.
5.5.2 Supervisory Control and Process Monitor
Cyber-physical control leaf nodes require bidirection communication with hierarchical en-
tities. In the TAIGA implementation, this communication channel is referred to as the
supervisory I/O, identified in Figure 5.1. The supervisory layer within a CPS contains
multiple nodes of HMIs that allow operators to monitor the current process behavior.
A USB-to-UART peripheral, similar to the one used by the reconfiguration network, is
connected to a Pmod port of the ZYBO board as the supervisory I/O peripheral. The
Pmod is routed to the PL using EMIO pins and connected to an AXI UART peripheral. The
AXI UART peripheral is configured with 8 data bits and a baud rate of 921,600. The fastest
supported baud rate is used to ensure fast transmission speed necessary for sending the entire
state vector at each control cycle. The methods for interfacing bidirectional communication
with the supervisory network and other tasks specific to TAIGA are implemented in an
infinite idle loop within the IOI framework. This loop is depicted in Figure 5.6 and executes as
a background process that is occasionally interrupted by the FIFO packet interrupt handler.
54
Figure 5.6: Idle loop and WDT interrupt service routine of the IOI.
At each iteration of the idle loop, the supervisory UART port is polled for new supervisory
inputs. The IOI accepts a new operational set-point command for which to actuate the
system. An operational set-point command is a three byte message on the UART port with
the syntax described in Table 5.4. The set-point command is specified with two header bytes
containing the characters “SP” to identify a set-point command. The last byte contains the
magnitude of the set-point in the least significant 7 bits, and the most significant bit specifies
the sign of the operational set-point.
Table 5.4: Syntax of an input operational set-point command on the supervisory UART bus.
Byte 1 Byte 2 Byte 37 0 7 0 7 6 0
‘S’ ‘P’+→ 0− → 1
Magnitude of Set-Point in Degrees (0-127)
Since only 7 bits are allocated to represent the magnitude, the set-point is restricted to a
range within ±127◦. While the servo arm can be actuated within a range of ±180◦, the
guard specifications limit the operational mobility within ±35◦ as defined by the operational
guard in Table 4.3.
The IOI contains a control cycle flag that determines when a control sequence is completed by
the production controller. The flag is asserted when a pendulum encoder read, servo encoder
read, and a control voltage write to the actuator are recorded. As portrayed in Figure 5.6,
the state vector is written on the supervisory UART once a control cycle is completed.
55
The current process state is maintained by probing the encoder reads in the FIFO inter-
rupt handler. Upon completion of the control cycle, the Kalman filter control algorithm is
executed to estimate the entire state vector prior to transmission on the supervisory I/O
bus. The state vector is stored as a four-dimensional floating point array in which each state
variable is 32 bits. Each byte of the state vector along with the output control voltage, also
stored as a floating point variable, is transmitted on the UART bus and reassembled by the
external recipient.
Table 5.5: Packet composition of process data transmission on UART from IOI.
Data Bytes
θ0 1 2 3
Upper Upper-Middle Lower-Middle Lower
α4 5 6 7
Upper Upper-Middle Lower-Middle Lower
θ8 9 10 11
Upper Upper-Middle Lower-Middle Lower
α12 13 14 15
Upper Upper-Middle Lower-Middle Lower
u16 17 18 19
Upper Upper-Middle Lower-Middle Lower
Footer20 21 22 23
Assertion State ‘–’ ‘–’ ‘\n’
The syntax of the transmission packet is represented in Table 5.5. Each of the 32-bit floating
point variables are divided into four bytes. A total of 24 bytes are transmitted on the super-
visory UART at each control cycle by the IOI. The last three bytes are a constant delimiter
representing the end of each packet. This delimiter is used for establishing boundaries be-
tween multiple packets received on the UART. The UART transmission is handled by the
AXI UART peripheral in hardware and is non-blocking. Once space is available, bytes are
loaded to the buffer and transmitted by the UART controller without hanging the IOI idle
loop as represented in Figure 5.6. As a result, the transmission of the 24 bytes happens in
parallel to the execution of other idle task methods. The time required to transmit a packet
is calculated using Equation 5.3.
Tpacket =bytes (data bits + overhead bits)
baud rate(5.3)
56
There are a total of 24 bytes in each packet as shown in Table 5.5. The AXI UART peripheral
is configured for 8 data bits with no parity. However, the UART protocol specifies a start and
stop bit during the transmission of each byte. As a result there are two additional overhead
bits transmitted with each byte. With a baud rate of 921,600, the transmission of a packet
on the supervisory I/O bus takes Tpacket = 0.2604 milliseconds. This is only a fraction of the
one millisecond control cycle time ensuring no delay or lag in the transmission of real-time
process data on the supervisory I/O bus.
As depicted in Figure 5.6, a trigger is asserted by two mechanisms: the trigger mechanism
or watch dog timer (WDT). These mechanisms are the primary safeguards of TAIGA and
are enabled with a physical button press on the ZYBO once a proper boot sequence is
initiated and the pendulum is balanced upright under stable production control operation.
A single byte trigger assertion state, specified in Table 5.5, is sent to the supervisor for
reporting whether or not the trigger has been asserted. This flag reports TAIGA-specific
states represented by single characters as described in Table 5.6.
Table 5.6: Flag transmitted on the UART bus to report TAIGA’s trigger state.
Flag Description‘P’ Production control of the RIP without IOI trigger mechanisms enabled.‘S’ Production control of the RIP with IOI trigger mechanisms enabled.‘G’ The trigger is asserted by the trigger mechanism which preemptively detected
a guard violation.‘W’ Trigger assertion due to expiration of the WDT counter.‘T’ Trigger assertion by both the WDT and trigger mechanism.
5.5.3 Trigger Mechanism
TAIGA’s IOI is responsible for responding to malicious plant behavior by maintaining process
states, anticipating malicious operation, and initiating switch-over to the backup controller.
The IOI hosts a trigger mechanism for forecasting the trajectory of the pendulum based
on the current process states. As illustrated in Figure 5.6, a trigger mechanism method —
trivial, prediction, or classification — is executed upon completion of the state estimation in
order to use the real-time process states for detecting malicious behavior. If a guard violation
is anticipated by the trigger mechanism, the trigger is asserted which initiates a switch-over to
57
the backup controller via the controller multiplexer. The three trigger mechanisms developed
for the inverted pendulum application are implemented as C functions:
• The trivial trigger mechanism accepts the first two states, α and θ, as inputs and
checks if they are confined within the operational and safety-critical guard boundaries
defined in Table 4.3.
• The online prediction mechanism accepts the current process state vector and the
number of iterations as input arguments. A local prediction state vector is maintained
within the method which is periodically synchronized with the current process state.
The LQG control algorithm and plant model are applied to the prediction state vector
and the trivial trigger mechanism is validated for future each iteration.
• The classification method is an arithmetic function that accepts the current process
state vector and returns a boolean value indicating whether or not the pendulum
process is within a region of safety.
5.5.4 Watch Dog Timer
The process states in the IOI are only updated and maintained upon the completion of a
control cycle. The IOI does not independently initiate sensor reads or voltage writes on
the process control bus, but rather probes the sensory measurements from process control
requests initiated by the controller. Under a covert DoS attack, a compromised process
control sequence in the production controller may disable any communication with the IOI.
In order to detect the absence of control, a WDT is implemented to ensure strict conformity
to the one millisecond control cycle period.
As illustrated in Figure 5.1, an AXI WDT peripheral is implemented in the PL with an
interrupt controller for detecting the timer expiration. The WDT acts as a hardware counter
that generates an interrupt when it overflows. The WDT counter width is configured in
hardware with the period calculated using Equation 5.4.
TWDT =2Width of WDT
fCLK
(5.4)
58
The WDT expires when a one millisecond control cycle is exceeded by the production con-
troller. As a result, TWDT ≥ 1 ms to allow some threshold for imprecision within the control
cycle. The width of the WDT is configured to 18 bits with the WDT clocked by the 144.4
MHz PL fabric clock (FCLK CLK0) which results in a WDT period of TWDT = 1.8148 ms.
As illustrated in Figure 5.6, the WDT counter is reset every time a control cycle is completed
by the controller. When the period of the production controller’s control cycle exceeds 1.8148
milliseconds, a trigger is asserted initiating switch-over to backup control.
The idle loop shown in Figure 5.6 is a background process that can be preempted by only an
interrupt. The WDT addresses DoS attacks which are not detected by the trigger mechanism
since process states within the IOI are not updated. Another covert attack leverages the
idle loop interruption by flooding FIFO requests on the queue such that the idle loop is
continuously interrupted by the FIFO interrupt handler. Since the WDT is also interrupt
driven and external to the control loop, such an attack is detected by the WDT’s interrupt
service once the WDT’s counter is not routinely reset as expected.
The idle task and WDT represented in Figure 5.6 contain mechanisms that assert the trigger
to switch to backup control when malicious or unexpected control and process behavior are
detected. No automated process or mechanism exists within the IOI or TAIGA system that
clears the trigger to resume production control. In order to satisfy TR5, physical and manual
operator intervention is required to reset the TAIGA system and regain production control.
5.6 Controller Multiplexer
The controller queue multiplexer is implemented within the PL, as illustrated in Figure 5.1,
and acts as a bidirectional multiplexer for all signals in the backup and production controller
FIFOs. An input controller switch select signal is used to select between the production
or backup controller queues. When cleared to logic low, the enqueue and dequeue signals of
the IOI are connected to the enqueue and dequeue FIFO generator blocks of the production
controller. When set to logic high, the enqueue and dequeue signals of the IOI are connected
to the enqueue and dequeue FIFO generator blocks of the backup controller.
The controller queue multiplexer block is created using HLS. The Xilinx HLS design flow
allows a high-level implementation, in C/C++, of a combinational or sequential circuit
with PL resources. The C/C++ code is translated to HDL which is then synthesized and
59
implemented using the Vivado tool. Furthermore, Xilinx interface protocols such as AXI are
included in the HLS design flow as libraries and pragmas which allow for simpler interfacing
with external functional blocks by abstracting away the bit-level and timing details of the
interface.
Table 5.7: Signals of the controller queue multiplexer allocated to their respective masters.
Production Backup IOI
Inpu
t
rx data a[31:0]
rx valid a
rx tlast a
tx ready a
rx data b[31:0]
rx valid b
rx tlast b
tx ready b
tx data[31:0]
tx valid
tx tlast
rx readyswitch select
Ou
tput tx data a[31:0]
tx valid a
tx tlast a
rx ready a
tx data b[31:0]
tx valid b
tx tlast b
rx ready b
rx data[31:0]
rx valid
rx tlast
tx ready
The inputs and outputs of the controller queue multiplexer module are represented in Ta-
ble 5.7. These signals, described in Table 5.2, are defined as ports to the HLS block with
the following directive:
#pragma HLS INTERFACE ap none port=signal
The ap none parameter of the HLS directive specifies that the specific port does not contain
an I/O protocol such as AXI or FIFO. The input signals of each controller are connected to
the output signals of the IOI and vice versa based on the switch select signal. This function
is implemented as an if-then statement with the switch select signal as the condition. Once
translated, the controller multiplexer block is synthesized and implemented using 105 LUTs.
The port interfaces and functional implementation do not require any logic storage elements
or sequential logic design. The functional block is thus implemented as a combinational
circuit without any clock inputs.
Chapter 6
Integration of TAIGA in
Cyber-Physical Control
TAIGA is applied to the CPS leaf nodes and acts as a last line of defence in protecting
process integrity. Typically, CPS leafs are interfaced through other regulatory controllers for
interaction with the IT infrastructure which monitors and operates the plant. Conforming to
conventional cyber-physical control, the TAIGA implementation for the RIP application is
integrated into a CPS topology. Figure 2.3 illustrates the implementation of the regulatory
controllers and the integration of the the ZYBO platform with the cyber network as initially
sketched in Figure 2.1.
Figure 6.1: Integration of the TAIGA leaf node into a cyber-physical control topology.
60
61
The ZYBO board resembles a PLC leaf node with supervisory interaction interfaced through
an RTU. As presented in Figure 6.1, an RTU is introduced to facilitate interaction be-
tween the ZYBO and IT infrastructure. This chapter elaborates on the implementation
of the RTU and various software services that enable remote monitoring and actuation of
the system. The source code of both client and server side services for cyber-physical con-
trol is maintained in the following GitHub repository: https://github.com/tejachil/
RPi-Remote-Terminal-Unit.git.
6.1 Remote Terminal Unit
In DCS or SCADA systems, the PLCs are the lowest-level embedded platforms that do not
directly interact with the IT supervisory network. An RTU is responsible for interacting
with the PLC on a simpler telemetry channel, aggregating multiple PLC nodes, and relaying
messages on the IT network. A Raspberry Pi running an optimized flavor of Linux, Raspbian,
is used in the implementation of the RTU as illustrated in Figure 6.1. Containing a high-
performance processor capable of interaction with IT networking services as well as lower-
level telemetry channels, the Raspberry Pi is a suitable embedded platform.
As identified in the implementation of TAIGA, Figure 5.1, the reconfiguration network and
supervisory I/O network are both implemented on dedicated UARTs using USB-to-UART
converters. These UARTs are connected to the Raspberry Pi’s USB ports and are supported
as serial devices in Linux as shown in Figure 6.1. The UART is isolated from the shared
UART/JTAG port to prevent FPGA reconfiguration via the Raspberry Pi. The JTAG USB
port is not networked to supervisory controllers and is used to securely update the PL of
TAIGA with physical access to the controller.
6.1.1 Remote Monitoring and Control Server
The IOI routinely transmits process states through the supervisory I/O network at each
control cycle based on the packet specifications in Table 5.5. Intended for supervisory mon-
itoring of the pendulum, the RTU relays this information through a local area network.
The Raspberry Pi hosts a web server that transmits the process data to any remote clients
requesting it.
62
In the open systems interconnection (OSI) model, Transmission Control Protocol (TCP)/Internet
Protocol (IP) and User Datagram Protocol (UDP) are the most prevalent transport layer
protocols for information exchange on the Internet. The one millisecond control cycle time
is aggressive for relaying the 24 byte data packet specified in Table 5.5. For efficient real-
time process monitoring, a robust communication protocol with high bandwidth is necessary.
Data transfer in the TCP/IP model follows a request and response scheme with the client
and server respectively. New information is requested by the client and the server responds.
The bidirectional communication ensures reliable packet exchange but is relatively slow.
UDP is a minimal network transmission model that transmits data unidirectionally to a
destination IP and port. The lack of handshaking and bidirectional acknowledgements does
not guarantee packet delivery and is relatively unreliable in comparison to TCP/IP. In the
cyber-physical control domain, obtaining real-time process data is more crucial for process
monitoring than the reliability of transmission. UDP is much faster and capable of stream-
ing the RIP process states unidirectionally minimizing unnecessary overheads and ensuring
timely data delivery that represents the real-time process state.
A UDP web server is implemented on the Raspberry Pi using Node.js, a JavaScript run-time
environment with an event-driven programming model. The environment is well supported
and allows easy implementation of lightweight, fast, and scalable network applications which
suit the Raspberry Pi platform. The web server receives packets on the supervisory I/O serial
port, assembles the process states, and transmits them to clients requesting the information.
Initially, a client transmits a message to the web server which then adds the client’s IP and
port to an internal list. At each control cycle, the state vector is streamed to all clients on
the web server’s list.
Supervisory control commands to update the operational set-point, as specified in Table 5.4,
are accepted via the UDP port and transmitted on the supervisory UART. The web server
also keeps track of the IOI’s state by monitoring the state flag described in Table 5.6.
Anytime the state of the flag is changed, a message is logged to the serial terminal window
signifying the start of trigger mechanisms or a trigger assertion. Providing bidirectional
communication on the supervisory UART, the UDP web server allows real-time remote
process monitoring and operation of the RIP.
The Raspberry Pi is a private node connected to a router with a static static IP address,
128.173.52.36. The UDP web server is opened on port 32392. The service transmits
63
process data to clients and receives data requests and control commands on this port. Public
access to the Raspberry Pi is granted by forwarding this port via the router to the private
Raspberry Pi’s IP making the web server accessible to any clients on the Internet.
6.2 Human-machine interface GUI
The implementation of the web server within the RTU is standalone and isolated from the
requirements and applications of the client. Client software may be customized to the mon-
itoring and control needs of the supervisor and varies in implementation making the overall
system portable. For the RIP application, a graphical user interface (GUI) is designed to
visually represent the pendulum state and intuitively control the RIP’s operating conditions.
Figure 6.2: GUI for remote monitoring and control of the RIP.
64
The GUI, illustrated in Figure 6.2, is implemented in the Processing development environ-
ment. Processing is a community driven and open source programming language using Java
with support for OpenGL and other interactive graphics libraries. The Processing environ-
ment suits and simplifies the pendulum’s visual representation.
6.2.1 Monitoring
By pressing ‘a’ on the GUI window, the client transmits a message to the UDP server running
on the Raspberry Pi requesting a process data stream. The client’s IP address and port are
added to the server’s client list and data is transmitted to the client at each consecutive
control cycle. The state vector and control voltage values received from the server are
quantitatively displayed on the GUI. The current state of the IOI module, as described in
Table 5.6, is also displayed to indicate whether or not a trigger has been asserted.
The pendulum is projected into 2-D to visually represent the RIP operation as illustrated
in Figure 6.2. The black line radiating from the center of the white semi-circle represents
the servo arm and rotates along the center axis based on the θ position. The thick blue line
represents the pendulum and protrudes from the end of the servo arm vertically. Similar to
the servo arm, the pendulum also oscillates from the vertical based on the current α position.
The trigger mechanisms within the IOI idle loop are started by a physical button press once
the pendulum is in stable operation. In order to quantitatively determine stable operation,
the classification algorithm implemented as a trigger mechanism in the IOI is ported to a
Java function and implemented in the processing GUI. The algorithm uses the process states
derived over the supervisory I/O to determine if the process is in safe and stable operation
before starting the IOI trigger mechanism. Unsafe classification is signalled to the operator
on the GUI. This offline classification prevents unnecessary guard triggers during setup.
6.2.2 Control
The GUI provides an intuitive visual input for selecting the operational set-point of the RIP.
The green line, as shown in Figure 6.2, is draggable radially along the the θ semi-circle and
used to input new operational set-points. Once the mouse is released on a desired position,
the new set-point is displayed on the GUI and a new set-point command is formulated with
65
the syntax specified on Table 5.4 and transmitted to the UDP server. The last byte of the
the set-point command is encoded using UTF-8 to allow usage of all 8 bits as opposed to
ASCII encoding.
User control capabilities are limited based on the current state of the IOI module. If the
trigger mechanisms are not started and the process is governed by the production controller,
actuation limits for the draggable set-point selection line are set to ±90◦. Once the trigger
mechanisms are started within the IOI, the acceptable region for set-point selection is limited
to the operational guards of ±35◦ and represented by the red lines. This client side limiter
on acceptable set-points prevents transmission of malicious set-points that will be ignored
anyway by the IOI.
6.3 Remote Surveillance
More often than not, safety-critical processes are under constant remote surveillance using
networked cameras for supervisory operator feedback and enforcement of perimeter security.
A Logitech C52 USB webcam with 720p HD resolution and auto-focus is used to remotely
surveil the RIP setup during operation. The camera is connected to a USB port of the
Raspberry Pi which broadcasts a live camera feed to the Internet.
Motion is a Linux service primarily intended for motion detection and video processing of
camera feeds. The service also contains a lightweight web server that broadcasts the live
camera feed to a specified network port. The motion service is started on Raspberry Pi’s
Linux and configured to broadcast on port 8081. Similar to the UDP server, the webcam
server’s port is forwarded to the public IP via the network router.
The live feed is accessed remotely via the Internet by directing a browser to the webcam
server’s IP and port: 128.173.52.36:8081. An example of the live camera stream from
the client side is shown in Figure 6.3. Surveillance of the webcam feed provides a secondary
external source for monitoring the RIP behavior that does not rely on the process state data
transmitted by the supervisory UART and relayed by the UDP server.
Adversarial access into the RTU compromises the process data transmission on the UDP
server inherently shutting off the communication with the leaf node and supervisor. Replay
attacks at this level transmit old process data to the supervisor while mounting an attack
66
Figure 6.3: Live camera feed of RIP for remote surveillance.
on the PLC. The TAIGA framework protects process safety and stability but the supervisor
is unaware of the attempted attack or backup operation of the PLC. The webcam acts as
an external visual sensor for assessing the state of the RIP. Replay attacks are much harder
with remote surveillance since it requires replaying and synchronizing not only process state
data, but also the camera stream which is difficult and impractical even with the availability
of network resources.
Chapter 7
Results
TAIGA’s effectiveness is evaluated in the cyber-physical control domain on the RIP ex-
periment. The primary purpose of TAIGA is to strengthen security at the leaf nodes of
cyber-physical control. The effectiveness of the trigger mechanism in protecting the RIP
safety and stability is validated under simulated network integrity and reconfiguration at-
tack scenarios. For network integrity attacks, the supervisory control GUI and RTU are
used to launch malicious operating conditions to the leaf node. Reconfiguration attacks are
simulated by loading compromised production control code containing latent methods that
threaten process safety and stability.
Since TAIGA is applied to embedded systems, the added cost in terms of resources, com-
putation, and power are important considerations. The addition of TAIGA is compared to
a typical leaf node implementation. Resource usage, latency, and execution time of various
methods are relevant performance metrics.
Process data for attack responses are collected in real-time using the supervisory network.
Several I/O pins are routed to a Pmod and used primarily for testing and debugging. These
pins are toggled and analyzed on a digital oscilloscope to obtain execution and timing results.
7.1 Resilience to Simulated Attack Scenarios
The RIP implementation of TAIGA is rigorously tested under network integrity and recon-
figuration attacks. The reconfiguration attack space subsumes the attacks possible through
67
68
the network channels and thus much more dangerous. By strengthening resilience to re-
configuration attacks on the system, network integrity attacks are also addressed. Network
protection, although complementary to TAIGA, is not the primary focus of this study. The
attack scenarios investigated attempt to disrupt process behavior and do not attempt to
violate network integrity since TAIGA operates under the assumption that the network and
untrusted modules are already compromised.
7.1.1 Denial-of-Service Attack
In this implementation, the RIP is governed independently by the controllers within TAIGA
and does not require constant supervisory input. Supervisory commands only update the
operating condition of the pendulum and are not integrated directly into the process control
loop. As a result, DoS from the supervisory network does not jeopardize pendulum safety
or stability.
Rather, a DoS attack originates from the production controller and thus requires intrusion
through the reconfiguration network. In the simulated attack, the production controller is
infected with latent malware that denies execution of the process control sequence and shuts
off all communication with the FIFOs. Since new sensor measurements are not updated in
the IOI, the trigger mechanism is no longer aware of the current process state and cannot an-
ticipate guard violations. Rather, the IOI’s WDT expires once the control cycle time period
exceeds 1.8148 ms and asserts the trigger initiating switch-over to the backup controller.
Figure 7.1 shows the response of the system under the simulated DoS attack. Initially, the
production controller is governing the process with a strict millisecond control cycle time
interval. At TDoS = 30 ms, the production controller stops executing the control sequence
and shuts off all communication with the IOI. Approximately 0.8148 ms after, the WDT
counter expires generating an interrupt in the IOI which initiates switch-over to backup
control, which operates the RIP without loss of safety or stability.
The backup and production controller operate asynchronously on separate clocks. However,
the timer routine for the backup does not initiate until it is invoked by a trigger assertion. In
the implementation of the backup controller, an iteration of the pendulum control sequence
is executed after process states are retrieved from the IOI and before starting the timer. This
ensures backup control starts at the instant of trigger assertion as observed in Figure 7.1.
69
Figure 7.1: Digital oscilloscope capture of plant response to a simulated DoS attack at timeTDoS = 30 seconds
Under a DoS attack, a 0.8148 ms delay is caused in the control cycle during the switch-over
process since the expiration period of the WDT is greater than the control cycle time. In the
results presented in Figure 7.1, a 1.884 ms time interval is measured from the time of DoS
and the start of backup control. This delay is negligible relative to the pendulum response
time of 1.2 seconds, and ensures process stability during the switch-over.
Livelock
Another stealthy DoS attack also originating from malicious reconfiguration requires system
knowledge of TAIGA and exploits the interrupt-driven FIFO handler within the IOI. The
trigger mechanism exists in the IOI’s idle loop and is routinely interrupted by new packets
on the FIFO. An attacker can take advantage of this interruption by continuously flooding
the FIFO with new packet requests. The execution of the idle task containing the trigger
mechanism is constantly interrupted causing livelock in which process control is maintained
but execution of other IOI methods are prevented. Such an attack is also addressed in a
manner similar to a DoS attack since the WDT expiration is interrupt-driven and able to
assert the trigger. The WDT is only reset once the trigger mechanism is executed, enforcing
strict trigger mechanism validation at each millisecond control cycle time.
70
7.1.2 Set-Point Attack
The operational set-point of the production controller originates from supervisory control.
The production controller requests an updated set-point from the IOI at each control cy-
cle. As a result, a set-point attack can originate from both the supervisory network, or a
maliciously reconfigured process control routine in the production controller.
Figure 7.2: Plant response to a simulated supervisory attack at time Tattack = 60 seconds.
A set-point attack originating from the supervisory network is checked to be within the
operational guard by the IOI module, −35◦ < θdesired < 35◦. A stealthy supervisory set-
point attack is executed in which the supervisor changes the operational set-point slightly
less than the operational guard boundary, θdesired = 34◦, at time Tattack = 60 seconds to evade
detection of the set-point verification method within the IOI.
The trigger mechanisms are the primary safeguard for recognizing this attack and are evalu-
ated in the results presented by Figure 7.2. For the trivial, prediction, and classifier trigger
mechanisms, the trigger is asserted at Tt = 60.344, Tp = 60.032, and Tc = 60.1138 seconds
respectively. The production controller operates safely at a set-point of 10◦ prior to the
attack. The backup controller is pre-configured to operate at 0◦.
71
Trivial Trigger Mechanism
The trigger is asserted by the trivial trigger mechanism at the moment the operational guard
is violated. The RIP undergoes a lightly damped response when repositioning the servo arm
to the backup set-point of θbackup = 0◦ causing an overshoot that violates both operational
and safety critical guards as shown in Figure 7.2. Since the trigger mechanism does not
preemptively anticipate the guard violation and take corrective action ahead of time, both
process safety and stability are violated.
Online Prediction Trigger Mechanism
The prediction trigger mechanism forecasts the RIP trajectory and preemptively detects a
guard violation. For a short instant following the attack, the prediction trigger mechanism
synchronizes its states with the real-time process states. Accelerated execution of the plant
model and backup control algorithm foresees a guard violation and the trigger is asserted.
At each control cycle, 50 iterations of the prediction trigger mechanism are executed to
maintain the execution time of the prediction unit within the one millisecond control cycle
time. In actuality, the guard violation occurs in approximately 344 milliseconds after the
time of attack as demonstrated by the trajectory of the trivial trigger mechanism.
Theoretically, the prediction trigger mechanism is able to predict a guard violation within
seven control cycles after state synchronization if it follows a similar trajectory as the trivial
trigger mechanism. In the results presented in Figure 7.2, the trigger is asserted in about 32
control cycles after the time of attack which maintains the RIP well within the operational
and safety-critical bounds.
Classification Trigger Mechanism
After Tattack, the pendulum arm velocity is dramatically increased to actuate the system
to the malicious set-point as seen by the increase in slope for the classifiers trajectory at
Tc = 60.1138 seconds. The classifier bounds all four dimensions of the state vector within
a region of safety and stability. The drastic increase in pendulum arm velocity at a θ
position that is approaching the operational guard causes the classifier to assert the trigger
by identifying the RIP operation to be outside of a safe state vector region.
72
The classifier asserts the trigger at the last possible point of safe recovery under backup
control. While the trigger is not asserted as preemptively as the online prediction method,
the pendulum is maintained well within the system’s operational guards. Process safety and
stability are not compromised, the primary objective of the trigger mechanism.
Reconfiguration Set-Point Attack
Under supervisory attacks, the prediction unit has knowledge of the production controller’s
desired operational set-point since it propagates through the IOI. A more covert set-point at-
tack originates from a compromised reconfiguration network in which latent malware within
the process control sequence ignores the operational conditions of the supervisor from the
IOI and attempts to operate the RIP at a malicious set-point.
In such a scenario, the prediction algorithm does not have knowledge of the desired oper-
ational set-point and predicts trajectory based on the backup controller’s set-point. The
prediction trigger mechanism is not able to assert the trigger as preemptively as the tra-
jectory shown in Figure 7.2. Rather, the prediction trigger mechanism follows an attack
response trajectory similar to the classifier’s and still maintains process safety and stability.
The classifier does not require knowledge of the desired operating conditions since it is
testing whether the backup controller can return the system to the backup set-point without
incurring a guard violation. The classifier trigger mechanism responds similarly to set-point
attacks originating from the supervisory or reconfiguration network; this response is captured
in Figure 7.2.
7.1.3 Deception Attack
Deception attacks are a class of attacks that are executed independently within process
control loops to disrupt plant behavior without the need of disclosure resources or online
network availability [24]. In the RIP framework, deception attacks propagate through a
compromised reconfiguration network and require malicious reconfiguration of the production
controller to gain sufficient disruption resources to modify the process control loop. Two
types of deception attack scenarios are investigated: a zero-dynamics attack in which the
encoder counter is reset arbitrarily under stable operation, and a bias-injection attack in
which dangerous voltages are written to the servo.
73
Zero-Dynamics Attack
Upon boot-up, the IOI initializes the ICs in the SACIB and the encoder counters are reset
with the servo arm stationary and pendulum pointing downwards. This position is the zero-
frame of reference for all encoder reads during process control for both the production and
backup controller. Resetting the encoder counters upon mobile operation of the RIP would
result in an offset frame of reference for sensory inputs jeopardizing not only process stability
for the production controller, but also the backup.
The IOI’s SPI filter limits the scope of interaction with the controller and plant. A SPI
command on the FIFO containing encoder counter reset data is ignored by the SPI filter and
is not transmitted on the SPI bus. Only sensory reads and servo voltage writes are relayed
to the physical process thus addressing zero-dynamic attacks on the RIP experiment.
Bias-injection Attack
Another class of deception attacks is false control data injection to the process control loop.
In the RIP experiment, malicious reconfiguration of the process control sequence allows
the adversary to write voltages that do not control the pendulum or are outside of the
servo actuator limits. Typically, lack of proper pendulum control is detected by the trigger
mechanisms and control is switched over to backup. However, momentary voltage writes
outside of the actuator limits can instantaneously damage the servo hardware before switch-
over to backup control.
Figure 7.3: Digital oscilloscope capture of servo voltage saturation at actuation limits, ±10volts, during voltage sweep of ±15 volts.
74
The SPI filter addresses false voltage bias injection attacks by enforcing servo actuator
limits. The DAC ICs and operational amplifier circuit in the SACIB is sourced with a ±12
volt supply but tuned to operate at the ±10 volt range, the servo actuator limits. While the
SACIB circuitry addresses hard actuator limits, any voltage write on the SPI bus is verified
to be within ±10 volts and saturated to this range by the IOI as well. Figure 7.3 shows
the saturation of the servo enforced by both the IOI and SACIB infrastructure to protect
the servo. The trigger mechanisms are disabled in order to obtain these results in which
the production control sequence is maliciously reconfigured and attempts to sweep a servo
voltage actuation between ±15 volts.
7.2 Execution Time and Control Latency
In process control loops, minimizing latency between sensory feedback and actuator response
is necessary to maintain efficient and robust control. Embedded platforms are digital elec-
tronics that inherently have computation and processing latencies caused by the execution
of various processes. For the RIP, the execution time of TAIGA-specific methods are mea-
sured with a digital oscilloscope by probing debug I/Os toggled in software. The additional
latencies caused by TAIGA are analyzed as a key metric for ensuring robust process control.
Without TAIGA, the production controller interacts directly with the RIP on a SPI bus
for sensing and actuation. Transferring two bytes of data on the SPI bus takes 17.04 µs,
approximately 8.51 µs per byte. Based on fSCK = 1.805 MHz clock rate for the SPI bus,
the actual transmission of one byte takes 8 clock cycles which is approximately 4.4308 µs.
The remaining measured transfer time is attributed to loading the transfer register and
communication with the AXI peripheral, which is unavoidable.
In the TAIGA framework, the low-level SPI drivers are moved from the production controller
to the IOI, and communication is directed through FIFOs. The transmission of packets
through the FIFO for process interaction causes a propagation delay of approximately 3.7 µs.
While this added latency associated with the TAIGA implementation is large relative to the
SPI transfer of a single byte, all routines interacting with the process typically require more
than a single byte of transfer. Sensor reads require four byte transfers while voltage writes
require two. As a result, the added latency of the FIFOs and IOI does not drastically impact
the execution time of process interactions within the pendulum control sequence.
75
Table 7.1: Execution time of RIP control sequence.
GetSet-Point
Read SPIEncoders
ControlAlgorithm
Write SPIVoltage
Total
Production 6.93 µs 2× 42.7 µs= 85.4 µs
2.27 µs26.8 µs
121.4 µsBackup 0 µs 13.48 µs 125.68 µs
The execution times of the RIP control sequence methods are measures and presented in
Table 7.1. The set-point command does not require peripheral I/O; the execution time is
primarily attributed to the FIFO latency arising from bidirectional enqueue and dequeue
between the controller and the IOI. Interaction with the physical plant requires relatively
slow SPI transfers with respect to the FIFO latency as reflected by both the encoder read
and servo voltage write operations. The control algorithm execution deviates between the
production and backup controllers due to the difference in clock frequency and floating point
performance between the ARM and MicroBlaze processors.
The execution time of the entire process control sequence is less than 130 µs for both
controllers, consuming only about 13% of the one millisecond control cycle time. In the
TAIGA framework, the remaining control cycle time is used for updating process states,
executing the trigger mechanism, and interacting with the supervisor. These processes are
implemented in the IOI’s idle loop and execution time is measured and presented in Table 7.2.
Table 7.2: Execution time of IOI idle loop.
Method Execution Time
UART Receive 2.32 µsKalman Filter 16.3 µs
Trigger Mechanism Trivial: 4.05 µs 50 Predictions: 828 µs Classifier: 557 µsUART Transmit 190 µs
The UART receive method polls the UART message buffer for new control command data
and is not a time-intensive process as depicted in Table 7.2. The execution of the Kalman
filter algorithm is necessary after the completion of each control cycle for state estimation
in the the IOI. Since the computational performance of the IOI is the same as the backup
controller, the time required to calculate the Kalman filter is similar to the backup control
76
algorithm calculation. At each control cycle, 24 bytes of data is transmitted on the super-
visory UART bus as specified in Table 5.5. This transmission is non-blocking and executed
concurrently by the UART hardware controller but requires 190 µs to complete the UART
transmission methods. The actual transmission time is greater and calculated by Equa-
tion 5.3. As a result, the trigger mechanisms are run in parallel to the UART transmission.
The two preemptive trigger mechanisms, prediction and classification, are computationally
intensive as demonstrated in Table 7.2. The online prediction trigger mechanism is run for
an experimentally determined 50 iterations at each control cycle to ensure the method does
not exceed the allotted time. In contrast, the classifier takes much less time to execute.
Running the prediction algorithm for 50 iterations at each control cycle would require syn-
chronization with real-time process states every 24 control cycles to forecast trajectory for
the entire RIP response time, 1.2 seconds. Perfectly timed attacks in between this synchro-
nization time attempting to violate operational guards in the 24 control cycle time window
can evade the detection of the prediction method making it non-ideal.
The classifier method is a much more suitable trigger mechanism due to its smaller execution
time. A one millisecond control cycle is aggressive even in comparison to typical industrial
control applications. The ability to successfully implement TAIGA and preemptive detection
schemes without jeopardizing this control cycle time validates the effectiveness of TAIGA.
7.3 Resource Utilization
The additional resources utilized by TAIGA are an important metric for assessing cost. A
standalone production controller for the RIP experiment is implemented with just a micro-
controller containing build-in peripheral resources. In the Zynq processor, this implementa-
tion is contained within the PS. TAIGA is realized primarily in the PL with the addition
of the backup controller, IOI, FIFOs, controller multiplexer, and numerous other functional
and peripheral blocks for interfacing with internal and external components. A comparison
of resources with a standalone production controller and TAIGA is illustrated in Figure 7.4.
The standalone production controller uses no FPGA resources; it only uses I/O and a single
global buffer (BUFG) for interfacing and control of the SPI bus. As illustrated in Figure 7.4,
TAIGA uses a large amount of FPGA resources. The FIFO multiplexer designed using HLS
77
Figure 7.4: Post-implementation Zynq FPGA resource usage with and without TAIGA.
is synthesized to use 105 LUTs for the combinational circuit. The hardware optimizations for
the MicroBlaze processor each use five DSPs. A large amount of FFs are used during routing
for registers and buffers. Lastly, TAIGA is BRAM-intensive since BRAMs are allocated for
the instruction and local data memories of the two MicroBlazes and the four FIFOs. While
a significant portion of the FPGA resources are used, the Zynq-7010 used in the ZYBO
platform has fewer resources available than other chips in the family. For the cost and
relatively low performance of the platform, TAIGA is successfully implemented without
requiring any excess resources.
Satisfying timing constraints is crucial for accurate and reliable operation of FPGA designs.
TAIGA’s implementation contains a mixture of both combinational and sequential design.
The sequential portions of the PL are clocked by the FCLK CLK0 fabric clock at 144.4 MHz.
For proper digital design, it is essential that the propagation delay through combinational
logic is less than the clock period of the sequential circuit elements.
The difference between the combinational propagation delay and sequential clock period is
known as slack. Negative slack is a critical timing violation since it implies combinational
propagation delay exceeds the clock period of sequential logic. The Vivado design tool es-
timates propagation through combinational circuit paths to determine slack. MicroBlaze
configurations form the critical paths in the TAIGA implementation. The fabric clock fre-
quency and MicroBlaze configuration are tuned to prevent negative slack. The estimated
slack for the standalone production and TAIGA implementation are specified in Table 7.3.
78
Table 7.3: Estimated slack for standalone production and TAIGA implementations on Zynq.
Standalone Production TAIGA
Setup Slack - 0.205 ns
Hold Slack - 0.027 ns
Pulse Width Slack 7.845 ns 2.211 ns
The standalone production controller only contains one path in the PL between combina-
tional and sequential logic as shown in the pulse width slack. The TAIGA implementation,
on the other hand, contains 25712 setup and hold sensitive paths and 9338 pulse width sen-
sitive paths. Larger slack is favorable since it guarantees better circuit integrity. However,
in order to achieve maximum performance from the MicroBlaze processors, the fabric clock
is maximized without incurring negative slack. While the TAIGA slack is low, it is still
positive and satisfies the timing requirements for the Zynq’s PS.
Table 7.4: Estimated power consumption for standalone production and TAIGA implemen-tations on Zynq.
Standalone Production TAIGA
Dynamic Power 1.56 W 1.766
Static Power 0.133 W 0.141
Total Power 1.693 W 1.907
Power consumption is another cost metric relevant to embedded systems. The Vivado de-
sign tool contains a power estimation tool based on the number of resources used by the
FPGA design. Table 7.4 describes the power consumed by a standalone production con-
troller implementation and TAIGA. TAIGA consumes approximately 12.6% more power
than a standalone implementation of the production controller. Static power remains ap-
proximately the same on the Zynq even with the increase in resource usage. However, an
increase in the resources utilizes a larger portion of the Zynq IC. Activity in this larger
die area of the chip causes an increase in the dynamic power. While power is pertinent to
embedded systems, it is not of critical concern in large control applications. The power con-
sumption of the embedded controllers is negligible relative to other power hungry electrical
components in most industrial applications.
Chapter 8
Conclusions
Incorporating aspects of autonomic systems and enforcing stringent trust requirements,
TAIGA is successfully implemented on an embedded configurable SoC platform to address
security at cyber-physical leaf nodes. Traditional security practices focus on protecting the
channels of intrusion at the CPS network layers. Compromise to these security precautions
and further infiltration to leaf nodes jeopardizes process safety and stability.
In TAIGA, security measures are implemented from a controls perspective to maintain pro-
cess integrity. The RIP experiment resembles industrial control applications with similar
process control concerns. Guards are defined on the pendulum system to establish bound-
aries of safe and stable operation. All production control process interaction is monitored
and arbitrated through the IOI. Trigger mechanisms are tailored for the RIP to preemp-
tively detect guard violations and switch-over to a trusted backup controller. The backup
controller, IOI, and trigger mechanism adapt to detected threats autonomically and are
protected by the TRs safeguarding them from malicious reconfiguration.
Responses to simulated attack scenarios show increased resilience to malicious reconfigura-
tion and network integrity attacks. DoS attacks originating from malicious reconfiguration
are successfully detected by the WDT within the IOI. Set-point attacks classified as either
a network integrity or reconfiguration attack are preemptively detected by the prediction
or classification trigger mechanisms maintaining process safety and stability. The SPI filter
ignores any attempted deception attacks originating from a compromised production control
sequence that may damage process infrastructure.
79
80
In contrast to existing cyber-physical threat response schemes, TAIGA maintains safety and
liveness when malicious behavior is detected by initiating a bump-less transition to backup
control. The preemptive trigger mechanisms satisfy the self-protecting property of auto-
nomic systems by proactively defending against anticipated threats. TAIGA autonomically
reconfigures control to backup which brings the system back to stable operation and makes
the system self-healing.
8.1 Scope of TAIGA
More often than not, ICSes contain numerous sub-processes with many interconnected con-
trol nodes and intertwined control loops. Process control at a specific leaf node can influence
the behavior of other leaf nodes. For example, a PLC responsible for opening a valve based
on a pressure measurement in a chemical plant can cause disruptive temperature variation
at a different location. While the pressure measurement may have been contained within
defined operational guards, the valve opening can cause a guard violation at a different node.
The autonomic nature of TAIGA mandates strong awareness of the entire process which is
not always feasible in industrial control applications. Inter-node communication between
ICS leafs is usually interfaced through networks that are untrusted in a TAIGA framework.
As a result, implementation of TAIGA in large interdependent CPSes is much more difficult.
TAIGA’s effectiveness relies on the integrity of the infrastructure layer and trust of all process
sensors and actuators. It is implemented in leaf nodes to enforce this process integrity by
maintaining process safety and liveness properties. However, certain safety-critical industrial
control applications do not guarantee trust below the leaf nodes. Power systems are an
example of such CPSes that are physically distributed preventing efficient perimeter security
measures. TAIGA is more suitable for application domains in which the infrastructure is
protected with physical perimeter security.
The RIP implementation contains only a single process control loop. The scalability of
such a system to larger leaf nodes and process control loops is possible with TAIGA as
long as each leaf node can accurately enforce high-level safety and stability properties, or
inter-node dependencies are contained locally to ensure trust within the telemetry channels.
The proposed TAIGA framework for cyber-physical leaf nodes is best suited for consolidated
control networks (such as aircraft, drones, and automobiles) more so than ICSes.
81
8.2 Future Work
TAIGA has been successfully implemented and applied to the RIP apparatus. However, to
better integrate the leaf node to a cyber-physical control topology, efficient reconfiguration
processes need to be implemented. Currently, simulated reconfiguration attacks are loaded
to the production controller as latent malware before bootup. In CPS systems, controllers
are remotely updated in real-time while the system is active. The controller parameter
optimization and firmware update methods will be implemented within the AMP framework
to effectively create a remote channel for reconfiguration.
IOI methods protect the RIP system from threats targeting process safety and stability.
However, attacks degrading process performance also cause plant disruption. The IOI is a
process-aware framework that is safeguarded by the TRs from common external threats. As
a result, additional trigger mechanisms, process monitors, and attack detection schemes may
possibly be hosted within the IOI to address other threats or more proactively respond to
specific network integrity or reconfiguration attacks.
TAIGA, from an architectural perspective, supports extension to encompass additional con-
trol modules. In the RIP application, the production and backup controllers are sufficient
to enforce process stability. However, more complex processes may require additional aux-
iliary, performance, or other controllers to ensure recovery from application-specific guard
violations.
TAIGA is well suited for independent or contained CPSes. Modern automobiles contain a
large number of computing and network elements for remote monitoring and control, making
them more vulnerable than ever before [11]. TAIGA could protect the individual controllers
of automotive systems by enforcing safety guards that can prevent accidents under network
attacks. Similar to automobiles, aircraft also contain a large number of interconnected and
networked computation and control elements making them susceptible to malicious intrusion.
Aircraft are perhaps even more safety-critical than automobiles and need protection from a
wide range of adversaries and threats.
Bibliography
[1] A. Al-Jodah, H. Zargarzadeh, and M. Abbas. Experimental verification and comparison
of different stabilizing controllers for a rotary inverted pendulum. In Control System,
Computing and Engineering (ICCSCE), 2013 IEEE International Conference on, pages
417–423, Nov 2013.
[2] R. Barry. FreeRTOS Reference Manual - API Functions and Configuration Options.
[3] E. Bernabeu, J. Thorp, and V. Centeno. Methodology for a security/dependability
adaptive protection scheme based on data mining. Power Delivery, IEEE Transactions
on, 27(1):104–111, Jan 2012.
[4] A. A. Cardenas, S. Amin, Z.-S. Lin, Y.-L. Huang, C.-Y. Huang, and S. Sastry. Attacks
against process control systems: Risk assessment, detection, and response. In Pro-
ceedings of the 6th ACM Symposium on Information, Computer and Communications
Security, ASIACCS ’11, pages 355–366, New York, NY, USA, 2011. ACM.
[5] N. T. Chiluvuri, O. A. Harshe, C. D. Patterson, and W. T. Baumann. Using heteroge-
neous computing to implement a trust isolated architecture for cyber-physical control
systems. In Proceedings of the 1st ACM Workshop on Cyber-Physical System Security,
CPSS ’15, pages 25–35, New York, NY, USA, 2015. ACM.
[6] Digilent. ZYBO Reference Manual, Feb 2014.
[7] Z. Franklin, C. Patterson, L. Lerner, and R. Prado. Isolating trust in an industrial
control system-on-chip architecture. In Resilient Control Systems (ISRCS), 2014 7th
International Symposium on, pages 1–6, Aug 2014.
82
83
[8] O. A. Harshe. Preemptive detection of cyber attacks in industrial control systems. Mas-
ter’s thesis, Virginia Tech, Bradley Department of Electrical and Computer Engineering,
Blacksburg, VA, Apr 2015.
[9] O. A. Harshe, N. Teja Chiluvuri, C. D. Patterson, and W. T. Baumann. Design and
implementation of a security framework for industrial control systems. In Industrial
Instrumentation and Control (ICIC), 2015 International Conference on, pages 127–132,
May 2015.
[10] J. Kephart and D. Chess. The vision of autonomic computing. Computer, 36(1):41–50,
Jan 2003.
[11] K. Koscher, A. Czeskis, F. Roesner, S. Patel, T. Kohno, S. Checkoway, D. McCoy,
B. Kantor, D. Anderson, H. Shacham, and S. Savage. Experimental security analysis of
a modern automobile. In Security and Privacy (SP), 2010 IEEE Symposium on, pages
447–462, May 2010.
[12] D. Kushner. The real story of Stuxnet. Spectrum, IEEE, 50(3):48–53, March 2013.
[13] E. A. Lee. Computing foundations and practice for cyber-physical systems: A prelim-
inary report. Technical Report UCB/EECS-2007-72, EECS Department, University of
California, Berkeley, May 2007.
[14] L. Lerner. Trustworthy Embedded Computing for Cyber-Physical Control. PhD thesis,
Virginia Tech, Bradley Department of Electrical and Computer Engineering, Blacks-
burg, VA, Jan 2015.
[15] L. Lerner, Z. Franklin, W. Baumann, and C. Patterson. Application-level autonomic
hardware to predict and preempt software attacks on industrial control systems. In
Dependable Systems and Networks (DSN), 2014 44th Annual IEEE/IFIP International
Conference on, pages 136–147, June 2014.
[16] L. W. Lerner, M. M. Farag, and C. D. Patterson. Run-time prediction and preemption
of configuration attacks on embedded process controllers. In Proceedings of the First
International Conference on Security of Internet of Things, SecurIT ’12, pages 135–144,
New York, NY, USA, 2012. ACM.
84
[17] L. W. Lerner, Z. R. Franklin, W. T. Baumann, and C. D. Patterson. Using high-
level synthesis and formal analysis to predict and preempt attacks on industrial control
systems. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-
programmable Gate Arrays, FPGA ’14, pages 209–212, New York, NY, USA, 2014.
ACM.
[18] X. Liao, J. Zhou, and X. Liu. Exploring AMBA AXI on-chip interconnection for
TSV-based 3D SoCs, year=2012, month=Jan, pages=1-4, keywords=elemental semi-
conductors;integrated circuit interconnections;integrated circuit reliability;network-
on-chip;protocols;silicon;system-on-chip;three-dimensional integrated circuits;3D
network-on-chip;AMBA AXI on-chip interconnection;Si;TSV reliability;TSV-
based 3D SoC;advanced microcontroller bus architecture advanced extensi-
ble interface protocol;through silicon via;Bridge circuits;Bridges;Integrated cir-
cuit interconnections;Latches;Protocols;System-on-a-chip;Through-silicon vias,
doi=10.1109/3DIC.2012.6263036,. In 3D Systems Integration Conference (3DIC),
2011 IEEE International.
[19] Y. Mo and B. Sinopoli. Secure control against replay attacks. In Communication,
Control, and Computing, 2009. Allerton 2009. 47th Annual Allerton Conference on,
pages 911–918, Sept 2009.
[20] T. H. Morris and W. Gao. Industrial control system cyber attacks. Proceedings of
the 1st International Symposium for ICS & SCADA Cyber Security Research, page 22,
2013.
[21] B. Obama. Executive order – improving critical infrastructure cybersecurity. The White
House, 2013.
[22] M. Roman, E. Bobasu, and D. Sendrescu. Modelling of the rotary inverted pendu-
lum system. In Automation, Quality and Testing, Robotics, 2008. AQTR 2008. IEEE
International Conference on, volume 2, pages 141–146, May 2008.
[23] L. Sha. Using simplicity to control complexity. Software, IEEE, 18(4):20–28, Jul 2001.
[24] A. Teixeira, D. Perez, H. Sandberg, and K. H. Johansson. Attack models and scenarios
for networked control systems. In Proceedings of the 1st International Conference on
High Confidence Networked Systems, HiCoNS ’12, pages 55–64, New York, NY, USA,
2012. ACM.
85
[25] A. Teixeira, I. Shames, H. Sandberg, and K. Johansson. Revealing stealthy attacks
in control systems. In Communication, Control, and Computing (Allerton), 2012 50th
Annual Allerton Conference on, pages 1806–1813, Oct 2012.
[26] R. T. P. Trupti D. Shingare. SPI implementation on FPGA. International Journal of
Innovative Technology and Exploring Engineering (IJITEE), 2(2):7–9, Jan 2013.
[27] Trusted Computing Group, Incorporated. TPM Main Specification Level 2 Version 1.2,
Revision 116 Part 1 Design Principles, Mar 2011.
[28] Xilinx. MicroBlaze Processor Reference Guide, Apr 2014.
[29] Xilinx. Vivado Design Suite - AXI Reference, Nov 2014.
[30] Xilinx. Zynq-7000 All Programmable SoC Technical Reference Manual, Feb 2015.
[31] L. Yongfu, S. Dihua, L. Weining, and Z. Xuebo. A service-oriented architecture for
the transportation cyber-physical systems. In Control Conference (CCC), 2012 31st
Chinese, pages 7674–7678, July 2012.
[32] M. Zeller. Myth or reality–does the Aurora vulnerability pose a risk to my generator?
In Protective Relay Engineers, 2011 64th Annual Conference for, pages 130–136, April
2011.