EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and...
-
Upload
nelson-gallagher -
Category
Documents
-
view
215 -
download
0
Transcript of EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and...
EEC 688/788EEC 688/788Secure and Dependable Secure and Dependable ComputingComputing
Lecture 9Lecture 9
Wenbing ZhaoWenbing ZhaoDepartment of Electrical and Computer EngineeringDepartment of Electrical and Computer Engineering
Cleveland State UniversityCleveland State University
[email protected]@ieee.org
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
OutlineOutline Schedule Change:
Guest seminar: Oct 30, Wed Lecture 10 was eliminated:
Lecture 11=> 10, lecture 12 =>11, lab 4 moved to Oct 23 Discussion #2 to Oct 28. Midterm #2 unchanged: Nov 4
Lab reports overdue! Time to work on the project!
Read 5 research papers: select papers from the end-of-chapter references in my book
04/20/2304/20/23 Wenbing ZhaoWenbing Zhao
Project Report RequirementProject Report Requirement
Report format: IEEE Transactions format. 4-10 pages MS Word Template
http://www.ieee.org/portal/cms_docs/pubs/transactions/TRANS-JOUR.DOC
LaTex Template http://www.ieee.org/portal/cms_docs/pubs/transactions/
IEEEtran.zip (main text) http://www.ieee.org/portal/cms_docs/pubs/transactions/
IEEEtranBST.zip (bibliography) Final Report due: Dec 9th midnight (no extensions!)
Must upload to turnitin.com account (as early as possible to see plagiarism report)
Project outline due: Nov 11th in class (hardcopy, no extension!) Topic, title, list of 5 papers
EEC688/788 Secure and Dependable EEC688/788 Secure and Dependable ComputingComputing
Data and Service Replication Replication resorts to the use of space redundancy to
achieve high availability Instead of running a single copy of the service, multiple copies
are used Usually deployed across a group of physical nodes for fault
isolation
Data and service replication Usually use different approaches Transactional data replication Optimistic replication (omitted) Balance consistency and performance: CAP theorem (omitted)
Data and Service Replication
Service replication: State machine replication Each replica is modeled as a state machine:
state, interface, deterministic state change via interface
Replica consistency issue: coordination needed Total order of requests to the server replicas Sequential execution of requests
Data replication: Direct access on data Operation on data: read or write Context: transaction processing => concurrent access
to replicated data essential
Service Replication State is encapsulated Clients interact with exported interfaces (APIs) Replication algorithm used to coordinate replicas (for
consistency) Fault tolerance middleware
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
Replication StylesReplication Styles Active replication
Every input (request) is executed by every replica Every replica generates the outputs (replies) Voting is needed to cope with non-fail-stop faults
Passive replication One of the replicas is designated as the primary replica Only the primary replica executes requests The state of the primary replica is transferred to the backups
periodically or after every request processing Semi-active replication
One of the replicas is designated as the leader (or primary) The leader determines the order of execution Every input is executed by every replica per the leader’s
instruction
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
DuplicateInvocationSuppressed
DuplicateResponsesSuppressed
Active ReplicationActive ReplicationActively Replicated
Client Object AActively Replicated
Server Object B
RM RM RM RM RM
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
Active Replication with Active Replication with VotingVoting
Question: to cope with f number of faults (non-malicious), how many replicas are needed?
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
State Transfer
State State
Response
Invocation
Passive ReplicationPassive ReplicationPassively Replicated
Client Object APassively Replicated
Server Object B
PrimaryReplica
PrimaryReplica
RMRM RM RMRM
Question: can passive replication tolerate non-fail-stop faults?
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
Ordering info
Ordering info Ordering info
Response
Invocation
Semi-Active ReplicationSemi-Active ReplicationSemi-Actively Replicated
Client Object ASemi-Actively Replicated
Server Object B
PrimaryReplica
PrimaryReplica
RMRM RM RMRM
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
Implementation of Service Replication:Ensuring Strong Replica Ensuring Strong Replica ConsistencyConsistency For active replication,
use a group communication system or a consensus algorithm that guarantees total ordering of all messages (plus deterministic processing in each replica)
Passive replication with systematic checkpointing
Semi-active replication Use two-phase commit
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
Total Ordering of MessagesTotal Ordering of Messages What is total ordering of messages?
All replicas receive the same set of messages in the same order Atomic multicast – If a message is delivered to one replica, it is also
delivered to all non-faulty replicas With replication, we need to ensure total ordering of messages sent by
a group of replicas to another group of replicas FIFO ordering between one sender and a group is not sufficient
m1
m2m1
m1m1
m1
m2m2
m1
04/20/2304/20/23EEC688/788: Secure & Dependable EEC688/788: Secure & Dependable
ComputingComputing Wenbing ZhaoWenbing Zhao
Potential Sources of Non-Potential Sources of Non-determinismsdeterminisms Multithreading
The order of accesses of shared data by different threads might not be the same at different replicas
System calls/library calls A call at one replica might succeed while the same call might fail
at another replica. E.g., memory allocation, file access
Host/process specific information Host name, process id, etc. Local clocks - gettimeofday()
Interrupts Delivered and handled asynchronously – big problem
Data Replication
Transactional data replication Read/write ops on a set of data items within the scope
of a transaction At the transaction level, executions appear to be
sequential (One-copy serializable) Actual ops on each data item often concurrent
Optimistic data replication Eventual consistency: eventually, all updates will be
propagated to all data items
Transactional Data Replication One-copy serializable
A transactional data replication algorithm should ensure that the replicated data appear to the clients as a single copy
The interleaving of the execution of the transactions be equivalent to a sequential execution of those transactions on a single copy of the data.
Make read ops cheaper than updates: read ops are more prevalent
It is challenging to design sound replication algorithms
Wrong Data Replication Algorithms Write-all
A read op on a data item x can be mapped to any replica of x Write on x must be applied to all replicas of x
Problem: what if a replica becomes faulty? Blocking! Any single replica fault => bring down the entire
system!
Wrong Data Replication Algorithms Write-all-available
A read op on a data item x can be mapped to any replica of x Write on x is applied to available replicas of x
Problem: cannot ensure one-copy serializable execution!
Attempting to Fix Write-All-Available Problem caused by accessing the not-fully-recovered
replica => how about preventing this? Still won’t work
Ti does not precedes Tj because Tj reads y before Ti writes to y Tj does not precedes Ti because Ti reads x before Tj writes to x Ti: R(x), W(y) Tj: R(y), W(x) Hence, Ti and Tj are not serializable!
Insight to the Problem The problem is caused by the fact that conflicting
operations are performed at difference replicas We must prevent this from happening A solution: use quorum-based consensus What is a quorum?
Given a system with n processes, a quorum is formed by a subset of the processes in the system
Any two quorums must intersect in at least one process Read quorum: a quorum formed for read ops Write quorum: a quorum formed for write ops
A Quorum-Based Replication Algorithm Basic idea:
Write ops apply to a write quorum Read ops apply to a read quorum Fault tolerance: given total number replicas N and
write quorum size W (>= read quorum size R), can tolerate up to N-W failures
Quorum rule Each replica assigned a positive weight, e.g., 1 A read quorum has a min total weight RT A write quorum has a min total weight WT RT+WT > total weight && 2WT > total weight
A Quorum-Based Replication Algorithm Since update is applied to a quorum of replicas, we need to track which replica has the latest value => use version numbers Version number is incremented after each update
Read rule A read on data x is mapped to a read quorum replicas of x Each replica returns both the value of x and its version
number The client select the value that has the highest version
number
A Quorum-Based Replication Algorithm Write rule A write op on data x is mapped to a write quorum replicas
of x First, retrieve version numbers from the replicas, set
v=vmax+1 for this write op Write to the replicas (in the write quorum) with new value
and version # v. A replica overwrites both the value and version number v