RCU
-
Upload
bergwolf -
Category
Technology
-
view
1.741 -
download
4
description
Transcript of RCU
![Page 2: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/2.jpg)
Agenda
What is RCU? Why? RCU Primitives RCU List Operations Sleepable RCU User Level RCU Q&A
![Page 3: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/3.jpg)
What is RCU?
Read-copy-update An alternative of rwlock Allow low over-head wait-free read Update can be expensive: need to
maintain old copies if in use
![Page 4: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/4.jpg)
Why RCU? W/o lock, this is broken due to compiler optimization and CPU out-of-
order exec 1 struct foo { 2 int a; 3 int b; 4 int c; 5 }; 6 struct foo *gp = NULL; 7 8 /* . . . */ 9 10 p = kmalloc(sizeof(*p), GFP_KERNEL); 11 p->a = 1; 12 p->b = 2; 13 p->c = 3; 14 gp = p;
![Page 5: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/5.jpg)
Why RCU?
Mutex, no concurrent readers Spin_lock, ditto Rwlock, allow concurrent readers.
The right choice?
![Page 6: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/6.jpg)
Why RCU?
rwlock is expensive Even read_lock has more overhead
than spin_lock If write_lock is not really rare,
rwlock contention is much worse than spin_lock contension
![Page 7: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/7.jpg)
RCU Basis Split update into removal and
reclamation phases Removal is performed immediately,
while reclamation is deferred until all readers active during the removal phase have completed
Takes advantage of the fact that writes to single aligned pointers are atomic on modern CPUs
![Page 8: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/8.jpg)
RCU Terminology read-side critical sections: code
delimited by rcu_read_lock() and rcu_read_unlock(), MUST NOT sleep.
quiescent state: any code not within an RCU read-side critical section
grace period: any time period during which each thread resides at least one quiescent state
![Page 9: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/9.jpg)
RCU Terminology
More on grace period: after a full grace period, all pre-existing RCU read-side critical sections are completed.
![Page 10: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/10.jpg)
RCU Update Sequence Remove pointers to a data structure, so
that subsequent readers cannot gain a reference to it
Wait for all previous readers to complete their RCU read-side critical sections (AKA, a grace period passes)
At this point, there cannot be any readers who hold references to the data structure, so it now may safely be reclaimed (e.g., in another thread)
![Page 11: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/11.jpg)
When Grace Period Passes? RCU readers are not permitted to block,
switch to user-mode execution, or enter the idle loop.
As soon as a CPU is seen passing through any of these three states, we know that that CPU has exited any previous RCU read-side critical sections.
If we remove an item from a linked list, and then wait until all CPUs have switched context, executed in user mode, or executed in the idle loop, we can safely free up that item.
![Page 12: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/12.jpg)
Core RCU APIs
rcu_read_lock() rcu_read_unlock() synchronize_rcu()/call_rcu() rcu_assign_pointer() rcu_dereference()
![Page 13: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/13.jpg)
Wait for Readers
synchronize_rcu(): waits only for all ongoing RCU read-side critical sections to complete
call_rcu(): registers a function and argument which are invoked after all ongoing RCU read-side critical sections have completed
![Page 14: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/14.jpg)
Assign & Retrieve
rcu_assign_pointer(): assign a new value to an RCU-protected pointer
rcu_dereference(): fetch an RCU-protected pointer, which is safe to use until rcu_read_unlock()
![Page 15: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/15.jpg)
RCU List Insert
list_add_rcu() list_add_tail_rcu() list_replace_rcu()
Must be protected by some locks.
![Page 16: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/16.jpg)
Sample Code 1 struct foo { 2 struct list_node *list; 3 int a; 4 int b; 5 int c; 6 }; 7 LIST_HEAD(head); 8 9 /* . . . */ 10 p = kmalloc(sizeof(*p), GFP_KERNEL); 11 p->a = 1; 12 p->b = 2; 13 p->c = 3; 14 spin_lock(&list_lock); 15 list_add_head_rcu(&p->list, &head); 16 spin_unlock(&list_lock);
![Page 17: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/17.jpg)
RCU List Transversal
list_for_each_entry_rcu() rcu_read_lock() and
rcu_read_unlock() must be called, but they never spin or block
Allows list_add_rcu() execute concurrently
![Page 18: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/18.jpg)
RCU List Removal list_del_rcu() removes element from list.
Must be protected by some lock But when to free it? synchronize_rcu() blocks until all read-
side critical sections that begin before synchronize_rcu() is completed
call_rcu() runs after all read-side critical sections that begin before call_rcu() is completed.
![Page 19: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/19.jpg)
Sample Codespin_lock(&mylock);p = search(head, key);if (p == NULL)
spin_unlock(&mylock);else {
list_del_rcu(&p->list);spin_unlock(&mylock);synchronize_rcu();kfree(p);
}
![Page 20: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/20.jpg)
Sleepable RCU
Why? the realtime kernels that require
spinlock critical sections be preemptible also require that RCU read-side critical sections be preemptible
![Page 21: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/21.jpg)
SRCU Implementation Strategy prevent any given task sleeping in
an RCU read-side critical section from getting an unbounded number of RCU callbacks refusing to provide asynchronous
grace-period interfaces, such as the Classic RCU's call_rcu() API
isolating grace-period detection within each subsystem using SRCU
![Page 22: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/22.jpg)
SRCU Grace Period?
grace periods are detected by counting per-CPU counters. readers manipulate CPU-local
counters. Two sets of per-CPU counters to do
read-copy-update
![Page 23: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/23.jpg)
SRCU Data Structure
struct srcu_struct {int completed;struct srcu_struct_array __percpu
*per_cpu_ref;struct mutex mutex;};struct srcu_struct_array {
int c[2];};
![Page 24: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/24.jpg)
Wait for Grace Period
synchronize_srcu() Flip the completed counter. So new
readers will be using the other set of per-CPU counters.
Wait for the old count to drain to zero.
![Page 25: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/25.jpg)
SRCU APIsint init_srcu_struct(struct srcu_struct *sp);void cleanup_srcu_struct(struct srcu_struct *sp);
int srcu_read_lock(struct srcu_struct *sp) __acquires(sp);
void srcu_read_unlock(struct srcu_struct *sp, int idx);
void synchronize_srcu(struct srcu_struct *sp);void synchronize_srcu_expedited(struct srcu_struct
*sp);
long srcu_batches_completed(struct srcu_struct *sp);
![Page 26: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/26.jpg)
Userspace RCU Available on http://lttng.org/urcu git clone git://git.lttng.org/userspace-
rcu.git Debian: aptitude install liburcu-dev Examples
![Page 27: RCU](https://reader031.fdocuments.in/reader031/viewer/2022022215/546c9238b4af9f842c8b5146/html5/thumbnails/27.jpg)
Q & A