- 快召唤伙伴们来围观吧
- 微博 QQ QQ空间 贴吧
- 文档嵌入链接
- 复制
- 微信扫一扫分享
- 已成功复制到剪贴板
Read Copy Update (RCU)
展开查看详情
1 .Advanced Operating Systems (CS 202) Read Copy Update (RCU) Feb. 3, 2017
2 . Linux Synch. Primitives Technique Description Scope Per-CPU Duplicate a data structure All CPUs variables among CPUs Atomic operation Atomic read-modify-write All instruction Memory barrier Avoid instruction re-ordering Local CPU Spin lock Lock with busy wait All Semaphore Lock with blocking wait (sleep) All Seqlocks Lock based on access counter All Local interrupt Forbid interrupt on a single CPU Local disabling Local softirq Forbid deferrable function on a Local disabling single CPU Read-copy- Lock-free access to shared data All update (RCU) through pointers
3 . Why are we reading this paper? • Example of a synchronization primitive that is: – Lock free (mostly/for reads) – Tuned to a common access pattern – Making the common case fast • What is this common pattern? – A lot of reads – Writes are rare – Prioritize writes – Ok to read a slightly stale copy • But that can be fixed too 3
4 .Traditional OS locking designs • complex • poor concurrency • Fail to take advantage of event-driven nature of operating systems 4
5 . Motivation • Locks have acquire and release cost – Each uses atomic operations which are expensive – Can dominate cost for short critical regions – Locks become the bottleneck • Readers/writers lock is also expensive – uses atomic increment/decrement for reader count 5
6 . Lock free data structures • Do not require locks • Good if contention is rare • But difficult to create and error prone • RCU is a mixture – Concurrent changes to pointers a challenge for lock-free – RCU serializes writers using locks – Win if most of our accesses are reads 6
7 .Race Between Teardown and Use of Service Can fix with locking, but we have to use the lock every operation 7
8 .Read-Copy Update Handling Race When Cannot be context switched inside RCU quiescent state 8
9 .Read-copy update works best when • divide an update into two phases • proceed on stale data for common- case operations (e.g. continuing to handle operations by a module being unloaded) • destructive updates are very infrequent. • Often used to update linked lists – Which are used all over kernels 9
10 .Typical RCU update sequence • Replace pointers to a data structure with pointers to a new version – Is this replacement atomic? • Wait for all previous reader to complete their RCU read-side critical sections. • at this point, there cannot be any readers who hold reference to the data structure, so it now may safely be reclaimed. 10
11 .Read-Copy Deletion 11 18
12 .Read-Copy Deletion (delete B) 12
13 .the first phase of the update 13 18
14 .Read-Copy Deletion When 14
15 .Read-Copy Deletion 15
16 .Read-Copy Search 16
17 . Assumptions • Read intensive – the update fraction f < 1/ |CPU| • Grace period – reading tasks can see stale data • requires that the modification be compatible with lock-free access – linked-listinsertion, deletion, and replacement are compatible 17
18 . Simple Implementation • Wait_for_rcu() – waits for a grace period to expire • Kfree_rcu() – waits for a grace period before freeing a specified block of memory. 18
19 . Read-Copy Update Grace Period non-preemptible kernel execution Quiescent state execution 19
20 .Simple Grace-Period Detection 20
21 .wait_for_rcu() I 21
22 .wait_for_rcu() II 22
23 . Implementations of Quiescent State 1. simply execute onto each CPU in turn. 2. use context switch, execution in the idle loop, execution in user mode, system call entry, trap from user mode as the quiescent states. 3. voluntary context switch as the sole quiescent state 4. tracks beginnings and ends of operations 23
24 . Implementation (option 4) • Generation counter for each RCU region • Generation updated on write • Every read increments generation counter going in – And decrements it going out • Quiescence = counter is zero 24
25 . RCU usage in Linux 25 Source: http://www.rdrop.com/users/paulmc/RCU/linuxusage.html
26 .RCU as percentage of all locking in linux 26 Source: http://www.rdrop.com/users/paulmc/RCU/linuxusage.html
27 . SeqLock • Another special synchronization primitive • Goal is to avoid writer starvation in reader writer locks • Has a lock and a sequence number – Lock for writers only – Writer increments sequence number after acquiring lock and before releasing lock • Readers do not block • But can check sequence number 27
28 . Shortcomings • Does not work in a preemptible kernel unless preemption is suppressed in all read-side critical sections • Cannot be called from an interrupt handler • Should not be called while holding a spinlock or with interrupts disabled • Relatively slow 28
29 . Addressing • The K42 and Tornado implementations of RCU are such that read-side critical sections can block as well as being preempted—solve 1 • Call_rcu() --solve 2、3 • Kfree_rcu() --solve 2、3 • High-Performance Design for RCU –solve 2、3、4 29