CS510 Concurrent Systems

28
CS510 Concurrent Systems “What is RCU, Fundamentally?” By Paul McKenney and Jonathan Walpole Daniel Mansour (Slides adapted from Professor Walpole’s)

description

CS510 Concurrent Systems. “What is RCU, Fundamentally?” By Paul McKenney and Jonathan Walpole Daniel Mansour (Slides adapted from Professor Walpole’s). Review of Concurrency. The Story So Far Critical Sections Shared global objects Blocking/Non-Blocking Reader/Writers - PowerPoint PPT Presentation

Transcript of CS510 Concurrent Systems

Page 1: CS510 Concurrent Systems

CS510 Concurrent Systems

“What is RCU, Fundamentally?”By Paul McKenney and Jonathan Walpole

Daniel Mansour(Slides adapted from Professor Walpole’s)

Page 2: CS510 Concurrent Systems

2

Review of Concurrency

The Story So Faro Critical Sections

• Shared global objects• Blocking/Non-Blocking

o Reader/Writers• Reads are “safer”• Writes are Fewer

o Hardware/Compiler Optimizations• Operations are not always in order

o Memory Reclamation• Deleted Objects can still be read

Page 3: CS510 Concurrent Systems

3

Read-Copy Update (RCU)

Read/Writeso Do not allow reads while writes are occurringo RCU solves this by allowing multiple versions

Insertiono Publish-Subscribe Mechanism

Deletiono Wait for Readers to completeo Non-Blocking

Page 4: CS510 Concurrent Systems

4

Publish-Subscribe Mechanism

1 struct foo { 2 int a; 3 int b; 4 int c; 5 }; 6 struct foo *gp = NULL; 7 8 /* . . . */ 9 10 p = kmalloc(sizeof(*p), GFP_KERNEL); 11 p->a = 1; 12 p->b = 2; 13 p->c = 3; 14 gp = p;

Page 5: CS510 Concurrent Systems

5

Publish-Subscribe Mechanism

1 struct foo { 2 int a; 3 int b; 4 int c; 5 }; 6 struct foo *gp = NULL; 7 8 /* . . . */ 9 10 p = kmalloc(sizeof(*p), GFP_KERNEL); 11 gp = p; 12 p->a = 1; 13 p->b = 2; 14 p->c = 3;

Reordered, concurrent readerswould get uninitialized values

Page 6: CS510 Concurrent Systems

6

Publish-Subscribe Mechanism

Publish the Objecto rcu_assign_pointer

1 p->a = 1; 2 p->b = 2; 3 p->c = 3; 4 rcu_assign_pointer(gp, p);

Page 7: CS510 Concurrent Systems

7

Publish-Subscribe Mechanism

Readerso Ordering can still be an issue

o p->a can be read before p in some compiler optimizations

1 p = gp; 2 if (p != NULL) { 3 do_something_with(p->a, p->b, p-c); 4 }

Page 8: CS510 Concurrent Systems

8

Publish-Subscribe Mechanism

Subscribe to the objecto rcu_dereference

o Implements any necessary memory barriers

1 rcu_read_lock() 2 p = rcu_dereference(gp) 3 if (p != NULL { 4 do_something_with(p->a, p->b, p->c); 5 } 6 rcu_read_unlock();

Page 9: CS510 Concurrent Systems

9

Implementation

Doubly Linked Listso list_head

o Head links to tail with prev and vice versa

Page 10: CS510 Concurrent Systems

10

Implementation - list_head

Publishing 1 struct foo { 2 struct list_head list; 3 int a; 4 int b; 5 int c; 6 }; 7 LIST_HEAD(head); 8 9 /* . . . */ 10 11 p = kmalloc(sizeof(*p), GFP_KERNEL); 12 p->a = 1; 13 p->b = 2; 14 p->c = 3; 15 list_add_rcu(&p->list, &head);

Still needs to be protected

Page 11: CS510 Concurrent Systems

11

Implementation - list_head

Subscribing 1 rcu_read_lock(); 2 list_for_each_entry_rcu(p, head, list) { 3 do_something_with(p->a, p->b, p->c); 4 } 5 rcu_read_unlock();

Page 12: CS510 Concurrent Systems

12

Implementation - hlist_head/hlist_node

Doubly Linked Listso hlist_head/hlist_node

o Linear list

Page 13: CS510 Concurrent Systems

13

Implementation - hlist_head/hlist_node

Publishing 1 struct foo { 2 struct hlist_node *list; 3 int a; 4 int b; 5 int c; 6 }; 7 HLIST_HEAD(head); 8 9 /* . . . */ 10 11 p = kmalloc(sizeof(*p), GFP_KERNEL); 12 p->a = 1; 13 p->b = 2; 14 p->c = 3; 15 hlist_add_head_rcu(&p->list, &head);

Page 14: CS510 Concurrent Systems

14

Implementation - list_head

Subscribing 1 rcu_read_lock(); 2 hlist_for_each_entry_rcu(p, q, head, list) { 3 do_something_with(p->a, p->b, p->c); 4 } 5 rcu_read_unlock();

Iteration requires checking for nullInstead of p = head

Page 15: CS510 Concurrent Systems

15

Reclaiming Memory

Waitingo rcu_read_lock and rcu_read_unlock

• Must not block or sleep

Page 16: CS510 Concurrent Systems

Reclaiming Memory

Basic Process

1. Alter the data (replace/delete an element)

2. Wait for all pre-existing RCU reads to completely finish• synchronize_rcu• New RCU reads have no way to access removed element

3. Reclaim/Recycle memory

16

Page 17: CS510 Concurrent Systems

Reclaiming Memory

Example

17

1 struct foo { 2 struct list_head list; 3 int a; 4 int b; 5 int c; 6 }; 7 LIST_HEAD(head); 8 9 /* . . . */ 10 11 p = search(head, key); 12 if (p == NULL) { 13 /* Take appropriate action, unlock, and return. */ 14 } 15 q = kmalloc(sizeof(*p), GFP_KERNEL); 16 *q = *p; 17 q->b = 2; 18 q->c = 3; 19 list_replace_rcu(&p->list, &q->list); 20 synchronize_rcu(); 21 kfree(p);

Page 18: CS510 Concurrent Systems

Reclaiming Memory

synchronize_rcuo rcu_read_lock and rcu_read_unlock do nothing on non-

CONFIG-PREEMPT kernelso Recall that code in rcu_read blocks can not sleep or block,

preventing context switcheso When a context switch occurs, we can deduce that the read

has completedo synchronize_rcu waits for each cpu to context switch

conceptually:

18

1 for_each_online_cpu(cpu) 2 run_on(cpu);

Page 19: CS510 Concurrent Systems

Maintaining Multiple Versions - Deletion

During deletion

Initial State

19

1 p = search(head, key); 2 if (p != NULL) { 3 list_del_rcu(&p->list); 4 synchronize_rcu(); 5 kfree(p); 6 }

Page 20: CS510 Concurrent Systems

Maintaining Multiple Versions - Deletion

list_del_rcu(&p->list)

p has been deleted, but other readers can still be accessing it

20

Page 21: CS510 Concurrent Systems

Maintaining Multiple Versions - Deletion

synchronize_rcu()

kfree(p)

21

Page 22: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

During replacement

Initial State

22

1 q = kmalloc(sizeof(*p), GFP_KERNEL); 2 *q = *p; 3 q->b = 2; 4 q->c = 3; 5 list_replace_rcu(&p->list, &q->list); 6 synchronize_rcu(); 7 kfree(p);

Page 23: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

kmalloc()

23

Page 24: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

*q = *p

24

Page 25: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

q->b = 2; q->c = 3

25

Page 26: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

list_replace_rcu

There are now two versions of the list, but both are well formed.

26

Page 27: CS510 Concurrent Systems

Maintaining Multiple Versions - Replacement

synchronize_rcu

kfree()

27

Page 28: CS510 Concurrent Systems

Issues

Very much a read/write implementation, not ideally suited for rapidly updated objects

Responsibility on the programmer to make sure that no sleeping/blocking occurs in rcu_read_locks

RCU updaters can still slow readers via cache invalidation

28