A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering...

26
A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report

Transcript of A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering...

Page 1: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

A Design of User-Level Distributed Shared Memory

Zhi Zhai Feng Shen Computer Science and Engineering

University of Notre Dame Oct. 27, 2009

Progress Report

Page 2: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Outline • Part I: General Ideas• Part II: Related Work • Part III: Implementation

– Client/Server Processes– C/S Page Tables– Page Fault Handler– Consistency mechanism

• Part IV: Accomplished Work • Part V: Future work

Page 3: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

General IdeasDSM Characteristics:• Physically: distributed memory• Logically: a single shared address space

Software DSM Layer

P1 P2 P3 Pn-1

M1 M2 M3 Mn-1

Page 4: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Structure of DSM CPU …CPU CPU

Memory Bus

MemoryDSM

HardwareNetwork

+ Simpler abstraction + Possibly better performance:

Larger memory space - no need to do paging on disk+ Process migration simplified - one process can easily be moved to a different

machine since they all share the same address space Long shared memory access can be a bottleneck!

Page 5: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Related WorkModels and Main Features: • IVY (Yale) - Divided Space: Shared & Private space • Mirage (UCLA) - Time Interval d : Avoid page thrashing• TreadMarks (Rice) - Lazy Release Consistency : Improve efficiency

• SAM (Stanford)

Page 6: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Sample Operation

connect connect

Get Addr.

Fetch Page

Page 7: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Implementation

• Server Process and Client Process• Server Page Table and Client Page Table• Page Fault Handler• Consistency Mechanism

Page 8: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Client/Server Process

• Client Process

Page 9: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

C/S Process

Listening

Thread

• Listening new requests coming in

• Page Table Management

• Processing requests from client 1

C/S Communi-cation

Page Table Thread

• Client 2

• Client n

• Server Process

Page 10: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

C/S Page Table

• Server Page Table– Client Data

• Client IDs, IP addresses– Page Info for all

• Setting the number of pages/frames• Owners / Prot bits/ frame mappings (all)

– Does not care underlying storage on each client

Page 11: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

C/S Page Table

• Client Page Table– Storage Info

• Pointer to the physical memory• Address space of the Virtual Memory

– Page Info for local pages• Self-owned pages• Cached pages• Owners / Prot bits / frame mappings (local)

– Does not care pages not visited

Page 12: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Page Fault Handler

• Fetch the IP address and frame #• Clone the demanded page• Update Prot bits• Executing operations A B C…• Writing back dirty pages (writing)• Restore Prot bits

Page 13: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Consistency Mechanism• Single Writer / Multi-Readers

– Snap-shot, one writing allowed• Page Modification

– Two folded role of local frames• Two reads should return the same value

– Occurrences• Writing operations• Page replacement

– Block modifications to the pages being used• Variable: use_counts (>0? Wait or redo: modify OK )• Fcntl lock (modifications suspended)

Page 14: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Accomplished work Client Server

Ready (bima.helios.nd.edu)

Communicating

Client 1 (chopin.helios.nd.edu)

Client 2 (mozart.helios.nd.edu)

Page 15: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Future Work

• Page Fault Handler Implementation• Testing Plan

– Inspired by JUMP (Univ. of Hong Kong)• Similar Mechanism: File locks to keep consistency • Source Code Available • Relatively New: 2001

– Comparing the performance of the same application on:• DSM vs. Single Machine • Different settings • Different DSM Systems

Page 16: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.
Page 17: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Appendix

Page 18: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Algorithms

Implementation• Central Server Algorithm • Migration Algorithm • Read-Replication Algorithm • Full-Replication Algorithm

Non Replicated

Central Non Migrated

Replicated

Migration

Full Replication

Read Replication Migrated

Page 19: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Consistency Model

• Strict Consistency • Causal Consistency • Weak Consistency• Release Consistency

Page 20: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Granularity• Granularity: size of the shared memory unit • Large page size: + less overhead incurred due to page size - greater chance for contention to access a page by many processes. • Smaller page sizes: + less apt to cause contention (reduce the likelihood of false sharing) - Higher Overhead

Page 21: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

C/S Page Table

• Server Page Table Entries– npages, nframes

• nframes – set by the server owner• Npages – decided by # clients connected

– client_addresses• Be accessed when page fault occured

– Full_page_mappings• Recorded the frame # each page is located in

– Full_page_bits• Indicate the usage status of each page

Page 22: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

C/S Page Table

• Client Page Table Entries– Client_id (int)

• Assigned by the server

– nframes, npages (int)• Constant values configured on the server

– Physmem• Actual allocated physical address / frame spaces• PROT_READ|PROT_WRITE

– Virtmem• Virtual memory address range• PROT_NONE, MAP_NONRESEARVE

Page 23: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

C/S Page Table

– Local_Page_mapping • Page # -> frame # they are loaded in

– local_page_bits (PROT_NONE for unknown pages)

• PROT_NONE, PROT_WRITE, PROT_READ, PROT_READ|PROT_WRITE

– local_page_owners• To separate the pages owned by local client and the

pages loaded from other clients– Use_status (int)

• Indicate if the owned page s are being cached or written by other clients

– Page_fault_handler

Page 24: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Page fault handler

• Reading Attempts– Send PAGE_REQ to server– Fetch the corresponding client address and the

frame #– Modify the server page table

• page_bits -> PROT_READ

– Clone the page from page owner to the local frame x– Modify the local page table

• Page_mapping -> framex• Page_owner -> remote client ID• Page _bits -> PROT_READ

Page 25: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Memory Consistency

• Writing– Snapshot

• The client processes who have cloned the page to local memory will not see the change until being notified after the writing completes

– One concurrent writer only• The server page table bits will be set as

PROT_READ|PROT_WRITE, write requests to the same page will be delayed until the writing program exits

Page 26: A Design of User-Level Distributed Shared Memory Zhi Zhai Feng Shen Computer Science and Engineering University of Notre Dame Oct. 27, 2009 Progress Report.

Memory Consistency

• Page Replacement– Two consecutive read on the page should

return the same value if no writing is requested

– The local frame being read or written by other clients will not be replaced

– Use_status• == 0, no other clients are using this page• > 0, the number of clients who are reading or

writing this page