October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee...

30
October 17, 2001 1 MuPC Run Time System MuPC Run Time System for UPC for UPC Steve Seidel, Phil Merkey Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Jeevan Savant, Kian Gap Lee Department of Computer Science Department of Computer Science Michigan Technological University Michigan Technological University Brian Wibecan, Brian Wibecan, Program PI Program PI Phil Becker, Phil Becker, Program Manager Program Manager Kevin Harris, Bruce Trull, Kevin Harris, Bruce Trull, and Daniel Christians and Daniel Christians Compaq UPC Development Compaq UPC Development
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee...

Page 1: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

1

MuPC Run Time SystemMuPC Run Time System for UPC for UPC

Steve Seidel, Phil MerkeySteve Seidel, Phil Merkey

Jeevan Savant, Kian Gap LeeJeevan Savant, Kian Gap LeeDepartment of Computer ScienceDepartment of Computer Science

Michigan Technological UniversityMichigan Technological University

Brian Wibecan, Brian Wibecan, Program PIProgram PI

Phil Becker, Phil Becker, Program ManagerProgram Manager

Kevin Harris, Bruce Trull,Kevin Harris, Bruce Trull,

and Daniel Christiansand Daniel ChristiansCompaq UPC DevelopmentCompaq UPC Development

Page 2: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

2

UPC designed by Carlson UPC designed by Carlson et al.et al.

A “light weight” extension of C for parallelismA “light weight” extension of C for parallelismA shared memory, multithreaded modelA shared memory, multithreaded modelArrays and pointers can be sharedArrays and pointers can be sharedArray distribution is semi-automaticArray distribution is semi-automaticRemote references are automatically resolvedRemote references are automatically resolvedParallel constructs includeParallel constructs include– forallforall– fence and split barrierfence and split barrier

Built-ins forBuilt-ins for– memory allocation/freememory allocation/free– lockslocks

Page 3: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

3

Compaq's UPC compilerCompaq's UPC compiler

UPC object codeUPC object code– front end translates UPC source to EDG ILfront end translates UPC source to EDG IL– lowering phase converts UPC-specifics to standard EDG ILlowering phase converts UPC-specifics to standard EDG IL– middle end converts EDG IL to GEM-compatible ILmiddle end converts EDG IL to GEM-compatible IL– GEM back end converts GEM IL to alpha object codeGEM back end converts GEM IL to alpha object code

Each of the intermediate phases above has some Each of the intermediate phases above has some UPC-specific components.UPC-specific components.Alternative:“Bail out" after lowering phase to Alternative:“Bail out" after lowering phase to produce C code that includes calls to a run time produce C code that includes calls to a run time system.system.Under discussion: EDG front end for UPCUnder discussion: EDG front end for UPC

Page 4: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

4

Run Time System InterfaceRun Time System InterfaceThe RTS interface is an evolving set of data objects The RTS interface is an evolving set of data objects and methods that captures the semantics of “UPC and methods that captures the semantics of “UPC minus C”.minus C”.

An RTS "reference implementation" was suggested An RTS "reference implementation" was suggested by Harris.by Harris.

A publicly available reference implementation willA publicly available reference implementation will– promote UPC code base, user base and platform basepromote UPC code base, user base and platform base– challenge MPI and OpenMPchallenge MPI and OpenMP– foster RTS evolution foster RTS evolution – promote support for UPC toolspromote support for UPC tools

MuPC is MTU's run time system for UPCMuPC is MTU's run time system for UPC

Page 5: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

5

Run Time System StructureRun Time System Structure

Run time structures describing shared objects Run time structures describing shared objects and globals are maintained. and globals are maintained.

References to nonlocal shared objects are References to nonlocal shared objects are made through made through getget and and putput..

UPC UPC barrierbarrier’s’s and and fencefence’s are passed directly ’s are passed directly to the RTS.to the RTS.

The same is true of UPC calls to other built-in The same is true of UPC calls to other built-in functions that provide locks and dynamic functions that provide locks and dynamic memory allocation.memory allocation.

Page 6: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

6

Available compiler technologyAvailable compiler technology

Proprietary Compaq compiler supports a Proprietary Compaq compiler supports a proprietary RTS.proprietary RTS.

Reference compiler is not currently available, Reference compiler is not currently available, but ...but ...

Compaq will provide a compiler that supports Compaq will provide a compiler that supports the reference RTS.the reference RTS.

Page 7: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

7

MuPC Design GoalsMuPC Design Goals

Public availabilityPublic availability

Wide platform baseWide platform base

Open source maintained by MTUOpen source maintained by MTU

User-level implementationUser-level implementation

Quick deliveryQuick delivery

Efficiency is not a primary goalEfficiency is not a primary goal

Page 8: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

8

Available PlatformsAvailable Platforms

MTU (on site):MTU (on site):– Beowulf cluster (64 nodes)Beowulf cluster (64 nodes)– Sun Enterprise 4500 (12 processors)Sun Enterprise 4500 (12 processors)– SGI Origin 2000 (4 processors)SGI Origin 2000 (4 processors)– Sun workstation networks (various)Sun workstation networks (various)– Linux workstation networks (various)Linux workstation networks (various)– AlphaServer and 2 workstations (provided by Compaq)AlphaServer and 2 workstations (provided by Compaq)

Remote:Remote:– AlphaServer SC (Compaq)AlphaServer SC (Compaq)– T3E (Cray)T3E (Cray)

Page 9: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

9

Transport vehicle selectionTransport vehicle selectionCandidatesCandidates– MPIMPI no one-sided communicationno one-sided communication– MPI-2MPI-2 incomplete implementationsincomplete implementations– PthreadsPthreads no multiprocessor support no multiprocessor support – OpenMPOpenMP expensive, possibly expensive, possibly

incompatibleincompatible– shmemshmem limited platform baselimited platform base– VIAVIA limited platform baselimited platform base– ARMCIARMCI limited user baselimited user base– TCP/IPTCP/IP too low-leveltoo low-level

Selection criteriaSelection criteria– Portability and availability: MPI, Pthreads, TCP/IPPortability and availability: MPI, Pthreads, TCP/IP– Technical shortcomings can be overcomeTechnical shortcomings can be overcome

Page 10: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

10

MPI/Pthreads hybridMPI/Pthreads hybridtransport vehicletransport vehicle

MPI provides process control and interprocessor MPI provides process control and interprocessor communication.communication.

Pthreads provides multithreading within each Pthreads provides multithreading within each process to handle asynchronous remote accesses.process to handle asynchronous remote accesses.

The following are equivalent in MuPC:The following are equivalent in MuPC:– one MPI processone MPI process– one UPC thread (from the user’s point of view)one UPC thread (from the user’s point of view)– one user Pthread + one MPI send/recv Pthreadone user Pthread + one MPI send/recv Pthread

Thread safety is provided by isolating all MPI calls Thread safety is provided by isolating all MPI calls in the send/recv Pthread.in the send/recv Pthread.

Page 11: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

11

upcrun -np 3 upc-demo

MPI_initpthread_create

MPI_initpthread_create

MPI_initpthread_create

userUPCthrd

send/recvthrd

userUPCthrd

send/recvthrd

userUPCthrd

send/recvthrd

upc_finalize upc_finalize upc_finalize

Page 12: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

12

Example: Nonlocal array Example: Nonlocal array reference reference x=a[k];x=a[k];

// User shared arrayshared int a[10][THREADS];// Frontend-generated temporary pointershared int *UPC_RTS_ptr;...// UPC source code:// x=a[k];// Front end computes address,// phase and thread of remote reference.UPC_RTS_ptr = (vaddr,phase,thread);// Call is made to get a[k]x = MuPC_get_sync_int(UPC_RTS_ptr);

Page 13: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

13

x = MuPC_get_sync_int(UPC_RTS_ptr p);

MuPC_get_sync_int Send/Recv Thr Send/Recv Thr

send_lock.type=GETsend_lock.ptr=pwait on recv_lock.donex=recv_lock.data

Pthread lock structs: send_lock recv_lock

while (threads)case...GET: MPI_Send(p,RECV)...REPLY: MPI_Recv(y) recv_lock.data=y recv_lock.done=T...end while

while (threads)case...RECV: MPI_Recv(p) MPI_Send(*p,REPLY)...end while

Page 14: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

14

x = MuPC_get_sync_int(UPC_RTS_ptr p);

MuPC_get_sync_int Send/Recv Thr Send/Recv Thr

send_lock.type=GETsend_lock.ptr=pwait on recv_lock.donex=recv_lock.data

Pthread lock structs: send_lock recv_lock

while (threads)case...GET: MPI_Send(p,RECV)...REPLY: MPI_Recv(y) recv_lock.data=y recv_lock.done=T...end while

while (threads)case...RECV: MPI_Recv(p) MPI_Send(*p,REPLY)...end while

Page 15: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

15

x = MuPC_get_sync_int(UPC_RTS_ptr p);

MuPC_get_sync_int Send/Recv Thr Send/Recv Thr

send_lock.type=GETsend_lock.ptr=pwait on recv_lock.donex=recv_lock.data

Pthread lock structs: send_lock recv_lock

while (threads)case...GET: MPI_Send(p,RECV)...REPLY: MPI_Recv(y) recv_lock.data=y recv_lock.done=T...end while

while (threads)case...RECV: MPI_Recv(p) MPI_Send(*p,REPLY)...end while

Page 16: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

16

x = MuPC_get_sync_int(UPC_RTS_ptr p);

MuPC_get_sync_int Send/Recv Thr Send/Recv Thr

send_lock.type=GETsend_lock.ptr=pwait on recv_lock.donex=recv_lock.data

Pthread lock structs: send_lock recv_lock

while (threads)case...GET: MPI_Send(p,RECV)...REPLY: MPI_Recv(y) recv_lock.data=y recv_lock.done=T...end while

while (threads)case...RECV: MPI_Recv(p) MPI_Send(*p,REPLY)...end while

Page 17: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

17

x = MuPC_get_sync_int(UPC_RTS_ptr p);

MuPC_get_sync_int Send/Recv Thr Send/Recv Thr

send_lock.type=GETsend_lock.ptr=pwait on recv_lock.donex=recv_lock.data

Pthread lock structs: send_lock recv_lock

while (threads)case...GET: MPI_Send(p,RECV)...REPLY: MPI_Recv(y) recv_lock.data=y recv_lock.done=T...end while

while (threads)case...RECV: MPI_Recv(p) MPI_Send(*p,REPLY)...end while

Page 18: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

18

Synthetic TestingSynthetic Testing

Pseudo-code walkthroughs of all MuPC functionsPseudo-code walkthroughs of all MuPC functions

Synthetic test codes are C/MPI programs that Synthetic test codes are C/MPI programs that call MuPC RTS routines directly.call MuPC RTS routines directly.

Shared data is artificially allocated.Shared data is artificially allocated.

// THREAD 0int a[10];...// a[12]=42;index=12%10;thread=12/10;MuPC_put_integer(a,index,thread,42);...

// THREAD 1int a[10];...// outcome is// a[2]=42;...

Page 19: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

19

Integration TestingIntegration TestingWrap Wrap getget’s, ’s, putput’s and ’s and notifynotify//waitwait to conform to conform to the RTS interface.to the RTS interface.

Integrate MuPC with front end ...Integrate MuPC with front end ...– ... data structures and globals... data structures and globals– ... initialization and finalization... initialization and finalization

Rewrite synthetic tests in UPC and compare to Rewrite synthetic tests in UPC and compare to previous results.previous results.

Add built-in functions forAdd built-in functions for– lockslocks– memory allocationmemory allocation

Page 20: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

20

Full-scale TestingFull-scale Testing

MTU test kernelsMTU test kernels

GWU UPC test suiteGWU UPC test suite

Contributed UPC codesContributed UPC codes

Page 21: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

21

Documentation, Delivery, and Documentation, Delivery, and DistributionDistribution

MuPC sourceMuPC source

Front-end binaries for targeted platformsFront-end binaries for targeted platforms

Makefiles, release notes, Makefiles, release notes, etc.etc.

Serve these items from MTU MuPC web siteServe these items from MTU MuPC web site

Publish a description of MuPCPublish a description of MuPC

Page 22: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

22

Preliminary Work, Summer, 2001Preliminary Work, Summer, 2001

RTS header files provided by CompaqRTS header files provided by Compaq

MPI-2 one-sided communication proposed as MPI-2 one-sided communication proposed as primary transport vehicle but current primary transport vehicle but current implementations do not meet full standardimplementations do not meet full standard

MPI/Pthreads hybrid selectedMPI/Pthreads hybrid selected

Studied intermediate output of Compaq's UPC Studied intermediate output of Compaq's UPC front endfront end

Compaq hardware and software deliveredCompaq hardware and software delivered

Single-threaded working environment verifiedSingle-threaded working environment verified

Accounts on AlphaServer SC also providedAccounts on AlphaServer SC also provided

Page 23: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

23

August 20-21, NashuaAugust 20-21, NashuaParticipants:Participants:– Bill Carlson, Brian Wibecan, Kevin Harris, Phil Becker, Bill Carlson, Brian Wibecan, Kevin Harris, Phil Becker,

Daniel Christians, Jim Bovay, Savant, Merkey, SeidelDaniel Christians, Jim Bovay, Savant, Merkey, Seidel

Discussed RTS definition and UPC features per Discussed RTS definition and UPC features per Wibecan's agenda.Wibecan's agenda.Outcomes:Outcomes:– MPI/Pthreads hybrid design feasibleMPI/Pthreads hybrid design feasible– MuPC will include MuPC will include upcccupccc and and upcrunupcrun MPI wrappers MPI wrappers– Agreed on RTS and UPC feature interpretationsAgreed on RTS and UPC feature interpretations– MuPC efficiency and performance not highest priority MuPC efficiency and performance not highest priority – Written meeting summary submitted to CompaqWritten meeting summary submitted to Compaq (Sept. 23, 2001)(Sept. 23, 2001)

Page 24: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

24

Current WorkCurrent Work

Recent improvements:Recent improvements:– isolating MPI calls for thread safetyisolating MPI calls for thread safety– send/recv threads yield control when there are no send/recv threads yield control when there are no

pending requestspending requests

Skeleton implementations of get/put, barrier, Skeleton implementations of get/put, barrier, fence, and finalize have been scaled to over 30 fence, and finalize have been scaled to over 30 nodes on MTU’s Beowulf cluster.nodes on MTU’s Beowulf cluster.

Page 25: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

25

Project Work Plan: Project Work Plan:

Start date June 28, 2001Start date June 28, 2001

This plan is based on the This plan is based on the Project Work ItemsProject Work Items specified in the March 27 RFP from Compaq specified in the March 27 RFP from Compaq and on the March 30 MTU Proposal.and on the March 30 MTU Proposal.

Page 26: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

26

Completed Work ItemsCompleted Work Items(per MTU proposal)(per MTU proposal)

1(a): Review implementation methodologies1(a): Review implementation methodologies

(b): Identify development platforms(b): Identify development platforms (c): Align resources (c): Align resources (staff and platforms)(staff and platforms)

(d): Identify target platforms(d): Identify target platforms (e): Conclusion memo (e): Conclusion memo (sent 9/23/1)(sent 9/23/1)

2: Formal Work Plan and Agreement 2: Formal Work Plan and Agreement – (Written version of this document)(Written version of this document)

4: Initial Design of Run Time System4: Initial Design of Run Time System– Design presented in Nashua on August 20, 2001Design presented in Nashua on August 20, 2001

Page 27: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

27

Remaining Work Items Remaining Work Items (w/completion dates)(w/completion dates)

5: Development of remaining primary 5: Development of remaining primary components components (Jan. 1, 2002)(Jan. 1, 2002)

– (d) locks(d) locks– (e) complete gets and puts(e) complete gets and puts– (b) memory allocation(b) memory allocation– (f) utility functions(f) utility functions

3: Test design and documentation 3: Test design and documentation (Feb. 1, 2002)(Feb. 1, 2002)

– This testing will be done concurrent with Item 5 above.This testing will be done concurrent with Item 5 above.– (a) Synthetic testing(a) Synthetic testing– (b) Integration testing(b) Integration testing– (c) Full-scale testing(c) Full-scale testing

Page 28: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

28

6: Public Interface development 6: Public Interface development (April 1, 2002)(April 1, 2002)

– (a) Makefiles, release notes, installation notes, etc.(a) Makefiles, release notes, installation notes, etc.– (b) Bundle all necessary software(b) Bundle all necessary software– (c) Provide MTU-authored test codes and results(c) Provide MTU-authored test codes and results– (d) Release advance copies for review and comment(d) Release advance copies for review and comment

7: System Refinement and Delivery 7: System Refinement and Delivery (June 1, 2002)(June 1, 2002)

– (a) Release MuPC to the UPC Developers' Group(a) Release MuPC to the UPC Developers' Group– (b) Maintain MuPC website at MTU(b) Maintain MuPC website at MTU– (c) Publish description of MuPC (c) Publish description of MuPC

8: Completion Certification8: Completion Certification (June 28, 2002)(June 28, 2002)– (a) Final MuPC release by MTU(a) Final MuPC release by MTU

Page 29: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

29

MuPC Project StaffMuPC Project Staff

Jeevan SavantJeevan Savant, M.S. Graduate Student, M.S. Graduate Student– MuPC design and implementationMuPC design and implementation– (Items 5(b,d,e,f), 6(a,d), and 7(a,c) above)(Items 5(b,d,e,f), 6(a,d), and 7(a,c) above)– Support: 9 months, half-timeSupport: 9 months, half-time

Kian Gap (Mark) LeeKian Gap (Mark) Lee, M.S. Graduate Student, M.S. Graduate Student– MuPC testing and platform integrationMuPC testing and platform integration– (Items 3(a,b,c), 6(b,c), 7(b,c) above)(Items 3(a,b,c), 6(b,c), 7(b,c) above)– Support: 9 months, half-timeSupport: 9 months, half-time

Phillip MerkeyPhillip Merkey, Research Assistant Professor, Research Assistant Professor

Steven SeidelSteven Seidel, Associate Professor, Associate Professor

Page 30: October 17, 20011 MuPC Run Time System for UPC Steve Seidel, Phil Merkey Jeevan Savant, Kian Gap Lee Department of Computer Science Michigan Technological.

October 17, 2001

30

Additional MTU UPC projectsAdditional MTU UPC projectsCharles WallaceCharles Wallace, Assistant Professor, Assistant Professor– UPC Memory modelsUPC Memory models

Xiaodi (Lisa) LiXiaodi (Lisa) Li, M.S. Graduate Student, M.S. Graduate Student– Benchmarking MuPC using one or two NAS parallel Benchmarking MuPC using one or two NAS parallel

benchmarksbenchmarks

Yi (Leon) LiangYi (Leon) Liang, M.S. Graduate Student, M.S. Graduate Student– Pthreads-only MuPC RTSPthreads-only MuPC RTS

Yongsheng HuangYongsheng Huang, M.S. Graduate Student, M.S. Graduate Student– UPC memory models, improving MuPC efficiencyUPC memory models, improving MuPC efficiency

Zhang ZhangZhang Zhang, Ph.D. Graduate Student, Ph.D. Graduate Student– UPC memory models, improving MuPC efficiencyUPC memory models, improving MuPC efficiency