MPJ: The second generation ‘MPI for Java’

39
MPJ: The second generation ‘MPI for Java’ Aamir Shafi 26 th April, 2005 Distributed Systems Group http://dsg.port.ac.uk

description

MPJ: The second generation ‘MPI for Java’. Aamir Shafi 26 th April, 2005 Distributed Systems Group http://dsg.port.ac.uk. People. Aamir Shafi Bryan Carpenter: Open Middleware Infrastructure Institute (OMII) Mark Baker. Presentation outline. Introduction - PowerPoint PPT Presentation

Transcript of MPJ: The second generation ‘MPI for Java’

Page 1: MPJ: The second generation ‘MPI for Java’

MPJ: The second generation ‘MPI for Java’

Aamir Shafi26th April, 2005

Distributed Systems Grouphttp://dsg.port.ac.uk

Page 2: MPJ: The second generation ‘MPI for Java’

April 19, 2023 2

People• Aamir Shafi• Bryan Carpenter:

–Open Middleware Infrastructure Institute (OMII)

• Mark Baker

Page 3: MPJ: The second generation ‘MPI for Java’

April 19, 2023 3

Presentation outline• Introduction• Design and implementation of MPJ• The runtime infrastructure• Implementation issues• Conclusion

Page 4: MPJ: The second generation ‘MPI for Java’

April 19, 2023 4

Introduction • MPI was introduced in June 1994 as a standard message passing API for parallel scientific computing.

–Language bindings for C, C++, and Fortran

• ‘Java Grande Message Passing Workgroup’ defined Java bindings in 98

• Previous efforts follow two approaches: –JNI approach–Pure Java approach:

• Remote Method Invocation (RMI)• Sockets

Page 5: MPJ: The second generation ‘MPI for Java’

April 19, 2023 5

Introduction: Pure Java approach• RMI

–Meant for client server applications

• Java Sockets –Java New I/O package:

• Adds non-blocking I/O to the Java language, • Direct Buffers:

– Allocated in the native OS memory and the JVM attempts to provide faster I/O

• Communication performance:• Comparison of Java NIO and C Netpipe drivers,• Java performs similar to C on Fast Ethernet.

– A very naïve comparison

Page 6: MPJ: The second generation ‘MPI for Java’

April 19, 2023 6

1. The latency is ~250 microseconds

2. After 1k, the latency starts increasing due to fragmentation of packets

3. Netpipe is a single-threaded simple benchmark

Page 7: MPJ: The second generation ‘MPI for Java’

April 19, 2023 7

1. Max throughput is ~90 Mbps

2. It will be great if MPJ with all its complexities can reach ~80 Mbps

Page 8: MPJ: The second generation ‘MPI for Java’

April 19, 2023 8

Introduction: JNI approach• Importance of JNI cannot be ignored:

–Where Java fails, JNI makes it work

• Advances in HPC communication hardware have continued to grow:

–Network latency has been reduced to a couple of microseconds

• ‘Pure Java’ looks like an impractical solution:– In the presence of myrinet, no application developer/user

would opt for Fast Ethernet

• Cons:–Not in essence with Java philosophy of ‘write once, run

anywhere’

Page 9: MPJ: The second generation ‘MPI for Java’

April 19, 2023 9

Introduction• For Java messaging:

–There is no ‘one size fits all’ approach

• Portability and high performance are often contradictory requirements:

–Portability: Pure Java–High Performance: JNI

• The choice between portability and high performance should best be left to application developers

• The challenging issue is how to manage these contradictory requirements:

–How to provide a flexible mechanism to help applications swap communication protocols?

Page 10: MPJ: The second generation ‘MPI for Java’

April 19, 2023 10

Presentation outline• Introduction• Design and implementation• The runtime infrastructure• Implementation issues• Conclusion

Page 11: MPJ: The second generation ‘MPI for Java’

April 19, 2023 11

Design• Aims:

–Support swapping various communication devices

• Two device levels: –The MPJ Device level (mpjdev)

• Separates native MPI device from all other devices• ‘native MPI’ device is a special case

– Possible to cut through and make use of native implementation of advanced MPI features

–The xdev Device level (xdev)• ‘gmdev’ – xdev based on GM 2.x comms library• ‘niodev’ – xdev based on Java NIO API• ‘smpdev’ – xdev based on Threads API

Page 12: MPJ: The second generation ‘MPI for Java’

April 19, 2023 12

MPJ design

MPJ point to point communications (Base level)

mpjdev (MPJ Device level)

MPJ collective Communications (High level)

Hardware (NIC, Memory etc)

MPJ API

JNI Java NIO

Java Virtual Machine (JVM)

JNI

Native MPI

gmdev

ThreadsAPI

smpdev

xdev

niodev

Page 13: MPJ: The second generation ‘MPI for Java’

April 19, 2023 13

Implementation• Point to point communications• Collective communications• Groups, communicators, and contexts• Derived datatypes

–Vector, Indexed, Contiguous, and Struct–Explict packing and unpacking

• Process Topologies–Cartesian –Graph

• Possible to cut through to the native MPI implementation

• As of today, three methods (Dims_create, Cancel, and Wtick are left unimplemented)

Page 14: MPJ: The second generation ‘MPI for Java’

April 19, 2023 14

Presentation outline• Introduction• Design and implementation• The runtime infrastructure• Implementation issues• Conclusion

Page 15: MPJ: The second generation ‘MPI for Java’

April 19, 2023 15

The runtime infrastructure

• All MPI libraries face the task of bootstrapping MPI processes over network computers

–RSH/SSH based scripts are the most common

• LAM/MPI daemons and runtime system works on UNIX based OS

–No version of LAM for Windows

• MPICH has recently introduced SMPD (Super Multi Purpose Daemon):

–According to docs: • Works on linux and Windows

• Difficult (if not impossible) to interface with Java

Page 16: MPJ: The second generation ‘MPI for Java’

April 19, 2023 16

Runtime: MPJDaemon and MPJStarter modules• Consists of two modules:

–The daemon that runs on compute nodes (MPJDaemon)

–The starter module that runs on head nodes (MPJStarter)

• Installing MPJDaemon on compute nodes: –RSH/SSH based scripts can easily install

daemon on UNIX based OSes:• Could be installed as services (/etc/init.d)

–Two files are required to install as a service on Windows

Page 17: MPJ: The second generation ‘MPI for Java’

April 19, 2023 17

Runtime: MPJDaemon on UNIX based OSes• $MPJ_HOME/bin/mpjdaemon is a rc shell that starts and stops the daemon

• Installation as an app: –‘cd $MPJ_HOME/bin’–./mpjdaemon start–Could use RSH/SSH script to install on whole

UNIX cluster

• Installation as a service–‘cp $MPJ_HOME/bin/mpjdaemon /etc/init.d’–Adding to the default runtime

• ‘rc-update add mpjdaemon default’ (Gentoo Linux)

–‘/etc/init.d/mpjdaemon start/stop/status

Page 18: MPJ: The second generation ‘MPI for Java’

April 19, 2023 18

Runtime: MPJDaemon on Windows

• ‘cd %MPJ_HOME%/bin’• ‘InstallMPJDaemon-NT.bat’

–This bat file installs the daemon as a service

Page 19: MPJ: The second generation ‘MPI for Java’

April 19, 2023 19

Runtime: MPJDaemon as services• Apache Commons Daemon:

–The source bundle does not even compile–The project is no more active–Spent a week trying to make it work on

Windows: • Gave up!

• Java Service Wrapper:–Simple and does what it says–Support for almost platforms available (where

you can run Java)–Distributed under MIT License:

• Redistribute without any restricitons

Page 20: MPJ: The second generation ‘MPI for Java’

April 19, 2023 20

Runtime: JMX M&M• Claims monitoring and management of Java apps:

–Start Java app with following switch: • –Dcom.sun.management.jmxremote

• Run ‘jconsole’: –Possible to connect to remote and local JVMs

• Useful if application is an Mbean: –Application attributes could be get/set remotely

• Possibility: –MPJDaemon could be operated remotely

Page 21: MPJ: The second generation ‘MPI for Java’

April 19, 2023 21

JMX M&M: Connection GUI

Page 22: MPJ: The second generation ‘MPI for Java’

April 19, 2023 22

JMX M&M: Connection summary

Page 23: MPJ: The second generation ‘MPI for Java’

April 19, 2023 23

JMX M&M: JVM memory

Page 24: MPJ: The second generation ‘MPI for Java’

April 19, 2023 24

JMX M&M: JVM threads

Page 25: MPJ: The second generation ‘MPI for Java’

April 19, 2023 25

JMX M&M: JVM info.

Page 26: MPJ: The second generation ‘MPI for Java’

April 19, 2023 26

Runtime: Dynamic class loading(1)

• The application (parallel program) and MPJ library is dynamically loaded into the daemon JVM:

–No need to copy jar files–No shared file system assumption

• MPJStarter starts the light-weight HTTP server (Jetty), which serves the jar file containing parallel program

Page 27: MPJ: The second generation ‘MPI for Java’

April 19, 2023 27

Runtime: Dynamic class loading(2)

• For example, ‘HiMPJ.java’ is a parallel program: –Requires mpj.jar to compile and run

• Bundle it into a jarfile specifying a manifest file with CLASSPATH attribute pointing to mpj.jar

– Write the manifest file,• Manifest-Version: 1.0• Main-Class: HiMPJ• Class-Path: mpj.jar

– ‘jar –cfm himpj.jar manifest HiMPJ.class’ • Copy it to $MPJ_HOME/lib directory

• Executing MPJStarter: – ‘cd $MPJ_HOME/bin’– ‘starter.[sh/bat] 2 himpj.jar ../lib –xdev niodev’

• JarClassLoader will load himpj.jar and mpj.jar into the daemons JVM

Page 28: MPJ: The second generation ‘MPI for Java’

April 19, 2023 28

Presentation outline• Introduction• Design and implementation• The runtime infrastructure• Implementation issues• Conclusion

Page 29: MPJ: The second generation ‘MPI for Java’

April 19, 2023 29

Issue 1: Shared memory device

• Based on Java Threads API: –Each thread is an MPI process–Communicates with other threads by sending

messages

• All threads run in the same JVM: –Cannot have static variables in the parallel

program–Static variables within the MPJ library require

synchronized access

Page 30: MPJ: The second generation ‘MPI for Java’

April 19, 2023 30

Issue 2: Synchronization problems with

threads in smpdev • Each MPJDaemon is assigned number of processes to be executed:

–In case of smpdev, all processes run on the same machine

• MPJDaemon loads the parallel program: –‘JarClassLoader.loadClass(parallelProgramNam

e)’

• Once loaded, the program is started as follows:

–‘JarClassLoader.invokeClass(pClass, args)’

Page 31: MPJ: The second generation ‘MPI for Java’

April 19, 2023 31

Issue 2: Synchronization problems with threads in smpdev

• For example, MPJStarter request MPJDaemons to start 2 processes (threads)

• MPJDaemon started two threads, which first load, and then start the program

–Processes (threads) are started in this way do not share static variables and cannot synchronize

• In order to share static variables and sync them, the class should be loaded just once, and exectued N times

– It was implemented in this way because niodev requires the exact opposite behaviour – No sharing of static variables

• Currently, the user specifies which device should be used:

– In case of niodev, the loading is done twice– In case of smpdev, the loading is done only once

Page 32: MPJ: The second generation ‘MPI for Java’

April 19, 2023 32

Issue 3: ‘cygwin’ • If running MPJ on cygwin,

–‘chmod o+w $MPJ_HOME/logs’–‘chmod a+x $MPJ_HOME/lib/*.dll’

• Is MPJDaemon a windows service, or a linux service on cygwin?

Page 33: MPJ: The second generation ‘MPI for Java’

April 19, 2023 33

(Future) Issue 4: Specifying multiple

devices • Currently, only one device can be specified:

–Either niodev or smpdev will be selected as the primary comms device

• But for SMP clusters, it would be ideal: –To use smpdev on a SMP node–Use niodev/gmdev for internode comms

Page 34: MPJ: The second generation ‘MPI for Java’

April 19, 2023 34

(Future) Issue 5: Starting MPJ with native MPI device• mpiJava/native MPI device uses ‘mpirun’ to bootstrap MPI processes:

–To bring it in line with other devices, native MPI device will have to be started by MPJ runtime infrastructure

Page 35: MPJ: The second generation ‘MPI for Java’

April 19, 2023 35

Issue 6: Multiple users running MPJDaemons at the same time

• Install daemons as an app, • Agree on the port numbers.

Page 36: MPJ: The second generation ‘MPI for Java’

April 19, 2023 36

Presentation outline• Introduction• Design and Implementation• The runtime infrastructure• Implementation Issues• Conclusion

Page 37: MPJ: The second generation ‘MPI for Java’

April 19, 2023 37

Summary• The key issue for Java messaging is not debating pure Java or JNI approach:

–But, providing a flexible mechanism to swap various comm protocols

• MPJ has a pluggable architecture: –We are implementing ‘niodev’, ‘gmdev’,

‘smpdev’, and native MPI device

• MPJ runtime infrastructure allows bootstrapping MPI process across various platforms

–MPJDaemons can be installed as native OS service

Page 38: MPJ: The second generation ‘MPI for Java’

April 19, 2023 38

Conclusions

• We are slowly but surely moving towards the first release of MPJ, the next generation of ‘MPI for Java’

• Current Status: –Unit Testing

• MPJ follows the same API as mpiJava: –The parallel applications built on top of mpiJava will work with MPJ–There are some differences in the API:

• Bsend, and explicit packing/unpacking -- see release docs for more details

• Arguably, the first MPI library for Java that implements real messaging stuff in pure Java

Page 39: MPJ: The second generation ‘MPI for Java’

April 19, 2023 39

Questions

?