Breaking Through Serial Barriers: Scalable Hard Particle...

15
THE GLOTZER GROUP Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo Simulations with HOOMD-Blue Joshua A. Anderson, M. Eric Irrgang, Sharon C. Glotzer Anderson, J. A. et al., JCP 254, 27-38 (2013) Thursday, April 17, 14

Transcript of Breaking Through Serial Barriers: Scalable Hard Particle...

Page 1: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

Simulations with HOOMD-BlueJoshua A. Anderson, M. Eric Irrgang, Sharon C. Glotzer

Anderson, J. A. et al., JCP 254, 27-38 (2013)

Thursday, April 17, 14

Page 2: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUPTHE GLOTZER GROUP

HPMC - Massively parallel MC on the GPU

• Hard Particle Monte Carlo plugin for HOOMD-blue

• 2D Shapes• Disk• Convex (Sphero)polygon• Concave polygon• Ellipse

• 3D Shapes• Sphere• Ellipsoid• Convex (Sphero)polyhedon

• NVT and NPT ensembles• Frenkel-Ladd free energy• Parallel execution on a single GPU• Domain decomposition across

multiple nodes (CPUs or GPUs)

H

β-MncP20 (A13)

#P04

[100]

Damasceno et al., Science (2012)

Engel M. et al., PRE 87, 042134 (2013)

Damasceno, P. F. et al., ACS Nano 6, 609 (2012)

Damasceno et al., Science (2012)

Thursday, April 17, 14

Page 3: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

Thursday, April 17, 14

Page 4: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

Thursday, April 17, 14

Page 5: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

Thursday, April 17, 14

Page 6: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

GPU parallel Monte Carlo

• Store lists of active cells• Compute cell list• Compute extended cell list• For i in [0...nselect)• Loop through checkerboards in a shuffled order• For each active cell c in parallel• RNG rng(c, i, step)• Choose one particle p• Choose a random trial move p’• Check all particles in the extended cell list for

overlaps with p’• If p’ remains in cell and no overlaps• p’ -> p

• Translate all particles by a random vector

• Challenges : Expensive overlap checks, precision, highly divergent execution, auto-tuning

Thursday, April 17, 14

Page 7: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Overlap checks• Disk/sphere - trivial• Convex polygons - separating axis• Concave polygons - brute force• Spheropolygons - XenoCollide/GJK• Convex polyhedra - XenoCollide/GJK• Ellipsoid / Ellipse: Matrix method• Compute delta in double, convert to

single for expensive overlap check

⊖=

Separating planes

XenoCollide

1001.842 - 1000.967 = 0.875

�~r

Thursday, April 17, 14

Page 8: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Example job script

from hoomd_script import *from hoomd_plugins import hpmc

init.read_xml(filename=‘init.xml’)

mc = hpmc.integrate.convex_polygon(seed=10, d=0.25, a=0.3);mc.shape_param.set('A', vertices=[(-0.5, -0.5), (0.5, -0.5), (0.5, 0.5), (-0.5, 0.5)]);

run(10e3)

Thursday, April 17, 14

Page 9: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Single GPU performance

CPU: 8-core Intel Xeon E5-2670 (Sandybridge) System: 65k particle dense fluid

38-64x 18-28x

0

16

32

48

64

80

Sp

eed

up

DiskSquare

Hexagon

Rounded SquareSphere

Ellipsoid

Tetrahedron

Cube

Trunc. Octahedron

Rounded Tetrahedron

Rounded Trunc. OctahedronDart

1 CPU core 1 CPU (8 cores) Tesla M2070 Tesla K20X Tesla K40

Thursday, April 17, 14

Page 10: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

Thursday, April 17, 14

Page 11: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

Thursday, April 17, 14

Page 12: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Multi-node scaling - squares (2D) GPU: Tesla K20X, CPU: Xeon E5-2680 (XSEDE Stampede)

106

107

108

109

Performance

1 2 4 8 16 32 64 128 256 512 1024 2048 4096

P - GPUs/CPU cores

41x

N=4,194,304N=65,536N=4,096

GPUGPUGPU

CPUCPUCPU

Thursday, April 17, 14

Page 13: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Multi-GPU scaling bottlenecks - squares (2D)

0.20

0.50

1.00

2.00

5.00

Time/ms

1 2 4 8 16 32 64

P

Compute

Communicate

Ideal compute

Thursday, April 17, 14

Page 14: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Multi-node scaling - truncated octahedra (3D)

105

106

107

108

109

Performance

1 2 4 8 16 32 64 128 256 512 1024 2048 4096

P - GPUs/CPU cores

20x

N=4,096,000N=64,000N=4,096

GPUGPUGPU

CPUCPUCPU

GPU: Tesla K20X, CPU: Xeon E5-2680 (XSEDE Stampede)

Thursday, April 17, 14

Page 15: Breaking Through Serial Barriers: Scalable Hard Particle ...on-demand.gputechconf.com/gtc/2014/presentations/S...Breaking Through Serial Barriers: Scalable Hard Particle Monte Carlo

THE GLOTZER GROUP

Questions?

Funding / Resources• DOD NSSEFF grant: N00244-09-1-0062• This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which

is supported by National Science Foundation grant number OCI-1053575.

email: [email protected]

• Code not yet publicly available, will eventually be released as part of HOOMD-blue http://codeblue.umich.edu/hoomd-blue

• Paper on disks: Anderson, J. A. et al., JCP 254, 27-38 (2013)• Paper on 3D & shapes: coming soon

Thursday, April 17, 14