DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

56
DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    224
  • download

    0

Transcript of DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Page 1: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION

Dave DeMarle

May 1 2003

Page 2: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Thesis:

It is possible to visualize multi-Gigabyte datasets interactively using ray tracing on a cluster.

Page 3: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Outline

Background.

Related work.

Communication.

Ray tracing with replicated data.

Distributed shared memory.

Ray tracing large volumes.

Page 4: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Ray Tracing

For every pixel, compute a ray from a viewpoint into space, and test for intersection with every object.Take the nearest hit object’s color for the pixel.Shadows, reflections, refractions and photorealistic effects simply require more rays.

Page 5: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.
Page 6: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Interactive Ray Tracing

1998: *-Ray

Image Parallel renderer optimized for SGI-Origin shared memory supercomputer.

My work moves this program to a Cluster, in order to make it less expensive.

Page 7: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

CPU 1 CPU 2 CPU 3 CPU 4

Page 8: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Ray Traced Forest Scene Showing task distribution

Page 9: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Cluster Computing

Connect inexpensive machines.

Advantages:Cheaper.Faster growth curve in commodity market.

Disadvantages:Slower network.Separate Memory.

Page 10: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Ray Nebula

~$1.5 million. ~$150 thousand.

32 0.39 GHz R12K CPUs. 2x32 1.7 GHz Xeon CPUs.

16GB RAM (shared). 32GB RAM (1GB per node).

NUMA hypercube network. Switched Gbit Ethernet.

335ns avg round trip latency. 34000ns avg round trip latency.

12.8 Gbit/sec bandwidth. .6 Gbit/sec bandwidth.

Page 11: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Related Work

2001: Saarland Renderer

Trace 4 rays with SIMD operations.

Obtain data from a central server.

Limited to triangular data.

My work keeps *-Ray’s flexibility, and uses distributed ownership.

Page 12: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Related Work

1993: Corrie and Mackeras

Volume rendering on a Fujitsu AP1000.

My work uses recent hardware, and multithreading on each node, to achieve interactivity.

Page 13: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Outline

Background.

Related work.

Communication.

Ray tracing with replicated data.

Distributed shared memory.

Ray tracing large volumes.

Page 14: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Communication

LegionGoal 1: to reduce library overhead.

Built on top of TCP.

Goal 2: reduce wait time.Dedicated communication thread handles

incoming traffic.

Page 15: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Inbound: Select(), read header(), call function.Outbound: protect with mutex for thread.

Comp Thread 1

Comp Thread T

Communicator Thread

handler_1() select()

Communicator::send()

Node 0

handler_h() Net

Page 16: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Outline

Background.

Related work.

Communication.

Ray tracing with replicated data.

Distributed shared memory.

Ray tracing large volumes.

Page 17: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Distributed Ray Tracer Implementation

Image Parallel Ray Tracer.

Supervisor/Workers program structure.

Each node runs a multithreaded application.

Replicate data if it fits in each node’s memory.

Use Distributed Shared Memory (DSM) for larger volumetric data.

Page 18: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Worker 2 Worker 3 Worker 1 RenderThread 1

RenderThread 2

RenderThread 1

RenderThread 2

RenderThread 1

RenderThread 2

Supervisor

ImageUser

Page 19: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Supervisor Program

Communicator

Scene State

Frame State

Task State

Display Thread

Aux. Dpy Threads

ImageNode 0

Page 20: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Worker Program

Communicator

Scene State

Frame State

TaskManager

Render Thread 1

SceneNode N

Render Thread N TaskQueue

ViewManager

Page 21: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Render StateData that *-Ray communicated by reference between functional units, is now transferred over the network.

SceneState – constant over a session. Acceleration structure type, number of workers…

FrameState – can change each frame. Camera Position, image resolution…

TaskState – changes during a frame. Pixel tile assignments.

Page 22: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

TaskManager keeps a local queue of tasks.

Two semaphores guard the queue.

Tile

Supervisor Worker 1

Tile Tile TaskManager

Tile

Tile Tile

Render Thread 1

Render Thread 2

TaskQueue

Tile

Tile Tile

Image

Page 23: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Network Limitation

Max frame rate determined by network.

19 μs per tile (queuing), 600Mbit/sec bandwidth.

0

10

20

30

40

50

60

70

80

CPUs

Fra

me

s/s

ec

32x32

32x32 limit

16x16

16x16 limit

8x8

8x8 limit

4x4

4x4 limit1 8 12 16 31

Page 24: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Replicated Comparison

Page 25: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Machine Comparison with Replicated Data

0

2

4

6

8

10

CPUs

Fra

me

s/s

ec 16x16 SGI

8x8 SGI

16x16 Cluster

8x8 Cluster

1 8 16 24 31

Page 26: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Outline

Background.

Related work.

Communication.

Ray tracing with replicated data.

Distributed shared memory.

Ray tracing large volumes.

Page 27: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Large Volumes

Richtmyer-MeshkovInstability Simulationfrom Lawrence Livermore National Labs.

1920x2048x2048x 8 bit

Page 28: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Legion’s DSMDataServer class Compute threads call acquire to obtain blocks of memory. The DataServer finds and returns the requested block. Compute threads call release to let the DataServer reuse the

space.

The DataServer uses Legion to transfer blocks over the network. Each node owns the blocks in its resident_set area, and caches

remote owned blocks in its local_cache area.

5 DataServer flavors: single threaded, multithreaded direct mapped, associative, mmap from disk, and writable.

Page 29: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

0 3 6 1

resident_set local_cacheDataServer

Communicator Thread

get_data()release_data()

Comp. Thread 1Node 0

4 2 7

1 4 7 5

resident_set local_cacheDataServer

Communicator Thread

get_data()release_data()

Comp. Thread 1Node 1

8 6 3

2 5 8 1

resident_set local_cacheDataServer

Communicator Thread

get_data()release_data()

Comp. Thread 1Node 2

4 6 3

Page 30: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Outline

Background.

Related work.

Communication.

Ray tracing with replicated data.

Distributed shared memory.

Ray tracing large volumes.

Page 31: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Large VolumesUse distributed versions of *-Ray’s templated volume classes, which treat DataServer as a 3D array.

DISOVolume DMIPVolume

DBrickArray3

DataServer

Data(x,y,z) Block Q, Offset R

Page 32: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Isosurface of visible female Showing data ownership

Page 33: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Optimized Data access for Large Volumes

Use 3 level bricking for memory coherence: 64 byte cache line. 4KB OS page. 4KB * L^3 Network transfer size.

3rd level bricks = DataServer blocks.

Use macrocell hierarchy to reduce number of accesses.

Page 34: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Results with Distributed Data

Hit time of 6.86 μs or higher.Associative DataServer takes longer.Miss time of 390 μs or higher.Larger bricks take longer.

Empirically, if local cache is >10% of data size, get >95% hit rates for isosurfacing, MIPing.

Investigated techniques to increase hit rate, reduce number of accesses.

Page 35: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Consolidated Access

Hit time is usually the limiting factor.

Reduce the number of DSM accesses.

Eliminate redundant accesses.

When ray needs data, sort accesses to get all needed data inside with one DSM access.

Page 36: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Consolidated Access

Brick 1 Brick 2 Brick 3

Brick 4 Brick 5 Brick 6

macrocell

Page 37: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Consolidated Access

Brick 1 Brick 2 Brick 3

Brick 4 Brick 5 Brick 6

macrocell

Page 38: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Consolidated Access

Brick 1 Brick 2 Brick 3

Brick 4 Brick 5 Brick 6

macrocell

Page 39: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

2 GB

0

100000

200000

300000

400000

500000

0

1

2

3

4

5

6

7

8

Fra

mes

/sec

Acq

uire

s/no

de/f

ram

e

Access 1 Access 8 Access X

Page 40: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Machine Comparison

Use the Richtmyer-Meshkov data set to compare the distributed ray tracer with *-Ray.

To determine how data sharing effects the cluster program.

Page 41: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

0

2

4

6

8

10

12

14

Frame Number

Fra

me

s/s

ec Ray 31

CPUs, 4.7 f/s

Nebula 62CPUs, 1.7 f/s

Nebula 32CPUs, 1.1f/s

1 589300

Page 42: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Traffic

When entire volume is in view it takes a few frames for the caches to load, which slows down the renderer.

When only a portion is in view, the working set is small and network traffic is not an issue.

Page 43: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

0

0.5

1

1.5

2

2.5

0

5

10

15

20

isov

alue

view

poin

tMB

/nod

efr

ames

/sec

Page 44: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

0

0.5

1

1.5

2

2.5

3

3.5

recorded rate

loaded rate

fram

es/s

ec

Frame Number

Page 45: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Treepot scene 2 million polygons

512x5121 hard shadow~1 f/s

CPU bound, not network bound

Page 46: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Richtmyer-MeshkovTimestep 2701920x2048x2048

512x5121..2 f/s w/ 1 hard shadow

CPU or network bound,depending on the Viewpoint.

Page 47: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Focusing in…

Page 48: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Focusing in…

Page 49: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Focusing in…

Page 50: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Focusing in…

Page 51: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Focusing in…

Page 52: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Images

Focusing in…

Page 53: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Conclusion

Confirmed that interactive Ray Tracing on a cluster is possible.Scaling and the ultimate Frame Rate is limited by latency, and number of tasks in image determines max frame rate.With reasonably complex scenes the render is CPU bound, even with 62 processors.With tens of processors, cluster is comparable to supercomputer.

Page 54: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Conclusion

Data Sets that exceed the memory space of any one node can be managed with a DSM.For isosurfacing, and MIPing, hit time is limiting factor, not network time.The longer data access time makes the cluster slower than the supercomputer, but it is still interactive.

Page 55: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Future Work

Faster for realistic images interactively.Faster network layer.Faster DSM.Faster ray tracing.

Direct volume rendering.

Distributed polygonal data sets.

Page 56: DISTRIBUTED INTERACTIVE RAY TRACING FOR LARGE VOLUME VISUALIZATION Dave DeMarle May 1 2003.

Acknowledgments

NSF Grants 9977218, 9978099.

DOE Views.

NIH Grants.

My Committee, Steve, Chuck and Pete.

Patti DeMarle.

Thanks to everyone else, for making this a great place to live and work!