Fine-grained Visualization Pipelines and Lazy Functional Languages D.J. Duke 1, M. Wallace 2, R....

Fine-grained Visualization Pipelines

and Lazy Functional Languages

D.J. Duke1, M. Wallace2, R. Borgo1, & C. Runciman2

1Visualization and Virtual Reality Group, University of Leeds, UK

2Programming Languages and Systems Group, University of York, UK

2

Overview

Motivation

• why are we doing functional programming?

• a lazy polytypic grid

• Haskell 101

Marching cubes

Functional approaches

• array-based

• streaming

Evaluation

• performance

• software engineering

Outlook

3

Why (pure) functional programming?

Practical concern: grid-enabled visualization

• new technologies for generic programming (polytypism)

• staged computation

• laziness – natural demand-driven evaluation and streaming

• migration to parallel evaluation

Theoretical concern: abstractions for software development

• problem decomposition program composition

• J. Hughes, Why Functional Programming Matters

• correctness – mathematically tractable, concise

• different way of thinking about problems

4

What is (pure) functional programming?

5

Type declaration - optionalHigher-order:

map takes another function

as parameter

Function body:

list of equations

Curried functions:

a -> b -> c rather than (a,b) -> c

makes partial application easy

e.g. (+1), or map (+4)

What is (pure) functional programming?

Functions are first-class citizens

• passed as parameters and/or returned as results

• higher-order functions implement patterns of computation

Expressive type systems

Laziness

Example

map :: (a -> b) -> [a] -> [b]map f [] = []map f (a:as) = (f a) : (map f as)

Pattern matching:

case analysis of list constructor

Laziness:

(f a) etc only evaluated as needed

Type variables:

this function works for any

choice of types “a” and “b”

6

Haskell 101

Other useful higher-order functions

(.) :: (b -> c) -> (a -> b) -> (a -> c)

(f . g) x = f (g x)

zipWith2 :: (a->b->c) -> [a] -> [b] -> [c]

zipWith2 f [] _ = []

zipWith2 f _ [] = []

zipWith2 f (a:as) (b:bs) = (f a b):(zipWith f as bs)

($) :: (a -> b) -> a -> b

f $ x = f x [Why? Write f . g $ x instead of (f . g) x]

Local definitionslargest :: [a] -> a

largest (a:as) = largest1 a as

where largest1 a [] = a

largest1 a (b:bs) | a > b = largest1 a bs

| otherwise = largest1 b bs

Type classeslargest :: (Ord a) => [a] -> a

Pipelines and functions

pipeline architecture widespread in visualization

supports distribution and streaming

However

• Streaming is ad-hoc and coarse grained

• Algorithms depend on mesh type

• Data traversed multiple times

readerozone levels isosurface normals

normalsisosurface

reader

temperature

displaygeo-reference

Note: analogy of pipeline

composition and function

composition: f . g

?

In the future ... a lazy polytypic grid

Grid enabling: distribution of the run-time system

and on-demand streaming of arbitrary data.

Through fusion laws, multiple traversals on a

single resource are folded into one pass.

2

readerozone levels isosurface normals

normalsisosurface

reader

temperature

geo-reference display

Algorithms: written once, based on generic pattern

of data types, then instantiated for any type.1

3 Specialization: adapt

programs to utilize resources

available – data or

computational.

9

Marching Cubes

Why do it?

• explore functional visualization

• well known, important algorithm

For each cell

• compare point samples with threshold

• generate case-index

• lookup table to find intersected edges

• interpolate surface-edge intersection

• group intersection points into triangles

Ambiguity problem – various solutions

• “MC33” approach

• tri-linear interpolant

\

10

Functional arrays

Basic types

type XYZ = (Int,Int,Int)type Num a => Dataset a = Array XYZ atype Cell a = (a,a,a,a,a,a,a,a)

Dataset traversal

isoA :: (Ord a, Integral a) => a -> Dataset a -> [Triangle]isoA th arr = concat $ zipWith1 (mcubeA th lookup) addrs where lookup arr (x,y,z) = (arr!(x,y,z), arr!(x+1,y,z), .., arr!(x+1,y+1,z+1)) addrs = [ (i,j,k) | k <- [1..ksz-1] , j <- [1..jsz-1] , i <- [1..isz-1]] (isz,jsz,ksz) = bounds arr

Cell-surface intersection

mcubeA th lookup xyz = group3 . map (interp th cell xyz) . mctable! . toByte . map8 (>th) $ cell where cell = lookup xyz

not

pseudocode

11

Can we do better?

entire dataset must be in-core

repeated threshold comparison (once per cell with common vertex)

repeated computation of edge interpolant

• interpolant used by more than one triangle in cell

• surface intersects edge of more than one cell

12

A window onto samples

Insight – we only need a constant-sized window onto dataset• window size = plane + line + 1 = (jsz+1)*isz + 1

• move from cell-to-cell => advance window by 1

• values read lazily as needed – and gc'd automatically

Haskell representation - a list of values• data Num a => D XYZ [a]

13

Streams of cells

stream :: XYZ -> [a] -> [Cell a]

stream (isz,jsz,ksz) origin =

zip8 origin (drop 1 origin)

(drop (line+1) origin) (drop line origin)

(drop plane origin) (drop (plane+1) origin)

(drop (planeline+1) origin) (drop planeline origin)

where

line = isz

plane = isz * jsz

planeline = plane + line

8-tu

ple

...

...

...

...

cells

15

From array to stream

isoA :: (Ord a, Integral a, Fractional b) => a -> Dataset a -> [Triangle b]isoA th sampleArr = concat $ zipWith1 (mcubeA th lookup) addrs where lookup arr (x,y,z) = (arr!(x,y,z), arr!(x+1,y,z), .., arr!(x+1,y+1,z+1)) addrs = [ (i,j,k) | k <- [1..ksz-1] , j <- [1..jsz-1] , i <- [1..isz-1]]mcubeA th lookup xyz = group3 . map (interp th cell xyz) . mctable! . toByte . map8 (>th) $ cell where cell = lookup xyz

Array

isoS th samples = concat $ zipWith2 (mcubeS th) addrs cells where cells = stream size samples addrs = [ (i,j,k) | k <- [1..ksz-1] , j <- [1..jsz-1] , i <- [1..isz-1]]

mcubeS :: (Ord a, Integral a, Fractional b) => a -> XYZ -> Cell a -> [Triangle b]mcubeS th xyz cell = group3 . map (interp th cell xyz) . mctable! . toByte . map8 (>th) $ cell

Stream

16

Sharing vertex comparison

isoS th samples = concat $ zipWith2 (mcubeS th) addrs cells where cells = stream size samples addrs = [ (i,j,k) | k <- [1..ksz-1] , j <- [1..jsz-1] , i <- [1..isz-1]]

mcubeS :: (Ord a, Integral a, Fractional b) => a -> XYZ -> Cell a -> [Triangle b]mcubeS th xyz cell = group3 . map (interp th cell xyz) . mctable! . toByte . map8 (>th) $ cell

Array

Stream

isoT th samples = concat $ zipWith3 (mcubeT th) addrs cells indices where indices = map toByte . stream . map (>th) cells = ... addrs = ...

mcubeT :: (Ord a, Integral a, Fractional b) => a -> XYZ -> Cell a -> Byte -> [Triangle b]mcubeT th xyz cell index = group3 . map (interp th cell xyz) . mctable! $ index

Indices

17

Sharing edge interpolants

isoT th samples = concat $ zipWith3 (mcubeT th) addrs cells indices where indices = map toByte . stream . map (>th) cells = ... addrs = ...

mcubeT :: (Num a, Fractional b) -> XYZ -> Cell a -> Byte -> [Triangle b]mcubeT th xyz cell index = group3 . map (interp th cell xyz) . mctable! $ index

Indices

InterpolantsisoI th (D size samples) = concat $ zipWith3 mcubeI addrs indices edges where edges = disContinuities size . mkCellEdges th size indices = ... addrs = ... mcubeI :: (Fractional b) => CellEdge b -> Byte -> XYZ -> [Triangle b]mcubeI xyz index edges = group3 . map (selectEdge edges xyz) . (mctable!) $ index

type CellEdge a = (a, a, a, a, a, a, a, a, a, a, a, a)

mkCellEdges :: (Integral a, Fractional b) => a -> XYZ -> [a] -> [CellEdge b]mkCellEdges thresh (XYZ isz jsz ksz) stream = zipWith12 CellEdge inter_x (drop line inter_x) (drop plane inter_x) (drop (plane+line) inter_x) inter_y (drop 1 inter_y) (drop plane inter_y) (drop (plane+1) inter_y) inter_z (drop 1 inter_z) (drop line inter_z) (drop (line+1) inter_z) where line = isz plane = isz*jsz offset d = zipWith2 interp stream d inter_x = offset (drop 1 stream) inter_y = offset (drop line stream) inter_z = offset (drop plane stream) interpolate v0 v1 = fromIntegral (thresh-v0) / fromIntegral (v1-v0)

18

Performance - time

Comparison with VTK pipeline

• [raw reader] -> vtkMarchingCubes; outputs retained

• forced evaluation of Haskell output

• comparison ≃ f(surface size)

Dataset Input Window SurfaceTime (seconds)

Array Stream VTK VTK/StreamSilicium 113,288 3,341 103,200 0.626 0.386 0.19 2.03Neghip 262,144 4,161 131,634 1.088 0.852 0.29 2.94Hydrogen 2,097,152 16,513 134,592 8.638 6.694 0.51 13.13Lobster 5,461,344 97,826 1,373,196 25.370 18.420 5.69 3.24Engine 8,388,608 65,793 1,785,720 44.510 28.060 5.29 5.3StatueLeg 10,814,133 116,623 553,554 48.780 78.340 54.2 1.45Aneurism 16,777,216 65,793 1,098,582 72.980 54.440 5.69 9.57Skull 16,777,216 65,793 18,415,053 79.500 57.190 79.03 0.72Stent8 45,613,056 262,657 8,082,312 287.500 154.900 33.17 4.67Vertebra8 134,217,728 262,657 197,497,908 703.000 517.100 755 0.68

19

Performance - space

Peak live memory usage

• Haskell usage measured by heap profiling

• Stream space use ≃ f(window size)

• Scalabilty without moving to out-of-core methods

Dataset Input Window Surface Array Stream VTK VTK/StreamSilicium 113,288 3,341 103,200 0.12 0.12 1.1 9.17Neghip 262,144 4,161 131,634 0.27 0.14 1.4 9.86Hydrogen 2,097,152 16,513 134,592 2.10 0.55 3 5.45Lobster 5,461,344 97,826 1,373,196 5.45 3.10 19.5 6.29Engine 8,388,608 65,793 1,785,720 8.25 2.10 25.4 12.1StatueLeg 10,814,133 116,623 553,554 11.00 3.72 15.9 4.27Aneurism 16,777,216 65,793 1,098,582 17.00 2.10 28.1 13.38Skull 16,777,216 65,793 18,415,053 17.00 2.13 1855.3 871.03Stent8 45,613,056 262,657 8,082,312 46.00 8.35 119.1 14.26Vertebra8 134,217,728 262,657 197,497,908 137.00 8.35 1300.9 155.8

20

“Ifs”, “buts”, and “maybe”s

Comparison of apples and oranges?

• only intended as a reality check ...

• vtkMarchingCubes not the fastest VTK surface extraction filter

• VTK pipeline could be fine-tuned, e.g. drop intermediate results

• other implementations

However ...

• our approach can also be tuned further ...

• elegant Haskell not necessarily efficient Haskell

• compiler work, e.g. fusion laws: (map f).(map g) = map (f.g)

21

Other criteria

Streaming criteria (Law et.al., Proceedings Vis'99)• caching: streamed data access + memoisation• demand-driven:

• natural product of call-by-need (“lazy evaluation”)• data pulled in – and transformed – only to the extent needed

• hardware architecture independence

• polymorphic types supported by type predicates

• Component based:

• library of fine-grained visualization combinators

Clarity• algorithms used in decision making should be evidently correct• strong typing, concise expression, equational reasoning

22

Outlook

Achievements• novel re-construction of fundamental algorithms

• fine-grained streaming - scalability

• demonstrated that FP can be (surprisingly) efficient

Current work• extending stream approach to other mesh organizations

• exploring polytypism to abstract from organization

• distributing (functional) pipeline over a grid

• exiting distributed FP approaches – GRID-GUM, Eden

• York bytecode compiler (YHC)

23

Finally

Thanks to:

• EPSRC “Fundamental Computing for e-Science” Programme

• Anonymous reviewers

Further information

• VVR group: www.comp.leeds.ac.uk/vvr/

• Plasma group: www.cs.york.ac.uk/plasma/

• Repository: hackage.haskell.org/trac/PolyFunViz/

• Haskell: www.haskell.org

Questions?

Fine-grained Visualization Pipelines and Lazy Functional Languages D.J. Duke 1, M. Wallace 2, R....

Documents

Transcript of Fine-grained Visualization Pipelines and Lazy Functional Languages D.J. Duke 1, M. Wallace 2, R....