Data Structures and Performance for Scientific Computing with Hadoop and Dumbo (ICME MR 2012)

30
Matrix storage Data Example: outputting many small matrices Example: Cholesky QR Data Structures and Performance for Scientific Computing with Hadoop and Dumbo Austin R. Benson Computer Sciences Division, UC-Berkeley ICME, Stanford University May 15, 2012

Transcript of Data Structures and Performance for Scientific Computing with Hadoop and Dumbo (ICME MR 2012)

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Data Structures and Performance for ScientificComputing with Hadoop and Dumbo

Austin R. Benson

Computer Sciences Division, UC-BerkeleyICME, Stanford University

May 15, 2012

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

1

1 Matrix storage

2 Data

3 Example: outputting many small matrices

4 Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Dense matrix storage

A =

11 12 13 1421 22 23 2431 32 33 3441 42 42 44

How do we store the matrix in HDFS?

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Dense matrix storage

A =

11 12 13 1421 22 23 2431 32 33 3441 42 42 44

In HDFS:

〈1, [11, 12, 13, 14]〉

〈2, [21, 22, 23, 24]〉

〈3, [31, 32, 33, 34]〉

〈4, [41, 42, 43, 44]〉

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Two rows per record

or we might use:

〈1, [[11, 12, 13, 14], [21, 22, 23, 24]]〉

〈3, [[31, 32, 33, 34], [41, 42, 43, 44]]〉

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Flattened list

or maybe

〈1, [11, 12, 13, 14, 21, 22, 23, 24]〉

〈3, [31, 32, 33, 34, 41, 42, 43, 44]〉

... but we do lose information here (maybe it’s not important)

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Full matrix

or maybe

〈1, [[11, 12, 13, 14], [21, 22, 23, 24], [31, 32, 33, 34], [41, 42, 43, 44]]〉

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

What is the ”best” way?

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

What is the ”best” way?

Depends on the application... we will look at an example later.

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

2

1 Matrix storage

2 Data

3 Example: outputting many small matrices

4 Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Data Serialization

Small optimizations → 2.5x speedup!

*all data from the NERSC Magellan cluster

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Data Serialization

Same experiment but different matrix size (200 columns):

Again, 2.5x speedup!

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Languages

Switching from Python to C++...

same general trend

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

More speedups

Algorithm performance isn’t the only place where we see speedups

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Why can we expect these speedups?

These are not high-performance implementations. We care aboutI/O performance.

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

3

1 Matrix storage

2 Data

3 Example: outputting many small matrices

4 Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Suppose we need to write many small matrices to disk.

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Code

Code:

git clone git://github.com/icme/mapreduce-workshop.gitcd mapreduce-workshop/arbenson

Files:

speed test.py (tester)

small matrix test.py (driver)

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

4

1 Matrix storage

2 Data

3 Example: outputting many small matrices

4 Example: Cholesky QR

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Algorithm

Cholesky QR: R = chol(ATA, ’upper’)

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Implementation for MapReduce

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Mapper implementation

Which of these implementations is better?

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Mapper implementation

Which of these implementations is better?

Answer: the one on the left (usually)

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Why?

1 Shuffle time

2 Reduce bottleneck

However, the left implementation could run out of memory.

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Mapper implementation

Can we do better? Yes

Matrix storage Data Example: outputting many small matrices Example: Cholesky QR

Questions?

Austin R. [email protected]

https://github.com/arbenson/mrtsqr