Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

17
Using Accumulo to Implement Confidentiality Protection in Message Queuing Dr. Rod Moten Chief Scientist PROARC, Inc. 8/25/22 PROARC, Inc. | 300 E. Lombard Suite 640 Baltimore MD 21202 | [email protected] | 410-665-2230 1

description

Accumulo is primarily used as a Big Data storage facility in a clustered environment. Accumulo’s columnar arrangement of rows, key-value pair indices and cell-level security make it attractive for non-Big Data applications as well. In this talk, we describe how to use Accumulo to implement message queuing that provides confidentiality protection. One feature of message queuing is broadcasting messages from a producer to multiple consumers. The messages could be part of a stream that the producer is providing to multiple consumers. In some cases, not all consumers should see every message in the stream. In a traditional queuing system, separate queues would be created for different levels of access. Thereby the messages would be duplicated for each level of access. In thistalk, we show how to use Accumulo to create a queuing system that does not require duplication. We also present results from experiments testing the performance of such a system under different loads. We also present results comparing the performance of streaming messages using a queuing system based on Accumulo compare to traditional queuing systems, such as Apache QPid.

Transcript of Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Page 1: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Using Accumulo to Implement Confidentiality Protection in

Message QueuingDr. Rod MotenChief Scientist

PROARC, Inc.

6/6/2014PROARC, Inc. | 300 E. Lombard Suite 640 Baltimore MD 21202 | [email protected] | 410-665-2230 1

Page 2: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Ensure confidential information is only accessible by those with the correct privileges

Example◦ Ensure only people with Secret clearances can

read Secret documents

Confidentiality Protection

6/6/2014

PROARC, INC. PROPRIETARY INFORMATION: The information contained herein may not be used in whole or in part except for the limited purpose for which it was furnished. Do not distribute, duplicate, or reproduce in whole or in part without the prior written consent of an authorized official of PROARC, Inc) 2

Page 3: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Artifacts are tagged with attributes that specify their confidentiality level

Portions of a single artifact can have different confidentiality levels

Entire artifact will be protected at the highest level of its parts

Reduce confidentiality level by stripping out portions with higher levels

Example

A Policy for Confidentiality Protection

Protection level of this document is Trade Secret

(Public) Sweeping fingers in shapes across the screen of a smartphone or tablet, can be used to unlock devices.(Confidential) The CEO of Acme uses the same shape for all his devices.(Trade Secret) When near a CEO exploit the Bluetooth bleed bug to send a fake notification to his device and study his gesture. (Public)  The free-form gestures have an inherent appeal as passwords.

Page 4: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Mark each frame or collection of frames with a confidentiality level◦ Consumers can only receive frames for which

they are privileged to read Consumers cannot directly transfer frames

to producers.◦ A broker is required

Use traditional message queuing system with access control, such as Qpid.

Queue per Confidentiality Level

Implementing Confidentiality Protection for Data Streams

Page 5: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Queue per Confidentiality Level

Frame 1A,B

Frame 2A

Frame 3A,B

Frame 4A,B

Queue for Confidentiality Level A

Queue for Confidentiality Level B

Frame 1A,B

Frame 3A,B

Frame 4A,B

Frame 1A,B

Frame 1A,B

Frame 2AFrame 3A,B

Frame 3A,B

Frame 4A,B

Frame 4A,B

Frame 1A,BFrame 2AFrame 3A,BFrame 4A,B

Frame 1A,B

Frame 2A

Frame 3A,B

Frame 4A,B

A separate queue for each protection level Consumers read all frames from queue for which they have access

Queue for A, but Not B

Frame 2A

Page 6: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

A single queue contains all frames for all confidentiality levels

Consumers only read frames for which they have access.

Single Queue for All Confidentiality Levels

Page 7: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Single Queue for All Confidentiality Levels

Frame 1A,B

Frame 2A

Frame 3A,B

Frame 4A,B

Frame 1A,BFrame 2AFrame 3A,BFrame 4A,B

Frame 1A,B

Frame 2A

Frame 3A,B

Frame 4A,B

A single queue contains all frames for all protection levels Consumers only read frames for which they have access.

Consumers with Access to A

Consumers with Access to B

Frame 1A,BFrame 1A,B

Frame 2A

Frame 3A,B

Frame 4A,BFrame 4A,B

Frame 3A,B

Page 8: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Treat queue as an unbounded buffer◦ Single writer – multiple readers

Buffer implemented as an Accumulo table◦ Technically it is a very large bounded buffer◦ Theoretically it can hold 2632 = 1.9 x 1049 entries

Each row contains a frame Row ID string of 32 characters from the set [a-z]

2632 frames = 1.9 x 1049 frames 1st frame: aaa…aaa 2nd frame: aaa…aab 27th frame: aaa…aba

Security label Confidentiality level

Using Accumulo for Single Queue Approach

Page 9: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

The frame is stored as the values of one or more columns.◦ A frame will be partitioned into multiple values if it is large.

Column Family◦ Contains the column index number

Column Qualifier◦ First column – total size of frame◦ Subsequent columns – size of value

Example – 1KB Frame divided into two columns

Organization of Columns

Row ID Column Family

Column Qualifier

Value

aaa…aaa

0 1024

aaa…aaa

1 512 <512 bytes>

aaa…aaa

2 512 <512 bytes>

Page 10: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Design of Proof-of-Concept

Producer

Unbounded Buffer Writer

AuthorizationService

Accumulo

Consumer

Unbounded Buffer Reader

Reader’sState

Writer’sState

Expired Row

Deleter

Single node instance of Accumulo

Deletes rows older than N seconds

Local persistent storage of last row read, etc.

Local persistent storage of last row written, etc.

Contains security labels for each Producer and Consumer

Multiple consumers per buffer

One producer per buffer

Page 11: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Batch writing of rows◦ Currently, Writers flush after writing one row.

Reduce polling◦ Currently Readers polls for a new row when it has

reached the end of the buffer◦ Writers can notify Readers via multicast when a

row is written

Possible Improvements

Page 12: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Comparison between Qpid and our POC messaging system ◦ Compare the average time to read and write a

frame at a specific rate Frames sizes: 2MB and 8KB Frame rate: 50 ms Number of Consumers: 1, 10, 100, 1000 Number of confidentiality levels: 1 and 5 We didn’t make any special configurations

to Qpid or Accumulo.

Multiple Queues vs Single Queue

Page 13: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

1Consumer – 50ms Frame Rate

Accumulo Qpid

# of Levels

Frame Size

Avg. Write Time

Avg. Read Time

1 8KB 0.18ms 4.3ms

1 2MB 111ms 196ms

5 8KB 0.18ms 4.3ms

5 2MB 111ms 196ms

# of Levels

Frame Size

Avg. Write Time

Avg. Read Time

1 8KB 0.93ms 47ms

1 2MB 129ms 3.98s

5 8KB 2.21ms 47ms

5 2MB 3.58s 3.98s

The number of access levels had no impact on the read and write times.

As expected, duplicating the frame for each confidentiality level slows down writes.

Page 14: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

100 Consumers – 50ms Frame Rate

Accumulo Qpid

# of Levels

Frame Size

Avg. Write Time

Avg. Read Time

1 8KB 0.21ms 28.3ms

1 2MB 236ms 2.23s

5 8KB 0.21ms 28.3ms

5 2MB 236ms 2.23s

# of Levels

Frame Size

Avg. Write Time

Avg. Read Time

1 8KB 0.93ms 47ms

1 2MB 129ms 3.98s

5 8KB 2.21ms 47ms

5 2MB 3.58s 3.98s

The read and write times for 1 and 100 consumers were so close we only show the results from 1 consumer.

Impacted by the number of consumers.

Page 15: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

# of Levels Frame Size Avg. Write Time

Avg. Read Time

Frame Rate

1 & 5 8KB 2.43ms 209ms 50 ms

1 & 5 2MB 12.9s 11.4s 50 ms

1 & 5 2MB 512ms 18.6s Write-50msRead-30s

1000 Consumers - Accumulo

Read times impacted by multiple consumers on the same VM and disk contention.

We didn’t test Qpid with 1000 Consumers because the queues are kept in RAM and we didn’t have enough RAM for 1000 consumers.

Page 16: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

1 10 100 10000

50

100

150

200

250

4.3 5.3828.3

209

8KB Frames

Read Write

# of Consumers

Read/W

rite

tim

es

in m

illis

eco

nds

Evidence that sharing NIC may be impacting performance

Read times are almost the same when there is only 1 consumer per VM.

Write times remain flat while read times increase as the number of consumers increase on the same VM.

Page 17: Accumulo Summit 2014: Using Accumulo to Implement Confidentiality Protection in Message Queuing

Accumulo may be suitable as the backbone for a message queuing system◦ Accumulo outperforms Qpid for complex attribute policies. ◦ A messaging system based on Accumulo isn’t restricted by

RAM like Qpid.◦ Drawback: May require a lot of polling.

Large frames◦ Small number of consumers and no more than 5 frames

per second.

Small frames◦ 100’s of consumers per buffer and no more than 40 frames

per second.

Conclusion