A Scalable Complex Event Processing Framework for ... · A Scalable Complex Event Processing...

36
0 A Scalable Complex Event Processing Framework for Combination of SQL-based Continuous Queries and C/C++ Functions Takashi Takenaka, Masamichi Takagi Hiroaki Inoue NEC Corporation August 30, 2012 FPL2012 0

Transcript of A Scalable Complex Event Processing Framework for ... · A Scalable Complex Event Processing...

0

A Scalable Complex Event Processing Framework for Combination of SQL-based Continuous Queries

and C/C++ Functions

Takashi Takenaka, Masamichi Takagi Hiroaki Inoue

NEC Corporation

August 30, 2012

FPL2012

0

1

Today’s Key Message

Synthesizing

SQL with C

to event processing

on FPGAs 1

2

Outline

• Background

• Our work

• Evaluation

• Conclusion 2

Complex Event Processing(CEP)

Retrieve important

information

Real-Time Events

Decision Making

Response

Control

Patterns from Historical Data

Real-world

3

CEP

IDC: $10B (software) by 2014

CEP Applications Financial Algorithmic Trading

Active diagnostics of facilities

Fraud detection: web commerce, credit card

Compliance reporting and monitoring

Track and Trace: Patients, packages

4

CEP in financial treading

5

(stock_id, price, volume) CEP

Stock exchanges

t

volu

me 1.0

0.8

VWAP high

Capture a trend: volume is starting high then it reaches 80%, then Calculate “volume-weighted average of price(V WAP)” during the period.

Decision Market information

Capture meaningful trend and calculate financial benchmarks

Performance requirement

6

monitoring Manufacturing Financial

Latency

Events per second

1ms

1s

Web analysis

Data warehouse

1 h

1K 10K 1M

Relational Database

Limit of SW

solution

(Torsten Grabs, “Introduction to Microsoft SQL Server 2008 R2 StreamInsight”)

Application domain of CEP

7

Hardware CEP Software CEP Hardware CEP [17,21]

Software CEP Hardware CEP

Performance Low (0.12Gbps) [12]

High (20Gbps)

Application range Broad Limited

Events Server

Events FPGA

NIC

Server

8

Major HW CEP Requirements

• Programmability

• Scalability

9

Programmability

Oracle CQL

Sybase CCL

IBM SPL

StreamBase StreamSQL

EsperTech EPL

... etc.

Software CEPs

SQL +

User-defined functions (C/Java)

Scalability

10

CEP R

esults

Event sources

10k Listed issues

in NYSE

1k sensor nodes

100 sensor nodes

multiple streams

Bridge monitoring

Factory monitoring

Financial Trading

Existing HW CEPs

11

Woods [21] Inoue[17]

Language SQL-based C-based

Performance Good (1Gbps)

Good (20Gbps)

Programmability No user functions

No SQL interface

Scalability Limited (< 1K) No

Summary of background

Programmability

Scal

abili

ty

SQL-based [21]

C-based [17]

1k

1

10k

13

Outline

• Background

• Our work

• Evaluation

• Conclusion 13

Our Goal

? SQL-based

[21]

C-based [17]

Programmability

Scal

abili

ty

1k

1

10k

SQL with C

15

Two Technical Points

1. SQL interface on the top of C-to-HDL compiler

2. Scalable architecture for multiple streams

15

Performance came from …

16

SQL to HDL compiler

SQL-based CEP language

C to HDL compiler

C-based CEP language

C-based: 20Gbps SQL-based: 1Gbps

Current “C-to-HDL compiler” technology generates well-optimized circuits

Use industry tools No industry tool

<

17

Basic Strategy

C to HDL compiler

C-based CEP Language

C to HDL compiler

C-based CEP Language

C-based [17]

This work

SQL to C interface

SQL to HDL compiler

SQL-based CEP language

SQL-based [21]

SQL’s primitives

18

SELECT * WHERE volume > 3200 From Stock

SELECT stock_id, SUM(volume) AS sum From Stock [ROWS 4 PRECEDING]

SELECT stock_id, calc_vwap() AS vwap From Stock [ROWS 4 PRECEDING]

Selection

Window, Aggregation

User functions

Translation rule: selection

19

SELECT * WHERE volume > 3200 From Stock

while(true) { entry = event_fifo_in.read(); if ( entry.volume > 3200 ) { event_fifo_out.write(entry); } }

SQL

C

Finding events whose volume is greater than 3,200.

Translation rule: window

20

SELECT stock_id, SUM(volume) AS sum From Stock [ROWS 4 PRECEDING]

#define WINDOW_SIZE 4 evin_t win[WINDOW_SIZE]; while(true) { for(i=1;i<WINDOW_SIZE;i++) win[i] = win[i-1]; win[0] = event_fifo_in.read(); result.sum = calc_sum_volume(win); event_fifo_out.write(result); }

SQL

C

Calculating sum of volume of latest 4 events.

Translation rule: User-function

21

SELECT stock_id, calc_vwap() AS vwap From Stock [ROWS 4 PRECEDING]

while(true) { ... result.vwap = calc_vwap(win); event_fifo_out.write(result); }

SQL

C

Calculating “volume-weighted average of price” of latest 4 events.

After translated to C, …

22

C-to-HDL compilers generate well-parallelized and pipelined circuits.

+

+

+

Result

Events

win[]

for(i=0;i<WINDOW_SIZE;i++) sum += win[i]; result = sum;

Summary of SQL-to-C interface

23

SQL-based [21]

C-based [17] Ours

Selection Yes No Yes Window Yes No Yes

Matching Yes Yes Yes Aggregation Limited Yes Yes

User function No Yes Yes Multiple streams Limited No ?

24

Two Technical Points

1. SQL interface on the top of a C-to-HDL compiler

2. Scalable architecture for multiple streams

24

Multiple streams

25

CEP Engine

CEP Engine

CEP Engine

Stream for stock A

Stream for stock B

Stream for stock C

Naive replication is not applicable for > 100 streams

CEP is required to receive multiple streams and to perform event processing.

Observation

26

Multiple streams are usually interleaved into a stream on a high-speed link an event arrives at a time

Stream for stock A

Stream for stock B

Stream for stock C

MUX C B A B A

A A

B B

C C

Interleaved multiple-context architecture

27

C B A B A Interleaved multiple streams

Events Stream Partitioner

CEP Core Logic

Matching contexts Stream ID

Summary of our work

28

SQL-based [21]

C-based [17] Ours

Selection Yes No Yes Window Yes No Yes

Matching Yes Yes Yes Aggregation Limited Yes Yes

User function No Yes Yes Multiple streams Limited No Yes

29

Outline

• Background

• Our work

• Evaluation

• Conclusion 29

30

Evaluation platform FPGA Xilinx XC5VLX330T-2 CAD NEC CyberWorkBench, Xilinx ISE 12.2

30

31

Setup

PCIe DMA

CEP logic

Network Server

NIC

156MHz x 128b=19.968Mbps

MAC &

UDP

31

Query

t

volu

me 1.0

0.8

VWAP high Captures a trend where volume is starting high then it reaches 80%, then calculates “volume-weighted average of price(VWAP)” during the period.

Query

32

SELECT stock_id, vwap FROM Stock MATCH_RECOGNIZE ( PARTITION BY stock_id MEASURES C.stock_id AS stock_id C.vwap AS vwap PATTERN (A B+ C) DEFINE A AS vol_high() B AS pri_stbl() C AS vol_down() )

SQL bool vol_high(evin_t ev, evarg_t &arg) { arg.stock_id = ev.stock_id; arg.volume = ev.volume arg.sum_volume = ev.volume; arg.sum_w_price = ev.price * ev.volume; return ev.volume > 1000; }

bool pri_stbl(evin_t ev, evarg_t &arg) { arg.sum_volume += ev.volume; arg.sum_w_price += ev.price * ev.volume arg.vwap=arg.sum_w_price/arg.sum_volume; return (ev.price > vwap); }

bool vol_down(evin_t ev, evarg_t &arg) { return ev.volume < 0.8 * arg.volume; }

User functions

Scalability

0

10

20

30

40

50

60

70

80

90

100Proposed Naive replication

FPG

A U

sage

# of streams

500x improvement

FPGA Usage = max(block mem usage/block mem avail, slice usage/slice avail) * 100

34

Outline

• Background

• Our work

• Evaluation

• Conclusion 34

Our work

Woods [21]

Inoue [17]

Programmability

Scal

abili

ty

1k

1

10k

SQL with C