Can a Divorced MOM and DAD take care of the CHILD ?

18
© 2003 IBM Corporation Can a Divorced MOM and DAD take care of the CHILD ? MOM – Message Oriented Middleware. DAD – Direct Access to Data (DBMSs). CHILD – Correlating Historical or In-transit Large-scale Data-stream.

description

Can a Divorced MOM and DAD take care of the CHILD ?. MOM – Message Oriented Middleware. DAD – Direct Access to Data (DBMSs). CHILD – Correlating Historical or In-transit Large-scale Data-stream. In this talk …. Introduce CHILD - Correlating Historical In-transit Large-scale Data-streams. - PowerPoint PPT Presentation

Transcript of Can a Divorced MOM and DAD take care of the CHILD ?

Page 1: Can a Divorced MOM and DAD take care of the CHILD ?

© 2003 IBM Corporation

Can a Divorced MOM and DAD take care of the CHILD ?

MOM – Message Oriented Middleware.DAD – Direct Access to Data (DBMSs).CHILD – Correlating Historical or In-transit Large-scale Data-stream.

Page 2: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

In this talk …

Introduce CHILD - Correlating Historical In-transit Large-scale Data-streams.

Compare CHILD and current Stream Processing Engines.

How DAD and MOM can/may help/work together?

Summary.

Page 3: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

The Supply Chain Example

Some funny DAD characteristics:• DADs are corporates custodians of truth.• DADs generally maintain a single version of truth - the recent truth.• DADs are optimized to answer questions for a single version of truth.• The truths can be atomically evaluated to answer the questions.• There is only one answer to the question.• DADs do not remember the answers provided to the previously asked questions.

Page 4: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Supply Chain Evolves to Accommodate Emerging Business Practices

Some of the Characteristics of MOM

Allows asynchronous communication between disconnected systems within and across organizations.

Provides Message Filtering and Message Correlation.

Persistence and Guaranteed Delivery Mechanism.

Message enrichment can be achieved by referencing static datasets during routing.

Page 5: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Proactive Supply Chain Management

In a proactive case: Each system creates it unique view

of state of interest and receives information about changes to state of interest.

There may not be a complete truth. Facts may arrive over a period of time.

The answers to the questions change as new facts become available.

The aim is to reduce the time to re-compute most recent answers.

Page 6: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Scenario: London Congestion Charging ( + security )Command & Control

Real time processing

SensorReading

DB

Billing

Security/ fraud alerts

Retrospectiveprocessing

Charging ( and security) rules

• vehicle license plate , owner, owner residency, fee paid ?

• entry and exit times of vehicle, time of day, day of week , charging, residency

• reentry within 3 hours is free

• fraud: enters zone and not seen ; security - grouped tanker trucks

• 100,000’s vehicle observations / hour

Page 7: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Example of CHILD Applications

RFID

Sensor Networks

Stock Quotes

Database Notification

Content Routing Networks

RSS Aggregators

Page 8: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

CHILD – Correlating Historical or In-transit Large-Scale Data Streams

Characteristics:

1. Append Only Data.

2. Push Paradigm – Stream of Data (truths), static set of queries (questions).

3. Continuous processing requirements.

4. Correlation requirements.

Page 9: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

CHILD – Correlating Historical or In-transit Large-Scale Data Streams - 2

S’ ∆’ S’’

∆’’ S’’’

∆’’’ S*

∆*

All queries have associated time constraint specified in terms of windowing functions.

Query Type 1: Query when states S’, S’’. S’’’, S* are reached. (DB Notification)

Query Type 2: Query when S’’’ is reached after S’ and S’’ (Sensor Networks)

Query Type 3: Query when S* is reached within 2 transitions from S’. (BI)

Query Type 4: Get an aggregate of (∆) (Sensor Network)

Query Type 5: Query when S’, S’’ were observed in the past N time windows. (Fraud detection Networks)

Query Type 6: Query when ∆’, ∆’’, ∆’’’ resulted in exact changes from S’ to S’’ to S’’’. (ESB)

Query Type 7: Query when S’,S’’,S’’’ …∆’, ∆’’, ∆’’’… were not observed. (Fraud Detection)

Page 10: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

All queries have associated time constraint specified in terms of windowing functions.

Query Type 8: Query Evaluate Join S,P states (All Most all use cases)

Query Type 9: Query Co evaluate Filter on S,P….. (All Most all use cases)

Query Type 10: Query Evaluate Join/Filter on S (t), S (t-T) (Sensor Networks, BI)

Query Type 11: Query Evaluate P between states S’ and S’’’ (Sensor Networks, Stock Ticks)

S’ ∆’ S’’

∆’’ S’’

’ ∆’’’

S* ∆*

P’ δ’ P’’

δ’’ P’’

’ δ’’’

P* δ *

CHILD – Correlating Historical or In-transit Large-Scale Data Streams - 3

Page 11: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Stream Systems – Academic Projects

AURORA

BOREALIS

STREAMDB

TELEGRAPHCQ

NIGARACQ

Page 12: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

CHILD and Stream Processing – Some Observations.

Temporal dimension is not always the predominant one.

For business processing all facts are retained.

An event is in the eye of the beholder, so every tuple is a message until observed in a context. Queries need to have context.

Being “Turing complete” SQL will allow one to specify arbitrary data manipulations, the tradeoff is how much State we retain vs. resource usage vs. throughput.

Declarative stream manipulation language needs to be developed.

A conceptual data model for manipulating append only data should be the focus - not limited to the engineering aspect of the systems.

Additionally, smart summarization techniques are required for correlating and mining historic data.

Page 13: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Real-time performance is critical ONLY in some cases.

Providing a common abstraction for sequence analysis on the data items appearing in the stream and across the streams remains critical.

Typical stream systems are restricted to 20-to-30 operators and require resource augmentation to handle higher workloads, which in turn requires capabilities similar to MOMs.

For handling queries over historic data and correlation with historic data CHILD requires capabilities equivalent to DADs.

CHILD and Stream Processing – Observations contd.

Page 14: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

DAD

SPE

optionaloptional

STREAM

SPE-1

SPE-2

SPE-4

SPE-3

SPE-5

Is this not MOM with Content Routing Operators ?

Page 15: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

SPE-1

SPE-2

SPE-4

SPE-3

SPE-5

What is missing?

Ability to create the Ad hoc network of content routers given a list of streams and queries.

Ability to describe and support smart subscriptions

Ability to scale simultaneous evaluation of multiple expressions.

Page 16: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Classification of MOMs and DADs

Moms/Dads Divorced DAD Relational DAD

Active DAD Temporal DAD

Divorced MOM Stream Processing Systems

Triggers and Database Notification Systems

Rule based subscription evaluation

Context Analysis.

Independent MOM Content Routing Systems

Message Enrichments with static data.

ECA (Event Condition Action) with data dissemination

Temporal Event Correlation.

Transactional MOM

Proprietary Queue based Systems

Secured Pub/Sub with transactional capability

Database as a rule-based Content Provider.

Traceability Analysis + Event Correlation.

Page 17: Can a Divorced MOM and DAD take care of the CHILD ?

Business Unit or Product Name

© 2003 IBM Corporation

Summary

Stream Processing is just one aspect of the emerging

paradigm of processing append only data with support for

continuous queries.

These systems need a new representational model. SQL

Or SQL extensions are not sufficient.

If not careful we may redevelop parts of MOM and DAD in

The process for creating support for CHILD.

Page 18: Can a Divorced MOM and DAD take care of the CHILD ?

© 2003 IBM Corporation

DAD: There is one and only one truth that I know. For previous versions of truth see my log…

MOM: I do not need to know the truth, I just GOSSIP. I GOSSIP about facts !!!

CHILD: But MOM, DAD, I do not need to know the complete truth. I want to take decisions now, I will correct them when I know more.

DAD: Well I can provide you triggers if you want?

CHILD: Ahhh !!! As if they scale.

MOM: Well I can talk with other MOMs and enrich the contents on the fly.

CHILD: Oh Is it !! Can you also enrich it on the fly? Or tell me when three red marbles are followed by four green ones?

MOM: Only if I know what marbles are. May be with my content routing hat on I can do that.

CHILD: Yeah Right !!!

CHILD: Can uncle Active (Database) help.

MOM: Oh no, he suffers from Rule Termination Problem.

DAD: Well if you ask Temporally aware brother of mine he can help you relate things in past.

CHILD: But DAD temporal is just one axes, I consider value Axes. I want to purchase a stock of MOBIL OIL only when the fuel price has risen after a REFINERY BLOWUP. Its not time but the context that matters.

MOM: You know my sister STREAM PROCESSING ENGINE can help.

CHILD: Oh Sure, with an ability to provide 20-30 operators, In-Memory operations only. Optional Recovery, Undefined Semantics and NON DECLARATIVE interface, I will be in great hands!!! YUCK!!

MOM: Oh we need to provide him with a mix or else he will replicate our behaviors.

DAD: DOH !!!