Introduction to Microsoft SQL Server 2008 R2 StreamInsight
description
Transcript of Introduction to Microsoft SQL Server 2008 R2 StreamInsight
Introduction to Microsoft SQL Server 2008 R2 StreamInsightTorsten GrabsSenior Program Manager LeadMicrosoft Corporation
PDC09-SVR07
What are Event-Driven Applications?
2
Complex Event Processing (CEP) is the continuous and incremental processing of event streams from multiple sources based on declarative query and pattern specifications with near-zero latency.
Database Applications
Event-driven Applications
Query Paradigm
Ad-hoc queries or requests
Continuous standing queries
Latency Seconds, hours, days Milliseconds or lessData Rate Hundreds of
events/secTens of thousands of events/sec or more
request
response
Eventoutput streaminput
stream
Relational Database Applications
Financial trading Applications
Scenarios for Event-Driven Applications
Aggregate Data Rate (Events/sec.)
Latency
0 10 100 1000 10000 100000 ~1million
MonthsDays
hours
MinutesSeconds100 ms
< 1ms
Operational Analytics Applications, e.g., Logistics, etc.
Manufacturing ApplicationsMonitoring Applications
CEP Target Scenarios
Data Warehousing Applications
Web Analytics Applications
3
Example Scenarios
4
Data
Stre
am
Stream Data Store & Archive
Event Processing Engine
Data
Stre
am
Asset Specs &
Parameters
Power, Utilities:• Energy
consumption• Outages• Smart grids• 100,000
events/sec
Visual trend-line and KPI monitoringBatch & product managementAutomated anomaly detectionReal-time customer segmentation Algorithmic tradingProactive condition-based maintenance
Web Analytics:• Click-stream
data• Online customer
behavior• Page layout• 100,000
events /sec
Manufacturing:• Sensor on plant
floor• React through
device controllers
• Aggregated data • 10,000
events/sec
• Threshold queries• Event correlation
from multiple sources• Pattern queries
Lookup
Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds
Financial Services:• Stock & news
feeds• Algorithmic
trading• Patterns over
time• Super-low
latency• 100,000
events /sec
Industry trends• Data acquisition
costs are negligible
• Raw storage costs are small and continue to decrease
• Processing costs are non-negligible
• Data loading costs continue to be significant
Manage business via KPI-triggered actionsMine historical dataDevise new KPIsMonitor KPIs
Record raw data (history)
Virtuous Cycle: Monitor, Manage, Mine
CEP advantage• Process data
incrementally, i.e., while it is in flight
• Avoid loading while still doing the processing you want
• Seamless querying for monitoring, managing and mining
5
Solutions with StreamInsight CEP
6
Data Sources, Operations, Assets, Feeds, Sensors, Devices
Monitor &
Record
Operational Data Store &
Archive
CEP Enginef(x) g(y
)
CEP Enginef(x) f'(x
)
g(y)
h(x,y)
Histor
y
Deploy
Resultsf'(x)
h(x,y)
Manage&
Benefit
Mine&
Design
Input Data Streams
Input Data Streams
OutputData Streams
Overview: Microsoft StreamInsight
7
CEP Engine
Output Adapters
Input Adapters
Event
Standing Queries
Event sources Event targets
`
Devices, Sensors
Web servers
Event stores & Databases
Stock tickers & News feeds
Event
Event
Event
Event
Event
Event
C_ID
C_NAME
C_ZIP
Event stores & Databases
Pagers & Monitoring devices
KPI Dashboards, SharePoint UI
Trading stations
Event
Event
CEP Application at Runtime
Static reference data
IDE
.NETC#
LINQCEP Application Development
Events> Events expose different temporal characteristics
> Point in time events> Interval events with fixed duration> Interval events with initially unknown duration
> Rich payloads capture all properties of an event
t1 t4t3t2 t5Time
Payl
oad/
val
ue
ab
c de
Event Types> Events in Microsoft’s CEP platform use the .NET
type system> Events are structured and can have multiple
fields> Fields are typed using the .NET framework types > CEP engine provisioned timestamp fields capture
all the different temporal event characteristics> Event sources populate time stamp fields
Timestamps/Metadata
LongpumpID
StringType
StringLocation
Doubleflow
Doublepressure
… … … … … …
Event Streams & Adapters> A stream is a possibly infinite sequence of
events> Insertions of new events> Changes to event durations
> Stream characteristics:> Event/data arrival patterns
> Steady rate with end-of-stream indication > Intermittent, random, or in bursts
> Out of order events: Order of arrival of events does not match the order of their application timestamps
> Adapters> Receive/get events from the data source> Enqueue events for processing in the engine
10
Typical CEP Queries> Typical CEP queries require combination of
functionality> Complex type describes event properties> Calculations introduce additional event properties> Grouping by one or more event properties> Aggregation for each event group over a pre-
defined period of time, typically a window> Multiple event groups monitored by the same query> Correlate event streams > Check for absence of activity with a data source> Enrich events with reference data> Collection of assets may change over time
> We want to make writing and maintaining those queries easy or even effortless
StreamInsight Query Features> Operators over streams
> Calculations (PROJECT)> Correlation of streams from different data sources
(JOIN)> Check for absence of activity with a data source
(EXISTS)> Selection of events from streams (FILTER)> Stream partitioning (GROUP & APPLY)> Aggregation (SUM, COUNT, …) > Ranking and heavy hitters (TOP-K)> Temporal operations: hopping window, sliding
window> Extensibility – to add new domain-specific
operators
LINQ Query Examples
LINQ Example – GROUP&APPLY, WINDOW:from e3 in MyStream3group e3 by e3.i into SubStreamfrom win in SubStream.HoppingWindow( FiveMinutes,ThreeSeconds)select new { i = SubStream.Key, a = win.Avg(e => e.f) };
LINQ Example – JOIN, PROJECT, FILTER:from e1 in MyStream1join e2 in MyStream2 on e1.ID equals e2.IDwhere e1.f2 == “foo”select new { e1.f1, e2.f4 };
JoinFilterProject
GroupingWindowProject &Aggregate
Writing StreamInsight Queries
demo
Extensibility SDK> Built-in operators do not cover all functionality
> Need for domain-specific extensions> Integrate with functionality from existing libraries
> Support for extensions in the CEP platform:> User-defined operators, functions, aggregates> Code written in .NET, deployed as .NET assembly> Query operators and LINQ can refer to functionality
of the assembly> Temporal snap-shot operator framework
> Interface to implement user-defined operators> Manages operator state and snapshot changes> Framework does the heavy lifting to deal with
intricate temporal behavior such as out-of-order events
User-defined Extensions> Temporal aspects managed by the
framework> Access to temporal metadata only if
necessary> Incremental state management for
aggregates> Easy to invoke from LINQ:var outputStream =
from win in input.TumblingWindow(TimeSpan.FromSeconds(10))select new
{ v = win.TimeWeightedAverage(e =>
e.dVal)};
TWAvg Implementation in C#public class TimeWeightedAverage : CepTimeSensitiveAggregate<double, double>{
public override double GenerateOutput(IEnumerable<IntervalEvent<double>> events,
WindowDescriptor wd) {
double avg = 0;foreach (IntervalEvent<double> e in events)
{avg += e.Payload * (e.EndTime - e.StartTime).Ticks;
}return avg / (wd.EndTime -
wd.StartTime).Ticks; } }
Managing CEP Data Sources & Queries> Typical CEP Application
> Multiple different data sources and adapters
> Several queries> Intellectual property is in the queries> Protect investment in queries over
time> Re-use existing queries as newer version
of an asset, device or data source become available
> Re-use existing queries for new data source types
> Seamlessly switch between live data feeds and historical data
Query Binding
> Binding a query with data sources yields a query instance
> Configure multiple query instances for a query> Similar to data independence: the same query can be
bound to different data sources as long as they deliver the required event type
Data Sources Analytics & Queries
QT1CEP Engine
Output Adapters
Input Adapters
Event
Standing Queries
Event
Event
Event
Event
Event
AT2
AT1
Q1
Q1’
AT3 Q1’’
StreamInsight Query Configuration> Query binding:
> Coupling adapters with a query
> Event types required by the query need to be matched by types delivered by adapters
> Query can be re-used for all data sources of the same type
> No changes to query necessary for re-use
Query
QueryTemplate
Output Adapter
Join
Proj.
Type Type
Type
InputAdapter
Type
Type
InputAdapter
Type
StreamInsight Deployment Alternatives
Data Sources
Aggregation & Correlation
21
CEP
CEP
CEP
CEP for lightweight processing and filteringCEP for aggregation and correlation of in-flight eventsCEP for complex analytics including historical data
Event processing engines are deployed at multiple places on different scales•At the edge – close to the data source• In the mid-tier – consolidate related data sources, • In the data center – historical archive, mining, large scale correlation.
Devices
Sensors
Web servers
Feeds
CEP
CEP CEP
CEP
CEP CEP
CEP CEP CEP
Complex Analytics &Mining
Managing StreamInsight Metadata> Persisted metadata: SQL CE provider> Non-persisted metadata: in-mem
providerProgramming/ Deployment
Implicit Server
Explicit Server
IObservable
Abstracts from CEP server, adapters and binding
Full control over CEP server, adapters and binding
Use reactive framework interfaces
In-process Supported Supported SupportedSeparate processes
N/A Supported Supported
Easy to use No query template re-use
Full controlQuery template re-use
Easiest to useData sources need IObservable interface
Financial Market Monitor
demo
Market Monitor: Architecture
Quotes:-MSFT-IBM
StreamInsight
Output Adapters
Input Adapters
GroupingAggregation
Asset Class
Ticker
Region
Country
Exchange
SUM Volume
SUM Bid
SUM Ask
Stock
MSFT NA US NASDAQ 100 100 100
Stock
IBM NA US NASDAQ 200 200 200
Deriv.
IBM EMEA CH SWIX 10 10 10
StreamInsight
Output Adapters
Input Adapters
FilterGrouping
StreamInsight
Output Adapters
Input Adapters
FilterGrouping
Ticker
Exchange
SUM Volume
MSFT NASDAQ 100
Exchange
SUM Bid
SUM Ask
NASDAQ 100 100
NASDAQ 200 200
Recap: Microsoft StreamInsight
25
CEP Engine
Output Adapters
Input Adapters
Event
Standing Queries
Event sources Event targets
`
Event
Event
Event
Event
Event
Event
C_ID
C_NAME
C_ZIP
Event
Event
Static reference data
CEP Application Development
Development experience with .NET, C#, LINQ and Visual Studio 2008
CEP platform from Microsoft to build event-driven applications
Event-driven applications are fundamentally
different from traditional database applications: queries are continuous, consume and produce streams, and compute results incrementally
Flexible adapter SDK with high
performance to connect to different event sources and
sinks
The CEP platform does the heavy lifting for you
to deal with temporal characteristics of event
stream data
Deployment Scenarios
CEP Engine
Reference data
Custom CEP Application
Scenario 1: Custom CEP Application DevScenario 2: Embed CEP in ApplicationISV Application with CEP Engine
CEP Engine
Reference data
Scenario 4: Operational Intelligence w/ CEP
MadisonETL Pipeline with CEP engine
CEP EngineCEP
CEP
CEP
CEP
KPIs KPI mining
Scenario 3: CEP Enabled Device
Device with Embedded CEP Engine
CEP Engine
.NET, C#, LINQ
StreamInsight Roadmap> Focus on
> Custom development platform for CEP applications
> CEP platform for Microsoft partners> CTP (Community Technology Preview) and
TAP (Technology Adopter Program) launched in August 2009
> Product released in first half 2010 with SQL Server 2008 R2
timeAug 2009
August CTPTAP program
TechEd Announceme
ntMay 2009 2010
ReleaseNovember
CTPTAP
continuesNov 2009
question & answer
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.