Big data and messaging with Spring

Post on 26-Jan-2015

108 views 0 download

description

 

Transcript of Big data and messaging with Spring

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

with Artem Bilan

by Pivotal

Big Data & Messaging

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Artem Bilan

abilan@gopivotal.com

Spring Integration Team

Spring FrameworkSpring AMQP

Spring XDReactor

https://github.com/artembilanhttp://www.linkedin.com/in/cleric

https://spring.io/team/artembilan http://stackoverflow.com/users/2756547

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Messaging and why do I care?

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Messaging is very simple

Headers

Payload

public static void main(String[] args) {...}

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Messaging in Spring IO

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Spring Integration

Meet IoC!

Uses

Written

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

The heart of Spring Integration

– Endpoints (Filters) connected through– Channels (Pipes) exchanging– Message

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Spring Integration Message Flow

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

How does it work?

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

What is a Big Data application? Not just writing a few MR or Hive/Pig jobs.

The full lifecycle involves

• Ingestion

• Stream Processing

• Workflow Orchestration

• Enterprise Integration

• Export

• Horizontal scalable deployment

How do you write one of these?

• Status quo is to combine different projects – not ideal

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Spring XD

• Unified Platform• Developer Productivity• Modular Extensibility• Distributed Architecture• Portable Runtime • Hadoop Distribution Agnostic• Proven Foundation • XD = ‘eXtreme Data’

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

TapsCompute

HDFS

Wor

kflow

Export

Spring XD Runtime

Inge

st

Jobs

Export

Files Sensors Mobile Social

RDBMS

NoSQL

R, SAS

Spring XD Shell

Streams

Redis

Gemfire

Predictive modeling

Spring XD

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

http | filter | file

Stream Processing Model

How can we make this easier?

Non-linear stream definitions also supported

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Runtimes

http | filter | file

Rabbit, Redis, (Pluggable)

XD Admin

CLUSTERED NODE

FilterModule

CLUSTERED NODE

HTTPModule

CLUSTERED NODE

FileModule

In MemoryTransport

http | filter | file

SINGLENODE

AllModules

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

StreamsMessageStore

HTTPTailFileMail

TwitterGemfireSyslog

TCPJMS

RabbitMQMQTT

FilterTransformer

SplitterAggregator

Groovy ScriptCounters

HTTPJSON

Java Code

FileHDFSJDBCTCPMail

RabbitMQGemfireSplunkMQTT

Dynamic Router

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Taps “Listen” to data from another stream

• Other stream is unaffected by the tap and unaware of its presence

• EAI ‘Wiretap’

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Analytics

• Simple Counter• Field Value Counter

– Count occurrences of named fields

• Aggregate Counter– Pre-aggregate counts in time

buckets

• Gauge– last value

• Rich Gauge– Last value, running average,

min/max

• Abstract API• Implementations for

– Memory– Redis

• Looking at– Pattern– JPMML

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Jobs Jobs are a directed graph of steps

Steps

• Copy or Process data

• Files, Databases, MR, Pig, Hive, Cascading

Step executions are persisted

• Checkpointing with restart

• Rich error handling capabilities

Single node or distributed with data partitioning

Jobs can be Triggered from streams

Executing jobs generate a stream of event data

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

S P R I N G X D

Demo

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

Learn More. Stay Connected.

• Spring IO: https://spring.io/platform• GitHub: https://github.com/spring-projects• Spring Integration: http://projects.spring.io/spring-integration• Spring XD: http://projects.spring.io/spring-xd• Reactor: https://github.com/reactor• EIP: http://www.eaipatterns.com• Spring Batch: http://projects.spring.io/spring-batch• Spring for Hadoop: http://projects.spring.io/spring-hadoop

© 2013 Pivotal Inc. All rights reserved. Do not distribute without permission.

ありがとう