StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

StratioDeep: an integration layer between Spark and Cassandra

Our customers

#StratioBD

StratioDeep

An efficient data mining solution

“Two and two are four?

Sometimes… Sometimes they are five.”

G. Orwell

#StratioBD

Why we useCassandra

One User – Lots of data

Case A

#StratioBD

Many Users – Few data

Case B

#StratioBD

Many users – Lots of data

Case C

#StratioBD

Why we also need Spark

• In Cassandra, you need to design the schema with the

query in mind• Every other type of query is either very inefficient or

impossible to resolve

#StratioBD

ChallengeAccepted

•Supports CQL3 features•Use of secondary Indexes•Small codebase (less bugs)

StratioDeep features (I)

#StratioBD

StratioDeep features (II)

Provides a Java friendly API:• Developers map Column Families to custom serializable POJOs

• StratioDeep wraps the complexity of performing Spark calculations

directly over the user provided POJOs.

• SQL-Like Domain Specific Language

#StratioBD

SQL-Like domain specific language:• Built on-top of Spark’s API.• SQL + Linq abstractions.• Unique interface to all Stratio platform modules

Stratio DSL (I)

#StratioBD

Stratio RT extension• Built on-top of Spark Streaming API.

Stratio BUS extension• Registration of new channels/consumer/producers

Cross-module integration with StratioMeta• Lets us create flows of data between StratioDeep StratioRT

• Materialized views, live queries, alerts, etc…

Stratio DSL (II)

#StratioBD

Use case A Use case C

#StratioBD

Conclusion

THANKS

Luca Rosellini @luca_rosellini

Alvaro Agea @alvaroagea

StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Technology

Transcript of StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013