Download - StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Transcript
Page 1: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

StratioDeep: an integration layer between Spark and Cassandra

Page 2: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013
Page 3: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Our customers

#StratioBD

Page 4: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

StratioDeep

An efficient data mining solution

“Two and two are four?

Sometimes… Sometimes they are five.”

G. Orwell

#StratioBD

Page 5: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Why we useCassandra

Page 6: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

One User – Lots of data

Case A

#StratioBD

Page 7: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Many Users – Few data

Case B

#StratioBD

Page 8: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Many users – Lots of data

Case C

#StratioBD

Page 9: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Why we also need Spark

• In Cassandra, you need to design the schema with the

query in mind• Every other type of query is either very inefficient or

impossible to resolve

#StratioBD

Page 10: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

ChallengeAccepted

Page 11: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013
Page 12: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

•Supports CQL3 features•Use of secondary Indexes•Small codebase (less bugs)

StratioDeep features (I)

#StratioBD

Page 13: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

StratioDeep features (II)

Provides a Java friendly API:• Developers map Column Families to custom serializable POJOs

• StratioDeep wraps the complexity of performing Spark calculations

directly over the user provided POJOs.

• SQL-Like Domain Specific Language

#StratioBD

Page 14: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

SQL-Like domain specific language:• Built on-top of Spark’s API.• SQL + Linq abstractions.• Unique interface to all Stratio platform modules

Stratio DSL (I)

#StratioBD

Page 15: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Stratio RT extension• Built on-top of Spark Streaming API.

Stratio BUS extension• Registration of new channels/consumer/producers

Cross-module integration with StratioMeta• Lets us create flows of data between StratioDeep StratioRT

• Materialized views, live queries, alerts, etc…

Stratio DSL (II)

#StratioBD

Page 16: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

Use case A Use case C

#StratioBD

Conclusion

Page 17: StratioDeep: an Integration Layer Between Spark and Cassandra - Spark Summit 2013

THANKS

Luca Rosellini @luca_rosellini

Alvaro Agea @alvaroagea