Infinispan The Future Of Os Data Grids Manik Surtani

30
MANIK SURTANI | [email protected] | twitter.com/maniksurtani Infinispan The future of open source data grids Manik Surtani [email protected] Principal Software Engineer Red Hat Inc. 1 Wednesday, 30 September 2009

description

In this talk, Infinispan founder and JBoss Cache project lead Manik Surtani introduces the role of data grids in today's cloud-computing environment. The extreme scalability offered by data grids are powering the greatest and most high-profile of today's applications. In his talk Surtani covers Infinispan, the new open source data grid platform, and it's motivations and evolution as a project. Distributed data structures will be discussed in-depth, including strategies of managing and deploying data grids on cloud services such as Amazon AWS by making use of powerful toolkits such as JClouds.

Transcript of Infinispan The Future Of Os Data Grids Manik Surtani

Page 1: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

InfinispanThe future of open source data grids

Manik Surtani [email protected] Software EngineerRed Hat Inc.

1Wednesday, 30 September 2009

Page 2: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Who is Manik? Founder and project lead, Infinispan Project lead, JBoss Cache Contributor and committer on various OSS projects

JBoss AS JGroups Hibernate etc.

2Wednesday, 30 September 2009

Page 3: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Agenda Cloud computing and data grids

And why YOU should care Introducing Infinispan

And how this relates to JBoss Cache The path ahead for Infinispan

Roadmap Featureset

3Wednesday, 30 September 2009

Page 4: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Clouds are today! Clouds are happening

*aaS: SaaS, PaaS, IaaS You cannot escape them!

Public: Amazon, Google, GoGrid, Rackspace Private: Eucalyptus, VMWare, IBM, RedHat

Traditional datacenters marginalized to niche deployments

Clouds become mainstream

4Wednesday, 30 September 2009

Page 5: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Why are clouds popular? Piecemeal cost Pay for what you use Massive, global data centers means high availability,

instant backups Everyone benefits from economies of scale Ability to scale on demand Very fast provisioning Proven charging model Remember timesharing on mainframes?

5Wednesday, 30 September 2009

Page 6: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

So why now?We’re in a perfect storm! Bandwidth is cheap and plentiful OS virtualization is mature ... and we’re in a financial crisis!

Everyone wants to cut costs, be more efficient! Making changes is easier now

6Wednesday, 30 September 2009

Page 7: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Why should I care? The platforms I use will still be

relevant: Java, Java EE Python, Ruby, .NET ... whatever!!

The OS I use will still be relevant Linux Solaris etc.

7Wednesday, 30 September 2009

Page 8: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Data Storage Databases on clouds don't make sense

Traditional modes of data storage won't work Clouds are inherently stateless, ephemeral Scalability is crucial

Databases still are a bottleneck ⋯ and single point of failure!

8Wednesday, 30 September 2009

Page 9: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Trying to make DBs work in the cloudE.g. 1: with AWS, store data files on EBS EBS is not guaranteed to be durable Still needs to be backed up

Snapshots expose data loss windowsLocks the volume from being written to

Mitigated using certain FS, e.g., XFS Can only be mounted by one EC2 node at a time

Single point of failure, bottleneck

9Wednesday, 30 September 2009

Page 10: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Trying to make DBs work in the cloudE.g. 2: with AWS, store data files on S3 a mounted S3 bucket

HUGE latency! Web service or REST based comms!

Native database clustering Notoriously slow and non-scalable Unreliable Expensive! Need special hardware

e.g., SAN for Oracle RAC

10Wednesday, 30 September 2009

Page 11: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

The solution: Data Grids! Data grids are perfect for clouds

Highly scalable No single point of failure Works with ephemeral cloud nodes Very low latency

Data grids Amazon SimpleDB uses Dynamo Infinispan, etc. Many other commercial and OSS offerings

11Wednesday, 30 September 2009

Page 12: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Data Grids - Speed! Data grids give you speed! Very low latency due to minimal disk lookup Memory 2 orders of magnitude faster than disk Especially for frequently used data Far greater concurrency Disk IO is always a concurrency bottleneck Memory offers far greater concurrency

12Wednesday, 30 September 2009

Page 13: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Introducing Infinispan

13Wednesday, 30 September 2009

Page 14: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Introducing Infinispan Highly scalable data grid platform

100% open source licensed (LGPL) Based on some JBoss Cache code But mostly all-new!

JBoss Cache is a clustered caching library Infinispan is a data grid platform JBoss Cache uses a tree-structured API Infinispan is a Map. Like JSR-107’s JCACHE

14Wednesday, 30 September 2009

Page 15: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Infinispan != JBoss Cache 4 Internal data container design completely different APIs completely different Not backward-compatible

Although an code-level compatibility layer is available

15Wednesday, 30 September 2009

Page 16: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

More scalable than JBoss Cache Internal structures more memory efficient Data organised in Map-like structures

As opposed to a tree Making better use of CAS

minimising synchronized blocks, mutexes Containers are naturally ordered

makes eviction much more efficient Uses JBoss Marshalling

smaller payloads + poolable streams = much faster remote calls

16Wednesday, 30 September 2009

Page 17: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

“Borrowed” from JBoss Cache JTA transactions Replicated data structure Eviction, cache persistence Notifications and eventing API JMX reporting Fine-grained replication MVCC locking Non-blocking state transfer techniques Query API Custom (non-JDK) marshalling

17Wednesday, 30 September 2009

Page 18: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

⋯ and new features! Consistent hash based data distribution Much simpler Map API (JSR-107 compliant) JPA API Client/server module with memcached compatibility

REST API - REST-* caching spec effort Ability to be consumed by non-JVM platforms JOPR based GUI management console Distributed executors

Map/reduce programming model made easy!

18Wednesday, 30 September 2009

Page 19: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Data distribution Consistent hash based data distribution

Will allow us to scale to bigger clusters Goal of efficient scaling to 1000’s of nodes

Lightweight, “L1” cache for efficient reads On writes, “L1” gets invalidated

Dynamic rebalancing

19Wednesday, 30 September 2009

Page 20: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

JPA API and fine-grained replication Successor to POJO Cache JPA interface: persist, find, remove... Will not rely on AOP, javassist, etc.

More robust and easier to use/debug Familiar JPA interface

Easy migration from existing, “traditional” datastores!

20Wednesday, 30 September 2009

Page 21: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Management Uses JOPR

Simple WAR file Rich web-based GUI Open Source (LGPL)

Infinispan exposes all data, operations in JMX Infinispan-JOPR plugin represents this graphically in JOPR Other plugins can be built for other tools

HP OpenView, Hyperic, etc.

21Wednesday, 30 September 2009

Page 22: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

So why is Infinispan sexy?

22Wednesday, 30 September 2009

Page 23: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

Why is Infinispan sexy? Transparent horizontal scalability

In both directions Fast, low latency data access Ability to address a very large heap Cloud-ready Consumable from many different platforms! Free and doesn't suck!

23Wednesday, 30 September 2009

Page 24: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

The path ahead

24Wednesday, 30 September 2009

Page 25: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

What happens to JBoss Cache? JBoss Cache in maintenance mode Currently released as 3.2.0 Only critical bug fixes No new development And where is Infinispan? Currently in Beta Expecting a final release very soon Very stable already, over 2x as fast as JBC and

4x more memory-efficient

25Wednesday, 30 September 2009

Page 26: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

RoadmapInfinispan 4.0.0New Map API, Async APIDistributed cacheNew marshalling codeManagement toolingREST API

Infinispan 4.1.0Client/server API, memcached module, language bindingsQuery API

26Wednesday, 30 September 2009

Page 27: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

RoadmapInfinispan 5.0.0JPA APIFine-grained replication

Infinispan 5.1.0Distributed executors, map/reduce modelDynamic provisioning

Infinispan 5.2.0Distributed querying based on map/reduce

27Wednesday, 30 September 2009

Page 28: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

To sum it up Clouds are becoming mainstream

Developers need to think about challenges involved DBs and clouds pose many challenges Data grids offer a good alternative Infinispan, a new open source data grid

Viable cloud data store Also helps remove bottlenecks and single points of failure in non-cloud environments!

28Wednesday, 30 September 2009

Page 29: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

How can YOU participate?Download and try it out!Report bugs. Not just in code, even docs, wikis, etc.

Suggest new features!Test with your own use cases

We love to hear how people use our stuff!!Lend a hand with development

Open and democratic dev processHelps prioritise features you want!Several non-Red Hat core committers already!

29Wednesday, 30 September 2009

Page 30: Infinispan The Future Of Os Data Grids   Manik Surtani

MANIK SURTANI | [email protected] | twitter.com/maniksurtani

QUESTIONS?

Project site: http://www.infinispan.org

Blog: http://blog.infinispan.org

Twitter: http://twitter.com/infinispan#infinispan

30Wednesday, 30 September 2009