Infinispan - Galder Zamarreno - October 2010

34
[email protected] | twitter.com/galderz | zamarreno.com Monday, October 18, 2010

description

Infinispan - Galder Zamarreno - October 2010

Transcript of Infinispan - Galder Zamarreno - October 2010

Page 2: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Infinispan: New Kid on the NoSQL Block

Galder ZamarreñoSenior Engineer, Red Hat14th October 2010, Lausanne JUG

Monday, October 18, 2010

Page 3: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

“There is a need for a viable cloud-ready data store. People need torethink the way they organize, store and access data.”

Monday, October 18, 2010

Page 4: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Who is Galder?

• R&D engineer (Red Hat Inc):

• Infinispan developer

• JBoss Cache developer

• Contributor and committer:

• JBoss AS, Hibernate, JGroups, JBoss Portal,...etc

• Blog: zamarreno.com

• Twitter: @galderz

Monday, October 18, 2010

Page 5: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Agenda

• Cloud computing and data storage

• And why you should care!

• Data grids and cloud storage

• Introducing Infinispan

Monday, October 18, 2010

Page 6: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Clouds are today!

• Clouds are happening

• *aaS

• You cannot escape them!

• Public: Amazon, Google, Rackspace, ...

• Private: Red Hat, Oracle, VMWare, ...

• Clouds will become mainstream

• Traditional data centers become marginalized

Monday, October 18, 2010

Page 7: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Why are clouds popular?

• Piecemeal costs, perfect utilization

• Pay for what you use, no more!

• Massive economies of scale

• High availability = Implicit backups!

• Very fast provisioning -> Elasticity

• Familiar charging model, controllable costs

• Operational expenditure versus capital expenditure

Monday, October 18, 2010

Page 8: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Why should I care?

• My favorite platform is still relevant

• Java, Java EE

• Python, Ruby, .NET,... whatever!

• My favorite OS is still relevant:

• Linux

• Solaris, ...etc.

Monday, October 18, 2010

Page 9: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Data Storage

• Databases on clouds:

• not a match made in heaven!

• Traditional modes of data storage won't work

• Clouds are inherently stateless, ephemeral

• Cloud deployments should scale

• ... but databases still are a bottleneck

• … and single point of failure!

Monday, October 18, 2010

Page 10: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

RDBMS on clouds: your options

• Non-ephemeral storage

• Restrictive

• Highly specialized hardware

• E.g., a SAN for Oracle RAC, ExaLogic?

• Hardly commodity hardware!

• Native database clustering

• Unreliable, expensive

Monday, October 18, 2010

Page 11: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Another solution: Data Grids!

• Data grids are perfect for clouds

• Highly scalable

• No single point of failure

• Works with ephemeral cloud nodes

• Very low latency

Monday, October 18, 2010

Page 12: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Data Grids and other vendors

• Data grids

• Amazon SimpleDB uses Dynamo

• Google BigTable

• Infinispan

• Many other commercial and OSS offerings

Monday, October 18, 2010

Page 13: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

In-Memory Data Grids - Speed!

• Low latency

• minimal disk lookup

• Memory 2 orders of magnitude faster than disk

• especially for frequently used data

• Concurrency, hardware threads

• Disk IO is always a concurrency bottleneck

• Memory offers far greater concurrency

Monday, October 18, 2010

Page 14: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Introducing

Monday, October 18, 2010

Page 15: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Introducing Infinispan• Scalable data grid platform

• open source - LGPL

• based on some JBoss Cache code ... but mostly all-new

• JBoss Cache...

• ... is a clustered caching library

• ... exposes a tree-structured API

• Infinispan has a Map-like API - (JSR-107 JCACHE)

• ... so, primarily key/value NoSQL

Monday, October 18, 2010

Page 17: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Infinispan != JBoss Cache 4• New architecture

• Brand new data container design

• Cutting edge algorithms

• New, completely different, APIs

• Not backward-compatible

• Although an code-level compatibility layer is available

• New expectations

• Designed for a far wider scope of purpose

Monday, October 18, 2010

Page 18: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

More scalable than JBC• Internal structures more memory-efficient

• Data organised in Map-like dictionaries

• As opposed to a tree

• Making better use of CAS

• Minimizing synchronized blocks, mutexes

• Highly precise and low overhead data eviction

• Uses JBoss Marshalling

• smaller payloads + poolable streams = faster RPC

Monday, October 18, 2010

Page 19: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

“Borrowed” from JBoss Cache• JTA transactions

• Replicated data structure

• Fine-grained replication

• Eviction, cache persistence

• Notifications and eventing API

• JMX reporting and Query API

• MVCC locking

• Non-blocking state transfer techniques

Monday, October 18, 2010

Page 20: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

… and new features!• Consistent hash based data distribution

• Much simpler Map API (JSR-107 compliant)

• Ability to be consumed by non-JVM platforms

• Client/server module

• Memcached compatibility

• HotRod - binary protocol supporting “smart clients”

• Javascript access via Websocket server

• REST API

Monday, October 18, 2010

Page 21: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

… and new features!

• JOPR based GUI management console

• JPA-like API

• Distributed execution

• Map/reduce made easy!

Monday, October 18, 2010

Page 22: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Data distribution• Consistent hash based data distribution

• Locating entries very efficient

• No network calls, no need for metadata

• Will allow us to scale to bigger clusters

• Goal of efficient scaling to 1000’s of nodes

• Lightweight, “L1” cache for efficient reads

• On writes, “L1” gets invalidated

• Dynamic rebalancing

Monday, October 18, 2010

Page 23: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

JPA-like API, fine-grained replication

• Successor to POJO Cache

• JPA-like interface: persist, find, remove...

• Will not rely on AOP, javassist, etc

• More robust and easier to use/debug

• Familiar JPA-like interface

• Easy migration from existing, “traditional” data stores!

Monday, October 18, 2010

Page 24: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Management

• Uses JOPR, a rich web-based GUI

• Simple WAR file

• Open Source (LGPL)

• Infinispan exposes data, operations in JMX

• Infinispan-JOPR plugin represents this graphically

• Other plugins can be built for other tools

• HP OpenView, Hyperic, etc.

Monday, October 18, 2010

Page 25: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

So why is Infinispan sexy?

Monday, October 18, 2010

Page 26: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Why is Infinispan sexy?

• Transparent horizontal scalability

• Elastic in both directions

• Fast, low latency data access

• Ability to address a very large heap

• Cloud-ready datastore

• Not just for Java

• Free and doesn't suck!

Monday, October 18, 2010

Page 27: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

The path ahead

Monday, October 18, 2010

Page 28: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Roadmap

• Infinispan 4.0.0 Starobrno (Released Feb 2010)

• New Map API

• Async API

• Distributed cache mode

• Management tooling

• REST API

• Hibernate 2nd level cache

Monday, October 18, 2010

Page 29: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Roadmap• Infinispan 4.1.0 Radegast (Released August 2010)

• Client/server

• Memcached protocol

• Hot Rod protocol

• Smart clients using HotRod

• Websocket server

• Lucene Directory

• LIRS adaptive, recency-based eviction

Monday, October 18, 2010

Page 30: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

Roadmap

• Infinispan 4.2.0 Ursus

• Collocated nodes in DIST

• Cassandra based cache store

• Infinispan 5.0.0 Pagoa

• JPA-like API + fine-grained replication

• Distributed executors

• Map/reduce programming model

Monday, October 18, 2010

Page 31: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

To sum it up• Clouds are becoming mainstream

• Need to think about challenges

• DBs and clouds pose many challenges

• Data grids offer a good alternative

• Infinispan, a new open source data grid

• Viable cloud data store but not just for clouds

• removes bottlenecks, single points of failure in non-cloud architectures too

Monday, October 18, 2010

Page 32: Infinispan - Galder Zamarreno - October 2010

[email protected] | twitter.com/galderz | zamarreno.com

How can YOU participate?• Download and try it out!

• Report bugs in code, even docs, wikis, etc.

• Suggest new features!

• Test with your own use cases and tell us how you use it!!

• Lend a hand with development

• Open and democratic dev process

• Helps prioritize features you want!

• Several non-Red Hat core committers already!

Monday, October 18, 2010