Infinispan - Galder Zamarreno - October 2010
-
Upload
jug-lausanne -
Category
Technology
-
view
360 -
download
0
description
Transcript of Infinispan - Galder Zamarreno - October 2010
[email protected] | twitter.com/galderz | zamarreno.comMonday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Infinispan: New Kid on the NoSQL Block
Galder ZamarreñoSenior Engineer, Red Hat14th October 2010, Lausanne JUG
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
“There is a need for a viable cloud-ready data store. People need torethink the way they organize, store and access data.”
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Who is Galder?
• R&D engineer (Red Hat Inc):
• Infinispan developer
• JBoss Cache developer
• Contributor and committer:
• JBoss AS, Hibernate, JGroups, JBoss Portal,...etc
• Blog: zamarreno.com
• Twitter: @galderz
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Agenda
• Cloud computing and data storage
• And why you should care!
• Data grids and cloud storage
• Introducing Infinispan
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Clouds are today!
• Clouds are happening
• *aaS
• You cannot escape them!
• Public: Amazon, Google, Rackspace, ...
• Private: Red Hat, Oracle, VMWare, ...
• Clouds will become mainstream
• Traditional data centers become marginalized
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Why are clouds popular?
• Piecemeal costs, perfect utilization
• Pay for what you use, no more!
• Massive economies of scale
• High availability = Implicit backups!
• Very fast provisioning -> Elasticity
• Familiar charging model, controllable costs
• Operational expenditure versus capital expenditure
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Why should I care?
• My favorite platform is still relevant
• Java, Java EE
• Python, Ruby, .NET,... whatever!
• My favorite OS is still relevant:
• Linux
• Solaris, ...etc.
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Data Storage
• Databases on clouds:
• not a match made in heaven!
• Traditional modes of data storage won't work
• Clouds are inherently stateless, ephemeral
• Cloud deployments should scale
• ... but databases still are a bottleneck
• … and single point of failure!
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
RDBMS on clouds: your options
• Non-ephemeral storage
• Restrictive
• Highly specialized hardware
• E.g., a SAN for Oracle RAC, ExaLogic?
• Hardly commodity hardware!
• Native database clustering
• Unreliable, expensive
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Another solution: Data Grids!
• Data grids are perfect for clouds
• Highly scalable
• No single point of failure
• Works with ephemeral cloud nodes
• Very low latency
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Data Grids and other vendors
• Data grids
• Amazon SimpleDB uses Dynamo
• Google BigTable
• Infinispan
• Many other commercial and OSS offerings
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
In-Memory Data Grids - Speed!
• Low latency
• minimal disk lookup
• Memory 2 orders of magnitude faster than disk
• especially for frequently used data
• Concurrency, hardware threads
• Disk IO is always a concurrency bottleneck
• Memory offers far greater concurrency
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Introducing Infinispan• Scalable data grid platform
• open source - LGPL
• based on some JBoss Cache code ... but mostly all-new
• JBoss Cache...
• ... is a clustered caching library
• ... exposes a tree-structured API
• Infinispan has a Map-like API - (JSR-107 JCACHE)
• ... so, primarily key/value NoSQL
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.comMonday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Infinispan != JBoss Cache 4• New architecture
• Brand new data container design
• Cutting edge algorithms
• New, completely different, APIs
• Not backward-compatible
• Although an code-level compatibility layer is available
• New expectations
• Designed for a far wider scope of purpose
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
More scalable than JBC• Internal structures more memory-efficient
• Data organised in Map-like dictionaries
• As opposed to a tree
• Making better use of CAS
• Minimizing synchronized blocks, mutexes
• Highly precise and low overhead data eviction
• Uses JBoss Marshalling
• smaller payloads + poolable streams = faster RPC
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
“Borrowed” from JBoss Cache• JTA transactions
• Replicated data structure
• Fine-grained replication
• Eviction, cache persistence
• Notifications and eventing API
• JMX reporting and Query API
• MVCC locking
• Non-blocking state transfer techniques
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
… and new features!• Consistent hash based data distribution
• Much simpler Map API (JSR-107 compliant)
• Ability to be consumed by non-JVM platforms
• Client/server module
• Memcached compatibility
• HotRod - binary protocol supporting “smart clients”
• Javascript access via Websocket server
• REST API
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
… and new features!
• JOPR based GUI management console
• JPA-like API
• Distributed execution
• Map/reduce made easy!
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Data distribution• Consistent hash based data distribution
• Locating entries very efficient
• No network calls, no need for metadata
• Will allow us to scale to bigger clusters
• Goal of efficient scaling to 1000’s of nodes
• Lightweight, “L1” cache for efficient reads
• On writes, “L1” gets invalidated
• Dynamic rebalancing
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
JPA-like API, fine-grained replication
• Successor to POJO Cache
• JPA-like interface: persist, find, remove...
• Will not rely on AOP, javassist, etc
• More robust and easier to use/debug
• Familiar JPA-like interface
• Easy migration from existing, “traditional” data stores!
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Management
• Uses JOPR, a rich web-based GUI
• Simple WAR file
• Open Source (LGPL)
• Infinispan exposes data, operations in JMX
• Infinispan-JOPR plugin represents this graphically
• Other plugins can be built for other tools
• HP OpenView, Hyperic, etc.
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
So why is Infinispan sexy?
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Why is Infinispan sexy?
• Transparent horizontal scalability
• Elastic in both directions
• Fast, low latency data access
• Ability to address a very large heap
• Cloud-ready datastore
• Not just for Java
• Free and doesn't suck!
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Roadmap
• Infinispan 4.0.0 Starobrno (Released Feb 2010)
• New Map API
• Async API
• Distributed cache mode
• Management tooling
• REST API
• Hibernate 2nd level cache
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Roadmap• Infinispan 4.1.0 Radegast (Released August 2010)
• Client/server
• Memcached protocol
• Hot Rod protocol
• Smart clients using HotRod
• Websocket server
• Lucene Directory
• LIRS adaptive, recency-based eviction
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Roadmap
• Infinispan 4.2.0 Ursus
• Collocated nodes in DIST
• Cassandra based cache store
• Infinispan 5.0.0 Pagoa
• JPA-like API + fine-grained replication
• Distributed executors
• Map/reduce programming model
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
To sum it up• Clouds are becoming mainstream
• Need to think about challenges
• DBs and clouds pose many challenges
• Data grids offer a good alternative
• Infinispan, a new open source data grid
• Viable cloud data store but not just for clouds
• removes bottlenecks, single points of failure in non-cloud architectures too
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
How can YOU participate?• Download and try it out!
• Report bugs in code, even docs, wikis, etc.
• Suggest new features!
• Test with your own use cases and tell us how you use it!!
• Lend a hand with development
• Open and democratic dev process
• Helps prioritize features you want!
• Several non-Red Hat core committers already!
Monday, October 18, 2010
[email protected] | twitter.com/galderz | zamarreno.com
Questions?
infinispan.org
blog.infinispan.org
twitter.com/infinispan
#infinispan
Monday, October 18, 2010