Infinispan, a distributed in-memory key/value data grid and cache

48
Infinispan Distributed in-memory key/value data grid and cache @infinispan

Transcript of Infinispan, a distributed in-memory key/value data grid and cache

Page 1: Infinispan, a distributed in-memory key/value data grid and cache

Infinispan

Distributed in-memory key/value data grid and cache

@infinispan

Page 2: Infinispan, a distributed in-memory key/value data grid and cache

Agenda

• Introduction

• Part 1

• Hash Tables

• Distributed Hash Tables

• Consistent Hashing

• Chord Lookup Protocol

• Part 2

• Data Grids

• Infinispan

• Architecture

• Consistent Hashing / Split Clusters

• Other features

Page 3: Infinispan, a distributed in-memory key/value data grid and cache

Part I – A (very) short introduction to distributed hash tables

Page 4: Infinispan, a distributed in-memory key/value data grid and cache

Hash Tables

Source: Wikipedia http://commons.wikimedia.org/wiki/File:Hash_table_5_0_1_1_1_1_1_LL.svg#/media/File:Hash_table_5_0_1_1_1_1_1_LL.svg

Page 5: Infinispan, a distributed in-memory key/value data grid and cache

Distributed Hash Tables (DHT)

Source: Wikipedia - http://commons.wikimedia.org/wiki/File:DHT_en.svg#/media/File:DHT_en.svg

Page 6: Infinispan, a distributed in-memory key/value data grid and cache

• Decentralized Hash Table functionality

• Interface

• put(K,V)

• get(K) -> V

• Nodes can fail, join and leave

• The system has to scale

Distributed Hash Tables (DHT)

Page 7: Infinispan, a distributed in-memory key/value data grid and cache

• Flooding in N nodes

• put() – store in any node O(1)

• get() – send query to all nodes O(N)

• Full replication in N nodes

• put() – store in all nodes O(N)

• get() – check any node O(1)

Simple solutions

Page 8: Infinispan, a distributed in-memory key/value data grid and cache

Fixed Hashing

NodeID = hash(key) % TotalNodes.

Page 9: Infinispan, a distributed in-memory key/value data grid and cache

Fixed Hashing with High Availability

NodeID = hash(key) % TotalNodes.

Page 10: Infinispan, a distributed in-memory key/value data grid and cache

Fixed Hashing and Scalability

NodeID = hash(key) % TotalNodes+1.

Page 11: Infinispan, a distributed in-memory key/value data grid and cache

2 Nodes, Key Space={0,1,2,3,4,5}

NodeID = hash(key) % 2.

NodeID = hash(key) % 3.

N0 (key mod 2 = 0) N1 (key mod 2 = 1)

0,2,4 1,3,5

N0 (key mod 3 = 0) N1 (key mod 3 = 1) N2 (key mod 3 = 2)

0,3 1,4 2,5

Page 12: Infinispan, a distributed in-memory key/value data grid and cache

Consistent Hashing

Page 13: Infinispan, a distributed in-memory key/value data grid and cache

Consistent Hashing – The Hash Ring

0

N0

N1

N2

K1

K2

K3

K4

K5

K6

Page 14: Infinispan, a distributed in-memory key/value data grid and cache

Consistent Hashing – Nodes Joining, Leaving

Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Page 15: Infinispan, a distributed in-memory key/value data grid and cache

Chord: Peer-to-peer Lookup Protocol

• Load Balance – distributed hash function, spreading keys evenly over nodes

• Decentralization – fully distributed no SPOF

• Scalability – logarithmic growth of lookup cost with the number of nodes, large systems are feasible

• Availability – automatically adjusts its internal tables to ensure the node responsible for a key is always found

• Flexible naming – key-space is flat (flexibility in how to map names to keys)

Page 16: Infinispan, a distributed in-memory key/value data grid and cache

Chord – Lookup O(N)

Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet ApplicationsIon Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz

Page 17: Infinispan, a distributed in-memory key/value data grid and cache

Chord – Lookup O(logN)

Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet ApplicationsIon Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz

• K=6 (0, 26−1)

• Finger[i] = first node that succeeds

(N+ 2𝑖−1) mod 2K , where 1 ≤ 𝑖 ≤ 𝐾

• Successor/Predecessor – the next/previous node on circle

Page 18: Infinispan, a distributed in-memory key/value data grid and cache

Chord – Node Join

Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet ApplicationsIon Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz

• Node 26 joins the system between nodes 21 and 32.

• (a) Initial state: node 21 points to node 32;

• (b) node 26 finds its successor (i.e., node 32) and points to it;

• (c) node 26 copies all keys less than 26 from node 32;

• (d) the stabilize procedure updates the successor of node 21 to node 26.

Page 19: Infinispan, a distributed in-memory key/value data grid and cache

• CAN (Hypercube), Chord (Ring), Pastry (Tree+Ring), Tapestry (Tree+Ring), Viceroy, Kademlia, Skipnet, Symphony (Ring), Koorde, Apocrypha, Land, Bamboo, ORDI …

The world of DHTs …

Page 20: Infinispan, a distributed in-memory key/value data grid and cache

Part II – A short introduction to Infinispan

Page 21: Infinispan, a distributed in-memory key/value data grid and cache

Where do we store data?One size does not fit all...

Page 22: Infinispan, a distributed in-memory key/value data grid and cache
Page 23: Infinispan, a distributed in-memory key/value data grid and cache

Infinispan – History

• 2002 – JBoss App Server needed a clustered solution forHTTP and EJB session state replication for HA clusters.JGroups (open source group communication suite) had areplicated map demo, expanded to a tree data structure,added eviction and JTA transactions.

• 2003 – this was moved to JBoss AS code base

• 2005 – JBoss Cache was extracted and became a standalone project

… JBoss Cache evolved into Infinispan, core parts redesigned

• 2009 – JBoss Cache 3.2 and Infinispan 4.0.0.ALPHA1 was released

• 2015 - 7.2.0.Alpha1

• Check the Infinispan RoadMap for more details

Page 24: Infinispan, a distributed in-memory key/value data grid and cache

Code?

<dependency>

<groupId>org.infinispan</groupId>

<artifactId>infinispan-embedded</artifactId>

<version>7.1.0.Final</version>

</dependency>

EmbeddedCacheManager cacheManager = new DefaultCacheManager();

Cache<String,String> cache = cacheManager.getCache();

cache.put("Hello", "World!");

Page 25: Infinispan, a distributed in-memory key/value data grid and cache

Usage Modes

• Embedded / library mode

• clustering for apps and frameworks (e.g. JBosssession replication)

• Local mode single cache

• JSR 107: JCACHE - Java Temporary Caching API

• Transactional local cache

• Eviction, expiration, write through, write behind, preloading, notifications, statistics

• Cluster of caches

• Invalidation, Hibernate 2nd level cache

• Server mode – remote data store

• REST, MemCached, HotRod, WebSocket (*)

Page 26: Infinispan, a distributed in-memory key/value data grid and cache

Code?

Configuration config = new ConfigurationBuilder()

.clustering()

.cacheMode(CacheMode.DIST_SYNC)

.sync()

.l1().lifespan(25000L)

.hash().numSegments(100).numOwners(3)

.build();

Configuration config = new ConfigurationBuilder()

.eviction()

.maxEntries(20000).strategy(EvictionStrategy.LRU)

.expiration()

.wakeUpInterval(5000L)

.maxIdle(120000L)

.build();

Page 27: Infinispan, a distributed in-memory key/value data grid and cache

Infinispan – Core Architecture

Remote App 1 (C++) Remote App 2 (Java) Remote App 3 (.NET)

Network (TCP)

Node (JVM)

MemCached, HotRod, REST, WebSocket (*)

Embedded App (Java)

Transport (JGroups)

NotificationTransactions / XA

QueryMap / Reduce

Monitoring

Storage Engine(RAM +

Overflow)

Node (JVM)

MemCached, HotRod, REST, WebSocket (*)

Embedded App (Java)

Transport (JGroups)

NotificationTransactions / XA

QueryMap / Reduce

Monitoring

Storage Engine(RAM +

Overflow)

TCP/UDP

Page 28: Infinispan, a distributed in-memory key/value data grid and cache

Infinispan Clustering and Consistent Hashing

• JGroups Views

• Each node has a unique address

• View changes when nodes join, leave

• Keys are hashed using MurmurHash3 algorithm

• Hash Space is divided into segments

• Key > Segment > Owners

• Primary and Backup Owners

Page 29: Infinispan, a distributed in-memory key/value data grid and cache

Does it scale?

• 320 nodes, 3000 caches, 20 TB RAM

• Largest cluster formed: 1000 nodes

Page 30: Infinispan, a distributed in-memory key/value data grid and cache

Empty Cluster

CLUSTER

Page 31: Infinispan, a distributed in-memory key/value data grid and cache

Add 1 Entry

CLUSTER

K1

Page 32: Infinispan, a distributed in-memory key/value data grid and cache

Primary and Backup

CLUSTER

K1

K1

Page 33: Infinispan, a distributed in-memory key/value data grid and cache

Add another one

CLUSTER

K1

K1

K2

Page 34: Infinispan, a distributed in-memory key/value data grid and cache

Primary And Backup

CLUSTER

K1

K1

K2K2

Page 35: Infinispan, a distributed in-memory key/value data grid and cache

A cluster with more keys

CLUSTER

K1

K1

K2K2

K3

K3K4

K4

K5

K5

Page 36: Infinispan, a distributed in-memory key/value data grid and cache

A node dies…

CLUSTER

K1

K1

K2K2

K3

K3K4

K4

K5

K5

Page 37: Infinispan, a distributed in-memory key/value data grid and cache

The cluster heals

CLUSTER

K1

K1

K2K2

K3 K3

K4

K4

K5

K5

Page 38: Infinispan, a distributed in-memory key/value data grid and cache

If multiple nodes fail…

• CAP Theorem to the rescue:

• Formulated by Eric Brewer in 1998

• C - Consistency

• A - High Availability

• P - Tolerance to Network Partitions

• Can only satisfy 2 at the same time:

• Consistency + Availability: The Ideal World where network partitions do not exist

• Partitioning + Availability: Data might be different between partitions

• Partitioning + Consistency: Do not corrupt data!

Page 39: Infinispan, a distributed in-memory key/value data grid and cache

Infinispan Partition Handling Strategies

• In the presence of network partitions

• Prefer availability (partition handling DISABLED)

• Prefer consistency (partition handling ENABLED)

• Split Detection with partition handling ENABLED:

• Ensure stable topology

• LOST > numOwners OR no simple majority

• Check segment ownership

• Mark partition as Available / Degraded

• Send PartitionStatusChangedEvent to listeners

Page 40: Infinispan, a distributed in-memory key/value data grid and cache

Cluster Partitioning – No data lost

K1

K1

K2K2

K3

K3K4

K4

K5

K5

Partition1 Partition2

Page 41: Infinispan, a distributed in-memory key/value data grid and cache

Cluster Partitioning – Lost data

K1

K1

K2K2

K3

K3K4

K4

K5

K5

Partition1

Partition2

Page 42: Infinispan, a distributed in-memory key/value data grid and cache

Merging Split Clusters

• Split Clusters see each other again

• Step1: Ensure stable topology

• Step2: Automatic: based on partition state

• 1 Available -> attempt merge

• All Degraded -> attempt merge

• Step3: Manual

• Data was lost

• Custom listener on Merge

• Application decides

Page 43: Infinispan, a distributed in-memory key/value data grid and cache

Querying Infinispan

• Apache Lucene Index

• Native Query API (Query DSL)

• Hibernate Search and Apache Lucene to index and search

• Native Map/Reduce

• Index-less

• Distributed Execution Framework

• Hadoop Integration (WIP)

• Run existing map/reduce jobs on Infinispan data

Page 44: Infinispan, a distributed in-memory key/value data grid and cache

Map Reduce:

MapReduceTask<String, String, String, Integer> mapReduceTask

= new MapReduceTask<>(wordCache);

mapReduceTask

.mappedWith(new WordCountMapper())

.reducedWith(new WordCountReducer());

Map<String, Integer> wordCountMap = mapReduceTask.execute();

Page 45: Infinispan, a distributed in-memory key/value data grid and cache

Query DSL:

QueryParser qp = new QueryParser("default", new

StandardAnalyzer());

Query luceneQ = qp

.parse("+station.name:airport +year:2014 +month:12

+(avgTemp < 0)");

CacheQuery cq = Search.getSearchManager(cache)

.getQuery(luceneQ, DaySummary.class);

List<Object> results = query.list();

Page 46: Infinispan, a distributed in-memory key/value data grid and cache

Other features

• JMX Management

• RHQ (JBoss Enterprise Management Solution)

• CDI Support

• JSR 107 (JCACHE) integration

• Custom interceptors

• Runs on Amazon Web Services Platform

• Command line client

• JTA with JBoss TM, Bitronix, Atomikos

• GridFS (experimental API), CloudTM, Cross Site Replication

Page 47: Infinispan, a distributed in-memory key/value data grid and cache

DEMO

Q & A

Page 48: Infinispan, a distributed in-memory key/value data grid and cache

Thank you!

Resources:

http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changedhttp://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdfhttp://pdos.csail.mit.edu/papers/ton:chord/paper-ton.pdfhttp://www.martinbroadhurst.com/Consistent-Hash-Ring.html

http://infinispan.org/docs/7.2.x/user_guide/user_guide.htmlhttps://github.com/infinispan/infinispan/wiki