Pivotal's effort on Apache Geode

Apache Geode,and Pivotal's leadership role

in open sourcing (Gemfire)

Nitin Lamba

(incubating)

Pivotal’s Open Source strategy

What is Apache Geode?

History

Differentiators

Basic Concepts

Resources

Agenda

In 2015, Pivotal granted the components of its Big Data Suite to open source

6 Million Lines of Code4 new open source communities

May 2015 Sept 2015

Sept 2015Oct 2015

From GEMFIRE to GEODE…

A distributed, memory-based data management platform for data oriented apps that need:• high performance, scalability,

resiliency and continuous availability

• fast access to critical data sets• location-aware distributed data

processing• event-driven data architecture

What is GEODE?

• 1000+ systems in production (real customers)• Cutting edge use cases

Incubating but ROCK solid…

<2000 2004 2008 2012 2016

Early drivers• Data Volumes• Margins/ transactions• IT maintenance costs • Elasticity needs

Real-time needs• Real-time response• Time to market needs• Flexible Data Models • Persistent+In-memory

Global Data• Visibility across DC• Fast Ingest• Device to enterprise • Uptime (always on)

Open Source!• Apache Incubation• Gemfire > Geode• Geode M1 release• 1st Geode Summit

Financial Services

US DoDTrade Clearing

Travel Portal

Online Gambling

TelcosManufacturing

Auto InsurancePayroll processing

Rail systems

…with both SCALE and SPEED, …

40KTransactionsper second

3TB Data

in-memory

17B Records

in-memory

120KConcurrent

… and impacting a LOT of people!

China RailwayCorporation

Indian Railways

36%of the world population

High-level Architecture

Powerful app development kit• APIs: Java & REST• Adapters: Redis, Lucene*, Spark*, …

Multiple persistence options• Filesystem, RDBMS or HDFS*• Sync: read-through, write-through• Async: write-behind

Durable <K,V> cache/ store• Data replicated or partitioned• Redundant storage in-memory/ disk• Flexible data retention policiesÎ

er +""""

&& &% % %% %% %%

A Peer-2-Peer in-memory Distributed System

* Experimental and waiting community feedback

• Minimize copying

• Minimize contention points

• Run user code in-process

• Partitioning & parallelism

• Avoid disk seeks

• Automated benchmarks

What makes it go FAST?

• Cache• Region• Member• Client Cache• Persistence• Functions

Let’s talk about a few BASIC CONCEPTS…

• In-memory storage and management for your data

• Configurable through XML, Java API or CLI

• Collection of Region

What is a CACHE?

• Distributed java.util.Map on steroids (Key/Value)

• Consistent API regardless of where or how data is stored

• Observable (reactive)

• Highly available, redundant on cache Member (s).

What is a REGION?

• Local, Replicated or Partitioned

• In-memory or persistent

• Redundant

• LRU

• Overflow

Region: Types & Options

LOCALLOCAL_HEAP_LRULOCAL_OVERFLOWLOCAL_PERSISTENTLOCAL_PERSISTENT_OVERFLOWPARTITIONPARTITION_HEAP_LRUPARTITION_OVERFLOWPARTITION_PERSISTENTPARTITION_PERSISTENT_OVERFLOWPARTITION_PROXYPARTITION_PROXY_REDUNDANTPARTITION_REDUNDANTPARTITION_REDUNDANT_HEAP_LRUPARTITION_REDUNDANT_OVERFLOWPARTITION_REDUNDANT_PERSISTENTPARTITION_REDUNDANT_PERSISTENT_OVERFLOWREPLICATEREPLICATE_HEAP_LRUREPLICATE_OVERFLOWREPLICATE_PERSISTENTREPLICATE_PERSISTENT_OVERFLOWREPLICATE_PROXY

• Durability

• WAL for efficient writing

• Consistent recovery

• Compaction

Persistent Regions

Server 1 Server N

• A process that has a connection to the system

• A process that has created a cache

• Embeddable within your application

What is a MEMBER?

Client

Locator

Server

• A process connected to the Geode server(s)

• Can have a local copy of the data

• Run OQL queries on local data

• Can be notified about events on the servers

What is a CLIENT CACHE?

Persistence - Shared Nothing

Server 3Server 2Server 1

Primary

Secondary

Primary

Secondary

Primary

Secondary

Primary

Secondary

Server 1 waits for others when it starts

Primary

Secondary

Fetches missed operations on restart

Persistence - Operational Logs

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Persistence - Operational Logs: Compaction

Create k1->v1

Create k2->v2

Modifyk1->v3

Create k4->v4

Modify k1->v5

Create k6->v6

Member 1Put k6->v6

Oplog2.crf

Oplog1.crf

Append to operation log

Copy live data forward

• Used for distributed concurrent processing (Map/Reduce, stored procedure)

• Highly available

• Data oriented

• Member oriented

Functions

• Check out: http://geode.incubator.apache.org

• Subscribe: user-subscribe@geode.incubator.apache.org

• Download: http://geode.incubator.apache.org/releases/

Join the Community!

Thank you!

Additional Slides

Built for PERFORMANCE…

200,000

400,000

600,000

800,000

1,000,000

YCSB Workloads

Cassandra Geode

…and horizontal, consistent SCALABILITY!

Horizontal scaling for reads, consistent latency and CPU

2 4 6 8 10

Speedu

ServerHosts

speedup latency(ms) CPU%

• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers• Partitioned region with redundancy and 1K data size

High Availability

Pivotal's effort on Apache Geode

Technology

Transcript of Pivotal's effort on Apache Geode

Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- leveraging apache geode to build a poor mans sap hana

Apache Geode/Pivotal GemFire - itdks.su.bcebos.com

20w04 · 2019-12-17 · Spring Data Geode 2.1.12.RELEASE Apache License 2.0 Spring Framework v5.1.11.RELEASE Apache License 2.0 Spring Security 5.1.7 Apache License 2.0 Spring Transaction

Introduction to Apache Geode (Cork, Ireland)

GEODE - sistemamid.com

OQL querying and indexes with Apache Geode (incubating)

Apache geode

IMCSummit 2015 - Day 2 Developer Track - Implementing a Highly Scalable In-Memory Stock Prediction System with Apache Geode, R and Spring XD

Implementing a Highly Scalable Stock Prediction System with R, Apache Geode and Spring XD

Building Apps with Distributed In-Memory Computing Using Apache Geode

Geode introduction

#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode

Apache Geode - The First Six Months

Getting Started with Apache Geode

Pivotal's effort on Apache Geode

Apache Geode Clubhouse - WAN-based Replication

seuil geode

Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement

Geode Jeopardy !

Redis adaptor for Apache Geode