In-Memory Performance Durability of Diskapachecon.com/euroadshow18/ignite-spark-iot.pdf · Ignite...

Post on 14-Aug-2020

5 views 0 download

Transcript of In-Memory Performance Durability of Diskapachecon.com/euroadshow18/ignite-spark-iot.pdf · Ignite...

© 2018 GridGain Systems, Inc.

In-Memory Performance Durability of Disk

© 2018 GridGain Systems, Inc.

Apache Ignite and Apache Spark

Where Fast Data Meets the IoT

Akmal ChaudhriGridGain Systems

© 2018 GridGain Systems, Inc.

• IoT Demands to Software• IoT Software Stack• Device OS/RTOS• Data Collection and Enrichment• NewSQL Database• Application APIs

• Demo

Agenda

© 2018 GridGain Systems, Inc.

IoT Demands to Software

Real-time Processing

SQL, Geo-Spatial

Analytics (BI, ML)

High-Availability

Simple Scalability

© 2018 GridGain Systems, Inc.

IoT Software Stack

Device OS/Real-Time OS

Data Collection and Enrichment

NewSQL Database

Application APIs

© 2018 GridGain Systems, Inc.

Apache IoT Software Stack

Device OS/Real-Time OS

Data Collection and Enrichment

NewSQL Database

Application APIs

© 2018 GridGain Systems, Inc.

Apache MyNewt

Open Source RTOS Cortex M, MIPS Bluetooth, Wifi,

TCP/IP

Secured Bootloader

Remote Firmware Upgrade

© 2018 GridGain Systems, Inc.

Data Collection and Enrichment

DURABLE MEMORY

DURABLE MEMORY

Ignite Cluster

© 2018 GridGain Systems, Inc.

Apache Ignite Database, Caching and Processing Platform

Memory-Centric Storage

Ignite Native Persistence(Flash, SSD, Intel 3D XPoint)

Third-Party Persistence(RDBMS, HDFS, NoSQL)

SQL Transactions Compute Services MLStreamingKey/Value

IoTFinancialServices

Pharma &Healthcare

E-CommerceTravel & Logistics

Telco

© 2018 GridGain Systems, Inc.

Ignite and Spark Integration

Spark Application

Spark Worker

S park

Job

S park

Job

Yarn Mesos Docker HDFS

Spark Worker

S park Job

S park Job

Spark Worker

S park Job

S park Job

In-Memory Shared RDD or DataFrame

GridGain Node GridGain Node GridGain Node

Share state and data among

Spark jobs

No data movement

Boost DataFrame and SQL Performance

SQL on top of RDDs

In-place query execution

© 2018 GridGain Systems, Inc.

1. Initial Query

2. Query execution over local data3. Reduce multiple results in one

Ignite Node

CanadaToronto

OttawaMontreal

Calgary

Ignite Node

India Mumbai

New Delhi

1

2

23

SQL Queries Execution Flow

© 2018 GridGain Systems, Inc.

• Distributed memory-centric database • Ingests data from HDFS or another storage

• Fully fledged compute platform: SQL, transactions, key-value, collocated processing, ML/DL

• Streaming and compute engine

• OLAP and OLTP • Inclined towards OLAP and focused on MR payloads

Comparing Ignite and Spark

© 2018 GridGain Systems, Inc.

Ignite is a memory-centric store for Spark

• No data movement from Ignite to Spark

• In-place query execution

• Boost DataFrame and SQL performance

• Share state and data among Spark jobs

• Faster data and streaming analytics

Ignite and Spark Together

+

© 2018 GridGain Systems, Inc.

DEMO

© 2018 GridGain Systems, Inc.

Any Questions?

Thank you for joining us. Follow the conversation.http://ignite.apache.org

#apacheignite