Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

59
Druid : Sub-Second OLAP queries over Petabytes of Streaming Data Nishant Bangarwa Hortonworks Druid Committer, PMC Superset Incubator PPMC June 2017

Transcript of Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

Page 1: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

Druid : Sub-Second OLAP queries over Petabytes of Streaming DataNishant Bangarwa HortonworksDruid Committer, PMCSuperset Incubator PPMC

June 2017

Page 2: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved2

AgendaHistory and Motivation

Introduction

Demo

Druid Architecture – Indexing and Querying Data

Druid In Production

Recent Improvements

Page 3: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

HISTORY AND MOTIVATION

Druid Open sourced in late 2012

Initial Use case

Power ad-tech analytics product

Requirements

Arbitrary queries

Scalability : trillions of events/day

Interactive : low latency queries

Real-time : data freshness

High Availability

Rolling Upgrades

Page 4: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved4

MOTIVATION

Business Intelligence Queries

Arbitrary slicing and dicing of data

Interactive real time visualizations on Complex data streams

Answer BI questions – How many unique male visitors visited my

website last month ?

– How many products were sold last quarter broken down by a demographic and product category ?

Not interested in dumping entire dataset

Page 5: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved5

Introdution

Page 6: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved6

What is Druid ?

Column-oriented distributed datastore

Sub-Second query times

Realtime streaming ingestion

Arbitrary slicing and dicing of data

Automatic Data Summarization

Approximate algorithms (hyperLogLog, theta)

Scalable to petabytes of data

Highly available

Page 7: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved7

Demo

Page 8: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo: Wikipedia Real-Time Dashboard (Accelerated 30x)

Page 9: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Wikipedia Real-Time Dashboard: How it Works

Wikipedia Edits Data

Stream

Exactly-OnceIngestion

Write

Read

Java Stream Reader

Page 10: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved10

Druid Architecture

Page 11: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Realtime Nodes

Historical Nodes

11

Druid Architecture

Batch Data

Event

Historical Nodes

Broker Nodes

Realtime Index Tasks

Streaming Data

Historical Nodes

Handoff

Page 12: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved12

Druid Architecture

Batch Data

Queries

Metadata Store

Coordinator Nodes

Zookeeper

Historical Nodes

Broker Nodes

Realtime Index Tasks

Streaming Data

Handoff

Optional Distributed Query Cache

Page 13: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved13

Indexing Data

Page 14: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Indexing Service

Indexing is performed by

Overlord

Middle Managers

Peons

Middle Managers spawn peons which runs ingestion tasks

Each peon runs 1 task

Task definition defines which task to run and its properties

Page 15: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved15

Streaming Ingestion : Realtime Index Tasks

Ability to ingest streams of data

Stores data in write-optimized structure

Periodically converts write-optimized structureto read-optimized segments

Event query-able as soon as it is ingested

Both push and pull based ingestion

Page 16: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Streaming Ingestion : Tranquility

Helper library for coordinating streaming ingestion

Simple API to send events to druid

Transparently Manages

Realtime index Task Creation

Partitioning and Replication

Schema Evolution

Can be used with Flink, Samza, Spark, Storm any other ETL framework

Page 17: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Kafka Indexing Service (experimental) Supports Exactly once ingestion

Messages pulled by Kafka Index Tasks

Each Kafka Index Task consumes from a set of partitions with start and end offset

Each message verified to ensure sequence

Kafka Offsets and corresponding segments persisted in same metadata transaction atomically

Kafka Supervisor

embedded inside overlord

Manages kafka index tasks

Retry failed tasks

Page 18: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Batch Ingestion

HadoopIndexTask

Peon launches Hadoop MR job

Mappers read data

Reducers create Druid segment files

Index Task

Runs in single JVM i.e peon

Suitable for data sizes(<1G)

Integrations with Apache HIVE and Spark for Batch Ingestion

Page 19: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved19

Querying Data

Page 20: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Querying Data from Druid

Druid supports

JSON Queries over HTTP

In built SQL (experimental)

Querying libraries available for

Python

R

Ruby

Javascript

Clojure

PHP

Page 21: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved21

JSON Over HTTP

HTTP Rest API

Queries and results expressed in JSON

Multiple Query Types

Time Boundary

Timeseries

TopN

GroupBy

Select

Segment Metadata

Page 22: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

In built SQL (experimental)

Apache Calcite based parser and planner

Ability to connect druid to any BI tool that supports JDBC

SQL via JSON over HTTP

Supports Approximate queries

APPROX_COUNT_DISTINCT(col)

Ability to do Fast Approx TopN queries

APPROX_QUANTILE(column, probability)

Page 23: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Integrated with multiple Open Source UI tools

Superset –

Developed at AirBnb

In Apache Incubation since May 2017

Grafana – Druid plugin (https://github.com/grafana-druid-plugin/druidplugin)

Metabase

With in-built SQL connect with any BI tool supporting JDBC

Page 24: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Superset

Python backend

Flask app builder

Authentication

Pandas for rich analytics

SqlAlchemy for SQL toolkit

Javascript frontend

React, NVD3

Deep integration with Druid

Page 25: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Superset Rich Dashboarding Capabilities: Treemaps

Page 26: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Superset Rich Dashboarding Capabilities: Sunburst

Page 27: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Superset UI Provides Powerful Visualizations

Rich library of dashboard visualizations:

Basic:• Bar Charts• Pie Charts• Line Charts

Advanced:• Sankey Diagrams• Treemaps• Sunburst• Heatmaps

And More!

Page 28: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved28

Druid in Production

Page 29: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Production readiness

Is Druid suitable for my Use case ?

Will Druid meet my performance requirements at scale ?

How complex is it to Operate and Manage Druid cluster ?

How to monitor a Druid cluster ?

High Availability ?

How to upgrade Druid cluster without downtime ?

Security ?

Extensibility for future Use cases ?

Page 30: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Suitable Use Cases

Powering Interactive user facing applications

Arbitrary slicing and dicing of large datasets

User behavior analysis

measuring distinct counts

retention analysis

funnel analysis

A/B testing

Exploratory analytics/root cause analysis

Page 31: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Performance and Scalability : Fast Facts

Most Events per Day

300 Billion Events / Day(Metamarkets)

Most Computed Metrics

1 Billion Metrics / Min(Jolata)

Largest Cluster

200 Nodes(Metamarkets)

Largest Hourly Ingestion

2TB per Hour(Netflix)

Page 32: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved32

Performance

Query Latency– average - 500ms

– 90%ile < 1sec

– 95%ile < 5sec

– 99%ile < 10 sec

Query Volume– 1000s queries per minute

Benchmarking code

https://github.com/druid-io/druid-benchmark

Page 33: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Performance : Approximate Algorithms

Ability to Store Approximate Data Sketches for high cardinality columns e.g userid

Reduced storage size

Use Cases

Fast approximate distinct counts

Approximate Top-K queries

Approximate histograms

Funnel/retention analysis

Limitation

Not possible to do exact counts

filter on individual row values

Page 34: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Simplified Druid Cluster Management with Ambari

Install, configure and manage Druid and all external dependencies from Ambari

Easy to enable HA, Security, Monitoring …

Page 35: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Simplified Druid Cluster Management with Ambari

Page 36: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Monitoring a Druid Cluster

Each Druid Node emits metrics for

Query performance

Ingestion Rate

JVM Health

Query Cache performance

System health

Emitted as JSON objects to a runtime log file or over HTTP to other services

Emitters available for Ambari Metrics Server, Graphite, StatsD, Kafka

Easy to implement your own metrics emitter

Page 37: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Monitoring using Ambari Metrics Server

HDP 2.6.1 contains pre-defined grafana dashboards

Health of Druid Nodes

Ingestion

Query performance

Easy to create new dashboards and setup alerts

Auto configured when both Druid and Ambari Metrics Server are installed

Page 38: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Monitoring using Ambari Metrics Server

Page 39: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Monitoring using Ambari Metrics Server

Page 40: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

High Availability

Deploy Coordinator/Overlord on multiple instances

Leader election in zookeeper

Broker – install multiple brokers

Use druid Router/ Any Load balancer to route queries to brokers

Realtime Index Tasks – create redundant tasks.

Historical Nodes – create load rule with replication factor >= 2 (default = 2)

Page 41: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved41

Rolling Upgrades

Maintain backwards compatibility

Data redundancy

Shared Nothing Architecture

Rolling upgrades

No Downtime

1

1

1

1

1

1

1

1

1

2

2

2

2

2

2

2

2

2

3

Page 42: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Security

Supports Authentication via Kerberos/ SPNEGO

Easy Wizard based kerberos security enablement via Ambari

Druid

Druid

KDC serverUser

Browser1 kinit user

2 Token

Page 43: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Extending Core Druid

Plugin Based Architecture

leverage Guice in order to load extensions at runtime

Possible to add extension to

Add a new deep storage implementation

Add a new Firehose

Add Aggregators

Add Complex metrics

Add new Query types

Add new Jersey resources

Bundle your extension with all the other Druid extensions

Page 44: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved44

Companies Using Druid

Page 45: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved45

Recent Improvements

Page 46: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Druid 0.10.0 Kafka Indexing Service – Exactly once Ingestion

Built in SQL support (cli, http, jdbc)

Numeric Dimensions

Kerberos Authentication

Performance improvements

Optimized large amounts of and/or with concise bitmaps

Index-based regex simple filters like ‘foo%’

~30% improvement on non-time groupBys

Apache Hive Integration – Supports full SQL, Large Joins, Batch Indexing

Apache Ambari Integration – Easy deployments and Cluster management

Page 47: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future Work Improved schema definition & management

Improvements to Hive/Druid integration

Materialized Views, Push down more filters, support complex columns etc…

Performance improvements

Select query performance improvements

Jit-friendly topN queries

Security enhancements

Row/Column level security

Integration with Apache Ranger

And much more……

Page 48: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved48

Community

User google group - [email protected]

Dev google group - [email protected]

Github - druid-io/druid

IRC - #druid-dev on irc.freenode.net

Page 49: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved49

Summary

Easy installation and management via Ambari

Real-time– Ingestion latency < seconds.

– Query latency < seconds.

Arbitrary slice and dice big data like ninja– No more pre-canned drill downs.

– Query with more fine-grained granularity.

High availability and Rolling deployment capabilities

Secure and Production ready

Vibrant and Active community

Available as Tech Preview in HDP 2.6.1

Page 50: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved50

Thank you ! Questions ? Twitter - @NishantBangarwa

Email - [email protected]

Linkedin - https://www.linkedin.com/in/nishant-bangarwa

Page 51: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Atscale + Hive + Druid

Leverage Atscale cubing capabilities

store aggregate tables in Druid

Updatable dimensions in HIVE

Page 52: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved52

Storage Format

Page 53: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Druid: Segments

Data in Druid is stored in Segment Files.

Partitioned by time

Ideally, segment files are each smaller than 1GB.

If files are large, smaller time partitions are needed.

Time

Segment 1:Monday

Segment 2:Tuesday

Segment 3:Wednesday

Segment 4:Thursday

Segment 5_2:Friday

Segment 5_1:Friday

Page 54: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved54

Example Wikipedia Edit Dataset

timestamp page language city country … added deleted

2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65

2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62

2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45

2011-01-01T00:05:35Z Ke$ha en Calgary CA 17 87

2011-01-01T00:06:41Z Ke$ha en Calgary CA 43 99

2011-01-02T00:08:35Z Selena Gomes en Calgary CA 12 53

Timestamp Dimensions Metrics

Page 55: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved55

Data Rollup

timestamp page language city country … added deleted

2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65

2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62

2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45

2011-01-01T00:05:35Z Ke$ha en Calgary CA 17 87

2011-01-01T00:06:41Z Ke$ha en Calgary CA 43 99

2011-01-02T00:08:35Z Selena Gomes en Calgary CA 12 53

timestamp page language city country count sum_added sum_deleted min_added max_added ….

2011-01-01T00:00:00Z Justin Bieber en SF USA 3 57 172 10 32

2011-01-01T00:00:00Z Ke$ha en Calgary CA 2 60 186 17 43

2011-01-02T00:00:00Z Selena Gomes en Calgary CA 1 12 53 12 12

Rollup by hour

Page 56: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved56

Dictionary Encoding

Create and store Ids for each value

e.g. page column

Values - Justin Bieber, Ke$ha, Selena Gomes

Encoding - Justin Bieber : 0, Ke$ha: 1, Selena Gomes: 2

Column Data - [0 0 0 1 1 2]

city column - [0 0 0 1 1 1]

timestamp page language city country … added deleted

2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65

2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62

2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45

2011-01-01T00:05:35Z Ke$ha en Calgary CA 17 87

2011-01-01T00:06:41Z Ke$ha en Calgary CA 43 99

2011-01-02T00:08:35Z Selena Gomes en Calgary CA 12 53

Page 57: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved57

Bitmap Indices

Store Bitmap Indices for each value

Justin Bieber -> [0, 1, 2] -> [1 1 1 0 0 0]

Ke$ha -> [3, 4] -> [0 0 0 1 1 0]

Selena Gomes -> [5] -> [0 0 0 0 0 1]

Queries

Justin Bieber or Ke$ha -> [1 1 1 0 0 0] OR [0 0 0 1 1 0] -> [1 1 1 1 1 0]

language = en and country = CA -> [1 1 1 1 1 1] AND [0 0 0 1 1 1] -> [0 0 0 1 1 1]

Indexes compressed with Concise or Roaring encoding

timestamp page language city country … added deleted

2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65

2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62

2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45

2011-01-01T00:01:35Z Ke$ha en Calgary CA 17 87

2011-01-01T00:01:35Z Ke$ha en Calgary CA 43 99

2011-01-01T00:01:35Z Selena Gomes en Calgary CA 12 53

Page 58: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved58

Approximate Sketch Columns

timestamp page userid language city country … added deleted

2011-01-01T00:01:35Z Justin Bieber user1111111 en SF USA 10 65

2011-01-01T00:03:63Z Justin Bieber user1111111 en SF USA 15 62

2011-01-01T00:04:51Z Justin Bieber user2222222 en SF USA 32 45

2011-01-01T00:05:35Z Ke$ha user3333333 en Calgary CA 17 87

2011-01-01T00:06:41Z Ke$ha user4444444 en Calgary CA 43 99

2011-01-02T00:08:35Z Selena Gomes user1111111 en Calgary CA 12 53

timestamp page language city country count sum_added sum_deleted

min_added Userid_sketch

….

2011-01-01T00:00:00Z Justin Bieber en SF USA 3 57 172 10 {sketch}

2011-01-01T00:00:00Z Ke$ha en Calgary CA 2 60 186 17 {sketch}

2011-01-02T00:00:00Z Selena Gomes en Calgary CA 1 12 53 12 {sketch}

Rollup by hour

Page 59: Druid: Sub-Second OLAP queries over Petabytes of Streaming Data

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Approximate Sketch Columns

Better rollup for high cardinality columns e.g userid

Reduced storage size

Use Cases

Fast approximate distinct counts

Approximate histograms

Funnel/retention analysis

Limitation

Not possible to do exact counts

filter on individual row values