Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
-
Upload
dataworks-summit -
Category
Technology
-
view
1.298 -
download
0
Transcript of Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid : Sub-Second OLAP queries over Petabytes of Streaming DataNishant Bangarwa HortonworksDruid Committer, PMCSuperset Incubator PPMC
June 2017
© Hortonworks Inc. 2011 – 2016. All Rights Reserved2
AgendaHistory and Motivation
Introduction
Demo
Druid Architecture – Indexing and Querying Data
Druid In Production
Recent Improvements
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
HISTORY AND MOTIVATION
Druid Open sourced in late 2012
Initial Use case
Power ad-tech analytics product
Requirements
Arbitrary queries
Scalability : trillions of events/day
Interactive : low latency queries
Real-time : data freshness
High Availability
Rolling Upgrades
© Hortonworks Inc. 2011 – 2016. All Rights Reserved4
MOTIVATION
Business Intelligence Queries
Arbitrary slicing and dicing of data
Interactive real time visualizations on Complex data streams
Answer BI questions – How many unique male visitors visited my
website last month ?
– How many products were sold last quarter broken down by a demographic and product category ?
Not interested in dumping entire dataset
© Hortonworks Inc. 2011 – 2016. All Rights Reserved5
Introdution
© Hortonworks Inc. 2011 – 2016. All Rights Reserved6
What is Druid ?
Column-oriented distributed datastore
Sub-Second query times
Realtime streaming ingestion
Arbitrary slicing and dicing of data
Automatic Data Summarization
Approximate algorithms (hyperLogLog, theta)
Scalable to petabytes of data
Highly available
© Hortonworks Inc. 2011 – 2016. All Rights Reserved7
Demo
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Demo: Wikipedia Real-Time Dashboard (Accelerated 30x)
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Wikipedia Real-Time Dashboard: How it Works
Wikipedia Edits Data
Stream
Exactly-OnceIngestion
Write
Read
Java Stream Reader
© Hortonworks Inc. 2011 – 2016. All Rights Reserved10
Druid Architecture
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Realtime Nodes
Historical Nodes
11
Druid Architecture
Batch Data
Event
Historical Nodes
Broker Nodes
Realtime Index Tasks
Streaming Data
Historical Nodes
Handoff
© Hortonworks Inc. 2011 – 2016. All Rights Reserved12
Druid Architecture
Batch Data
Queries
Metadata Store
Coordinator Nodes
Zookeeper
Historical Nodes
Broker Nodes
Realtime Index Tasks
Streaming Data
Handoff
Optional Distributed Query Cache
© Hortonworks Inc. 2011 – 2016. All Rights Reserved13
Indexing Data
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Indexing Service
Indexing is performed by
Overlord
Middle Managers
Peons
Middle Managers spawn peons which runs ingestion tasks
Each peon runs 1 task
Task definition defines which task to run and its properties
© Hortonworks Inc. 2011 – 2016. All Rights Reserved15
Streaming Ingestion : Realtime Index Tasks
Ability to ingest streams of data
Stores data in write-optimized structure
Periodically converts write-optimized structureto read-optimized segments
Event query-able as soon as it is ingested
Both push and pull based ingestion
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Streaming Ingestion : Tranquility
Helper library for coordinating streaming ingestion
Simple API to send events to druid
Transparently Manages
Realtime index Task Creation
Partitioning and Replication
Schema Evolution
Can be used with Flink, Samza, Spark, Storm any other ETL framework
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kafka Indexing Service (experimental) Supports Exactly once ingestion
Messages pulled by Kafka Index Tasks
Each Kafka Index Task consumes from a set of partitions with start and end offset
Each message verified to ensure sequence
Kafka Offsets and corresponding segments persisted in same metadata transaction atomically
Kafka Supervisor
embedded inside overlord
Manages kafka index tasks
Retry failed tasks
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Batch Ingestion
HadoopIndexTask
Peon launches Hadoop MR job
Mappers read data
Reducers create Druid segment files
Index Task
Runs in single JVM i.e peon
Suitable for data sizes(<1G)
Integrations with Apache HIVE and Spark for Batch Ingestion
© Hortonworks Inc. 2011 – 2016. All Rights Reserved19
Querying Data
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Querying Data from Druid
Druid supports
JSON Queries over HTTP
In built SQL (experimental)
Querying libraries available for
Python
R
Ruby
Javascript
Clojure
PHP
© Hortonworks Inc. 2011 – 2016. All Rights Reserved21
JSON Over HTTP
HTTP Rest API
Queries and results expressed in JSON
Multiple Query Types
Time Boundary
Timeseries
TopN
GroupBy
Select
Segment Metadata
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
In built SQL (experimental)
Apache Calcite based parser and planner
Ability to connect druid to any BI tool that supports JDBC
SQL via JSON over HTTP
Supports Approximate queries
APPROX_COUNT_DISTINCT(col)
Ability to do Fast Approx TopN queries
APPROX_QUANTILE(column, probability)
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Integrated with multiple Open Source UI tools
Superset –
Developed at AirBnb
In Apache Incubation since May 2017
Grafana – Druid plugin (https://github.com/grafana-druid-plugin/druidplugin)
Metabase
With in-built SQL connect with any BI tool supporting JDBC
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Superset
Python backend
Flask app builder
Authentication
Pandas for rich analytics
SqlAlchemy for SQL toolkit
Javascript frontend
React, NVD3
Deep integration with Druid
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Superset Rich Dashboarding Capabilities: Treemaps
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Superset Rich Dashboarding Capabilities: Sunburst
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Superset UI Provides Powerful Visualizations
Rich library of dashboard visualizations:
Basic:• Bar Charts• Pie Charts• Line Charts
Advanced:• Sankey Diagrams• Treemaps• Sunburst• Heatmaps
And More!
© Hortonworks Inc. 2011 – 2016. All Rights Reserved28
Druid in Production
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Production readiness
Is Druid suitable for my Use case ?
Will Druid meet my performance requirements at scale ?
How complex is it to Operate and Manage Druid cluster ?
How to monitor a Druid cluster ?
High Availability ?
How to upgrade Druid cluster without downtime ?
Security ?
Extensibility for future Use cases ?
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Suitable Use Cases
Powering Interactive user facing applications
Arbitrary slicing and dicing of large datasets
User behavior analysis
measuring distinct counts
retention analysis
funnel analysis
A/B testing
Exploratory analytics/root cause analysis
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Performance and Scalability : Fast Facts
Most Events per Day
300 Billion Events / Day(Metamarkets)
Most Computed Metrics
1 Billion Metrics / Min(Jolata)
Largest Cluster
200 Nodes(Metamarkets)
Largest Hourly Ingestion
2TB per Hour(Netflix)
© Hortonworks Inc. 2011 – 2016. All Rights Reserved32
Performance
Query Latency– average - 500ms
– 90%ile < 1sec
– 95%ile < 5sec
– 99%ile < 10 sec
Query Volume– 1000s queries per minute
Benchmarking code
https://github.com/druid-io/druid-benchmark
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Performance : Approximate Algorithms
Ability to Store Approximate Data Sketches for high cardinality columns e.g userid
Reduced storage size
Use Cases
Fast approximate distinct counts
Approximate Top-K queries
Approximate histograms
Funnel/retention analysis
Limitation
Not possible to do exact counts
filter on individual row values
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simplified Druid Cluster Management with Ambari
Install, configure and manage Druid and all external dependencies from Ambari
Easy to enable HA, Security, Monitoring …
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Simplified Druid Cluster Management with Ambari
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Monitoring a Druid Cluster
Each Druid Node emits metrics for
Query performance
Ingestion Rate
JVM Health
Query Cache performance
System health
Emitted as JSON objects to a runtime log file or over HTTP to other services
Emitters available for Ambari Metrics Server, Graphite, StatsD, Kafka
Easy to implement your own metrics emitter
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Monitoring using Ambari Metrics Server
HDP 2.6.1 contains pre-defined grafana dashboards
Health of Druid Nodes
Ingestion
Query performance
Easy to create new dashboards and setup alerts
Auto configured when both Druid and Ambari Metrics Server are installed
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Monitoring using Ambari Metrics Server
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Monitoring using Ambari Metrics Server
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
High Availability
Deploy Coordinator/Overlord on multiple instances
Leader election in zookeeper
Broker – install multiple brokers
Use druid Router/ Any Load balancer to route queries to brokers
Realtime Index Tasks – create redundant tasks.
Historical Nodes – create load rule with replication factor >= 2 (default = 2)
© Hortonworks Inc. 2011 – 2016. All Rights Reserved41
Rolling Upgrades
Maintain backwards compatibility
Data redundancy
Shared Nothing Architecture
Rolling upgrades
No Downtime
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
3
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Security
Supports Authentication via Kerberos/ SPNEGO
Easy Wizard based kerberos security enablement via Ambari
Druid
Druid
KDC serverUser
Browser1 kinit user
2 Token
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Extending Core Druid
Plugin Based Architecture
leverage Guice in order to load extensions at runtime
Possible to add extension to
Add a new deep storage implementation
Add a new Firehose
Add Aggregators
Add Complex metrics
Add new Query types
Add new Jersey resources
Bundle your extension with all the other Druid extensions
© Hortonworks Inc. 2011 – 2016. All Rights Reserved44
Companies Using Druid
© Hortonworks Inc. 2011 – 2016. All Rights Reserved45
Recent Improvements
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Druid 0.10.0 Kafka Indexing Service – Exactly once Ingestion
Built in SQL support (cli, http, jdbc)
Numeric Dimensions
Kerberos Authentication
Performance improvements
Optimized large amounts of and/or with concise bitmaps
Index-based regex simple filters like ‘foo%’
~30% improvement on non-time groupBys
Apache Hive Integration – Supports full SQL, Large Joins, Batch Indexing
Apache Ambari Integration – Easy deployments and Cluster management
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future Work Improved schema definition & management
Improvements to Hive/Druid integration
Materialized Views, Push down more filters, support complex columns etc…
Performance improvements
Select query performance improvements
Jit-friendly topN queries
Security enhancements
Row/Column level security
Integration with Apache Ranger
And much more……
© Hortonworks Inc. 2011 – 2016. All Rights Reserved48
Community
User google group - [email protected]
Dev google group - [email protected]
Github - druid-io/druid
IRC - #druid-dev on irc.freenode.net
© Hortonworks Inc. 2011 – 2016. All Rights Reserved49
Summary
Easy installation and management via Ambari
Real-time– Ingestion latency < seconds.
– Query latency < seconds.
Arbitrary slice and dice big data like ninja– No more pre-canned drill downs.
– Query with more fine-grained granularity.
High availability and Rolling deployment capabilities
Secure and Production ready
Vibrant and Active community
Available as Tech Preview in HDP 2.6.1
© Hortonworks Inc. 2011 – 2016. All Rights Reserved50
Thank you ! Questions ? Twitter - @NishantBangarwa
Email - [email protected]
Linkedin - https://www.linkedin.com/in/nishant-bangarwa
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Atscale + Hive + Druid
Leverage Atscale cubing capabilities
store aggregate tables in Druid
Updatable dimensions in HIVE
© Hortonworks Inc. 2011 – 2016. All Rights Reserved52
Storage Format
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Druid: Segments
Data in Druid is stored in Segment Files.
Partitioned by time
Ideally, segment files are each smaller than 1GB.
If files are large, smaller time partitions are needed.
Time
Segment 1:Monday
Segment 2:Tuesday
Segment 3:Wednesday
Segment 4:Thursday
Segment 5_2:Friday
Segment 5_1:Friday
© Hortonworks Inc. 2011 – 2016. All Rights Reserved54
Example Wikipedia Edit Dataset
timestamp page language city country … added deleted
2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65
2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62
2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45
2011-01-01T00:05:35Z Ke$ha en Calgary CA 17 87
2011-01-01T00:06:41Z Ke$ha en Calgary CA 43 99
2011-01-02T00:08:35Z Selena Gomes en Calgary CA 12 53
Timestamp Dimensions Metrics
© Hortonworks Inc. 2011 – 2016. All Rights Reserved55
Data Rollup
timestamp page language city country … added deleted
2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65
2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62
2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45
2011-01-01T00:05:35Z Ke$ha en Calgary CA 17 87
2011-01-01T00:06:41Z Ke$ha en Calgary CA 43 99
2011-01-02T00:08:35Z Selena Gomes en Calgary CA 12 53
timestamp page language city country count sum_added sum_deleted min_added max_added ….
2011-01-01T00:00:00Z Justin Bieber en SF USA 3 57 172 10 32
2011-01-01T00:00:00Z Ke$ha en Calgary CA 2 60 186 17 43
2011-01-02T00:00:00Z Selena Gomes en Calgary CA 1 12 53 12 12
Rollup by hour
© Hortonworks Inc. 2011 – 2016. All Rights Reserved56
Dictionary Encoding
Create and store Ids for each value
e.g. page column
Values - Justin Bieber, Ke$ha, Selena Gomes
Encoding - Justin Bieber : 0, Ke$ha: 1, Selena Gomes: 2
Column Data - [0 0 0 1 1 2]
city column - [0 0 0 1 1 1]
timestamp page language city country … added deleted
2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65
2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62
2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45
2011-01-01T00:05:35Z Ke$ha en Calgary CA 17 87
2011-01-01T00:06:41Z Ke$ha en Calgary CA 43 99
2011-01-02T00:08:35Z Selena Gomes en Calgary CA 12 53
© Hortonworks Inc. 2011 – 2016. All Rights Reserved57
Bitmap Indices
Store Bitmap Indices for each value
Justin Bieber -> [0, 1, 2] -> [1 1 1 0 0 0]
Ke$ha -> [3, 4] -> [0 0 0 1 1 0]
Selena Gomes -> [5] -> [0 0 0 0 0 1]
Queries
Justin Bieber or Ke$ha -> [1 1 1 0 0 0] OR [0 0 0 1 1 0] -> [1 1 1 1 1 0]
language = en and country = CA -> [1 1 1 1 1 1] AND [0 0 0 1 1 1] -> [0 0 0 1 1 1]
Indexes compressed with Concise or Roaring encoding
timestamp page language city country … added deleted
2011-01-01T00:01:35Z Justin Bieber en SF USA 10 65
2011-01-01T00:03:63Z Justin Bieber en SF USA 15 62
2011-01-01T00:04:51Z Justin Bieber en SF USA 32 45
2011-01-01T00:01:35Z Ke$ha en Calgary CA 17 87
2011-01-01T00:01:35Z Ke$ha en Calgary CA 43 99
2011-01-01T00:01:35Z Selena Gomes en Calgary CA 12 53
© Hortonworks Inc. 2011 – 2016. All Rights Reserved58
Approximate Sketch Columns
timestamp page userid language city country … added deleted
2011-01-01T00:01:35Z Justin Bieber user1111111 en SF USA 10 65
2011-01-01T00:03:63Z Justin Bieber user1111111 en SF USA 15 62
2011-01-01T00:04:51Z Justin Bieber user2222222 en SF USA 32 45
2011-01-01T00:05:35Z Ke$ha user3333333 en Calgary CA 17 87
2011-01-01T00:06:41Z Ke$ha user4444444 en Calgary CA 43 99
2011-01-02T00:08:35Z Selena Gomes user1111111 en Calgary CA 12 53
timestamp page language city country count sum_added sum_deleted
min_added Userid_sketch
….
2011-01-01T00:00:00Z Justin Bieber en SF USA 3 57 172 10 {sketch}
2011-01-01T00:00:00Z Ke$ha en Calgary CA 2 60 186 17 {sketch}
2011-01-02T00:00:00Z Selena Gomes en Calgary CA 1 12 53 12 {sketch}
Rollup by hour
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Approximate Sketch Columns
Better rollup for high cardinality columns e.g userid
Reduced storage size
Use Cases
Fast approximate distinct counts
Approximate histograms
Funnel/retention analysis
Limitation
Not possible to do exact counts
filter on individual row values