HBaseConEast2016: OpenTSDB+BigTable

14
OpenTSDB + Bigtable Integrating time series database with Google Cloud Bigtable Danil Zburivsky, Big Data Practice Lead, Pythian — [email protected] Max Luebbe, SRE, Google — [email protected]

Transcript of HBaseConEast2016: OpenTSDB+BigTable

OpenTSDB + BigtableIntegrating time series database withGoogle Cloud BigtableDanil Zburivsky, Big Data Practice Lead, Pythian — [email protected] Max Luebbe, SRE, Google — [email protected]

Pythian specializes in design, implementation, and management of systems that directly contribute to revenue and business success.

History19 years in business

Growing at 30+% per year

400+ employees

300+ customers worldwide

HQ Ottawa, Canada - global reach

Technology agnostic = trusted advisor

Deep expertise: Oracle, Oracle Apps, MySQL, AWS, SQL Server, Cassandra/DataStax, Azure, PostgreSQL, Cloudera, MapR, Hortonworks etc.

Google Premier Partner Status (as of end Aug)

5 Certified Developers (soon to be 12)

Dedicated Google Technical Champion

Launch partner for: Kubernetes, Dataflow, Cloud SQL, Dataproc

Integrated OpenTSDB with Bigtable

DW Explorers Program Partner

Upcoming BigQuery & Cloud ML Launch Partner

• (time, metric, value)• OS and apps metrics• Industrial equipment• Web traffic

Time series data

• Volume can be explosive• Data arrival and access

patterns are different

Storing time series data is a challenge

• Volume can be explosive

• Data arrival and access patterns are different

Storing time series data is a challenge

OpenTSDB ArchitectureServer Server Server Server

TSD TSD

Hbase or Bigtable

TSD RPC

Hbase RPC or Hbase API

Web UI

Scripts/Alerting

http

TSD RPC

• Open source• Uses HBase as a data

store• Data model optimized for

TS • REST API

OpenTSDB

<metric_uid><timestamp><tagk1><tagv1>[...<tagkN><tagvN>]

<col_t+1>[...<col_t+N>]

Origins of Bigtable• Updating search index in bulk: too slow!• Rather than a file system (GFS), needed random

access• Generic & scalable, now powers many diverse

products

Bigtable decouples compute & storage...

Node

Distributed filesystem

Node

Client Client Client Client Client Client

Processing

Storage

Clients

A B C D E

...making scalability fast & seamless

Node

Distributed filesystem

Node

Client Client Client Client Client Client

Processing

Storage

Clients

A B C D E

Node

Google Cloud Bigtable

Cloud Bigtable

Bigtable Service

async-bigtable● Uses standard HBase 1.0 API● BufferedMutator● Thread pool

https://github.com/OpenTSDB/asyncbigtable

Future work● Native Bigtable API● Fully asynchronous ● Performance

Demo time!