Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research...

41
Google Cloud Bigtable And what it's awesome at

Transcript of Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research...

Page 1: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Google Cloud BigtableAnd what it's awesome at

Page 2: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Jen TongDeveloper AdvocateGoogle Cloud Platform

@MimmingCodes

Page 3: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Agenda

1

2

Research

3

A story about bigness

How it works

4 When it's awesome

Page 4: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Google Research Publications

Page 5: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Google Research Publications

Page 6: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Managed Cloud Versions

Bigtable

Flume

Dremel

Page 7: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Managed Cloud Versions

Bigtable

Flume

Dremel

Bigtable

Dataflow

BigQuery

Page 8: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Cloud BigtableCloud Bigtable

Page 9: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Google’s mission is to organize the world’s information and make

it universally accessible and useful.

Page 10: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigness

Page 11: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Google Internal Bigtable in Numbers

• Storage: 100s of PB

• Throughput: 1,000,000s of QPS

• Bandwidth: 100s of GB/sec

Page 12: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

How much is that?

Several Datas worthPhoto credit: jdhancock

Page 13: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

How much is that?

Millennia of DVD videoPhoto credit: illinoislibrary

Page 14: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable

Plus Hundreds of Internal Services

Influence

Google is not affiliated or endorsed by any of these companies. Apache HBase, Apache Cassandra and Apache Accumulo are trademarks are of The Apache Software Foundation. Hypertable is the trademark of Hypertable Inc.

Page 15: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Engineering

Page 16: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Engineering

Hundreds of engineer-years worth

Page 17: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable - The early years

• Jeff and Sanjay decided to build a database service that could scale linearly across thousands and thousands of commodity servers

○ Systems will fail, retain performance at scale

• Abandon traditional relational model

• The first generation was about:

○ Prototyping and build the service to do its first scaling

○ Migrate initial applications to Bigtable

○ Figure out replication, and first multi-tenant version of Bigtable

Page 18: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable - Stabilized

• From batch only, to serving web traffic

○ Low latency for 99th percentile of requests

• Polish the Bigtable service

○ React better to abusive usage

○ Mixed media clusters - mixture of SSD + HDD storage with configurable affinity

○ Bring tablet server recovery time from 10s of seconds to 1 second or less

○ Easier replication

Page 19: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Google Cloud Bigtable

• A fully-managed service

• Focus more on your business, less on infrastructure

• Straightforward pricing model

Page 20: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Data Model

Page 21: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Data model

Page 22: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

How it works

Page 23: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

HBase Architecture

HBase Cluster

Region Server

Region Server

Region Server

Region Server

Master

Region Server

Bloomfilter

Memory Table

WAL

Block Cache

RegionRegion

Region Region

ZooKeeper

HBase Client

HDFS

Page 24: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Page 25: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable System Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Page 26: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Page 27: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Bigtable Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Page 28: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Life of Bigtable data

Page 29: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Life of Bigtable data

Page 30: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Life of Bigtable data

Page 31: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Life of Bigtable data

Page 33: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

When it's awesome

Page 34: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Management

● Who in the audience have used HBase before?

● Things you will not see in Cloud Bigtable:

○ Compactions

○ Pre-splitting

○ Lots of configuration settings

○ 1 minute regionserver outages

○ Coprocessors (for now)

Page 35: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Throughput

Write Throughput (MB/s)Mixed Read/Write Throughput(MB/s)

Page 36: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Latency

late

ncy

(ms)

at

99%

read

update

Page 37: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Financial ServicesFaster risk analysis, credit card fraud/abuse

Marketing/ Digital MediaUser engagement, clickstream analysis, real-time adaptive content

Internet of ThingsSensor data dashboards and anomaly detection

TelecommunicationsSampled traffic patterns, metric collection and reporting

EnergyOil well sensors, anomaly detection, predictive modeling

BiomedicalGenomics sequencing data analysis

Cloud Bigtable Use Cases

Page 38: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

When not to use it

• Relational joins, like for online transaction processing

• Interactive querying

• Blobs over 10MB

• ACID transactions

• Automatic cross-zone replication

• You don't have much data yet

Page 39: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

When not to use it

• Relational joins, like for online transaction processing - Cloud SQL

• Interactive querying - BigQuery

• Blobs over 10MB - Cloud Storage

• ACID transactions - Datastore

• Automatic cross-zone replication - Datastore

• You don't have much data yet - Datastore, Firebase, or Cloud SQL

Page 40: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable

Thank you!

Jen TongDeveloper AdvocateGoogle Cloud Platform

@MimmingCodeslittle418.com

Page 41: Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research Publications. Managed Cloud Versions Bigtable Flume Dremel. Managed Cloud Versions Bigtable