Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research...

Post on 07-Aug-2020

21 views 0 download

Transcript of Google Cloud Bigtable · 2016/02/03  · Google Research Publications. Google Research...

Google Cloud BigtableAnd what it's awesome at

Jen TongDeveloper AdvocateGoogle Cloud Platform

@MimmingCodes

Agenda

1

2

Research

3

A story about bigness

How it works

4 When it's awesome

Google Research Publications

Google Research Publications

Managed Cloud Versions

Bigtable

Flume

Dremel

Managed Cloud Versions

Bigtable

Flume

Dremel

Bigtable

Dataflow

BigQuery

Cloud BigtableCloud Bigtable

Google’s mission is to organize the world’s information and make

it universally accessible and useful.

Bigness

Google Internal Bigtable in Numbers

• Storage: 100s of PB

• Throughput: 1,000,000s of QPS

• Bandwidth: 100s of GB/sec

How much is that?

Several Datas worthPhoto credit: jdhancock

How much is that?

Millennia of DVD videoPhoto credit: illinoislibrary

Bigtable

Plus Hundreds of Internal Services

Influence

Google is not affiliated or endorsed by any of these companies. Apache HBase, Apache Cassandra and Apache Accumulo are trademarks are of The Apache Software Foundation. Hypertable is the trademark of Hypertable Inc.

Engineering

Engineering

Hundreds of engineer-years worth

Bigtable - The early years

• Jeff and Sanjay decided to build a database service that could scale linearly across thousands and thousands of commodity servers

○ Systems will fail, retain performance at scale

• Abandon traditional relational model

• The first generation was about:

○ Prototyping and build the service to do its first scaling

○ Migrate initial applications to Bigtable

○ Figure out replication, and first multi-tenant version of Bigtable

Bigtable - Stabilized

• From batch only, to serving web traffic

○ Low latency for 99th percentile of requests

• Polish the Bigtable service

○ React better to abusive usage

○ Mixed media clusters - mixture of SSD + HDD storage with configurable affinity

○ Bring tablet server recovery time from 10s of seconds to 1 second or less

○ Easier replication

Google Cloud Bigtable

• A fully-managed service

• Focus more on your business, less on infrastructure

• Straightforward pricing model

Data Model

Data model

How it works

HBase Architecture

HBase Cluster

Region Server

Region Server

Region Server

Region Server

Master

Region Server

Bloomfilter

Memory Table

WAL

Block Cache

RegionRegion

Region Region

ZooKeeper

HBase Client

HDFS

Bigtable Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Bigtable System Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Bigtable Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Bigtable Architecture

Bigtable Cell

Tabletserver

Tabletserver

Tabletserver

Tabletserver

Master

Tabletserver

Bloomfilter

Memtable

Sharedlog

Block Cache

TabletTablet

Tablet Tablet

Chubby

HBase Client

Colossus

Life of Bigtable data

Life of Bigtable data

Life of Bigtable data

Life of Bigtable data

When it's awesome

Management

● Who in the audience have used HBase before?

● Things you will not see in Cloud Bigtable:

○ Compactions

○ Pre-splitting

○ Lots of configuration settings

○ 1 minute regionserver outages

○ Coprocessors (for now)

Throughput

Write Throughput (MB/s)Mixed Read/Write Throughput(MB/s)

Latency

late

ncy

(ms)

at

99%

read

update

Financial ServicesFaster risk analysis, credit card fraud/abuse

Marketing/ Digital MediaUser engagement, clickstream analysis, real-time adaptive content

Internet of ThingsSensor data dashboards and anomaly detection

TelecommunicationsSampled traffic patterns, metric collection and reporting

EnergyOil well sensors, anomaly detection, predictive modeling

BiomedicalGenomics sequencing data analysis

Cloud Bigtable Use Cases

When not to use it

• Relational joins, like for online transaction processing

• Interactive querying

• Blobs over 10MB

• ACID transactions

• Automatic cross-zone replication

• You don't have much data yet

When not to use it

• Relational joins, like for online transaction processing - Cloud SQL

• Interactive querying - BigQuery

• Blobs over 10MB - Cloud Storage

• ACID transactions - Datastore

• Automatic cross-zone replication - Datastore

• You don't have much data yet - Datastore, Firebase, or Cloud SQL

Thank you!

Jen TongDeveloper AdvocateGoogle Cloud Platform

@MimmingCodeslittle418.com