BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

145
B igTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1

Transcript of BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Page 1: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

1

BigTable and Google File SystemPresented by: Ayesha Fawad

10/07/2014

Page 2: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

2

Overview

Google File System Basics Design Chunks Replicas Clusters Client

Page 3: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

3

Overview

Google File System chunk server Master server Shadow Master Read Request Workflow Write Request Workflow Built-in Functions Limitations

Page 4: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

4

Overview

BigTable Introduction What is BigTable? Design

example

Rows and Tablets example

Columns and Column Families example

Page 5: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

5

Overview

BigTable Timestamp

example

Cells Data Structure SSTables and Logs Tablet Table

Page 6: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

6

Overview

BigTable Cluster Chubby How to find a row Mutations BigTable Implementation BigTable Building Blocks Architecture

Page 7: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

7

Overview

BigTable Master server Tablet server Client Library Incase of Failure? Recovery Process Compactions Refinement

Page 8: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

8

Overview

BigTable Interactions between GFS and BigTable API Why use BigTable? Why not any other Database? Application Design CAP

Page 9: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

9

Overview

BigTable Google Services using BigTable BigTable Derivatives Colossus Comparison

Page 10: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

10

Overview

Google App Engine Introduction GAE Data store Unsupported Actions Entities Models Queries Indexes

Page 11: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

11

Overview

Google App Engine GQL Transactions Data store Software Stack GUI

Main

Data store options Competitors

Page 12: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

12

Overview

Google App Engine Hard Limits Free Quotas Cloud Data Storage options

Page 13: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

13

Google File System

Presented by: Ayesha Fawad

10/07/2014

Page 14: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

14

Basics

Originated in 2003.GFS is designed for system to system

interaction, not user to system. Network of inexpensive machines running on

Linux operating systems

Page 15: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

15

Design

GFS relies on Distributed Computing to provide users the infrastructure they need to create, access and alter data

Distributed Computing: is all about networking several computers together and taking advantage of their individual resources in a collective way. Each computer contributes some of its resources e.g. such as memory, processing power and hard drive space, to the overall network. It turns the entire network into a massive computer, with each individual computer acting as a processor and data storage device.

Page 16: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

16

Design Autonomic Computing: a concept in which computers

are able to diagnose problems and solve them in real time without the need for human intervention

Challenge for GFS development team was to design an autonomic monitoring system that could work across a huge network of computers

Simplification offer basic commands like open, create, read, write and

close. Some specialized commands like append, snapshot

Page 17: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

17

Design

Checkpoints can include application level checksums Readers verify and process only file region up to last

checkpoint, which is known to be in defined state Check pointing allows writers to restart

incrementally and keeps readers from processing successfully written file data that is still incomplete from applications point of view

Relies on appends rather than overwrites

Page 18: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

18

Chunks Files on the GFS tend to be very large (multi-gigabyte

(GB) range) GFS handles this issue by breaking files up into

chunks of 64 MB each good for scans, streams, archives, shared Q’s

Each chunk has a unique 64-bit ID number called chunk handle

Simplifies Resource Application: all file chunks are the same size check which computers are near capacity check which computers are underused balance workload by moving chunk from one resource

to another

Page 19: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

19

Replicas

Two categories:Primary Replica: primary replica is the chunk that a chunk

server sends to a clientSecondary Replica: secondary replicas serve as backups on

other chunk servers

Master decides which chunks will act as primary or secondary Based on client changes to the data in the chunk, the master

server informs chunk servers with secondary replicas that they have to copy the new chunk off the primary chunk server to stay current

Page 20: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

20

Design

Page 21: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

21

Clusters

Google has organized GFS into a simple network of computers called clustersCluster contains three kinds of entities:

1. Clients 2. Master Server 3. Chunk servers

Page 22: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

22

Client

Clients: any entity making a request Developed by Google for its own use Clients can be other computers or computer

applications

Page 23: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

23

Chunk server

Chunk servers: workhorses stores the 64 MB file chunks sends requested chunks directly to client replicas are configurable

Page 24: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

24

Master server

Master Server: is the coordinator for cluster maintains operation log keeps track of metadata

information describing chunks chunk garbage collection re-replication on chunk server failures chunk migration to balance load and disk space

does not store the actual chunks

Page 25: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

25

Master server Upon start up, master server polls all the chunk

servers chunk servers respond back with information of:

data they contain location details space details

Page 26: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

26

Shadow Master

Shadow master servers contact primary master server to stay up to date operation log polling chunk servers

Anything goes wrong with the primary master, the shadow server can take over

GFS ensure shadow master servers are stored on different machines (incase of hardware failure)

Shadow servers lag behind the primary master server by fractions of a second

They provide limited services in parallel with master. Services are limited to reads

Page 27: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

27

Shadow Master

Page 28: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

28

Read Request Work flow1. Client send a read request for a particular file to

master

Page 29: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Read Request Work flow2. Master responds back with a location of primary

replica, where client can find that particular file

29

Page 30: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

30

Read Request Work flow3. Client contacts the chunk server directly

Page 31: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Read Request Work flow

4. chunk server sends the replica to the client

31

Page 32: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow1. Client sends the request to master server

32

Page 33: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow2. Master responds back with a location of primary and secondary replicas

33

Page 34: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow3. Client sends the write data to all the replicas. Regardless of primary or secondary, closest one first

(pipeline)

34

Page 35: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow4. Once data is received by replicas, client instructs the

primary replica to begin the write function primary assigns consecutive serial numbers to each of

the file changes (mutations)

35

Page 36: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow5. After primary applies the mutations to its own data,

it sends the write requests to all the secondary replicas

36

Page 37: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow6. Secondary replicas complete the write function and

report back to the primary replica

37

Page 38: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Write Request Work flow7. Primary sends confirmation to the client

if that doesn’t work, the master will identify the affected replica as garbage

38

Page 39: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

39

Mutations

Page 40: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

40

Mutations

Consistent: a file region is consistent, if all clients will always see same data, regardless of which replicas is being read

Defined: a region is defined, after a file data mutation if it is consistent and clients will see what the mutation writes in its entirely

Page 41: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

41

Built-in Functions

Master and Chunk replication Streamlined recovery process Rebalancing Stale replica detection Garbage removal - configurable Checksumming

each 64 MB chunk is broken into blocks of 64 KB each block has its own 32-bit checksum master monitors and compares checksums prevents data corruption

Page 42: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Limitations

Suited for batch-oriented applications which prefers high sustained bandwidth over low latency e.g. web crawling

Single Point of Failure is unacceptable for latency sensitive applications e.g. Gmail or YouTube

Single master a scanning bottleneck Consistency Problems

42

Page 43: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

43

BigTablePresented by: Ayesha Fawad

10/07/2014

Page 44: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

44

Introduction

Created by Google in 2005. Maintained as a proprietary, in-house

technology. Some technical details were disclosed in

USENIX Symposium in 2006. It is being used by Google services since

2005.

Page 45: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

45

What is BigTable?

It is a distributed storage system could be spread across multiple nodes appears to be one large table not a database design, it’s a storage design model

Page 46: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

46

What is BigTable?

Map BigTable is a collection of (key, value) pairs The key identifies a row and the value is the set of

columns

Page 47: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

47

What is BigTable?

Sparse different rows in the table may you use different

columns. with many of the columns empty for a particular row

Page 48: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

48

What is BigTable?

Column-oriented it can operate on a set of attributes (columns) for all tuples stores each column contiguously on disk

allow more records in a disk block reduces the disk I/O

The underlying assumption is that in most cases not all columns are needed for data access

In RDBMS implementation, usually each “row” is stored contiguous on disk

Page 49: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

49

Examplewebpages

Page 50: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

50

Example

webpages{   

"com.cnn.www" => { "contents" => "html….",

"anchor" => { "cnnsi.com" => "CNN",

 "my.look.ca" => "CNN.com" }

} },

Page 51: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

51

What is BigTable?

It is semi-structured Map (key value pair) different rows in the same table can have different

columns key is string, so it is not required to be sequential

unlike an array

Page 52: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

52

What is BigTable?

Lexicographically sorted data is sorted by keys structure keys in a way that sorting brings the data

together, for e.g.

e d u . v i l l a n o v a . c se d u . v i l l a n o v a . l a we d u . v i l l a n o v a . w w w

Page 53: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

53

What is BigTable?

Persistent when a certain amount of data is collected in memory,

BigTable makes it persistent by storing the data in Google File System

Page 54: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

54

What is BigTable?

Multi-dimensional data is indexed by row key, column name and time stamp its like a table with many rows (key) and many columns (columns)

with timestamp. it acts like a mapFor e.g

URLS : row keysMetadata of Web pages : column namesContents of Web page : columnTimestamps when fetched

Page 55: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

55

Design

row : s t r ingco lumn : s t r ingt ime : i n t64

Data is indexed by row key, column name and time stamp

Each value in map is an interpreted array of bytes Offers client some control over data layout and format

Careful choice of schema can control locality of data Client decides how to serialize the data

Page 56: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

56

Row

Row key is up to 64KB Row range for a table are dynamically partitioned Each row range is called a tablet

unit of distribution load balancing

Clients can select row keys for better locality of data accesses reads of short row ranges are efficient. typically require communication with few number of

machines

Page 57: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

57

Row

every read or write of data using a single row key is atomic no guarantee across rows (different columns being

read or written in the row) supports single row transactions to perform

atomic (read, modify, write) sequences on data store under a single row key

does not support general transaction across row keys

Page 58: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

58

Row with Example

ROW

Page 59: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

59

Column and Column Families

Column keys are grouped together to form sets, which are called Column families

family:qualifier data stored in the same column family usually has the

same data type indexes data in the same column family compress data number of distinct column families are small for e.g. language used on web page

Page 60: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

60

Column with Example

COLUMNS

Page 61: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

61

Column Families with Example

COLUMN FAMILY

family: qualifier

Page 62: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

62

Timestamp

64-bit integersMultiple timestamps exist in each cell to show various

versions of data created modified

Most recent version is accessible first can choose options for garbage collection can choose specific timestamps

Timestamps are assigned by BigTable (in microseconds) or Client application

Page 63: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

63

Timestamp with Example

TIMESTAMPS

Page 64: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

64

Cells

CELLS

Page 65: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

65

Mutations

First, mutations are logged in a log file Log file is stored in GFS Then the mutations are applied to an in-

memory version called memtable

Page 66: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

66

Mutations

Page 67: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

67

Data Structure

GFS supports two data structures. Logs Sorted String Tables

Data Structure is defined using protocol buffers (data description language)

Used to avoid inefficiency of converting data from one format to another.

For e.g. data format in Java and .NET

Page 68: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

68

SSTables and Logs

In memory BigTable provides mutable storage using key-value.

Once the log or in-memory table reaches a certain limit, changes are made persistent by GFS. immutable

All transaction in memory are saved in GFS as segments, called logs

After changes reach a certain size (that you want in memory), they are cleaned.

After cleaning, data is compacted into series of SSTables Then sent out as chunks to GFS

Page 69: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

69

SSTables and Logs

SSTable provides a persistent, immutable, ordered map from keys to values

Sequence of blocks form into an SSTable Each SSTable saves one block index

when SSTable is opened, index is loaded in memory specifies block location

Page 70: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

70

SSTables and Logs

64KBBlock

64KBBlock

64KBBlock

Index (block ranges)

Page 71: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Tablet

Tablets are a range of rows of a table Contains multiple SSTables Tablets are assigned to Tablet servers

71

Index

64K block

64K block

64K block

SSTable

Index

64K block

64K block

64K block

Tablet Start:aardvark End:apple

SSTable

Page 72: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Table

72

SSTable SSTable SSTable SSTable

Tabletapple

Tabletapple boat

Multiple tablets form a Table SSTables can overlap but tablets do not overlap

aardvark

Page 73: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

73

Cluster

BigTable cluster stores tables each table consists of tablets initially, table contains one tablet as the table grows, multiple tablets are created tablets are assigned to tablet servers each tablet exists at only one server server contains multiple tablets each tablet is 100-200 MB

Page 74: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

74

How to find a Row?

Page 75: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

75

How to find a Row?

Client reads location of the root tablet from the Chubby file

Root tablet contains location of Metadata tablets root tablet never splits

Metadata tablet contains the location of user tablets

Page 76: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

76

BigTable Architecture

Tablet server

GFS Chunkserver

SSTable SSTable SSTable

Tablet Tablet Tablet

Tablet server

GFS Chunkserver

SSTable

(replica)SSTable

SSTable

Tablet Tablet Tablet

(replica)SSTable

Logical view:

Physicallayout:

SSTable

Chubby ServerMaster

Page 77: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

77

BigTable Implementation

BigTable has 3 components: Master Server Tablet Servers: dynamically added or removed to

handle workload Chubby Client Library: links the master server, many

tablets servers and all clients

Page 78: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

78

BigTable Implementation

Page 79: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

79

BigTable Building Blocks

Google File System Stores persistent state.

Scheduler Schedules jobs involved in serving BigTable

Lock Service Master election Location bootstrapping

Map Reduce Used to read/write BigTable data

Page 80: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

80

Chubby

Distributed Lock Service name space consists of directories and files, which are

used as locks provides Mutual Exclusion

Highly available 1 master (elected) 5 active replicas

Paxos maintain consistency in replicas

Atomic reads and writes

Page 81: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

81

Chubby

Responsible for: ensure there is only one active master store the bootstrap location of BigTable data discover tablet servers store BigTable schema information store access control lists

Page 82: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

82

Chubby Client Library

Responsible for: providing consistent caching of Chubby files each Chubby client maintains a session with a Chubby

service Every client’s session has a lease expiration time. If the client is unable to renew its session lease within

the given time, the session expires and all locks and open handles are lost

Page 83: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

83

Master Server Starts up:

Acquires unique master lock in chubby Discovers tablet assignments Discovers live servers in chubby Scans Metadata table to learn the set of tablets

Responsible for: Adding or deleting tablet servers based on demand Assigns tablets to tablet servers Monitor and balance tablet server load Garbage collection of files in GFS Check tablet server for the status of its lock

Incase of Failure: If session with Chubby is lost, master kills itself and an election can take

place to find a new master

Page 84: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

84

Tablet Server Starts up:

acquires an exclusive lock on a uniquely named file in a specific Chubby directory

Responsible for: Tablet servers manages tablets Splits tablets beyond a certain size For reads and writes, client communicates directly with

tablet server Incase of Failure:

if it loses its exclusive lock, the tablet server stops serving if the file exists, will attempt to reacquire lock if the file no longer exists, tablet server kills itself, restart and

join the pool of unassigned tablet servers

Page 85: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

85

Tablet Server FailureChubby Server

Tablet server

GFS Chunkserver

SSTable SSTable SSTable

Tablet Tablet Tablet

Tablet server

GFS Chunkserver

SSTable

(replica)SSTable

SSTable

Tablet Tablet Tablet

(replica)SSTable

Logical view:

Physicallayout:

SSTable

X

X X X X

Master

Page 86: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

86

Tablet Server Failure

Tablet server

GFS Chunkserver

SSTable SSTable SSTable

Tablet Tablet Tablet

(replica)SSTable

Logical view:

Physicallayout:

Tablet

(other tablet servers drafted to serve other “abandoned” tablets)

Backup copy of tablet made primary

Message sent to tablet server by master

Extra replica of tablet created automatically by GFS

Chubby ServerMaster

Page 87: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

87

Tablet Server Recovery Process

Read metadata containing SSTABLES and redo points

Metadata table contains the list of SSTables that comprise a tablet and a set of a redo points

Redo points are pointers into any commit logs Apply redo points to reconstruct the memtable

based on updates in commit log

Page 88: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

88

Tablet Server Recovery Process

Read and Write requests at the tablet server are checked to make sure they are well formed

Check permission file in Chubby to ensure Authorization

Incase of write operation, all mutations are written to commit log and finally a group commit is used

Incase of read operation, it is executed on a merged view of the sequence of SSTables and the memtable

Page 89: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

89

Compactions

When in-memory is full Minor compaction – convert the memtable into an

SSTable Reduce memory usage Reduce log traffic on restart

Merging compaction Reduce number of SSTables Good place to apply policy “keep only N versions”

Major compaction Merging compaction that results in only one SSTable No deletion records, only live data

Page 90: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

90

Refinement

Locality groupsClients can group multiple column families together into a locality group.

Compression Compression applied to each SSTable block separately Uses Bentley and McIlroy's scheme and fast compression

algorithm Caching for read performance

Uses Scan Cache and Block Cache Bloom filters

Reduce the number of disk accesses

Page 91: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

91

Refinement

Commit-log implementationSuppose one log per tablet rather have one log per tablet server

Exploiting SSTable immutability No need to synchronize accesses to file system when

reading SSTables Concurrency control over rows efficient Deletes work like garbage collection on removing

obsolete SSTables Enables quick tablet split: parent SSTables used by

children

Page 92: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

92

Interactions between GFS and BigTable

Persistent state of a collection of rows (tablet) is stored in GFS Writes

Incoming writes are recorded in memory as memtables They are sorted and buffered in memory After they reach a certain size, they are stored in

sequence of SSTables (persistent storage, in GFS)

Page 93: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

93

Interactions between GFS and BigTable

Reads Information can be in Memtables or SSTables Need to consider, how to avoid Stale information All tables are sorted so easy to find most recent

Recovery To recover a tablet, tablet server reconstructs Memtable

by reading its metadata, redo points Need to consider, how to avoid Stale information All tables are sorted so easy to find most recent

Page 94: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

94

API

BigTable APIs provide functions for: Creating/deleting tables, column families Changing cluster, table and column family metadata such as

access control rights Client applications can:

write or delete values lookup values from individual rows iterate over a subset of data

Support of single row transactions Allowing cells to be used as integer counters Executing client supplied scripts in the address space of

servers

Page 95: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

95

Why use BigTable?

Scale is Large More than 100 TB of Satellite Image Data Millions of users

thousands of queries per second manage Latency

Billions of URLS billions and billions of pages each page has many versions

Page 96: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

96

Why not any other Database?

In-house solution is always cheaper Scale is very large for most of the databases Cost is too high Same system can be used across different projects,

which again lowers the cost With Relational Databases, we expect ACID

transactions. It is impossible to guarantee Consistency while providing High Availability and Network Partition Tolerance.

Page 97: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

97

CAP

Page 98: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

98

CAP

Page 99: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

99

Application Design

Reminders Timestamp is Int64, so application needs to plan

for updating the same cell at the same time by multiple clients.

At application level, need to know the data structure that is supported by GFS, to avoid conversion

Page 100: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

100

Google Services using BigTable

Used a database by: Google Analytics Google Earth Google App Engine Datastore Google Personalized Search

Page 101: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

101

BigTable Derivatives

Apache Hbase database, which is built to run on top of the Hadoop Distributed File System (HDFS).

Cassandra, which originated at Facebook Inc. Hypertable, an open source technology, an alternative

to HBase.

Page 102: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

102

Colossus

GFS is more suited for batch operations Colossus is a revamped file system that is suited for

real-time operations Colossus makes use of a new search infrastructure

called ‘Caffeine’ which enables Google to update its search index in real-time

In Colossus there are many masters operating at the same time

Number of changes have already been made to the open-source Hadoop to make it look more like Colossus

Page 103: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

103

Comparison

Page 104: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

104

Comparison

Page 105: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

105

Comparison

Page 106: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

106

Comparison

Page 107: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

107

Comparison

Page 108: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

108

Comparison

Page 109: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

109

Comparison

Page 110: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

110

Comparison

Page 111: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

111

Comparison

Page 112: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

112

Comparison

Page 113: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

113

Comparison

Page 114: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

114

Comparison

Page 115: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Google App Engine

Presented by: Ayesha Fawad

10/07/2014

Page 116: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Introduction

Also know as GAE or App Engine Preview started in April 2008 Came out of preview in September 2011 Is a a PAAS (platform as a service) Allows developing and hosting web applications in

Google managed data centers Default choice for storage is a NoSQL solution

Page 117: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Introduction Language Independent

plans to support more languages Automatic scaling

automatically allocates more resources to handle additional demand

It is free up to certain level of resources (storage, bandwidth, or instance hours) required by the application

Does not allow joins

Page 118: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Introduction

Applications are Sandboxed across multiple servers security mechanism to execute/run untested code restricted resources for the safety of host system

ReliableService Level Agreement of 99.5% uptimecan sustain multiple data center failures

Page 119: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

119

GAE Data store

It is built on top of BigTable Follows a hierarchical structure Schema-less object data store Designed to scale for high performance Queries are pre-built indexes Does not require entities of same kind

Page 120: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

120

Does Not Support

Join operations Inequality filtering on multiple properties Filtering data based on results of sub query

Page 121: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

121

Entities

Also known as Objects in App Engine Data store Each entity is uniquely identified by its own key An entity:

begins with root entity proceeding from parent to child

Every Entity belongs to an Entity group

Page 122: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

122

Models

Model is the superclass for data model definitions defined in google.appengine.ext.db

Entities of a given kind are represented by instance of the corresponding model class

Page 123: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

123

Queries

A Data store Query retrieves entities from data store which operates on entity values keys to meet specified set of conditions

Data store API provides a Query class for constructing queries PreparedQuery class for fetching and returning entities

from the data store Can apply filters and sort orders on queries

Page 124: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

124

Indexes

An index is defined on a list of entity properties of an entity kind

An index table contains a column for every property specified in the index’s definition

Data store identifies the index that corresponds with the Query’s kind, filter properties, filter operators and sort orders

App Engine predefines an index on each property of each kind. These indexes are sufficient to simple queries.

Page 125: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

125

GQL

GQL is a SQL like language for retrieving Entities or Keys from App Engine Data store

Page 126: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

126

Transactions

Transaction is a set of Data store operations on one or more entity

Its atomic, means transactions are never partially applied

Isolation and Consistency Required when users are attempting to create or

update an entity with same string ID Also possible to queue transactions

Page 127: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

127

Data store Software Stack

Page 128: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

128

Data store Software Stack

App Engine Data store schema-less storage advanced query engine

Megastore Multi-row transactions simple indexes/queries strict schema

BigTable distributed key/value

storeNext gen distributed file

system

Page 129: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI

https://appengine.google.com Everything done through console can

also be done through Command Line (appcfg)

Page 130: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

130

GUI

Main Data Administration Billing

Page 131: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Dashboard you can see all metrics related to your application. versions resources and usage much more

Page 132: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Instances total number of instances availability (e.g. dynamic) average latency average memory much more

Page 133: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Logs detailed information helps resolving any issue much more

Page 134: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Versions number of versions default setting deployment information delete a specific version much more

Page 135: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Backends its like a worker role piece of business logic which does not have a user

interface much more

Page 136: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Crom Jobs time based job can be defined in xml or yaml file much more

Page 137: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Task Queues can create multiple tasks first one will be default automatically can be defined in xml or yaml file much more

Page 138: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

GUI (Main)

Quota Details detailed metrics of resources being used For e.g. storage, memcache, mail etc shows daily quota shows rate details of what client is billed for much more

Page 139: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

139

Data store Options

High-Replication uses Paxos algorithm multi master read and write provides highest level of availability (99.999% SLA) certain queries will be Eventually Consistent some latency due to multi master writing reads are from the fastest source (local)Reads are transactional

Page 140: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

140

Data store Options

Master/Slave offers strong Consistency over availability, for all reads

and queries data is written to a single master data center, then

replicated asynchronously to other (slave) data centers 99.9% SLA reads from master only

Page 141: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Competitors

App Engine offers better infrastructure to host applications in terms of administration and scalability

Other hosting services offer better flexibility for applications in terms of languages and configuration

Page 142: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Hard Limits

Page 143: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

Free Quotas

Page 144: BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014 1.

144

Cloud Data Storage Options