CSC590 Selected Topics

20
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber by Haifa Alyahya 432920323

description

CSC590 Selected Topics. Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber. by Haifa Alyahya 432920323. Outline. Introduction Data Model - PowerPoint PPT Presentation

Transcript of CSC590 Selected Topics

Page 1: CSC590 Selected Topics

CSC590 Selected Topics

Bigtable: A Distributed Storage System for Structured Data

Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach

Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber

by Haifa Alyahya432920323

Page 2: CSC590 Selected Topics

• Introduction• Data Model• APIs• Building Blocks• Implementation• Refinements• Performance• Real Applications• Conclusion

Outline

Page 3: CSC590 Selected Topics

Discussion

• Bigtable(Bt) is a distributed storage system for managing structured data that is designed to scale to a very large size.

• Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.

Page 4: CSC590 Selected Topics

• Bigtable is designed to reliably scale to petabytes of data and thousands of machines.

• Bigtable has achieved several goals:

– Wide applicability.– Scalability.– High performance.– High availability.

Introduction

Page 5: CSC590 Selected Topics

• Scale Problem

– Lots of data

– Millions of machines

– Different project/applications

– Hundreds of millions of users

• Storage for (semi-)structured data.

• No commercial system big enough

– Couldn’t afford if there was one

• Low-level storage optimization help performance significantly Much harder to do when running on top of a database layer

Motivation

Page 6: CSC590 Selected Topics

Data Model

• A sparse, distributed persistent multi-dimensional sorted map

(row, column, timestamp) -> cell contents

Page 7: CSC590 Selected Topics

Data Model

• Rows– Arbitrary string– Access to data in a row is atomic– Ordered lexicographically

Page 8: CSC590 Selected Topics

Data Model

• Column

– Tow-level name structure:

• family: qualifier

– Column Family is the unit of access control

Page 9: CSC590 Selected Topics

Data Model

• Timestamps

– Store different versions of data in a cell– Lookup options

• Return most recent K values• Return all values

Page 10: CSC590 Selected Topics

Data Model

• The row range for a table is dynamically partitioned• Each row range is called a tablet• Tablet is the unit for distribution and load balancing

Page 11: CSC590 Selected Topics

APIs

• Metadata operations– Create/delete tables, column families, change metadata

• Writes– Set(): write cells in a row– DeleteCells(): delete cells in a row– DeleteRow(): delete all cells in a row

• Reads– Scanner: read arbitrary cells in a bigtable

• Each row read is atomic• Can restrict returned rows to a particular range• Can ask for just data from 1 row, all rows, etc.• Can ask for all columns, just certain column families, or specific

columns

Page 12: CSC590 Selected Topics

APIs

Page 13: CSC590 Selected Topics

Building Blocks

• Google File System (GFS)– stores persistent data (SSTable file format)

• Scheduler– schedules jobs onto machines

• Chubby– Lock service: distributed lock manager– master election, location bootstrapping

• MapReduce (optional)– Data processing– Read/write Bigtable data

Page 14: CSC590 Selected Topics

Chubby

• {lock/file/name} service• Coarse-grained locks• Each clients has a session with Chubby.

– The session expires if it is unable to renew its session lease within the lease expiration time.

• 5 replicas, need a majority vote to be active• Also an OSDI ’06 Paper

Page 15: CSC590 Selected Topics

Implementation

• The Bigtable implementation has three major components:– A library that is linked into every client– One master server– Many tablet servers

Page 16: CSC590 Selected Topics

Tablet Location Management

Page 17: CSC590 Selected Topics

Refinements

• Locality groups:– Clients can group multiple column families together

into a locality group.

• Compression:– Uses Bentley and McIlroy's scheme and fast

compression algorithm.

• Caching for read performance:– Uses Scan Cache and Block Cache.

• Bloom filters:– Reduce the number of accesses.

Page 18: CSC590 Selected Topics

Performance Evaluation

Page 19: CSC590 Selected Topics

Real Applications

• Google Analytics– http://analytics.google.com

• Google Earth– http://earth.google.com

• Personalized search – www.google.com/psearch

Page 20: CSC590 Selected Topics

Conclusions

• Users like… – the performance and high availability provided by the

Bigtable implementation

– that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time

– There are significant advantages to building a custom storage solution

• Challenges…– User adoption and acceptance of a new interface

– Implementation issues