BlobSeer in NoSQL world

16
2 BlobSeer: Architecture Clients Perform fine grain blob accesses Providers Store the pages of the blob Provider manager Monitors the providers Favours data load balancing Metadata providers Store information about page location Version manager Ensures concurrency control Clients Providers Metadata providers Provider manager Version manager

Transcript of BlobSeer in NoSQL world

Page 1: BlobSeer in NoSQL world

2

BlobSeer: Architecture

Clients Perform fine grain blob accesses

Providers Store the pages of the blob

Provider manager Monitors the providers Favours data load balancing

Metadata providers Store information about page location

Version manager Ensures concurrency control

Clients

Providers

Metadata providers

Provider manager

Version manager

Page 2: BlobSeer in NoSQL world

3

BlobSeer: What may be refined

Hotspots/fault-tolerance Fixed single version manager Fixed provider manager

Load balancing Version manager, provider manager may become hotspots Fixed metadata providers

Page 3: BlobSeer in NoSQL world

4

BlobSeer: What I am thinking of

Page 4: BlobSeer in NoSQL world

5

Background: Lighting-weigh DHT(may not correct) Using consistent hashing to hash distribute keys

Load balancing Fault tolerance Elasticity

Lookup cost: O(1) Base on Gossip overlay (borrowed from NoSQL world) Or base on Kelips P2P prototype (I have just know about it) Given a key, node know the destination exactly in most cases Overhead: OK ref. NoSQL world (Facebook Cassandra, Amazon Dynamo,

Voldermort)

I will try solving my given problems by building BlobSeer on top of this DHT

Page 5: BlobSeer in NoSQL world

6

Distributed version managers

Distributed version managers: A 2 levels Splitting BLOB_ID namespace

DHT-based Fortunately, blob is independent from each other Hash (BLOB_ID) => ID of version manager server

Splitting version ID’s space per BLOB Easily Rely on DHT replication Hash (BLOB_ID) => {neighbouring version managers}

Lookup cost = O(1), equally to BlobSeer

Page 6: BlobSeer in NoSQL world

7

Concurrent writing/appending need to be serialized On master Blob.getlatest() Blob.write() Blob.append()

Access to history versions Randomly on {master, slaves} Blob.read() Blob.getsize() Ask Master only in case of necessary

Master periodically PUTS OR Slaves PULL versions to do serialization Version info is quite tiny

Page 7: BlobSeer in NoSQL world

8

Eliminate the provider manager Provider manager keeps cluster state to answer clients’ requests

Lookup costs O(1)

Providers can learn themselves about the system state Load and Load balancing?? Lookup costs O(1) Use the presented DHT overlay to propagate providers’ states

Gossip-based (limited in cluster size around 1000 but it is still good) Or a lighting version of P2P overlay (E.g. Kelips) Hotspot when increasing number of clients, providers

Client randomly asks any providers

Page 8: BlobSeer in NoSQL world

9

However !!!

We will not want to use consistent hashing

Page 9: BlobSeer in NoSQL world

10

Architecture

Version managers, metadata managers, providers, clients

DHT with consistent hashing

Distributed membership management

Gossip based

Zookeeper (like Google’s chubby)

Replication, fault tolerance, leader election

Page 10: BlobSeer in NoSQL world

11

Access scenarios

Reading Hash blobID to know its associated version manager Go down the metadata tree Access providers O(1) for any step and equal to the current BlobSeer design

Writing The same as in BlobSeer but no provider manager

Page 11: BlobSeer in NoSQL world

12

Overview of the implementation

Gossip based DHT

We need 3 hash namespaces Version managers Metadata providers Providers

Elasticity Is inherent if we use consistent hashing for DHT

Fault-tolerance DHT based

Load balancing DHT based

Page 12: BlobSeer in NoSQL world

13

Advantages

Still keeping the current nice features of BlobSeer

Monolithic-based design Node provides all capabilities as a client, a version manager, a metadata

manager and a provider Simpler/easier for configuration/deployment (autonomic feature?)

Load balancing

Fault tolerance

Elasticity

Compare to NoSQL key/value store Efficient one key/ a value of TB size (versioning, throughput)

Page 13: BlobSeer in NoSQL world

14

Some more discussions

If client is outside of BlobSeer storage cloud, client randomly chooses one node to communicate. Node is as a proxy server (Cassandra)

We may need a small number of version manager, metadata managers Leader election (can base on Apache Zookeeper) If we fix them, we will reduce overhead at DHT level

BlobSeer cloud

Client

Page 14: BlobSeer in NoSQL world

15

BlobSeer in NoSQL paradigm

Document stores

Column stores

Page 15: BlobSeer in NoSQL world

16

{pages} distribution

BlobSeer’s approach Distribute {pages} over different providers {pages} are mapped to physical addresses of providers directly

DHT’s approach DHT is used only to know how has {pages} but not to route {pages} Must find a good way: {pages} of single write should be distributed over

different providers? [YES or NO] Hopefully, page keys are picked by client in BlobSeer

DHT load balancing DHT fault-tolerance Lookup cost: O(1)

Page 16: BlobSeer in NoSQL world

17

Eliminate the provider manager Provider manager keeps cluster state to answer clients’ requests

Lookup costs O(1) Hotspot when increasing number of clients, providers

Providers can learn themselves about the system state Lookup costs O(1) Use the presented DHT overlay to propagate providers’ states

Gossip-based (limited in cluster size around 1000 but it is still good) Or a lighting version of P2P overlay (E.g. Kelips)

Need a good way to distribute {pages} of each separated write operation over DHT?

BlobSeer’s approach DHT’s approach