Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza,...

40
Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto University of Toronto EuroSys 2006: Leuven, Belgium EuroSys 2006: Leuven, Belgium April 19, 2006 April 19, 2006

Transcript of Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza,...

Database Replication Policies for Dynamic Content ApplicationsGokul Soundararajan, Cristiana Amza, Ashvin GoelUniversity of TorontoUniversity of Toronto

EuroSys 2006: Leuven, BelgiumEuroSys 2006: Leuven, Belgium

April 19, 2006April 19, 2006

2

Dynamic Content Web Server

Internet

Application Server

Web Server

Database Server

3

Dynamic Content Web Server

Internet

Application Server

(Stateless)

Web Server

(Stateless)

Database Server

(Stateful)

4

Today’s Server Farms

Data centers can run multiple applications E.g., IBM/HP

Service providers can multiplex resources E.g., applications have peaks at different times

Challenge: database server becomes the bottleneck

Internet

Application Server

(Stateless)

Web Server

(Stateless)

Database Server

(Stateful)

5

Motivation Scale the database backend on clusters

Handle more clients Run multiple applications Handle failures in the backend

Our approach: Database replication Dynamic replica allocation

Adapt to changing load or failures

6

Database Replication Read-one, write-all Plattner & Alonso, MW 04 Lin et. al, SIGMOD 05 Amza et. al, ICDE 05

Scaling for E-Commerce (TPC-W)

7

Dynamic Replication

Assume a cluster hosts 2 applications App1 (Red) using 2 machines App2 (Blue) using 2 machines

Assume App1 has a load spike

1 2 3 4

Replica SetApp1 = {1,2} Replica SetApp2 = {3,4}

8

Dynamic Replication

Choose nr. of replicas to allocate to App1 Say, we adapt by allocating one more replica

Then, two options App2 still uses two replicas (overlap replica sets) App2 loses one replica (disjoint replica sets)

1 2 3 4

Replica SetApp1 = {1,2} Replica SetApp2 = {3,4}

9

Dynamic Replication

Choose nr. of replicas to allocate to App1 Say, we adapt by allocating one more replica

Then, two options App2 still uses two replicas (overlap replica sets)App2 still uses two replicas (overlap replica sets) App2 loses one replica (disjoint replica sets)

1 2 4

Replica SetApp1 = {1,2,3} Replica SetApp2 = {3,4}

3

10

Dynamic Replication

Choose nr. of replicas to allocate to App1 Say, we adapt by allocating one more replica

Then, two options App2 still uses two replicas (overlap replica sets) App2 loses one replica (disjoint replica sets)App2 loses one replica (disjoint replica sets)

1 2 3 4

Replica SetApp1 = {1,2,3} Replica SetApp2 = {4}

11

Challenges

Adding a replica can take time Bring replica up-to-date Warm-up memory

Can avoid adaptation with fully-overlapped replica sets

Disjoint Full OverlapSLOW Adaptation

NO Adaptation

12

Challenges

However, overlapping applications compete for memory causing interference

Can avoid interference with disjoint replica sets

Disjoint Full OverlapNOInterference

HIGH Interference

13

Challenges

However, overlapping applications compete for memory causing interference

Can avoid interference with disjoint replica sets

Disjoint Full OverlapNOInterference

HIGH Interference

Tradeoff between adaptation delay and interference

14

Insight for Dynamic Content Apps Database reads are much heavier than writes

Reads are multi-table joins Writes are single row updates

Overlapping reads – high interference Overlapping writes – little interference

15

Insight for Dynamic Content Apps Database reads are much heavier than writes

Reads are multi-table joins Writes are single row updates

Overlapping reads – high interference Overlapping writes – little interference

Solution: Separate reads and overlap writes

16

Our Solution – Partial Overlap

Reads of applications sent to disjoint replica sets Avoids interference

Read-Set Set of replicas where reads are sent

1 2 3 4

Read-SetApp1 = {1,2} Read-SetApp2 = {3,4}

17

Our Solution – Partial Overlap

Writes of apps sent to overlapping replica sets Reduces replica addition time

Write-Set Set of replicas where writes are sent

1 2 3 4

Read-SetApp1 = {1,2}Write-SetApp1 = {1,2,3}

Read-SetApp2 = {3,4}Write-SetApp2 = {2,3,4}

18

Optimization

For a given application, Replicas in Write-Set – Fully Up-to-Date Other Replicas – Periodic Batch Updates

1 2 3

Fully Up-to-date

4PeriodicUpdates

19

When do we adapt? Add when application’s requirements not met

Due to either load spikes or failures Remove when replica not needed Application requirements defined through a

Service Level Agreement (SLA)

20

Resource Manager Feedback Loop

Global Resource Manager

Monitor

Analyze

Request Add/Remove

Execute

21

Resource Manager Feedback Loop

Global Resource Manager

Monitor

Analyze

Request Add/Remove

Execute

When does the feedback loop end?

22

Possible Oscillations Change not seen

immediately Replica addition takes

time Bring replica fully up-to-

date, warm-up memory May trigger more adds Oscillations cause

interference between applications

Global Resource Manager

Monitor

Analyze

Request Add/Remove

Execute

23

Avoiding Oscillations Delay-Awareness

Use load-balance as heuristic for stabilization after replica addition

Removes are conservative Tentative removes

Global Resource Manager

Monitor

Analyze

Request Add/Remove

Execute

24

Cluster Architecture

`

`

Web/Application

Servers

Users

Resource Manager

Auctions Query Scheduler

E-Commerce Query Scheduler

Database Tier

25

Experimental Setup Hardware

AMD Athlon 2600+ running at 2.1 Ghz 512 MB of RAM 60 GB Hard Drive

Software RedHat Fedora Core 2 Linux Apache 1.3.31 with PHP 4.0 MySQL 4.0.16 with InnoDB tables

Benchmarks TPC-W: E-Commerce Retail Store RUBIS: Online Bidding

26

Outline of Results Defined SLA in terms of query latency bound

Query latency < 600 ms Cluster Size

Up to 8 database replicas 10 web/application servers

Experiments Interference between Workloads Adapting to Load Changes Adapting to Faults

27

Disjoint

1 2

28

Partial Overlap

1 2

29

Full Overlap

1 2

30

Interference

00.5

11.5

22.5

33.5

44.5

5

Disjoint PartialOverlap

Full Overlap

Incr

ease

in L

aten

cy

TPCW RUBIS

31

Adaptation to Load Changes

32

Adapting to Load Changes Three schemes

Disjoint – 4/4 Dynamic allocation using Partial overlap Full Overlap – 8/8

33

Disjoint

TPC-W RUBIS

34

Full Overlap

TPC-W RUBIS

35

Partial Overlap

TPC-W RUBIS

36

Adaptation to Faults

37

Adaptation to Faults

38

More Results - In the Paper More complex load scenarios

Including overload Effect of delay-awareness

Avoiding oscillations

39

Conclusion Database replication

Handle more clients Dynamic replica allocation

Handle multiple workloads with different peaks Handle faults

40

Thanks!