Hypervisor-Assisted Application Checkpointing for High Availability

48
Hypervisor-Assisted Application Checkpointing for High Availability Min Lee Joint work with A. S. Krishnakumar, P. Krishnan, Navjot Singh, Shalini Yajnik

description

Hypervisor-Assisted Application Checkpointing for High Availability. Min Lee Joint work with A. S. Krishnakumar , P. Krishnan, Navjot Singh, Shalini Yajnik. Introduction. V irtualization technology Gets adopted widely Proves its usefulness Most applications run well Natively run - PowerPoint PPT Presentation

Transcript of Hypervisor-Assisted Application Checkpointing for High Availability

Page 1: Hypervisor-Assisted Application  Checkpointing  for High Availability

Hypervisor-Assisted Application Checkpointing for High AvailabilityMin Lee

Joint work with A. S. Krishnakumar, P. Krishnan, Navjot Singh, Shalini Yajnik

Page 2: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 2

Introduction

Virtualization technology– Gets adopted widely– Proves its usefulness– Most applications run well

• Natively run

Some important applications don’t run well– Certain operations cannot run natively– Instead they use hypercalls– Our target: Application-checkpointing

Page 3: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 3

Xen Virtual Machine Monitor

Xen Hypervisor

ModifiedGuest OS

ModifiedGuest OS

ModifiedGuest OS

… …Virtual

machines

Virtual hardware (vCpu, vDisk, vNic, vMemory etc.)

Physical hardware (Cpu, Disk, Nic, Memory etc.)

ApplicationsApplications Applications

(Taken/adapted from ‘Xen and co.’ slides)

Page 4: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 4

High Availability Approaches

Categories– Application-transparent

• No changes to application or guest• Xen-specific: Remus, Kemari

– Application-assisted• Application implements the checkpointing logic• Flexible and light-weight

We are targeting– Application-assisted under virtualization

• Xen-specific• Applicable to general hypervisors

Page 5: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 5

Hypervisor-Assisted Application Checkpointing

Application checkpointing– Provides transactional properties to the traditional heap

• Make high available heap

– Processes survive failures– Has performance issues in Xen

Our technique improves application-checkpointing performance in Xen

Page 6: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 6

High Availability

List_add()

List_del()

Magical mirror

changes

changes

List_add()Crash

TakeoverList_add()

Page 7: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 7

Transaction APIs

List of dirty-pages– Written pages

Mprotect() system call– Write-protect– SIGSEGV signal

Tstart();List_add();Tend();

int declare(addr, size);void undeclare(Tid);void Tstart(Tid);void Tend(Tid, dirty_pages);

List_add();

Tstart();List_add();List_del();List_add();List_del();Tend();

List_add();List_del();List_add();List_del();

Examples:

APIs:

Page 8: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 8

PT – Existing Approach

Get dirty pages123456789

101112

5 List_add();handler() {

mprotect(unprotect);add_to_dirty_pages();

}

5

List_add();7

7

Tstart();

Tend();…

Declare() {}

Undeclare() {}

123456789

101112

Process’ view(virtual pages)

Page 9: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 9

PT Call-Flow

Pure User-level

User

OS

Hypervisor

Mprotect()Mprotect()

Page fault

Signal

For every dirty page

TLB flush TLB flush

Page 10: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 10

Approaches

PT-based Emulation-based Scan-based

Pure user space PT(Exisiting)

Hypervisor-assisted

Page 11: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 11

Approaches

PT-based Emulation-based Scan-based

Pure user space PT(Exisiting) Emulation

Hypervisor-assisted PTxen Emulxen Scanxen

Our approaches

Page 12: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 12

Our Approaches

Page 13: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 13

Emulation

Under the condition– Most transactions are small

123456789

101112

List_add();handler() {

emulate();log_to_write_buffer();

}

(Addr1,100)

List_add();

(Addr2,200)

Tstart() {}

Tend();…

Declare();

Undeclare();

123456789

101112

Process’ view(virtual pages)

Page 14: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 14

Hypervisor-Assisted:User-to-hypervisor call

Overhead through OS unnecessary– Directly talk to Xen

Move checkpointing to Xen level– Add new interrupt vector

• 0x80: system call• 0x82: hypercall from guest OS• 0x84: hypercall from user (Newly added)

Xen-based approaches without any changes to guest OS.

Page 15: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 15

Hypervisor-Assisted:User-to-hypervisor call

User-to-Hypervisor Call

Page 16: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 16

PTxen

Implement PT in Xen123456789

101112

5 List_add();

page_fault() {mprotect(unprotect);add_to_dirty_pages();

}

5

List_add();7

7

Tstart() {}

Tend();…

Declare();

Undeclare() {}

123456789

101112

Process1, (1-12)

----- Xen -----

Process’ view(virtual pages)

Page 17: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 17

Emulxen

Emulation in Xen

List_add();

List_add();

Tstart() {}

Tend();…

Declare();

Undeclare();

123456789

101112

Process1, (1-12)

page_fault() {emulate();log_to_write_buffer();

}

(Addr1,100)(Addr2,200)

----- Xen -----

123456789

101112

Process’ view(virtual pages)

Page 18: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 18

Scanxen Idea

– Scan page table rather than trapping writes– Hardware marks dirty bit

List_add();

5

List_add();

7

Tstart() {}

Tend();…

Declare();

Undeclare();

123456789

101112

Process1, (1-12)

----- Xen -----

= Dirty-bit in page table

scan_page_table() {collect_dirty_bit(); add_to_dirty_pages();

}

Process’ view(virtual pages)

Page 19: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 19

Microbenchmark

10000 transactions10MB heap size

Page 20: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 20

Microbenchmark

Transactional heap size– For simplicity, whole heap is protected

Transaction– Write per pages (wpp)

• # of writes per pages– Page per transaction (ppt)

• # of unique pages written– # of writes = wpp * ppt

Scanxen– Impacted by only heap size– Not wpp, ppt, or transaction size

Page 21: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 21

PT vs PTxen

PTxen shows 10x speedup PT, PTxen get impacted by ppt

1 2 3 4 5 6 7 8ppt

0

2

4

6

8

10

12

14

Tim

e in

sec

PT(wpp = 4, 8, 16 overlapped)

PTxen(wpp = 4, 8, 16 overlapped)

Page 22: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 22

Emulation vs emulxen

16 32 48 64 80 96 112 128Transaction size

0

5

10

15

20

25

30

35

40

45

emul wpp 4emul wpp 8emul wpp 16emulxen wpp 4emulxen wpp 8emulxen wpp 16

Tim

e in

sec

Emul-based gets impacted by transaction size Emulxen shows 4x speedup

emul

emulxen

ppt (wpp=16) : 1 2 3 4 5 6 7 8ppt (wpp=8) : 2 4 6 8 10 12 14 16ppt (wpp=4) : 4 8 12 16 20 24 28 32

Page 23: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 23

PT Call-Flow

Pure User-level

User

OS

Hypervisor

Mprotect()Mprotect()

Page fault

Signal

User

OS

Hypervisor Page fault

declare()

For every dirty page

TLB flush TLB flush TLB flush

Xen-assisted

Page 24: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 24

Evaluation

Source from the book “Data Structures and Algorithm Analysis in C (Second Edition),” by

Mark Allen Weiss

Page 25: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 25

Data StructuresOPS_PER_T=1 writes pages

avg min max avg min max

aa (AA-trees) insert 21.9836 5 63 4.9481 1 7

delete 20.4053 2 63 6.0642 1 9

avl (AVL trees) insert 30.5609 6 39 5.1021 1 9

bin (Binomial queues) insert 27.9985 25 64 2.0735 1 10

dsl (Deterministic skip list) insert 10.4176 7 23 3.1421 1 5

hashquad (Quadratic probing hash) insert 11.3983 2 47023 1.0146 1 68

hashsepchain (Separate chaining hash) insert 4 4 4 1.9696 1 3

leftheap (Leftist heap) insert 23.5673 5 31 3.0665 1 6

delete 34.0132 0 59 9.2518 0 15heap (binary heaps) insert 2.8693 2 14 2.4009 1 5

delete 12.5523 2 15 2.7349 1 5list (Linked list) insert 4 4 4 1.0029 1 2

delete 1 1 1 1 1 1queue (Queues) insert 3 3 3 1.8984 1 2

delete 2 2 2 1 1 1

rb (Red black tree) insert 13.7011 10 28 4.6102 1 9

splay (Splay trees) insert 20.0851 4 5262 4.7745 1 34delete 7.7604 3 15001 3.0258 1 40

tree (Binary search tree) insert 720.7852 4 1436 5.4576 1 10delete 1.7139 0 3 1.7139 0 3

Page 26: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 26

Evaluation Results 1

mprotect emul emulxen mprotxen0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4queue insertqueue deletelist insertlist delete

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.050.1

0.150.2

0.250.3

0.350.4

0.450.5

hashquad inserthashsepchain insert

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

dsl insert

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.10.20.30.40.50.60.70.80.9

1

bin insert

Tim

e in

sec

PTXenPT

PTXenPT

PTXenPT

PTXenPT

Page 27: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 27

Evaluation Results 2

mprotect emul emulxen mprotxen0

0.10.20.30.40.50.60.70.80.9

1

splay insertsplay delete

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.2

0.4

0.6

0.8

1

1.2

1.4aa insertaa delete

Tim

e in

sec

PTXenPT

mprotect emul emulxen mprotxen0

5

10

15

20

25

tree inserttree delete

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.20.40.60.8

11.21.41.61.8

2leftheap insertleftheap delete

Tim

e in

sec

PTXenPT

PTXenPTPTXenPT

Page 28: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 28

Evaluation Results 3

mprotect emul emulxen mprotxen0

0.1

0.2

0.3

0.4

0.5

0.6

heap insertheap delete

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.2

0.4

0.6

0.8

1

1.2

rb insertavl insert

Tim

e in

sec

Scanxen shows almost constant 2.5sec across all

PTXenPT PTXenPT

Page 29: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 29

Evaluation Summary

Emulxen has up to 4x speedup compared to emulation PTxen has up to 13x speedup compared to PT

queu

e-ins

ert

queu

e-dele

te

list-in

sert

list-d

elete

hash

quad

-inse

rt

hash

sepc

hain-

delet

e

dsl-in

sert

bin-in

sert

splay

-inse

rt

splay

-delet

e

aa-in

sert

aa-de

lete

tree-i

nsert

tree-d

elete

lefthe

ap-in

sert

lefthe

ap-de

lete

heap

-inse

rt

heap

-delet

e

rb-ins

ert

avl-in

sert

0

2

4

6

8

10

12

14

16speedup emulxen

speedup mprotxen

Spee

dup

(1=1

00%

)

PTXen

Page 30: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 30

Transaction Aggregation

OPT=1– A single operation (e.g. an insert or a delete)

OPT=5– Multiple operations merged into one transaction– # of writes increases linearly– # of unique pages touched remains same in most cases

It should benefit PT-based approaches– Because of their heavy dependence on PPT– Details in the paper

Page 31: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 31

Conclusion

Family of application checkpointing techniques introduced

Emulation-based techniques– Useful for small transactions [fewer # of writes]

Hypervisor-Assisted Application Checkpointing– 4x~13x than userspace implementation

Page 32: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 32

Thank you!

Page 33: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 33

Extra Slides

Page 34: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 34

Emulation vs PT

1 2 3 4 5 6 7 8 9 10 11 12write per page

0

5

10

15

20

25

30

35ppt 4 emulppt 4 mprotectppt 8 emulppt 8 mprotect

Tim

e in

sec

Emul-based is good for small transaction– Roughly wpp=5 and wpp=1.3 is breakeven point

1 2 3 4 5wpp

0

0.5

1

1.5

2

2.5

3

3.5

ppt 4 emulxenppt 4 mprotxenppt 8 emulxenppt 8 mprotxen

Tim

e in

sec

Note scale difference

Page 35: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 35

Scanxen vs PT

1 2 3 4 5 6 7 8Pages per transaction

0

2

4

6

8

10

12

14

Tim

e in

sec

1 2 3 4 5 6 7 8Pages per transaction

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Tim

e in

sec

For small buffer and large ppt, scanxen might be better– Not the case in our experiments

Note scale difference

1MB

2MB

3MB

4MB

5MB

PT

PTxen

40KB80KB

120KBScanxen heapsize

Scanxen heapsize

Page 36: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 36

Scanxen vs emulation

Scanxen might be better than emulation– For big transactions

Scanxen

emul

emulxen

16 32 48 64 80 96 112 128Transaction size

0

5

10

15

20

25

30

35

40

45

scanxen wpp 4emul wpp 4emulxen wpp 4

Tim

e in

sec

Page 37: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 37

queu

e-ins

ert

list-in

sert

hash

quad

-inse

rt

dsl-in

sert

splay

-inse

rt

aa-in

sert

tree-i

nsert

lefthe

ap-in

sert

heap

-inse

rt

rb-ins

ert0

0.05

0.1

0.15

0.2

0.25No-HAmprotxen

Tim

e in

sec

Page 38: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 38

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

queue list queue list queue listNoTLBFlush TLBFlush AreaFlush

0

0.01

0.02

0.03

0.04

0.05

0.06

PTxen

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

inse

rtde

lete

queue list queue list queue listNoTLBFlush TLBFlush AreaFlush

0

0.5

1

1.5

2

2.5

3

scanxen

Page 39: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 39

Operations per transaction– OPT=5 , Merging transaction

• No impact to emulation-based ones• Some slowdown for scanxen

– Merging transactions• Total # of pages written goes down effectively• PT and PTxen becomes much better than emul/emulxen• Still 13x improvement between PT and PTxen

Page 40: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 40

Evaluation

mprotect emul emulxen mprotxen0

0.2

0.4

0.6

0.8

1

1.2

OPT=5, 2000 Transactions

rb insertavl insert

Tim

e in

sec

mprotect emul emulxen mprotxen0

0.2

0.4

0.6

0.8

1

1.2

OPT=1 , 10000 Transactionsrb insertavl insert

Tim

e in

sec

Page 41: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 41

Bandwidth : Amount

queu

e-ins

ert

list-in

sert

hash

quad

-inse

rt

dsl-in

sert

splay

-inse

rt

aa-in

sert

tree-i

nsert

lefthe

ap-in

sert

heap

-inse

rt

rb-ins

ert0

50000100000150000200000250000300000350000400000

mprotect-based

Am

ount

of s

ent i

n K

B

queu

e-ins

ert

list-in

sert

hash

quad

-inse

rt

dsl-in

sert

splay

-inse

rt

aa-in

sert

tree-i

nsert

lefthe

ap-in

sert

heap

-inse

rt

rb-ins

ert0

1000

2000

3000

4000

5000

6000

tree-insert; 56311.34375

emul-based

Am

ount

of s

ent i

n K

B

Note that tree-insert is 56311.34375 which is out of scale.

Emul-based mostly less than 2MB– No ‘diff’ process for emul-based

Page 42: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 42

Bandwidth : Time

queu

e-ins

ert

list-in

sert

hash

quad

-inse

rt

dsl-in

sert

splay

-inse

rt

aa-in

sert

tree-i

nsert

lefthe

ap-in

sert

heap

-inse

rt

rb-ins

ert0

0.010.020.030.040.050.060.070.080.09

mprotxenmprotectscanxen

Tim

e in

sec

-0.005

0

0.005

0.01

0.015

0.02

emulxenemul

Tim

e in

sec

Emul-based mostly less than 5ms

Page 43: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 43

Bandwidth : Percentage

queu

e-ins

ert

list-in

sert

hash

quad

-inse

rt

dsl-in

sert

splay

-inse

rt

aa-in

sert

tree-i

nsert

lefthe

ap-in

sert

heap

-inse

rt

rb-ins

ert0

10203040506070

mprotxen

Perc

enta

ge

-2

0

2

4

6

8

10

12 emulxenemulmprotectscanxen

Perc

enta

ge

Relatively small fraction– Except PTxen --- due to its minimum runtime

Page 44: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 44

Microbenchmark

scanxen

PT

PTxen

Page 45: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 45

emulxen

emul

scanxen

Page 46: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 46

16 32 48 64 80 96 112 128Transaction size (Tsize)

0

10

20

30

40

50

60mprotect wpp 4mprotxen wpp 4scanxen wpp 4emul wpp 4emulxen wpp 4

Tim

e in

sec

emulxenPTxen

PT

emul

scanxen

Page 47: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 47

Microbenchmark

writes

Transactional heap Dirty pages in Transactional heap

Tstart() of PT

Tend() of PT Three separate mprotect() calls

writesTstart() of PTxen

Tend() of PTxen Single PTxen() call

Page 48: Hypervisor-Assisted Application  Checkpointing  for High Availability

© 2009 Avaya Inc. All rights reserved. 48

Main process Diff process

diffdirty page

Backup process

Network