Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica...

14
Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo

Transcript of Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica...

Page 1: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Replay Debugging for Distributed Application

D. Geels, G. Altekar, S. Shenker and I. Stoica

Presented by: Olusanya Soyannwo

Page 2: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Outline

Introduction Design Challenges Limitations Evaluation Related Work Conclusion

Page 3: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Introduction

Goal Find non-deterministic failures in

deployed, distributed applications Motivation

Growth of distributed applications Limitations of existing tools

• Network inconsistency• Inadequacy of simulations

Reproduction difficulty

Page 4: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Introduction

Deterministic Replay Remote Debugging latency Continuous interaction Connection problems

Continuous logging Performance concerns

Consistent Group Replay Multiple snapshots

Mixed Environment Determine (non-)cooperating peers

Page 5: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Introduction

LiblogProvides consistent replay in mixed

env.No Additional Hardware or patchesWorks on unmodified C/C++ applicationSimple

• Startup script• GDB interface

Page 6: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Design

Shared Library Implementation Intercepts calls to libc and vice versa Less complicated

Message Tagging and Capture Log messages Time stamps

Central Replay Local replay Network bandwidth, matching h/w, data

accessibility

Page 7: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Challenges

Multi-threaded applications P.-Shared memory S.-Implement new scheduler

Illegal memory accesses P.-Heap/Stack corruption S.-Zero out memory*

TCP Limitation Querying for non-cooperating peers GDB uniprocess restriction

Page 8: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Limitations

Log storage Host Requirements Scheduling semantics Network overhead Limited consistency Completeness Soundness

Page 9: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Evaluation

ExperimentsDual 3.06Ghz, Pentium 4 Xeon, 512K

L2 cache2GB of RAM, 80 GB 7500 rpm

ATA/100 diskBroadcom 1000TX gigabit Ethernet

Page 10: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Evaluation

Page 11: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Evaluation

Page 12: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Evaluation

Page 13: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Conclusion

Related WorkLiblog is similar to several others

(DejaVu, Jockey, Flashback) Useful for select applications Needs a lot of enhancements

Page 14: Replay Debugging for Distributed Application D. Geels, G. Altekar, S. Shenker and I. Stoica Presented by: Olusanya Soyannwo.

Ideas/Issues

Useful for simulations Restricted to none resource intensive

applications. No significant comparison How long can logging occur for?

4MB/hr Inadequate citations/references