Undo tech overview_201410

54
Copyright Undo Ltd, 2014 Introduction to UndoDB Greg Law Co-founder and CEO, Undo Ltd

description

Overview of the UndoDB debugger for C/C++ programs on Linux

Transcript of Undo tech overview_201410

Page 1: Undo tech overview_201410

Copyright Undo Ltd, 2014

Introduction to UndoDB

Greg Law

Co-founder and CEO, Undo Ltd

Page 2: Undo tech overview_201410

Copyright Undo Ltd, 2014

Undo Overview

•Founded in 2005, with 25 man years development in debug technology Founder : Greg Law, Ph.D , Acorn, Solarflare, Nexwave,etc

Founder: Julian Smith, Ph.D, Acorn, E-14, Broadcom, etc

•Well funded with $1.25 million raised in round 3. − Backed financially by Cambridge Angels and Jaan Tallinn, co-founder Skype

− Currently 11 full time staff

•Market Momentum − Revenue and headcount doubled in 2012 and 2013.

− OEMs with ARM (DS5 v5.16) and Rogue Wave (Totalview)

− Selected as Gartner Cool Vendor in Application Development for 2014.

− Customers include : NASA, Lawrence Livermore Labs, Cadence Design

Systems, Mentor Graphics, Synopsys Inc.

Page 3: Undo tech overview_201410

Copyright Undo Ltd, 2014

Debugging Involves Backwards Reasoning

“Reason back from the state of the crashed program to determine

what could have caused this. Debugging involves backwards

reasoning, like solving murder mysteries. Something impossible

occurred, and the only solid information is that it really did occur. So

we must think backwards from the result to discover the reasons.”

Kernighan & Pike's The Practice of Programming

Page 4: Undo tech overview_201410

Copyright Undo Ltd, 2014

Debugging Involves Backwards Reasoning

“Reason back from the state of the crashed program to determine

what could have caused this. Debugging involves backwards

reasoning, like solving murder mysteries. Something impossible

occurred, and the only solid information is that it really did occur. So

we must think backwards from the result to discover the reasons.”

Kernighan & Pike's The Practice of Programming

Page 5: Undo tech overview_201410

Copyright Undo Ltd, 2014

Debugging Involves Backwards Reasoning

“Reason back from the state of the crashed program to determine

what could have caused this. Debugging involves backwards

reasoning, like solving murder mysteries. Something impossible

occurred, and the only solid information is that it really did occur. So

we must think backwards from the result to discover the reasons.”

Kernighan & Pike's The Practice of Programming

• Regular debuggers give you pause/continue

Page 6: Undo tech overview_201410

Copyright Undo Ltd, 2014

Debugging Involves Backwards Reasoning

“Reason back from the state of the crashed program to determine

what could have caused this. Debugging involves backwards

reasoning, like solving murder mysteries. Something impossible

occurred, and the only solid information is that it really did occur. So

we must think backwards from the result to discover the reasons.”

Kernighan & Pike's The Practice of Programming

• Regular debuggers give you pause/continue

• UndoDB gives you a rewind button, etc

• CCTV for your code

Page 7: Undo tech overview_201410

Copyright Undo Ltd, 2014

Reversible Debugging for Real Code

A holy-grail of tools research for decades.

UndoDB is the first and only reversible debugger that is effective on

real-world, complex software

• Used on many of the world’s most complex software projects

Scientific supercomputing clusters (NASA Ames, LLNL, etc)

EDA, Enterprise (several fortune 500 customers)

High performance, scalable, for compiled code on Linux

"I found the idea of the product amazing and a boon to my

productivity... I have already been able to fix a deadlock

that was driving me crazy for a week in only ten minutes!”

Page 8: Undo tech overview_201410

Copyright Undo Ltd, 2014

Performance is key

Patented checkpoint + replay approach

• Exploit the natural determinism of computers

• Reduces time/space overheads by many orders of magnitude

• Highly optimised JIT binary translation

Benchmark

• Time to ‘gzip’ a 16MB* file

(gzip is mostly CPU bound, with some IO)

* GDB times extrapolated from 16K file

Native UndoDB GDB “process record”

Time 1.49s 2.61s (1.75x) 21 hours (>50,000x)

Space N/A 17.8MB 63GB

Page 9: Undo tech overview_201410

Copyright Undo Ltd, 2014

The Event Log

All non-deterministic events (syscalls, signals,

thread switches) stored in the Event Log

− In record mode: operations executed as normal, and

results stored in event log

− In replay mode: operations not executed, but results

synthesized from the contents of the event log

Replay is therefore completely deterministic

No interaction with outside world in replay mode

Page 10: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 11: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 12: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 13: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 14: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 15: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 16: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

Page 17: Undo tech overview_201410

Copyright Undo Ltd, 2014

undodb-gdb: error: UndoDB's

event log is full, so no more

history can be recorded.

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Page 18: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time

reverse-next

Page 19: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time

reverse-next

Page 20: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time, or

Increase its size

undodb-set-max-event-log-size 768M

Page 21: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time, or

Increase its size, or

Select circular mode

undodb-set-event-log-mode circular

Page 22: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time, or

Increase its size, or

Select circular mode

undodb-set-event-log-mode circular; continue

Page 23: Undo tech overview_201410

Copyright Undo Ltd, 2014

The event log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time, or

Increase its size, or

Select circular mode

undodb-gdb: error: UndoDB's event log is full...

Page 24: Undo tech overview_201410

Copyright Undo Ltd, 2014

The Event Log

Event log is a fixed size

in memory

Fills up as your program

runs

When it’s full:

Go back in time, or

Increase its size, or

Select circular mode

Page 25: Undo tech overview_201410

Copyright Undo Ltd, 2014

Snapshots

Page 26: Undo tech overview_201410

Copyright Undo Ltd, 2014

Snapshots

Page 27: Undo tech overview_201410

Copyright Undo Ltd, 2014

Snapshots

Page 28: Undo tech overview_201410

Copyright Undo Ltd, 2014

Snapshots

Page 29: Undo tech overview_201410

Copyright Undo Ltd, 2014

Threads

Page 30: Undo tech overview_201410

Copyright Undo Ltd, 2014

Threads

Page 31: Undo tech overview_201410

Copyright Undo Ltd, 2014

Threads

Scheduling is serialized, but preemptive

− Many race conditions can be reproduced

• But not all – e.g. missing or incorrect memory barrier

− Bigger impact on performance

Record: OS scheduler selects which thread runs next

Replay: UndoDB forces scheduling to mirror record

Page 32: Undo tech overview_201410

Copyright Undo Ltd, 2014

UndoDB Save/Load

Collaborate

Save a recording and send it to a colleague.

Load the recording onto another machine

Defer and prioritise

Capture now, debug later Save a recording for analysis later.

Capture sporadic/intermittent bugs and save them for later.

Protect against losing your data

Resilient to power or system failures -- never reproduce a bug a second time.

Page 33: Undo tech overview_201410

Copyright Undo Ltd, 2014

Record In-The-Field Failures

Undo Flight Recorder is a new product to record

failures at the customer site

A library you link into your code and ship to your

customers

By default, library is dormant. C API allows your

program to enable recording

Page 34: Undo tech overview_201410

Copyright Undo Ltd, 2014

Flight Recorder

Your

Executable Ship to customer

Page 35: Undo tech overview_201410

Copyright Undo Ltd, 2014

Flight Recorder

Your

Executable Ship to customer

Page 36: Undo tech overview_201410

Copyright Undo Ltd, 2014

Flight Recorder

Your

Executable

undofr.so

Ship to customer

Page 37: Undo tech overview_201410

Copyright Undo Ltd, 2014

Flight Recorder

Your

Executable

undofr.so

Ship to customer

recording Back to you – load into UndoDB

Page 38: Undo tech overview_201410

Copyright Undo Ltd, 2014

Good Kinds of Bug for UndoDB

1. Long run times

2. Frequently-called functions

3. Intermittent bugs

4. Stack corruption

5. Memory leaks

6. Real-time network protocols

7. Race conditions

8. Data structure corruption

9. Dynamic code

Page 39: Undo tech overview_201410

Copyright Undo Ltd, 2014

Good Kinds of Bug for UndoDB

1. Long run times

2. Frequently-called functions

3. Intermittent bugs

4. Stack corruption

5. Memory leaks

6. Real-time network protocols

7. Race conditions

8. Data structure corruption

9. Dynamic code

Page 40: Undo tech overview_201410

Copyright Undo Ltd, 2014

Good Kinds of Bug for UndoDB

1. Long run times

2. Frequently-called functions

3. Intermittent bugs

4. Stack corruption

5. Memory leaks

6. Real-time network protocols

7. Race conditions

8. Data structure corruption

9. Dynamic code

Demo #1

(square roots)

Demo #1

(square roots)

Page 41: Undo tech overview_201410

Copyright Undo Ltd, 2014

Good Bugs for UndoDB

1. Long run times

2. Frequently-called functions

3. Intermittent bugs

4. Stack corruption

5. Memory leaks

6. Real-time network protocols

7. Race conditions

8. Data structure corruption

9. Dynamic code

Demo #2

(stack smash)

Page 42: Undo tech overview_201410

Copyright Undo Ltd, 2014

Features and benefits

Works on unmodified Linux x86 and ARM systems and programs

• No kernel patches or modules, or special hardware requirements

• No recompilation or special libraries for program being debugged

• No restrictions on application to debug: threads, signals, shared memory, etc

Greatest value is on the bugs where:

• Effects manifest themselves a long time after underlying bug

• Non-deterministic “once in a blue moon” failures

• Large, unfamiliar codebases (“how did that happen?”)

Full-featured debugger

• Attach, watchpoints, etc.

• Fits seamlessly into your existing workflow (gdb / Eclipse / Emacs, etc)

Page 43: Undo tech overview_201410

Copyright Undo Ltd, 2014

Competition Time

Get the code at: http://undo-software.com/bugmug

Can you find the bug?

#bugmug @undosoft

[email protected]

Page 44: Undo tech overview_201410

Copyright Undo Ltd, 2014

Support

Questions or problems: [email protected]

If you suspect a bug in UndoDB:

− Run with --undodb-asserts-1 on the command line, or

set UNDODB_debug_level_internal=2 env var to enable

internal logging/extra checking

− On UndoDB crash, files a la undodb_log.1234 will be

created; send these as attachments to Undo

− Alternatively, use maint-undodb-dump-internal-log

command to dump the logs immediately

Page 45: Undo tech overview_201410

Copyright Undo Ltd, 2014

Part II UndoDB master class

Tips and tricks for better debugging

Page 46: Undo tech overview_201410

Copyright Undo Ltd, 2014

Timeline

“What time is it?”, or “Goto” a particular time:

undodb-get-n

undodb-goto-n

undodb-show-event-log-extent

Precise control of timeline:

undodb-set-bookmark and

undodb-goto-bookmark

Tip: Use these commands to binary chop

Page 47: Undo tech overview_201410

Copyright Undo Ltd, 2014

Record Mode and Replay Mode

When you first run, you’re

in Record Mode

If you step forwards, you

stay in Record Mode

Issue a reverse operation

puts you in Replay Mode

Page 48: Undo tech overview_201410

Copyright Undo Ltd, 2014

Record/replay continued

Replaying forwards to the end of the event log takes

you back into record mode (debugger will stop)

Use undodb-goto-record-mode to jump straight to

record mode

− As with undodb-goto-n, will not hit breakpoints

State (registers, memory) can be modified in record

mode, but not in replay mode

User issued function-calls (e.g. ‘call’ command from

gdb prompt) can be issued from replay mode or record

mode (see below)

Page 49: Undo tech overview_201410

Copyright Undo Ltd, 2014

The Event Log

All “non-deterministic” inputs are recorded

− System-calls

− Thread-switches

− Signals

− Non-deterministic instructions

− Shared memory reads (incl. DMA)

List events

maint-undodb-show-events b,e

maint-undodb-show-eventstats b,e

Page 50: Undo tech overview_201410

Copyright Undo Ltd, 2014

More on the Event Log

Increase the event-log size dynamically undodb-set-max-event-log-size

Stop when the event log gets full (default) undodb-event-log-mode straight

Record forever with a circular event log undodb-event-log-mode circular

Page 51: Undo tech overview_201410

Copyright Undo Ltd, 2014

Performance tuning

Start recording later:

− Start with --undodb-defer-recording option

− Then when you’re ready: undodb-enable-record

− Note: you cannot go back to record recording was enabled

Overview of performance metrics

− maint-undodb-show-exec-summary

− Make sure your instrumentation heap is big enough:

--undodb-instr-heapsize 512M

− Look out for executing code in writable memory

approx 2x slower than executing for read-only memory

Page 52: Undo tech overview_201410

Copyright Undo Ltd, 2014

Snapshots

List existing snapshots

maint-undodb-show-snapshots

Beware memory consumption

e.g. --undodb-snapshots 5

Note that reverse opterations may be slower

Hint to create a snapshot soon

maint-undodb-snapshot

(will create a snapshot at next basic-block boundary)

Page 53: Undo tech overview_201410

Copyright Undo Ltd, 2014

More tips and tricks

No symbols/no backtrace?

reverse-stepi or reverse-finish

undodb-goto-n -1000

Set default command-line options ~/.undodb-gdb.defaults

Calling functions in debuggee/inferior

− Side-effects are discarded after function is run

Function is run in a temporary process

− Use our `infcall’ gdb patch to make this bullet-proof

e.g. break foo.c:42 if !memcmp( a, b, 32)

Page 54: Undo tech overview_201410

Copyright Undo Ltd, 2014

More tips and tricks

Autotrace feature

Attach to an executable by name rather than pid

--undodb-autotrace /path/to/myexe cmd

cmd is the command as executed; if it or a child process exec’s

/path/to/myexec then undodb will attach to that

The path must be complete path or a glob

Work differently

− Don’t try to anticipate where to put breakpoints; run until the

bug happens, then “think backwards”