Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

46
e Programming of Asynchronous Interacti we do it for real? Qadeer arch in Software Engineering osoft Research

description

Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer Research in Software Engineering Microsoft Research. Asynchronous interaction. Collection of state machines communicating asynchronously via message buffers distributed algorithms - PowerPoint PPT Presentation

Transcript of Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Page 1: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Safe Programming of Asynchronous Interaction:Can we do it for real?

Shaz QadeerResearch in Software EngineeringMicrosoft Research

Page 2: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Asynchronous interaction

• Collection of state machines communicating asynchronously via message buffers– distributed algorithms– cloud infrastructure, services, and applications– event-driven JavaScript/AJAX programs– device drivers– …

Page 3: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Challenging characteristics

• Decomposition of a logical task into pieces

• Temporally overlapped execution of tasks

• Failure tolerance is important

• Coordination via protocols

Page 4: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Safety-critical is so 20th century• Software should just “work”

– as cloud computing becomes common– as devices get embedded into everyday life

• First-order concerns– software reliability– programming, testing, and debugging productivity– cost of achieving reliability and productivity

• Need programming techniques to improve reliability and productivity

Page 5: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Outline

• Formal design of USB device driver stack in Windows 8

• Challenges (or inspiration) for the future

• Domain-specific language, compiler, and verifier for protocol programming

Page 6: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

What is USB?

• Universal Serial Bus• Primary mechanism for connecting

peripherals to PCs– 2 billion USB devices sold every year (as of 2008)– voted most important PC innovation of all time

(PC magazine)

1996 2000 2008

USB 1.0 USB 2.0 USB 3.0

Page 8: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Design methodology (Aull-Gupta)

State Machine In

Visio

State Table, Transitions And

State Entry Functions In C

Operations In C

State Machine Engine In C

Script

State Table, Transitions And

State Entry Functions In Zing

State Machine Engine In Zing

Document Operations, Rules And

Assumptions

Program Operations, Rules And Assumptions

In Zing

Script

Page 9: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Assumptions/Guarantees

• Upon calling TimerStart(), machine could receive TimerFired event– S1, S2, and S3 need to handle TimerFired

• Upon receiving TimerFired, machine will not receive TimerFired– S4 does not need to handle TimerFired

State S1TimerStart()

State S2

State S3 State S4

X

TimerFiredY

Page 10: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

StartTimer

EmptyFunction()

WaitingForCommand

UsbTimerStart()

StartingTimer

OperationSuccess

EmptyFunction()

WaitingForTimerToExpire

TimerFired

StopTimer

UsbTimerStop()

StoppingTimer

OperationSuccess

SignalTimerCompletion()

SignallingTimerCompletion

OperationSuccess

OperationFailure

EmptyFunction()

WaitingForTimerToFlushOnStop

TimerFired

Timer state machine

Page 11: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Zing error traceCheck failed ******************************************************************************* Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___StartTimer') Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___StartTimer') AttributeEvent: Handled Event ___StartTimer, Old State: ___WaitingForCommand, New State: ___StartingTimer Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___TimerFired') Send(chan='Microsoft.Zing.Application+___EVENT_CHAN(12)', data='___StopTimer') AttributeEvent: Handled Event ___OperationSuccess, Old State: ___StartingTimer, New State: ___WaitingForTimerToExpire Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___TimerFired') AttributeEvent: Handled Event ___TimerFired, Old State: ___WaitingForTimerToExpire, New State:

___SignallingTimerCompletion AttributeEvent: Handled Event ___OperationSuccess, Old State: ___SignallingTimerCompletion, New State:

___WaitingForCommand

Receive(chan='Microsoft.Zing.Application+___EVENT_CHAN(12', data='___StopTimer') AttributeEvent: HSM-1: Unhandled Event ___StopTimer, State ___WaitingForCommand]

Error in state:Zing Assertion failed: Expression: false Comment: Unhandled Event

Depth on error 208

Page 12: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Impact• Unprecedented use of formal design in Windows• Model is the Source• Over 200 rules to catch regression bugs even before C

Code is compiled• Over 300 bugs found and fixed

– unhandled messages, property violations

State machine # states # transitions #bugs

HSM 196 361 90

PSM 3.0 295 752 12

PSM 2.0 457 1386 97

DSM 1919 4238 120

Page 13: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Benefits

• Model verification complements testing– validates states that are hard to reach with testing– debugging is significantly easier

• Explicit specification of contracts – solid design– better documentation and maintenance

Page 14: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Difficulties faced by programmers

• Visio inadequate container for state diagrams

• Semantics of modeling language embedded inside scripts

• No automation for managing properties, models, and lemmas

Page 15: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

From modeling to programming

• State machine models are programs in a domain-specific language (DSL)

• Develop a modern programming environment for a DSL inspired by state machines– Simple syntax/semantics for programs and properties– Code generator and runtime library for execution– Verifier for property checking

Page 16: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Ping Pong

machine Ping receives pong { var x: Pong

state ( start, x := new Pong(y = this); raise unit ) ( ping1, send(x, ping); return )

transition ( start, unit, ping1 ) ( ping1, pong, ping1 )}

machine Pong receives ping { var y: Ping

state ( start, return ) ( pong1, send(y, pong); raise unit )

transition ( start, ping, pong1 ) ( pong1, unit, start )}

Page 17: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

Page 18: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

Page 19: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

Page 20: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

ping

Page 21: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

Page 22: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

pong

Page 23: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

pong

Page 24: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

x := new Pong;raise unit

send(x, ping);return

unit

pong

return

send(that, pong);raise unit

ping unit

Page 25: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Unhandled events

• Suppose state s only provides the transitions (s, e1, s1) and (s, e2, s2)

• Retrieving e3 from input queue results in UnhandledEventException

• Absence of UnhandledEventException must be verified

Page 26: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Deferred events

• State (s, Stmt, {e1, e2})• s is in the middle of critical processing waiting

for e• Presence of e1 and e2 in the buffer does not

cause UnhandledEventException• e1 and e2 are skipped over while retrieving e

Page 27: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Sub-state machines

• Statement “call s” pushes state s on the machine stack– s will handle a sub-protocol

• Sub-computation inherits deferred events from the caller

• Caller given a chance to handle UnhandledEventException

Page 28: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Memory management

• When is it safe to free up the memory for a state machine?

• Reference counting: Increment, Decrement• A machine is freed only when

– its reference count is zero– it is quiescent

• Accessing a freed machine causes IllegalAccessException whose absence must be verified

Page 29: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Runtime library

• Provides support for– machine creation and deletion– input buffer management– execution of transitions and entry functions

• Reactive event-driven computation piggybacked on external threads– locking for coordination among multiple external

threads executing within the runtime

Page 30: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Verification

• How do we verify the absence of UnhandledEventException and IllegalAccessException?

• How do we verify program-specific properties?

• How do we specify interfaces?

Page 31: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Automata

Automata are used to model implementation and specification.

AB(𝑆 , Σ , 𝛿 , 𝑖)

Set of states

Alphabet

Transitions:

Initial state { A, B }

Automata

Page 32: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Parallel composition isthe synchronous product.(trace intersection)

AB AC

A

B

B

C

C

𝑠𝛼→𝑠′ 𝑡 𝛼

→𝑡 ′

(𝑠 , 𝑡 )𝛼→

(𝑠 ′ , 𝑡 ′ )

𝑠𝛼→𝑠′𝛼∉Σ𝑇

(𝑠 , 𝑡 )𝛼→

(𝑠 ′ , 𝑡 )

𝛼∉Σ𝑆𝑡 𝛼→𝑡 ′

(𝑠 , 𝑡)𝛼→

(𝑠 ,𝑡 ′ )

Shared transition

Local transition

𝑆 𝑇

𝑆 ||𝑇

Parallel composition

Page 33: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Properties

Specifications are monitors that define the set of allowed traces.An implementation is correct if it refines the specifications.Refinement is trace inclusion.

AB

B

≼ABB

Properties

Page 34: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Semantic gap

• How do we connect a program to a finite collection of automata communicating via rendezvous over a finite alphabet?

• Challenges– dynamic creation of machines– asynchronous message passing– unbounded input buffers

Page 35: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Solution

• Dynamic machine creation– finite verification scenario

• Asynchronous message passing– separate events for sending and receiving– events tagged by sender and receiver machine ids

Page 36: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Send AReceive B Receive ASend B

Send A

Receive A

Send B

Receive B

Implementations(machines and channels)

Ping

Ping Buffer

Pong

Pong Buffer

Page 37: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Solution

• Dynamic machine creation– finite verification scenario

• Asynchronous message passing– separate events for sending and receiving– events tagged by sender and receiver machine ids

• Unbounded input buffers– compositional verification – finite-state buffer abstractions

Page 38: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

is a set of specification automata. is a set of implementation automata.

We want to prove (difficult).

Compositional verification tells us how we can do:

where are subsets of and are subsets of

Compositional verification

Page 39: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Simple hierarchical caseHierarchical compositional rule

Page 40: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Send AReceive B Receive ASend B

Send A

Receive A

Send B

Receive B

Send A

Receive ASend B

Receive B

Implementations(machines and channels)

Specification

Page 41: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Decomposing by weakening

AB Weaken by A AB

A

A

S Weaken(S, A)

S = Weaken(S, A) || Weaken(S, B)

Page 42: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Given a spec S, and a set of implementation machines I:

If for all E in alphabet of S,there is such that

Then .

Circular compositional rule

Page 43: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Receive ASend B

Send A

Receive ASend B

Receive B

Send A

Receive ASend B

Receive B

Send B

Send BSend B

Send B

refines

Pong

Page 44: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Review

• A domain-specific language for programming protocol aspects of asynchronous computations– operational semantics– compiler/runtime for device driver domain– verification

Page 45: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Work in progress

• Deliver working prototype to Windows and third-party driver developers

• Other applications– cloud infrastructure, services, and applications– networking software– asynchronous web programming – …

Page 46: Safe Programming of Asynchronous Interaction: Can we do it for real? Shaz Qadeer

Opportunity

• Transform protocol design and implementation across a variety of application domains

• Target the greatest threat to software reliability in the era of pervasive devices and pervasive distributed computing