How to find lots of bugs with system-specific static analysis Dawson Engler Andy Chou, Ben Chelf,...

How to find lots of bugs with system-specific static analysis

Dawson EnglerAndy Chou, Ben Chelf, Seth Hallem,

Ken AshcraftStanford University

This talk is about how you can find lots of bugs in real code by making compilers aggressively system specific

This talk is about how you can find lots of bugs in real code by making compilers aggressively system specific

Checking systems software Systems software has many ad-hoc restrictions:

– “acquire lock L before accessing shared variable X”– “do not allocate large variables on 6K kernel stack”

Error = crashed system. How to find errors?– Formal verification

» + rigorous» - costly + expensive. *Very* rare to do for software

– Testing:» + simple, few false positives» - requires running code: doesn’t scale & can be

impractical – Manual inspection

» + flexible» - erratic & doesn’t scale well.

– What to do??

Never do X (do not use floating point, allocate large vars on 6K kernel stack) Always do X before/after Y (acquire lock before use, release after use)

Never do X (do not use floating point, allocate large vars on 6K kernel stack) Always do X before/after Y (acquire lock before use, release after use)

Some brave pioneers in the audience, but Practically intractable: far too strenous, and even if you do, spec isn’t code. Heuristic: if you see someone trying to verify sw can count on them being an academic trying to get tenure.

Some brave pioneers in the audience, but Practically intractable: far too strenous, and even if you do, spec isn’t code. Heuristic: if you see someone trying to verify sw can count on them being an academic trying to get tenure.

Method of choice if you build systems for money: test. Problem: O(paths) = exponential in length of code. So if you build systems, what you wind up with is a system that only crashes after a week. Further, mapping crash back to cause can be *really* hard.

Method of choice if you build systems for money: test. Problem: O(paths) = exponential in length of code. So if you build systems, what you wind up with is a system that only crashes after a week. Further, mapping crash back to cause can be *really* hard.

Inspection is dead: modify code, have to do it again.

Inspection is dead: modify code, have to do it again.

Another approach Observation: rules can be checked with a compiler

– scan source for “relevant” acts check if they make sense E.g., to check “disabled interrupts must be re-enabled:” scan for calls to disable()/enable(), check that they match, not done twice

Main problem:– compiler has machinery to automatically check, but

not knowledge– implementor has knowledge but not machinery

Metacompilation (MC): – give implementors a framework to add easily-

written, system-specific compiler extensionsWant to make compilers aggressively system specific: if you design a system or interface and see a use of it, you invariably see ways of doing it better: give you way to articulate this knowledge and have compiler do it for you automatically

Want to make compilers aggressively system specific: if you design a system or interface and see a use of it, you invariably see ways of doing it better: give you way to articulate this knowledge and have compiler do it for you automatically

Implementation: – Extensions dynamically linked into GNU gcc compiler– Applied down all paths in input program source

– Scalable: handles millions of lines of code – Precise: says exactly what error was– Immediate: finds bugs without having to execute

path– Effective: 1500+ errors in Linux source code

Metacompilation (MC)

ent->data = kmalloc(..)if(!ent->data)

free(ent);goto out;

… out: return ent;

Linuxfs/proc/generic.c

GNU C compiler

free checker“using ent after free!”

The main results: it works really well, and its easyThe main results: it works really well, and its easy

No X after Y: do not use freed memory

sm free_checker { state decl any_pointer v; decl any_pointer x; start: { kfree(v); } ==> v.freed ; v.freed: { v == x } | { v != x } ==> { /* suppress fp */ } | { v } ==> { err(“Use after free!”); ;}

start

v.freed

error

use(v)

kfree(v)

/* 2.4.4:drivers/isdn/isdn_ppp.c */if (!(ippp_table[i] = kmalloc(…)) for (j = 0; j < i; j++) kfree(ippp_table[i]);

Simple. Have had freshman write these and post bugs to linux groups.

Three parts: start state. Pattern, match does a transition, callouts. Scales with sophistication of analysis.

System will kill variables, track when they are assigned to others.



System will kill variables, track when they are assigned to others.

Talk Overview Overview: metacompilation [OSDI ’00, ASPLOS

‘00] Next: three examples

– Temporal rule: sanitize user data before use– Contextual rule: don’t block with interrupts off– Moving dynamic to static: assert checking

Broader checking: Inferring rules [SOSP ’01]– Find inconsistencies in program belief systems– Great lever: find errors without knowing truth

Deeper checking [ISCA ’01]– Extract formal model from raw C code– Run through model checker

“X before Y”: sanitize integers before use Security: OS must check user integers before use MC checker: Warn when unchecked integers from

untrusted sources reach trusting sinks

– Global; simple to retarget (text file with 2 srcs&12 sinks)

– Linux: 125 errors, 24 false; BSD: 12 errors, 4 false

array[v]while(i < v) …

v.clean

Use(v)v.tainted

Syscall param

Network packet

copyin(&v, p, len)

any<= v <= any

memcpy(p, q, v)copyin(p,q,v)copyout(p,q,v)

ERROR

User supplies base functions, we check the rest (9/2 sources, 15/12 sinks). Interesting: written by an undergrad, no compiler course, probably has close to the world record of security holes found.

User supplies base functions, we check the rest (9/2 sources, 15/12 sinks). Interesting: written by an undergrad, no compiler course, probably has close to the world record of security holes found.

Some big, gaping security holes.

No checks

Unexpected overflow

Weird security implications

/* 2.4.9: drivers/net/wan/farsync.c */copy_from_user(&wrthdr, addr, sizeof wrthdr);if ( wrthdr.size + wrthdr.offset > FST_MEMSIZE ) return -ENXIO;copy_from_user(card->mem+wrthdr.offset,data,wrthdr.size)

/* 2.4.1/kernel/sysctl.c:455:do_sysctl_strategy */ get_user(len, oldlenp); if (len > table->maxlen) len = table->maxlen; copy_to_user(oldval, table->data, len);

2.4.5-ac8/drivers/usb/se401.c: copy_from_user(&frame, arg, sizeof(int));ret=se401_newframe(se401, frame);se401->frame[frame].grabstate = FRAME_UNUSED;

Good example: understood once by someone, writes checker and then imposed on everyone. People know in the abstract that they

have fixed sized integers; be hard pressed to find anyone that admitted otherwise. However, they prompty program as if they are

arbitrarily sized.

Good example: understood once by someone, writes checker and then imposed on everyone. People know in the abstract that they

have fixed sized integers; be hard pressed to find anyone that admitted otherwise. However, they prompty program as if they are

arbitrarily sized.

Some more big, gaping security holes. Remote exploit, no checks

A more subtle overflow/* 2.4.9-ac7/fs/intermezzo/psdev.c:presto_psdev_ioctl */error = copy_from_user(&input, (char *)arg, sizeof(input));...input.path = kmalloc(input.path_len + 1, GFP_KERNEL);if ( !input.path ) return -ENOMEM;error =copy_from_user(input.path,user_path, input.path_len);

/* 2.4.9/drivers/isdn/act2000/capi.c:actcapi_dispatch */isdn_ctrl cmd;...while ((skb = skb_dequeue(&card->rcvq))) { msg = skb->data; ... memcpy(cmd.parm.setup.phone,msg->msg.connect_ind.addr.num, msg->msg.connect_ind.addr.len - 1);

Good example: understood once by someone, writes checker and then imposed on everyone.

Good example: understood once by someone, writes checker and then imposed on everyone.

“In context Y, don’t do X”: blocking Linux: if interrupts are disabled, or spin lock held,

do not call an operation that could block:– Compute transitive closure of all

potentially blocking fn’s – Hit disable/lock: warn of any calls– 123 errors, 8 false pos

– Heavy clustering: » net/atm: 152 checks, 22 bugs (exp 1.9) P

=3.1x10^-15» drivers/i2o: 692 checks, 35 bugs (exp 8.8) P=

2.6x10^-10

/* drivers/net/pcmcia/wavelan_cs.c */spin_lock_irqsave (&lp->lock, flags);/* 1889 */switch(cmd)... case SIOCGIWPRIV: /* 2304 */ if(copy_to_user(wrq->u.data.pointer, …))

clean

disable()

error

lock(l)

NoBlock

enable()

unlock(l)

Block call

400 lines later have this violation. This is a common pattern: implementor just doesn’t know about the rule, so keeps violating it. Happens: since rules manually enforced and poorly documented. Heavy clustering: 14 in one, 19 in another, 34

400 lines later have this violation. This is a common pattern: implementor just doesn’t know about the rule, so keeps violating it. Happens: since rules manually enforced and poorly documented. Heavy clustering: 14 in one, 19 in another, 34

Example: statically checking assert Assert(x) used to check “x” at runtime. Abort if

false– compiler oblivious, so cannot analyze statically– Use MC to build an assert-aware extension

Result: found 5 errors in FLASH. – Common: code cut&paste from other context– Manual detection questionable: 300-line path

explosion between violation and check

General method to push dynamic checks to static

msg.len = 0;...assert(msg.len !=0);

assert checker line 211:assert failure!

Another way to use MC is to push dynamic checks to static. Usually have some amount of dynamic type checking going on where you have a series of if statements at the beginning of your routine to check for error conditions. So just pull into the compiler and check.

Another way to use MC is to push dynamic checks to static. Usually have some amount of dynamic type checking going on where you have a series of if statements at the beginning of your routine to check for error conditions. So just pull into the compiler and check.

Metacompilation: – Correctness rules map clearly to concrete source

actions– Check by making compilers aggressively system-

specific

– Easy: digest sentence fragment, write checker. – Result: precise, immediate error diagnosis As outsiders found errors in every system looked at 1000s bugs, many capable of crashing system

Next:– Inferring errors by checking program belief systems– Deeper checking

Summary

Given a set of uses of some interface you’ve built, you invariably see better ways of doing things. This gives you a way to articulate this knowlege and have the compiler do it for you automatically. Let one person do it.

Given a set of uses of some interface you’ve built, you invariably see better ways of doing things. This gives you a way to articulate this knowlege and have the compiler do it for you automatically. Let one person do it.

Easier to write code to check than it is to write code that obeys

Easier to write code to check than it is to write code that obeys

Problem: what are the rules?!?!– 100-1000s of rules in 100-1000s of subsystems. – To check, must answer: Must a() follow b()? Can

foo() fail? Does bar(p) free p? Does lock l protect x?

– Manually finding rules is hard. So don’t. Instead infer what code believes, cross check for contradiction

Intuition: how to find errors without knowing truth?– Contradiction. To find lies: cross-examine. Any

contradiction is an error.– Deviance. To infer correct behavior: if 1 person

does X, might be right or a coincidence. If 1000s do X and 1 does Y, probably an error.

– Crucial: we know contradiction is an error without knowing the correct belief!

Goal: find as many serious bugs as possible

Reduced to playing where’s waldo with grep on millions of line of code, or documentation, hoping you can find all cases

Reduced to playing where’s waldo with grep on millions of line of code, or documentation, hoping you can find all cases

MUST beliefs: – Inferred from acts that imply beliefs code *must*

have.

– Check using internal consistency: infer beliefs at different locations, then cross-check for contradiction

MAY beliefs: could be coincidental – Inferred from acts that imply beliefs code *may*

have

– Check as MUST beliefs; rank errors by belief confidence.

Cross-checking program belief systems

x = *p / z; // MUST belief: p not null // MUST: z != 0unlock(l); // MUST: l acquiredx++; // MUST: x not protected by l

// MAY: A() and B()// must be paired

Specification = checkable redundancy. Can cross check code against itself for same effect. Others: that

x was not already equal to value.

Specification = checkable redundancy. Can cross check code against itself for same effect. Others: that

x was not already equal to value.

B(); // MUST: B() need not // be preceded by A()

A();…B();

A();…B();

A();…B();

A();…B();

Trivial consistency: NULL pointers *p implies MUST belief:

– p is not null A check (p == NULL) implies two MUST beliefs:

– POST: p is null on true path, not null on false path– PRE: p was unknown before check

Cross-check these for three different error types.

Check-then-use (79 errors, 26 false pos)/* 2.4.1: drivers/isdn/svmb1/capidrv.c */if(!card) printk(KERN_ERR, “capidrv-%d: …”, card->contrnr…)

Hello world of internal consistency checkers. Show because it is one of the simplest possible checkers, and because it finds hundreds of errors. This is a case where the rule is obvious, but we commonly don’t know the state of the system --- instead infer what state the code believes the system is in.

Hello world of internal consistency checkers. Show because it is one of the simplest possible checkers, and because it finds hundreds of errors. This is a case where the rule is obvious, but we commonly don’t know the state of the system --- instead infer what state the code believes the system is in.

Any NULL pointer analysis is going to get this one. It’s good we do too, but not wildly impressive. The next errors are where the action is.

Any NULL pointer analysis is going to get this one. It’s good we do too, but not wildly impressive. The next errors are where the action is.

Null pointer fun Use-then-check: 102 bugs, 4 false

Contradiction/redundant checks (24 bugs, 10 false)

/* 2.4.7: drivers/char/mxser.c */struct mxser_struct *info = tty->driver_data;unsigned flags;if(!tty || !info->xmit_buf)

return 0;

/* 2.4.7/drivers/video/tdfxfb.c */ fb_info.regbase_virt = ioremap_nocache(...); if(!fb_info.regbase_virt) return -ENXIO; fb_info.bufbase_virt = ioremap_nocache(...); /* [META: meant fb_info.bufbase_virt!] */ if(!fb_info.regbase_virt) { iounmap(fb_info.regbase_virt);

Had been doing this for a year, but had missed these errors --- as soon as you put in terms of beliefs, becomes obvious.

Had been doing this for a year, but had missed these errors --- as soon as you put in terms of beliefs, becomes obvious.

Aside: redundancy checking Assume: code supposed to be useful

– Like types: high-level bugs map to low-level redundancies

Identity operations: “x = x”, “1 * y”, “x & x”, “x | x”

Assignments never read (126 bugs, 26 fp, 1.8K uninsp):

/* 2.4.5-ac8/net/appletalk/aarp.c */ da.s_node = sa.s_node; da.s_net = da.s_net;

/* 2.4.5-ac8/net/decnet/af_decnet.c:dn_wait_run */do { if (signal_pending(current)) { err = -ERESTARTSYS; break; } ...} while(scp->state != DN_RUN);return 0;

Can look for redundancy in general: deadcode elim is an error finder. Can look for: writes never read, lock acquired that protects nothing. Redundant transition means we’re missing something with analysis.


Redundancy checking Dead code (66 bugs, 26 false):

Detect incomplete specifications:– Detect missed sinks in range checker: flag when

data read from untrusted source, sanitized, but then not used for any dangerous operation.

– Lock checker: critical section with no shared state, lock with no bound variables

for(entry=priv->lec_arp_tables[i];entry != NULL; entry=next){

next = entry->next; if (…) lec_arp_remove(priv->lec_arp_tables, entry); lec_arp_unlock(priv); return 0;}



Internal Consistency: finding security holes Applications are bad:

– Rule: “do not dereference user pointer <p>”– One violation = security hole– Detect with static analysis if we knew which were

“bad”– Big Problem: which are the user pointers???

Sol’n: forall pointers, cross-check two OS beliefs– “*p” implies safe kernel pointer– “copyin(p)/copyout(p)” implies dangerous user

pointer– Error: pointer p has both beliefs.– Implemented as a two pass global checker

Result: 24 security bugs in Linux, 18 in OpenBSD– (about 1 bug to 1 false positive)

First pass: mark all pointers treated as user pointers. Second pass: make sure they are never dereferenced.

First pass: mark all pointers treated as user pointers. Second pass: make sure they are never dereferenced.

Still alive in linux 2.4.4:

– Tainting marks “rt” as a tainted pointer, checking warns that rt is passed to a routine that dereferences it

– 3 other examples in same routine Can combine with earlier range checker (12

errors):

/* drivers/net/appletalk/ipddp.c:ipddp_ioctl */case SIOFCINDIPDDPRT:

if(copy_to_user(rt, ipddp_find_route(rt), sizeof(struct ipddp_route)))

return –EFAULT;

An exampleMarked as tainted because passed as the first argument to copy_to_user, which is used

to access potentientially bad user pointers. Does global analysis to detect that the pointer will be dereferenced by ippd_…

Marked as tainted because passed as the first argument to copy_to_user, which is used to access potentientially bad user pointers. Does global analysis to detect that the

pointer will be dereferenced by ippd_…

/* 2.4.9/drivers/telephony/ixj.c:ixj_ioctl */ case IXJCTL_INIT_TONE: copy_from_user(&ti, (char *) arg, sizeof(ti)); ... case IXJCTL_INTERCOM_START: ... ixj[arg]->intercom = board;

Common: multiple implementations of same interface.– Beliefs of one implementation can be checked

against those of the others! User pointer (3 errors):

If one implementation taints its argument, all others must

– How to tell? Routines assigned to same function pointer

– More general: infer execution context, arg preconditions…

– Interesting q: what spec properties can be inferred?

Cross checking beliefs related abstractly

–Parameter features: Can a param be null? What are legal values of integer parameter Return code: What are allowable error code to return & when? Execution context: Are interrupts off or on when code runs? When it exits? Does it run concurrently?

–Parameter features: Can a param be null? What are legal values of integer parameter Return code: What are allowable error code to return & when? Execution context: Are interrupts off or on when code runs? When it exits? Does it run concurrently?

foo_write(void *p, void *arg,…){ copy_from_user(p, arg, 4); disable(); … do something … enable(); return 0;}

bar_write(void *p, void *arg,…){ *p = *(int *)arg; … do something … disable(); return 0;}If one does it right, we can cross check all: if

one dev gets it right we are in great shape.

If one does it right, we can cross check all: if one dev gets it right we are in great shape.

Handling MAY beliefs MUST beliefs: only need a single contradiction MAY beliefs: need many examples to separate

fact from coincidence. General approach:– Assume MAY beliefs are MUST beliefs & check

them– Count number of times belief passed check – Count number of times belief failed check – Use the test statistic to rank errors based on ratio

of checks (n) to errors (err):

– Intuition: the most likely errors are those where n is large, and err is small.

– BAD idea: pick threshold t, if z(n,c) > t treat as MUST

z(n, err) = ((n-err)/n-p0)/sqrt(p0*(1-p0)/n)

Intuition: the more often x is obeyed correctly, the more likely it is to be a valid instance.

Intuition: the more often x is obeyed correctly, the more likely it is to be a valid instance.

Statistical: Deriving deallocation routines Use-after free errors are horrible.

– Problem: lots of undocumented sub-system free functions

– Soln: derive behaviorally: pointer “p” not used after call “foo(p)” implies MAY belief that “foo” is a free function

Conceptually: Assume all functions free all arguments– (in reality: filter functions that have suggestive

names)– Emit a “check” message at every call site.– Emit an “error” message at every use

– Rank errors using z test statistic: z(checks, errors)– E.g., foo.z(3, 3) < bar.z(3, 1) so rank bar’s error first – Results: 23 free errors, 11 false positives

foo(p);*p = x;

foo(p);*p = x;

foo(p);*p = x;

bar(p);p = 0;

bar(p);p = 0;

bar(p);*p = x;

Can cross-correlate: free is on error path, has dealloc in name, etc, bump up ranking. Foo has 3 errors, and 3 checks. Bar, 3 checks, one error. Essentially every passed check implies belief held, every error =

not held

Can cross-correlate: free is on error path, has dealloc in name, etc, bump up ranking. Foo has 3 errors, and 3 checks. Bar, 3 checks, one error. Essentially every passed check implies belief held, every error =

not held

Ranked free errorsKfree[0]: 2623 checks, 60 errors, z= 48.87 2.4.1/drivers/sound/sound_core.c:sound_insert_unit: ERROR:171:178: Use-after-free of 's'! set by 'kfree‘ ... kfree_skb[0]: 1070 checks, 13 errors, z = 31.92 2.4.1/drivers/net/wan/comx-proto-fr.c:fr_xmit: ERROR:508:510: Use-after-free of 'skb'! set by 'kfree_skb‘ ...[FALSE] page_cache_release[0] ex=117, counter=3, z = 10.3dev_kfree_skb[0]: 109 checks, 4 errors, z=9.67 2.4.1/drivers/atm/iphase.c:rx_dle_intr: ERROR:1321:1323: Use-after-free of 'skb'! set by 'dev_kfree_skb_any‘ ...cmd_free[1]: 18 checks, 1 error, z=3.77 2.4.1/drivers/block/cciss.c:667:cciss_ioctl: ERROR:663:667: Use-after-free of 'c'! set by 'cmd_free[1]'drm_free_buffer[1] 15 checks, 1 error, z = 3.35 2.4.1/drivers/char/drm/gamma_dma.c:gamma_dma_send_buffers: ERROR:Use-after-free of 'last_buf'! [FALSE] cmd_free[0] 18 checks, 2 errors, z = 3.2

Stratified error reports: rank all errors for different classes. See that there is a few clear ones, then a longer tail. At the top, 2.6K ok checks and 60 violations (2% error?) the third function was bogus . The next few were good, then there was a tail so we stopped. You decide how deeply to go down. Good for both discovery and for validation that you have everything.

Stratified error reports: rank all errors for different classes. See that there is a few clear ones, then a longer tail. At the top, 2.6K ok checks and 60 violations (2% error?) the third function was bogus . The next few were good, then there was a tail so we stopped. You decide how deeply to go down. Good for both discovery and for validation that you have everything.

A bad free error

/* drivers/block/cciss.c:cciss_ioctl */ if (iocommand.Direction == XFER_WRITE){

if (copy_to_user(...)) { cmd_free(NULL, c); if (buff != NULL) kfree(buff); return( -EFAULT); } } if (iocommand.Direction == XFER_READ) { if (copy_to_user(...)) { cmd_free(NULL, c); kfree(buff); } } cmd_free(NULL, c); if (buff != NULL) kfree(buff);

Example inferring free checker

sm free_checker { state decl any_pointer v; decl any_pointer x; decl any_fn_call call; decl any_args args; start: { call(v) } { char *n = mc_identifier(call); if(strstr(n, “free”) || strstr(n, “dealloc”) || … ) { mc_v_set_state(v, freed); mc_v_set_data(v, n); note(“NOTE: %s”, n); } }; v.freed: { v == x } | { v != x } { /* suppress fp */ } | { v } { err(“Use after free %s!”, mc_v_get_data(v)); ;





Traditional: – Use global analysis to track which routines return

NULL– Problem: false positives when pre-conditions hold,

difficult to tell statically (“return p->next”?) Instead: see how often programmer checks.

– Rank errors based on number of checks to non-checks. Algorithm: Assume *all* functions can return NULL

– If pointer checked before use, emit “check” message– If pointer used before check, emit “error”

– Sort errors based on ratio of checks to errors Result: 152 bugs, 16 false.

Statistical: deriving routines that can fail

p = foo(…);*p = x;

p = bar(…);if(!p) return;*p = x;



p = bar(…);*p = x;

Can also use consistency: if a routine calls a routine that fails, then it to can fail. Similarly, if a routine checks foo for failure, but calls bar, which does not, is a type error. (In a sense can use witnesses: take good code and see what it does, reapply to unknown code)

Can also use consistency: if a routine calls a routine that fails, then it to can fail. Similarly, if a routine checks foo for failure, but calls bar, which does not, is a type error. (In a sense can use witnesses: take good code and see what it does, reapply to unknown code)

The worst bug Starts with weird way of checking failure:

So why are we looking for “seg_alloc”?/* ipc/shm.c:750:newseg: */if (!(shp = seg_alloc(...)) return -ENOMEM;id = shm_addid(shp);

/* 2.3.99: ipc/shm.c:1745:map_zero_setup */if (IS_ERR(shp = seg_alloc(...))) return PTR_ERR(shp);

static inline long IS_ERR(const void *ptr) { return (unsigned long)ptr > (unsigned long)-1000L; }

int ipc_addid(…* new…) { ... new->cuid = new->uid =…; new->gid = new->cgid = … ids->entries[id].p = new;

Deriving “A() must be followed by B()” “a(); … b();” implies MAY belief that a() follows b()

– Programmer may believe a-b paired, or might be a coincidence.

Algorithm:– Assume every a-b is a valid pair (reality: prefilter

functions that seem to be plausibly paired)– Emit “check” for each path that has a() then b()– Emit “error” for each path that has a() and no b()

– Rank errors for each pair using the test statistic» z(foo.check, foo.error) = z(2, 1)

Results: 23 errors, 11 false positives.

foo(p, …) bar(p, …);

“check foo-bar”

x();y();

“check x-y”

foo(p, …);…

“error:foo, no bar!”

Checking derived lock functions Evilest:

And the award for best effort:

/* 2.4.1: drivers/sound/trident.c: trident_release: lock_kernel(); card = state->card; dmabuf = &state->dmabuf; VALIDATE_STATE(state);

/* 2.4.0:drivers/sound/cmpci.c:cm_midi_release: */ lock_kernel(); if (file->f_mode & FMODE_WRITE) { add_wait_queue(&s->midi.owait, &wait); ... if (file->f_flags & O_NONBLOCK) { remove_wait_queue(&s->midi.owait, &wait); set_current_state(TASK_RUNNING); return –EBUSY; … unlock_kernel();

Key ideas: – Check code beliefs: find errors without knowing

truth.– Beliefs code MUST have: Contradictions = errors– Beliefs code MAY have: check as MUST beliefs and

rank errors by belief confidence

Secondary ideas:– High-level errors map to low-level redundancies– Specification is a checkable redundancy: code has

many redundant uses that can be leveraged in same way.

– Can use statistical ranking to help traditional analysis!

Summary: Belief Analysis

Code has many similar redundancies: can cross check all callers to routine for consistency, can check multiple implementations of routine or interface to make sure they make the same assumptions.

Code has many similar redundancies: can cross check all callers to routine for consistency, can check multiple implementations of routine or interface to make sure they make the same assumptions.

We’d like real assurances of correctness– Verification? Coders don’t write docs, much less

specs…– Observation: spec clearly mirrors code. Auto-

extract!

Deeper checking

void PILocalGet(void) { /* ... Boilerplate setup code ... */ nh.len = LEN_CACHELINE; if (!hl.Pending) { if (!hl.Dirty) { /* ... 37 lines deleted ... */ ASSERT(!hl.IO); // The commented out ASSERT is // true 99.99% of the time, // but is not always // ASSERT(hl.Local); /*... deleted 15 lines ... */ PI_SEND(F_DATA, F_FREE, F_SWAP, F_NOWAIT, F_DEC, 1); hl.Local = 1;

Rule "PI Local Get”Cache.State = Invalid & ! Cache.Wait& ! DH.Pending & ! DH.Dirty ==> Begin Assert !DH.Local; DH.Local := true; CC_Put(Home, Memory);EndRule;

Interesting: checking high level things easier than low level. Will have set of data structures, function calls, naming conventions. Strip these out. In reality, the code is 18K lines and the specification is 2-3K and takes months

Interesting: checking high level things easier than low level. Will have set of data structures, function calls, naming conventions. Strip these out. In reality, the code is 18K lines and the specification is 2-3K and takes months

Key: abstract models are clearly embedded in code– Implementors use extensions to mark these

features – System rips them out & translates to formal model– Implementors can guide translation to rewrite +

augment

Example: verifying FLASH protocol – Hard core, asm strewn C. – Tested for 6+ years, manually “verified”– We found 8 errors. – Bonus: Automatically found bugs in manual spec

(it’s code)

Overview: Automatic extraction

code mark printslicer spec

model

murphi bugs

Interesting: checking high level things easier than low level. Will have set of data structures, function calls, naming conventions. Strip these out. Esseniall wriing an absracion funcion.

Interesting: checking high level things easier than low level. Will have set of data structures, function calls, naming conventions. Strip these out. Esseniall wriing an absracion funcion.

So hard that they got a paper; so easy that we got one.

So hard that they got a paper; so easy that we got one.

A simple abstraction functionsm len slicer { /* wildcard variables for pattern matching */ decl any_expr type, data, keep, swp, wait, nl;

/* match all uses of the length field. */ pat length = { HG_header.nh.len } ; /* match sends */ pat sends = { NI_SEND(type, data, keep, swp, wait, nl) } | { PI_SEND(type, data, keep, swp, wait, nl) } ; /* match accesses to directory entries */ pat entries = { HG_h.hl.Local } | { HG_h.hl.Dirty } ; /* mark patterns for MC slicer */ all: length | sends | entries ==> { mgk_tag(mgk_s); } ;}

Tool-based checking– PREfix/PREfast– Slam– ESP

Higher level languages– TypeState, Vault– Foster et al’s type qualifier work.

Derivation:– Houdini to infer some ESC specs– Ernst’s Daikon for dynamic invariants– Larus et al dynamic temporal inference

Spec extraction– Bandera– Slam

Related work

MC: Effective static analysis of real code – Write small extension, apply to code, find 100s-

1000s of bugs in real systems– Result: Static, precise, immediate error diagnosis

Belief analysis: broader checking– Infer system rules and state using code beliefs– Key feature: find errors without knowing truth.

Model extraction: deeper checking– Common: abstract models clearly embedded in C

code– Automatically extract these using extensions– Model check result

Summary

How to find lots of bugs with system-specific static analysis Dawson Engler Andy Chou, Ben Chelf,...

Documents

Transcript of How to find lots of bugs with system-specific static analysis Dawson Engler Andy Chou, Ben Chelf,...