C Programming Pitfalls
Transcript of C Programming Pitfalls
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
C Programming Pitfalls
Uri GorenKernel Stateful Enforcement I/SDecember 2005
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Part I: C Macros
And how they can damage your code
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Problems Caused by Macros
This presentation shows several ways, in which macros can cause unexpected problems in your code.
Goals:– Make you scared. You should think twice before using macros.– If you do write macros, help you do it safely.– If you run into macro related trouble, help you figure out what’s
wrong. I have suggested solutions to each problem. But:
– Each solution solves just one problem. You may need to combine several solutions.
– Some solutions don’t completely solve even one problem.– Some can’t be implemented is some cases.– Some solutions contradict other solutions.– Some solutions make your code ugly.
So – DON’T use this as a programming guide. Just use it to know the risks.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Operator Order #1
Here’s a nice macro:#define double(x) x+x
Let’s try to use it:int a=5;printf(“a=%d a*6=%d\n”, a, double(a)*3);
Code after preprocessing:printf(a=%d a*6=%d\n”, a, a+a*3);
Result – 20 instead of 30. Solution:#define double(x) (x+x)
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Operator Order #2
Let’s try this one:#define sqr(x) (x*x)
And use it:int a=3, b=2;printf(“sqr(a+b)=%d\n”, sqr(a+b));
Code after preprocessingprintf(“sqr(a+b)=%d\n”, (a+b*a+b));
Result: 11 instead of 25. Solution:
#define sqr(x) ((x) * (x))
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Multiple Evaluation
Here’s the fixed version of the previous macro:#define sqr(x) ((x)*(x))
How many sqares need to be added, to get 1000?int n=1, sum=0;while (sum<1000) sum += sqr(n++);
After preprocessing:sum += ((n++)*(n++));
Result – n is incremented twice in each loop. Solutions:
– Don’t pass parameters with side effects to macros.– Don’t evaluate twice. May be hard without using braces,
which isn’t possible when the macro returns a value.• With gcc, it’s possible. But we’re writing cross-platform.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Multiple / Late Evaluation
We implement pools, and have this macro:#define move(srcpool, dstpool, num) { \
move_the_elements; \srcpool->size -= num; \dstpool->size += num; \
} Let’s move everything from pool1 to pool2:
move(pool1, pool2, pool1->size); Code after preprocessing:
move_the_elements;pool1->size -= pool1->size;pool2->size += pool1->size;
Result – pool2->size isn’t changed. Solution:
#define move(srcpool, dstpool, num) { \int move_num = num; \move_the_elements; \srcpool->size -= move_num ; \dstpool->size += move_num ; \
}
Set to 0 – OK
Add 0 –Bug!
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Variable Name Conflicts
Here’s one:#define add_square(x, a) { /* Add a*a to x */ \
int s = a; /* Avoid multiple eval */ \x += s*s; \
} Let’s try to use it:
int s, t;add_square(s, t);
After preprocessing:{ int s = t; s += s*s; }
Result: Caller’s ‘s’ is shadowed, and isn’t changed. Solutions:
– Don’t define variables in macros. But it contradicts previous suggestions.– Compiler warning about shadowing (as if we look at them).– Use more descriptive names. Only reduce the chances.– Use names “nobody uses” – with many ___underscores.
• Everybody use names which “nobody uses”.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Flow Control #1
Here’s a useful macro:#define dbgprint(msg) if (dbgflag) printf(msg);
And let’s use it:if (x==3)
dbgprint(“very bad\n”);else
dbgprint(“very good\n”); After preprocessing:
if (x==3)if (dbgflag) printf(“very bad\n”);;
elseif (dbgflag) printf(“very good\n”);;
Result – because of the extra “;”, the compiler won’t relate the “else” to the “if” – compilation error.
Solution:– Make sure to use braces around the conditional statement, even if
there’s only one. But as the macro writer, you can’t assure this.– Omit the “;” from the macro – leave it to the caller.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Flow Control #2
Look at this debugging macro:#define dbgprint(msg) if (dbgflag) printf(msg)
Suppose we use it, this way:if (x==3)
dbgprint(“very bad\n”);else
dbgprint(“very good\n”); After preprocessing:
if (x==3)if (dbgflag) printf((“very bad\n”);
elseif (dbgflag) printf (“very good\n”);
Result - whose “else” is it? The else will be related to if(dbgflag), not to if (x==3) – wrong results.
Solutions:– Use braces with the if. But the macro writer doesn’t control it.– Use one of these structures:
• #define dbgprint(msg) if (!dbgflag) {} else printf(msg)• #define dbgprint(msg) do {if (dbgflag) printf(msg);} while(0)
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Macro / Symbol Name Collision
In some header, we define a nice macro:#define sum(a, b) ((a)+(b))
In some unrelated code, which just happens to include this header:int sum = 3;
Result – variable sum treated as macro – compilation error.
Solution - prefix name with component name:#define cp_math_sum(a, b) ((a)+(b))– Quite annoying – name too long.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Part II:Integer Arithmetic Pitfalls
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Preview
Do we really understand integers?– We take mathematical operations for granted.– We assume that things work just like we have learned in
elementary school.– We take integer arithmetic as something basic, that doesn’t
require any bothering. Question: Which integers satisfy the condition (x == -x) ?
– In normal math, there’s only one – 0.– In fact there are two – 0 and 0x80000000. Check for yourself.– Conclusion – integers are not as simple as you may think.
In this presentation you will find:– Many functions and code segments, doing integer arithmetic.– All are “mathematically correct” – if integers were simple numbers,
they would give correct results.– All are buggy – they fail because of how integers work.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Intermediate Results
When evaluating a complex expression, there are intermediate results.– Normally, we ignore them – we look at the big picture.
Intermediate results have their types, and their data range.– They are not stored in an arbitrary size and precision.– It’s just as if you have declared them explicitly :– int x = a + b * c; is the same as:– int temp1 = b * c; (Assuming that b,c are ints)int x = a + temp1;
What if the intermediate result wraps, but the final result doesn’t?– With addition and subtraction, it’s usually OK.
• In “(1-2)+3”, though 1-2 is 0xffffffff, we eventually get 2 – correct.– With multiplication and division, we usually can’t.
• In “2GB * 3 / 10”, 2GB * 3 will overflow, and division won’t fix it.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Unsigned Integers – loop counter
Here’s a nice loop:int a[SIZE];unsigned int pos;for (pos=SIZE-1; pos>=0; pos--)
a[pos] = 555;
Any problem?– “pos >= 0” is a meaningless condition!– pos will go down from SIZE-1 to 0, then to
0xffffffff. This is still positive.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Unsigned Integers Subtraction #1
Here’s a nice macro:#define DIFF(x,y) (x-y) #define PDIFF(x,y) ((DIFF(x,y) > 0 ? DIFF(x,y) : 0)
Very nice and simple – returns the difference, if it’s positive.
Now let’s use it:unsigned int x=3, y=5;printf(“%d\n”, PDIFF(x,y));
We get -2!!!– x,y are unsigned, so (x-y) is unsigned.– “(x-y) > 0” is the same as “(x-y) != 0”.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Unsigned integers Subtraction #2
Let’s fix the macro above:#define PDIFF(x,y) ((int)(DIFF(x,y) > 0 ? (int)(DIFF(x,y) : 0)
Now, does it work?unsigned int x = 3*1024*1024*1024 + 3;unsigned int y = 3;printf(“%u\n”, PDIFF(x,y));
We get 0!– x is greater than y, but (x-y), when viewed as an
integer, is negative.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Unsigned Integers Overflow
Take a look at this function:int f(unsigned int x, unsigned int y) {
if (x+y > 1000) return TRUE;return FALSE;
}
We expect it to return FALSE only if both x and y are pretty small.
Now how about this:unsigned int a = 2 * 1024 * 1024 * 1024;if (!f(a, a)) printf(“boom!\n”);
Obviously, a is very large, so a+a must also be large.– However, a+a equals 0!– The function will return FALSE.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Signed Integers Boundaries #1
Boundary checking is important. As in:int f(int x) {
static int y[SIZE] = { … };if (x >= SIZE) return ERROR;return y[x];
}
But does it check the argument properly? How about f(-1) ?
– The error won’t be caught.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Signed Integers Boundaries #2
Here’s a lovely function:int dy(int x) {
static int y[SIZE] = { ... };if (x<0 || x+1>=SIZE) return ERROR;return (y[x+1] – y[x]);
}
This time, we carefully check the arguments, so we won’t exceed the array boundaries.– Do we?
How about “dy(0x7fffffff)”?– The condition is now:
if (7fffffff < 0 || 0x80000000 > SIZE) return ERROR;– 0x80000000 is negative! it’s not greater than SIZE.– The function will not catch the error!
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Using Constants #1
Consider the following function:int big_enough(int size) {
return (size > sizeof(int));}
Tells you whether a given size is big enough.– Obviously, a negative size is not big enough.– Or is it?
big_enough(-1) will return true!– When comparing, size is converted to u_int.– So we’re comparing 0xffffffff with 4 – certainly big enough!
Constants have a type, and can be signed/unsigned:– Signed constants - e.g. 100, 100L.– Unsigned constants – e.g. 100U, sizeof(anything).
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Using Constants #2
Here’s another function:int big_enough2(int size) {
return ((size - sizeof(int)) > 100);}
Tells you if the size is big enough for something.– Again – negative sizes are surely not big enough.– And again…
big_enough2(-1) will return true!– When subtracting sizeof(int), the result is unsigned.– So we’re comparing 0xfffffffb with 100 – certainly big enough!
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Calculating Average
Here’s a simple exercise:– Write a function to calculate the average of two numbers.– Actually two exercises – signed and unsigned.
Solutions:unsigned int u_avg(unsigned int x, unsigned int y) {
return (x + y) / 2;}int s_avg(int x, int y) {
return (x + y) / 2;}
Now let’s check if it works:– Unsigned:
unsigned int x = 2 * 1024 * 1024 * 1024;printf(“%u\n”, u_avg(x, x));
– We get 0!• x+x equals 0, so (x+x)/2 does also.
– Signed:int x = 0x7fffffff;printf(“%d\n”, s_avg(x, x));
– We get -1!• x+x equals 0xfffffffe, which is -2. So (x+x)/2 is -1.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Percentage #1
How much is 30% out of something?– That’s easy. Can you program it?
Sure. Let’s do it nice, clean and modular:#define PCT(p) (p / 100)int f(int x) { return PCT(30) * x; }
Oops, it never works.– “30 / 100” is 0. We always return 0.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Percentage #2
The last percentage function was stupid. Let’s write a better one:int f(x) { return x * 30 / 100; }
Now it works.– Always?
Company X is worth 143,165,600$. I have 30% of the shares. What’s my fortune?– Using the above function we get 7$. Not very
exciting.– Why? “x * 30” is more than 4G, so it wraps around.
Division can’t fix it any more.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Percentage #3
Writing a percentage function can’t be that hard. This time, we’ll do it right:int f(int x) { return x / 100 * 30; }
Let’s how it does with the last example:– f(143165600) reurns 42,949,680.– I like this one much better.
But how about something easier?– How much is 30% of 10?– f(10) returns 0!
• “10 / 100” is 0.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Percentage #4
Here’s a more general percentage function:int p(int x, unsigned int p) {
if (x>1000 || x<-1000 || p>100) return OUT_OF_RANGE;
return x * p / 100;}
We don’t support large x – so we can’t overflow. But…
How much is 50% of -30? Let’s try p(-30, 50):– We get 42949657. How come?
x is signed, p is unsigned (makes sense). – In C, it means x*p is unsigned.– We put -1500 in an unsigned integer – it wraps around.– Division treats it as a large positive number, and returns a
smaller positive number.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Percentage #5
Here’s a harder question – what percentage is 30 out of 50?– Or generally, what percentage is x out of y?
Here are all the simple ways to calculate it:– (x / y * 100)
• Returns 0 when x < y (all normal cases).– (x * 100 / y)
• Overflows when x is large (what’s 5M out of 8M?)– x / (y / 100)
• Crashes when y is small (what’s 5 out of 8?)• Inaccurate when y is not very large (what’s 500 of 599?)
– 100 / y * x• Inaccurate when y less than 100.• 0 when y is more than 100.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Signed/Unsigned Division
Here’s a nice function:int cut(int x, unsigned int factor) {
return x / factor;}
How much is a half of -6? Try cut(-6, 2):– We get 2147483645. How come?
x is signed, factor is unsigned. – So before dividing, x is converted to unsigned –
we get 4G-6.– After dividing, we get 2G-3 – converting it back to
signed keeps it a large positive.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Bit Fields
Bit fields are very nice – they save memory. Here’s a program that uses them:
struct x {int flag:1;int count:31;
}const char *flag_set(struct x *s) {
const char *n[] = { “FALSE”, “TRUE” };return n[s->flag];
} What happens if we set flag to 1 and call flag_set?
– It returns an invalid string!– flag is signed, so it can get either 0 or -1.– So our program returns n[-1].
This example is platform dependant.– In Solaris cc, bit fields are unsigned (unless explicitly signed).
• This program works fine on Solaris (if compiled with cc).– In gcc (on all platforms), bit fields are signed.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Shifting
Check out this function:void printb(int x, char *buf) {
buf[0] = ‘\0’;for (; x != 0; x >>= 1)
strcat(buf, (x & 1) ? “1” : “0”);}
It creates a string with x in binary (reversed). How about count1s(0x80000000)?
– We expect a lots of zeros, ending with 1. The function will loop infinitely (until it crashes)!
– Shifting right a signed integer duplicates the high order bit!– 0x80000000 >> 1 == 0xc0000000. (8=1000b, c=1100b).– Shift right is like division – preserves the sign.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Conclusions
Be aware. Remember that:– Your code may not mean what you think it means.– The variable’s type and valid range are important. – Intermediate data has a type and valid range.
• Especially important with multiplication and division. Unsigned integers are more dangerous:
– The wrap around value (0) is closer to the expected values.– A single unsigned makes the whole expression unsigned.
Test you arithmetic:– Copy the arithmetic into a simple test program, and test all cases.
• Much easier to cover all possible values this way.– Even code that seems simple and correct may surprise you.
Use types explicitly:– When types matter, don’t let the compiler cast automatically. Cast
yourself, to make things clear.– Use variables for intermediate results, even when not needed.
• This may remind you of the intermediate values’ importance.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Part III:Miscellaneous C PitfallsUri GorenAugust 2005
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Alignment
Consider the following function:char buf[SIZE];void write_num(int off, int num) {
int *p = &buf[off];*p = num;
} It writes a number in a given offset within a
buffer. What if the offset isn’t a multiple of 4?
– Intel based platforms – will work a bit slower.– Sun Sparc (Solaris) – crash!
So pay attention to alignment.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Operator Precedence
We all know the precedence of some operators:– Multiplication and division before addition and subtraction.
• “a * b + c” is the same as “(a * b) + c”.– Assignment after almost everything:
• “a = x + y” is the same as “a = (x + y)”.– Not “(a = x) + y”!
But do we always know the precedence?– a + b << 2– a ^ b & c– a > 3 ? c = d : x = y
You can find the full precedence table easily.– Don’t do it!– When you’re not 100% sure – use parenthesis.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
The Importance of Prototypes
Prototypes are great, but optional.– They allow the compiler to catch more errors.– Omitting them just causes a warning.– The code works fine without them.
• Most of the time… Look at this case:
/* char *get_name(int id); No prototype! */printf(“%s\n”, get_name(MY_ID));
Will this work?– On 64bit platforms, the returned value will be assumed “int”.– The higher 32 bits will be ignored.– If the string is located above 4GB – it will crash.
Sometimes we get away with it.– In Solaris 64bit kernel, all global and static variables are located
below 4GB.– The problem is when returning a pointer to dynamic memory.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Arrays with Offset -1
Normally, we can access only positive array offsets. But look at this trick:int _a[SIZE+1];int *a = &_a[1];
Now we can access a[-1] to a[SIZE-1]. But it will fail, under two conditions:
– The index is an unsigned variable.– The index is of a type smaller than a pointer.
• u_char or u_short on 32 bits.• On 64 bits - also u_int.
So this trick should be done carefully (or not at all).
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Part IV:Examples from Our Code
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Array With a Negative Index
Here’s an array, defined in fwdrv.c:static struct fwiftab _fwiftab[MAXIFP+1];struct fwiftab *fwiftab = &_fwiftab[1];
This should allow access to fwiftab[-1]. But – what if the index is unsigned?
– It will crash on 64bit platforms.– It will crash if the index is u_char or u_short.
In practice:– It’s always called with a signed int.– -1 is possible only on Nokia, which isn’t 64bit.– We’re lucky.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Macro Affecting Flow Control
Here’s a very useful kernel macro:#define FW_ASSERT(caller, cond, msg) { \
if (fw_assert_on && !(cond)) { \kdprintf("FW-1: %s: %s (%s:%d)\n", \
caller, msg, __FILE__, __LINE__);\fw_panic(msg); \
} \}
What happens when used in an “if”?if (x > 0)
FW_ASSERT(rname, y > 0, “too small”);else
printf(“OK\n”);
This will not compile!– The semicolon after FW_ASSERT will “break” the if statement.
So use FW_ASSERT carefully.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Macro – Parameter Names
Look what I’ve found in fwlddist.c:#define FWSYNC_FCU_SET_TMOUT_TTL(timeout, ttl) \
do { info.timeout = &(timeout); \ info.ttl = &(ttl); \
} while (0) The macro was written carefully:
– “do {} while(0)” used –works fine with if-else.– Parenthesis around all parameters.
One real bug:– “timeout” and “ttl” are both parameters and structure members.– FWSYNC_FCU_SET_TMOUT_TTL(3, 4) won’t compile.
In practice: – It’s called many times, always with variables named timeout and ttl.– This is the only case where the macro can work.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Possible Overflow
Here’s a piece of code from fwatom.c:u_int fw_hmem_size_new, fw_hmem_maxsize_new;...if (fw_hmem_size_new * 2 > fw_hmem_maxsize_new)
fw_hmem_size_new = fw_hmem_maxsize_new / 2;
Makes sure that the new size doesn’t exceed half the new limit.– Both sizes are in bytes.
But – what if the size is 2GB or more?– “fw_hmem_size_new * 2” will wrap around.– The size won’t be decreased.
In practice:– The size can’t be more than 2GB minus something.– This is because we currently can’t use more than 2GB.– The bug is just around the corner.
©2005 Check Point Software Technologies Ltd. Proprietary & Confidential
Wrong Parameter Checking
A function from fwdrv.c:char *fw_func_getname(int func_id){
if (func_id < fwfuncs.nfunc)return fwfuncs.funcdesc[func_id].funcname;
return NULL;}
What if func_id is negative?– It will return a bad pointer.
In Practice:– func_id isn’t negative, unless there’s another bug.– The string is used only if debug is enabled.