Winter 2017 CS107 Practice Final Exam (Practice #2)

This practice exam is based on the actual final exam for the winter 2016 quarter. It does not include a cache design problem (see practice exam #1 for one). The assembly also uses compiler settings –O0 instead of –Og, which results in somewhat more complicated handling of the stack frame than we are used to (our exam will have the usual –Og settings, but this one should be solvable for you as practice). CS107 Cynthia Lee

Winter 2017

CS107 Practice Final Exam (Practice #2)

You have 3 hours to complete all problems. You don’t need to #include any header files, and you needn’t use assert to guard against any errors. Understand that the majority of points are awarded for concepts taught in CS107: more points for *(void **), less for for loop syntax. Note: Because your exam pages will be unstapled and separated for scanning (for online grading), it is critical that you write your SUNetID on every page of the exam. Thanks for your cooperation!

SUNet ID (username): ____________________________ (REQUIRED on every page) 2

Problem 1: AMD64 and Optimizations [23 points]

Presented below is the AMD64 assembly generated as a result of feeding the ham function to gcc. There’s an unoptimized version on the left (compiled with –O0) and an optimized version on the right (compiled with -O2).

Unoptimized: <ham>:

push %rbp

mov %rsp,%rbp

sub $0x30,%rsp

mov %rdi,-0x28(%rbp)

mov -0x28(%rbp),%rax

mov %rax,%rdi

callq <strlen@plt>

mov %eax,-0x14(%rbp)

movl $0x6,-0x10(%rbp)

mov -0x14(%rbp),%edx

mov %edx,%eax

shl $0x5,%eax

add %edx,%eax

mov %eax,-0xc(%rbp)

movl $0x1c,-0x8(%rbp)

mov -0x10(%rbp),%edx

mov -0x8(%rbp),%eax

imul %edx,%eax

cmp $0xffffffff,%eax

jge L1


mov %rax,%rdi

callq <ham>

L1: movl $0x0,-0x18(%rbp)

jmp L3

L2: mov -0x18(%rbp),%edx


add %rdx,%rax

movb $0x0,(%rax)

addl $0x1,-0x18(%rbp)

L3: mov -0x14(%rbp),%eax

cmp -0x18(%rbp),%eax

ja L2

mov -0x8(%rbp),%eax

leaveq

retq

Optimized: <ham>:

push %rbx

mov %rdi,%rbx

callq <strlen@plt>

xor %edx,%edx

test %eax,%eax

mov %eax,%ecx

je L2

L1: movb $0x0,(%rbx,%rdx,1)

add $0x1,%rdx

cmp %edx,%ecx

ja L1

L2: mov $0x1c,%eax

pop %rbx

retq


a) [13pts] First, fill in the blanks below so that binky is programmatically consistent with the unoptimized (–O0) assembly on the left of the previous page. Your code should refer to variables, not register names. You should write constants in the types we usually use for good C coding style, namely: write int constants in decimal (base 10), char constants as character literals, pointers/memory addresses in hexadecimal (base 16), and so on (do this even when writing constants in other formats/types would work). Note that the C code is nonsense and should just be a faithful reverse engineering of the unoptimized assembly. You may not typecast anything.

int ham(char *burr) {

int peggy = ____________________________________________;

int george[3];

george[0] = ____________________________________________;

george[1] = peggy * (_________________); //do not move parens—this is one value

george[2] = ____________________________________________;

if (____________________________________________) {

____________________________________________;

}

for (unsigned int i = _________; ________________; ________________) {

____________________________________________;

}

return _________________________________________;

}


The following are some optimizations we discussed in class. In the next parts of the question, please refer to them by their roman numerals in this list:

i. Constant folding ii. Common subexpression elimination

iii. Dead code iv. Strength reduction v. Code motion

vi. Tail recursion vii. Loop unrolling

b) [2pts] Even in what we are calling the “unoptimized” code (-O0), we can observe gcc

applying some optimizations. One example of this can be seen in gcc,’s “unoptimized” (-O0), output corresponding to the C code line where george[1] is assigned a value. Identify the optimization and explain how it applies in this case. Optimzation (write Roman numeral identifier from list above): _______

c) [2pts] The unoptimized code on the left has an imul, while the optimized code on the right does not. Identify the optimization that led to the imul being missing from the optimized code, and explain how it applies in this case. Optimzation (write Roman numeral identifier from list above): _______

d) [2pts] The unoptimized code on the left has a recursive call to ham, while the optimized code on the right does not. Identify and describe the optimization that led to the recursive ham call being missing from the optimized code, and explain how it applies in this case. Optimzation (write Roman numeral identifier from list above): _______


e) [3pts] Consider the value that the optimized code on the right returning. On the line below, copy the instruction from the optimized code that shows what it is returning (i.e., copy the instruction that writes the return value to the register corresponding to return value): Instruction: ____________________________________________ Based on this instruction, we note that the compiler has determined that the function always returns the same (constant) value. Yet the ham function is still doing some additional work, besides just returning that constant. What other work is the optimized ham function still doing?

Is the optimized code writing values to the george array: YES NO Is the optimized code writing values to the burr string/array: YES NO

f) [1pt] The optimized code in this problem is much shorter than the equivalent unoptimized version on the left, in terms of number of instructions in the assembly code (about 36 lines of assembly code, compared to about 16 lines of assembly code). Considering the gcc optimizations we studied, is the opposite ever true—is it ever the case that optimized code is longer (more instructions) than the unoptimized version? (circle) YES NO


Problem 2: Who Complains? [18 points]

For this problem, you are presented with six small programs, along with the commands that in theory can be used to build and execute them. You should write something about each of the six programs, as follows:

Assume we compile using: gcc –Wall –o prog prog.c, and attempt to run the prog executable (if one was generated) with no arguments.

If any warnings are issued during the build process, you should identify which component(s) issues the warning(s) by CIRCLING the appropriate response(s), and then under “Explain,” explain what the warning(s) are.

If an error is issued during the build process, you should identify which component issued the error (thereby halting the build process) by CIRCLING the appropriate response, and then under “Explain,” explain what the error is.

If everything builds (with or without warning(s)) but the program crashes (either it may crash or it is sure to crash), you should say so by CIRCLING the appropriate response, and then under “Explain,” explain why.

Explanations can be very brief. I gave more space than should be necessary. Clarity is more key than length.

The programs aren’t intended to do anything meaningful, as they’re contrived to exercise your understanding of the build process and the C runtime. Take careful note of when command-line arguments are used and when they aren’t. You shouldn’t worry about the program’s output or return value when it executes without crashing. Also, note that when #include directives you might expect are NOT used in a program, that’s quite intentional! a.) [3pts] Consider the following program, and respond according to the instructions given above.

#define HAMILTON main int main(int argc, const char *argv[]) { if (argc > 0) return HAMILTON(argc-1, argv); else return 0; } Preprocessor: ERROR WARNING Compiler: ERROR WARNING N/A (earlier error) Linker: ERROR WARNING N/A (earlier error) Running: NO CRASH MAY CRASH SURE CRASH N/A (earlier error) Explain:


b.) [3pts] Consider the following program, and respond according to the instructions given above. int main(int argc, const char *argv[]) { char * str = "Aaron Burr"; printf("%s, sir.\n", str); return 0; } Preprocessor: ERROR WARNING Compiler: ERROR WARNING N/A (earlier error) Linker: ERROR WARNING N/A (earlier error) Running: NO CRASH MAY CRASH SURE CRASH N/A (earlier error) Explain:

c.) [3pts] Consider the following program, and respond according to the instructions given above.

#define NOT_THROWING_AWAY myshot

int main(int argc, const char *argv[]) {

return NOT_THROWING_AWAY * 2;

} Preprocessor: ERROR WARNING Compiler: ERROR WARNING N/A (earlier error) Linker: ERROR WARNING N/A (earlier error) Running: NO CRASH MAY CRASH SURE CRASH N/A (earlier error) Explain:


d.) [3pts] Consider the following program, and respond according to the instructions given above. #include <stdio.h> char *satisfied(); int main(int argc, const char *argv[]) { printf("May you always be %s\n", satisfied()); return 0; } Preprocessor: ERROR WARNING Compiler: ERROR WARNING N/A (earlier error) Linker: ERROR WARNING N/A (earlier error) Running: NO CRASH MAY CRASH SURE CRASH N/A (earlier error) Explain:

e.) [3pts] Consider the following program, and respond according to the instructions given above. int main(int argc, const char *argv[]) { int * sisters = NULL; *sisters = 3; return 0; } Preprocessor: ERROR WARNING Compiler: ERROR WARNING N/A (earlier error) Linker: ERROR WARNING N/A (earlier error) Running: NO CRASH MAY CRASH SURE CRASH N/A (earlier error) Explain:


Problem 3: Bits and Bytes [15 points]

(a) Professor Aiken's STOKE stochastic optimizer used hamming distance to assess how close an experimental result was to the correct answer.1 Hamming distance is the count of positions where two bit patterns differ. For example, the bit patterns 01010 and 11110 have hamming distance 2. Implement the function ham_close to return true if its two inputs have a hamming distance <= 1 and false otherwise. A full-credit solution must operate efficiently using only bitwise/integer operators and straight-line code (no if/switch/loop).

bool ham_close(unsigned int a, unsigned int b)

(b) Write the signed (two’s complement) 8-bit binary number 10011110 in decimal: (c) Write the unsigned 8-bit binary number 10011110 in decimal:

1 Note: you don’t need to have been in attendance at the lecture to answer this question. The information is self-

contained in the problem.


Problem 4: A Revolutionary Heap [18 points]

Consider a heap allocator implementation called designed as follows: Each block’s total size (in bytes) must be a multiple of 8. All blocks, allocated and free, have

a pre-node header (4 bytes), and the post-node footer (4 bytes), so the smallest possible block size (including header, payload, and footer) is 16 bytes.

The pre-node header and the post-node footer have the same format, which is as follows: (1) most significant bit is an allocated bit (1=allocated, 0=free); (2) the next 31 bits give the size of the block in words, where one word is 8 bytes (i.e., the size is the number of bytes in the block divided by 8).

There is a single free list, implemented as sorted linked list of free blocks. Free blocks appear in the list in increasing order of block size. Thus, free nodes have an additional field overlaying the first 8 bytes of the payload, which is a pointer to the header of the next free block, or NULL if the end of the list.

Example 1: To fulfill a request to malloc 20 bytes, you would use a block of size 32 bytes: 4 bytes of header, 24 bytes of payload (20 bytes + 4 bytes of unused/wasted space because the nearest multiple of 8 larger than 20 is 24), and 4 bytes of footer. The first 32 bits of the header and footer would be “10000000000000000000000000000100” (1 for allocated, then 1002=410 and 8*4 = 32 byte block size). This block would be found in the free list, sorted by the size 32.

Example 2: Here is a diagram of a portion of a heap: 0x100 104 114 118 11C 12C 130 134 144 148

alloc 1

size 3

alloc 1

size 3

alloc 0

size 3

alloc 0

size 3

alloc 0

size 3

alloc 0

size 3

alloc 0

size 7

...

In the above diagram, vertical lines mark increments of 4 bytes. So the block with base address 0x100 is holding an allocated 16-byte payload with base address 0x104, and the total block size is 24 bytes (“size” = 3, and 2*3 = 24). The three blocks after that are all free (payloads of 16 bytes, 16 bytes, and 48 bytes, respectively). The block at address 0x118 has a “next” pointer to the block at 0x130, the block at 0x130 has a “next” pointer to the block at 0x148, and the block at 0x148 is the last block in the free list. Notice that the blocks in the free list sorted in ascending order by size (32, 32, 32, 56).

Assume the following global typedefs, constants, and variables have already been set up:

#define ALLOC (1 << 31) /* mask used to isolate allocated bit */ static void *heapStart; /* base address of entire heap segment */ static size_t heapSize; /* number of bytes in heap segment */ void *free_list; /* pointer to the header of the first block in the free list */

In this problem, you will be asked to write code to complete parts of this implementation of a heap allocator. We’ll start with some basic helper functions useful for malloc and free (parts a-b), and then consider more complex components of free (parts c-e).

0x130 0x148 NULL


a) [3pts] When the user calls malloc, will need to translate their requested number of bytes into the appropriate minimum possible block size for our heap allocator system, in units of 8-byte words. Write a helper function that takes as input the requested bytes (the input to malloc) and returns the block size the heap allocator will use for its internal header/footer storage. Here are some sample inputs->outputs: 16->3, 2->2, 8->2, 20->4.

int calc_block_size_words(size_t req_bytes)

{

}

b) [2pts] Write a helper function that will assist in reading data from your block’s header/footer

format. Given a pointer to the header or footer of a block, this function returns the total size of the block in bytes (not words). size_t block_size_bytes(void * block)

{

}

c) [2pts] Write a helper function that will assist in navigating your heap for coalescing (there will also be a corresponding helper to get the right block, which you are not required to write). Given the address of the current block (i.e., the address of the header of the current block), this helper will return the address of the header of the block to the left (smaller memory address). If there is no left block (i.e., at the boundaries of the heap), return NULL. void * left_block(void * curr)

{

}


d) [3pts] Write a helper function that will assist in navigating your free list. Given a pointer to the header of a block that is in the free list, this function returns a pointer to the next block in the free list (or NULL if the block is the last block in the free list). void * next_free(void * curr)

{

}

e) [3pts] This helper function takes a pointer to the header of a newly freed block and adds it to the free list as the new first element of the free list. Assume we have already determined that is where it belongs to be in sorted order. Assume the contents of the header byte and the footer byte have already been set appropriately before the call to this function. void add_to_free_list_beginning(void * block)

{

}

f) [3pts] This helper function takes a pointer to the header of a newly freed block and adds it to the free list so it appears after the provided ‘prev’ node. Assume we have already determined that is where it belongs to be in sorted order. Assume the contents of the header byte and the footer byte have already been set appropriately before the call to this function. void add_to_free_list_after(void * prev, void * block)

{

}


g) [2pts] This helper function takes a pointer to the header of a newly freed block and adds it to the free list in appropriately sorted order. Assume the contents of the header byte and the footer byte have already been set appropriately before the call to this function. void add_to_free_list(void * block)

{

void * curr = free_list;

size_t blocksize = block_size_bytes(block);

if (curr == NULL) {

add_to_free_list_beginning(block);

return;

}

size_t currsize = block_size_bytes(curr);

if (blocksize <= currsize) {

add_to_free_list_beginning(block);

return;

}

void * prev = free_list;

while (curr != NULL) {

currsize = block_size_bytes(curr);

if (blocksize <= currsize) {

add_to_free_list_after(prev, block);

return;

}

prev = curr;

curr = next_free(curr);

}

// write code here to handle adding block to the end of the free list

}

Winter 2017 CS107 Practice Final Exam (Practice #2)

Documents

Transcript of Winter 2017 CS107 Practice Final Exam (Practice #2)