Lab sessions Mon Feb 16 to Thu Feb 19
Lab written by Julie Zelenski
This lab is designed to give you a chance to:
Find an open computer and somebody new to sit with. Introduce yourself and celebrate/commiserate how things went for you on last week's midterm.
Get started. Clone the starter project using the command
hg clone /afs/ir/class/cs107/repos/lab6/shared lab6
This creates the lab6 directory which contains source files and a Makefile.
Pull up the online lab checkoff and have it open in a browser so you'll be able to jot down things as you go. At the end of the lab period, you will submit that sheet and have the TA check off your work.
Dissecting the stack. Use gdb to make some observations about the runtime stack. Open stack.c
in your editor and peruse the C code for the various inky
functions. Build the stack
program and run gdb on it.
dinky
and run the program. When the breakpoint is hit, use backtrace
or where
command to see the current frames on the runtime stack.up
, down
, and frame n
commands allow you to change the selected frame. These commands don't change the state of program execution (the execution remains where it stopped in the lowest frame), but they allow you to examine the runtime state from the perspective of another stack frame. For example, you change to the frame for winky
or main
, you will then be able to print the variables/parameters that are visible only in that scope.info frame
command tells you the story about this stack frame. The info args
and info locals
provide information about the parameters and local variables, respectively. Try changing frames and using the various info
commands to see results in different contexts.You can dump the raw memory from the stack using the x
command.
(gdb) x/10wx $esp examines 10 hex words from stack starting at current esp
Do a stack dump like this and identify how the contents of the stack memory map to the various parts of the stack frame (locals, return address, saved bp, parameters).
Disassemble some of the *inky
functions to see the coordination between the caller and callee for a function call. Who puts the parameters on the stack (caller or callee?) What action puts the return address on the stack? Who saves the base pointer and changes the base pointer for the new function? Who makes space on the stack for the locals? Who cleans up each of these things put onto the stack? How is the return value communicated? How is a struct passed as a parameter? What about a pointer to a struct?
main
function prints the address of its stack frame when run. Run the program a few times from the shell and note the addresses that are printed--- is the stack always being placed in the same location or not? Is the stack in the same location when you run the program under gdb?
Parameter passing by "channeling". The code in the channeling
function in stack.c
is based on a bug I once helped a novice student with.
channeling
function calls the init_array
function to initialize an array followed by a call to print_array
to print it. Run the program. You'll see that the code seems to work just fine despite the fact that neither function takes any parameters nor returns any values! Just exactly how are these functions communicating?channeling
(between init and print calls) and compile and re-execute. Now what happens? How can you explain the change in outcome?Perhaps this exercise will help you make sense of some previous situation where your program's behavior was reproducibly yet inexplicably altered by adding/removing/changing some innocent and seemingly unrelated code!
Love thy neighbor. The clear_array
function in stack.c
shows a bug made by those unaccustomed to C's zero-based array indexing. Run the program and allow it to execute clear_array
. What do you observe? Run the program under gdb and poke around to make sense of what is happening. How does the observed symptom relate to the underlying bug?
In this function, the value of i
changes as a result of code that doesn't directly assign to it. These kind of bugs are calling "stomping" or "clobbering" memory. A variable can be stomped on due to an under/overrun of a neighboring region, using a pointer into a deallocated stack frame, referring to freed heap memory, or using an uninitialized pointer. These bugs are often difficult to track down since the variable being changed is not named in the code that changes it and it may have been stomped long before you notice the effect. It's worth learning about the useful gdb watchpoint feature. The gdb watch
command is given an expression to watch such as a variable or a memory location. It then sets up a special kind of breakpoint that stops your program whenever there is a change to that variable or memory location.
(gdb) watch i // report when i changes
(gdb) watch *0xff8b1234 // report when specified memory location changes
Run under gdb, set a breakpoint on clear_array
and once hit, set up a watchpoint to monitor changes to i
. Verify that it reports on all changes to the memory at that location, including when it is clobbered. Watchpoints can can be a useful tool for tracking down those bugs that make mysterious changes to memory (and may be a useful tool in reverse engineering a bomb, too. Hmm...)
Recursion. The factorial
function in stack.c
is a classic recursive formulation of the factorial function.
factorial(n)
become erroneous? Why does that happen?factorial
function (e.g. a 1000-element array), re-compile and run in gdb again on the -1 case. What changes?factorial
stack frame is including the space for saved bp and return address. Multiply the size of the frame by the number of frames to get an estimate of the maximum possible size of the runtime stack. What is the maximum value factorial you can attempt to compute before you seem to run out of stack space?limit
(or ulimit -a
for sh shells) reports the process limits, including the stack size. Does the stack size reported by limit match up with the calculation you did above?Change the process limit on stack size using the limit command shown below. Re-run the program, what changes?
limit stacksize 100 meg
Edit the Makefile to change the optimization level used to compile the program. Look for the line that begins stack: CFLAGS +=
which is specifying the compiler flags for compiling the stack program. The current setting should be -O0
which compiles with no optimization. Change -O0
to -O2
to apply moderately aggressive optimizations. Re-compile and run again. Woah! What happened? Did it actually "fix" the infinite recursion? Disassemble the optimized factorial
to see what the compiler did with the code. This fancy optimization is called "tail-recursion elimination".
Stack smashing. The check_name
function in attack.c
shows an all-too-common bug of reading into a stack-allocated buffer naively assuming the buffer will also be big enough for what is being read.
attack
program. (Note warning about "dangerous", remember this for later...) Run the program and when it asks for your name, enter "Pat" and all is fine. Now run again and respond that your name is "John Jacob Jingleheimer Schmidt". What happens?attack
under gdb, enter the very long name to reproduce the crash and examine the backtrace---how odd and unhelpful! Set a breakpoint on the check_name
function. Run the program and when it stops at this breakpoint, examine the 8 bytes of stack housekeeping using x/2wx $ebp
. Uses next
to walk through the gets
call, enter the long name, then examine the stack housekeeping again. What happened? Try x/s $ebp
to get a different view of it. What has happened to the saved bp and return address? What will now happen when check_name
tries to return?attack.o: CFLAGS +=
. Edit that line to change -fnostack-protector
into -fstack-protector
. Recompile and re-run. Now what happens when you enter a long name? The stack protector is a special bit of code that halts the program on stack buffer overruns.gets
-- read its man page for advice on what to use instead.
Optional extra challenge for those curious about stack-smashing evilware. Poorly-written functions like check_name
can be exploited by malicious programs in a buffer overrun attack. The basic idea is to supply just the "right" input when overflowing an unprotected stack-allocated buffer.
main
. Where is the code supposed to return after a call to check_name
? Where would you rather it return instead? What kind of input could you supply to check_name
that could get the program to return to that location instead? Note that an overrun not only overwrites the return address, it also trashes the saved base pointer. Usually this causes problems when later unwinding the stack to the previous frame. However, in this particular program, the remaining instructions of main
don't refer to the base pointer again, so we can get away with it. Fixing up the base pointer is tricky.attack
must be compiled with no optimization and no stack protection (In the Makefile, be sure the line reads attack: CFLAGS += -O0 -fnostack-protector
and recompile). Feed Sneaky's input to the program like this ./attack < sneaky_input
. How does Sneaky Guy manage to get an A? It may help to use od -t x1 sneaky_input
to see a raw dump of the file contents.Before you leave, be sure to submit your checkoff sheet (in the browser) and have lab TA come by and confirm so you will be properly credited for lab If you don't finish everything before lab is over, we strongly encourage you to finish the remainder on your own!