Lab sessions Mon Feb 16 to Thu Feb 19
Lab written by Julie Zelenski
This lab is designed to give you a chance to:
Find an open computer and somebody new to sit with. Introduce yourself and celebrate/commiserate how things went for you on last week's midterm.
Get started. Clone the starter project using the command
hg clone /afs/ir/class/cs107/repos/lab6/shared lab6
This creates the lab6 directory which contains source files and a Makefile.
Pull up the online lab checkoff and have it open in a browser so you'll be able to jot down things as you go. At the end of the lab period, you will submit that sheet and have the TA check off your work.
Dissecting the stack. Use gdb to make some observations about the runtime stack. Open
stack.c in your editor and peruse the C code for the various
inky functions. Build the
stack program and run gdb on it.
dinkyand run the program. When the breakpoint is hit, use
wherecommand to see the current frames on the runtime stack.
frame ncommands allow you to change the selected frame. These commands don't change the state of program execution (the execution remains where it stopped in the lowest frame), but they allow you to examine the runtime state from the perspective of another stack frame. For example, you change to the frame for
main, you will then be able to print the variables/parameters that are visible only in that scope.
info framecommand tells you the story about this stack frame. The
info localsprovide information about the parameters and local variables, respectively. Try changing frames and using the various
infocommands to see results in different contexts.
You can dump the raw memory from the stack using the
(gdb) x/10wx $esp examines 10 hex words from stack starting at current esp
Do a stack dump like this and identify how the contents of the stack memory map to the various parts of the stack frame (locals, return address, saved bp, parameters).
Disassemble some of the
*inky functions to see the coordination between the caller and callee for a function call. Who puts the parameters on the stack (caller or callee?) What action puts the return address on the stack? Who saves the base pointer and changes the base pointer for the new function? Who makes space on the stack for the locals? Who cleans up each of these things put onto the stack? How is the return value communicated? How is a struct passed as a parameter? What about a pointer to a struct?
mainfunction prints the address of its stack frame when run. Run the program a few times from the shell and note the addresses that are printed--- is the stack always being placed in the same location or not? Is the stack in the same location when you run the program under gdb?
Parameter passing by "channeling". The code in the
channeling function in
stack.c is based on a bug I once helped a novice student with.
channelingfunction calls the
init_arrayfunction to initialize an array followed by a call to
print_arrayto print it. Run the program. You'll see that the code seems to work just fine despite the fact that neither function takes any parameters nor returns any values! Just exactly how are these functions communicating?
channeling(between init and print calls) and compile and re-execute. Now what happens? How can you explain the change in outcome?
Perhaps this exercise will help you make sense of some previous situation where your program's behavior was reproducibly yet inexplicably altered by adding/removing/changing some innocent and seemingly unrelated code!
Love thy neighbor. The
clear_array function in
stack.c shows a bug made by those unaccustomed to C's zero-based array indexing. Run the program and allow it to execute
clear_array. What do you observe? Run the program under gdb and poke around to make sense of what is happening. How does the observed symptom relate to the underlying bug?
In this function, the value of
i changes as a result of code that doesn't directly assign to it. These kind of bugs are calling "stomping" or "clobbering" memory. A variable can be stomped on due to an under/overrun of a neighboring region, using a pointer into a deallocated stack frame, referring to freed heap memory, or using an uninitialized pointer. These bugs are often difficult to track down since the variable being changed is not named in the code that changes it and it may have been stomped long before you notice the effect. It's worth learning about the useful gdb watchpoint feature. The gdb
watch command is given an expression to watch such as a variable or a memory location. It then sets up a special kind of breakpoint that stops your program whenever there is a change to that variable or memory location.
(gdb) watch i // report when i changes (gdb) watch *0xff8b1234 // report when specified memory location changes
Run under gdb, set a breakpoint on
clear_array and once hit, set up a watchpoint to monitor changes to
i. Verify that it reports on all changes to the memory at that location, including when it is clobbered. Watchpoints can can be a useful tool for tracking down those bugs that make mysterious changes to memory (and may be a useful tool in reverse engineering a bomb, too. Hmm...)
factorial function in
stack.c is a classic recursive formulation of the factorial function.
factorial(n)become erroneous? Why does that happen?
factorialfunction (e.g. a 1000-element array), re-compile and run in gdb again on the -1 case. What changes?
factorialstack frame is including the space for saved bp and return address. Multiply the size of the frame by the number of frames to get an estimate of the maximum possible size of the runtime stack. What is the maximum value factorial you can attempt to compute before you seem to run out of stack space?
ulimit -afor sh shells) reports the process limits, including the stack size. Does the stack size reported by limit match up with the calculation you did above?
Change the process limit on stack size using the limit command shown below. Re-run the program, what changes?
limit stacksize 100 meg
Edit the Makefile to change the optimization level used to compile the program. Look for the line that begins
stack: CFLAGS += which is specifying the compiler flags for compiling the stack program. The current setting should be
-O0 which compiles with no optimization. Change
-O2 to apply moderately aggressive optimizations. Re-compile and run again. Woah! What happened? Did it actually "fix" the infinite recursion? Disassemble the optimized
factorial to see what the compiler did with the code. This fancy optimization is called "tail-recursion elimination".
Stack smashing. The
check_name function in
attack.c shows an all-too-common bug of reading into a stack-allocated buffer naively assuming the buffer will also be big enough for what is being read.
attackprogram. (Note warning about "dangerous", remember this for later...) Run the program and when it asks for your name, enter "Pat" and all is fine. Now run again and respond that your name is "John Jacob Jingleheimer Schmidt". What happens?
attackunder gdb, enter the very long name to reproduce the crash and examine the backtrace---how odd and unhelpful! Set a breakpoint on the
check_namefunction. Run the program and when it stops at this breakpoint, examine the 8 bytes of stack housekeeping using
x/2wx $ebp. Uses
nextto walk through the
getscall, enter the long name, then examine the stack housekeeping again. What happened? Try
x/s $ebpto get a different view of it. What has happened to the saved bp and return address? What will now happen when
check_nametries to return?
attack.o: CFLAGS +=. Edit that line to change
-fstack-protector. Recompile and re-run. Now what happens when you enter a long name? The stack protector is a special bit of code that halts the program on stack buffer overruns.
gets-- read its man page for advice on what to use instead.
Optional extra challenge for those curious about stack-smashing evilware. Poorly-written functions like
check_name can be exploited by malicious programs in a buffer overrun attack. The basic idea is to supply just the "right" input when overflowing an unprotected stack-allocated buffer.
main. Where is the code supposed to return after a call to
check_name? Where would you rather it return instead? What kind of input could you supply to
check_namethat could get the program to return to that location instead? Note that an overrun not only overwrites the return address, it also trashes the saved base pointer. Usually this causes problems when later unwinding the stack to the previous frame. However, in this particular program, the remaining instructions of
maindon't refer to the base pointer again, so we can get away with it. Fixing up the base pointer is tricky.
attackmust be compiled with no optimization and no stack protection (In the Makefile, be sure the line reads
attack: CFLAGS += -O0 -fnostack-protectorand recompile). Feed Sneaky's input to the program like this
./attack < sneaky_input. How does Sneaky Guy manage to get an A? It may help to use
od -t x1 sneaky_inputto see a raw dump of the file contents.
Before you leave, be sure to submit your checkoff sheet (in the browser) and have lab TA come by and confirm so you will be properly credited for lab If you don't finish everything before lab is over, we strongly encourage you to finish the remainder on your own!