Lab sessions Mon Feb 09 to Thu Feb 12
Lab written by Julie Zelenski
This lab is designed to give you a chance to:
Find an open computer and somebody new to sit with. Introduce yourself and share your suggestions about how to best prep for the upcoming midterm.
Get started. Clone the lab starter project using the command
hg clone /afs/ir/class/cs107/repos/lab5/shared lab5
This creates the lab5 directory which contains source files and a Makefile. Bring up our guide to IA32 basics and this handy IA32 cheat sheet (stolen from The Laboratory for Software Technology) in your browser for reference during lab. And as always, have the online lab checkoff ready so you can jot down things as you go. At the end of the lab period, you will submit the sheet and have the TA check off your work.
Disassembling with objdump.
objdump is a tool that operates on object files (i.e. files containing compiled machine code). It can dig out all sorts of information from the object file (the objdump man page is a good resource for learning more), but one of the more common uses is to disassemble object code into assembly form. Let's try it out!
objdump -ddisassembles an object file. The output is a list of binary-encoded machine instructions each alongside its assembly equivalent (this format is like that you used for
disassemble()in assignment 4). If the object file was compiled with debugging information, adding the
objdumpwill intersperse the original C source with the assembly. Thus,
objdump -S -dshows each C construct matched with its compiled translation into assembly. This sort of dump is called a deadlist ("dead" to distinguish from the study of "live" assembly as it executes). The deadlist of an entire program can be long and a bit overwhelming.
countops.pyutility from this lab reports the assembly instructions most heavily used in a given object file. Try out
countops.py trace.ofor an example. This python program operates by invoking
objdumpto disassemble the file, tallies instructions by opcode, and reports the top 10 most frequent. Try it out on a few executables (your spellcheck or reassemble, or programs like emacs or gcc and so on) to get an idea of what the mix of IA32 instructions tends to look like.
Assembly-level debugging in gdb. Our debugger has great support for working with code at the assembly level. Load the
trace program in gdb and try out the gdb commands listed below that allow to poke around at the assembly-level. To learn more about a command available within gdb, use gdb's built-in help.
disassemble command. With no arguments, will print the disassembled instructions for the currently executing function. You can also give an optional argument of what to disable: function name, address, or range of addresses.
disassemble/m will intersperse assembly with original C source which can be helpful when trying to match up the two.
(gdb) disassemble main Dump of assembler code for function main: 0x08048596 <+0>: push %ebp 0x08048597 <+1>: mov %esp,%ebp 0x08048599 <+3>: and $0xfffffff0,%esp ...
In the disassembly as printed by gdb, the hex number in the leftmost column is the address in memory for that instruction and in angle brackets is the offset of that instruction relative to the start of the function. You may notice minor differences in presentation between the disassembled instructions as printed by gdb versus the output from objdump, e.g. use of
movl instead of
mov, negative signed values may display as large unsigned, and so on.
x command (examine memory) includes an
x/i addr will decode the binary-encoded instruction at a given address and print its disassembled translation:
(gdb) x/i main prints first instruction of main (gdb) x/8i main prints first 8 instructions of main
You can set a breakpoint at a specific machine instruction by specifying its address
b *address or an offset within a function
b * main+6. Note that the latter is not 6 instructions into main, but 6 bytes worth of instructions into main. Given the variable-length encoding of IA32 instructions, 6 bytes can correspond to one or several instructions.
(gdb) b *0x08048375 break at specified address (gdb) b *main+6 break at instruction 6 bytes past start of main
nexti commands allow you to single-step through assembly instructions. These are the assembly-level equivalents of the source-level
next commands. (They can be abbreviated
(gdb) stepi executes next single machine instruction (gdb) nexti executes next machine instruction (proceed over fn calls)
info reg command will print the value of the eight integer registers and the control codes.
info all-reg includes floating point and vector registers. You can refer to an individual register by name to view or change the register's value. Within gdb, a register name is prefixed with
$ instead of the
% as in the assembly.
(gdb) info reg (gdb) p $ebp show current value in %ebp register (gdb) set $eax = 9 change current value in %eax register
You can add a
display expression to print the current value of a given expression each time your program stops in the debugger. One useful expression to display when stepping is the next instruction to be executed. The
eip register holds the address of the next instruction to be executed, setting it to display before you
stepi will print the next instruction before executing it. The
display command works for other expressions, too--- variables, parameters, arithmetic, and so on--- very handy!
(gdb) display/i $eip
Last, but certainly not least, this is a great time to try out the
tui (text user interface) I have been using in lecture. Tui mode splits your session into panes for simultaneously viewing the C source, assembly translation, and/or current register state. The gdb
layout command puts the debugger into tui mode. The layout argument specifies which pane(s) you want (
split). Tui mode is a great tool for tracing/visualizing execution, but sadly also can be a nuisance at times (garbling the display and/or misleading you about the current state of affairs). If your tui window has gotten whacked, the
refresh command sometimes works to clean it up. If things get really out of hand,
ctrl-x a will exit tui mode and return you to ordinary non-graphical gdb.
Reading assembly. Read over the C code in
trace.c. Compile the program and use
objdump -d -S trace.o to deadlist the generated assembly interspersed with the original C source (or use
disassemble/m fn_name in gdb to do same by function name) . There are several interesting observations you can make by comparing the C code to its translation. Study the disassembled output in order to answer the following questions.
numsarray initialized? What happened to the
strlencall on the string constant to init the last array element?
count? What does this tell you about the
numsand compare to the use of
ptr. Accessing an element using
ptr[i]requires one more memory access than via
nums[i], can you identify where that happens in the instruction stream? Do you understand why there is an additional memory load in this case?
add/subinstruction does the correct thing for both unsigned and signed arithmetic?
stepiin the debugger. Can you explain what's going on in this case and why it's more complex?
sar) or logical (
forloop versus the
whileloop? For the
Reverse-engineering and hand-generation. The program
solve to make several calls to the function
mystery and prints its results. Let's look into this mystery further! The
mystery function was written in directly in assembly, not compiled from C. Open the
mystery.s file to read the assembly. Now use gdb
stepi through the execution of a call to
mystery and observe its operation. Once you understand it, jot down an equivalent C version of
mystery. And lastly, try out hand-generating a little assembly by editing the
mystery.s file to change the implementation of
mystery to instead return the negative inverse (i.e. -value) of the smaller of the two arguments. Compile and test to verify that your assembly code is correct.
Visualize awesomeness. Three super-cool CS107 alumni (Thank you Julia, Kat, and Constance!) developed a web-based interactive simulator/visualizer designed for students learning IA32. I think it's pretty nifty! Visit Rainbow Onion to check it out. You can use it as tutorial by choosing one of the topics from the tutorial menu and walk through its guided example, using the included self-test exercises along the way to confirm your understanding. You can also use it as simulator for experimentation and play. Edit the assembly code in the main pane and then "Step" through it, while observing updates to the registers and condition codes. Does the visualization help you better understand what's happening at the machine level?
Before you leave, be sure to submit your checkoff sheet (in the browser) and have your lab TA come by and confirm so you will be properly credited. If you don't finish everything before lab is over, we strongly encourage you to finish the remainder on your own!