Due date: Mon Mar 02 11:59 pm - Hard deadline: Fri Mar 06 11:59 pm
[Quick Links: Implementation details, Advice page, Grading]
Assignment by Julie Zelenski
After completing this assignment, you can proudly say that you have learned how to:
When a program crashes outside of the debugger, its dying words, often nothing more than Segmentation fault
, are little help in pinpointing the problem. With knowledge of the runtime stack and the executable file layout, you can come to the rescue by writing a fault handler. The handler can be linked into a program and will intercede on a crash and walk up the stack to give a symbolic backtrace of where the crash occurred. Fault handlers are often built into commercial applications to capture crash reports out in the field.
The assignment consists of two tasks:
namelist
program that prints the list of functions contained in an object file. This program is a simplified version of the nm
utility. This intermediate milestone verifies your handling of the ELF function symbol data.namelist
.Both namelist and crash reporter require you to do a little dissection of an object file. An object file is the result of compiling, assembling, and possibly linking, C source code. It contains global variables, string constants, function names, IA32 code, etc., all encoded as binary data. Object files on our Linux machines are in the Executable and Linking Format (ELF). The elf
man page and sections 7.3-7.5 of Bryant and O'Hallaron provide explanation and diagrams for ELF. (And for the insomniacs, here is the full 100-page ELF specification). The helpful readelf
command can be used to print the contents of a specific part of an ELF file to see how the data is stored. And here is a neat ELF file diagram that passed by HackerNews the other day. (Thanks to Dmitri and Daltron for sharing!)
For this assignment, you will dig into the ELF file to extract the function symbol information. The relevant ELF data includes:
readelf -h filename
prints the ELF file header.readelf -S filename
prints the section header table.readelf -s filename
prints the symbol table section.readelf -x index filename
prints the contents of the section at the specified index (index is a number) in the section header table; this will dump the entire string table section.The diagram below shows the parts of the ELF format you will access (uint
is used as an abbreviation for unsigned int and uintptr_t types):
An ELF file is designed to be directly accessed with minimal translation from its on-disk representation. For example, the section header table is a contiguous sequence of section headers, where each section header is the same size and has the same fields in the same order. Thus, the section header table is laid out as an array of section header structures. Similarly, the symtab section is an array of symbol structs. This means that you can apply a typecast to the location of the data within the file and treat it like an array, directly accessing entries using ordinary array notation.
Here are the specific steps to read the symbols from an ELF file:
offset
and nsectionheaders
fields from the file header to identify the location and length of the section header table.type
field is equal to SHT_SYMTAB
. This is the header for the symbol table (symtab) section. An ELF file will have at most one symtab section.offset
and size
fields from the symtab section header to identify the location and size of the symtab section data. The offset for a section header is the number of bytes between the beginning of the ELF file and the first byte of the section data.strtab_index
field, which is the index into the section header table for the companion string table section header. There can be more than one string table section in an ELF file, so be sure to use strtab_index
to access the correct one.offset
and size
fields from the strtab section header to identify the location and size of the strtab section data.name
field identifies where to find the symbol's name. The name
offset is expressed as the number of bytes from the start of the strtab table section data to the start of the symbol's name string.Your first task is to write code to extract the function symbols from an ELF file. Both namelist and crash reporter build on this functionality, thus you will put this code into a shared module that is compiled into both. You are to design the public interface of the symbols module. The goal of any interface is to provide useful functionality via routines that are sensibly designed and easy to use. A client should be able to make a simple request to get the desired information which is returned in a tidy package. Your interface should have the necessary flexibility to support its two known clients (namelist and crash reporter), but you don't need to anticipate needs of other potential clients in an attempt to predict the future.
Just because the internals of the symbols module have to deal with ELF in its ugly, native format doesn't mean it should return data to the clients in this raw state. The symbols module can, and should, abstract away the goopy details and provide the requested information in a form that doesn't require the client to get entangled in the low-level technicalities of ELF.
The namelist
program prints the function symbols from an object file. It is a simplified version of the standard nm
utility.
strip
was used), namelist prints a message indicating there was no symbol table and exits. If the file has a symbol table section but it contains no defined function symbols, namelist prints nothing.If the optional address argument was not given, namelist prints every defined function symbol from the symbol table. The symbols are printed one per line; the symbols are output in case-insensitive lexicographic order by name. The four columns are the symbol address, size, binding (T for extern or t for static), and name. The required format is shown below. For both the address and the size, use the printf format %08x
which outputs 8 hex digits aligned with leading zeros.
00000064 00000014 T binky
00000000 00000015 t crash_here
00000036 0000002e t dinky
00000078 00000017 T main
00000029 0000000d T pinky
00000015 00000014 t winky
Namelist prints only defined function symbols with static or extern binding. Non-function symbols (i.e. symbol's type != STT_FUNC), other bindings (symbol's binding != STB_GLOBAL or STB_LOCAL), and undefined symbols (i.e. symbol's section_tag == SHN_UNDEF) are all discarded.
The optional address argument is a hexadecimal address. If given, namelist searches for a matching function symbol for that address. The address matches a symbol if it is within the range spanned by the symbol's start and size. If symbol binky
starts at 0xa3 and has size 16 bytes, all addresses from 0xa3 to 0xb2 are within its range. If a match is found, it prints the symbol name and hex offset (distance from the function start to the matched address). If the address is not within any function range, namelist responds that no match was found. The required format for found and not-found is shown below. Your program should exactly match this wording and print both the address and offset using format %#x
.
> namelist buggy.o 0x6a
Address 0x6a matches binky+0x6
> namelist buggy.o 0x107
Address 0x107 not found in any symbol range
You'll discover some symbols are "synonyms"--- two or more symbols names with identical or overlapping ranges. If an address is within the range of more than one symbol, namelist matches it to the one whose name is first lexicographically.
Once you can reliably produce a list of function symbols and match an address to a symbol, you're ready for the fun job--- building on this work to create a crash reporter!
When a program crashes outside the debugger, it leaves few clues to follow up on. If the bug is reproducible, running again under gdb can get you a backtrace. But what about those elusive bugs that aren't repeatable or only occur outside of gdb? A crash reporter can provide critical information about a crash without requiring a debugger.
You are to write the libreporter
library that provides a crash reporter tool that can be linked into any program. Once initialized, the crash reporter monitors the executing program and on fatal error, it intercedes to produce a symbolic backtrace of the runtime stack before terminating. How does the crash reporter work?
When an exceptional event occurs (memory access violation, divide by zero, etc.), the kernel sends the process a signal. The default action for fatal signals is to terminate the program. Alternatively, you can register a function as the callback (called a signal handler) to instead process the signal. That callback might attempt error recovery, do cleanup, give information to the user, and so on. Our starting code in reporter.c
shows the boilerplate code required to set up a signal handler. If you're curious, you can read more about signals from the man page for sigaction
and in section 8.5 of Bryant and O'Hallaron.
The crash reporter signal handler will register for fatal signals that prints information about the event (signal number/name, faulting instruction) along with a stack backtrace. A sample crash report looks like this:
Program received signal 11 (Segmentation fault)
Faulting instruction at [0x08048f00] crash_here (+0x10)
[0xf778d410] <unknown>
[0x08048f52] dinky (+0x2c)
[0x08048f17] winky (+0x12)
[0x08048f24] pinky (+0xb)
[0x08048f3f] dinky (+0x19)
[0x08048f66] binky (+0x12)
[0x08048f78] main (+0x10)
Here is what is expected from your libreporter:
libreporter.a
and make a call to init_reporter()
. A client program typically makes this call somewhere early in main
, but it is valid to initiate crash reporting later in execution as well.init_reporter
harvests the symbol information and stashes it for later use. We make an exception to our standard prohibition against global variables and allow you to store this data globally within the reporter module, but you should confine the global state to the minimum needed and be sure to mark it static so it is not broadcast into the global namespace."/proc/self/exe"
refers to the executable file of the currently executing program. You can open this path as an ELF file to get the symbol data for the currently executing program.init_reporter
does nothing. Without symbol information, it will not be able to produce a symbolic backtrace and it does not register for any signal-handling. In this way, a client can selectively disable all crash reporting for an executable by stripping its symbol table, instead of requiring a recompile. If init_reporter
is called from a stripped executable, it behaves as a no-op: no error message is printed, no signal-handling is installed, and the program runs without crash detection.%eip
register (see starter code). This will be the address of the faulting instruction.%#010x
format and print the offset using %#x
.<unknown>.
Symbol names will not be present for dynamically-linked libraries or selectively stripped symbols.main
. There can be other frames behind main in the stack, but stop here, do not attempt to trudge through those frames or print them. You may assume that the symbol table will always contain a symbol named main
.exit
which halts and tears down the entire address space. You may also ignore clean up of the crash reporter globals even if the client program runs to completion without error.The crash reporter can be misled when the stack itself has been corrupted, such as when a buffer overflow has stomped on the return address and base pointer. An invalid return address will cause the reporter to mis-identify the function for a particular stack frame, which is not a big deal, but dereferencing an invalid saved base pointer can turn the backtrace into total garbage or more likely, crash the reporter during its stack crawl. The reporter should make an attempt to detect stack trouble by including a heuristic that stops the backtrace if it reaches a saved base pointer that appears whacked. A simple heuristic for identifying trouble is to reject a base pointer which points outside the bounds of the current extent of the stack. Here is an example crash report that shows the required error message to print when halting the backtrace.
Program received signal 11 (Segmentation fault)
Faulting instruction at [0x0804923a] crash_here (+0xb)
[0xf7731410] <unknown>
[0x08049267] dinky (+0x19)
Halting backtrace: cannot follow saved base pointer [0x30000000] not within stack segment
There are various other ill-behaved programs (infinite recursion that attempts to grow the stack beyond the maximum size, one that stomps global data and destroys your saved function symbols, etc) that cause problems that are not easily detected or handled. You are not expected to devise handling for these, nor will we test on them.
-fomit-frame-pointer
which means some of those functions may not include the standard function prologue/epilogue to save and restore %ebp
using the stack. These functions are called frameless. If a frameless function appears in the backtrace, your crash reporter is likely to be misled into skipping over this call or halting the backtrace due to this non-conforming use of the %ebp
register. You do not need to take precautions or make a special case of this, just unwind the backtrace assuming all functions store a base pointer on the stack in the expected location. We will not test crash reporter on backtraces containing frameless functions.The starter project files include:
Makefile
The Makefile has three targets that are built by default: namelist
, libreporter.a
, and buggy
, a sample client program that links with the libreporter library. The Makefile makes it easy to extend with additional client programs. Custom sanity check can be used to compare the output of yours versus the solution--- very handy!symbols.c
This module extracts the function symbols from an ELF file and provides functions to manipulate that symbol data. This module is part of both the namelist and libreporter targets. You are to design an appropriate interface in symbols.h and implement it in symbols.c.namelist.c
This program prints and looks up function symbols. This file will contain only a trivial amount of code.reporter.c
The module is where you write the code for the signal handling and stack crawling.buggy.c
This sample program links with libreporter and deliberately causes a fatal error. You can use/change/cannibalize this program to test your reporter.Output format. Please pay careful attention to the output requirements for both namelist and the crash report, especially the required wording and printf formats. Use sanity check to verify that your output is conforming.
Background information on how we grade assignments.
Functionality (100 points)
buggy
program, as well as more complex cases (deeper stacks, different signals, unknown addresses).Code quality (buckets weighted to contribute ~20 points)
Here we will read and evaluate your code in areas such as:
Check out a copy of the starter project from your cs107 repository using the command
hg clone /afs/ir/class/cs107/repos/assign6/$USER assign6
The assign6 samples directory is linked in your repo as slink
and includes a sample namelist program and solution version of the crash reporter library.
There is a sanity check that verifies output conformance of both namelist and crash reporter. Be sure that all debugging print statements are removed/disabled as extraneous output can interfere with the autotester. When finished, submit your code for grading. If needed, be sure to familiarize yourself with our late policy.