Assignment 7 advice

Written by Julie Zelenski

Choosing your design

Unlike our previous assignments which have dictated narrow design requirements, the allocator design is very open-ended. You will choose your own data structures, strategies, and implementation without our direction or interference. It is a wonderful opportunity (but a little frightening, too). The design process can be fun and creative but it is best to start early to have enough time to think through the options, weigh trade-offs, make decisions, experiment, and iterate until your design solidifies into a success.

Here are just a few of the decisions you will have to consider:

Section 9.9 of Bryant and O'Hallaron contains helpful reading on heap implementation strategies and presents partial code for an allocator case study. This allocator uses boundary tags and an implicit free list and demonstrates some worthwhile motifs such as abstracting the pointer arithmetic/typecasts (although we strongly recommend using inline functions instead of macros) and strategically installing a few extra blocks to avoid making special cases. However, this design is not a particularly great performer, the code is missing some key pieces, and I know of at least one bug. Consider it code you can learn from, but be wary of adopting from it without careful scrutiny of what you're getting. We expect you are itching to put your hard-earned CS107 superpowers to use creating your own version from scratch than follow along the text anyway! If you do need to borrow from it to get a leg up, we have a specific carve-out in our Honor Code policy that allows you to incorporate code from the textbook (with appropriate citation). Use of all other code is entirely off-limits -- you should not be reviewing or adopting any code for your allocator from any other resource.

Design choices and performance trade-offs

Our performance test will measure utilization and throughput on a set of varied script files. These grading scripts will be similar in spirit to the published samples, i.e., a mix of real-world use cases and generated patterns, but will not be identical. Your allocator should be designed to perform well in a variety of situations, rather than be tailored to fit exactly one set of known inputs.

The sample scripts reflect a range of scenarios for how the allocator might be used. Studying the relative performance provides guidance about in which situations your allocator operates efficiently and those it does not, allowing you to identify opportunities for optimization. For example, if you have bottlenecks in realloc, you will see it reflected in poorer performance on those scripts that make heavy use of realloc as opposed to the other scripts that have little or no use. Sometimes improving the results for one particular script comes at the expense of degrading others, so you'll have to weigh the results on balance. The per-script results tell you where you have strengths/weaknesses, but your allocator will be evaluated on aggregate performance, not any one script in isolation.

Some suggestions to help get you thinking about designing for performance:

Optimizing for performance

No amount of optimization can compensate for a design/algorithm that is irredeemably slow, so first shore up weaknesses in your design before putting on your optimization hat. Once you have a solid design that is correctly implemented, you can then work to reduce the constant factors by applying code optimization techniques. The first step in any optimization effort is measuring. The performance summary given by the test harness gives you a high-level view of your performance of your design as a whole, but you'll need more specific information to make effective optimizations in the code itself. You can run the alloctest program itself under the Valgrind callgrind tool to profile the code and obtain instruction counts and cache hit/miss counts to identify the heavily-traveled code passages or those that are particularly cache-unfriendly. This tells you where to focus your attention for maximum gain. Once you have identified critical passage(s), you apply optimization techniques to streamline code and reduce constant factors. Even just a slight decrease in the instruction count or cache miss rate for a high-traffic area can pay off handsomely in absolute results.

Some notes on optimization:

Script files for alloctest

The alloctest harness runs on script files. A script file contains a sequence of allocator requests in a simple text-based format. The three request types are a (allocation), r(reallocation), and f (free). Each allocation result is assigned an id-number that can be referred to in a subsequent realloc or free.

a id-number size
r id-number size
f id-number

A script file containing:

a 0 24
a 1 100
f 0
r 1 300
f 1

is translated by alloctest into these calls to your allocator:

void *ptr0 = mymalloc(24);
void *ptr1 = mymalloc(100);
free(ptr0);
ptr1 = myrealloc(ptr1, 300);
free(ptr1);

You can open any script file in your text editor. A script can begin with a comment that summarizes the scenario presented by this script. You can also peruse the entire sequence to see the pattern of requests in more detail. You can make your own simple scripts using a text editor and create larger, complex scripts programmatically.

Testing and debugging

Logistics for success

Our thoughts on how to make your last assignment a productive and joyful experience:

Code quality

Program facts

Some facts that might help you gauge what to expect.