Assignment 7: Heap allocator

Due date: Tue Mar 11 11:59 pm - Hard deadline: Thu Mar 13 11:59 pm

Assignment by Julie Zelenski, based on original assignment given by Randal Bryant & David O'Hallaron (CMU)

[Quick Links: Implementation details, Advice page, Grading]

Learning goals

In completing this assignment, you will:

  1. study the internals of the heap allocator
  2. experiment with the process of evaluating and choosing among alternative designs
  3. apply optimization techniques to find and mitigate runtime/memory inefficiencies
  4. explore performance trade-offs and weigh improvements against the complexity of code required to achieve them
  5. bring together all of your CS107 skills and knowledge into a satisfying capstone experience!

Your assignment

You are to write an allocator that manages the heap and implements the dynamic allocation functions malloc, realloc, and free. The allocator grabs a large memory segment using the low-level OS memory routines and parcels out this segment to service dynamic memory requests. Your code will manage the heap segment, track the available/in-use space, and update your heap data structures in response to requests. The key goals for the allocator include:

The starter code contains a sample allocator. This trivial allocator is largely correct but it is heinously inefficient. Your goal is to replace this allocator with one that uses intelligent data structures and efficient algorithms. The challenge is to strike a balance between tight memory utilization and high throughput, while maintaining correctness in all situations.

Just three little functions? How hard can that be? :-) It is possible to write an efficient allocator with a couple hundred lines of code. However, it will require your most skilled and sophisticated coding and you may end up more proud of those three functions than any code you've written heretofore. What a great way to finish off an awesome quarter!

For this assignment, you have the option of working in a team of two or working individually. The project requirements are the same either way. Working in a team has advantages (two heads are better than one) and disadvantages (scheduling, coordinating, arguing). Choosing to partner won't cut your work in half, but the collaboration may contribute to a higher-quality result and better mastery of the material. A team makes one submission and both partners receive the same score. Read our advice for successful partnership.

Implementation details

Here are the specific requirements for the allocator:

Using the alloctest program and script files

You will need to do thorough testing of your allocator. An ordinary C program that makes targeted calls to the allocator functions is one means to test. The simple.c program in the starter project can be used in this way. The starter project also includes alloctest, a script-based test program. A script file is a text file containing a sequence of allocator requests. The alloctest program reads a script file, executes its requests, attempts to verify they were properly serviced, and measures allocator performance. A sample report on one script is shown below.

alloctest -f tiny1.script
Evaluating allocator on tiny1....done.

 script name     correct?    utilization   requests       secs        Kreq/sec
-------------------------------------------------------------------------------
tiny1                Y            17%           12       0.000018        665
-------------------------------------------------------------------------------

Aggregate 1 of 1                  17%           12       0.000018        665
17% (utilization) 6% (throughput, expressed relative to performance target)

Below is an overview of how alloctest operates. You can review the code in alloctest.c for further details.

Grading

Background information on how we grade assignments. This assignment focuses on efficiency and the majority of the points will be awarded on the basis of performance trials.

Correctness (48 points)

Performance (80 points)

We will measure the utilization and throughput of your allocator on a set of mixed scripts that are similar, but not identical, to the published samples. The those two measurements will be converted according to this scale:

The value for U and T will be a number in the range [0, 50]. Performance is scored using the formula points = 1.5*min(U,T) + .5*max(U,T). The performance points are not awarded on the straight sum U + T, but instead a blend that rewards a balanced optimization. The idea is to not go to extremes to optimize one at the expense of the other, but moderate between the two. 80 points is the full-credit benchmark, and you can earn up to 20 additional bonus points. Submissions with bugs that interfere with performance testing will earn points at a reduced rate, based on the performance formula applied to the limited scripts which operate correctly and a penalty adjustment for the failures.

Utilization is generally stable on the same inputs, but throughput varies with CPU/cache contention, so it is normal to see small ups and downs from run-to-run and even larger discrepancies when the system is under heavy load (use top or uptime to view activity/load). During grading, we will evaluate performance using an isolated quiescent host to reduce artifacts. All myths have identical software, but some have skimpier hardware (e.g. 4M L2 cache versus 6M, lower clock) which can tweak your results. You can view a host's specs in the file /proc/cpuinfo if you're curious. We will measure performance for all submissions on the same myth under the same conditions to ensure consistency in grading.

Your allocator's performance on the published samples is predictive of the measurement we will make for grading, but not an exact match. The goal is not to tune your allocator to perform at its best on exactly and only the samples, but instead develop a design that fares well in a wide variety of scenarios, for which the samples are representative possibilities. The scripts for performance grading will be selected from the samples and include a few new scripts of comparable scope/scale. If your allocator has consistent performance across most/all sample scripts, then taking a few out and replacing with others should result in only minor change in overall performance, but if you have wider variability script-to-script, then your allocator performance may experience a more significant swing when we mix things up in grading. You can estimate your worst-case outcome by calculating how your allocator would fare on a mix constructed from the published samples having removed your top few best performers and replacing with copies of your weakest performers.

Code quality (buckets weighted to contribute ~20 points)

Design documentation (32 points)

A number of points are allocated for the readme file where you will document your allocator design, along with your process and results. You are to address the issues below with clear, complete, and concise writing. The intended audience is a programmer-literate peer.

We will evaluate your design document on its thoughtfulness and completeness in describing, analyzing, and evaluating your design and process. Readme credit is not tied to successful performance results. If your results are not what you'd hoped for, use the readme to tell us about it, i.e. the observed problems, what its root cause seems to be, what you tried that didn't pan out, what you learned in the process, and so on.

Getting started

Check out a copy of the starter project from your cs107 repository using the command

hg clone /afs/ir/class/cs107/repos/assign7/$USER assign7

The starter repo contains these files:

Special attention to the Honor Code

Before you begin coding, you'll want to consider a range of design options. The textbook contains good foundation material which you are encouraged to review as a starting point. You may also find it helpful to talk over implementation options and trade-offs with classmates. Your readme must properly cite any external resources or discussions that influenced your design. When it comes to code-level specifics, your allocator must be your independent work. Under no circumstances should you be studying code from outside sources nor incorporating such code. If your investigations accidentally stumble across allocator code, we expect you to immediately about-face from it and get back to designing and writing your own allocator. We have a zero-tolerance policy for submissions containing code that has been adopted/borrowed from others. Lacking citation, this act constitutes plagiarism and will handled as a violation of the Honor Code.

Finishing

When finished, don't forget to submit your files for grading. Only one partner makes the submission, not both. No late submissions will be accepted after the hard deadline (Sunday Dec 7th) Late days count individually per-partner.

It's definitely time to celebrate-- congratulations on an awesome quarter!