Due date: Wed Mar 11 11:59 pm - Hard deadline: Fri Mar 13 11:59 pm
Assignment by Julie Zelenski, based on original assignment given by Randal Bryant & David O'Hallaron (CMU)
[Quick Links: Implementation details, Advice page, Grading]
In completing this assignment, you will:
You are to write an allocator that manages the heap and implements the dynamic allocation functions malloc
, realloc
, and free
. The allocator grabs a large memory segment using the low-level OS memory routines and parcels out this segment to service dynamic memory requests. Your code will manage the heap segment, track the available/in-use space, and update your heap data structures in response to requests. The key goals for the allocator include:
The starter code contains a sample allocator. This trivial allocator is largely correct but it is heinously inefficient. Your goal is to replace this allocator with one that uses intelligent data structures and efficient algorithms. The challenge is to strike a balance between tight memory utilization and high throughput, while maintaining correctness in all situations.
Just three little functions? How hard can that be? :-) It is possible to write an efficient allocator with a couple hundred lines of code. However, it will require your most skilled and sophisticated coding and you may end up more proud of those three functions than any code you've written heretofore. What a great way to finish off an awesome quarter!
For this assignment, you have the option of working in a team of two or working individually. The project requirements are the same either way. Working in a team has advantages (two heads are better than one) and disadvantages (scheduling, coordinating, arguing). Choosing to partner won't cut your work in half, but the collaboration may contribute to a higher-quality result and better mastery of the material. A team makes one submission and both partners receive the same score. Read our advice for successful partnership.
Here are the specific requirements for the allocator:
malloc
man pages to learn what constitutes a well-formed request and the expected allocator behavior. You may ignore the esoteric details from the NOTES section of the man page.mymalloc
, myrealloc
, and myfree
to avoid clashes with the libc version. You will also implement the myinit
function that configures the initial state and the validate_heap
debugging function that is used to verify internal heap consistency.ALIGNMENT
constant (8). A payload address could be 0xa5b8 or 0xa5b0, but not 0xa5b4 or 0xa5b1. The alignment requirement applies to the address of the payload; whatever headers/footers you have placed around it are your own business.segment.h
in the starter code for more information.segment.h
. By "memory-management" we specifically mean those operations that allocate or deallocate memory, so absolutely no calls to malloc, realloc, free, calloc, sbrk, brk, mmap, or related variants. The use of other library functions is fine, e.g. memmove
and memset
can be used as they do not allocate or deallocate memory.validate_heap
function. This function is called from our test harness between requests to verify the internal consistency of the heap data structures. This debugging hook is intended to help surface problems in your heap sooner rather than later. Note that validate_heap will not be called during any performance testing, so there is no concern about its efficiency (or lack thereof).You will need to do thorough testing of your allocator. An ordinary C program that makes targeted calls to the allocator functions is one means to test. The simple.c
program in the starter project can be used in this way. The starter project also includes alloctest
, a script-based test program. A script file is a text file containing a sequence of allocator requests. The alloctest program reads a script file, executes its requests, attempts to verify they were properly serviced, and measures allocator performance. A sample report on one script is shown below.
alloctest -f tiny1.script
Evaluating allocator on tiny1....done.
script name correct? utilization requests secs Kreq/sec
-------------------------------------------------------------------------------
tiny1 Y 17% 12 0.000018 665
-------------------------------------------------------------------------------
Aggregate 1 of 1 17% 12 0.000018 665
17% (utilization) 6% (throughput, expressed relative to performance target)
Below is an overview of how alloctest operates. You can review the code in alloctest.c
for further details.
alloctest
will no arguments, it will run all the scripts in the directory /afs/ir/class/cs107/samples/assign7
. You can use the -f argument to change which scripts are being run. The -f argument is the path to a single script file or a directory of script files.alloctest
program parses the script file and make calls to mymalloc and so on as requested in the script./afs/ir/class/cs107/samples/assign7
contains a collection of varied scripts. These scripts are best suited for testing the performance of an already debugged implementation as they are very large (thousands of requests). The ones named x-trace were constructed by tracing the allocation calls made by a running program, e.g. reassemble-trace traces our assignment 1. (For the curious: see man page for ltrace
on how to snoop on a program's library calls.) These represent "real world" testing patterns. The scripts named x-pattern were mechanically generated via patterns such as alternately allocating a small and large chunk.alloctest
to selectively verify correctness (-c) or measure performance (-p). The default (no flags) is to do both.validate_heap
between each request. If any of these checks turn up trouble, it prints an error message to draw attention to the problem.alloctest
program is that it calls the allocator's myinit
function after executing one script and before starting another. Take care to properly implement the myinit
function to wipe the slate clean, removing any previous heap contents and starting fresh, lest you encounter bugs due to ghost data left behind in the heap from the previous script.Background information on how we grade assignments. This assignment focuses on efficiency and the majority of the points will be awarded on the basis of performance trials.
Correctness (48 points)
Performance (80 points)
We will measure the utilization and throughput of your allocator on a set of mixed scripts that are similar, but not identical, to the published samples. The those two measurements will be converted according to this scale:
U = utilization - 25
. The full-credit benchmark U = 40
is reached at utilization >= 65%, up to 10 bonus is added for exceeding it.T = throughput / 2
. The full-credit benchmark T = 40
is reached at throughput >= 80%, up to 10 bonus is added for exceeding it.The value for U
and T
will be a number in the range [0, 50]. Performance is scored using the formula points = 1.5*min(U,T) + .5*max(U,T)
. The performance points are not awarded on the straight sum U + T
, but instead a blend that rewards a balanced optimization. The idea is to not go to extremes to optimize one at the expense of the other, but moderate between the two. 80 points is the full-credit benchmark, and you can earn up to 20 additional bonus points. Submissions with bugs that interfere with performance testing will earn points at a reduced rate, based on the performance formula applied to the limited scripts which operate correctly and a penalty adjustment for the failures.
Utilization is generally stable on the same inputs, but throughput varies with CPU/cache contention, so it is normal to see small ups and downs from run-to-run and even larger discrepancies when the system is under heavy load (use top
or uptime
to view activity/load). During grading, we will evaluate performance using an isolated quiescent host to reduce artifacts. All myths
have identical software, but some have skimpier hardware (e.g. 4M L2 cache versus 6M, lower clock) which can tweak your results. You can view a host's specs in the file /proc/cpuinfo
if you're curious. We will measure performance for all submissions on the same myth under the same conditions to ensure consistency in grading.
Your allocator's performance on the published samples is predictive of the measurement we will make for grading, but not an exact match. The goal is not to tune your allocator to perform at its best on exactly and only the samples, but instead develop a design that fares well in a wide variety of scenarios, for which the samples are representative possibilities. The scripts for performance grading will be selected from the samples and include a few new scripts of comparable scope/scale. If your allocator has consistent performance across most/all sample scripts, then taking a few out and replacing with others should result in only minor change in overall performance, but if you have wider variability script-to-script, then your allocator performance may experience a more significant swing when we mix things up in grading. You can estimate your worst-case outcome by calculating how your allocator would fare on a mix constructed from the published samples having removed your top few best performers and replacing with copies of your weakest performers.
Code quality (buckets weighted to contribute ~20 points)
validate_heap
routine.Design documentation (32 points)
A number of points are allocated for the readme file where you will document your allocator design, along with your process and results. You are to address the issues below with clear, complete, and concise writing. The intended audience is a programmer-literate peer.
We will evaluate your design document on its thoughtfulness and completeness in describing, analyzing, and evaluating your design and process. Readme credit is not tied to successful performance results. If your results are not what you'd hoped for, use the readme to tell us about it, i.e. the observed problems, what its root cause seems to be, what you tried that didn't pan out, what you learned in the process, and so on.
Check out a copy of the starter project from your cs107 repository using the command
hg clone /afs/ir/class/cs107/repos/assign7/$USER assign7
The starter repo contains these files:
simple
and alloctest
. You may edit the definition of ALLOCATOR_EXTRA_CFLAGS
in the Makefile to configure the best build settings for your allocator, but should make no other changes to the Makefile.allocator.c
module is where you will write your allocator implementation.segment.c
module provides the low-level memory allocator. You will use these functions in writing your allocator. Read the header file to see what is available. Do not edit these files.fcyc.c
module is the cycle-counting timer from Chapter 5 of the Bryant and O'Hallaron. It is used by alloctest to count cycles used by your allocator. Do not edit these files.simple.c
module is a sample program that manipulates linked lists and strings using dynamic allocation. You can use/change/cannibalize this program for testing.alloctest.c
module contains the script-driven test harness to evaluate your allocator. Do not edit this file.Before you begin coding, you'll want to consider a range of design options. The textbook contains good foundation material which you are encouraged to review as a starting point. You may also find it helpful to talk over implementation options and trade-offs with classmates. Your readme must properly cite any external resources or discussions that influenced your design. When it comes to code-level specifics, your allocator must be your independent work. Under no circumstances should you be studying code from outside sources nor incorporating such code. If your investigations accidentally stumble across allocator code, we expect you to immediately about-face from it and get back to designing and writing your own allocator. We have a zero-tolerance policy for submissions containing code that has been adopted/borrowed from others. Lacking citation, this act constitutes plagiarism and will handled as a violation of the Honor Code.
When finished, don't forget to submit your files for grading. Only one partner makes the submission, not both. No late submissions will be accepted after the hard deadline (Sunday Dec 7th) Late days count individually per-partner.
It's definitely time to celebrate-- congratulations on an awesome quarter!