Lab 4: Malloc Lab

Aaron Bauer

May 9, 2022

Lab 4: Writing a Dynamic Storage Allocator1

Introduction

In this lab you will be writing a dynamic storage allocator for C programs, i.e., your own version of the malloc, free, and (for extra credit) realloc routines. You are encouraged to explore the design space creatively and implement an allocator that is correct, efficient and fast.

The goals of this lab are

Logistics

You can do this assignment individually or with a partner. You must fill out the partner survey regardless of whether you will be working with a partner. The survey gives you the option of requesting to be matched with a partner. Being matched is not guaranteed, as an odd number of people may request a partner. Late days may be used on this lab only if both partners have them remaining.

Hand Out Instructions

You can download the files you need for this lab with
    wget http://cs.carleton.edu/faculty/awb/cs208/s22/handouts/lab4-handout.tar
Then run the command
    tar xvf lab4-handout.tar

This will create a directory called lab4-handout that contains a number of files. The only file you will be modifying and handing in is mm.c. The mdriver.c program is a driver program that allows you to evaluate the performance of your solution. Use the command make to generate the driver code and run it with the command ./mdriver -V. (The -V flag displays helpful summary information.) The command make test will provide nicely formatted information about your overall lab grade. Detailed usage information for the mdriver testing executable is described later.

The extracted directory contains the following:

When you have completed the lab, you will hand in only one file (mm.c), which contains your solution. The allocator you build might work on similar platforms (no guarantees!), but it must work on Linux for grading. Compile the allocator and test driver with make to produce an executable called mdriver.

Advice

This assignment requires careful bit-level manipulation of memory using many of C’s unsafe pointer casting features without much structure. Your allocator code will not be enormous, but it will be subtle. Take planning seriously and work methodically. Ignoring this advice could cost hours or days. Also keep in mind that performance is a component of the grading for this assignment.

We recommend this path for preparation and development of a successful allocator:

Tasks

You will modify mm.c to implement a memory allocator with the interface declared in mm.h and the functionality described below.

    int   mm_init(void);
    void *mm_malloc(size_t size);
    void  mm_free(void *ptr);

The provided mm.c file partially implements an allocator based on an implicit free list. It provides skeletons of several helper functions beyond the interface above and an implementation of mm_malloc that uses these helper functions. Your job is to complete the helper functions (and possibly add more of your own) as well as mm_free to implement a memory allocator with good balance of memory utilization and throughput. You may use any implementation strategy that works (subject to programming rules), but performance of your memory allocator is a large component of the grade. There are many opportunities for extensions, whether in increasingly sophisticated allocator implementations for better performance or in analysis and error-checking tools.

Grading considers design, documentation, style, correctness, and performance.

The remainder of this document includes:

Preparatory Exercises

As you read this document, complete these exercises to familiarize yourself with heap layout, block layout, and the provided starter code. These exercises will help you design your allocator implementation before diving into messy implementation details. Good planning is key when developing low-level systems software. Feel free to discuss these exercises with your classmates. You can find solutions to these exercises on Moodle.

These exercises will help you plan your allocator implementation and avoid wasting time writing C code before you understand what you are doing. Tips:

A. Heap and Block Layout

Read mm.c to learn about the block layout. Assume words are 8 bytes. Pay special attention to the early comments about block and heap layout, as well as the code in mm_init.

For the following questions, draw memory as an array of words (but not to scale when using big numbers). Use the block layout rules from mm.c and especially the heap setup code in mm_init. Follow the style of heap drawings used in the CSAPP book (Figures 9.36-9.38), but remember that block details will vary. Always draw the heap header word, the heap footer word, and the headers (and footers as needed) for all blocks in the heap.

B. Simulate the Starter Allocator

Simulate the starter allocator code by hand starting from an heap generated by a call to mm_init. Update a drawing of the heap as you go, following the drawing guidelines from part A. Show exactly what this starter allocator does, not your idea of what an allocator should do. This will force you to read and understand the provided code carefully in detail. Add comments as you go if it helps you to keep notes in addition to the provided comments.

Starting from a fresh initial heap, simulate the following requests and update your drawing based on what the starter code allocator does:

  p0 = mm_malloc(12);
  p1 = mm_malloc(16);
  p2 = mm_malloc(16);
  mm_free(p0);
  mm_free(p1);
  p3 = mm_malloc(24);

Starting from fresh initial heaps, simulate the traces short1-bal.rep and short2-bal.rep from the traces directory in the starter code. The format of trace files is described here.

C. Starter Functions

D. Add and Simulate Pseudocode Features

For each of the following features, add the feature by sketching pseudocode comments within mm.c. Then, before working on the next feature, simulate the allocator with this feature on the sample traces from the previous part, drawing heap state as you go.

Write pseudocode at a sufficient level of detail that you can simulate clearly, but do not get tangled up in C-level pointer work. Well-written comments at this stage can remain as documentation when you move on to implementation in C.

Specification

Function Specification

The three main memory management functions should work as follows:

These semantics match the the semantics of the corresponding C library malloc and free routines. Run man malloc in the terminal for complete documentation.

Support Routines

The memlib.c package simulates the memory system for your dynamic memory allocator. You can invoke the following functions in memlib.c:

Programming Rules}

Implementation

You may take any approach to implementing the specified allocator behavior above. Your implementation will be evaluated based on correctness, performance, documentation, and style. This section suggests an incremental implementation strategy that allows you to choose what balance of implementation sophistication and performance you prefer.

Start with an implicit free list allocator implementation. If you wish to improve the performance of your allocator once you have implemented and tested an implicit free list approach, consider an explicit free list or other search, coalescing, and splitting strategies as next steps.

Implicit Free List

Convert your pseudocode feature implementations to C code one at a time, testing each before starting C code for the next feature.

Run tests on individual traces for [debugging] and on all traces for general testing and performance evaluation.

Explicit Free List

Memory Order vs List Order

Explicit free-list allocators must distinguish memory order from list order. Memory order refers to the order of blocks as arranged in the memory address space. We use the terms adjacent, preceding, predecessor, following, and successor to refer to memory order. List order refers to the order of blocks in the explicit free list. We use the terms next and previous to refer to list order. Confusing these orders will lead to tricky bugs.

The best way to develop an explicit free list allocator is to first develop a working implicit free list allocator and extend it to use explicit free lists. Save a copy of your implicit free list allocator before you start working on improving it, so you have a working allocator as a reference (and a fallback).

Some suggestions as you implement explicit free lists:

Alternative Policies and Beyond

A clean and efficient working explicit free list allocator should achieve a respectable performance index. To achieve even better performance, you could consider implementing deferred coalescing, the boundary tag optimization described in the textbook, alternative ways of splitting, or alternative search strategies. Segregated free lists (CSPP section 9.9.14) or approaches using self-balancing trees can very extremely efficient, but correspondingly tricky to get right. If you embark on this path, weigh difficulty of implementation against likely effect on performance index.

Compiling and Testing

The mdriver program tests your mm.c implementation for correctness, space utilization, and throughput. Build mdriver with make and run it with the command ./mdriver -V. (The -V flag displays helpful summary information as described below).

mdriver uses trace files to simulate memory management workloads using your mm.c implementation. A trace is a sequence of allocate and free events. mdriver simulates a trace by calling mm_malloc and mm_free for each corresponding event in the trace in order.

The mdriver executable accepts the following command line arguments:

Common Testing Tasks

Trace Format

When learning about the starter code, you will simulate small traces. When testing and debugging, you may find it useful to write and test your own small traces.

Traces used by mdriver summarize the execution of a program as a sequence of mm_malloc and mm_free calls in a simple format.

A trace file contains 4 header lines:

Remaining lines after the header give a sequence of memory management events, one per line. Each event is either an allocate event or a free event:

The following example C code would generate the corresponding trace below it.

C code:

p0 = malloc(12);
p1 = malloc(16);
p2 = malloc(16);
free(p0);
free(p1);
p3 = malloc(24);
A corresponding trace:
128
4
6
1
a 0 12
a 1 16
a 2 16
f 0
f 1
a 3 24

Heap Consistency Checker

Dynamic memory allocators are notoriously tricky beasts to program correctly and efficiently. They are difficult to program correctly because they involve a lot of untyped pointer manipulation.
You will find it very helpful to use the provided heap checker function check_heap that scans the heap and checks it for consistency. Make sure to remove all calls to check_heap before submitting your code. It will definitely affect performance evaluation. check_heap takes a line number as a parameter, so that any error messages can be tagged with the line where check_heap was called. You can do this with
  check_heap(__LINE__);

The compiler will automatically fill in __LINE__ with the correct line number.

A print_heap function is also provided that prints out the current state of the implicit free list.

You may also find it helpful to extend the check_heap function to check more detailed consistency properties. This becomes most interesting when considering an explicit free list (or more sophisticated) allocator. check_heap should return a nonzero value if and only if your heap is consistent according to the conditions you check.

Example checks to add:

Feel free to rename or split check_heap into multiple static helper functions. When you submit mm.c, make sure to disable any calls to check_heap as they will impact performance.

Performance

Use mdriver.opt for performance evaluation.

The mdriver.opt executable is a second version of mdriver that is compiled with aggressive compiler optimizations enabled and assertions disabled. Run make mdriver.opt to create it. Use mdriver.opt for performance evaluation, but use the normal mdriver during testing and debugging. Enabling optimizations and disabling assertions may be necessary to reach the highest levels of performance, but they can make debugging more difficult.

For the most part, a correct implementation based on our provided code will yield passable performance. Two performance metrics will be used to evaluate your solution:

The driver program summarizes the performance of your allocator by computing a performance index, \(P\), which is a weighted sum of the space utilization and throughput \[ P = 0.6\times U_{mm.c} + 0.4\times \min\left(1, \frac{T_{mm.c}}{T_{libc}}\right) \] where \(U_{mm.c}\) is your space utilization, \(T_{mm.c}\) is your throughput, and \(T_{libc}\) is the estimated throughput of libc malloc on your system on the default traces.2 The performance index values both good space utilization and good throughput, with slight preference toward space efficiency.

Your allocator must balance utilization and throughput. A performance index of 70-80 or above (out of 100) is pretty good. For reference, an implicit free list with splitting and coalescing should achieve an index in the 50s.

How To Work

Low-level unguarded memory manipulation code like this allocator is prone to subtle and frustrating bugs. There’s a reason you often use higher-level languages with more protections! The following strategies help to work effectively and responsibly to avoid frustration and confusion.

Plan Carefully

Review allocator concepts, complete the preparatory exercises, and read the provided code carefully. Never write code until you have a good idea of what you are doing and what is already there.

Code Defensively

Expect things to go wrong. Anywhere your code relies on an assumption, check it explicitly. Explain to your partner why your code should work. Write error-prone code once carefully and reuse it.

Implement and Test Incrementally

Implement and test one feature (or sub-feature!) at a time, testing each before you start implementing the next. This will make it much easier to understand what code is involved in a bug.

Debug Methodically

gdb, check_heap, and especially print_heap functions are the tools of choice. Here are some strategies and other tips.

What To Turn In

Before submitting, remove calls to check_heap, print_heap, and any diagnostic printing that you added outside of these functions.

Submit your mm.c file to Lab 4 on Moodle. You do not need to turn in anything else. If you are working with a partner, only one person needs to submit (make sure both of your names are in the top comment).

Grading

Your grade will be calculated from 100 points as follows:

You may use any implementation strategy that works (subject to programming rules). Allocator performance is a significant component of the grade for this lab. Do the arithmetic: a high-quality implicit free list allocator can achieve a reasonable grade.

OPTIONAL Extra Credit: Implement realloc

Implement a final memory allocation-related function: mm_realloc. The signature for this function, which you will find in your mm.h file, is:

extern void* mm_realloc(void* ptr, size_t size);

The mm_realloc function returns a pointer to an allocated region of at least size bytes with the following constraints.

To run tests with mm_realloc, switch out mdriver.c with mdriver-realloc.c (to do this rename mdriver-realloc.c to mdriver.c, and delete or rename the original). To run mm_realloc tracefiles, run the driver with the -f flag to specify a tracefile, or first edit config.h to include additional realloc tracefiles in the default list. The realloc traces will not be evaluated for performance, only correctness. Each of the two test cases is worth 3 points of extra credit if it passes.


  1. This lab is adapted from the Malloc Lab developed for Computer Systems: A Programmer’s Perspective by Randal E. Bryant and David R. O’Hallaron, Carnegie Mellon University, available here.↩︎

  2. The value for \(T_{libc}\) is a constant in the driver (6000 Kops/s) that was established by testing the throughput of malloc and free on mantis.↩︎