CS 208 w20 lecture 20 outline

1 Introduction

// static global data, size fixed at compile time, exists for the lifetime of the program
int array[1024]; 
void foo(int n) {
    // stack-allocated data, known lifetime (deallocated on return)
    int tmp;
    int local_array[n]; // some versions of C allow dynamically-sized stack allocation

    // dynamic (heap) data, size and lifetime known only at runtime
    int* dyn = (int*) malloc(n * sizeof(int));
    // good practices:
    //    sizeof makes code more portable
    //    void* is implicitly cast into any pointer type; explicit typecast will help you
    //      catch coding errors when pointer types don’t match
}

Two big questions we'll focus on for the next two weeks:

how do we manage the scarce resource of physical memory while providing all processes as much virtual memory as they need?
- implemented by the operating system kernel and hardware
how do we handle malloc and free quickly and efficiently?
- implemented by the C library

Lab 5 will task you with doing 2., so we will start there.

2 Dynamic Memory Allocation

We don't always know how much space we will need for certain data structures, etc. ahead of time. Hardcoded sizes everywhere can lead to problems and become a maintenance nightmare. Hence, we wany the ability to allocate memory as we go along (dynamically)

2.1 Type of Allocators

Allocator organizes heap as a collection of variable-sized blocks, which are either allocated or free
explicit allocators: require manual requests and frees (malloc package in C)
implicit allocators: unused blocks are automatically detected and freed (garbage collection)

2.1.1 `malloc` package

malloc does no initialization, use calloc to request zero-initialized memory
- returns pointer to the beginning of allocated block, NULL indicates failed request
- typically 16-byte aligned on x86-64
change the size of previously allocated block with realloc
underneath, allocation can use mmap and munmap or use the void *sbrk(intptr_t incr) function
- sbrk grows or shrinks the heap by adding incr to the kernel's brk pointer, returns old value of brk

free takes a pointer to the beginning of a block previously allocated by malloc, calloc, or realloc
- behavior undefined on other arguments, no indication of error

2.2 Spreadsheet Example

https://docs.google.com/spreadsheets/d/1Tj8LTKBBqLodX5K8GbqCVUSWOFur6KzXZshQhx0bF4E/edit?usp=sharing

2.3 Allocator Goals and Requirements

2.3.1 Requirements

handling arbitrary request sequences: cannot make any assumptions about the sequencing of allocate and free requests
making immediate responses to requests: no reordering or batching of requests to improve performance
using only the heap: allocator's internal data structures must also be stored on the heap
aligning blocks: blocks need to meet alignment requirements for any type of data they might hold
not modifying allocated blocks: cannot modify or move blocks when they are allocated

2.3.2 Goals

maximizing throughput: maximize requests per second
maximizing memory utilization: use the greatest possible fraction of the space allocated for the heap. Peak utilization metric is the maximum such fraction achieved over the course of \(n\) requests.
these are in tension: respond to a request faster vs respond with a better choice of block

2.4 Fragmenataion

internal fragmentation: allocated block is larger than the amount requested (due to minimum block size or alignment requirements)

POLL
external fragmentation: when allocation/free pattern leaves "holes" between blocks
- symptom: there is enough total free memory to satisfy a request, but no single free block is large enough
- example: if final request on spreadsheet example were for 48 bytes
- difficult to quantify as it depends on future requests
POLL

2.5 Implementation Issues

free block organization: how to keep track of them
placement: which free block should we choose to fulfill a request
splitting: what do we do with the remainder of a free block after part of it is allocated
coalescing: what do we do when a block is freed

2.6 Implicit Free Lists

since block size will always be a multiple of 16 due to alignment, the four lower order bits of the size will always be 0
we can make efficient use of the header by using these four bits to indicate if the block is allocated or free
the block size includes the payload and any padding, and is thus an implicit pointer to the start of the next block
we will need some special block to mark the end of the free list (e.g, allocated bit, size 0)
implicit free list is simple, but operations are linear in the number of blocks since we have to traverse the list
alignment combined with block format impose a minimum block size

2.7 Placing Allocated Blocks

an allocator's placement policy determines the choice of free block to fulfill allocation
- first fit: take the first free block that's big enough
  - end up with lots of small "splinters" near the start of the list, large blocks near the end
- next fit: like first fit, but start search where the previous one left off
  - unclear if this is better than first fit in practice
- best fit: choose the smallest free block that fits
  - better memory utilization, but requires exhaustive search
if the fit is not good, an allocator might opt to split the free block into two parts
if necessary, an allocate will ask the kernel for additional heap memory, inserting a new free block into the list

2.8 Coalescing Free Blocks

adjacent free blocks cause false fragmentation—free memory chopped up into many small unusable free blocks
to address this an allocator must merge (coalesce) adjacent free blocks
- immediate coalescing: merge blocks whenever one is freed
  - simple, constant time
  - can lead to a form of thrashing where a block is coaslesced and split repeatedly
- deferred coalescing: wait and coaslesce at a later time (e.g., scan the hope when a request fails)
  - often the more efficient choice

2.8.1 Boundary Tags

with a simple header containing the block size, we can reach the next block, but what about the previous block?
- we'd have to traverse the free list until we reached the current block
instead, add a footer to each block that's the same as the header
- together, the header and footer form boundary tags
- if we go back one word from the header, we get to the footer of the previous block

four cases for coalescing:
1. previous and next blocks are both allocated
2. previous block is allocated, next block is free
3. previous block is free, next block is allocated
4. previous and next blocks are both free
boundary tags can introduce significant memory overhead in the case of many small blocks
- fortunately, we only need to have footers in free blocks, since we only care about the size of the previous block when it is free and we want to merge it
- we still need a way to tell if the previous block is allocated, so we'll add that information to one of the unused low-order bits of the header