CS 332 w22 — Condition Variables

1. Waiting for a Change
- 1.1. Polling
2. Condition Variable
3. Blocking Bounded Queue
4. Sleeplock
5. Reading: The Producer/Consumer Problem

1 Waiting for a Change

locks let us protect a critical section, but present a question:
- if a thread needs to wait for lock to become available, what should it do?
  - spinlock says "burn the CPU"
  - there's probably a better way
many other situations arise where we need to wait for another thread to take some action
- thread-safe queue, might want to wait for an element to be inserted if we try and remove from an empty queue
- web server waits for a request, word processor waits for a keypress, a maps app waits for a user command or new data before updating the view

1.1 Polling

unsatisfying solution: waiting thread repeatedly checks shared state to see if it has changed

int
removeOrBlock(TSQueue *q) {
    int item;
    bool success;

    do {
        success = tryRemove(q, &item);
    } while(!success);
    return item;
}

this busy-waiting is very inefficient

2 Condition Variable

a synchronization primitive that facilitates efficient waiting
three methods:
- condvar_wait(struct condvar *cv, Lock *lock)
  - atomically releases the lock and suspends the calling thread
  - places the calling thread on the condition variable's waiting list
  - when the calling thread is later re-scheduled, it re-acquires the lock before returning from wait
- condvar_signal(struct condvar *cv)
  - takes one thread off the condition variable's waiting list and makes it eligible to run (i.e., adds it to the scheduler's ready list)
  - no effect if no threads are waiting
- condvar_broadcast(struct condvar *cv)
  - takes all threads off the condition variable's waiting list and makes them eligible to run
  - no effect if no threads are waiting
note that Unix has unrelated system calls that interact with processes also called wait and signal
condition variable has no state apart from its waiting list
a condition variable is used to wait for a change to shared state, and a lock must always protect updates to shared state
- all three methods should only be called when holding the associated lock

2.1 Standard Design Pattern

void some_method_that_waits() {
    lock_acquire(&lock);

// Read and/or write shared state here.

    while (!test_on_shared_state(shared_data)) {
        condvar_wait(&cv, &lock);
    }
    assert(test_on_shared_state(shared_data));

// Read and/or write shared state here.

    lock_release(&lock);
}

void some_method_that_signals() {
    lock_acquire(&lock);

// Read and/or write shared state here.

// If state has changed in a way that 
// could allow another thread to make 
// progress, signal (or broadcast).

    condvar_signal(&cv);

    lock_release(&lock);
}

2.2 Atomic `wait`

wait is always called while holding a lock
call must atomically release the lock and add the thread to the waiting list
what could happen if a this wasn't atomic:
- thread releases the lock before waiting¹
- thread waits before releasing the lock²

2.3 `wait` in a loop

Why did the design pattern above have the thread waiting on the condition variable inside a loop?
when a thread is re-enabled via signal or broadcast, it may not run immediately
- it's just added to the ready list with no special priority
when it finally does run, other threads may have acquired the lock and changed shared state in the meantime
- whatever was true when signal or broadcast was called may no longer hold
thus, wait must always be called from within a loop
- there is no guarantee of atomicity between signal or broadcast and the return from the call to wait

2.3.1 simplifies the implementation

this lack of atomicty actually helps simplify the implmentation of condition variables (without making them more complex to use)
no special scheduling code: signal puts a waiting thread onto the ready list and lets the scheduler choose when to run it
no special code needed to re-acquire the lock at the end of wait, thread calls acquire normally
some implementations even warn that wait can return even if no signal or broadcast occurred
- from Java documentation:

When waiting upon a Condition, a "spurious wakeup" is permitted to occur, in general, as a concession to the underlying platform semantics. This has little practical impact on most application programs as a Condition should always be waited upon in a loop, testing the state predicate that is being waited for. An implementation is free to remove the possibility of spurious wakeups but it is recommended that applications programmers always assume that they can occur and so always wait in a loop.

2.3.2 improves modularity

waiting in a loop makes it easier to reason about when a thread will continue
- we only have to look at the loop condition
- we don't need to examine the code to understand where and why signal or broadcast are called
this also means that signaling at the wrong time will never cause a waiting thread to proceed when it should not
- signal and broadcast can be seen as hints that it might be a good time to proceed
- may matter for performance, but extra calls won't harm correctness

3 Blocking Bounded Queue

// BBQ.h 
// Thread-safe blocking queue.

#ifndef _BBQ_H_
#define _BBQ_H_
#include <kernel/synch.h>

int MAX = 512;

typedef struct {
  // Synchronization variables
    struct spinlock lock;
    struct condvar item_added;
    struct condvar item_removed;

  // State variables
    int items[MAX];
    int front;
    int next_empty;
} BBQ;

BBQ* bbq_init();
void bbq_free(BBQ *q);
void bbq_insert(BBQ *q, int item);
int bbq_remove(BBQ *q);
#endif

// BBQ.c
// thread-safe blocking queue

#include <kernel/kmalloc.h>
#include "BBQ.h"

// static prevents this variable from being visible outside this file
static struct kmem_cache *bbq_allocator;

// Wait until there is room and
// then insert an item.
void
bbq_insert(BBQ *q, int item) {
    spinlock_acquire(&q->lock);
    while ((q->next_empty - q->front) == MAX) {
        condvar_wait(&item_removed, &q->lock);
    }
    q->items[q->next_empty % MAX] = item;
    q->nextEmpty++;
    condvar_signal(&q->item_added);
    spinlock_release(&q->lock);
}

// Wait until there is an item and 
// then remove an item.
int
bbq_remove(BBQ *q) {
    int item;

    spinlock_acquire(&q->lock);
    while (q->front == q->next_empty) {
        condvar_wait(&q->item_added, &q->lock);
    }
    item = q->items[q->front % MAX];
    q->front++;
    condvar_signal(&q->item_removed);
    spinlock_release(&q->lock);
    return item;
}

// Initialize the queue to empty,
// the lock to free, and the
// condition variables to empty.
BBQ* bbq_init() {
  BBQ *q;

  // if the allocator has not been created yet, do so now
  if (bbq_allocator == NULL) {
    if ((bbq_allocator = kmem_cache_create(sizeof(BBQ))) == NULL) {
      return NULL;
    }
  }

  // allocate the BBQ struct
  if ((q = kmem_cache_alloc(bbq_allocator)) == NULL) {
    return NULL;
  }

  q->front = 0;
  q->next_empty = 0;
  spinlock_init(&q->lock);
  condvar_init(&q->item_added);
  condvar_init(&q->item_removed);
}

void bbq_free(BBQ *q) {
  kmem_cache_free(bbq_allocator, q);
}

4 Sleeplock

osv uses a spinlock and a condition variable to implement an efficient lock for general use.

void
sleeplock_acquire(struct sleeplock* lock)
{
    if (!synch_enabled) {
        return;
    }
    kassert(lock);
    spinlock_acquire(&lock->lk);
    while (lock->holder != NULL) {
        condvar_wait(&lock->waiters, &lock->lk);
    }
    lock->holder = thread_current();
    spinlock_release(&lock->lk);
}

void
sleeplock_release(struct sleeplock* lock)
{
    if (!synch_enabled) {
        return;
    }
    kassert(lock && lock->holder == thread_current());
    spinlock_acquire(&lock->lk);
    lock->holder = NULL;
    condvar_signal(&lock->waiters);
    spinlock_release(&lock->lk);
}

5 Reading: The Producer/Consumer Problem

Read Section 30.2 (p. 374–382) of the OSTEP book. It shows how locks and condition variables can be used to address a class operating systems scenario: the producer/consumer problem. By walking through flawed solutions, it should help you build your intuition for how to design thread-safe code using the synchronization primitives we've discussed so far.

Footnotes:

After it releases the lock, but before it waits, another thread preempts it, acquires the lock, makes the changes the original thread is waiting for, and calls signal (which has no effect on an empty waiting list). Eventually the original thread is scheduled again, whereupon it puts itself on the waiting list and suspends, forever waiting for a signal that already came.

The thread could be suspended while still holding the lock, which would prevent any other thread from making the state change it is waiting for