CS 332 w22 — Caches and Address Translation

Table of Contents

1 Caching

1.1 Definitions

  • Cache
    • Copy of data that is faster to access than the original
    • Hit: if cache has copy
      • Desired data is resident in the cache
    • Miss: if cache does not have copy
      • Load missing data into the cache
      • Overwrite (evict) resident data to make room, if necessary
  • Cache block
    • Unit of cache storage (range of memory locations)
  • Temporal locality
    • Programs tend to reference the same memory locations multiple times
    • Example: instructions in a loop
  • Spatial locality
    • Programs tend to reference nearby locations
    • Example: data in a loop

1.2 Reading from a Cache

  • Cache logic implemented in hardware
  • Completely transparent to the program
    • But performance impact can be huge!

cache.png

1.3 Writing to a Cache

  • Two kinds of cache write behavior
    • Write through: changes sent immediately to next level of storage
    • Write back: changes stored in cache until cache block is replaced

cacheWrite.png

1.4 Example Cache Architectures

corei7caches.png

  • Intel Core i7 (pictured above)
    • L1 caches are 32 or 48 KB
    • L2 caches range from 256 KB to 2 MB depending on specific model
    • L3 caches ranges from 4 MB to 24 MB
  • Apple M1 chip:
    • 4 high-performance cores and 4 energy-efficient cores
      • M1 Pro and M1 Max have 8 high-performance and 2 energy-efficient
    • Each performance core has a 192 KB instruction cache and 128 KB data cache (L1)
    • Each efficient core has a 128 KB instruction cache and 64 KB data cache (L1)
    • Performance cores share at 12 MB L2 cache, efficient cores share a 4 MB L2 cache
    • The entire system on a chip (SoC) has a shared 16 MB cache

1.5 Memory Hierarchy

mem-hierarchy.png

  • Going up the hierarchy: memory gets smaller, faster, and more expensive (per byte)
  • Going down the hierarchy: memory gets bigger, slower, and cheaper (per byte)

1.6 When Caches Work and When They Do Not

cache-working-set.png

Events like context switches can cause a burst of misses:

cache-phase-change.png

1.7 Types of Caches

  • Fully Associative cache
    • Cache checks address against every entry and returns matching value, if any

      cache-fullyassoc.png

  • Direct Mapped cache
    • Cache hashes address to determine which location to check for a match

      cache-directmapped.png

  • Set Associative cache
    • Cache hashes address to determine which location to check
    • Checks each set in parallel for a match

      cache-setassoc.png

2 Address Translation

  • Conversion from memory address the program thinks it's referencing to the physical location of that memory cell
    • Virtual address to physical address
    • The range of memory addresses a program can use is called its address space
  • Program behaves correctly despite its memory being stored somewhere completely different than where it thinks it is stored
  • A useful fiction, transparent to the programmer
  • Goals:
    • Memory protection
    • Memory sharing: shared libraries, interprocess communication
    • Sparse addresses: multiple regions of dynamic allocation (stack/heap)
    • Efficiency: memory placement, runtime lookup, compact translation tables
    • Portability

address-translation.png

2.1 Base and Bound Hardware Translation

baseandbound.png

  • Virtual addresses range from 0 to bound
  • Physical addresses from base to base + bound
  • base and bound specific to each process, stored in special registers
    • Saved on context switch
  • In a nutshell: each process reserves a range of physical memory
  • Pros:
    • Simple, fast, safe
    • Can relocate physical memory without changing process
  • Cons:
    • Can't prevent process from overwriting its own code (only check exceeding bound)
    • Can't share code/data with other processes
    • Can't grow stack/heap as needed

3 Reading: Address Translation

OSTEP Chapter 15 (p. 151–161) provides a good walkthrough of address translation with examples and additional context.