CS 332 w22 — Segmentation and Paging

Table of Contents

1 Base and Bound

  • Pros:
    • Simple, fast, safe
      • Just two hardware registers!
    • Can relocate physical memory without changing process
  • Cons:
    • Can't prevent process from overwriting its own code (only check exceeding bound)
    • Can't share code/data with other processes
    • Can't grow stack/heap as needed

2 Segmentation

Key idea: multiple bases and bounds to divide memory into segments

segment.png

  • Segment is a contiguous region of virtual memory
  • Each process has a segment table (stored in hardware)
    • Each entry in the table defines a segment
  • Segment can be located anywhere in physical memory
    • Each segment has: start (base), length (bound), access permissions
  • Just as with base and bound, a memory access to a segment exceeding the bound triggers an exception
    • On UNIX systems, called a segmentation fault
  • Common theme: virtual address split into index (Segment) and offset
  • Base value from table added to Offset to compute physical address

2.1 Segmentation Facilitates Sharing

segmentShared.png

  • Same base, bound, same/different access permissions
  • For example, two instances of the same program sharing a code segment

2.2 Copy-on-Write

We can use segmentation to avoid unnecessary copying

  • UNIX fork
    • Makes a complete copy of a process
    • If child never uses parent's memory (e.g., just calls exec), a lot of unnecessary copying
  • Segments allow a more efficient implementation
    • Copy segment table into child
    • Mark parent and child segments read-only
    • Start child process; return to parent
    • If child or parent writes to a segment (ex: stack, heap)
      • Trap into kernel
      • Make a copy of the segment, change permissions, and resume

2.3 Zero-on-Reference

  • Need to zero out contents of memory before allocation
    • Avoid accidentally leaking information!
    • But only want to bother if memory will actually be used
  • How much physical memory is needed for the stack or heap?
    • Only what is currently in use (could be anywhere from KBs to GBs)
    • Reserve larger space, but only zero the first few KB
    • Set the segment bound to just the zeroed portion
  • When program uses memory beyond end of stack or expands the heap
    • Segmentation fault into OS kernel
    • Kernel allocates some memory (extends bound)
    • Zeros the memory
    • Modifies the segment table
    • Resumes the process

2.4 Pros and Cons

  • Pros:
    • Can share code/data segments between processes
    • Can protect code segment from being overwritten
    • Can transparently grow stack/heap as needed
    • Can detect if need to copy-on-write
  • Cons:
    • Complex memory management
      • Need to find chunk of a particular size (e.g., some programs have small data segment, some large)
    • May need to rearrange memory from time to time to make room for new segment or growing segment
      • Compacting free memory can create a lot of processor overhead (i.e., moving segments around to consolidate free space)
      • External fragmentation in physical memory due to wasted space between different-sized chunks

3 Paging

Key idea: divide memory into fixed-size pages

  • Manage physical memory in fixed size units, or pages
    • Both virtual and physical memory use this same fixed page size
  • Finding a free page is easy
    • Bitmap allocation
      • Each bit represents one physical page frame, indicates free or allocated
        • Bitmaps also used in file systems to represent free disk blocks (also fixed-size units)
      • 0011111100000001100 indicates 2 free pages, then 6 allocated pages, then 7 free pages, then 2 allocated, then 2 free
  • Each process has its own page table
    • Stored in physical memory
  • What hardware registers do we need?
    • Pointer to page table start
    • Page table length

A program's memory may be scattered throughout physical memory

  • Each process sees its virtual address space as a neatly ordered deck of cards
  • Physical memory is all the decks from all the processes shuffled together

logicalPages.png

3.1 Address Translation

As before, the virtual address is used to look up the physical address

  • Again, part of the virtual address is an index, and part is an offset
    • Since offset is lower-order bits, specifies location within contiguous virtual page or physical frame
  • Page table contains access permissions like segment table

paged-pte.png

Example:

64-byte virtual address space (64 = 2^6  ->  6-bit virtual addresses)
16-byte pages

Virtual Memory
4 pages (64/16 = 4), 2 bits to specify virtual page number (VPN)
VPN
00  -> page for code,  addresses 00...15
01  -> page for heap,  addresses 16...31
10  -> <unused>,       addresses 32...47
11  -> page for stack, addresses 48...63

Virtual addresses on page with VPN 00:
000000
000001
000010
000011
000100
000101
000110
000111
001000
001001
001010
001011
001100
001101
001110
001111

Top two bits are always the virtual page number: 00

Page Table
  use VPN as index, each PTE stores
  physical page number (PPN), access permissions
  valid bit indicates whether physical page exists
+-------+-------+--------+
| valid |  PPN  | access |
+-------+-------+--------+
|   1   |  001  |   R    | (code)
|   1   |  100  |  R/W   | (heap)
|   0   |  ---  |  ---   | (unused)
|   1   |  111  |  R/W   | (stack)
+-------+-------+--------+

mov 17,%rdx
virtual address: 0x11 (010001)
split into VPN=01 and offset=0001
VPN corresponds to PTE |   1   |  100  |  R/W   | (heap)
So physical address is PPN concatenated with offset = 1000001 or 0x41

Physical Memory (128 bytes)
PPN
000 -> reserved for kernel
001 -> stores virtual page 00
010 -> free
011 -> free
100 -> stores virtual page 01
101 -> free
110 -> free
111 -> stores virtual page 11

Where would the page table be stored?
In the kernel's memory! (Physical page 000)

3.2 Copy-on-Write

  • Does copy-on-write still work with pages?
    • Set entries in both page tables to point to same page frames
    • Need core map of page frames to track which processes are pointing to which page frames (e.g., reference count)
  • UNIX fork with copy on write
    • Copy page table of parent into child process
    • Mark all pages (in new and old page tables) as read-only
    • Trap into kernel on write (in child or parent)
    • Copy page
    • Mark both as writeable
    • Resume execution

3.3 Demand Paging

Can I start running a program before its code and data are in physical memory?

  • Set all page table entries to invalid
  • When a page is referenced for first time, kernel trap
  • Kernel brings page in from disk
    • Page in on demand
  • Resume execution
  • Remaining pages can be transferred in the background while program is running

3.4 Pros and Cons

  • Pros:
    • Fixed-size pages avoid external fragmentation (can always use free memory)
    • Finding a free page is easy, no need to rearrange to make room
    • Flexible, enables sparse address spaces
  • Cons:
    • Page table is in memory, so every memory access requires an extra memory access to first retrieve the page table entry
    • What if virtual address space is large?
      • Need a page table entry for each virtual page
      • 32-bits, 4KB pages => 500K page table entries
      • 64-bits => 4 quadrillion page table entries
    • Reduce page table size with larger pages?
      • Internal fragmentation: wasted space if we don’t need all of the space inside a large fixed size chunk

4 Reading: Introduction to Paging

Read OSTEP chapter 18 (p. 197–208) introducing paging. It provides additional examples and context, making it a good complement to the video.