CS 332 w22 — Segmentation and Paging

1. Base and Bound
2. Segmentation
3. Paging
4. Reading: Introduction to Paging

1 Base and Bound

Pros:
- Simple, fast, safe
  - Just two hardware registers!
- Can relocate physical memory without changing process
Cons:
- Can't prevent process from overwriting its own code (only check exceeding bound)
- Can't share code/data with other processes
- Can't grow stack/heap as needed

2 Segmentation

Key idea: multiple bases and bounds to divide memory into segments

Segment is a contiguous region of virtual memory
Each process has a segment table (stored in hardware)
- Each entry in the table defines a segment
Segment can be located anywhere in physical memory
- Each segment has: start (base), length (bound), access permissions
Just as with base and bound, a memory access to a segment exceeding the bound triggers an exception
- On UNIX systems, called a segmentation fault
Common theme: virtual address split into index (Segment) and offset
Base value from table added to Offset to compute physical address

2.1 Segmentation Facilitates Sharing

Same base, bound, same/different access permissions
For example, two instances of the same program sharing a code segment

2.2 Copy-on-Write

We can use segmentation to avoid unnecessary copying

UNIX fork
- Makes a complete copy of a process
- If child never uses parent's memory (e.g., just calls exec), a lot of unnecessary copying
Segments allow a more efficient implementation
- Copy segment table into child
- Mark parent and child segments read-only
- Start child process; return to parent
- If child or parent writes to a segment (ex: stack, heap)
  - Trap into kernel
  - Make a copy of the segment, change permissions, and resume

2.3 Zero-on-Reference

Need to zero out contents of memory before allocation
- Avoid accidentally leaking information!
- But only want to bother if memory will actually be used
How much physical memory is needed for the stack or heap?
- Only what is currently in use (could be anywhere from KBs to GBs)
- Reserve larger space, but only zero the first few KB
- Set the segment bound to just the zeroed portion
When program uses memory beyond end of stack or expands the heap
- Segmentation fault into OS kernel
- Kernel allocates some memory (extends bound)
- Zeros the memory
- Modifies the segment table
- Resumes the process

2.4 Pros and Cons

Pros:
- Can share code/data segments between processes
- Can protect code segment from being overwritten
- Can transparently grow stack/heap as needed
- Can detect if need to copy-on-write
Cons:
- Complex memory management
  - Need to find chunk of a particular size (e.g., some programs have small data segment, some large)
- May need to rearrange memory from time to time to make room for new segment or growing segment
  - Compacting free memory can create a lot of processor overhead (i.e., moving segments around to consolidate free space)
  - External fragmentation in physical memory due to wasted space between different-sized chunks

3 Paging

Key idea: divide memory into fixed-size pages

Manage physical memory in fixed size units, or pages
- Both virtual and physical memory use this same fixed page size
Finding a free page is easy
- Bitmap allocation
  - Each bit represents one physical page frame, indicates free or allocated
    - Bitmaps also used in file systems to represent free disk blocks (also fixed-size units)
  - 0011111100000001100 indicates 2 free pages, then 6 allocated pages, then 7 free pages, then 2 allocated, then 2 free
Each process has its own page table
- Stored in physical memory
What hardware registers do we need?
- Pointer to page table start
- Page table length

A program's memory may be scattered throughout physical memory

Each process sees its virtual address space as a neatly ordered deck of cards
Physical memory is all the decks from all the processes shuffled together

3.1 Address Translation

As before, the virtual address is used to look up the physical address

Again, part of the virtual address is an index, and part is an offset
- Since offset is lower-order bits, specifies location within contiguous virtual page or physical frame
Page table contains access permissions like segment table

Example:

64-byte virtual address space (64 = 2^6  ->  6-bit virtual addresses)
16-byte pages

Virtual Memory
4 pages (64/16 = 4), 2 bits to specify virtual page number (VPN)
VPN
00  -> page for code,  addresses 00...15
01  -> page for heap,  addresses 16...31
10  -> <unused>,       addresses 32...47
11  -> page for stack, addresses 48...63

Virtual addresses on page with VPN 00:
000000
000001
000010
000011
000100
000101
000110
000111
001000
001001
001010
001011
001100
001101
001110
001111

Top two bits are always the virtual page number: 00

Page Table
  use VPN as index, each PTE stores
  physical page number (PPN), access permissions
  valid bit indicates whether physical page exists
+-------+-------+--------+
| valid |  PPN  | access |
+-------+-------+--------+
|   1   |  001  |   R    | (code)
|   1   |  100  |  R/W   | (heap)
|   0   |  ---  |  ---   | (unused)
|   1   |  111  |  R/W   | (stack)
+-------+-------+--------+

mov 17,%rdx
virtual address: 0x11 (010001)
split into VPN=01 and offset=0001
VPN corresponds to PTE |   1   |  100  |  R/W   | (heap)
So physical address is PPN concatenated with offset = 1000001 or 0x41

Physical Memory (128 bytes)
PPN
000 -> reserved for kernel
001 -> stores virtual page 00
010 -> free
011 -> free
100 -> stores virtual page 01
101 -> free
110 -> free
111 -> stores virtual page 11

Where would the page table be stored?
In the kernel's memory! (Physical page 000)

3.2 Copy-on-Write

Does copy-on-write still work with pages?
- Set entries in both page tables to point to same page frames
- Need core map of page frames to track which processes are pointing to which page frames (e.g., reference count)
UNIX fork with copy on write
- Copy page table of parent into child process
- Mark all pages (in new and old page tables) as read-only
- Trap into kernel on write (in child or parent)
- Copy page
- Mark both as writeable
- Resume execution

3.3 Demand Paging

Can I start running a program before its code and data are in physical memory?

Set all page table entries to invalid
When a page is referenced for first time, kernel trap
Kernel brings page in from disk
- Page in on demand
Resume execution
Remaining pages can be transferred in the background while program is running

3.4 Pros and Cons

Pros:
- Fixed-size pages avoid external fragmentation (can always use free memory)
- Finding a free page is easy, no need to rearrange to make room
- Flexible, enables sparse address spaces
Cons:
- Page table is in memory, so every memory access requires an extra memory access to first retrieve the page table entry
- What if virtual address space is large?
  - Need a page table entry for each virtual page
  - 32-bits, 4KB pages => 500K page table entries
  - 64-bits => 4 quadrillion page table entries
- Reduce page table size with larger pages?
  - Internal fragmentation: wasted space if we don’t need all of the space inside a large fixed size chunk

4 Reading: Introduction to Paging

Read OSTEP chapter 18 (p. 197–208) introducing paging. It provides additional examples and context, making it a good complement to the video.