CS 332 w22 — Segmentation and Paging
Table of Contents
1 Base and Bound
- Pros:
- Simple, fast, safe
- Just two hardware registers!
- Can relocate physical memory without changing process
- Simple, fast, safe
- Cons:
- Can't prevent process from overwriting its own code (only check exceeding bound)
- Can't share code/data with other processes
- Can't grow stack/heap as needed
2 Segmentation
Key idea: multiple bases and bounds to divide memory into segments
- Segment is a contiguous region of virtual memory
- Each process has a segment table (stored in hardware)
- Each entry in the table defines a segment
- Segment can be located anywhere in physical memory
- Each segment has: start (base), length (bound), access permissions
- Just as with base and bound, a memory access to a segment exceeding the bound triggers an exception
- On UNIX systems, called a segmentation fault
- Common theme: virtual address split into index (Segment) and offset
- Base value from table added to Offset to compute physical address
2.1 Segmentation Facilitates Sharing
- Same base, bound, same/different access permissions
- For example, two instances of the same program sharing a code segment
2.2 Copy-on-Write
We can use segmentation to avoid unnecessary copying
- UNIX fork
- Makes a complete copy of a process
- If child never uses parent's memory (e.g., just calls exec), a lot of unnecessary copying
- Segments allow a more efficient implementation
- Copy segment table into child
- Mark parent and child segments read-only
- Start child process; return to parent
- If child or parent writes to a segment (ex: stack, heap)
- Trap into kernel
- Make a copy of the segment, change permissions, and resume
2.3 Zero-on-Reference
- Need to zero out contents of memory before allocation
- Avoid accidentally leaking information!
- But only want to bother if memory will actually be used
- How much physical memory is needed for the stack or heap?
- Only what is currently in use (could be anywhere from KBs to GBs)
- Reserve larger space, but only zero the first few KB
- Set the segment bound to just the zeroed portion
- When program uses memory beyond end of stack or expands the heap
- Segmentation fault into OS kernel
- Kernel allocates some memory (extends bound)
- Zeros the memory
- Modifies the segment table
- Resumes the process
2.4 Pros and Cons
- Pros:
- Can share code/data segments between processes
- Can protect code segment from being overwritten
- Can transparently grow stack/heap as needed
- Can detect if need to copy-on-write
- Cons:
- Complex memory management
- Need to find chunk of a particular size (e.g., some programs have small data segment, some large)
- May need to rearrange memory from time to time to make room for new segment or growing segment
- Compacting free memory can create a lot of processor overhead (i.e., moving segments around to consolidate free space)
- External fragmentation in physical memory due to wasted space between different-sized chunks
- Complex memory management
3 Paging
Key idea: divide memory into fixed-size pages
- Manage physical memory in fixed size units, or pages
- Both virtual and physical memory use this same fixed page size
- Finding a free page is easy
- Bitmap allocation
- Each bit represents one physical page frame, indicates free or allocated
- Bitmaps also used in file systems to represent free disk blocks (also fixed-size units)
0011111100000001100
indicates 2 free pages, then 6 allocated pages, then 7 free pages, then 2 allocated, then 2 free
- Each bit represents one physical page frame, indicates free or allocated
- Bitmap allocation
- Each process has its own page table
- Stored in physical memory
- What hardware registers do we need?
- Pointer to page table start
- Page table length
A program's memory may be scattered throughout physical memory
- Each process sees its virtual address space as a neatly ordered deck of cards
- Physical memory is all the decks from all the processes shuffled together
3.1 Address Translation
As before, the virtual address is used to look up the physical address
- Again, part of the virtual address is an index, and part is an offset
- Since offset is lower-order bits, specifies location within contiguous virtual page or physical frame
- Page table contains access permissions like segment table
Example:
64-byte virtual address space (64 = 2^6 -> 6-bit virtual addresses) 16-byte pages Virtual Memory 4 pages (64/16 = 4), 2 bits to specify virtual page number (VPN) VPN 00 -> page for code, addresses 00...15 01 -> page for heap, addresses 16...31 10 -> <unused>, addresses 32...47 11 -> page for stack, addresses 48...63 Virtual addresses on page with VPN 00: 000000 000001 000010 000011 000100 000101 000110 000111 001000 001001 001010 001011 001100 001101 001110 001111 Top two bits are always the virtual page number: 00 Page Table use VPN as index, each PTE stores physical page number (PPN), access permissions valid bit indicates whether physical page exists +-------+-------+--------+ | valid | PPN | access | +-------+-------+--------+ | 1 | 001 | R | (code) | 1 | 100 | R/W | (heap) | 0 | --- | --- | (unused) | 1 | 111 | R/W | (stack) +-------+-------+--------+ mov 17,%rdx virtual address: 0x11 (010001) split into VPN=01 and offset=0001 VPN corresponds to PTE | 1 | 100 | R/W | (heap) So physical address is PPN concatenated with offset = 1000001 or 0x41 Physical Memory (128 bytes) PPN 000 -> reserved for kernel 001 -> stores virtual page 00 010 -> free 011 -> free 100 -> stores virtual page 01 101 -> free 110 -> free 111 -> stores virtual page 11 Where would the page table be stored? In the kernel's memory! (Physical page 000)
3.2 Copy-on-Write
- Does copy-on-write still work with pages?
- Set entries in both page tables to point to same page frames
- Need core map of page frames to track which processes are pointing to which page frames (e.g., reference count)
- UNIX fork with copy on write
- Copy page table of parent into child process
- Mark all pages (in new and old page tables) as read-only
- Trap into kernel on write (in child or parent)
- Copy page
- Mark both as writeable
- Resume execution
3.3 Demand Paging
Can I start running a program before its code and data are in physical memory?
- Set all page table entries to invalid
- When a page is referenced for first time, kernel trap
- Kernel brings page in from disk
- Page in on demand
- Resume execution
- Remaining pages can be transferred in the background while program is running
3.4 Pros and Cons
- Pros:
- Fixed-size pages avoid external fragmentation (can always use free memory)
- Finding a free page is easy, no need to rearrange to make room
- Flexible, enables sparse address spaces
- Cons:
- Page table is in memory, so every memory access requires an extra memory access to first retrieve the page table entry
- What if virtual address space is large?
- Need a page table entry for each virtual page
- 32-bits, 4KB pages => 500K page table entries
- 64-bits => 4 quadrillion page table entries
- Reduce page table size with larger pages?
- Internal fragmentation: wasted space if we don’t need all of the space inside a large fixed size chunk
4 Reading: Introduction to Paging
Read OSTEP chapter 18 (p. 197–208) introducing paging. It provides additional examples and context, making it a good complement to the video.