Logging Practice Problems

Explain the purpose of the checkpoint mechanism. How often should checkpoints be performed? How does the frequency of checkpoints affect:¹
- System performance when no failure occurs?
- The time it takes to recover from a system crash?
- The time it takes to recover from a media (disk) failure?
Explain how the database may become inconsistent if some log records pertaining to a block are not output to disk before the block is output to disk.²
Outline the drawbacks of the no-steal and force buffer management policies.³
How does a write-ahead log support both undo and redo operations?⁴

Footnotes:

Checkpointing is done with log-based recovery schemes to reduce the time required for recovery after a rash. If there is no checkpointing, then the entire log must be searched after a rash, and all transactions must be undone/redone from the log. If checkpointing is performed, then most of the log records prior to the checkpoint an be ignored at the time of recovery. Another reason to perform checkpoints is to clear log records from stable storage as it gets full. If the amount of disk storage available is limited, frequent checkpointing is necessary to keep the log size small.

Sine checkpoints cause some loss in performance while they are being taken, their frequency should be reduced if fast recovery is not critical. If we need fast recovery, checkpointing frequency should be increased. Checkpoints have no effect on recovery from a disk rash; archival dumps are the equivalent of checkpoints for recovery from disk rashes.

Consider a banking scheme and a transaction which transfers $50 from account $A$ to account $B$. The transaction has the following steps:

R(A)
A = A - 50
W(A)
R(B)
B = B + 50
W(B)

Suppose the system crashes after the transaction commits, but before its log records are flushed to stable storage. Further assume that at the time of the crash the update of $A$ has been flushed to disk, whereas the buffer page containing $B$ was not yet written to disk. When the system comes up it is in an inconsistent state, but recovery is not possible because there are no log records corresponding to this transaction on disk.

Drawback of the no-steal policy: The no-steal policy does not work with transactions that perform a large number of updates, since the buffer may get filled with updated pages that cannot be evicted to disk, and the transaction cannot then proceed.
Drawback of the force policy: The force policy might slow down the commit of transactions as it forces all modified blocks to be flushed to disk before commit. Also, the number of output operations is more in the case of the force policy. This is because frequently updated blocks are output multiple times, once for each transaction. Further, the policy results in more random I/O, since the blocks written out are not likely to be sequential on disk.

⁴

The log supports redo because the system can scan the log forwards starting at the most recent checkpoint and re-apply the recorded updates. The log supports undo because the system can scan the log backwards from the end and undo the recorded updates (using the original value part of each log record).