Lecture 04: Memory & Disk Management¶

约 558 个字 4 张图片预计阅读时间 3 分钟共被读过 3 次

15-445/645 Database Systems (Spring 2025)
Carnegie Mellon University
Prof. Jignesh Patel

📌 Key Concepts¶

1. Disk-Oriented DBMS¶

Primary Storage: Persistent storage (HDD/SSD).
Buffer Pool Manager: Moves data between disk and memory.
Illusion: Database appears fully in memory, even if larger than available RAM.

Optimization Goals:
1. Spatial Control: Keep related pages physically close on disk (improves prefetching).
2. Temporal Control: Minimize disk I/O stalls by managing when pages are loaded/evicted.

2. Buffer Pool Organization¶

Structure: Array of fixed-size frames (each holds a page copy).
Write-Back Cache: Dirty pages are buffered, not immediately written to disk.
Page Table: In-memory hash table mapping page IDs → buffer pool frames.

Meta-Data per Page:
- Dirty Flag: Indicates if modified (needs write-back).
- Pin/Reference Counter: Tracks active accesses (pinned pages cannot be evicted).
- Access Tracking: Used for replacement policies (e.g., LRU timestamps).

3. Locks vs. Latches¶

Locks 🔒	Latches 🔩
Protect logical DB content (e.g., tuples).	Protect internal data structures (e.g., hash tables).
Held for transaction duration.	Held for operation duration.
Support rollbacks.	No rollback needed.
Example: Row-level locks.	Example: Mutexes for page table access.

4. Page Table vs. Page Directory¶

Page Directory: On-disk mapping of page IDs → physical locations.
Page Table: In-memory mapping of page IDs → buffer pool frames.

Example:
- Page Directory entry: Page#5 → Disk Block 1023.
- Page Table entry: Page#5 → Frame#7 (Dirty: True, Pin Count: 2).

5. Why Not Use OS mmap? 🚨¶

Problems:
1. Transaction Safety: OS may flush dirty pages prematurely.
2. I/O Stalls: Threads stall on page faults (DBMS can’t predict which pages are in memory).
3. Error Handling: SIGBUS errors on invalid page access.
4. Performance: TLB shootdowns and OS contention.

DBMS Advantages:
- Control over flushing order, prefetching, replacement policies, and thread scheduling.

OS Workarounds (e.g., madvise, mlock, msync) are as complex as manual management.

🔄 Buffer Replacement Policies¶

Goals:¶

Correctness: Avoid evicting pinned/dirty pages.
Accuracy: Predict future access patterns.
Speed: Low overhead.
Low Meta-Data: Minimal per-page storage.

1. Least Recently Used (LRU)¶

Mechanism: Track last access timestamp; evict oldest page.
Implementation: Sorted list or priority queue.
Example:
Pages: [A (ts=10), B (ts=5), C (ts=15)] → Evict B.

Weakness:
- Sequential Flooding: Scans pollute the buffer pool (e.g., SELECT * FROM table evicts useful pages).

2. CLOCK (Approximate LRU)¶

Mechanism:
Each page has a reference bit (1 if accessed recently).
Clock hand sweeps frames; evicts pages with reference bit = 0.
Example:
Pages: [A (ref=1), B (ref=0), C (ref=1)] → Evict B.

3. LRU-K¶

Mechanism: Track last K accesses to predict future use.
Example:
K=2: Page A accessed at times [t=5, t=10] → Interval = 5.
Prefer evicting pages with longer intervals (less frequently accessed).

Tradeoff: Higher meta-data overhead.

🧩 Handling Dirty Pages¶

Fast Path: Non-dirty pages are simply dropped.
Slow Path: Dirty pages must be written to disk before eviction.

Optimization:
- Background Writing: Periodically flush dirty pages to reduce eviction latency.

🚀 Buffer Pool Optimizations¶

1. Multiple Buffer Pools¶

Purpose: Reduce contention (e.g., separate pools for indexes and tables).
Mapping: Use hashing or object IDs to assign pages to pools.

2. Pre-Fetching¶

Mechanism: Load pages needed for future operations (e.g., sequential scans).
Example:
During a B+Tree index scan, prefetch sibling leaf pages.

Mechanism: Multiple queries reuse a single scan cursor.
Example:
Query 1 scans Table X; Query 2 attaches to the same scan.

⚠️ Key Takeaways¶

Avoid OS mmap: DBMS needs full control over memory management.
Replacement Policies: Balance accuracy and overhead (CLOCK > LRU for large pools).
Optimize I/O: Use Direct I/O, prefetching, and multiple pools.

📊 Final Thought: The OS is not your friend! 🛑