This repository contains the implementation of various components of a simple database management system (SimpleDB) as part of the coursework for the Database class CS186 at UC Berkeley. The project is divided into four labs, each focusing on different aspects of database systems.
- Lab 1: Basic Components
- Lab 2: Filter and Join Operations
- Lab 3: B+ Tree Indexing
- Lab 4: Transactions and Concurrency Control
- Git Commit History
-
TupleDesc.java and Tuple.java:
- Implemented using arrays to store field descriptions and values.
- Iterators for field descriptions and tuple fields.
-
Catalog:
- Stores the database schema using an ArrayList.
- Supports lookup by table ID and name.
-
BufferPool:
- Manages the reading and writing of pages into memory.
- Implements a fixed-size buffer pool with methods to fetch pages from disk.
-
HeapFile, HeapPage, and RecordID:
- HeapFile: Handles file storage and retrieval.
- HeapPage: Manages individual pages within a HeapFile.
- RecordID: Uniquely identifies tuples within pages.
- Implementing page-level operations such as reading from and writing to disk.
- Ensuring thread safety in buffer pool operations.
-
Filter:
- Uses a predicate to filter tuples based on specified conditions.
-
Join:
- Implemented nested loop join.
- Iterates over tuples in the outer relation and matches them with tuples in the inner relation.
-
Aggregate:
- Uses a HashMap to store grouped aggregate results.
- Supports various aggregation operations like AVG, SUM, MIN, and MAX.
-
HeapFile Mutability:
- Added methods to insert and delete tuples from a HeapPage.
- Managed space allocation within pages for new tuples.
- Efficiently implementing the nested loop join.
- Handling different aggregate operations, especially AVG, which requires additional bookkeeping.
-
BTreePage:
- Represents nodes in the B+ tree.
- Stores entries and pointers to child nodes.
-
Search and Insert:
- Recursive method to find the appropriate leaf page for insertion.
- Splits internal pages when they exceed the maximum number of entries.
-
Delete:
- Redistributes entries between sibling pages to maintain balance.
- Merges pages when necessary to prevent underflow.
- Implementing page splitting and merging while maintaining the tree structure.
- Ensuring that all parent and child pointers are updated correctly.
-
LockState and LockManager:
- Manages locks on pages to ensure safe concurrent access.
- Implements strict two-phase locking (2PL).
-
Lock Lifetime:
- Transactions hold locks until they commit or abort.
- Implemented a mechanism to release locks after transactions complete.
-
NO STEAL Policy:
- Ensures that dirty pages are not written to disk before the transaction commits.
-
Deadlocks and Aborts:
- Implemented a timeout mechanism to detect and handle deadlocks.
- Transactions that exceed the timeout are aborted.
- Managing concurrent access to shared resources.
- Detecting and resolving deadlocks efficiently.
- The detailed commit history for each lab is available in the repository.