Skip to content

This repository contains the implementation of various components of a simple database management system (SimpleDB) as part of the coursework for the Database class CS186 at UC Berkeley.

Notifications You must be signed in to change notification settings

chun77/SimpleDB

Repository files navigation

SimpleDB Project

This repository contains the implementation of various components of a simple database management system (SimpleDB) as part of the coursework for the Database class CS186 at UC Berkeley. The project is divided into four labs, each focusing on different aspects of database systems.

Table of Contents

Lab 1: Basic Components

Design and Implementation

  1. TupleDesc.java and Tuple.java:

    • Implemented using arrays to store field descriptions and values.
    • Iterators for field descriptions and tuple fields.
  2. Catalog:

    • Stores the database schema using an ArrayList.
    • Supports lookup by table ID and name.
  3. BufferPool:

    • Manages the reading and writing of pages into memory.
    • Implements a fixed-size buffer pool with methods to fetch pages from disk.
  4. HeapFile, HeapPage, and RecordID:

    • HeapFile: Handles file storage and retrieval.
    • HeapPage: Manages individual pages within a HeapFile.
    • RecordID: Uniquely identifies tuples within pages.

Challenges

  • Implementing page-level operations such as reading from and writing to disk.
  • Ensuring thread safety in buffer pool operations.

Lab 2: Filter and Join Operations

Design and Implementation

  1. Filter:

    • Uses a predicate to filter tuples based on specified conditions.
  2. Join:

    • Implemented nested loop join.
    • Iterates over tuples in the outer relation and matches them with tuples in the inner relation.
  3. Aggregate:

    • Uses a HashMap to store grouped aggregate results.
    • Supports various aggregation operations like AVG, SUM, MIN, and MAX.
  4. HeapFile Mutability:

    • Added methods to insert and delete tuples from a HeapPage.
    • Managed space allocation within pages for new tuples.

Challenges

  • Efficiently implementing the nested loop join.
  • Handling different aggregate operations, especially AVG, which requires additional bookkeeping.

Lab 3: B+ Tree Indexing

Design and Implementation

  1. BTreePage:

    • Represents nodes in the B+ tree.
    • Stores entries and pointers to child nodes.
  2. Search and Insert:

    • Recursive method to find the appropriate leaf page for insertion.
    • Splits internal pages when they exceed the maximum number of entries.
  3. Delete:

    • Redistributes entries between sibling pages to maintain balance.
    • Merges pages when necessary to prevent underflow.

Challenges

  • Implementing page splitting and merging while maintaining the tree structure.
  • Ensuring that all parent and child pointers are updated correctly.

Lab 4: Transactions and Concurrency Control

Design and Implementation

  1. LockState and LockManager:

    • Manages locks on pages to ensure safe concurrent access.
    • Implements strict two-phase locking (2PL).
  2. Lock Lifetime:

    • Transactions hold locks until they commit or abort.
    • Implemented a mechanism to release locks after transactions complete.
  3. NO STEAL Policy:

    • Ensures that dirty pages are not written to disk before the transaction commits.
  4. Deadlocks and Aborts:

    • Implemented a timeout mechanism to detect and handle deadlocks.
    • Transactions that exceed the timeout are aborted.

Challenges

  • Managing concurrent access to shared resources.
  • Detecting and resolving deadlocks efficiently.

Git Commit History

  • The detailed commit history for each lab is available in the repository.

About

This repository contains the implementation of various components of a simple database management system (SimpleDB) as part of the coursework for the Database class CS186 at UC Berkeley.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published