Skip to content

lingjiechen2/STAT-154-Lightweight_BERT_for_QA

Repository files navigation

STAT154

Course Project for STAT 154 in UC Berkeley

Trade-off: Lightweight BERT for QA

Overview

"Trade-off" explores a lightweight BERT model for Chinese Question Answering (QA), demonstrating its effectiveness and efficiency compared to Large Language Models (LLMs).

Dataset

  • DRCD: Delta Reading Comprehension Dataset.
  • ODSQA: Open-Domain Spoken Question Answering Dataset.

Models

  • BERT and its variants (ALBERT, RoBERTa).
  • Comparative analysis with larger LLMs (Qwen-7B, Baichuan 2).

Methodology

Fine-tuning BERT for QA tasks, focusing on preprocessing, training, and postprocessing techniques.

Results

BERT variants, particularly RoBERTa, showed high accuracy, outperforming some larger LLMs in specific tasks.

Future Work

Expanding dataset size, diversifying QA tasks, and adjusting language settings for comprehensive analysis.

About

Course Project for STAT 154 in UC Berkeley

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published