Automated Assessment of Answer Relevance in QA Systems

This repository contains the implementation of a neural model designed to assess the relevance of answers in French-language question-answering systems. The model was developed as part of a Kaggle-style hackathon and is documented in detail in the accompanying paper.

Overview

The project focuses on scoring the relevance of answers to French-language queries using a hierarchical architecture based on the pre-trained CamemBERT model. The solution leverages a combination of dimensional reduction, dropout regularization, and layer normalization to produce relevance scores on a scale from 0 to 1.

Key Features

Base Model: CamemBERT (camembert-base) for contextual French language embeddings.
Data Source: French Question Answering Dataset (FQuAD).
Balanced Dataset: Positive and negative pairs were generated for effective training.
Loss Function: Binary Cross-Entropy with Logits Loss.
Optimization: AdamW optimizer with weight decay correction.
Training: Mixed precision training for memory efficiency and faster computation.

Model Architecture

The architecture includes:

Base Language Model: CamemBERT for extracting embeddings.
Feature Processing Layers:
- Dense layer to reduce dimensions.
- Dropout layer for regularization.
- Layer normalization for stabilized training.
Output Layer: Single-unit layer producing raw logits, converted into relevance scores via the sigmoid function.

Inference Pipeline

Tokenize input question-article pairs.
Process the pairs through the trained model.
Apply sigmoid activation to output logits.
Return a relevance score between 0 and 1.

Installation

Clone this repository and install the required dependencies:

git clone [repository_url]
cd [repository_name]
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Usage

To run the model on custom question-answer pairs:

Prepare the input data in the required format.
Use the provided script to load the model and make predictions:

python predict.py --input_file your_input_file.json --output_file predictions.json

Dataset

The training data is based on FQuAD, with positive pairs (original question-article pairs) labeled as relevant and negative pairs (randomly sampled unrelated articles) labeled as irrelevant.

Results

The model produces continuous relevance scores that effectively capture the semantic relationships between questions and potential answers while maintaining computational efficiency.

Contributions

Developed by: Zakaria El Alaoui

License

This project is licensed under the MIT License. See the LICENSE file for details.

For any questions or feedback, feel free to contact me at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
Relevance_of_Retrieval_documentation.pdf		Relevance_of_Retrieval_documentation.pdf
juridia-competition.ipynb		juridia-competition.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Assessment of Answer Relevance in QA Systems

Overview

Key Features

Model Architecture

Inference Pipeline

Installation

Usage

Dataset

Results

Contributions

License

About

Releases

Packages

Languages

elza02/Automated-Assessment-of-Answer-Relevance-in-Question-Answering-Systems-A-Machine-Learning-Approach

Folders and files

Latest commit

History

Repository files navigation

Automated Assessment of Answer Relevance in QA Systems

Overview

Key Features

Model Architecture

Inference Pipeline

Installation

Usage

Dataset

Results

Contributions

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages