Skip to content

yirencao/Ethical-AI

Repository files navigation

AI Ethics in Large Language Models 🤖

Hi there, Welcome! 👋

Within this repository, you will find the resources and notebooks for our project focused on integrating ethical considerations into AI systems, particularly Large Language Models (LLMs) like LLaMA. Our aim is to explore how AI systems can handle ethical dilemmas under different ethical theories, using a combination of technical and philosophical approaches.

Project Structure 📚

Details accessible at our project wiki.

  • Week-by-week plan documenting our progress.
  • Milestones marking key stages in our research and development process.
  • Comprehensive notes on ethical theories and AI model training considerations.

Dataset and Model 👾

The ETHICS dataset, encompassing scenarios representing justice, virtue, deontology, utilitarianism, and common-sense morality, forms the core of our model training and evaluation.

Utilizing LLaMA 2 as our baseline, enhanced by QLoRA for fine-tuning, we aim to balance high performance with alignment to human ethical values.

Examples

util

deon

Model Performance

Accuracy Results

Contents 🗞️

  • data: Comprised the ETHICS and Red-teaming (from Anthropics) datasets, stored within the subfolders ethics/ and anthropics/ respectively.
  • 1. preprocessing: Contains notebooks for preparing and structuring the datasets for model training and evaluation.
  • 2. modelling: one_by_one_train.py for fine-tuning LLaMA with sharded data. The detailed instruction is under Tuning Instructions section.
  • 3. evaluation: data_evaluation.py demonstrates how to process and evaluate the outputs from the model.
  • 4. results: Contains the results generated by the model, as well as comprehensive analysis of the model's performance pre and post fine-tuning.

Installation and Usage 🛠️

  • Clone the repository: git clone https://github.com/yirencao/Ethical-AI.git
  • Create a virutal environment: conda create -n v1 python=3.8
  • Install required packages in the virtual environment: pip install -r requirements.txt
  • Follow the notebooks in order for a step-by-step guide through the project.

Tuning Instructions 🪓

Adapt one_by_one_train.py for your setup as follows:

# User should replace '0,1' with the specific GPU IDs they want to use.
# For example, if they have a single GPU, they could set it to '0'.
os.environ['CUDA_VISIBLE_DEVICES'] = 'your_gpu_ids_here'

# Replace 'Erynan' with your Hugging Face account name.
# This is important for saving and loading the model to/from your Hugging Face account.
model_name = f"your_huggingface_account_name/4_ethics_{i-1}"
new_model = f"your_huggingface_account_name/4_ethics_{i}"

Prepare your environment for tuning:

# Activate virtual environment
source activate v1

# Set up Hugging Face token
export HUGGINGFACE_TOKEN=[your_huggingface_token]
huggingface-cli login --token $HUGGINGFACE_TOKEN

# Start model tuning
python one_by_one_train.py

Happy coding! 🎉

About

For digital humanity, EPFL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published