AI Ethics in Large Language Models 🤖

Hi there, Welcome! 👋

Within this repository, you will find the resources and notebooks for our project focused on integrating ethical considerations into AI systems, particularly Large Language Models (LLMs) like LLaMA. Our aim is to explore how AI systems can handle ethical dilemmas under different ethical theories, using a combination of technical and philosophical approaches.

Project Structure 📚

Details accessible at our project wiki.

Week-by-week plan documenting our progress.
Milestones marking key stages in our research and development process.
Comprehensive notes on ethical theories and AI model training considerations.

Dataset and Model 👾

The ETHICS dataset, encompassing scenarios representing justice, virtue, deontology, utilitarianism, and common-sense morality, forms the core of our model training and evaluation.

Utilizing LLaMA 2 as our baseline, enhanced by QLoRA for fine-tuning, we aim to balance high performance with alignment to human ethical values.

Examples

Model Performance

Contents 🗞️

data: Comprised the ETHICS and Red-teaming (from Anthropics) datasets, stored within the subfolders ethics/ and anthropics/ respectively.
1. preprocessing: Contains notebooks for preparing and structuring the datasets for model training and evaluation.
2. modelling: one_by_one_train.py for fine-tuning LLaMA with sharded data. The detailed instruction is under Tuning Instructions section.
3. evaluation: data_evaluation.py demonstrates how to process and evaluate the outputs from the model.
4. results: Contains the results generated by the model, as well as comprehensive analysis of the model's performance pre and post fine-tuning.

Installation and Usage 🛠️

Clone the repository: git clone https://github.com/yirencao/Ethical-AI.git
Create a virutal environment: conda create -n v1 python=3.8
Install required packages in the virtual environment: pip install -r requirements.txt
Follow the notebooks in order for a step-by-step guide through the project.

Tuning Instructions 🪓

Adapt one_by_one_train.py for your setup as follows:

# User should replace '0,1' with the specific GPU IDs they want to use.
# For example, if they have a single GPU, they could set it to '0'.
os.environ['CUDA_VISIBLE_DEVICES'] = 'your_gpu_ids_here'

# Replace 'Erynan' with your Hugging Face account name.
# This is important for saving and loading the model to/from your Hugging Face account.
model_name = f"your_huggingface_account_name/4_ethics_{i-1}"
new_model = f"your_huggingface_account_name/4_ethics_{i}"

Prepare your environment for tuning:

# Activate virtual environment
source activate v1

# Set up Hugging Face token
export HUGGINGFACE_TOKEN=[your_huggingface_token]
huggingface-cli login --token $HUGGINGFACE_TOKEN

# Start model tuning
python one_by_one_train.py

Happy coding! 🎉

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Ethics in Large Language Models 🤖

Project Structure 📚

Dataset and Model 👾

Examples

Model Performance

Contents 🗞️

Installation and Usage 🛠️

Tuning Instructions 🪓

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
1. preprocessing		1. preprocessing
2. modelling		2. modelling
3. evaluation		3. evaluation
4. results		4. results
data		data
presentations		presentations
README.md		README.md
requirements.txt		requirements.txt

yirencao/Ethical-AI

Folders and files

Latest commit

History

Repository files navigation

AI Ethics in Large Language Models 🤖

Project Structure 📚

Dataset and Model 👾

Examples

Model Performance

Contents 🗞️

Installation and Usage 🛠️

Tuning Instructions 🪓

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages