LLM Project

Overview

Welcome to the repository for my thesis project! This work focuses on developing, deploying, and evaluating large language models on Text-to-SQL. Below, you'll find a breakdown of the folder structure and what each file and directory is used for.

Directory and File Descriptions

convert-to-pdf/
- convert_to_pdf.ipynb: A handy notebook that converts .py and .ipynb files into PDFs.
deployment/
- prompts.txt: This file contains all the prompts I used during the deployment and evaluation phases.
- streamlit-openai-chat.py: This script sets up a Streamlit app that lets you chat with the language model in real time, using the LiteLLM Proxy server.
evaluation-pipeline/
- clean_and_commented_json.py: This script evaluates various language models on a Text-to-SQL task using DSPy. It configures the models, handles dataset splitting, runs evaluations with different metrics, and logs everything to an Excel file. It also includes some optimization techniques like LabeledFewShot and BootstrapFewShotWithRandomSearch.
- clean_and_commented_json_openai.py: Same as the previous script, but this one’s specifically for working with OpenAI.
- evaluation/
  - evaluation.ipynb: A notebook that dives into analyzing the log_evaluations from the evaluation pipeline.
  - figures/: A place to store all the figures generated during evaluation.
  - latex/: Likely for any LaTeX files I might need for creating report-ready figures or tables.
  - log_evaluations.xlsx: An Excel file that tracks logs for metrics like Match, Correctness, and Execution during evaluation.
  - model_sizes.xlsx: Lists the sizes of the different models I used.
- logs/: A directory where all the log files from the evaluation process are kept.
- optimized_programs/: Contains the optimized programs generated by DSPy.
- utils_evaluate.py: A script that adapts DSPy’s evaluation process to handle multiple metrics at once.
litellm-docker/
- config_base.yaml: The config file for the Student Models.
- config_evaluator.yaml: The config file for the Judge Model.
- docker-compose-eval.yml: Docker Compose setup for running LiteLLM for the Judge Model.
- docker-compose_base.yml: Docker Compose setup for running LiteLLM for the Student Models.
phoenix-docker/
- docker-compose.yml: Docker Compose file for deploying the Phoenix logging framework.
unsloth_tunes/
- Alpaca_+_Llama_3_8b_full_example_sql_edit_checkpoint.ipynb: Fine-tunes the Llama-8b-text model with the gretelai/synthetic_text_to_sql dataset.
- Llama_3_8b_chat_template_Unsloth_2x_faster_finetuning_edit.ipynb: Fine-tunes the Llama-8b-instruct model using the same dataset.
- Phi_3_Medium_4K_Instruct_Unsloth_2x_faster_finetuning.ipynb: Fine-tunes the Phi-3-medium-4k-instruct model, again with the gretelai/synthetic_text_to_sql dataset.
- req_conda_unsloth.txt: A list of the Conda environment requirements for the fine-tuning process.
- restore_artifacts.ipynb: A notebook to restore artifacts that were backed up by WANDB.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Project

Overview

Directory and File Descriptions

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.vscode		.vscode
convert-to-pdf		convert-to-pdf
deployment		deployment
evaluation-pipeline		evaluation-pipeline
litellm-docker		litellm-docker
phoenix-docker		phoenix-docker
unsloth_tunes		unsloth_tunes
.gitignore		.gitignore
readme.md		readme.md

felixdsml/llm

Folders and files

Latest commit

History

Repository files navigation

LLM Project

Overview

Directory and File Descriptions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages