Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
multiple_choice_samples.txt		multiple_choice_samples.txt
multiple_choice_samples_wa.txt		multiple_choice_samples_wa.txt
post_process.py		post_process.py
run_eval.py		run_eval.py

README.md

Evaluation for Geoscience LLM

This folder contains the evaluation scripts for language model taking geoscience exams.

.
├── multiple_choice_samples_wa.txt # prompt (with 'the answer is') for 5-shot eval
├── multiple_choice_samples.txt # naive prompt for 5-shot eval
└── post_process.py # benchmark preprocessing scripts
└── memtra # Memorizing transformers
└── memtra # Memorizing transformers

We will release end to end version at the end of October, along with Geo-Eval

Usage

-> Here is an example:

For k2 series model with lora

python run_eval.py --model_name k2_ni --base_model /home/daven/llm/qokori/llama-2023-05-07-15-10/checkpoint/ --lora_weights /home/daven/llm/qokori/qokori-sft/outputs/geo_llama/

For k2 series model without lora

python run_eval.py --model_name geollama --base_model /home/daven/llm/qokori/llama-2023-05-07-15-10/checkpoint/

For non-k2 series model

python run_eval.py --model_name gpt2_xl --base_model gpt2-xl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluation

evaluation

README.md

Evaluation for Geoscience LLM

Usage

Files

evaluation

Directory actions

More options

Directory actions

More options

Latest commit

History

evaluation

Folders and files

parent directory

README.md

Evaluation for Geoscience LLM

Usage