Hypothesis Search: Inductive Reasoning with Language Models

This is the implementation of the paper Hypothesis Search: Inductive Reasoning with Language Models. [Paper]

Installation

conda create -n hypothesis_search python=3.8.16
conda activate hypothesis_search
pip install -r requirements.txt
# unzip the prompt cache file
unzip resources/cache.json.zip

The code is tested on Ubuntu 20.04.

Important: We have cached all our results in resources/cache.json. So all experiments can be run without having an OpenAI API key. In our pipeline, error messages of Python program execution will be included in the prompt during the feedback stage. To leverage the cache to reproduce all results, please update line 250 of utils.py so the path strings in error message are mapped to environment-independent strings.

...
current_root = os.getcwd()
path_mapping = {
    # mapping machine-specific paths to generic paths
    current_root: '/root',
    '/data2/rcwang/miniconda3/': '/envs',
    # add your own python package mapping here
    # an example:
    '/home/rcwang/miniconda3/envs/hypothesis_search/': '/envs',
...

Running experiments

Main Results

Main results can be obtained by running the scripts in scripts/main:

scripts/
└── main
    ├── 1darc
    │   ├── full.sh
    │   └── program_only.sh
    └── arc
        ├── direct.sh
        ├── human_selected_hypothesis.sh
        ├── human_written_hypothesis.sh
        ├── program_only.sh
        └── summarized_hypothesis.sh

To get results with different numbers of feedbacks, alter the num_feedbacks $NUMBER argument in the script. Currently we only include results on the ARC and 1D-ARC datasets.

Results Visualization

For each experiment, you can turn on the flag visualize True to visualize the generated hypotheses, programs and execution results as a webpage. The webpage will be saved in the results directory. We provide an example in results/naive/gpt4-0613/summarized_hypothesis/vis/index.html.

100 Randomly Sampled Tasks of ARC

The task IDs of the 100 randomly sampled ARC tasks used in our experiments are stored in resources/splits/random_100.json. We use the task IDs in LARC. The corresponding JSON file name of each task in the original ARC dataset is stored in resources/id2json.json.

Hypothesis Generation / Summarization

For the selected 100 tasks, we can dump the generated hypotheses by running sh scripts/main/arc/dump_hypothesis.sh; 64 hypotheses for each task will be saved in resources/generated_languages/gpt4_0613_100_64.json. Run python language_summarization.py to summarize the 64 hypotheses to 8 hypotheses; results will be saved in resources/generated_languages/gpt4_0613_100_synthesized.json.

Running New Tasks

Currently the code can only run examples in the cache by default. OpenAI API calls are disabled to prevent unnecessary cost. To actually use OpenAI API calls to generate new results, put your API key in the openai_api_key field in resources/configs/default.yaml and set only_allow_cache to false. It's strongly encouraged to set num_feedbacks $NUMBER to be <= 1 to reduce cost.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
config		config
resources		resources
results/naive/gpt4-0613/summarized_hypothesis/vis		results/naive/gpt4-0613/summarized_hypothesis/vis
scripts/main		scripts/main
temp		temp
.gitignore		.gitignore
1d_fill_results.json		1d_fill_results.json
1d_flip_results.json		1d_flip_results.json
1d_move_1p_results.json		1d_move_1p_results.json
README.md		README.md
consts.py		consts.py
dataset.py		dataset.py
gpt.py		gpt.py
html_table.py		html_table.py
language_generation.py		language_generation.py
language_summarization.py		language_summarization.py
parc.py		parc.py
po_fill.json		po_fill.json
po_flip.json		po_flip.json
po_move.json		po_move.json
program_cluster.py		program_cluster.py
prompt.py		prompt.py
requirements.txt		requirements.txt
utils.py		utils.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hypothesis Search: Inductive Reasoning with Language Models

Installation

Running experiments

Main Results

Results Visualization

100 Randomly Sampled Tasks of ARC

Hypothesis Generation / Summarization

Running New Tasks

About

Releases

Packages

Languages

YitongGu/hypothesis_search

Folders and files

Latest commit

History

Repository files navigation

Hypothesis Search: Inductive Reasoning with Language Models

Installation

Running experiments

Main Results

Results Visualization

100 Randomly Sampled Tasks of ARC

Hypothesis Generation / Summarization

Running New Tasks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages