Skip to content

Commit

Permalink
How to add a new model
Browse files Browse the repository at this point in the history
  • Loading branch information
robertvacareanu committed Apr 11, 2024
1 parent 9bc55bf commit c1a1c8d
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 0 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,3 +117,12 @@ Selected LLMs, both private (e.g., Claude 3 Opus, GPT-4) and open (e.g., DBRX) c


### Adaptation


## How to

### How to add a new dataset?
Please check `hot_to_dataset.md`

### How to add a new model?
Please check `hot_to_model.md`
40 changes: 40 additions & 0 deletions how_to_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# How to add a new model

Out of the box, there is code for the following types of models.

API Requests
- (1) OpenAI
- (2) DeepInfra
- (3) OpenRouter
- (4) Fireworks

Local Models
- (1) Deployed with TGI
- (2) Deployed as `AutoModelForCausalLM`


Performing (additional) experiments with models from one of the services above, even if different models, is easy. For example, the code below uses `gpt-4-0125-preview`. To change this to `gpt-3.5-turbo-1106`, just change the parameter of `ChatOpenAI`.
```python
llm = ChatOpenAI(model_name="gpt-4-0125-preview", temperature=0)
model_name = 'gpt4-turbo'
```

To use non-chat models, please use `OpenAI(model_name="davinci-002", temperature=0)`. Note, however, that this might entail additional changes.

## Add a new model (LLM as a service)

In the case that you want to add a model from a service that is not in the code, you will need to write a new file in `src/regressors`. The structure is that it contains a function with the following signature
```python
def llm_regression(llm, x_train, x_test, y_train, y_test, encoding_type, add_instr_prefix=False, instr_prefix='The task is to provide your best estimate for "Output". Please provide that and only that, without any additional text.\n\n\n\n\n'):
```

The first parameter is the llm. In this function, you will call the llm with the prompt:
```python
if add_instr_prefix:
inpt = instr_prefix + fspt.format(**x)
else:
inpt = fspt.format(**x)
output = llm.call_as_llm(inpt, stop=['\n'], max_tokens=12).strip().split('\n')[0].strip()
```

Depending on the llm, it might not work with `llm.call_as_llm`, or you might need to write a particular class in case that service is not supported out of the box by langchain. There is an example of this in `src/regressors/openrouter_llm_regressor.py`, where I wrote specific code for openrouter (`ChatOpenRouter`).

0 comments on commit c1a1c8d

Please sign in to comment.