-
Notifications
You must be signed in to change notification settings - Fork 152
Build your own plugin
Optillm supports a simple plugin system to extend the capabilities of the proxy. You can use it to run any code as part of the request <--> response call to a LLM.
To do so, you need to create a python file and put it in the /plugins
folder.
The plugin needs only two things:
- SLUG (a unique string that will be the name of the plugin. This is what the user will put in to use the plugin from the proxy)
- Implement the
run
method (the method will get the initial query, system prompt, a API client (optional) and model (optional) and return a tuple with the final response and tokens used)
SLUG = "your_plugin_name"
def run(system_prompt, initial_query: str, client=None, model=None) -> Tuple[str, int]:
# Implement your code here
return final_response, tokens_used
When the proxy starts it will load all the files from the /plugins
folder. It will then route the requests based on the SLUG
you set and call the run
method in the file. In the above example running the proxy with your_plugin_name-gpt-4o-mini
will route the request to the plugin's run
method. There are several plugins that are already implemented as shown here. You can use them as reference.
Note
Plugins may do anything, they do not have to call an LLM or return the response. E.g. the readurls plugin fetches the content of all the URLs in the request and adds it to the context.
Plugins can also be chained together with &
or |
operators. For instance, if you like to read the content of all URLs in a request and then process the request with memory as the context may become very large to process directly, you can combine them as readurls&memory-gpt-4o-mini
. The &
operator will run the plugins one after the other in a pipeline, taking the output of the previous stage as the input to the next one.
On the other hand the |
operator will run both the plugins/approaches in parallel and return a response with multiple completions in a list. E.g. if you want to run the request with rto
approach and also want to run it with executecode
plugin, you can use rto|executecode-gpt-4o-mini
. This will return a list with 2 completions one for rto
and another for executecode
.