LLProxy is a flexible, auto-discovering proxy server for Large Language Models (LLMs).
- Automatic discovery of LLM models and endpoints
- HTTP port scan (finds local API endpoints)
- SSH scan (finds processes with a PORT env var set)
- HTTPS managed endpoint scan (supports API keys)
- Unified completions and chat completions API for multiple LLM models
- Clients dont need to worry about keys or where the model is hosted
- Periodic model discovery updates
- Status page for active models
- Model filtering (only import some models into the proxy)
- Hot-reloading of configuration
- Node.js (v20)
-
Clone this repository:
git clone https://github.com/the-crypt-keeper/llproxy.git cd llproxy
-
Install dependencies:
npm install
-
Configure the
config.json
file with your LLM endpoints.
Edit the config.json
file to set up your LLM endpoints:
{
"port": 3333,
"interval": 30000,
"endpoints": [
{
"hostname": "supremacy",
"ssh_username": "mike",
"env_var": "PORT",
"tags": []
},
{
"hostname": "100.106.238.128",
"port_start": 8080,
"port_end": 8090,
"tags": []
},
{
"url": "https://api.groq.com/openai",
"tags": [],
"filter": ["llama3","gemma","mixtral"],
"apikey": "your_api_key_here"
}
]
}
-
port
: The port on which LLProxy will run -
interval
: The interval (in milliseconds) for periodic model discovery -
endpoints
: An array of LLM endpoints to discover. There are three types of endpoint configurations:-
SSH-based discovery:
hostname
: The hostname for SSH connectionssh_username
: The SSH username for authenticationenv_var
: The environment variable to search for a port number in SSH-based discovery
-
HTTP scan discovery:
hostname
: The IP address or hostname of the endpointport_start
: The starting port number for the scan rangeport_end
: The ending port number for the scan range
-
Managed provider import:
url
: The URL for the API endpoint of the managed providerapikey
: The API key for authentication with the managed providerfilter
: An array of strings to filter model namesmodels
: An explicit array of model IDs (skip the /v1/models discovery call)
Common options:
tags
: Optional tags to append to model names (applies to all types)
-
-
Start the server:
node app.js
-
The server will start on the configured port (default: 3333)
-
Access the API endpoints (returns JSON):
- GET
/v1/models
: List all available models - POST
/v1/completions
: Get completions for a prompt - POST
/v1/chat/completions
: Get chat completions
- GET
-
Access monitoring endpoints (returns HTML):
- GET
/status
: View the status page for active models
- GET
GET http://localhost:3333/v1/models
POST http://localhost:3333/v1/completions
Content-Type: application/json
{
"model": "model_name",
"prompt": "Once upon a time",
"max_tokens": 50
}
POST http://localhost:3333/v1/chat/completions
Content-Type: application/json
{
"model": "model_name",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
]
}
This project is functionally complete, no further features are planned.
This project is licensed under the MIT License.