Use AI models in Neovim for completions or chat. Build prompts programatically with lua. Designed for those who want to customize their prompts, experiment with multiple providers or use local models.
llm.nvim.mp4
- 🎪 Provider agnostic. Comes with:
- OpenAI ChatGPT (and compatible API's)
- hosted: Google PaLM, together, huggingface
- local: llama.cpp, ollama
- 🎨 Programmatic prompts in lua
- customize everything
- async and multistep prompts
- starter examples
- 🌠Streaming completions
- directly in buffer
- transform/extract text
- append/replace/insert modes
- 🦜 Chat in
mchat
filetype buffer- edit settings or messages at any point
- take conversations to different models
- basic syntax highlights and folds
If you have any questions feel free to ask in discussions
- Nvim 0.8.0 or higher
- curl
With lazy.nvim
require('lazy').setup({
{
'gsuuon/model.nvim',
-- Don't need these if lazy = false
cmd = { 'M', 'Model', 'Mchat' },
init = function()
vim.filetype.add({
extension = {
mchat = 'mchat',
}
})
end,
ft = 'mchat',
keys = {
{'<C-m>d', ':Mdelete<cr>', mode = 'n'},
{'<C-m>s', ':Mselect<cr>', mode = 'n'},
{'<C-m><space>', ':Mchat<cr>', mode = 'n' }
},
-- To override defaults add a config field and call setup()
-- config = function()
-- require('model').setup({
-- prompts = {..},
-- chats = {..},
-- ..
-- })
--
-- require('model.providers.llamacpp').setup({
-- binary = '~/path/to/server/binary',
-- models = '~/path/to/models/directory'
-- })
--end
}
})
model.nvim comes with some starter prompts and makes it easy to build your own prompt library. For an example of a more complex agent-like multi-step prompt where we curl for openapi schema, ask gpt for relevant endpoint, then include that in a final prompt look at the openapi
starter prompt.
Prompts can have 5 different modes which determine what happens to the response: append, insert, replace, buffer, insert_or_replace. The default is to append, and with no visual selection the default input is the entire buffer, so your response will be at the end of the file. Modes are configured on a per-prompt basis.
Run a completion prompt
:Model [name]
or:M [name]
— Start a completion of either the visual selection or the current buffer. Uses the default prompt if no prompt name is provided.
Start a new chat
:Mchat [name] [instruction]
— Start a new chat buffer with thename
ChatPrompt. Provide an optional instruction override - if currently in anmchat
buffer use-
to re-use the same instruction (e.g.:Mchat openai -
)
Run a chat buffer
:Mchat
— Request the assistant response in a chat buffer. You can save anmchat
buffer asmy_conversation.mchat
, reload it later and run:Mchat
with your next message to continue where you left off. You'll need to have the same ChatPrompt configured in setup.
Responses are inserted with extmarks, so once the buffer is closed the responses become normal text and won't work with the following commands.
Select response
llm_select.mp4
:Mselect
— Select the response under the cursor.
Delete response
llmdelete.mp4
:Mdelete
— Delete the response under the cursor. Ifprompt.mode == 'replace'
then replace with the original text.
Cancel response
llmcancel.mp4
:Mcancel
— Cancel the active response under the cursor.
Show response
llmshow.mp4
:Mshow
— Flash the response under the cursor if there is one.
Setup and usage
- Python 3.10+
pip install numpy openai tiktoken
Check the module functions exposed in store. This uses the OpenAI embeddings api to generate vectors and queries them by cosine similarity.
To add items call into the model.store
lua module functions, e.g.
:lua require('model.store').add_lua_functions()
:lua require('model.store').add_files('.')
Look at store.add_lua_functions
for an example of how to use treesitter to parse files to nodes and add them to the local store.
To get query results call store.prompt.query_store
with your input text, desired count and similarity cutoff threshold (0.75 seems to be decent). It returns a list of {id: string, content: string}:
builder = function(input, context)
---@type {id: string, content: string}[]
local store_results = require('model.store').prompt.query_store(input, 2, 0.75)
-- add store_results to your messages
end
:Mstore [command]
:Mstore init
— initialize a store.json file at the closest git root directory:Mstore query <query text>
— query a store.json
All setup options are optional. Add new prompts to options.prompts.[name]
and chat prompts to options.chats.[name]
.
require('model').setup({
default_prompt = {},
prompts = {...},
chats = {...},
hl_group = 'Comment',
join_undo = true,
})
Prompts go in the prompts
field of the setup table and are ran by the command :Model [prompt name]
or :M [prompt name]
. The commands tab-complete with the available prompts.
With lazy.nvim:
{
'gsuuon/model.nvim',
config = function()
require('model').setup({
prompts = {
instruct = { ... },
code = { ... },
ask = { ... }
}
})
end
}
A prompt entry defines how to handle a completion request - it takes in the editor input (either an entire file or a visual selection) and some context, and produces the api request data merging with any defaults. It also defines how to handle the API response - for example it can replace the selection (or file) with the response or insert it at the cursor positon.
Check out the starter prompts to see how to create prompts. Type definitions are in provider.lua.
model_reason_siblings.mp4
Chat prompts go in setup({ prompts = {..}, chats = { [name] = { <chat prompt> }, .. } })
next to prompts
. Defaults to the starter chat prompts.
Use :Mchat [name]
to create a new mchat buffer with that chat prompt. The command will tab complete with available chat prompts. You can prefix the command with :horizontal Mchat [name]
or :tab Mchat [name]
to create the buffer in a horizontal split or new tab.
A brand new mchat
buffer might look like this:
openai
---
{
params = {
model = "gpt-4-1106-preview"
}
}
---
> You are a helpful assistant
Count to three
Run :Mchat
in the new buffer (with no name argument) to get the assistant response. You can edit any of the messages, params, options or system instruction (the first line, if it starts with >
) as necessary throughout the conversation. You can also copy/paste to a new buffer, :set ft=mchat
and run :Mchat
.
You can save the buffer with an .mchat
extension to continue the chat later using the same settings shown in the header. mchat
comes with some syntax highlighting and folds to show the various chat parts - name of the chatprompt runner, options and params in the header, and a system message.
You can use require('util').module.autoload
instead of a naked require
to always re-require a module on use. This makes the feedback loop for developing prompts faster:
require('model').setup({
- prompts = require('prompt_library')
+ prompts = require('model.util').module.autoload('prompt_library')
})
I recommend setting this only during active prompt development, and switching to a normal require
otherwise.
(default)
Set the OPENAI_API_KEY
environment variable to your api key.
OpenAI prompts can take an additional option field to talk to compatible API's.
compat = vim.tbl_extend('force', openai.default_prompt, {
options = {
url = 'http://127.0.0.1:8000/v1/'
}
})
url?: string
- (Optional) Custom URL to use for API requests. Defaults to 'https://api.openai.com/v1/'. Ifurl
is provided then the environment key will not be sent, you'll need to includeauthorization
.endpoint?: string
- (Optional) Endpoint to use in the request URL. Defaults to 'chat/completions'.authorization?: string
- (Optional) Authorization header to include in the request. Overrides any authorization given through the environment key.
This provider uses the llama.cpp server.
You can start the server manually or have it autostart when you run a llamacpp prompt. To autostart the server call require('model.providers.llamacpp').setup({})
in your config function and set a model
in the prompt options (see below). Leave model
empty to not autostart. The server restarts if the prompt model or args change.
- Build llama.cpp
- Download the model you want to use, e.g. Zephyr 7b beta
- Setup the llamacpp provider if you plan to use autostart:
config = function() require('model').setup({ .. }) require('model.providers.llamacpp').setup({ binary = '~/path/to/server/binary', models = '~/path/to/models/directory' }) end
- Use the llamacpp provider in a prompt:
local llamacpp = require('model.providers.llamacpp') require('model').setup({ prompts = { zephyr = { provider = llamacpp, options = { model = 'zephyr-7b-beta.Q5_K_M.gguf', args = { '-c', 8192, '-ngl', 35 } }, builder = function(input, context) return { prompt = '<|system|>' .. (context.args or 'You are a helpful assistant') .. '\n</s>\n<|user|>\n' .. input .. '</s>\n<|assistant|>', stops = { '</s>' } } end } } })
Setup require('model.providers.llamacpp').setup({})
binary: string
- path to the llamacpp server binary executablemodels: string
- path to the parent directory of the models (joined withprompt.model
)
model: string (optional)
- The path to the LLM model file to use with server autostart. If not specified, the default model will be used.args: string[] (optional)
- An array of additional arguments to pass to the LLM server at startup.url: string (optional)
- The URL to connect to the LLM server instead of using the default one. This can be useful for connecting to a remote LLM server or a customized local one.
This is a llama.cpp based provider specialized for codellama infill / Fill in the Middle. Only 7B and 13B models support FIM, and the base models (not Instruct) seem to work better. Start the llama.cpp server example with one of the two supported models before using this provider.
This uses the ollama REST server's /api/generate
endpoint. raw
defaults to true, and stream
is always true.
Example prompt with starling:
['ollama/starling'] = {
provider = ollama,
params = {
model = 'starling-lm'
},
builder = function(input)
return {
prompt = 'GPT4 Correct User: ' .. input .. '<|end_of_turn|>GPT4 Correct Assistant: '
}
end
},
Set the PALM_API_KEY
environment variable to your api key.
Check the palm prompt in starter prompts for a reference. Palm provider defaults to the chat model (chat-bison-001
). The builder's return params can include model = 'text-bison-001'
to use the text model instead.
Params should be either a generateMessage body by default, or a generateText body if using model = 'text-bison-001'
.
['palm text completion'] = {
provider = palm,
builder = function(input, context)
return {
model = 'text-bison-001',
prompt = {
text = input
},
temperature = 0.2
}
end
}
Set the TOGETHER_API_KEY
environment variable to your api key. Params go to the inference endpoint.
Set the HUGGINGFACE_API_KEY
environment variable to your api key.
Set the model field on the params returned by the builder (or the static params in prompt.params
). Set params.stream = false
for models which don't support it (e.g. gpt2
). Check huggingface api docs for per-task request body types.
['huggingface bigcode'] = {
provider = huggingface,
params = {
model = 'bigcode/starcoder'
},
builder = function(input)
return { inputs = input }
end
}
For older models that don't work with llama.cpp, koboldcpp might still support them. Check their repo for setup info.
Providers implement a simple interface so it's easy to add your own. Just set your provider as the provider
field in a prompt. Your provider needs to kick off the request and call the handlers as data streams in, finishes, or errors. Check the hf provider for a simpler example supporting server-sent events streaming. If you don't need streaming, just make a request and call handler.on_finish
with the result.
Basic provider example:
local test_provider = {
request_completion = function(handlers, params, options)
vim.notify(vim.inspect({params=params, options=options}))
handlers.on_partial('a response')
handlers.on_finish()
end
}
require('model').setup({
prompts = {
test_prompt = {
provider = test_provider,
builder = function(input, context)
return {
input = input,
context = context
}
end
}
}
})
The following are types and the fields they contain:
Setup require('model').setup(SetupOptions)
default_prompt?: string
- The default prompt to use with:Model
or:M
. Default is the openai starter.prompts?: {string: Prompt}
- A table of custom prompts to use with:M [name]
. Keys are the names of the prompts. Default are the starters.chats?: {string: ChatPrompt}
- A table of chat prompts to use with:Mchat [name]
. Keys are the names of the chats.hl_group?: string
- The default highlight group for in-progress responses. Default is'Comment'
.join_undo?: boolean
- Whether to join streaming response text as a single undo command. When true, unrelated edits during streaming will also be undone. Default istrue
.
params
are generally data that go directly into the request sent by the provider (e.g. content, temperature). options
are used by the provider to know how to handle the request (e.g. server url or model name if a local LLM).
Setup require('model').setup({prompts = { [prompt name] = Prompt, .. }})
Run :Model [prompt name]
or :M [prompt name]
provider: Provider
- The provider for this prompt, responsible for requesting and returning completion suggestions.builder: ParamsBuilder
- Converts input (either the visual selection or entire buffer text) and context to request parameters. Returns either a table of params or a function that takes a callback with the params.transform?: fun(string): string
- Optional function that transforms completed response text after on_finish, e.g. to extract code.mode?: SegmentMode | StreamHandlers
- Response handling mode. Defaults to 'append'. Can be one of 'append', 'replace', 'buffer', 'insert', or 'insert_or_replace'. Can be a table of StreamHandlers to manually handle the provider response.hl_group?: string
- Highlight group of active response.params?: table
- Static request parameters for this prompt.options?: table
- Optional options for the provider.
request_completion: fun(handler: StreamHandlers, params?: table, options?: table): function
- Requests a completion stream from the provider and returns a cancel callback. Feeds completion parts back to the prompt runner using handler methods and calls on_finish after completion is done.default_prompt? : Prompt
- Default prompt for this provider (optional).adapt?: fun(prompt: StandardPrompt): table
- Adapts a standard prompt to params for this provider (optional).
(function)
fun(input: string, context: Context): table | fun(resolve: fun(params: table))
- Converts input (either the visual selection or entire buffer text) and context to request parameters. Returns either a table of params or a function that takes a callback with the params.
(enum)
Exported as local mode = require('model').mode
APPEND = 'append'
- Append to the end of input.REPLACE = 'replace'
- Replace input.BUFFER = 'buffer'
- Create a new buffer and insert.INSERT = 'insert'
- Insert at the cursor position.INSERT_OR_REPLACE = 'insert_or_replace'
- Insert at the cursor position if no selection, or replace the selection.
on_partial: fun(partial_text: string): nil
- Called by the provider to pass partial incremental text completions during a completion request.on_finish: fun(complete_text?: string, finish_reason?: string): nil
- Called by the provider when the completion is done. Takes an optional argument for the completed text (complete_text
) and an optional argument for the finish reason (finish_reason
).on_error: fun(data: any, label?: string): nil
- Called by the provider to pass error data and an optional label during a completion request.
params
are generally data that go directly into the request sent by the provider (e.g. content, temperature). options
are used by the provider to know how to handle the request (e.g. server url or model name if a local LLM).
Setup require('model').setup({chats = { [chat name] = ChatPrompt, .. }})
Run :Mchat [chat name]
provider: Provider
- The provider for this chat prompt.create: fun(input: string, context: Context): string | ChatContents
- Converts input and context into the first message text or ChatContents, which are written into the new chat buffer.run: fun(messages: ChatMessage[], config: ChatConfig): table | fun(resolve: fun(params: table): nil )
- Converts chat messages and configuration into completion request params. This function returns a table containing the required params for generating completions, or it can return a function that takes a callback to resolve the params.system?: string
- Optional system instruction used to provide specific instructions for the provider.params?: table
- Static request parameters that are provided to the provider during completion generation.options?: table
- Provider options, which can be customized by the user to modify the chat prompt behavior.
role: 'user' | 'assistant'
- Indicates whether this message was generated by the user or the assistant.content: string
- The actual content of the message.
system?: string
- Optional system instruction used to provide context or specific instructions for the provider.params?: table
- Static request parameters that are provided to the provider during completion generation.options?: table
- Provider options, which can be customized by the user to modify the chat prompt behavior.
config: ChatConfig
- Configuration for this chat buffer, used bychatprompt.run
. This includes information such as the system instruction, static request parameters, and provider options.messages: ChatMessage[]
- Messages in the chat buffer.
before: string
- The text present before the selection or cursor.after: string
- The text present after the selection or cursor.filename: string
- The filename of the buffer containing the selected text.args: string
- Any additional command arguments provided to the plugin.selection?: Selection
- An optionalSelection
object representing the selected text, if available.
start: Position
- The starting position of the selection within the buffer.stop: Position
- The ending position of the selection within the buffer.
row: number
- The 0-indexed row of the position within the buffer.col: number or vim.v.maxcol
- The 0-indexed column of the position within the line. Ifvim.v.maxcol
is provided, it indicates the end of the line.
require('model').setup({
prompts = {
['prompt name'] = ...
}
})
Ask for additional user instruction
prompt_replace.mp4
ask = {
provider = openai,
params = {
temperature = 0.3,
max_tokens = 1500
},
builder = function(input)
local messages = {
{
role = 'user',
content = input
}
}
return util.builder.user_prompt(function(user_input)
if #user_input > 0 then
table.insert(messages, {
role = 'user',
content = user_input
})
end
return {
messages = messages
}
end, input)
end,
}
Create a commit message based on `git diff --staged`
commit-message-example.mp4
['commit message'] = {
provider = openai,
mode = mode.INSERT,
builder = function()
local git_diff = vim.fn.system {'git', 'diff', '--staged'}
return {
messages = {
{
role = 'system',
content = 'Write a short commit message according to the Conventional Commits specification for the following git diff: ```\n' .. git_diff .. '\n```'
}
}
}
end,
}
Modify input to append messages
modify-input-example.mp4
--- Looks for `<llm:` at the end and splits into before and after
--- returns all text if no directive
local function match_llm_directive(text)
local before, _, after = text:match("(.-)(<llm:)%s?(.*)$")
if not before and not after then
before, after = text, ""
elseif not before then
before = ""
elseif not after then
after = ""
end
return before, after
end
local instruct_code = 'You are a highly competent programmer. Include only valid code in your response.'
return {
['to code'] = {
provider = openai,
builder = function(input)
local text, directive = match_llm_directive(input)
local msgs ={
{
role = 'system',
content = instruct_code,
},
{
role = 'user',
content = text,
}
}
if directive then
table.insert(msgs, { role = 'user', content = directive })
end
return {
messages = msgs
}
end,
mode = segment.mode.REPLACE
},
code = {
provider = openai,
builder = function(input)
return {
messages = {
{
role = 'system',
content = instruct_code,
},
{
role = 'user',
content = input,
}
}
}
end,
},
}
Replace text with Spanish
local openai = require('model.providers.openai')
local segment = require('model.util.segment')
require('model').setup({
prompts = {
['to spanish'] =
{
provider = openai,
hl_group = 'SpecialComment',
builder = function(input)
return {
messages = {
{
role = 'system',
content = 'Translate to Spanish',
},
{
role = 'user',
content = input,
}
}
}
end,
mode = segment.mode.REPLACE
}
}
})
Notifies each stream part and the complete response
local openai = require('model.providers.openai')
require('model').setup({
prompts = {
['show parts'] = {
provider = openai,
builder = openai.default_builder,
mode = {
on_finish = function (final)
vim.notify('final: ' .. final)
end,
on_partial = function (partial)
vim.notify(partial)
end,
on_error = function (msg)
vim.notify('error: ' .. msg)
end
}
},
}
})
You can move prompts into their own file and use util.module.autoload
to quickly iterate on prompt development.
Setup
local openai = require('model.providers.openai')
-- configure default model params here for the provider
openai.initialize({
model = 'gpt-3.5-turbo-0301',
max_tokens = 400,
temperature = 0.2,
})
local util = require('model.util')
require('model').setup({
hl_group = 'Substitute',
prompts = util.module.autoload('prompt_library'),
default_prompt = {
provider = openai,
builder = function(input)
return {
temperature = 0.3,
max_tokens = 120,
messages = {
{
role = 'system',
content = 'You are helpful assistant.',
},
{
role = 'user',
content = input,
}
}
}
end
}
})
Prompt library
local openai = require('model.providers.openai')
local segment = require('model.util.segment')
return {
code = {
provider = openai,
builder = function(input)
return {
messages = {
{
role = 'system',
content = 'You are a 10x super elite programmer. Continue only with code. Do not write tests, examples, or output of code unless explicitly asked for.',
},
{
role = 'user',
content = input,
}
}
}
end,
},
['to spanish'] = {
provider = openai,
hl_group = 'SpecialComment',
builder = function(input)
return {
messages = {
{
role = 'system',
content = 'Translate to Spanish',
},
{
role = 'user',
content = input,
}
}
}
end,
mode = segment.mode.REPLACE
},
['to javascript'] = {
provider = openai,
builder = function(input, ctx)
return {
messages = {
{
role = 'system',
content = 'Convert the code to javascript'
},
{
role = 'user',
content = input
}
}
}
end,
},
['to rap'] = {
provider = openai,
hl_group = 'Title',
builder = function(input)
return {
messages = {
{
role = 'system',
content = "Explain the code in 90's era rap lyrics"
},
{
role = 'user',
content = input
}
}
}
end,
}
}
New starter prompts, providers and bug fixes are welcome! If you've figured out some useful prompts and want to share, check out the discussions.
I'm hoping to eventually add the following features - I'd appreciate help with any of these.
The basics are here - a simple json vectorstore based on the git repo, querying, cosine similarity comparison. It just needs a couple more features to improve the DX of using from prompts.
Make treesitter and LSP info available in prompt context.
A split buffer for chat usage. I still just use web UI's for chat, but having RAG enhanced chats would be nice.