A bunny that sits on top of a llama (and controls it).
The bunny-llama project is a tool that uses bun to interact with llama.cpp. It provides a way to quickly develop and test code using llama.cpp bindings. The project supports hot module reloading, which allows for changes in the code to be reflected in the running application without needing to manually restart it.
This also means the model does not have to be reloaded every time you make a change and recompile your custom prompting functions.
make api-llama.so && curl localhost:1337
bun clone
bun make
bun ride
bun clean
bun install
(most likely you already have git and zig)
Install zig with the right version:
bun install -g @oven/zig
or update it as described here
For people with nvidia gpus:
install conda.
conda create -n bunny
conda activate bunny
conda install cuda -c nvidia
then make the llama with cuda, like so:
bun clone
bun make-cuda
bun ride.ts
now you have a special cuda enabled llama.
if you closed your shell and you want to build the cuda llama again, you need to activate the conda environment first:
conda activate bunny
bun make-cuda