Skip to content

iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh

License

Notifications You must be signed in to change notification settings

spirobel/bunny-llama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bunny-llama

What is this?

A bunny that sits on top of a llama (and controls it).

On a more serious note:

The bunny-llama project is a tool that uses bun to interact with llama.cpp. It provides a way to quickly develop and test code using llama.cpp bindings. The project supports hot module reloading, which allows for changes in the code to be reflected in the running application without needing to manually restart it.

This also means the model does not have to be reloaded every time you make a change and recompile your custom prompting functions.

Hot module reloading

 make api-llama.so && curl localhost:1337

To run:

bun clone
bun make
bun ride

To clean:

bun clean

To install dependencies:

bun install

(most likely you already have git and zig)

Install zig with the right version:

bun install -g @oven/zig

or update it as described here

Nvidia llama

For people with nvidia gpus:

install conda.

conda create -n bunny
conda activate bunny
conda install cuda -c nvidia

then make the llama with cuda, like so:

bun clone
bun make-cuda
bun ride.ts

now you have a special cuda enabled llama.

if you closed your shell and you want to build the cuda llama again, you need to activate the conda environment first:

conda activate bunny
bun make-cuda

About

iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published