
Lists (11)
Sort Name ascending (A-Z)
Starred repositories
8
stars
written in C++
Clear filter
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.
High-performance stateful serverless runtime based on WebAssembly
Java Bindings for llama.cpp - A Port of Facebook's LLaMA model in C/C++