- llama.cpp backend (CPU, Metal) now requires a redownload of gguf model due to upstream format changes: TabbyML#645 ggerganov/llama.cpp#3252
- Switch cpu backend to llama.cpp: TabbyML#638
- add
server.completion_timeout
to control the code completion interface timeout: TabbyML#637 - Switch cuda backend to llama.cpp: TabbyML#656
- Switch tokenizer to llama.cpp, so tabby no longer need to download additional tokenizer file: TabbyML#683
- Supports golang: TabbyML#553
- Supports ruby: TabbyML#597
- Supports using local directory for
Repository.git_url
: usefile:///path/to/repo
to specify a local directory. - A new UI design for webserver.
- Improve snippets retrieval by dedup candidates to existing content + snippets: TabbyML#582
- Fix GPU OOM issue caused the parallelism: TabbyML#541, TabbyML#587
- Fix git safe directory check in docker: TabbyML#569
The currently supported languages are:
- Rust
- Python
- JavaScript / JSX
- TypeScript / TSX
A blog series detailing the technical aspects of Retrieval-Augmented Code Completion will be published soon. Stay tuned!
- Fix Issue #511 by marking ggml models as optional.
- Improve stop words handling by combining RegexSet into Regex for efficiency.
- Fix a critical issue that might cause request dead locking in ctranslate2 backend (when loading is heavy)
We have introduced a new argument, --chat-model
, which allows you to specify the model for the chat playground located at http://localhost:8080/playground
To utilize this feature, use the following command in the terminal:
tabby serve --device metal --model TabbyML/StarCoder-1B --chat-model TabbyML/Mistral-7B
Mainland Chinese users have been facing challenges accessing Hugging Face due to various reasons. The Tabby team is actively working to address this issue by mirroring models to a hosting provider in mainland China called modelscope.cn.
# Download from the Modelscope registry
TABBY_REGISTRY=modelscope tabby download --model TabbyML/WizardCoder-1B
- Implemented more accurate UTF-8 incremental decoding in the GitHub pull request.
- Fixed the stop words implementation by utilizing RegexSet to isolate the stop word group.
- Improved model downloading logic; now Tabby will attempt to fetch the latest model version if there's a remote change, and the local cache key becomes stale.
- set default num_replicas_per_device for ctranslate2 backend to increase parallelism.