- Vancouver, Canada
-
23:00
(UTC -07:00)
Stars
Rust library for concurrent data access, using memory-mapped files, zero-copy deserialization, and wait-free synchronization.
Enforce the output format (JSON Schema, Regex etc) of a language model
A Virtual Machine Monitor for modern Cloud workloads. Features include CPU, memory and device hotplug, support for running Windows and Linux guests, device offload with vhost-user and a minimal com…
Official inference framework for 1-bit LLMs
Simple, safe way to store and distribute tensors
Serverless LLM Serving for Everyone.
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
SGLang is a fast serving framework for large language models and vision language models.
QJL: 1-Bit Quantized JL transform for KV Cache Quantization with Zero Overhead
A high-throughput and memory-efficient inference and serving engine for LLMs
A secure container runtime with CRI/OCI interface
`std::execution`, the proposed C++ framework for asynchronous and parallel programming.
A decentralized application (Dapp) for a simple online ticket booking using Solidity and HardHat.
A tool for enriching the output of nvidia-smi.
آموزش دیزاین پترن به زبون آدمیزاد - Teaching design patterns in Persian
Extracts static code features from opencl kernels to be used for machine learning.