Stars
Yi-1.5 is an upgraded version of Yi, delivering stronger performance in coding, math, reasoning, and instruction-following capability.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.
Development repository for the Triton language and compiler
Top free VPN (ClashX & V2Ray proxy) with subscription links. [免费VPN、免费梯子、免费科学上网、免费订阅链接、免费节点、精选、ClashX & V2Ray 教程]
Stable Diffusion AI client app for Android
Compose Multiplatform app generates images using Stability AI
Stable Diffusion in NCNN with c++, supported txt2img and img2img
llm deploy project based mnn. This project has merged into MNN.
CPU INFOrmation library (x86/x86-64/ARM/ARM64, Linux/Windows/Android/macOS/iOS)
ppl.cv is a high-performance image processing library of openPPL supporting various platforms.
High-efficiency floating-point neural network inference operators for mobile, server, and Web
A debugging and profiling tool that can trace and visualize python code execution
Use GraphicBuffer class from Android native code
Stable Diffusion with Core ML on Apple Silicon
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.
Protocol Buffers - Google's data interchange format
A curated list of awesome things related to HarmonyOS. 华为鸿蒙操作系统。