Skip to content
View ducviet00's full-sized avatar
  • Moreh, Inc.
  • Ha Noi
  • 20:13 (UTC +07:00)
  • LinkedIn in/ducviet00

Block or report ducviet00

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Optical character recognition for Japanese text, with the main focus being Japanese manga

Python 1,979 96 Updated Jan 1, 2025

Memray is a memory profiler for Python

Python 13,787 402 Updated Mar 17, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 1,304 94 Updated Mar 13, 2025

Load compute kernels from the Hub

Python 99 4 Updated Mar 21, 2025

Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents

TypeScript 4,016 326 Updated Mar 12, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,054 518 Updated Mar 16, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,684 101 Updated Mar 7, 2025

[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

Cuda 1,011 67 Updated Mar 21, 2025

dotfiles

Lua 12 Updated Nov 5, 2024

macOS system monitor in your menu bar

Swift 30,050 956 Updated Mar 22, 2025

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 2,143 135 Updated Mar 22, 2025

WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

TypeScript 631 35 Updated Mar 13, 2025

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

C++ 758 57 Updated Mar 22, 2025

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

243 14 Updated Mar 3, 2025

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 2,383 173 Updated Mar 18, 2025

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

Python 308 20 Updated Mar 10, 2025

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 3,754 226 Updated Mar 21, 2025

The ML4W Dotfiles for Hyprland - An advanced and full-featured configuration for the dynamic tiling window manager Hyprland including an easy to use installation script for Arch and Fedora based Li…

Shell 2,139 182 Updated Mar 18, 2025

LLM KV cache compression made easy

Python 441 31 Updated Mar 19, 2025

MLX: An array framework for Apple silicon

C++ 19,713 1,124 Updated Mar 22, 2025

👻 Ghostty is a fast, feature-rich, and cross-platform terminal emulator that uses platform-native UI and GPU acceleration.

Zig 28,599 744 Updated Mar 21, 2025

AI-powered tools to enhance Anki flashcards with explanations, mnemonics, illustrations, and adaptive learning for medical school and beyond

Python 712 23 Updated Feb 14, 2025

Python tool for converting files and office documents to Markdown.

Python 41,109 1,942 Updated Mar 21, 2025

Nvidia Instruction Set Specification Generator

Python 253 11 Updated Jul 9, 2024

CUDA/Metal accelerated language model inference

C 529 23 Updated Mar 9, 2025
Cuda 28 5 Updated Jan 6, 2025
Python 29 1 Updated Dec 2, 2024

Fastest kernels written from scratch

Cuda 199 29 Updated Mar 7, 2025

Tutorials on tinygrad

356 26 Updated Feb 26, 2025
Next