Skip to content
View matichon-vultureprime's full-sized avatar

Organizations

@vultureprime

Block or report matichon-vultureprime

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Python 1,271 86 Updated Jan 30, 2025

An Aspiring Drop-In Replacement for NumPy at Scale

Python 823 79 Updated Jan 6, 2025

Code I wrote for my AI & LLM workshops

Jupyter Notebook 380 132 Updated Jan 24, 2025

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.

Python 4,077 296 Updated Jan 17, 2025
TypeScript 9,633 533 Updated Jan 30, 2025

🌈 React for interactive command-line apps

TypeScript 27,533 628 Updated Nov 29, 2024

🦄 Record your terminal and generate animated gif images or share a web player

JavaScript 15,481 503 Updated Aug 29, 2024

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream d…

Python 685 49 Updated Jan 29, 2025

Experimental projects related to TensorRT

MLIR 86 14 Updated Jan 31, 2025

📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉

3,311 226 Updated Jan 24, 2025

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

Python 2,656 364 Updated Jan 7, 2025

An AI search engine inspired by Perplexity

TypeScript 1,325 200 Updated Jan 17, 2025

CUDA checkpoint and restore utility

C 277 15 Updated Jan 27, 2025

Checkpoint/Restore tool

C 3,069 617 Updated Jan 29, 2025

The visual editor for React

TypeScript 5,907 365 Updated Jan 30, 2025
Jupyter Notebook 22 4 Updated Mar 5, 2024

A framework for evaluating function calls made by LLMs

Python 36 4 Updated Jul 23, 2024

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python 11,721 1,037 Updated Jan 29, 2025
Jupyter Notebook 497 23 Updated Aug 23, 2024

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 2,148 160 Updated Jan 30, 2025

Mamba SSM architecture

Python 13,860 1,195 Updated Jan 18, 2025

[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia

JavaScript 157 14 Updated Jul 30, 2024

Contrastive Chain-of-Thought Prompting

Python 57 4 Updated Nov 18, 2023

Evaluate the accuracy of LLM generated outputs

Jupyter Notebook 590 63 Updated Jan 30, 2025

The Triton TensorRT-LLM Backend

Python 760 113 Updated Jan 30, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,270 1,086 Updated Jan 30, 2025

Fast and memory-efficient exact attention

Python 15,232 1,440 Updated Jan 30, 2025

GitHub Action for advanced repository traffic analysis and reporting

Python 327 41 Updated Oct 1, 2023