Skip to content
View matichon-vultureprime's full-sized avatar

Organizations

@vultureprime

Block or report matichon-vultureprime

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
30 results for source starred repositories
Clear filter

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Python 1,278 86 Updated Feb 7, 2025

An Aspiring Drop-In Replacement for NumPy at Scale

Python 825 79 Updated Jan 6, 2025

Code I wrote for my AI & LLM workshops

Jupyter Notebook 382 135 Updated Feb 7, 2025

Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.

Python 4,354 343 Updated Jan 17, 2025
TypeScript 9,854 546 Updated Feb 7, 2025

🌈 React for interactive command-line apps

TypeScript 27,571 629 Updated Nov 29, 2024

🦄 Record your terminal and generate animated gif images or share a web player

JavaScript 15,489 503 Updated Aug 29, 2024

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream d…

Python 692 50 Updated Jan 31, 2025

Experimental projects related to TensorRT

MLIR 88 14 Updated Feb 6, 2025

📖A curated list of Awesome LLM/VLM Inference Papers with codes: WINT8/4, Flash-Attention, Paged-Attention, Parallelism, etc. 🎉🎉

3,366 230 Updated Jan 31, 2025

Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models

Python 2,660 365 Updated Jan 7, 2025

An AI search engine inspired by Perplexity

TypeScript 1,343 207 Updated Jan 17, 2025

CUDA checkpoint and restore utility

C 286 15 Updated Jan 27, 2025

Checkpoint/Restore tool

C 3,078 623 Updated Feb 4, 2025

The visual editor for React

TypeScript 6,055 373 Updated Feb 3, 2025
Jupyter Notebook 22 5 Updated Mar 5, 2024

A framework for evaluating function calls made by LLMs

Python 36 4 Updated Jul 23, 2024

Gorilla: Training and Evaluating LLMs for Function Calls (Tool Calls)

Python 11,753 1,037 Updated Feb 6, 2025
Jupyter Notebook 499 23 Updated Aug 23, 2024

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Python 2,287 165 Updated Feb 4, 2025

Mamba SSM architecture

Python 13,901 1,200 Updated Jan 18, 2025

[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia

JavaScript 158 14 Updated Jul 30, 2024

Contrastive Chain-of-Thought Prompting

Python 57 4 Updated Nov 18, 2023

Evaluate the accuracy of LLM generated outputs

Jupyter Notebook 602 65 Updated Feb 2, 2025

The Triton TensorRT-LLM Backend

Python 765 113 Updated Feb 7, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,328 1,088 Updated Feb 7, 2025

Fast and memory-efficient exact attention

Python 15,328 1,443 Updated Feb 4, 2025

GitHub Action for advanced repository traffic analysis and reporting

Python 328 41 Updated Oct 1, 2023