jcao-ai

JCao jcao-ai

46 followers · 6 following

Lepton.ai

Achievements

x3 x2

Achievements

x3 x2

Organizations

Stars

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,419 441 Updated Feb 8, 2025

hkust-nlp / simpleRL-reason

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 2,281 171 Updated Feb 7, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 9,170 1,194 Updated Feb 1, 2025

ggarber / rt-llm-proxy

Real Time (WebRTC & WebTransport) Proxy for LLM WebSocket APIs

Python 23 2 Updated Jan 17, 2025

Ind1x1 / GPUdirect_rdma_Access

This repository based by Mellanox/gpu_direct_rdma_access. Some errors in the code have been modified, some methods have been optimized, and some features have been added

C 2 1 Updated Sep 5, 2024

x42005e1f / aiologic

GIL-powered* locking library for Python

Python 20 2 Updated Feb 8, 2025

bentoml / BentoDiffusion

BentoDiffusion: A collection of diffusion models served with BentoML

Python 348 25 Updated Dec 23, 2024

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 724 28 Updated Sep 21, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 34,178 3,697 Updated Jan 25, 2025

grabowskiadrian / shopify-products-scraper

This is Shopify products Scraper. The script retrieves data from the products.json file of Shopify shop. Then, for each product, it makes an additional query to the product page to retrieve data fr…

Python 18 3 Updated Nov 24, 2024

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 697 56 Updated Sep 4, 2024

openai / grok

Python 4,104 544 Updated Mar 19, 2024

mit-han-lab / distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Python 652 29 Updated Dec 2, 2024

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 6,168 1,057 Updated Feb 7, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 15,359 1,446 Updated Feb 8, 2025

leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,971 1,018 Updated Jan 14, 2025

punica-ai / punica

Serving multiple LoRA finetuned LLM as one

Python 1,022 47 Updated May 8, 2024

state-spaces / mamba

Mamba SSM architecture

Python 13,906 1,200 Updated Jan 18, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,350 1,091 Updated Feb 8, 2025

thakkarparth007 / copilot-explorer

Hacky repo to see what the Copilot extension sends to the server

JavaScript 656 73 Updated Apr 21, 2023

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,174 499 Updated May 3, 2024

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,405 163 Updated Jun 25, 2024

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,491 481 Updated Feb 7, 2025

huggingface / trl

Train transformer language models with reinforcement learning.

Python 11,279 1,507 Updated Feb 8, 2025

allenai / FineGrainedRLHF

Python 267 21 Updated Jan 6, 2025

acheong08 / EdgeGPT

Reverse engineered API of Microsoft's Bing Chat AI

Python 8,033 904 Updated Aug 3, 2023

princeton-nlp / MeZO

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333

Python 1,078 66 Updated Jan 11, 2024

jaymody / picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy.

Python 3,305 428 Updated Apr 24, 2023

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,790 4,058 Updated Jul 17, 2024

meta-llama / llama

Inference code for Llama models

Python 57,530 9,693 Updated Jan 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JCao jcao-ai

Achievements

Achievements

Organizations

Block or report jcao-ai

Stars

OpenRLHF / OpenRLHF

hkust-nlp / simpleRL-reason

Jiayi-Pan / TinyZero

ggarber / rt-llm-proxy

Ind1x1 / GPUdirect_rdma_Access

x42005e1f / aiologic

bentoml / BentoDiffusion

efeslab / Nanoflow

2noise / ChatTTS

grabowskiadrian / shopify-products-scraper

IST-DASLab / marlin

openai / grok

mit-han-lab / distrifuser

NVIDIA / cutlass

Dao-AILab / flash-attention

leptonai / search_with_lepton

punica-ai / punica

state-spaces / mamba

NVIDIA / TensorRT-LLM

thakkarparth007 / copilot-explorer

jzhang38 / TinyLlama

FasterDecoding / Medusa

InternLM / lmdeploy

huggingface / trl

allenai / FineGrainedRLHF

acheong08 / EdgeGPT

princeton-nlp / MeZO

jaymody / picoGPT

tatsu-lab / stanford_alpaca

meta-llama / llama