jcao-ai

Follow

JCao jcao-ai

Follow

ML

40 followers · 6 following

Lepton.ai

Achievements

Achievements

Stars

bentoml / BentoDiffusion

BentoDiffusion: A collection of diffusion models served with BentoML

Python 331 25 Updated Aug 26, 2024

efeslab / Nanoflow

A throughput-oriented high-performance serving framework for LLMs

Cuda 568 23 Updated Sep 21, 2024

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 31,213 3,387 Updated Sep 21, 2024

grabowskiadrian / shopify-products-scraper

This is Shopify products Scraper. The script retrieves data from the products.json file of Shopify shop. Then, for each product, it makes an additional query to the product page to retrieve data fr…

Python 17 Updated Apr 8, 2024

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 575 45 Updated Sep 4, 2024

openai / grok

Python 4,078 515 Updated Mar 19, 2024

mit-han-lab / distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Python 561 21 Updated Aug 17, 2024

NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines

C++ 5,457 924 Updated Sep 25, 2024

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 13,646 1,250 Updated Oct 6, 2024

leptonai / search_with_lepton

Building a quick conversation-based search demo with Lepton AI.

TypeScript 7,763 988 Updated Sep 18, 2024

punica-ai / punica

Serving multiple LoRA finetuned LLM as one

Python 964 45 Updated May 8, 2024

state-spaces / mamba

Mamba SSM architecture

Python 12,743 1,074 Updated Sep 26, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,339 936 Updated Oct 1, 2024

thakkarparth007 / copilot-explorer

Hacky repo to see what the Copilot extension sends to the server

JavaScript 624 71 Updated Apr 21, 2023

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 7,711 454 Updated May 3, 2024

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,234 153 Updated Jun 25, 2024

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 4,344 390 Updated Sep 28, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 9,608 1,207 Updated Oct 5, 2024

allenai / FineGrainedRLHF

Python 251 21 Updated Nov 22, 2023

acheong08 / EdgeGPT

Reverse engineered API of Microsoft's Bing Chat AI

Python 8,077 910 Updated Aug 3, 2023

princeton-nlp / MeZO

[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333

Python 1,029 62 Updated Jan 11, 2024

jaymody / picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy.

Python 3,204 412 Updated Apr 24, 2023

tatsu-lab / stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,388 4,031 Updated Jul 17, 2024

meta-llama / llama

Inference code for Llama models

Python 55,851 9,512 Updated Aug 18, 2024

peng-zhihui / L-ink_Card

Smart NFC & ink-Display Card

C 7,315 1,798 Updated Jan 10, 2021

unlir / XDrive

Stepper motor with multi-function interface and closed loop function. 具有多功能接口和闭环功能的步进电机。

C 1,240 444 Updated May 12, 2024

jcchurch13 / Mechaduino-Hardware

Mechaduino hardware design files. Project logs:

Eagle 345 124 Updated May 4, 2017

viktorvano / STM32-Bootloader

STM32 bootloader example that can jump to 2 apps.

C 251 66 Updated Jul 27, 2021

arttupii / CLSMB

Closed Loop Step Motor Controller

C++ 24 6 Updated May 28, 2022

Misfittech / nano_stepper

Stepper feedback controller

C++ 423 179 Updated Apr 27, 2024