Highlights
- Pro
Stars
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
A library for advanced large language model reasoning
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
O1 Replication Journey: A Strategic Progress Report – Part I
Related papers for Continual Reinforcement Learning.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
[NeurIPS 2022] PerfectDou: Dominating DouDizhu with Perfect Information Distillation
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI
Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
This project aims to collect the latest "call for reviewers" links from various top CS/ML/AI conferences/journals
Proximal Policy Optimization Algorithm applied to MountainCar in discrete environment
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Isaac Gym Reinforcement Learning Environments
一个简洁优雅的词典翻译 macOS App。开箱即用,支持离线 OCR 识别,支持有道词典,🍎 苹果系统词典,🍎 苹果系统翻译,OpenAI,Gemini,DeepL,Google,Bing,腾讯,百度,阿里,小牛,彩云和火山翻译。A concise and elegant Dictionary and Translator macOS App for looking up words an…
TPAMI2020 "Unsupervised Multi-Class Domain Adaptation: Theory, Algorithms, and Practice"
Code released for ICML 2019 paper "Bridging Theory and Algorithm for Domain Adaptation".
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks