Stars
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
Arena-Hard-Auto: An automatic LLM benchmark.
A framework for the evaluation of autoregressive code generation language models.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
official code for "Large Language Models as Optimizers"
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
[ECCV 2024] 💐Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"
HIQL: Offline Goal-Conditioned RL with Latent States as Actions (NeurIPS 2023)
Graphormer is a general-purpose deep learning backbone for molecular modeling.
DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diver…
Using advances in generative modeling to learn reward functions from unlabeled videos.
Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
Mastering Diverse Domains through World Models
Package for working with hypernetworks in PyTorch.
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics.
A PyTorch implementation of Implicit Q-Learning