NagisaZj

Follow

NagisaZj

Follow

6 followers · 0 following

Achievements

Achievements

Stars

allenai / winogrande

WinoGrande: An Adversarial Winograd Schema Challenge at Scale

Python 91 10 Updated Mar 13, 2020

lmarena / arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Python 734 91 Updated Dec 29, 2024

bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Python 882 228 Updated Oct 31, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,534 447 Updated Feb 11, 2025

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 28,262 3,267 Updated Jan 26, 2025

openai / simple-evals

Python 2,298 202 Updated Feb 8, 2025

tml-epfl / icl-alignment

Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]

Python 29 3 Updated Jan 23, 2025

openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15,474 2,658 Updated Dec 18, 2024

WHU-ZQH / DUP

Python 14 1 Updated May 29, 2024

google-deepmind / opro

official code for "Large Language Models as Optimizers"

Python 496 57 Updated Dec 4, 2024

real-stanford / universal_manipulation_interface

Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots

Python 787 147 Updated Dec 18, 2024

real-stanford / diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

Python 1,992 377 Updated Dec 24, 2024

TEA-Lab / diffusion_reward

[ECCV 2024] 💐Official implementation of the paper "Diffusion Reward: Learning Rewards via Conditional Video Diffusion"

Python 90 8 Updated Jul 2, 2024

seohongpark / HIQL

HIQL: Offline Goal-Conditioned RL with Latent States as Actions (NeurIPS 2023)

Python 76 5 Updated Dec 1, 2024

microsoft / Graphormer

Graphormer is a general-purpose deep learning backbone for molecular modeling.

Python 2,206 345 Updated Jun 7, 2024

XuGW-Kevin / DrM

DrM, a visual RL algorithm, minimizes the dormant ratio to guide exploration-exploitation trade-offs, achieving significant improvements in sample efficiency and asymptotic performance across diver…

Python 70 11 Updated May 27, 2024

Alescontrela / viper_rl

Using advances in generative modeling to learn reward functions from unlabeled videos.

Jupyter Notebook 119 12 Updated Feb 12, 2024

thuml / ContextWM

Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499

Python 58 3 Updated Sep 29, 2024

octo-models / octo

Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.

Python 1,045 189 Updated Jul 31, 2024

danijar / dreamerv3

Mastering Diverse Domains through World Models

Python 1,490 244 Updated Jan 30, 2025

chrhenning / hypnettorch

Package for working with hypernetworks in PyTorch.

Python 121 14 Updated Sep 7, 2023

lucidrains / flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

Python 1,229 58 Updated Oct 18, 2022

Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,554 2,925 Updated Sep 2, 2024

kzl / decision-transformer

Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Python 2,476 465 Updated Apr 29, 2024

matthias-weissenbacher / KFC

Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics.

Python 20 1 Updated Sep 11, 2023

chinganc / lightATAC

Python 2 1 Updated Jul 25, 2024

ethanluoyc / optimal_transport_reward

Python 14 4 Updated Apr 11, 2024

gwthomas / IQL-PyTorch

A PyTorch implementation of Implicit Q-Learning

Python 71 9 Updated Oct 23, 2021

pcchenxi / LAPO-offlienRL

Python 14 1 Updated Jun 5, 2023

ikostrikov / implicit_q_learning

Python 247 39 Updated Jan 23, 2022