Skip to content
View kouroshHakha's full-sized avatar

Block or report kouroshHakha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
  • ray Public

    Forked from ray-project/ray

    An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyp…

    Python Apache License 2.0 Updated Feb 22, 2025
  • Sky-T1: Train your own O1 preview model within $450

    Python Apache License 2.0 Updated Feb 13, 2025
  • verl Public

    Forked from volcengine/verl

    veRL: Volcano Engine Reinforcement Learning for LLM

    Python Apache License 2.0 Updated Feb 10, 2025
  • Python Updated Feb 10, 2025
  • OpenRLHF Public

    Forked from OpenRLHF/OpenRLHF

    An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

    Python Apache License 2.0 Updated Feb 4, 2025
  • This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

    Python MIT License Updated Jan 26, 2025
  • TinyZero Public

    Forked from Jiayi-Pan/TinyZero

    Clean, accessible reproduction of DeepSeek R1-Zero

    Python Apache License 2.0 Updated Jan 26, 2025
  • Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

    Python Apache License 2.0 Updated Dec 31, 2024
  • ⚡ Building applications with LLMs through composability ⚡

    Python MIT License Updated Jun 22, 2023
  • jumbo Public

    Python 2 2 Other Updated May 5, 2023
  • SCSS Updated Apr 12, 2023
  • Python 12 3 BSD 3-Clause "New" or "Revised" License Updated Jan 7, 2023
  • fist Public

    Python 10 4 BSD 3-Clause "New" or "Revised" License Updated Aug 2, 2022
  • d3rlpy Public

    Forked from takuseno/d3rlpy

    An offline deep reinforcement learning library

    Python MIT License Updated May 20, 2022
  • Shell 16 7 BSD 3-Clause "New" or "Revised" License Updated Mar 29, 2022
  • genetic and neural net optimization for circuit design

    Python 18 9 Apache License 2.0 Updated Mar 29, 2022
  • Python 1 Updated Mar 29, 2022
  • Python Updated Mar 21, 2022
  • bb_envs Public

    Python 1 1 Updated Mar 17, 2022
  • Python 8 4 BSD 3-Clause "New" or "Revised" License Updated Mar 11, 2022
  • PyTorch implementation of Soft Actor-Critic (SAC)

    Jupyter Notebook 1 MIT License Updated Mar 9, 2022
  • Pytorch implementation of Neural Processes for functions and images 🎆

    Jupyter Notebook MIT License Updated Mar 9, 2022
  • MLutils Public

    Python 1 Updated Jan 26, 2022
  • Keeping track of RL experiments

    Apache License 2.0 Updated Sep 7, 2021
  • d4rl Public

    Forked from kpertsch/d4rl

    A benchmark for offline reinforcement learning.

    Python Apache License 2.0 Updated Apr 14, 2021
  • spirl Public

    Forked from clvrai/spirl

    Official implementation of "Accelerating Reinforcement Learning with Learned Skill Priors", Pertsch et al., CoRL 2020

    Python 1 Updated Apr 1, 2021
  • editsql Public

    Forked from ryanzhumich/editsql
    Python MIT License Updated Feb 15, 2021
  • RoBO Public

    Forked from automl/RoBO

    RoBO: a Robust Bayesian Optimization framework

    Python BSD 3-Clause "New" or "Revised" License Updated Dec 30, 2020
  • NASBench: A Neural Architecture Search Dataset and Benchmark

    Python Apache License 2.0 Updated Nov 9, 2020
  • hiro Public

    Python Updated Nov 4, 2020