Skip to content
View borisshapa's full-sized avatar
๐ŸŒธ
๐ŸŒธ

Block or report borisshapa

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A template for starting reproducible Python machine-learning projects with hardware acceleration. Find an example at https://github.com/CLAIRE-Labo/no-representation-no-trust

Shell 88 4 Updated Mar 3, 2025

Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"

Python 26 1 Updated Feb 16, 2025

Training Large Language Model to Reason in a Continuous Latent Space

Python 941 80 Updated Jan 24, 2025

VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)

Python 13 1 Updated Jan 14, 2025

An easy-to-use Python framework to generate adversarial jailbreak prompts.

Python 565 47 Updated Sep 2, 2024

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,399 409 Updated Mar 7, 2025

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

HTML 121,225 16,297 Updated Mar 3, 2025

[COLM 2024] A Survey on Deep Learning for Theorem Proving

170 12 Updated Feb 10, 2025

The official implementation of Self-Play Preference Optimization (SPPO)

Python 494 47 Updated Jan 23, 2025

Deep learning for audio processing

Jupyter Notebook 626 111 Updated Dec 27, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 ๐Ÿ“ and reasoning techniques.

6,543 365 Updated Mar 6, 2025

Dateset Reset Policy Optimization

Python 30 1 Updated Apr 12, 2024

Efficient Triton Kernels for LLM Training

Python 4,565 276 Updated Mar 7, 2025

A library for advanced large language model reasoning

Python 2,008 178 Updated Feb 21, 2025

The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".

Python 16 1 Updated Jun 20, 2024

[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

Python 41 5 Updated Jul 28, 2024

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Jupyter Notebook 291 29 Updated Aug 6, 2024

Monte Carlo tree search in JAX

Python 2,433 198 Updated Dec 11, 2024

Library for industrial alignment.

Python 384 29 Updated Mar 4, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,344 532 Updated Mar 6, 2025

[ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluating

Jupyter Notebook 38 7 Updated Mar 2, 2025

Recipes to train reward model for RLHF.

Python 1,225 88 Updated Feb 9, 2025

Training Sparse Autoencoders on Language Models

Jupyter Notebook 648 147 Updated Feb 25, 2025

๐Ÿš€ Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 2,060 126 Updated Mar 5, 2025

Minimal but scalable implementation of large language models in JAX

Python 34 Updated Nov 2, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 15,473 1,458 Updated Jan 19, 2025

A JAX research toolkit for building, editing, and visualizing neural networks.

Python 1,736 57 Updated Dec 16, 2024
Next