borisshapa

🌸

Boris Shaposhnikov borisshapa

🌸

AI researcher at T-Bank. ex @VKCOM. PhD student.

57 followers · 64 following

St Petersburg
https://t.me/borisshapa

Achievements

Stars

CLAIRE-Labo / python-ml-research-template

A template for starting reproducible Python machine-learning projects with hardware acceleration. Find an example at https://github.com/CLAIRE-Labo/no-representation-no-trust

Shell 88 4 Updated Mar 3, 2025

corl-team / lime

Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"

Python 26 1 Updated Feb 16, 2025

facebookresearch / coconut

Training Large Language Model to Reason in a Continuous Latent Space

Python 941 80 Updated Jan 24, 2025

enjeeneer / zero-shot-rl

VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)

Python 13 1 Updated Jan 14, 2025

EasyJailbreak / EasyJailbreak

An easy-to-use Python framework to generate adversarial jailbreak prompts.

Python 565 47 Updated Sep 2, 2024

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,399 409 Updated Mar 7, 2025

f / awesome-chatgpt-prompts

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

HTML 121,225 16,297 Updated Mar 3, 2025

zhaoyu-li / DL4TP

[COLM 2024] A Survey on Deep Learning for Theorem Proving

170 12 Updated Feb 10, 2025

uclaml / SPPO

The official implementation of Self-Play Preference Optimization (SPPO)

Python 494 47 Updated Jan 23, 2025

markovka17 / dla

Deep learning for audio processing

Jupyter Notebook 626 111 Updated Dec 27, 2024

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,543 365 Updated Mar 6, 2025

Cornell-RL / drpo

Dateset Reset Policy Optimization

Python 30 1 Updated Apr 12, 2024

GAIR-NLP / weak-to-strong-reasoning

Python 59 3 Updated Sep 2, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 4,565 276 Updated Mar 7, 2025

maitrix-org / llm-reasoners

A library for advanced large language model reasoning

Python 2,008 178 Updated Feb 21, 2025

ruizheng20 / gpo

The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".

Python 16 1 Updated Jun 20, 2024

TianduoWang / DPO-ST

[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

Python 41 5 Updated Jul 28, 2024

YuxiXie / MCTS-DPO

This is the repository that contains the source code for the Self-Evaluation Guided MCTS for online DPO.

Jupyter Notebook 291 29 Updated Aug 6, 2024

google-deepmind / mctx

Monte Carlo tree search in JAX

Python 2,433 198 Updated Dec 11, 2024

turbo-llm / turbo-alignment

Library for industrial alignment.

Python 384 29 Updated Mar 4, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,344 532 Updated Mar 6, 2025

RodkinIvan / associative-recurrent-memory-transformer

Forked from booydar/recurrent-memory-transformer

[ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluating

Jupyter Notebook 38 7 Updated Mar 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boris Shaposhnikov borisshapa

Achievements

Achievements

Block or report borisshapa

Stars

CLAIRE-Labo / python-ml-research-template

corl-team / lime

facebookresearch / coconut

enjeeneer / zero-shot-rl

EasyJailbreak / EasyJailbreak

volcengine / verl

f / awesome-chatgpt-prompts

zhaoyu-li / DL4TP

uclaml / SPPO

markovka17 / dla

hijkzzz / Awesome-LLM-Strawberry

Cornell-RL / drpo

GAIR-NLP / weak-to-strong-reasoning

linkedin / Liger-Kernel

maitrix-org / llm-reasoners

ruizheng20 / gpo

TianduoWang / DPO-ST

YuxiXie / MCTS-DPO

google-deepmind / mctx

turbo-llm / turbo-alignment

OpenRLHF / OpenRLHF

RodkinIvan / associative-recurrent-memory-transformer

vwxyzjn / summarize_from_feedback_details

RLHFlow / RLHF-Reward-Modeling

angie-chen55 / pref-learning-ranking-acc

jbloomAus / SAELens

fla-org / flash-linear-attention

young-geng / mintext

KindXiaoming / pykan

google-deepmind / penzai