🎯
Focusing
PhD in Reinforcement Learning, LLM Alignment, RLHF
-
University of Cambridge
- https://holarissun.github.io/
- @HolarisSun
Highlights
- Pro
Pinned Loading
-
RewardModelingBeyondBradleyTerry
RewardModelingBeyondBradleyTerry Publicofficial implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
-
RewardShifting
RewardShifting PublicCode for NeurIPS 2022 paper Exploiting Reward Shifting in Value-Based Deep RL
-
Prompt-OIRL
Prompt-OIRL Publiccode for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
-
embedding-based-llm-alignment
embedding-based-llm-alignment PublicCodebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs
-
478 contributions in the last year
Day of Week | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | April Apr | |||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More
Activity overview
Contributed to
holarissun/RewardModelingBeyondBradleyTerry,
holarissun/HandsOnTransformers,
holarissun/embedding-based-llm-alignment
and 3 other
repositories
Loading
Contribution activity
April 2025
Created 16 commits in 4 repositories
Created 1 repository
-
holarissun/Inverse-RLignment
This contribution was made on Apr 15
2
contributions
in private repositories
Apr 15