12kimih

Follow

Ik-hwan Kim 12kimih

Follow

9 followers · 8 following

Lists (3)

Sort

🛠️ Libraries

17 repositories

🚀 Projects

20 repositories

📖 Resources

15 repositories

Stars

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 1,412 217 Updated Apr 14, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 23,921 2,183 Updated Apr 14, 2025

javiferran / sae_entities

Python 23 6 Updated Mar 6, 2025

huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences

Python 5,127 440 Updated Nov 21, 2024

mlabonne / llm-datasets

Curated list of datasets and tools for post-training.

2,938 254 Updated Jan 29, 2025

WildEval / ZeroEval

Forked from allenai/WildBench

A simple unified framework for evaluating LLMs

HTML 210 23 Updated Apr 11, 2025

openai / simple-evals

Python 2,639 238 Updated Apr 14, 2025

openai / evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Python 15,891 2,686 Updated Dec 18, 2024

THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 613 48 Updated Jan 20, 2025

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 8,629 2,303 Updated Apr 14, 2025

huggingface / evaluation-guidebook

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Jupyter Notebook 1,133 72 Updated Jan 7, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,073 2,704 Updated Apr 14, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,166 723 Updated Apr 4, 2025

TigerVNC / tigervnc

High performance, multi-platform VNC client and server

C++ 5,883 1,028 Updated Apr 8, 2025

openai / spinningup

An educational resource to help anyone learn deep reinforcement learning.

Python 10,777 2,315 Updated Aug 5, 2024

openai / baselines

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python 16,210 4,900 Updated Aug 1, 2024

openai / prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Python 1,972 116 Updated Jun 1, 2023

zhentingqi / rStar

Python 921 104 Updated Jan 23, 2025

GoodAI / goodai-ltm-benchmark

A library for benchmarking the Long Term Memory and Continual learning capabilities of LLM based agents. With all the tests and code you need to evaluate your own agents. See more in the blogpost:

HTML 67 12 Updated Dec 17, 2024

langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 105,608 17,147 Updated Apr 14, 2025

jbloomAus / SAELens

Training Sparse Autoencoders on Language Models

Jupyter Notebook 717 163 Updated Apr 14, 2025

shehper / scaling_laws

Forked from karpathy/nanoGPT

An open-source implementation of Scaling Laws for Neural Language Models using nanoGPT

Python 41 5 Updated Dec 8, 2023

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,646 365 Updated Apr 14, 2025

callummcdougall / ARENA_3.0

Jupyter Notebook 502 315 Updated Apr 12, 2025

TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models

Python 2,053 366 Updated Apr 4, 2025

allenai / OLMo

Modeling, training, eval, and inference code for OLMo

Python 5,492 591 Updated Apr 10, 2025

naver-ai / usdm

Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)

Python 87 3 Updated Dec 3, 2024

axolotl-ai-cloud / axolotl

Go ahead and axolotl questions

Python 9,092 990 Updated Apr 14, 2025

huggingface / smollm

Everything about the SmolLM2 and SmolVLM family of models

Python 2,171 127 Updated Mar 31, 2025

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 1,785 176 Updated Apr 14, 2025