Skip to content
View Kyriection's full-sized avatar
🎨
Focusing
🎨
Focusing

Block or report Kyriection

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Training Large Language Model to Reason in a Continuous Latent Space

Python 370 16 Updated Jan 16, 2025

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 644 45 Updated Jan 16, 2025
262 6 Updated Sep 29, 2024

A library for mechanistic interpretability of GPT-style language models

Python 1,749 317 Updated Jan 16, 2025

Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Python 868 31 Updated Jan 12, 2025

Official Code Repository for LM-Steer Paper: "Word Embeddings Are Steers for Language Models" (ACL 2024 Outstanding Paper Award)

Python 73 12 Updated Oct 1, 2024

A Telegram bot to recommend arXiv papers

Python 220 15 Updated Jan 8, 2025

DSPy: The framework for programming—not prompting—language models

Python 21,056 1,590 Updated Jan 14, 2025

[ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"

Python 377 19 Updated Oct 16, 2024

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Python 634 61 Updated Jun 1, 2024

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓

2,274 129 Updated Dec 17, 2024

Fast low-bit matmul kernels in Triton

Python 187 15 Updated Jan 7, 2025

Code used in Novy-Marx and Velikov (2024), AI-Powered (Finance) Scholarship

Python 25 10 Updated Jan 8, 2025

🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch

Python 2,082 52 Updated Nov 27, 2024

Recipes to scale inference-time compute of open models

Python 932 82 Updated Jan 16, 2025

My learning notes/codes for ML SYS.

Python 366 13 Updated Jan 14, 2025

Code for BLT research paper

Python 1,315 95 Updated Jan 14, 2025

The HELMET Benchmark

Python 103 13 Updated Jan 14, 2025
Python 3 Updated Dec 10, 2024

Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch

Python 145 8 Updated Dec 31, 2024
Python 52 1 Updated Jan 10, 2025

A library for advanced large language model reasoning

Python 1,661 145 Updated Jan 14, 2025

Optimisers.jl defines many standard optimisers and utilities for learning loops.

Julia 82 24 Updated Jan 7, 2025

Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead

Python 210 6 Updated Jan 4, 2025

APOLLO: SGD-like Memory, AdamW-level Performance

Python 81 2 Updated Jan 3, 2025
Python 404 40 Updated Jul 19, 2024

Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure (NeurIPS 2024) + Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count (a…

Python 9 Updated Dec 27, 2024

Stochastic Automatic Differentiation library for PyTorch.

Python 195 5 Updated Aug 30, 2024
Next