SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2…

Python 14,293 1,454 Updated Jan 27, 2025

EQ-bench / EQ-Bench

A benchmark for emotional intelligence in large language models

Python 216 19 Updated Jul 26, 2024

normster / llm_rules

RuLES: a benchmark for evaluating rule-following in language models

Python 214 15 Updated Jan 22, 2025

zjunlp / LLMAgentPapers

Must-read Papers on LLM Agents.

2,064 117 Updated Nov 12, 2024

princeton-nlp / LLMBar

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

Python 119 8 Updated Jul 8, 2024

ionic-team / ionic-docs

HTML 592 3,073 Updated Jan 27, 2025

hkust-nlp / AgentBoard

An Analytical Evaluation Board of Multi-turn LLM Agents

SAS 272 28 Updated May 20, 2024

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 1,406 140 Updated Jan 27, 2025

nexusflowai / NexusRaven-V2

Jupyter Notebook 400 32 Updated Feb 13, 2024

Junjie-Ye / RoTBench

[EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning

Python 13 Updated Sep 20, 2024

facebookresearch / cruxeval

CRUXEval: Code Reasoning, Understanding, and Execution Evaluation

Python 122 14 Updated Oct 11, 2024

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 18,607 1,070 Updated Jan 28, 2025

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,393 161 Updated Jun 25, 2024

apple / axlearn

An Extensible Deep Learning Library

Python 1,927 281 Updated Jan 28, 2025

dabit3 / openai-functions-god-app

TypeScript 240 33 Updated Jul 5, 2023

instructor-ai / instructor

structured outputs for llms

Python 9,123 713 Updated Jan 27, 2025

developersdigest / function-chain

The FunctionChain is a tool that simplifies and organizes the process of invoking OpenAI functions in your Node.js applications. With this toolkit, you can easily scaffold out and isolate all the O…

JavaScript 55 10 Updated Jul 10, 2023

JohannLai / openai-function-calling-tools

🛠 openai function calling tools for JS/TS

TypeScript 290 26 Updated Jan 30, 2024

abetlen / ggml-python

Python bindings for ggml

Python 136 11 Updated Sep 2, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 35,143 5,347 Updated Jan 28, 2025

ctlllll / LLM-ToolMaker

Jupyter Notebook 1,025 100 Updated May 29, 2023

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,548 314 Updated Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guoli Yin gyin94

Achievements