SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2…

Python 14,169 1,444 Updated Jan 14, 2025

EQ-bench / EQ-Bench

A benchmark for emotional intelligence in large language models

Python 212 19 Updated Jul 26, 2024

normster / llm_rules

RuLES: a benchmark for evaluating rule-following in language models

Python 214 15 Updated Jan 14, 2025

zjunlp / LLMAgentPapers

Must-read Papers on LLM Agents.

2,033 111 Updated Nov 12, 2024

princeton-nlp / LLMBar

[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction Following

Python 120 8 Updated Jul 8, 2024

ionic-team / ionic-docs

HTML 590 3,068 Updated Jan 13, 2025

hkust-nlp / AgentBoard

An Analytical Evaluation Board of Multi-turn LLM Agents

SAS 270 26 Updated May 20, 2024

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 1,384 138 Updated Jan 14, 2025

nexusflowai / NexusRaven-V2

Jupyter Notebook 400 32 Updated Feb 13, 2024

Junjie-Ye / RoTBench

[EMNLP 2024] RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning

Python 12 Updated Sep 20, 2024

facebookresearch / cruxeval

CRUXEval: Code Reasoning, Understanding, and Execution Evaluation

Python 119 14 Updated Oct 11, 2024

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 18,310 1,056 Updated Jan 14, 2025

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,387 166 Updated Jun 25, 2024

apple / axlearn

An Extensible Deep Learning Library

Python 1,920 280 Updated Jan 14, 2025

dabit3 / openai-functions-god-app

TypeScript 240 33 Updated Jul 5, 2023

instructor-ai / instructor

structured outputs for llms

Python 8,879 701 Updated Jan 14, 2025

developersdigest / function-chain

The FunctionChain is a tool that simplifies and organizes the process of invoking OpenAI functions in your Node.js applications. With this toolkit, you can easily scaffold out and isolate all the O…

JavaScript 55 11 Updated Jul 10, 2023

JohannLai / openai-function-calling-tools

🛠 openai function calling tools for JS/TS

TypeScript 289 26 Updated Jan 30, 2024

abetlen / ggml-python

Python bindings for ggml

Python 136 11 Updated Sep 2, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 33,733 5,156 Updated Jan 14, 2025

ctlllll / LLM-ToolMaker

Jupyter Notebook 1,025 100 Updated May 29, 2023

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,523 310 Updated Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guoli Yin gyin94

Achievements