GitHub - xinzhel/LLM-Agent-Survey: Survey on LLM Agents (Published on CoLing 2025)

A Reading List for LLM-Agents (Updated: 5 Mar 2025)

Xinzhe Li

This Repository vs. Others

Our Github Repository follows the selection criteria below:

Allowing Coherent Understanding: They can be systematically categoried into the unified framework in my survey, according to the use of LLM-Profiled Roles (LMPRs).
- A general survey (Accepted at CoLing 2025): A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning
High Quality: Papers are published on ICML, ICLR, NeurIPS, *ACL (including EMNLP), and COLING. Or unpublished papers contain useful analysis and insightful novelty
- Unpublished papers are marked with 💡 and will be updated upon publication. ⭐️ STAR this repo to stay updated!
- Paper Reviews: The paper links to OpenReview (if available) are alwasy given. I often learn much more from and resonate with many reviews about the papers and evaluate some rejected papers with the reviews. (That's why I always like NeurIPS/ICLR papers).
Exhasutive Review on Search Workflows
- A corresponding survey: A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks
  Updated Paper will be released on 6 Mar 2025

Other Github Repositories summarize related papers with less constrained selection criteria:

Other Github Repositories summarize related papers focusing on specific perspectives:

nuster1128/LLM_Agent_Memory_Survey: Focus on memory
teacherpeterpan/self-correction-llm-papers: Focus on feedback learning (Self Correction)
git-disl/awesome-LLM-game-agent-papers: Focus on gaming applications

🎁 Surveys

A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning, CoLing 2025 [paper]
A Survey on Large Language Model based Autonomous Agents, Frontiers of Computer Science 2024 [paper] | [code]
Augmented Language Models: a Survey, TMLR [paper]
Understanding the planning of LLM agents: A survey, arXiv [paper] 💡
The Rise and Potential of Large Language Model Based Agents: A Survey, arxiv [paper] 💡
A Survey on the Memory Mechanism of Large Language Model based Agents, arxiv [paper] 💡

🚀 Tool Use

ReAct: Synergizing Reasoning and Acting in Language Models, ICLR 2023 [paper]
Toolformer: Language Models Can Teach Themselves to Use Tools, NeurIPS 2023 [paper]
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023 [paper]
API-Bank: A Benchmark for Tool-Augmented LLMs, EMNLP 2023 [paper]
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings, NeurIPS 2023 [paper]
MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting, ACL 2023 [paper]
ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models, EMNLP 2023 [paper]
ART: Automatic multi-step reasoning and tool-use for large language models, arXiv.2303.09014 [paper] 💡
TALM: Tool Augmented Language Models, arXiv.2205.12255 [paper] 💡
On the Tool Manipulation Capability of Open-source Large Language Models, arXiv.2305.16504 [paper] 💡
Large Language Models as Tool Makers, arXiv.2305.17126 [paper] 💡
GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution, arXiv.2307.08775 [paper] 💡
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, arXiv.2307.16789 [paper] 💡
Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models, arXiv.2308.00675 [paper] 💡
MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback, arXiv.2309.10691 [paper] 💡
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning, arXiv.2309.10814 [paper] 💡

🧠 Planning

Base Workflows

On the Planning Abilities of Large Language Models -- A Critical Investigation, NeurIPS 2023 [paper]

Search Workflows

Details in the page (on the way to be publised).

Alphazero-like Tree-Search can guide large language model decoding and training, ICML 2024 [paper]
- Search Algorithm: MCTS
Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models, ICML 2024 [paper]
- Search Algorithm: MCTS
When is Tree Search Useful for {LLM} Planning? It Depends on the Discriminator, ACL 2024 [paper]
- Search Algorithm: MCTS
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation, ACL findings 2024 [paper]
- Search Algorithm: MCTS
Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs, ACL 2024 [paper]
- Search Algorithm: BFS/DFS
LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, EMNLP findings 2024 [paper] | [code]
- Search Algorithm: A*
LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models, COLM2024 [paper] | [code]
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models, arXiv.2310.04406 [paper] 💡
Large Language Model Guided Tree-of-Thought, arXiv.2305.08291 [paper]💡
Tree Search for Language Model Agents, Under Review [paper]💡
- Search Algorithm: Best-First Search
Q*: Improving multi-step reasoning for llms with deliberative planning, Under Review [paper]💡
- Search Algorithm: A*
Planning with Large Language Models for Code Generation, ICLR 2023 [paper]
- Search Algorithm: MCTS
Tree of Thoughts: Deliberate Problem Solving with Large Language Models, NeurIPS 2023 [paper]
- Search Algorithm: BFS/DFS
LLM-MCTS:Large Language Models as Commonsense Knowledge for Large-Scale Task Planning, NeurIPS 2023 [paper] | [code]
- Search Algorithm: MCTS
Self-Evaluation Guided Beam Search for Reasoning, NeurIPS 2023 [paper]
- Search Algorithm: BFS/DFS
PathFinder: Guided Search over Multi-Step Reasoning Paths, NeurIPS 2023 R0-FoMo [paper]
- Search Algorithm: Beam Search
Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts, EMNLP 2023 [paper]
RAP: Reasoning with Language Model is Planning with World Model, EMNLP 2023 [paper]
- Search Algorithm: MCTS
Prompt-Based Monte-Carlo Tree Search for Goal-oriented Dialogue Policy Planning, EMNLP 2023 [paper]
- Search Algorithm: MCTS
Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design, EMNLP findings 2023 [paper]
- Search Algorithm: MCTS
Agent q: Advanced reasoning and learning for autonomous ai agents, arXiv.2309.10814 [paper] 💡
- Search Algorithm: MCTS

Decomposition

HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023 [paper] | [code]
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, NeurIPS 2023 [paper]

PDDL+Local Search

Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning, NeurIPS 2023 [paper] | [code]
On the Planning Abilities of Large Language Models - A Critical Investigation, NeurIPS 2023 [paper] | [code]
PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change, NeurIPS 2023 [paper] | [code]

Others

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency, arXiv.2304.11477 [paper]💡

🔄 Feedback Learning

Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023 [paper]
Self-Refine: Iterative Refinement with Self-Feedback, NeurIPS 2023 [paper]
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning, ICLR 2024 [paper] | [code]
Learning From Correctness Without Prompting Makes LLM Efficient Reasoner, COLM2024 [paper]
Learning From Mistakes Makes LLM Better Reasoner, arXiv [paper] | [code]💡
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback ACL 2024 [paper]

🧩 Composition

Planning + Feedback Learning

AdaPlanner: Adaptive Planning from Feedback with Language Models, NeurIPS 2023 [paper]
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, ICLR 2024 [paper]
ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning, arXiv.2308.13724 [paper] 💡

Planning + Tool Use

ToolChain: Efficient Action Space Navigation in Large Language Models with A* Search, ICLR 2024 [paper]
TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents, FMDM @ NeurIPS 2023 [paper]
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems, LLMAgents @ ICLR 2024 [paper]

🗺️ World Modeling

Can Language Models Serve as Text-Based World Simulators?, ACL 2024 [paper] | [code]
Making Large Language Models into World Models with Precondition and Effect Knowledge, arXiv [paper] 💡

Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning, NeurIPS 2023 [paper] | [code]
ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games, EMNLP 2023 [paper] | [code]

📊 Benchmarks

Tool-Use Benchmarks

MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use, arXiv.2310.03128 [paper] 💡
TaskBench: Benchmarking Large Language Models for Task Automation, arXiv.2311.18760 [paper] 💡

Planning Benchmarks

Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change), NeurIPS 2023 [paper]

📝 Citation

If you find our work helpful, you can cite this paper as:

@inproceedings{li2024review,
  title={A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning},
  author={Li, Xinzhe},
  booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
  year = "2025",  
}

@article{li2025survey,
  title={A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks},
  author={Li, Xinzhe},
  journal={arXiv preprint arXiv:2501.10069},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
search.md		search.md
text.py		text.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Reading List for LLM-Agents (Updated: 5 Mar 2025)

This Repository vs. Others

Table of Contents

🎁 Surveys

🚀 Tool Use

🧠 Planning

Base Workflows

Search Workflows

Decomposition

PDDL+Local Search

Others

🔄 Feedback Learning

🧩 Composition

Planning + Feedback Learning

Planning + Tool Use

🗺️ World Modeling

📊 Benchmarks

Tool-Use Benchmarks

Planning Benchmarks

📝 Citation

About

Releases

Packages

License

xinzhel/LLM-Agent-Survey

Folders and files

Latest commit

History

Repository files navigation

A Reading List for LLM-Agents (Updated: 5 Mar 2025)

This Repository vs. Others

Table of Contents

🎁 Surveys

🚀 Tool Use

🧠 Planning

Base Workflows

Search Workflows

Decomposition

PDDL+Local Search

Others

🔄 Feedback Learning

🧩 Composition

Planning + Feedback Learning

Planning + Tool Use

🗺️ World Modeling

📊 Benchmarks

Tool-Use Benchmarks

Planning Benchmarks

📝 Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages