Xinzhe Li
Our Github Repository follows the selection criteria below:
- Allowing Coherent Understanding: They can be systematically categoried into the unified framework in my survey, according to the use of LLM-Profiled Roles (LMPRs).
- A general survey (Accepted at CoLing 2025): A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning
- High Quality: Papers are published on ICML, ICLR, NeurIPS, *ACL (including EMNLP), and COLING. Or unpublished papers contain useful analysis and insightful novelty
- Unpublished papers are marked with π‘ and will be updated upon publication. βοΈ STAR this repo to stay updated!
- Paper Reviews: The paper links to OpenReview (if available) are alwasy given. I often learn much more from and resonate with many reviews about the papers and evaluate some rejected papers with the reviews. (That's why I always like NeurIPS/ICLR papers).
- Exhasutive Review on Search Workflows
- A corresponding survey: A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks Updated Paper will be released on 6 Mar 2025
- A corresponding survey: A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks
Other Github Repositories summarize related papers with less constrained selection criteria:
- AGI-Edgerunners/LLM-Agents-Papers
- zjunlp/LLMAgentPapers
- Paitesanshi/LLM-Agent-Survey
- woooodyy/llm-agent-paper-list
- Autonomous-Agents
Other Github Repositories summarize related papers focusing on specific perspectives:
- nuster1128/LLM_Agent_Memory_Survey: Focus on memory
- teacherpeterpan/self-correction-llm-papers: Focus on feedback learning (Self Correction)
- git-disl/awesome-LLM-game-agent-papers: Focus on gaming applications
- π Surveys
- π Tool Use
- π§ Planning
- π Feedback Learning
- 𧩠Composition
- π World Modeling
- π Benchmarks
- π Citation
- A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning, CoLing 2025 [paper]
- A Survey on Large Language Model based Autonomous Agents, Frontiers of Computer Science 2024 [paper] | [code]
- Augmented Language Models: a Survey, TMLR [paper]
- Understanding the planning of LLM agents: A survey, arXiv [paper] π‘
- The Rise and Potential of Large Language Model Based Agents: A Survey, arxiv [paper] π‘
- A Survey on the Memory Mechanism of Large Language Model based Agents, arxiv [paper] π‘
- ReAct: Synergizing Reasoning and Acting in Language Models, ICLR 2023 [paper]
- Toolformer: Language Models Can Teach Themselves to Use Tools, NeurIPS 2023 [paper]
- HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023 [paper]
- API-Bank: A Benchmark for Tool-Augmented LLMs, EMNLP 2023 [paper]
- ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings, NeurIPS 2023 [paper]
- MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting, ACL 2023 [paper]
- ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models, EMNLP 2023 [paper]
- ART: Automatic multi-step reasoning and tool-use for large language models, arXiv.2303.09014 [paper] π‘
- TALM: Tool Augmented Language Models, arXiv.2205.12255 [paper] π‘
- On the Tool Manipulation Capability of Open-source Large Language Models, arXiv.2305.16504 [paper] π‘
- Large Language Models as Tool Makers, arXiv.2305.17126 [paper] π‘
- GEAR: Augmenting Language Models with Generalizable and Efficient Tool Resolution, arXiv.2307.08775 [paper] π‘
- ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs, arXiv.2307.16789 [paper] π‘
- Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models, arXiv.2308.00675 [paper] π‘
- MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback, arXiv.2309.10691 [paper] π‘
- Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning, arXiv.2309.10814 [paper] π‘
- On the Planning Abilities of Large Language Models -- A Critical Investigation, NeurIPS 2023 [paper]
Details in the page (on the way to be publised).
- Alphazero-like Tree-Search can guide large language model decoding and training, ICML 2024 [paper]
- Search Algorithm: MCTS
- Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models, ICML 2024 [paper]
- Search Algorithm: MCTS
- When is Tree Search Useful for {LLM} Planning? It Depends on the Discriminator, ACL 2024 [paper]
- Search Algorithm: MCTS
- Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation, ACL findings 2024 [paper]
- Search Algorithm: MCTS
- Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs, ACL 2024 [paper]
- Search Algorithm: BFS/DFS
- LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, EMNLP findings 2024 [paper] | [code]
- Search Algorithm: A*
- LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models, COLM2024 [paper] | [code]
- Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models, arXiv.2310.04406 [paper] π‘
- Large Language Model Guided Tree-of-Thought, arXiv.2305.08291 [paper]π‘
- Tree Search for Language Model Agents, Under Review [paper]π‘
- Search Algorithm: Best-First Search
- Q*: Improving multi-step reasoning for llms with deliberative planning, Under Review [paper]π‘
- Search Algorithm: A*
- Planning with Large Language Models for Code Generation, ICLR 2023 [paper]
- Search Algorithm: MCTS
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models, NeurIPS 2023 [paper]
- Search Algorithm: BFS/DFS
- LLM-MCTS:Large Language Models as Commonsense Knowledge for Large-Scale Task Planning, NeurIPS 2023 [paper] | [code]
- Search Algorithm: MCTS
- Self-Evaluation Guided Beam Search for Reasoning, NeurIPS 2023 [paper]
- Search Algorithm: BFS/DFS
- PathFinder: Guided Search over Multi-Step Reasoning Paths, NeurIPS 2023 R0-FoMo [paper]
- Search Algorithm: Beam Search
- Plan, Verify and Switch: Integrated Reasoning with Diverse X-of-Thoughts, EMNLP 2023 [paper]
- RAP: Reasoning with Language Model is Planning with World Model, EMNLP 2023 [paper]
- Search Algorithm: MCTS
- Prompt-Based Monte-Carlo Tree Search for Goal-oriented Dialogue Policy Planning, EMNLP 2023 [paper]
- Search Algorithm: MCTS
- Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design, EMNLP findings 2023 [paper]
- Search Algorithm: MCTS
- Agent q: Advanced reasoning and learning for autonomous ai agents, arXiv.2309.10814 [paper] π‘
- Search Algorithm: MCTS
- HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face, NeurIPS 2023 [paper] | [code]
- Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, NeurIPS 2023 [paper]
- Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning, NeurIPS 2023 [paper] | [code]
- On the Planning Abilities of Large Language Models - A Critical Investigation, NeurIPS 2023 [paper] | [code]
- PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change, NeurIPS 2023 [paper] | [code]
- LLM+P: Empowering Large Language Models with Optimal Planning Proficiency, arXiv.2304.11477 [paper]π‘
- Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023 [paper]
- Self-Refine: Iterative Refinement with Self-Feedback, NeurIPS 2023 [paper]
- SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning, ICLR 2024 [paper] | [code]
- Learning From Correctness Without Prompting Makes LLM Efficient Reasoner, COLM2024 [paper]
- Learning From Mistakes Makes LLM Better Reasoner, arXiv [paper] | [code]π‘
- LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback ACL 2024 [paper]
- AdaPlanner: Adaptive Planning from Feedback with Language Models, NeurIPS 2023 [paper]
- CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, ICLR 2024 [paper]
- ISR-LLM: Iterative Self-Refined Large Language Model for Long-Horizon Sequential Task Planning, arXiv.2308.13724 [paper] π‘
- ToolChain: Efficient Action Space Navigation in Large Language Models with A* Search, ICLR 2024 [paper]
- TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents, FMDM @ NeurIPS 2023 [paper]
- TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems, LLMAgents @ ICLR 2024 [paper]
- Can Language Models Serve as Text-Based World Simulators?, ACL 2024 [paper] | [code]
- Making Large Language Models into World Models with Precondition and Effect Knowledge, arXiv [paper] π‘
- Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning, NeurIPS 2023 [paper] | [code]
- ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games, EMNLP 2023 [paper] | [code]
- MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use, arXiv.2310.03128 [paper] π‘
- TaskBench: Benchmarking Large Language Models for Task Automation, arXiv.2311.18760 [paper] π‘
- Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change), NeurIPS 2023 [paper]
If you find our work helpful, you can cite this paper as:
@inproceedings{li2024review,
title={A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning},
author={Li, Xinzhe},
booktitle = "Proceedings of the 31st International Conference on Computational Linguistics",
year = "2025",
}
@article{li2025survey,
title={A Survey on LLM Test-Time Compute via Search: Tasks, LLM Profiling, Search Algorithms, and Relevant Frameworks},
author={Li, Xinzhe},
journal={arXiv preprint arXiv:2501.10069},
year={2025}
}