Skip to content

Latest commit

 

History

History
151 lines (102 loc) · 5.16 KB

README.md

File metadata and controls

151 lines (102 loc) · 5.16 KB

✨ MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

MindSearch is an open-source AI Search Engine Framework with Perplexity.ai Pro performance. You can simply deploy it with your own perplexity.ai style search engine with either close-source LLMs (GPT, Claude) or open-source LLMs (InternLM2.5 series are specifically optimized to provide superior performance within the MindSearch framework; other open-source models have not been specifically tested). It owns following features:

  • 🤔 Ask everything you want to know: MindSearch is designed to solve any question in your life and use web knowledge.
  • 📚 In-depth Knowledge Discovery: MindSearch browses hundreds of web pages to answer your question, providing deeper and wider knowledge base answer.
  • 🔍 Detailed Solution Path: MindSearch exposes all details, allowing users to check everything they want. This greatly improves the credibility of its final response as well as usability.
  • 💻 Optimized UI Experience: Providing all kinds of interfaces for users, including React, Gradio, Streamlit and Terminal. Choose any type based on your need.
  • 🧠 Dynamic Graph Construction Process: MindSearch decomposes the user query into atomic sub-questions as nodes in the graph and progressively extends the graph based on the search result from WebSearcher.

⚡️ MindSearch vs other AI Search Engines

Comparison on human preference based on depth, breadth, factuality of the response generated by ChatGPT-Web, Perplexity.ai (Pro), and MindSearch. Results are obtained on 100 human-crafted real-world questions and evaluated by 5 human experts*.

* All experiments are done before July.7 2024.

⚽️ Build Your Own MindSearch

Step1: Dependencies Installation

git clone https://github.com/InternLM/MindSearch
cd MindSearch
pip install -r requirements.txt

Step2: Setup MindSearch API

Setup FastAPI Server.

python -m mindsearch.app --lang en --model_format internlm_server --search_engine DuckDuckGoSearch
  • --lang: language of the model, en for English and cn for Chinese.
  • --model_format: format of the model.
    • internlm_server for InternLM2.5-7b-chat with local server. (InternLM2.5-7b-chat has been better optimized for Chinese.)
    • gpt4 for GPT4. if you want to use other models, please modify models
  • --search_engine: Search engine.
    • DuckDuckGoSearch for search engine for DuckDuckGo.
    • BingSearch for Bing search engine.

Step3: Setup MindSearch Frontend

Providing following frontend interfaces,

  • React
# Install Node.js and npm
# for Ubuntu
sudo apt install nodejs npm

# for windows
# download from https://nodejs.org/zh-cn/download/prebuilt-installer

# Install dependencies

cd frontend/React
npm install
npm start

Details can be found in React

  • Gradio
python frontend/mindsearch_gradio.py
  • Streamlit
streamlit run frontend/mindsearch_streamlit.py

🌐 Change Web Search API

To use a different type of web search API, modify the searcher_type attribute in the searcher_cfg located in mindsearch/agent/__init__.py. Currently supported web search APIs include:

  • GoogleSearch
  • DuckDuckGoSearch
  • BraveSearch
  • BingSearch

For example, to change to the Brave Search API, you would configure it as follows:

BingBrowser(
    searcher_type='BraveSearch',
    topk=2,
    api_key=os.environ.get('BRAVE_API_KEY', 'YOUR BRAVE API')
)

🐞 Debug Locally

python -m mindsearch.terminal

📝 License

This project is released under the Apache 2.0 license.

Citation

If you find this project useful in your research, please consider cite:

@article{chen2024mindsearch,
  title={MindSearch: Mimicking Human Minds Elicits Deep AI Searcher},
  author={Chen, Zehui and Liu, Kuikun and Wang, Qiuchen and Liu, Jiangning and Zhang, Wenwei and Chen, Kai and Zhao, Feng},
  journal={arXiv preprint arXiv:2407.20183},
  year={2024}
}

Our Projects

Explore our additional research on large language models, focusing on LLM agents.

  • Lagent: A lightweight framework for building LLM-based agents
  • AgentFLAN: An innovative approach for constructing and training with high-quality agent datasets (ACL 2024 Findings)
  • T-Eval: A Fine-grained tool utilization evaluation benchmark (ACL 2024)