Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encounter a bug:TypeError: list indices must be integers or slices, not str #120

Open
chenyihang1993 opened this issue Mar 6, 2025 · 1 comment

Comments

@chenyihang1993
Copy link

Excute:deepsearcher --query "小组成员有哪些?"
Result:

<think> Select agent [ChainOfRAG] to answer the query [小组成员有哪些?] </think>

>> Iteration: 1

<think> Perform search [谁是小组的成员?] on the vector DB collections: ['deepsearcher'] </think>

<search> Search [谁是小组的成员?] in [deepsearcher]...  </search>

supported_doc_indices:  ['\\Document 0']
Traceback (most recent call last):
  File "/opt/conda/bin/deepsearcher", line 8, in <module>
    sys.exit(main())
  File "/root/deep-searcher/deepsearcher/cli.py", line 61, in main
    final_answer, refs, consumed_tokens = query(args.query, max_iter=args.max_iter)
  File "/root/deep-searcher/deepsearcher/online_query.py", line 10, in query
    return default_searcher.query(original_query, max_iter=max_iter)
  File "/root/deep-searcher/deepsearcher/agent/rag_router.py", line 69, in query
    answer, retrieved_results, n_token_retrieval = agent.query(query, **kwargs)
  File "/root/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 197, in query
    all_retrieved_results, n_token_retrieval, additional_info = self.retrieve(query, **kwargs)
  File "/root/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 182, in retrieve
    supported_retrieved_results, n_token2 = self._get_supported_docs(
  File "/root/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 167, in _get_supported_docs
    supported_retrieved_results = [retrieved_results[i] for i in supported_doc_indices]
  File "/root/deep-searcher/deepsearcher/agent/chain_of_rag.py", line 167, in <listcomp>
    supported_retrieved_results = [retrieved_results[i] for i in supported_doc_indices]
TypeError: list indices must be integers or slices, not str

config.yaml:

provide_settings:
  llm:
    # provider: "OpenAI"
    # config:
    #   model: "gpt-4o-mini"  # "gpt-o1-mini"
#      api_key: "sk-xxxx"  # Uncomment to override the `OPENAI_API_KEY` set in the environment variable
#      base_url: ""

#    provider: "DeepSeek"
#    config:
#      model: "deepseek-chat"  # "deepseek-reasoner"
##      api_key: "sk-xxxx"  # Uncomment to override the `DEEPSEEK_API_KEY` set in the environment variable
##      base_url: ""

#    provider: "SiliconFlow"
#    config:
#      model: "deepseek-ai/DeepSeek-V3"
##      api_key: "xxxx"  # Uncomment to override the `SILICONFLOW_API_KEY` set in the environment variable
##      base_url: ""

#    provider: "PPIO"
#    config:
#      model: "deepseek/deepseek-v3/community"
##      api_key: "sk_xxxxxx"  # Uncomment to override the `PPIO_API_KEY` set in the environment variable
##      base_url: ""

#    provider: "TogetherAI"
#    config:
#      model: "deepseek-ai/DeepSeek-V3"
##      api_key: "xxxx"  # Uncomment to override the `TOGETHER_API_KEY` set in the environment variable

#    provider: "AzureOpenAI"
#    config:
#      model: ""
#      api_version: ""
##      azure_endpoint: "xxxx"  # Uncomment to override the `AZURE_OPENAI_ENDPOINT` set in the environment variable
##      api_key: "xxxx"  # Uncomment to override the `AZURE_OPENAI_KEY` set in the environment variable

   provider: "Ollama"
   config:
     model: "qwen2.5:3b"
#      base_url: ""

  embedding:
  #   provider: "OpenAIEmbedding"
  #   config:
  #     model: "text-embedding-ada-002"
#      api_key: ""  # Uncomment to override the `OPENAI_API_KEY` set in the environment variable
#      base_url: "" # Uncomment to override the `OPENAI_BASE_URL` set in the environment variable
#      dimension: 1536 # Uncomment to customize the embedding dimension 


   provider: "MilvusEmbedding"
   config:
     model: "BAAI/bge-large-zh-v1.5"

#    provider: "VoyageEmbedding"
#    config:
#      model: "voyage-3"
##      api_key: ""  # Uncomment to override the `VOYAGE_API_KEY` set in the environment variable

#    provider: "BedrockEmbedding"
#    config:
#      model: "amazon.titan-embed-text-v2:0"
##      aws_access_key_id: ""  # Uncomment to override the `AWS_ACCESS_KEY_ID` set in the environment variable
##      aws_secret_access_key: ""  # Uncomment to override the `AWS_SECRET_ACCESS_KEY` set in the environment variable
    
#    provider: "SiliconflowEmbedding"
#    config:
#      model: "BAAI/bge-m3"
# .    api_key: ""   # Uncomment to override the `SILICONFLOW_API_KEY` set in the environment variable   

  file_loader:
    provider: "PDFLoader"
    config: {}

#    provider: "JsonFileLoader"
#    config:
#      text_key: ""

#    provider: "TextLoader"
#    config: {}

#    provider: "UnstructuredLoader"
#    config: {}

  web_crawler:
    provider: "FireCrawlCrawler"
    config: {}

    # provider: "Crawl4AICrawler"
    # config: # Uncomment to custom browser configuration for Crawl4AI
    #   browser_config:
    #     headless: false
    #     proxy: "http://127.0.0.1:7890"
    #     chrome_channel: "chrome"
    #     verbose: true
    #     viewport_width: 800
    #     viewport_height: 600
    
#    provider: "JinaCrawler"
#    config: {}

  vector_db:
    provider: "Milvus"
    config:
      default_collection: "deepsearcher"
      uri: "./milvus.db"
      token: "root:Milvus"
      db: "default"

query_settings:
  max_iter: 3

load_settings:
  chunk_size: 1500
  chunk_overlap: 100

@zc277584121
Copy link
Collaborator

@chenyihang1993 the llm is too small to follow the format parsing prompt, ref this: #99 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants