Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using the Qwen2-VL-72B-Instruct model from siliconCloud to enable visual functions, an error occurred while calling the API. #14245

Closed
5 tasks done
xinshenghao opened this issue Feb 24, 2025 · 10 comments
Labels
🐞 bug Something isn't working

Comments

@xinshenghao
Copy link

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

1.0.0-beta.1

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

I have a simple workflow.

Image

Step 1: Upload the image and name it "file".

Step 2: Call the model to perform the analysis.

Image

error is:

[siliconflow] Error: PluginInvokeError: {"args":{"description":"[models] Error: API request failed with status code 400: {\"code\":20040,\"message\":\"Image url should be a valid url or should like -BASE64-CONTENT\",\"data\":null}"},"error_type":"InvokeError","message":"[models] Error: API request failed with status code 400: {\"code\":20040,\"message\":\"Image url should be a valid url or should like -BASE64-CONTENT\",\"data\":null}"}

✔️ Expected Behavior

fix it

❌ Actual Behavior

No response

@dosubot dosubot bot added the 🐞 bug Something isn't working label Feb 24, 2025
Copy link

dosubot bot commented Feb 24, 2025

Hey, @xinshenghao! I'm here to help you with bugs, questions, and contributions. Let's tackle this issue together!

I found a similar discussion that might be relevant to your issue: I can't upload the image on dify workflow via api, which is still open and unsolved [1].

Regarding the error you're encountering, it seems to be related to the format of the image input. The error message indicates that the image URL should be a valid URL or in the format -BASE64-CONTENT. In Dify, the correct format for image URLs or base64 content when using the Qwen2-VL-72B-Instruct model is specified by the MULTIMODAL_SEND_IMAGE_FORMAT configuration. The default format is base64, but you can also use url. If you choose the url format, ensure that the FILES_URL is configured to an externally accessible address so that the multimodal model can access the images [2].

Make sure that your image input is correctly formatted according to these specifications to resolve the error.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@xinshenghao
Copy link
Author

@dosu MULTIMODAL_SEND_IMAGE_FORMAT is base64

Copy link

dosubot bot commented Feb 24, 2025

Great! Since your MULTIMODAL_SEND_IMAGE_FORMAT is set to base64, ensure that the image data you are sending is correctly encoded in base64 format. Double-check that the image string starts with data:image/TYPE;base64, where TYPE is the correct MIME type for your image (e.g., jpeg, png).

If the format is correct and the issue persists, verify that the base64 string is complete and not truncated. You might also want to test with a different image to rule out any issues with the specific file you're using. If you continue to face issues, consider logging the base64 string to ensure it's being generated and sent correctly.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@xinshenghao
Copy link
Author

@dosu I added print statements in the LLM plugin source code and found that the base64 data content of the image was abnormal. The data sent to SiliconCloud is as follows:

Image

{
    "model": "Qwen/Qwen2-VL-72B-Instruct",
    "stream": true,
    "temperature": 0.7,
    "messages": [
        {
            "role": "system",
            "content": "识别图片,提取出内容。"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/png;base64,",
                        "detail": "low"
                    }
                }
            ]
        }
    ]
}

How can I fix it?

Copy link

dosubot bot commented Feb 25, 2025

It looks like the base64 image data is incomplete in the JSON payload you're sending. The image_url field should contain the full base64-encoded image data, but it currently only has the prefix data:image/png;base64, without the actual encoded content.

To fix this, ensure that the base64 string is correctly generated and appended to the image_url. The complete base64 string should follow the data:image/png;base64, prefix. Here's what you should check:

  1. Base64 Encoding: Make sure the image is being properly encoded to base64. You can use Python's base64 module to encode the image data.

  2. Appending the Data: After encoding, append the base64 string to the data:image/png;base64, prefix.

  3. Verify the Data: Double-check that the entire base64 string is being included in the JSON payload.

Here's a simple example of how you might encode an image to base64 in Python:

import base64

# Read the image file
with open('path_to_image.png', 'rb') as image_file:
    # Encode the image to base64
    base64_image = base64.b64encode(image_file.read()).decode('utf-8')

# Construct the full data URL
image_data_url = f"data:image/png;base64,{base64_image}"

Ensure that image_data_url is correctly set in your JSON payload. This should resolve the issue with the abnormal base64 data content.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@xinshenghao
Copy link
Author

@dosu I added a print statement in the /api/core/plugin/manager/modle.py(method:invoke_llm line:145) file of the dify-api and found that the base64 image data sent to the plugin module is complete. I speculate that there might be an issue with the plugin module.

Image

Copy link

dosubot bot commented Feb 25, 2025

Hey @xinshenghao! It sounds like you're diving deep into the code, which is awesome! However, I'm just a bot designed to help with bugs, technical questions, and engineering knowledge. I can't really speculate on issues with specific plugin modules. If you think I should be able to help with this, feel free to reach out to the Dosu Team. They'll be happy to hear your feedback!

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@xinshenghao
Copy link
Author

@dosu I find a console in docker-api-1:
/app/api/.venv/lib/python3.12/site-packages/pydantic/main.py:390: UserWarning: Pydantic serializer warnings:
PydanticSerializationUnexpectedValue: Expected str but got list with value [ImagePromptMessageConten...=<DETAIL.HIGH: 'high'>)] - serialized value may not be as expected
PydanticSerializationUnexpectedValue: Expected PromptMessageContent but got ImagePromptMessageContent with value ImagePromptMessageContent...l=<DETAIL.HIGH: 'high'>) - serialized value may not be as expected
return self.pydantic_serializer.to_python(

Copy link

dosubot bot commented Feb 25, 2025

The warnings you're encountering indicate that Pydantic is expecting a str or PromptMessageContent type, but it's receiving a list or ImagePromptMessageContent instead. This mismatch can lead to unexpected serialization results.

To resolve these warnings, ensure that the data being serialized matches the expected types. Here are some steps you can take:

  1. Check Data Types: Verify that the data structures you are passing to Pydantic models are of the expected types. If a str is expected, ensure that you are not passing a list. Similarly, if PromptMessageContent is expected, ensure that you are not passing ImagePromptMessageContent directly.

  2. Adjust Serialization Logic: If your use case requires passing a list or a specific subclass like ImagePromptMessageContent, adjust the serialization logic to handle these cases appropriately. You might need to convert the list to a string or ensure that the correct base class is used.

  3. Review Model Definitions: Check the Pydantic model definitions to ensure they are correctly set up to handle the types of data you are working with. You may need to update the model to accept lists or specific subclasses if that aligns with your application's requirements.

By aligning the data types with the expected model definitions, you should be able to resolve these warnings and ensure that the serialization process works as intended.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@xinshenghao
Copy link
Author

view #14326

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant