Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Handle Error for CUDA Configuration and API Request #6572

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skywinder
Copy link

@skywinder skywinder commented Dec 12, 2024

Issue

Running the following curl request:

curl -X POST http://localhost:7860/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Hello, how are you?",
    "max_tokens": 0
  }'

results in an error:

raise AssertionError("Torch not compiled with CUDA enabled")

Steps to Reproduce
1. Start the API using the following command:

./start_macos.sh --api --api-port 7860 --verbose

2.	Test the endpoint with:
curl -X POST http://localhost:7860/v1/completions \
-H "Content-Type: application/json" \
-d '{
    "prompt": "Hello, how are you?",
    "max_tokens": 0
}'

Expected Behavior

The request should return a clear error or handle cases where CUDA is not enabled.

Proposed Solution
• Add a preflight check during the startup process to verify CUDA compatibility.
• Ensure the API gracefully handles requests even when CUDA is unavailable.

Impact
• Prevents crashes due to unhandled assertions.
• Improves debugging experience with clear error messages.

Checklist:

@jfmherokiller
Copy link

I applied your fix locally because it allows the system to fallback to cpu when I am playing a game that is making heavy use of my gpu.

@skywinder
Copy link
Author

I applied your fix locally because it allows the system to fallback to cpu when I am playing a game that is making heavy use of my gpu.

thanks, one more useful use case. hope it will be merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants