CUDA 11.8 not working #27

erictbar · 2025-01-23T22:26:28Z

I have an NVIDIA GeForce 3060 Ti that I have gotten to work in WSL2 with CUDA 11.8 in other projects, such as those involving whisper and immich-machine-learning.

After installing

pip install audiblez
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/kokoro-v0_19.onnx
wget https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json
pip install onnxruntime-gpu

In a virtualenv with Python 3.12, and then run:

audiblez Katamari\ Damacy\ -\ L\ E\ Hall.epub -l en-us -v am_michael -s 1.1 --providers CUDAExecutionProvider

Traceback (most recent call last): File "/mnt/d/Temporary/TextToSpeech/audiblez/audenv/bin/audiblez", line 8, in sys.exit(cli_main()) ^^^^^^^^^^ File "/mnt/d/Temporary/TextToSpeech/audiblez/audenv/lib/python3.12/site-packages/audiblez.py", line 213, in cli_main main(kokoro, args.epub_file_path, args.lang, args.voice, args.pick, args.speed, args.providers) File "/mnt/d/Temporary/TextToSpeech/audiblez/audenv/lib/python3.12/site-packages/audiblez.py", line 34, in main kokoro.sess.set_providers(providers) File "/mnt/d/Temporary/TextToSpeech/audiblez/audenv/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 221, in set_providers self._reset_session(providers, provider_options) File "/mnt/d/Temporary/TextToSpeech/audiblez/audenv/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 565, in _reset_session self._create_inference_session(providers, provider_options) File "/mnt/d/Temporary/TextToSpeech/audiblez/audenv/lib/python3.12/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 537, in create_inference_session sess.initialize_session(providers, provider_options, disabled_optimizers) RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:129 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; SUCCTYPE = cudaError; std::conditional_t<THRW, void, common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, SUCCTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; SUCCTYPE = cudaError; std::conditional_t<THRW, void, common::Status> = void] CUDA failure 100: no CUDA-capable device is detected ; GPU=-1 ; hostname=My-PCs-Name; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=282 ; expr=cudaSetDevice(info.device_id);

I have confirmed that CUDA is running

nvidia-smi Thu Jan 23 16:57:04 2025 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 565.72 Driver Version: 566.14 CUDA Version: 12.7 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3060 Ti On | 00000000:06:00.0 On | N/A | | 30% 31C P0 36W / 200W | 1775MiB / 8192MiB | 2% Default | | | | N/A |

My GPU only supports CUDA 11.8 and was wondering if maybe I am missing something, or if the script needs updating to support 11.8 (I saw CUDA support was only recently added) or if I will have to use CPU?

The text was updated successfully, but these errors were encountered:

patrickdeanbrown · 2025-01-24T04:53:30Z

Hey!

I haven't tried other versions of onnxruntime-gpu, but it installs by default for CUDA 12. onnxruntime-gpu does have legacy CUDA 11.8 packages, but they must be manually specified. There are directions here: https://onnxruntime.ai/docs/install/#requirements

Also, I'm using a 2060 on Windows 11, and CUDA 12 installs and runs without an issue. It appears to have support for GPUs back to the 2060, so it's worth a try seeing if CUDA 12 now supports your 3060 Ti.

Hope this helps.

erictbar · 2025-01-24T15:47:15Z

I tried a bunch of things to get CUDA 12 working. Reinstalled all NVIDIA related packages. Installed cudnn-cuda-12 which I don't think was installed before, also installed cuda-compat-12-8 as I believe 12.7 is the highest my GPU supports normally. Also added to ~/.bashrc:

export PATH=/usr/local/cuda-12.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

What I think finally got it was referencing #29, I added torch to v0.20.2 which I had installed, but also during pip install torch I noticed Could not build wheels for ebooklib, pylatexenc, which is required to install pyproject.toml-based projects and ran:

pip install --upgrade pip setuptools wheel
pip install ebooklib pylatexenc
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

at the suggestion of Copilot.

A combination of these things and I got the warning while running:
python audiblez/audiblez.py book.epub -l en-us -v am_michael -s 1.1 --providers CUDAExecutionProvider

2025-01-24 10:38:35.396249882 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 39 Memcpy nodes are added to the graph main_graph for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message.
2025-01-24 10:38:35.407806861 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-01-24 10:38:35.407845158 [W:onnxruntime:, session_state.cc:1170 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
2025-01-24 10:38:35.803724777 [W:onnxruntime:Default, scatter_nd.h:51 ScatterNDWithAtomicReduction] ScatterND with reduction=='none' only guarantees to be correct if indices are not duplicated.

but it is using GPU now and looks to be signficantly faster 👍

santinic · 2025-01-29T19:26:00Z

Fixed in v3, now it uses Torch. pip install --upgrade audiblez and it should use Cuda by default

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA 11.8 not working #27

CUDA 11.8 not working #27

erictbar commented Jan 23, 2025

patrickdeanbrown commented Jan 24, 2025

erictbar commented Jan 24, 2025 •

edited

Loading

santinic commented Jan 29, 2025 •

edited

Loading

CUDA 11.8 not working #27

CUDA 11.8 not working #27

Comments

erictbar commented Jan 23, 2025

patrickdeanbrown commented Jan 24, 2025

erictbar commented Jan 24, 2025 • edited Loading

santinic commented Jan 29, 2025 • edited Loading

erictbar commented Jan 24, 2025 •

edited

Loading

santinic commented Jan 29, 2025 •

edited

Loading