Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0... #47

weiqiang13 · 2024-08-16T02:03:38Z

{'verbose': True, 'with_cuda': True, 'extra_ldflags': ['-L/home/junlong/anaconda3/envs/xlstm/lib', '-lcublas'], 'extra_cflags': ['-DSLSTM_HIDDEN_SIZE=128', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=4', '-DSLSTM_NUM_STATES=4', '-DSLSTM_DTYPE_B=float', '-DSLSTM_DTYPE_R=nv_bfloat16', '-DSLSTM_DTYPE_W=nv_bfloat16', '-DSLSTM_DTYPE_G=nv_bfloat16', '-DSLSTM_DTYPE_S=nv_bfloat16', '-DSLSTM_DTYPE_A=float', '-DSLSTM_NUM_GATES=4', '-DSLSTM_SIMPLE_AGG=true', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL_VALID=false', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL=0.0', '-DSLSTM_FORWARD_CLIPVAL_VALID=false', '-DSLSTM_FORWARD_CLIPVAL=0.0', '-U__CUDA_NO_HALF_OPERATORS', '-U__CUDA_NO_HALF_CONVERSIONS', '-U__CUDA_NO_BFLOAT16_OPERATORS', '-U__CUDA_NO_BFLOAT16_CONVERSIONS', '-U__CUDA_NO_BFLOAT162_OPERATORS__', '-U__CUDA_NO_BFLOAT162_CONVERSIONS__'], 'extra_cuda_cflags': ['-Xptxas="-v"', '-gencode', 'arch=compute_80,code=compute_80', '-res-usage', '--use_fast_math', '-O3', '-Xptxas -O3', '--extra-device-vectorization', '-DSLSTM_HIDDEN_SIZE=128', '-DSLSTM_BATCH_SIZE=8', '-DSLSTM_NUM_HEADS=4', '-DSLSTM_NUM_STATES=4', '-DSLSTM_DTYPE_B=float', '-DSLSTM_DTYPE_R=nv_bfloat16', '-DSLSTM_DTYPE_W=nv_bfloat16', '-DSLSTM_DTYPE_G=nv_bfloat16', '-DSLSTM_DTYPE_S=nv_bfloat16', '-DSLSTM_DTYPE_A=float', '-DSLSTM_NUM_GATES=4', '-DSLSTM_SIMPLE_AGG=true', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL_VALID=false', '-DSLSTM_GRADIENT_RECURRENT_CLIPVAL=0.0', '-DSLSTM_FORWARD_CLIPVAL_VALID=false', '-DSLSTM_FORWARD_CLIPVAL=0.0', '-U__CUDA_NO_HALF_OPERATORS', '-U__CUDA_NO_HALF_CONVERSIONS', '-U__CUDA_NO_BFLOAT16_OPERATORS', '-U__CUDA_NO_BFLOAT16_CONVERSIONS', '-U__CUDA_NO_BFLOAT162_OPERATORS__', '-U__CUDA_NO_BFLOAT162_CONVERSIONS__']}
Using /home/junlong/.cache/torch_extensions/py311_cu121 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/junlong/.cache/torch_extensions/py311_cu121/slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0/build.ninja...
Building extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
ninja: no work to do.
Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0...
How to solve this problem？

Ninlawat-Puhu · 2024-08-22T07:37:26Z

I encountered the same problem.

Ninlawat-Puhu · 2024-08-22T07:38:17Z

@kpoeppel Hi, Could you advise how to solve ?

vivianawoo · 2024-09-01T12:45:11Z

me too,how to solve it~~

matiashaggman · 2024-09-03T07:57:17Z

I have this problem aswell. This only occurs if you include the sLSTM module in the xLSTM stack, using only mLSTM works. I tested this on the lightning platform with NVIDIA L4 GPU.

calliope-pro · 2024-09-11T02:05:53Z

Same here

f-krause · 2024-09-29T12:26:25Z

Same problem here (linux, ubuntu)!

A "work around" is to set backend="vanilla" in sLSTMLayerConfig, however this will ofc result in very slow learning

kpoeppel · 2024-10-10T11:39:19Z

Is it stuck in loading? Because I see no error in what you shared. If there is a loading error of the module you can clear your torch_extensions cache (typically $HOME/.cache/torch_extensions).
In any case, make sure that your GPU has compute capability >= 8.0 (Ampere). This is needed for bfloat16.

f-krause · 2024-10-10T16:43:01Z

Same problem here (linux, ubuntu)!

I made it work on my machine by changing the ninja version.

This config of versions currently works for me (though training is super slow on an A100 compared to transformers/mamba/torch lstm):

Ubuntu
Python 3.10.14
cuda 11.8
cudatoolkit=11.8.0
cudatoolkit-dev=11.7.0
pytorch 2.4.1
ninja 1.11.1.1
gcc 11.2.0
gxx_impl_linux-64 11.2.0
gxx_linux-64 11.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0... #47

Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0... #47

weiqiang13 commented Aug 16, 2024

Ninlawat-Puhu commented Aug 22, 2024

Ninlawat-Puhu commented Aug 22, 2024

vivianawoo commented Sep 1, 2024

matiashaggman commented Sep 3, 2024

calliope-pro commented Sep 11, 2024

f-krause commented Sep 29, 2024

kpoeppel commented Oct 10, 2024

f-krause commented Oct 10, 2024 •

edited

Loading

Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0... #47

Loading extension module slstm_HS128BS8NH4NS4DBfDRbDWbDGbDSbDAfNG4SA1GRCV0GRC0d0FCV0FC0d0... #47

Comments

weiqiang13 commented Aug 16, 2024

Ninlawat-Puhu commented Aug 22, 2024

Ninlawat-Puhu commented Aug 22, 2024

vivianawoo commented Sep 1, 2024

matiashaggman commented Sep 3, 2024

calliope-pro commented Sep 11, 2024

f-krause commented Sep 29, 2024

kpoeppel commented Oct 10, 2024

f-krause commented Oct 10, 2024 • edited Loading

f-krause commented Oct 10, 2024 •

edited

Loading