Pytorch 1.6-1.8 compatability - CUDA11/3090 ready #92

MatthewHowe · 2020-11-13T13:17:16Z

Modified from pull request from @half-potato for compatibility with torch 1.6. Replaced THBlas functions with aten tensor functions.
Tested for torch 1.7 and 1.8 with cuda 10 and 11.
Worked with RTX2080 and RTX3090.
@XDynames

MatthewHowe · 2020-11-13T13:20:48Z

#90 #89 #88 #74

jerryhitit · 2020-11-14T03:06:17Z

Hi! Matthew,
I have a RTX3090, and cloned your project that you modified 14 hours ago.
While ./make.sh still get the error about:
nvcc fatal : Unsupported gpu architecture 'compute_86'

I got a Ubuntu 18.04, CUDA 11.1 pytorch 1.7, and gcc 7.5.0 / g++ 7.5.0

I guess it's probably the CUDA caused error, ANY HELP WOULD BE APPRECIATED!!

XDynames · 2020-11-14T03:15:45Z

Jerry do you install you use a nightly binary for your Pytorch? https://discuss.pytorch.org/t/rtx-3000-support/98158

I have built this in a docker container using Nvidia's base image of CUDA11.1 then using the pip command in the link to install pytorch compiled with the RTX3000 support and it seems to work well (@MatthewHowe what base image did you use?)

From some googling it looks like it could also be conflicting versions of different nvidia packages, nvcc, cudnn, ect

jerryhitit · 2020-11-14T03:52:56Z

Thanks, @XDynames .
I used to got a pytorch from
pip install torch==1.7.0+cu110 torchvision==0.8.1+cu110 torchaudio===0.7.0 -f https://download.pytorch.org/whl/torch_stable.html

And under your suggestion about I should use nightly binary, so I use the pip command:
pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

And my pytorch version looks like now:

And now I got a ninja related error like this:

result in the RuntimeError: Error compiling objects for extension

I will double check all the NVIDIA packages, and find a way to solve the ninja problem.
Thanks AGAIN!

MatthewHowe · 2020-11-14T07:38:54Z

I used this [docker image]docker pull nvidia/cuda:11.1-devel-ubuntu18.04 - installed conda then torch-nightly.
I then cloned and compiled DCNv2. This could be an issue with Cuda11.0 or some other conflicting packages.
When DCN doesn't compile usually the error from the cause is above your screen cap - if you run the ./make again the compiled parts will not run and it will make it clearer what is causing the issue.

jerryhitit · 2020-11-15T06:12:27Z

Hi, @MatthewHowe Thanks for your great abvice!

I double checked my CUDA installation, and nvcc settings. After proper set those environment variables. It won't cause the correspond errors like ['nvcc', '-v'].

While on the contrary, ninja still have report an error about the FAIL in 'THCudaBlas_SgemmBatched'.
It seems to be a new problem.

The log is like this:

FAILED: /home/liurui/DCNv2/build/temp.linux-x86_64-3.7/home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.o
/usr/local/cuda-11.1/bin/nvcc -DWITH_CUDA -I/home/liurui/DCNv2/DCN/src -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include/TH -I/home/liurui/anaconda3/envs/FairMOT/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-11.1/include -I/home/liurui/anaconda3/envs/FairMOT/include/python3.7m -c -c /home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.cu -o /home/liurui/DCNv2/build/temp.linux-x86_64-3.7/home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=sm_80 -ccbin g++ -std=c++14
/home/liurui/DCNv2/DCN/src/cuda/dcn_v2_cuda.cu(126): error: identifier "THCudaBlas_SgemmBatched" is undefined

Sorry. I FIX this problem by degrading my pytorch 1.8 nightly binary to 1.7 stable version. Because the THCudaBlas_SgemmBatched is modified in recent version, so it caused this problem.

It work will, and compile successfully.

AND Thanks for Matthew‘s great work again!!

XDynames · 2020-11-15T09:06:36Z

Just looked into this and ATEN lost this definition on the 13NOV.....

Maybe we should look into replacing SgemmBatched with a non deprecated version for 1.8 support?
pytorch/pytorch#47987

Shank2358 · 2020-11-16T09:51:12Z

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

jerryhitit · 2020-11-16T10:25:26Z

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

you can try downgrade pytorch version to 1.7 stable, it work fine with me.

Shank2358 · 2020-11-16T11:33:18Z

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

you can try downgrade pytorch version to 1.7 stable, it work fine with me.
Thank you. I will try it again.

Shank2358 · 2020-11-17T02:08:01Z

I have compiled successfully using pytorch1.7. Thanks. @jerryhitit @MatthewHowe

duanzhiihao · 2020-11-30T13:29:47Z

I successfully compiled on Windows 10, CUDA 11.1 (RTX3090), and PyTorch 1.7. Thank you so much!

KiedaTamashi · 2020-12-02T14:32:49Z

@MatthewHowe Hi Matthew, I failed to compile using pytorch1.7 with RuntimeError: Error compiling objects for extension.

I used the latest version of you which supports pytorch1.7
My environment (using anaconda virtual env):

gcc 7.5.0
ninja 1.10.2
ubuntu 18.04
python 3.7
pytorch 1.7
cudatoolkit 10.2

torch.cuda.is_available return True and CUDA home is not None

Error Message:
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1522, in _run_ninja_build
env=env)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/subprocess.py", line 481, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "setup.py", line 69, in
cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 79, in run
_build_ext.run(self)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 339, in run
self.build_extensions()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 653, in build_extensions
build_ext.build_extensions(self)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 448, in build_extensions
self._build_extensions_serial()
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 473, in _build_extensions_serial
self.build_extension(ext)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/setuptools/command/build_ext.py", line 196, in build_extension
_build_ext.build_extension(self, ext)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/distutils/command/build_ext.py", line 533, in build_extension
depends=ext.depends)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 482, in unix_wrap_ninja_compile
with_cuda=with_cuda)
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1238, in _write_ninja_file_and_compile_objects
error_prefix='Error compiling objects for extension')
File "/NAS/home01/tanzhenwei/anaconda3/envs/torch17/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1538, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Could you help me?

XDynames · 2020-12-04T05:13:27Z

Double check that your versions all line up - if you want to use CUDA 10.2 make sure CUDNN is the correct version and the pytorch binary you are using is compiled with CUDA 10.2

KiedaTamashi · 2020-12-07T10:29:48Z

Double check that your versions all line up - if you want to use CUDA 10.2 make sure CUDNN is the correct version and the pytorch binary you are using is compiled with CUDA 10.2

Hi @XDynames , I solved this by modifying my python interrupter file "anaconda3/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py"

But I met another gcc compile problem.

running install
running bdist_egg
running egg_info
writing DCNv2.egg-info/PKG-INFO
writing dependency_links to DCNv2.egg-info/dependency_links.txt
writing top-level names to DCNv2.egg-info/top_level.txt
reading manifest file 'DCNv2.egg-info/SOURCES.txt'
writing manifest file 'DCNv2.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building '_ext' extension
Emitting ninja build file /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.2
g++ -pthread -shared -B /NAS/home01/tanzhenwei/anaconda3/envs/py37/compiler_compat -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,-rpath=/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,--no-as-needed -Wl,--sysroot=/ /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/vision.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_im2col_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_psroi_pooling_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib/python3.7/site-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.7/_ext.cpython-37m-x86_64-linux-gnu.so
g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o: No such file or directory
g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o: No such file or directory
g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o: No such file or directory
error: command 'g++' failed with exit status 1

Do you have any advice? The environment is the same.

ConnerWK · 2020-12-17T16:24:06Z

@MatthewHowe Hi MatthewHowe. Thanks for your great job, I successfully compiled on Ubuntu18.04.5, CUDA 11.1 (RTX3090), and PyTorch 1.7. 0 .
For there still some packages need to be compiled manually. I wonder if there are some guidelines , principles or rules to modify the source code from CUDA10(even earlier versions) version to CUDA 11 version so that I can compiled it with current environment. Though I browsed the files changed, i still have no idea about how to do it properly.
Would you mind provide some guidance? Looking forward for your reply.

XDynames · 2020-12-17T22:58:55Z

@ConnerWK Not to put a fine point on it but the code for DCN has become a bit messy - what we have done was to replae low level BLAS & CUDABLAS function calls with a higher level ATEN equivalent

This is viewed by us as a band-aid, so we've started working on a pure pytorch NN.module based solution that will not require compiling. Currently we have deformable convolution V1/2 passing all the unit tests from this code but have yet to break ground on ROI pooling

Let me know if this is something you'd be interested in

WangJian981002 · 2021-01-06T18:37:56Z

Can you solve this problem, I have compiled successfully in cuda11,pytorch1.7(RTX 3090), thank u very @MatthewHowe

error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1607370156314/work/aten/src/THC/THCCachingHostAllocator.cpp line=278 error=700 : an illegal memory access was encountered
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /opt/conda/conda-bld/pytorch_1607370156314/work/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8
error in modulated_deformable_im2col_cuda: no kernel image is available for execution on the device
/opt/conda/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 33 leaked semaphores to clean up at shutdown
len(cache))
Traceback (most recent call last):
File "C_ddp.py", line 349, in
main()
File "C_ddp.py", line 109, in main
mp.spawn(main_worker, nprocs=ngpus_per_node, args=(ngpus_per_node, args))
File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "/home/wj/Detection/CenterNetV2/C_ddp.py", line 277, in main_worker
center_loss, center_fuse_loss, scale_loss, offset_loss = model({'img':img , 'label':label , 'heatmap_t':heatmap_t , 'hm_FuseClass_t':hm_FuseClass_t})
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wj/Detection/CenterNetV2/nets/resnet_dcn_model.py", line 35, in forward
out=self.backbone(x)[0]
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wj/Detection/CenterNetV2/networks/resnet_dcn.py", line 261, in forward
x = self.deconv_layers(x)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 929, in forward
output_padding, self.groups, self.dilation)
RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR
You can try to repro this exception using the following code snippet. If that doesn't trigger the error, please include your original repro script when reporting this issue.

import torch
torch.backends.cuda.matmul.allow_tf32 = True
torch.backends.cudnn.benchmark = True
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.allow_tf32 = True
data = torch.randn([32, 256, 40, 40], dtype=torch.float, device='cuda', requires_grad=True)
net = torch.nn.Conv2d(256, 256, kernel_size=[4, 4], padding=[1, 1], stride=[2, 2], dilation=[1, 1], groups=1)
net = net.cuda().float()
out = net(data)
out.backward(torch.randn_like(out))
torch.cuda.synchronize()

ConvolutionParams
data_type = CUDNN_DATA_FLOAT
padding = [1, 1, 0]
stride = [2, 2, 0]
dilation = [1, 1, 0]
groups = 1
deterministic = true
allow_tf32 = true
input: TensorDescriptor 0x55923cf1b4e0
type = CUDNN_DATA_FLOAT
nbDims = 4
dimA = 32, 256, 40, 40,
strideA = 409600, 1600, 40, 1,
output: TensorDescriptor 0x55923cf1d1b0
type = CUDNN_DATA_FLOAT
nbDims = 4
dimA = 32, 256, 20, 20,
strideA = 102400, 400, 20, 1,
weight: FilterDescriptor 0x55923cf4cec0
type = CUDNN_DATA_FLOAT
tensor_format = CUDNN_TENSOR_NCHW
nbDims = 4
dimA = 256, 256, 4, 4,
Pointer addresses:
input: 0x7fcf70a80000
output: 0x7fcf6fe00000
weight: 0x7fd1d1700000

WangJian981002 · 2021-01-07T17:19:49Z

Double check that your versions all line up - if you want to use CUDA 10.2 make sure CUDNN is the correct version and the pytorch binary you are using is compiled with CUDA 10.2

Hi @XDynames , I solved this by modifying my python interrupter file "anaconda3/envs/py37/lib/python3.7/site-packages/torch/utils/cpp_extension.py"

But I met another gcc compile problem.

running install
running bdist_egg
running egg_info
writing DCNv2.egg-info/PKG-INFO
writing dependency_links to DCNv2.egg-info/dependency_links.txt
writing top-level names to DCNv2.egg-info/top_level.txt
reading manifest file 'DCNv2.egg-info/SOURCES.txt'
writing manifest file 'DCNv2.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building '_ext' extension
Emitting ninja build file /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.2
g++ -pthread -shared -B /NAS/home01/tanzhenwei/anaconda3/envs/py37/compiler_compat -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,-rpath=/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib -Wl,--no-as-needed -Wl,--sysroot=/ /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/vision.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_im2col_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cpu/dcn_v2_psroi_pooling_cpu.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o -L/NAS/home01/tanzhenwei/anaconda3/envs/py37/lib/python3.7/site-packages/torch/lib -L/usr/local/cuda/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.7/_ext.cpython-37m-x86_64-linux-gnu.so
g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_cuda.o: No such file or directory
g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.o: No such file or directory
g++: error: /NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/build/temp.linux-x86_64-3.7/NAS/project01/rzimmerm_substitles/FairMot_compressing/src/lib/models/networks/DCNv2/DCN/src/cuda/dcn_v2_psroi_pooling_cuda.o: No such file or directory
error: command 'g++' failed with exit status 1

Do you have any advice? The environment is the same.

do you solve this problem? I find the same issue too .

sparkfax · 2021-01-21T06:35:22Z

pytorch version1.7 stable
gcc 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)
CUDA Version: 11.0
I can run pytorch on other project, so pytorch and cuda version should match.

make return error as follow:
Emitting ninja build file /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
FAILED: /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o
/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/opt/mot/DCNv2/src -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/TH -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/FairMOT/include/python3.8 -c -c /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu -o /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin g++ -std=c++14
/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(107): error: identifier "THCState_getCurrentStream" is undefined
/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(279): error: identifier "THCState_getCurrentStream" is undefined
/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(324): error: identifier "THCudaBlas_Sgemv" is undefined
3 errors detected in the compilation of "/tmp/tmpxft_0011fd41_00000000-6_dcn_v2_cuda.cpp1.ii".

sparkfax · 2021-01-21T08:52:09Z

pytorch version1.7 stable
gcc 8.3.1 20191121 (Red Hat 8.3.1-5) (GCC)
CUDA Version: 11.0
I can run pytorch on other project, so pytorch and cuda version should match.

make return error as follow:
Emitting ninja build file /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
FAILED: /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o
/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/opt/mot/DCNv2/src -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/TH -I/root/anaconda3/envs/FairMOT/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/root/anaconda3/envs/FairMOT/include/python3.8 -c -c /home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu -o /home/opt/mot/DCNv2/build/temp.linux-x86_64-3.8/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_75,code=sm_75 -ccbin g++ -std=c++14
/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(107): error: identifier "THCState_getCurrentStream" is undefined
/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(279): error: identifier "THCState_getCurrentStream" is undefined
/home/opt/mot/DCNv2/src/cuda/dcn_v2_cuda.cu(324): error: identifier "THCudaBlas_Sgemv" is undefined
3 errors detected in the compilation of "/tmp/tmpxft_0011fd41_00000000-6_dcn_v2_cuda.cpp1.ii".

I use this version https://github.com/lbin/DCNv2, THCState_getCurrentStream" is undefined solved.

hhcs9527 · 2021-02-09T12:59:10Z

Is there a solution for compliling this branch for PyTorch = 1.8 and CUDA = 11.1 (from torch.version.cuda)?

XDynames · 2021-02-09T21:50:08Z

@hhcs9527 Not yet, we have a version of deformable convolution - not ROI pooling that does work with those versions but it is currently not working well in multi GPU training (very slow)
You might be able to patch what is here again by working out a suitable ATEN function to replace the depreciated BLAS calls used - we felt like we'd be doing this for ever after literally having some of the functions we used as replacements deprecated in the next version of pytorch (which dropped a day after we submitted this pull request)

DrakeSkytecn · 2021-03-18T12:59:16Z

I still had many errors on Windows. Does this work on Windows?
Edit: Windows 10, torch==1.7.1, cuda 11.0
Edit 2: Add errors

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(28): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_im2col_bilinear_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(28): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(29): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_im2col_bilinear_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(29): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(65): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_gradient_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(65): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(66): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_gradient_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(66): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(92): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_coordinate_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(92): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(93): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_coordinate_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(93): error: identifier "__floorf" is undefined in device code

meet the same error like u
Windows 10, python 3.8, torch 1.7.0, cuda 10.2

DrakeSkytecn · 2021-03-18T14:40:22Z

I still had many errors on Windows. Does this work on Windows?
Edit: Windows 10, torch==1.7.1, cuda 11.0
Edit 2: Add errors

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(28): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_im2col_bilinear_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(28): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(29): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_im2col_bilinear_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(29): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(65): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_gradient_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(65): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(66): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_gradient_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(66): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(92): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_coordinate_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(92): error: identifier "__floorf" is undefined in device code

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(93): error: calling a __host__ function("__floorf") from a __device__ function("dmcn_get_coordinate_weight_cuda") is not allowed

DCNv2/DCN/src/cuda/dcn_v2_im2col_cuda.cu(93): error: identifier "__floorf" is undefined in device code

Finally build successfully!!!
clone the code from this version https://github.com/lbin/DCNv2/tree/pytorch_1.7
and replace all the floor(...) to floorf(...),
ceil(...) to ceilf(...),
round(...) to roundf(...)

DrakeSkytecn · 2021-03-25T05:47:29Z

try this one https://github.com/jinfagang/DCNv2_latest

…

------------------ 原始邮件 ------------------ 发件人: "CharlesShang/DCNv2" ***@***.***>; 发送时间: 2021年3月24日(星期三) 晚上6:54 ***@***.***>; ***@***.******@***.***>; 主题: Re: [CharlesShang/DCNv2] Pytorch 1.6-1.8 compatability - CUDA11/3090 ready (#92) @rathaROG commented on this pull request. In DCN/src/cpu/dcn_v2_cpu.cpp: > @@ -1,5 +1,6 @@ #include <vector> @haruishi43 Thanks for your reply! Can you help verify this? This is what I did: git clone https://github.com/CharlesShang/DCNv2.git cd DCNv2 git remote add tteepe https://github.com/tteepe/DCNv2.git git fetch tteepe git checkout origin/master python setup.py build develop I also made some changes in dcn_v2_im2col_cuda.cu and dcn_v2_psroi_pooling_cuda.cu: ceil() to ceilf() floor() to floorf() round() to roundf() My system: windows 10, cuda 11.1.1, cudnn 8.1.1.33, anaconda python 3.6.12 with these packages: Cython @ file:///C:/ci/cython_1614014892888/work cython-bbox==0.1.3 torch==1.8.0 torchaudio==0.8.0 torchvision==0.9.0 All errors: cpu\dcn_v2_cpu.cpp(82): error C3861: 'THFloatBlas_gemm': identifier not found cpu\dcn_v2_cpu.cpp(101): error C3861: 'THFloatBlas_gemm': identifier not found cpu\dcn_v2_cpu.cpp(176): error C3861: 'THFloatBlas_gemm': identifier not found cpu\dcn_v2_cpu.cpp(216): error C3861: 'THFloatBlas_gemm': identifier not found cpu\dcn_v2_cpu.cpp(224): error C3861: 'THFloatBlas_gemv': identifier not found cuda/dcn_v2_cuda.cu(107): error: identifier "THCState_getCurrentStream" is undefined cuda/dcn_v2_cuda.cu(126): error: identifier "THCudaBlas_SgemmBatched" is undefined cuda/dcn_v2_cuda.cu(273): error: identifier "THCudaBlas_Sgemm" is undefined cuda/dcn_v2_cuda.cu(279): error: identifier "THCState_getCurrentStream" is undefined cuda/dcn_v2_cuda.cu(324): error: identifier "THCudaBlas_Sgemv" is undefined What did I miss? Please help me. I really want to make it work. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

DrakeSkytecn · 2021-03-25T12:55:44Z

upgrade ur vs to 2019

…

------------------ 原始邮件 ------------------ 发件人: "CharlesShang/DCNv2" ***@***.***>; 发送时间: 2021年3月25日(星期四) 晚上7:31 ***@***.***>; ***@***.******@***.***>; 主题: Re: [CharlesShang/DCNv2] Pytorch 1.6-1.8 compatability - CUDA11/3090 ready (#92) @rathaROG commented on this pull request. In DCN/src/cpu/dcn_v2_cpu.cpp: > @@ -1,5 +1,6 @@ #include <vector> Hi @haruishi43, I still had one more problem: [4/4] "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64/link.exe" dcn_v2_cuda.o dcn_v2_cpu.o dcn_v2_im2col_cpu.o dcn_v2_psroi_pooling_cpu.o dcn_v2_cuda.cuda.o dcn_v2_im2col_cuda.cuda.o dcn_v2_psroi_pooling_cuda.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib ***@***.***@at@@***@***.***@***@***.***_N1@Z torch_cuda_cpp.lib ***@***.***@at@@yahxz torch.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib/x64" cudart.lib /out:DCNv2_gpu.pyd FAILED: DCNv2_gpu.pyd "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64/link.exe" dcn_v2_cuda.o dcn_v2_cpu.o dcn_v2_im2col_cpu.o dcn_v2_psroi_pooling_cpu.o dcn_v2_cuda.cuda.o dcn_v2_im2col_cuda.cuda.o dcn_v2_psroi_pooling_cuda.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib ***@***.***@at@@***@***.***@***@***.***_N1@Z torch_cuda_cpp.lib ***@***.***@at@@yahxz torch.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib/x64" cudart.lib /out:DCNv2_gpu.pyd Creating library DCNv2_gpu.lib and object DCNv2_gpu.exp MSVCRT.lib(loadcfg.obj) : error LNK2001: unresolved external symbol __enclave_config DCNv2_gpu.pyd : fatal error LNK1120: 1 unresolved externals ninja: build stopped: subcommand failed. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

rathaROG · 2021-03-25T22:45:46Z

upgrade ur vs to 2019
…
------------------ 原始邮件 ------------------ 发件人: "CharlesShang/DCNv2" @.>; 发送时间: 2021年3月25日(星期四) 晚上7:31 @.>; @.@.>; 主题: Re: [CharlesShang/DCNv2] Pytorch 1.6-1.8 compatability - CUDA11/3090 ready (#92) @rathaROG commented on this pull request. In DCN/src/cpu/dcn_v2_cpu.cpp: > @@ -1,5 +1,6 @@ #include <vector> Hi @haruishi43, I still had one more problem: [4/4] "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64/link.exe" dcn_v2_cuda.o dcn_v2_cpu.o dcn_v2_im2col_cpu.o dcn_v2_psroi_pooling_cpu.o dcn_v2_cuda.cuda.o dcn_v2_im2col_cuda.cuda.o dcn_v2_psroi_pooling_cuda.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib @.@at@@@.@@._N1@Z torch_cuda_cpp.lib @.@at@@yahxz torch.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib/x64" cudart.lib /out:DCNv2_gpu.pyd FAILED: DCNv2_gpu.pyd "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64/link.exe" dcn_v2_cuda.o dcn_v2_cpu.o dcn_v2_im2col_cpu.o dcn_v2_psroi_pooling_cpu.o dcn_v2_cuda.cuda.o dcn_v2_im2col_cuda.cuda.o dcn_v2_psroi_pooling_cuda.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda_cu.lib @.@at@@@.@@._N1@Z torch_cuda_cpp.lib @.@at@@yahxz torch.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\dev\exc\Anaconda3\envs\DEFT\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\lib/x64" cudart.lib /out:DCNv2_gpu.pyd Creating library DCNv2_gpu.lib and object DCNv2_gpu.exp MSVCRT.lib(loadcfg.obj) : error LNK2001: unresolved external symbol __enclave_config DCNv2_gpu.pyd : fatal error LNK1120: 1 unresolved externals ninja: build stopped: subcommand failed. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Thanks for the clue! I already had the latest version of VS2019, and I realized that I didn't add the path of cl.exe of VS2019 in the system path variable. In case you're interested, I also made a windows-ready repo here:
https://github.com/rathaROG/DCNv2_Windows

DrakeSkytecn · 2021-03-26T03:07:53Z

Thanks for the repo 👍------------------ 原始邮件 ------------------ ***@***.***> 发送时间: 2021年3月26日(星期五) 上午6:46 ***@***.***>; ***@***.******@***.***>; 主题: Re: [CharlesShang/DCNv2] Pytorch 1.6-1.8 compatability - CUDA11/3090 ready (#92)

JohnPekl · 2021-04-09T23:22:31Z

Hi! Matthew,
I have a RTX3090, and cloned your project that you modified 14 hours ago.
While ./make.sh still get the error about:
nvcc fatal : Unsupported gpu architecture 'compute_86'

I got a Ubuntu 18.04, CUDA 11.1 pytorch 1.7, and gcc 7.5.0 / g++ 7.5.0

I guess it's probably the CUDA caused error, ANY HELP WOULD BE APPRECIATED!!

I have the same issue and it was fixed by the following steps:
My computer: NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2, nvcc - V ==> Build cuda_11.0_bu.TC445_37.28540450_0

Install pytorch 1.7.1 py3.8_cuda11.0.221_cudnn8.0.5_0 conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch -c conda-forge
Clone the latest source from DCNv2_latest
Add the following line in setup.py '--gpu-architecture=compute_75','--gpu-code=sm_75'

extra_compile_args["nvcc"] = [ "-DCUDA_HAS_FP16=1", "-D__CUDA_NO_HALF_OPERATORS__", "-D__CUDA_NO_HALF_CONVERSIONS__", "-D__CUDA_NO_HALF2_OPERATORS__", '--gpu-architecture=compute_75','--gpu-code=sm_75' ]

./make.sh

haruishi43 · 2021-04-10T01:55:31Z

@JohnPekl have you tried running export TORCH_CUDA_ARCH_LIST='8.0+PTX' before running make.sh? It's only a temporary workaround but it should allow it to compile.

JohnPekl · 2021-04-10T02:55:38Z

@JohnPekl have you tried running export TORCH_CUDA_ARCH_LIST='8.0+PTX' before running make.sh? It's only a temporary workaround but it should allow it to compile.

@haruishi43 , I haven't tried running export TORCH_CUDA_ARCH_LIST='8.0+PTX. The four mentioned steps are all that I have done.

hhd-shuai · 2021-04-14T09:05:44Z

I have compiled successfully using pytorch1.7. Thanks. @jerryhitit @MatthewHowe

I have downgrade pytorch version to 1.7 stable,but it doesn't work for me.
Do you have any good suggestions？Thank you in advance.

bryanbocao · 2021-06-15T20:49:30Z

I used this [docker image]docker pull nvidia/cuda:11.1-devel-ubuntu18.04 - installed conda then torch-nightly.
I then cloned and compiled DCNv2. This could be an issue with Cuda11.0 or some other conflicting packages.
When DCN doesn't compile usually the error from the cause is above your screen cap - if you run the ./make again the compiled parts will not run and it will make it clearer what is causing the issue.

Hi @MatthewHowe I appreciate your work! Do you have specific commands to compile DCNv2? Thanks!

Ada1223 · 2021-07-31T09:41:07Z

@Xpangz this is a different error
In your case the g++ linker is failing to find compiled objects it expects to be created by the first build stage
Double check all your versions, if you used the above pip install to get the pytorch binary compiled with cuda 11 it will not be compatible with your install version of cuda 10.2
Your cuda version, cuda toolkit binary and pytorch (what cuda/cudnn it was compiled with) all have to agree for this to build

@XDynames thank you for your reply, I used conda install pytorch environment again, and it get solved.

hello，I met the same error as yours ,could you explain how to use conda install for more details? I recreate the envs and reinstall the pytorch : conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch,
but it still doesn't work .I'm creay

Dhagash4 · 2021-10-27T15:39:05Z

My system specs:
Ubuntu;20.04

NVIDIA GeForce RTX 3060 Ti with CUDA capability sm_86 is not compatible with the current PyTorch installation. The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_61 sm_70 sm_75 compute_37.

In my conda environment I
am using pytorch 1.7.0 with cuda 10.2, testcpu.py is giving no error but if I try to run testcuda.py it gives this and then pauses for long time it gives this error

Unable to find a valid cuDNN algorithm to run convolution

And if I am upgrading the pytorch version then its failing to build only any solutions, suggestions much appreciated

Thank you for your time

GeLink9999 · 2021-11-28T05:25:47Z

seems ok after using https://github.com/tteepe/DCNv2

Steinwang · 2021-12-05T07:20:23Z

Your cuda version, cuda toolkit binary and pytorch (what cuda/cudnn it was compiled with) all have to agree for this to build

Hello， Could you please share your version of pytorch cudatoolkit gcc-v nvcc -v information?I'm suffering this problem for command 'g++' failed with exit status 1 and it drives me crazy @ @XDynames

fkjslee · 2021-12-27T09:58:36Z

can't fix when i downgrade my pytorch to 1.7.0 stable. sad....

Ada1223 · 2022-02-15T16:08:28Z

您好，已经收到您的邮件，我会尽快给您回复。

unbeliveyu · 2022-09-01T01:44:55Z

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html
I made the following error according to the above command:
ERROR: torch has an invalid wheel, .dist-info directory not found

unbeliveyu · 2022-09-01T02:53:56Z

pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html

the same error

How did you modify it
please help me

unbeliveyu · 2022-09-04T04:14:25Z

I successfully compiled on Windows 10, CUDA 11.1 (RTX3090), and PyTorch 1.7. Thank you so much!

你能解决这个问题吗，我在cuda11，pytorch1.7（RTX 3090）编译成功，非常感谢 @MatthewHowe

modulated_deformable_im2col_cuda 中的错误：没有可在设备上执行的内核映像 THCudaCheck FAIL 文件=/opt/conda/conda-bld/pytorch_1607370156314/work/aten/src/THC/THCCachingHostAllocator.cpp 行=278 错误=700：非法内存在抛出 'std::runtime_error' what() 实例后遇到访问终止调用：/opt/conda/conda-bld/pytorch_1607370156314/work/torch/lib/c10d/../c10d/NCCLUtils 中的 NCCL 错误。 hpp:136，未处理的 cuda 错误，modulated_deformable_im2col_cuda 中的 NCCL 版本 2.7.8 错误：没有可在设备上执行的内核映像 /opt/conda/lib/python3.7/multiprocessing/semaphore_tracker.py:144：用户警告：semaphore_tracker：关闭时似乎有 33 个泄漏的信号量需要清理 len(cache)) Traceback（最近一次调用最后）：文件“C_ddp.py”，第 349 行，在 main() 文件“C_ddp.py”，第 109 行，在 main mp.spawn(main_worker, nprocs=ngpus_per_node, args =(ngpus_per_node, args)) 文件“/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py”，第 199 行，在 spawn return start_processes(fn, args, nprocs, join, daemon , start_method='spawn') 文件“/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py”，第 157 行，在 start_processes 而不是 context.join()：文件“/opt /conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py”，第 118 行，加入引发异常（msg）异常：

-- 进程 0 因以下错误而终止： Traceback（最近一次调用最后一次）：文件“/opt/conda/lib/python3.7/site-packages/torch/multiprocessing/spawn.py”，第 19 行，在 _wrap fn (i, *args) 文件“/home/wj/Detection/CenterNetV2/C_ddp.py”，第 277 行，在 main_worker center_loss, center_fuse_loss, scale_loss, offset_loss = model({'img':img, 'label':label, 'heatmap_t':heatmap_t, 'hm_FuseClass_t':hm_FuseClass_t}) 文件“/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py”，第 727 行，在 _call_impl 结果 = self.前向（*输入，**kwargs）文件“/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/distributed.py”，第 619 行，前向输出 = self.模块（*输入[0]，**kwargs[0]）文件“/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py”，第 727 行，在 _call_impl 结果 = self.forward(*input, **kwargs) 文件“/home /wj/Detection/CenterNetV2/nets/resnet_dcn_model.py”，第 35 行，in forward out=self.backbone(x)[0] 文件“/opt/conda/lib/python3.7/site-packages/torch/nn /modules/module.py”，第 727 行，在 _call_impl 结果 = self.forward(*input, **kwargs) 文件“/home/wj/Detection/CenterNetV2/networks/resnet_dcn.py”，第 261 行，向前 x = self.deconv_layers(x) 文件“/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py”，第 727 行，在 _call_impl 结果中 = self.forward(*input, * *夸格斯）文件“/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py”，第 117 行，前向输入 = 模块（输入）文件“/opt/conda/lib/python3. 7/site-packages/torch/nn/modules/module.py”，第 727 行，在 _call_impl 结果 = self.forward(*input, **kwargs) 文件“/opt/conda/lib/python3.7/site- packages/torch/nn/modules/conv.py"，第 929 行，在前向 output_padding、self.groups、self.dilation) RuntimeError：cuDNN 错误：CUDNN_STATUS_INTERNAL_ERROR 您可以尝试使用以下代码片段重现此异常。如果这不会触发错误，请在报告此问题时包含您的原始复制脚本。

import torch.backends.cuda.matmul.allow_tf32 = True torch.backends.cudnn.benchmark = True torch.backends.cudnn.deterministic = True torch.backends.cudnn.allow_tf32 = True data = torch.randn([32, 256 , 40, 40], dtype=torch.float, device='cuda', requires_grad=True) net = torch.nn.Conv2d(256, 256, kernel_size=[4, 4], padding=[1, 1], stride=[2, 2], dilation=[1, 1], groups=1) net = net.cuda().float() out = net(data) out.backward(torch.randn_like(out)) 火炬。 cuda.synchronize()

ConvolutionParams data_type = CUDNN_DATA_FLOAT padding = [1, 1, 0] stride = [2, 2, 0] dilation = [1, 1, 0] groups = 1 确定性 = true allow_tf32 = true 输入：TensorDescriptor 0x55923cf1b4e0 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 32, 256, 40, 40, strideA = 409600, 1600, 40, 1, 输出：TensorDescriptor 0x55923cf1d1b0 type = CUDNN_DATA_FLOAT nbDims = 4 dimA = 32, 256, 20, 20, strideA = 102400, 400, 20, 1 权重：FilterDescriptor 0x55923cf4cec0 type = CUDNN_DATA_FLOAT tensor_format = CUDNN_TENSOR_NCHW nbDims = 4 dimA = 256, 256, 4, 4, 指针地址：输入：0x7fcf70a80000 输出：0x7fcf6fe00000 权重：0x7fd1d1700000

How did you solve this problem？
After I successfully compiled dcnv2, this problem also occurred when compiling the whole program. Please help me
大佬救救我

unbeliveyu · 2022-09-04T04:19:40Z

我在 Windows 10、CUDA 11.1 (RTX3090) 和 PyTorch 1.7 上成功编译。太感谢了！

Excuse me, how did you compile it successfully? Did you limit the computing power of the graphics card to 75? Can you help me?

3846chs · 2022-11-08T06:48:07Z

I forked from https://github.com/MatthewHowe/DCNv2 which fixes to be compatible with [torch 1.7 / 1.8 with cuda 10 / 11]
This folder structure is slightly different from the original, which can cause errors in several projects using DCNv2.
So, I made the folder structure the same as the original: https://github.com/3846chs/DCNv2.git

My environment:
torch 1.7.1+cu110
cuda 11
RTX 3090

yellowjs0304 · 2022-11-22T00:20:42Z

Hi, I met same error(compute_86) "torch 1.7 / 1.8 with cuda 10 / 11" .
Did you used Anaconda environment? @3846chs

+) Fix
I'm using the Anaconda environment (torch 1.9.1+cu111/ cuda 11.2 / GPU 3080).
I don't know why this is fixed but, I re-installed cuda by official cuda install guide. Also, cudnn followed official way (unzip cudnn.tar and copied the cudnn*.h or cudnn.so.* files to /usr/local/cuda-11.2/lib64 or include/) . And updated ~/.bashrc file as new environment variable PATH.

In my case, i don't used conda install cudatoolkit, the issue solved.

junmuzi · 2023-05-26T22:40:31Z

Hi, can someone help me to see why I get the following error here?

error: command '/usr/local/cuda-11.0/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/bin/nvcc' failed: No such file or directory: '/usr/local/cuda-11.0/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/bin/nvcc'

The full error message is as follows:
$ ./make.sh
...
/home/sda/lijun/Moving-object-detection-DSFNet/lib/models/DCNv2-master/DCN/src/cpu/dcn_v2_psroi_pooling_cpu.cpp:398:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES’ 398 | AT_DISPATCH_FLOATING_TYPES(out_grad.type(), "dcn_v2_psroi_pooling_cpu_backward", [&] { | ^~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/ATen/Tensor.h:3, from /home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/ATen/Context.h:4, from /home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/ATen/ATen.h:9, from /home/sda/lijun/Moving-object-detection-DSFNet/lib/models/DCNv2-master/DCN/src/cpu/dcn_v2_psroi_pooling_cpu.cpp:15:
/home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/ATen/core/TensorBody.h:363:7: note: declared here 363 | T * data() const {
| ^~~~
/usr/local/cuda-11.0/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/bin/nvcc -DWITH_CUDA -I/home/sda/lijun/Moving-object-detection-DSFNet/lib/models/DCNv2-master/DCN/src -I/home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include -I/home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/TH -I/home/junmuzi/anaconda3/envs/mod6/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda-11.0/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/include -I/home/junmuzi/anaconda3/envs/mod6/include/python3.7m -c /home/sda/lijun/Moving-object-detection-DSFNet/lib/models/DCNv2-master/DCN/src/cuda/dcn_v2_cuda.cu -o build/temp.linux-x86_64-cpython-37/home/sda/lijun/Moving-object-detection-DSFNet/lib/models/DCNv2-master/DCN/src/cuda/dcn_v2_cuda.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin g++ -std=c++14
error: command '/usr/local/cuda-11.0/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/bin/nvcc' failed: No such file or directory: '/usr/local/cuda-11.0/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/usr/local/cuda-11.7/:/bin/nvcc'

Ada1223 · 2023-05-26T22:40:56Z

您好，已经收到您的邮件，我会尽快给您回复。

QingZhuanya · 2024-09-26T02:35:15Z

I forked from https://github.com/MatthewHowe/DCNv2 which fixes to be compatible with [torch 1.7 / 1.8 with cuda 10 / 11] This folder structure is slightly different from the original, which can cause errors in several projects using DCNv2. So, I made the folder structure the same as the original: https://github.com/3846chs/DCNv2.git

My environment: torch 1.7.1+cu110 cuda 11 RTX 3090

3Q, wonderful work

Ada1223 · 2024-09-26T02:35:46Z

您好，已经收到您的邮件，我会尽快给您回复。

MatthewHowe added 3 commits November 13, 2020 23:25

half-potato:master pull request

8248636

Removing legacy files

f50ec52

From THBlas to torch implementation

194f573

This was referenced Nov 17, 2020

Error compiling objects for extension in CUDA11.1, pytorch1.7 or pytorch-nightly(1.8), RTX3090 #90

Closed

RuntimeError: Error compiling objects for extension lucasjinreal/DCNv2_latest#7

Closed

nostayup mentioned this pull request Nov 30, 2020

作者可以出一个支持pytorch1.7的吗 lucasjinreal/DCNv2_latest#9

Closed

This comment has been minimized.

Sign in to view

yawara18 mentioned this pull request Apr 8, 2021

add f yawara18/DCNv2#1

Merged

eshoyuan mentioned this pull request Apr 14, 2021

setup.py build_ext fail RuntimeError: Error compiling objects for extension skyhehe123/SA-SSD#79

Open

unbeliveyu mentioned this pull request Sep 1, 2022

> > > pip install --pre torch torchvision -f https://download.pytorch.org/whl/nightly/cu110/torch_nightly.html #133

Open

Pytorch 1.6-1.8 compatability - CUDA11/3090 ready #92

Are you sure you want to change the base?

Pytorch 1.6-1.8 compatability - CUDA11/3090 ready #92

Conversation

MatthewHowe commented Nov 13, 2020

MatthewHowe commented Nov 13, 2020

jerryhitit commented Nov 14, 2020

XDynames commented Nov 14, 2020 • edited Loading

jerryhitit commented Nov 14, 2020

MatthewHowe commented Nov 14, 2020 • edited Loading

jerryhitit commented Nov 15, 2020 • edited Loading

XDynames commented Nov 15, 2020 • edited Loading

Shank2358 commented Nov 16, 2020

jerryhitit commented Nov 16, 2020

Shank2358 commented Nov 16, 2020

Shank2358 commented Nov 17, 2020

duanzhiihao commented Nov 30, 2020

KiedaTamashi commented Dec 2, 2020

XDynames commented Dec 4, 2020

KiedaTamashi commented Dec 7, 2020

ConnerWK commented Dec 17, 2020

XDynames commented Dec 17, 2020 • edited Loading

This comment has been minimized.

WangJian981002 commented Jan 6, 2021

WangJian981002 commented Jan 7, 2021

sparkfax commented Jan 21, 2021

sparkfax commented Jan 21, 2021

hhcs9527 commented Feb 9, 2021

XDynames commented Feb 9, 2021

DrakeSkytecn commented Mar 18, 2021

DrakeSkytecn commented Mar 18, 2021

DrakeSkytecn commented Mar 25, 2021 via email

DrakeSkytecn commented Mar 25, 2021 via email

rathaROG commented Mar 25, 2021

DrakeSkytecn commented Mar 26, 2021 via email

JohnPekl commented Apr 9, 2021 • edited Loading

haruishi43 commented Apr 10, 2021

JohnPekl commented Apr 10, 2021

hhd-shuai commented Apr 14, 2021

bryanbocao commented Jun 15, 2021

Ada1223 commented Jul 31, 2021

Dhagash4 commented Oct 27, 2021 • edited Loading

GeLink9999 commented Nov 28, 2021

Steinwang commented Dec 5, 2021

fkjslee commented Dec 27, 2021

Ada1223 commented Feb 15, 2022 via email

unbeliveyu commented Sep 1, 2022 • edited Loading

unbeliveyu commented Sep 1, 2022

unbeliveyu commented Sep 4, 2022

unbeliveyu commented Sep 4, 2022

3846chs commented Nov 8, 2022

yellowjs0304 commented Nov 22, 2022 • edited Loading

junmuzi commented May 26, 2023 • edited Loading

Ada1223 commented May 26, 2023 via email

QingZhuanya commented Sep 26, 2024

Ada1223 commented Sep 26, 2024 via email

XDynames commented Nov 14, 2020 •

edited

Loading

MatthewHowe commented Nov 14, 2020 •

edited

Loading

jerryhitit commented Nov 15, 2020 •

edited

Loading

XDynames commented Nov 15, 2020 •

edited

Loading

XDynames commented Dec 17, 2020 •

edited

Loading

JohnPekl commented Apr 9, 2021 •

edited

Loading

Dhagash4 commented Oct 27, 2021 •

edited

Loading

unbeliveyu commented Sep 1, 2022 •

edited

Loading

yellowjs0304 commented Nov 22, 2022 •

edited

Loading

junmuzi commented May 26, 2023 •

edited

Loading