We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
命令行部署报错
[TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] void turbomind::LogitsProcessorLayer<T>::freeBuffer() [with T = float] stop [TM][DEBUG] turbomind::LogitsProcessorLayer<T>::~LogitsProcessorLayer() [with T = float] stop [TM][DEBUG] turbomind::SamplingLayer<T>::~SamplingLayer() [with T = float] start [TM][DEBUG] void turbomind::SamplingLayer<T>::freeBuffer() [with T = float] start [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] void turbomind::SamplingLayer<T>::freeBuffer() [with T = float] stop [TM][DEBUG] turbomind::SamplingLayer<T>::~SamplingLayer() [with T = float] stop [TM][DEBUG] turbomind::StopCriteriaLayer<T>::~StopCriteriaLayer() [with T = float] start [TM][DEBUG] void turbomind::StopCriteriaLayer<T>::freeBuffer() [with T = float] start [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] void turbomind::StopCriteriaLayer<T>::freeBuffer() [with T = float] stop [TM][DEBUG] turbomind::StopCriteriaLayer<T>::~StopCriteriaLayer() [with T = float] stop [TM][DEBUG] void turbomind::UnifiedDecoder<T>::freeBuffer() [with T = __nv_bfloat16] [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] Free buffer 0x6a492a000 [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] Free buffer 0x7fb5b3000000 [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) Aborted (core dumped)
mdeploy serve api_server /home/software/checkpoint --server-port 80 --log-level DEBUG --tp 1
lmdeploy check_env sys.platform: linux Python: 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 16:31:09) [GCC 11.2.0] CUDA available: True MUSA available: False numpy_random_seed: 2147483648 GPU 0: NVIDIA A10 CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 12.4, V12.4.131 GCC: gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 PyTorch: 2.4.0+cu121 PyTorch compiling details: PyTorch built with: - GCC 9.3 - C++ Version: 201703 - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications - Intel(R) MKL-DNN v3.4.2 (Git Hash 1137e04ec0b5251ca2b4400a4fd3c667ce843d67) - OpenMP 201511 (a.k.a. OpenMP 4.5) - LAPACK is enabled (usually provided by MKL) - NNPACK is enabled - CPU capability usage: AVX512 - CUDA Runtime 12.1 - NVCC architecture flags: -gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_90,code=sm_90 - CuDNN 90.2 (built against CUDA 12.5) - Built with CuDNN 90.1 - Magma 2.6.1 - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=9.1.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.4.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF, TorchVision: 0.19.0+cu121 LMDeploy: 0.6.4+ transformers: 4.47.1 gradio: Not Found fastapi: 0.115.6 pydantic: 2.10.4 triton: 3.0.0 NVIDIA Topology: GPU0 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X 0-7 0 N/A Legend: X = Self SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI) NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU) PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge) PIX = Connection traversing at most a single PCIe bridge NV# = Connection traversing a bonded set of # NVLinks
No response
The text was updated successfully, but these errors were encountered:
Can you try the following command?
lmdeploy serve api_server /home/software/checkpoint --chat-template internvl2_5 --server-port 8000 --log-level INFO --tp 1
Sorry, something went wrong.
lvhan028
No branches or pull requests
命令行部署报错
[TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] void turbomind::LogitsProcessorLayer<T>::freeBuffer() [with T = float] stop [TM][DEBUG] turbomind::LogitsProcessorLayer<T>::~LogitsProcessorLayer() [with T = float] stop [TM][DEBUG] turbomind::SamplingLayer<T>::~SamplingLayer() [with T = float] start [TM][DEBUG] void turbomind::SamplingLayer<T>::freeBuffer() [with T = float] start [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] void turbomind::SamplingLayer<T>::freeBuffer() [with T = float] stop [TM][DEBUG] turbomind::SamplingLayer<T>::~SamplingLayer() [with T = float] stop [TM][DEBUG] turbomind::StopCriteriaLayer<T>::~StopCriteriaLayer() [with T = float] start [TM][DEBUG] void turbomind::StopCriteriaLayer<T>::freeBuffer() [with T = float] start [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] void turbomind::StopCriteriaLayer<T>::freeBuffer() [with T = float] stop [TM][DEBUG] turbomind::StopCriteriaLayer<T>::~StopCriteriaLayer() [with T = float] stop [TM][DEBUG] void turbomind::UnifiedDecoder<T>::freeBuffer() [with T = __nv_bfloat16] [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] Free buffer 0x6a492a000 [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) [TM][DEBUG] Free buffer 0x7fb5b3000000 [TM][DEBUG] virtual void turbomind::Allocator<turbomind::AllocatorType::CUDA>::free(void**, bool) Aborted (core dumped)
Reproduction
mdeploy serve api_server /home/software/checkpoint --server-port 80 --log-level DEBUG --tp 1
Environment
Error traceback
No response
The text was updated successfully, but these errors were encountered: