Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Debug][PaddleV3] 测试 inference 模型导入卡住的问题 #1077

Open
wants to merge 28 commits into
base: develop
Choose a base branch
from

Conversation

megemini
Copy link
Contributor

[Debug][PaddleV3] 测试 inference 模型导入卡住的问题

关联:#1064

@PaddlePaddle PaddlePaddle locked and limited conversation to collaborators Oct 21, 2024
@PaddlePaddle PaddlePaddle unlocked this conversation Oct 21, 2024
@megemini
Copy link
Contributor Author

@vivienfanghuagood

根据 #1064 中的方法,日志已经打开了 ~ 效率云貌似有点问题,可以看一下这里:

https://xly.bce.baidu.com/paddlepaddle/x2paddle-ci/newipipe/detail/11736915/job/27809560

2024-10-21 12:13:00 [2024-10-21 04:13:00,355] [    INFO] convert.py:303 - Now translating model from onnx to paddle.
2024-10-21 12:13:00 Converting node 1 ...     
2024-10-21 12:13:00 Converting node 2 ...     [2024-10-21 04:13:00,361] [    INFO] convert.py:325 - Model optimizing ...
2024-10-21 12:13:00 [2024-10-21 04:13:00,370] [    INFO] convert.py:329 - Model optimized.
2024-10-21 12:13:00 /usr/local/lib/python3.9/dist-packages/paddle/framework/io.py:939: UserWarning: The input state dict is empty, no need to save.
2024-10-21 12:13:00   warnings.warn("The input state dict is empty, no need to save.")
2024-10-21 12:13:00 /usr/local/lib/python3.9/dist-packages/paddle/jit/dy2static/program_translator.py:699: UserWarning: full_graph=False don't support input_spec arguments. It will not produce any effect.
2024-10-21 12:13:00 You can set full_graph=True, then you can assign input spec.
2024-10-21 12:13:00   warnings.warn(
2024-10-21 12:13:00 I1021 04:13:00.460772    76 op_desc.cc:1112] CompileTime infer shape on abs
2024-10-21 12:13:00 I1021 04:13:00.460808    76 infershape_utils.cc:546] *******: op kernel signature - Kernel Signature - name: abs; inputs: X; attributes: ; outputs: Out
2024-10-21 12:13:00 I1021 04:13:00.463881    76 eager.cc:118] Tensor(cuda_graph) have not GradNode, add ******* for it.
2024-10-21 12:13:00 I1021 04:13:00.479213    76 op_desc.cc:1112] CompileTime infer shape on scale
2024-10-21 12:13:00 I1021 04:13:00.479271    76 infershape_utils.cc:546] *******: op kernel signature - Kernel Signature - name: scale; inputs: X; attributes: scale, bias, bias_after_scale; outputs: Out
2024-10-21 12:13:00 /usr/local/lib/python3.9/dist-packages/paddle/static/io.py:581: UserWarning: no variable in your model, please ensure there are any variables in your model to save
2024-10-21 12:13:00   warnings.warn(
2024-10-21 12:13:00 I1021 04:13:00.514019    76 onednn_context.cc:104] Clearing DNNL cache.
2024-10-21 12:13:00 I1021 04:13:00.514046    76 onednn_context.cc:122] Resetting Paddle data layout to NCHW.
2024-10-21 12:13:00 [2024-10-21 04:13:00,515] [    INFO] convert.py:331 - Successfully exported Paddle static graph model!
2024-10-21 12:13:00 [2024-10-21 04:13:00,515] [    INFO] convert.py:348 - ================================================
2024-10-21 12:13:00 [2024-10-21 04:13:00,516] [    INFO] convert.py:349 - 
2024-10-21 12:13:00 [2024-10-21 04:13:00,516] [    INFO] convert.py:350 - Model Converted! Fill this survey to help X2Paddle better, https://iwenjuan.baidu.com/?code=npyd51 
2024-10-21 12:13:00 [2024-10-21 04:13:00,516] [    INFO] convert.py:353 - 
2024-10-21 12:13:00 [2024-10-21 04:13:00,517] [    INFO] convert.py:354 - ================================================
2024-10-21 12:13:00 [2024-10-21 04:13:00,517] [    INFO] onnxbase.py:207 - >>> onnx2paddle finished ...
2024-10-21 12:13:00 [2024-10-21 04:13:00,517] [    INFO] onnxbase.py:333 - >>> _mk_onnx_res *******...
2024-10-21 12:13:00 [2024-10-21 04:13:00,520] [    INFO] onnxbase.py:339 - >>> sess.run ...
2024-10-21 12:13:00 [2024-10-21 04:13:00,525] [    INFO] onnxbase.py:213 - >>> _mk_paddle_res ...
2024-10-21 12:13:00 I1021 04:13:00.526403    76 eager.cc:118] Tensor(generated_tensor_1) have not GradNode, add ******* for it.
2024-10-21 12:13:00 [2024-10-21 04:13:00,527] [    INFO] onnxbase.py:247 - >>> NOT self.run_dynamic...
2024-10-21 12:13:00 [2024-10-21 04:13:00,528] [    INFO] onnxbase.py:260 - >>> config.enable_use_gpu...
2024-10-21 12:13:00 [2024-10-21 04:13:00,528] [    INFO] onnxbase.py:265 - >>> config.enable_use_gpu finished...
2024-10-21 12:13:00 [2024-10-21 04:13:00,528] [    INFO] onnxbase.py:271 - >>> enable_memory_optim finished...
2024-10-21 12:13:00 [2024-10-21 04:13:00,529] [    INFO] onnxbase.py:276 - >>> config.disable_glog_info...
2024-10-21 12:13:00 [2024-10-21 04:13:00,529] [    INFO] onnxbase.py:281 - >>> config.pass_builder...
2024-10-21 12:13:00 [2024-10-21 04:13:00,530] [    INFO] onnxbase.py:285 - >>> create_predictor(config)...

日志到这里就卡住了,后面也没有 glog ~ 还请帮忙看一下 ~ 🙏🙏🙏

@vivienfanghuagood
Copy link

看起来卡在了OneDNN转换,你试试config.DisableMKLDNN(),再运行日志看看。
另外 @zhanglirong1999 看看是否有建议呢。

@megemini
Copy link
Contributor Author

看起来卡在了OneDNN转换,你试试config.DisableMKLDNN(),再运行日志看看。 另外 @zhanglirong1999 看看是否有建议呢。

貌似不行 ~ 提示木有这个属性:

2024-10-21 17:36:17 E
2024-10-21 17:36:17 ======================================================================
2024-10-21 17:36:17 ERROR: test (__main__.TestAbsConvert)
2024-10-21 17:36:17 ----------------------------------------------------------------------
2024-10-21 17:36:17 Traceback (most recent call last):
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/test_auto_scan_abs.py", line 62, in test
2024-10-21 17:36:17     self.run_and_statis(max_examples=30)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 105, in run_and_statis
2024-10-21 17:36:17     loop_func()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/usr/local/lib/python3.9/dist-packages/hypothesis/core.py", line 1469, in wrapped_test
2024-10-21 17:36:17     raise the_error_hypothesis_found
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 227, in run_test
2024-10-21 17:36:17     obj.run()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 419, in run
2024-10-21 17:36:17     paddle_res[str(v)] = self._mk_paddle_res(ver=v)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 281, in _mk_paddle_res
2024-10-21 17:36:17     config.DisableMKLDNN()
2024-10-21 17:36:17 AttributeError: 'paddle.base.libpaddle.AnalysisConfig' object has no attribute 'DisableMKLDNN'

@luotao1 luotao1 added the contributor External developers label Oct 23, 2024
@vivienfanghuagood
Copy link

看起来卡在了OneDNN转换,你试试config.DisableMKLDNN(),再运行日志看看。 另外 @zhanglirong1999 看看是否有建议呢。

貌似不行 ~ 提示木有这个属性:

2024-10-21 17:36:17 E
2024-10-21 17:36:17 ======================================================================
2024-10-21 17:36:17 ERROR: test (__main__.TestAbsConvert)
2024-10-21 17:36:17 ----------------------------------------------------------------------
2024-10-21 17:36:17 Traceback (most recent call last):
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/test_auto_scan_abs.py", line 62, in test
2024-10-21 17:36:17     self.run_and_statis(max_examples=30)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 105, in run_and_statis
2024-10-21 17:36:17     loop_func()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/usr/local/lib/python3.9/dist-packages/hypothesis/core.py", line 1469, in wrapped_test
2024-10-21 17:36:17     raise the_error_hypothesis_found
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 96, in run_test
2024-10-21 17:36:17     return self.run_test(configs=configs)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/auto_scan_test.py", line 227, in run_test
2024-10-21 17:36:17     obj.run()
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 419, in run
2024-10-21 17:36:17     paddle_res[str(v)] = self._mk_paddle_res(ver=v)
2024-10-21 17:36:17   File "/workspace/X2Paddle/test_autoscan/onnx/onnxbase.py", line 281, in _mk_paddle_res
2024-10-21 17:36:17     config.DisableMKLDNN()
2024-10-21 17:36:17 AttributeError: 'paddle.base.libpaddle.AnalysisConfig' object has no attribute 'DisableMKLDNN'

如果是python的话,用config.disable_mkldnn()

@zhanglirong1999
Copy link

这边似乎是走到了onednn_context里面,但是后面没有更确切的信息,暂时没有更多的建议。如果关闭了onednn可以跑过,确认是onednn的问题,后续有需要,onednn会跟进一下。

@megemini
Copy link
Contributor Author

这边似乎是走到了onednn_context里面,但是后面没有更确切的信息,暂时没有更多的建议。如果关闭了onednn可以跑过,确认是onednn的问题,后续有需要,onednn会跟进一下。

CI 还是卡在了 create_predictor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributor External developers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants