Skip to content

[SYCL] Support UR_DEVICE_INFO_USM_CONTEXT_MEMCPY_SUPPORT_EXP on Cuda #19267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 18, 2025

Conversation

jchlanda
Copy link
Contributor

@jchlanda jchlanda commented Jul 2, 2025

This is an optimisation that uses a direct memcopy from host to device.

@jchlanda jchlanda requested a review from a team as a code owner July 2, 2025 13:51
@jchlanda jchlanda requested a review from steffenlarsen July 2, 2025 13:51
@jchlanda
Copy link
Contributor Author

jchlanda commented Jul 2, 2025

It's tested by an existing sycl_device_globals.cpp test.

@jchlanda jchlanda force-pushed the jakub/device_globals_gpu branch from 5ded23f to 1561006 Compare July 2, 2025 13:56
@jchlanda jchlanda temporarily deployed to WindowsCILock July 2, 2025 13:56 — with GitHub Actions Inactive
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void *pDst,
const void *pSrc,
size_t Size) {
UR_CHECK_ERROR(cuMemcpyHtoD((CUdeviceptr)pDst, pSrc, Size));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious - where in the API does it say that pSrc is a host pointer and pDst is a device pointer? I can't find much information about the extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems you're right, the docs for the function declaration don't mention it explicitly. Are you worried that this call could potentially handle situation where both src and dest are device pointers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or host->host or device->host. It's not clear to me what this API is meant to handle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've discussed it offline, you are right, the extension does not give any guarantees about the location of the pointers. I'll put it back to draft to address this issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be bale to trace the pointers with cuPointerGetAttribute and CU_POINTER_ATTRIBUTE_MEMORY_TYPE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved to cuMemcpy that automatically handles the provenience of the src/dst pointers.

@jchlanda jchlanda temporarily deployed to WindowsCILock July 2, 2025 16:00 — with GitHub Actions Inactive
@jchlanda jchlanda temporarily deployed to WindowsCILock July 2, 2025 16:00 — with GitHub Actions Inactive
@jchlanda jchlanda marked this pull request as draft July 3, 2025 10:27
@jchlanda jchlanda force-pushed the jakub/device_globals_gpu branch from b07d3fc to e84cf2b Compare July 7, 2025 09:22
@jchlanda jchlanda temporarily deployed to WindowsCILock July 7, 2025 09:22 — with GitHub Actions Inactive
@jchlanda jchlanda marked this pull request as ready for review July 7, 2025 09:24
@jchlanda jchlanda temporarily deployed to WindowsCILock July 7, 2025 09:43 — with GitHub Actions Inactive
@jchlanda jchlanda force-pushed the jakub/device_globals_gpu branch from e84cf2b to 8d70bec Compare July 7, 2025 10:59
@jchlanda jchlanda temporarily deployed to WindowsCILock July 7, 2025 10:59 — with GitHub Actions Inactive
@jchlanda jchlanda temporarily deployed to WindowsCILock July 7, 2025 11:20 — with GitHub Actions Inactive
@jchlanda jchlanda force-pushed the jakub/device_globals_gpu branch from 8d70bec to 40707c8 Compare July 7, 2025 13:16
@jchlanda jchlanda temporarily deployed to WindowsCILock July 7, 2025 13:17 — with GitHub Actions Inactive
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@jchlanda jchlanda force-pushed the jakub/device_globals_gpu branch from 40707c8 to cba6baa Compare July 8, 2025 05:57
@jchlanda jchlanda temporarily deployed to WindowsCILock July 8, 2025 05:57 — with GitHub Actions Inactive
@jchlanda jchlanda temporarily deployed to WindowsCILock July 8, 2025 06:18 — with GitHub Actions Inactive
@jchlanda jchlanda temporarily deployed to WindowsCILock July 8, 2025 06:18 — with GitHub Actions Inactive
@jchlanda jchlanda force-pushed the jakub/device_globals_gpu branch from cba6baa to 5eee551 Compare July 17, 2025 13:28
@jchlanda
Copy link
Contributor Author

This should be ready to roll @intel/llvm-gatekeepers. Thank you.

@sommerlukas sommerlukas merged commit d0279af into intel:sycl Jul 18, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants