Skip to content

[SYCL][Docs] Add sycl_ext_named_sub_group_sizes kernel properties #19795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: sycl
Choose a base branch
from

Conversation

steffenlarsen
Copy link
Contributor

This PR adds the kernel properties required for sycl_ext_named_sub_group_sizes. When sub_group_size_primary is attached to a kernel, the kernel has its !intel_reqd_sub_group_size contain the value -1. When sub_group_size_automatic is attached to a kernel, the kernel does not receive any !intel_reqd_sub_group_size metadata. From an optional kernel features/sycl-post-link perspective, these behaviors are aligned with the specification, allowing sub_group_size_automatic marked kernels to be bundled with kernels with no attached sub_group_size property, while kernels marked with sub_group_size_primary are bundled separately with respect to other kernels marked with a required sub group size.

@steffenlarsen
Copy link
Contributor Author

This is a revival of #12335. Tag @jzc for awareness.

@steffenlarsen steffenlarsen changed the title [SYCL] Add sycl_ext_named_sub_group_sizes kernel properties [SYCL][Docs] Add sycl_ext_named_sub_group_sizes kernel properties Aug 14, 2025
Comment on lines +32 to +34
// CHECK: spir_kernel void @{{.*}}SGSizePrimaryKernelFunctor()
// CHECK-SAME: !intel_reqd_sub_group_size ![[SGSizeAttr:[0-9]+]]
Q.parallel_for(NDRange, SGSizePrimaryKernelFunctor{});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add a test here that the kernel actually executes with the primary sub-group size, as returned by info::device::primary_sub_group_size? Or is that covered by another test somewhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it looks like this device info descriptor is still missing. I will add it. As an aside, should it be renamed to ext_oneapi_primary_sub_group_size?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to you on this -- we should do whatever is consistent with our other extensions.

sycl::info isn't an enum, so I think we could define this as sycl::ext::oneapi::info::device::primary_sub_group_size if we wanted to. But sycl::info::ext_oneapi_primary_sub_group_size is fine by me, if that's what we usually do.

Signed-off-by: Larsen, Steffen <[email protected]>
@gmlueck
Copy link
Contributor

gmlueck commented Aug 14, 2025

Does this PR implement the entire extension?

The spec says the following about the default sub-group size:

If no sub-group size property appears on a kernel or SYCL_EXTERNAL function, the default behavior of an implementation must be to compile and execute the kernel or function using a device’s primary sub-group size.

I didn't see any code that changes the default behavior. Has our default always been to use the primary sub-group size? If so, should we add a test that verifies this?

The spec also says that the DPC++ implementation supports a command line option named -fsycl-default-sub-group-size. I don't see anything in this PR that adds that option. Was it added before? Do we have a test for it?

@steffenlarsen steffenlarsen marked this pull request as draft August 14, 2025 14:17
@steffenlarsen
Copy link
Contributor Author

The spec says the following about the default sub-group size:

If no sub-group size property appears on a kernel or SYCL_EXTERNAL function, the default behavior of an implementation must be to compile and execute the kernel or function using a device’s primary sub-group size.

I didn't see any code that changes the default behavior. Has our default always been to use the primary sub-group size? If so, should we add a test that verifies this?

Admittedly, this was a revival of an old PR, but it sounds like I need to give the extension another read. I don't think we pick the primary at the moment. We let the implementation pick any sub-group size it sees fit for any given kernel. I am a little concerned that changing it will have an effect on user code, but I suppose their code should be robust towards it if they don't specify a size themselves.

The spec also says that the DPC++ implementation supports a command line option named -fsycl-default-sub-group-size. I don't see anything in this PR that adds that option. Was it added before? Do we have a test for it?

We have that option from the old sub-group extension, so that should be fine. Mostly just frontend testing for it though, so that could be improved indeed.

@gmlueck
Copy link
Contributor

gmlueck commented Aug 14, 2025

Admittedly, this was a revival of an old PR, but it sounds like I need to give the extension another read. I don't think we pick the primary at the moment.

One concern I have about this part is the compatibility with older drivers. If DPC++ by default generates some new SPIR-V decoration on kernels, will the old LTS driver understand this decoration? I think it's OK if we generate new SPIR-V when the application uses a new SYCL feature. It's reasonable to require the user to use an updated driver when they use a new feature. However, it's not OK if unmodified SYCL code suddenly requires a new driver just because it's compiled with a new DPC++.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants