Skip to content

Conversation

Alexandr-Konovalov
Copy link
Contributor

Create possibility to delay creation of copy of HostKernel till it became used out of submit stack, i.e. by scheduler. Do type erasure for kernel lambda via vptr in HostKernelRefBase.

Create possibility to delay creation of copy of HostKernel till it became used
out of submit stack, i.e. by scheduler. Do type erasure for kernel lambda via
vptr in HostKernelRefBase.
Comment on lines 261 to 266
#ifndef __INTEL_PREVIEW_BREAKING_CHANGES
// This function is needed for host-side compilation to keep kernels
// instantitated. This is important for debuggers to be able to associate
// kernel code instructions with source code lines.
// NOTE: InstatiateKernelOnHost() should not be called.
void InstantiateKernelOnHost() override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be empty. HostKernelRef<...> instantiates HostKernel<...> on line 256, and its InstantiateKernelOnHost already does the right thing (outside preview). And for preview we need a mechanism that doesn't require copy-paste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it can be empty. I try to describe the reason in comment.

And for preview we need a mechanism that doesn't require copy-paste.

Are we talking about GetInstantiateKernelOnHostPtr() call, right? It required template parameter, so unclear what can we done other then adding the call to templated constructor.

virtual char *getPtr() override {
return const_cast<char *>(reinterpret_cast<const char *>(&MKernel));
}
virtual std::shared_ptr<HostKernelBase> takeOrCopyOwnership() const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this should std::unique_ptr because it has no overhead and one can always easily create shared via unique_ptr::release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally agree. Overhead is the creation/destructor of unique_ptr, meanwhile caller needs shared_ptr. (And for shared_ptr we return 2 pointers vs 1 for unique_ptr, so it's hard to judge). Is that a chance that someday caller would need unique_ptr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overhead is the creation/destructor of unique_ptr

is exactly zero with optimizations enabled: https://godbolt.org/z/fcaos1Wr7

That is not true for std::shared_ptr (which not only has extra memory alloc but also involves atomics).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good point. Done.


std::shared_ptr<detail::HostKernelBase> HostKernel = std::make_shared<
detail::HostKernel<KernelType, TransformedArgType, Dims>>(KernelFunc);
HostKernelRef<KernelType, TransformedArgType, Dims> HostKernel(KernelFunc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could do

const HostKernelRefBase &TypeErasedKernel = HostKernerlRef<...>{KernelFunc};

(https://godbolt.org/z/h9v9s3TrG), but getPtr() isn't marked as const 😞

const KernelType &MKernel;

public:
HostKernelRef(const KernelType &Kernel) : MKernel(Kernel) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we delete copy ctor here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to create HostKernelRef from constant reference, as in sycl/include/sycl/queue.hpp, so we can't.

  HostKernelRef<KernelType, KernelTypeUniversalRef, TransformedArgType, Dims>
      HostKernel(std::forward<KernelTypeUniversalRef>(KernelFunc));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean add HostKernelRef(const HostKernelRef&) = delete;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explicitly delete copy ctor from HostKernelRefBase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants