Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor C++ language extensions and C++ support #3673

Open
wants to merge 1 commit into
base: docs/develop
Choose a base branch
from

Conversation

MKKnorr
Copy link

@MKKnorr MKKnorr commented Nov 15, 2024

No description provided.

@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch 2 times, most recently from e911a63 to da93aea Compare November 15, 2024 14:34
@neon60 neon60 force-pushed the hip_runtime_api_how-to branch from 0c52420 to 31fa125 Compare November 18, 2024 16:16
@neon60 neon60 deleted the branch docs/develop November 18, 2024 17:39
@neon60 neon60 closed this Nov 18, 2024
@neon60 neon60 reopened this Nov 18, 2024
@MKKnorr MKKnorr changed the base branch from hip_runtime_api_how-to to docs/develop November 19, 2024 09:34
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch 2 times, most recently from 760259e to 6d6c430 Compare November 21, 2024 17:01
@MKKnorr MKKnorr marked this pull request as ready for review November 21, 2024 17:01
@MKKnorr MKKnorr requested a review from neon60 November 21, 2024 17:02
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch 3 times, most recently from 058fdde to 3eddfa4 Compare November 22, 2024 10:43
@MKKnorr MKKnorr changed the base branch from docs/develop to landing_page_2 November 22, 2024 10:43
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from 3eddfa4 to c485169 Compare November 22, 2024 10:45
@neon60
Copy link
Contributor

neon60 commented Nov 24, 2024

@MKKnorr Please rebase on docs/develop.

@MKKnorr MKKnorr changed the base branch from landing_page_2 to docs/develop November 25, 2024 09:42
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch 2 times, most recently from ac2b239 to 761c1f8 Compare November 25, 2024 09:48
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch 2 times, most recently from 58960c9 to 4b37cf5 Compare November 25, 2024 15:10
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from 4b37cf5 to 0cf8414 Compare November 26, 2024 13:05
Copy link

@randyh62 randyh62 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me. I left a few comments.

docs/how-to/hip_cpp_language_extensions.rst Outdated Show resolved Hide resolved
docs/how-to/hip_cpp_language_extensions.rst Show resolved Hide resolved
docs/how-to/hip_cpp_language_extensions.rst Outdated Show resolved Hide resolved
docs/how-to/hip_cpp_language_extensions.rst Outdated Show resolved Hide resolved
docs/how-to/hip_cpp_language_extensions.rst Show resolved Hide resolved
HIP supports ``__threadfence()``, ``__threadfence_block()`` and
``__threadfence_system()``.

On AMD devices, ``__threadfence_system()``, has restrictions and therefore needs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it is referring to some implicit knowledge the user must have relative to threadfence_system? What are the "restrictions"? And is it appropriate to call it a workaround in the docs? Seems like a workaround is good for an issue or bug report, but not for user docs. Perhaps "requires the following process"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I had more information on this, but I couldn't find information proving or disproving this, so I just reformatted the already existing content

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requires the following process?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll ask around what the current status on this feature is, maybe we can get rid of this section altogether, but I'll postpone it for now

docs/how-to/kernel_language_cpp_support.rst Outdated Show resolved Hide resolved
docs/how-to/kernel_language_cpp_support.rst Outdated Show resolved Hide resolved
docs/how-to/kernel_language_cpp_support.rst Show resolved Hide resolved
docs/how-to/kernel_language_cpp_support.rst Show resolved Hide resolved
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from ef46f97 to 00a1e34 Compare November 28, 2024 13:26
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from 00a1e34 to b91f075 Compare December 4, 2024 11:28
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from b91f075 to 4ffc403 Compare December 4, 2024 13:28
@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from 4ffc403 to 5cc8184 Compare December 5, 2024 15:54

.. code-block:: cpp

#include <hip/hip_runtime.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we modify this to add host and device functions as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is specifically about launching a kernel and I'd like to keep it as concise as possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense, but this kind of felt like a build up, there is description of host, device and global functions but example only shows global function. I think user reading through it, when they reach this section should get a complete picture of all three in action.

I'll leave it up to you, but if I was reading this, I wish there would be a more complete example which encompasses all that I just read.

docs/how-to/hip_cpp_language_extensions.rst Show resolved Hide resolved
docs/how-to/hip_cpp_language_extensions.rst Show resolved Hide resolved

GPU multiprocessors have a fixed pool of resources (primarily registers and
shared memory) which are shared by the actively running warps. Using more
resources can increase IPC of the kernel but reduces the resources available for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can increase IPC of the kernel

What does IPC mean here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instructions per cycle. Rephrased the sentence to replace the acronym

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use such words (IPC incase of kernels?). Can we measure IPC for kernels? I would recommend using something like 'occupancy' which is a measurable metric.

automatically reduce shared memory usage. The compilation fails, if the compiler
can not generate code that satisfies the launch bounds.

On NVCC this parameter maps to the ``.maxntid`` PTX directive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are mentioning PTX directive, should we also mention the amdgpu_flat_work_group_size function attribute?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but this also is not the appropriate place in general. I already opened an issue to move this to the compiler documentation, as this does not directly relate to the language

Porting from CUDA __launch_bounds
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

CUDA defines the ``__launch_bounds`` qualifier which works similar to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does CUDA support __launch_bounds (without the trailing underscore?) can you point me to the CUDA documentation for it, cant seem to find it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have sworn I've seen it somewhere, but it seems you're right, it also has trailing underscores. Fixed it.

docs/how-to/hip_cpp_language_extensions.rst Show resolved Hide resolved
Managed memory is a special qualifier, that makes the marked memory available on
the device and on the host. For more details see :ref:`unified_memory`.

__restrict__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt restrict more of a clang extension than HIP, should we include it in HIP documentation? or alteast mention that here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is somewhat of a language extension, but also compiler specific/a hint to the compiler. However since CUDA requires this, I'm quite sure HIP also has to support it, so I'd say it has its place in this documentation

reduced support of the C++ standard library in device code, as functions are
only compiled for the host by default. There are ongoing efforts to implement
C++ standard library functionality with `libhipcxx
<https://github.com/ROCm/libhipcxx>`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something worth mentioning here, that STL features which are guaranteed to resolve at compile time (for example type_traits) are supported on device code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also noted under C++11 support - constexpr, but I added an additional note here.

@MKKnorr MKKnorr force-pushed the refactor-cpp-language-extensions branch from 5cc8184 to 7ee5328 Compare December 16, 2024 15:50
@neon60 neon60 force-pushed the refactor-cpp-language-extensions branch from 7ee5328 to 92410f6 Compare December 19, 2024 16:32
@neon60 neon60 force-pushed the refactor-cpp-language-extensions branch from 92410f6 to e9a6c15 Compare December 19, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants