-
Notifications
You must be signed in to change notification settings - Fork 196
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Backport several fixes into 2.7.x. (#2579)
* Fix `common_type` specialization for extended floating point types (#2483) * Fix `common_type` specialization for extended floating point types The machinery we had in place was not really suited to specialize `common_type` because it would take precendence over the actual implementation of `common_type` In that case, we only specialized `common_type<__half, __half>` but not `common_type<__half, __half&>` and so on. This shows how brittle the whole thing is and that it is not extensible. Rather than putting another bandaid over it, add a proper 5th step in the common_type detection that properly treats combinations of an extended floating point type with an arithmetic type. Allowing arithmetic types it necessary to keep machinery like `pow(__half, 2)` working. Fixes [BUG]: `is_common_type` trait is broken when mixing rvalue references #2419 * Work around MSVC declval bug * Disable system header for narrowing conversion check (#2465) There is an incredible compiler bug reported in nvbug4867473 where the use of system header changes the way some types are instantiated. The culprit seems to be that within a system header the compiler accepts narrowing conversions that it should not accept Work around it by moving __is_non_narrowing_convertible to its own header that is included before we define the system header machinery * Drop 2 relative includes that snuck in (#2492) * Fix popc.h when architecture is not x86 on MSVC. (#2524) * Fix popc when architecture is not x86 * Update libcudacxx/include/cuda/std/__bit/popc.h --------- Co-authored-by: Michael Schellenberger Costa <[email protected]> * Make `bit_cast` play nice with extended floating point types (#2434) * Move `__is_nvbf16` and `__is_nvfp16` to their own file * Make `bit_cast` play nice with extended floating point types --------- Co-authored-by: Michael Schellenberger Costa <[email protected]>
- Loading branch information
Showing
19 changed files
with
410 additions
and
128 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
73 changes: 73 additions & 0 deletions
73
libcudacxx/include/cuda/std/__cccl/is_non_narrowing_convertible.h
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// Part of libcu++, the C++ Standard Library for your entire system, | ||
// under the Apache License v2.0 with LLVM Exceptions. | ||
// See https://llvm.org/LICENSE.txt for license information. | ||
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
// SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
#ifndef __CCCL_IS_NON_NARROWING_CONVERTIBLE_H | ||
#define __CCCL_IS_NON_NARROWING_CONVERTIBLE_H | ||
|
||
#include <cuda/std/__cccl/compiler.h> | ||
|
||
//! There is compiler bug that results in incorrect results for the below `__is_non_narrowing_convertible` check. | ||
//! This breaks some common functionality, so this *must* be included outside of a system header. See nvbug4867473. | ||
#if defined(_CCCL_FORCE_SYSTEM_HEADER_GCC) || defined(_CCCL_FORCE_SYSTEM_HEADER_CLANG) \ | ||
|| defined(_CCCL_FORCE_SYSTEM_HEADER_MSVC) | ||
# error \ | ||
"This header must be included only within the <cuda/std/__cccl/system_header>. This most likely means a mix and match of different versions of CCCL." | ||
#endif // system header detected | ||
|
||
namespace __cccl_internal | ||
{ | ||
|
||
#if defined(_CCCL_CUDA_COMPILER) && (defined(__CUDACC__) || defined(_NVHPC_CUDA) || defined(_CCCL_COMPILER_NVRTC)) | ||
template <class _Tp> | ||
__host__ __device__ _Tp&& __cccl_declval(int); | ||
template <class _Tp> | ||
__host__ __device__ _Tp __cccl_declval(long); | ||
template <class _Tp> | ||
__host__ __device__ decltype(__cccl_internal::__cccl_declval<_Tp>(0)) __cccl_declval() noexcept; | ||
|
||
// This requires a type to be implicitly convertible (also non-arithmetic) | ||
template <class _Tp> | ||
__host__ __device__ void __cccl_accepts_implicit_conversion(_Tp) noexcept; | ||
#else // ^^^ CUDA compilation ^^^ / vvv no CUDA compilation | ||
template <class _Tp> | ||
_Tp&& __cccl_declval(int); | ||
template <class _Tp> | ||
_Tp __cccl_declval(long); | ||
template <class _Tp> | ||
decltype(__cccl_internal::__cccl_declval<_Tp>(0)) __cccl_declval() noexcept; | ||
|
||
// This requires a type to be implicitly convertible (also non-arithmetic) | ||
template <class _Tp> | ||
void __cccl_accepts_implicit_conversion(_Tp) noexcept; | ||
#endif // no CUDA compilation | ||
|
||
template <class...> | ||
using __cccl_void_t = void; | ||
|
||
template <class _Dest, class _Source, class = void> | ||
struct __is_non_narrowing_convertible | ||
{ | ||
static constexpr bool value = false; | ||
}; | ||
|
||
// This also prohibits narrowing conversion in case of arithmetic types | ||
template <class _Dest, class _Source> | ||
struct __is_non_narrowing_convertible<_Dest, | ||
_Source, | ||
__cccl_void_t<decltype(__cccl_internal::__cccl_accepts_implicit_conversion<_Dest>( | ||
__cccl_internal::__cccl_declval<_Source>())), | ||
decltype(_Dest{__cccl_internal::__cccl_declval<_Source>()})>> | ||
{ | ||
static constexpr bool value = true; | ||
}; | ||
|
||
} // namespace __cccl_internal | ||
|
||
#endif // __CCCL_IS_NON_NARROWING_CONVERTIBLE_H |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.