`"cuda"` target in numba.vectorize not working correctly? #3179

lgray · 2024-07-10T00:18:47Z

Version of Awkward Array

2.6.6

Description and code to reproduce

The following code fails:

import awkward as ak
import cupy as cp
import numba as nb

ak.numba.register_and_check()

@nb.vectorize(
    [
        nb.float32(nb.float32),
        nb.float64(nb.float64),
    ]
)
def _square(x):
    return x * x

@nb.vectorize(
    [
        nb.float32(nb.float32),
        nb.float64(nb.float64),
    ],
    target="cuda",
)
def _square_cuda(x):
    return x * x

counts = cp.random.poisson(lam=3, size=50)
flat_values = cp.random.normal(size=int(counts.sum()))

values = ak.unflatten(flat_values, counts)

values2_cpu = _square(ak.to_backend(values, "cpu"))

print(values2_cpu)

values2 = _square_cuda(values)

print(values2)

resulting in the output:

[[0.045, 1.83], [1.55, 0.224, 0.621, ..., 1.24, 3.87], ..., [0.153, 0.0017]]

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 33
     29 values2_cpu = _square(ak.to_backend(values, "cpu"))
     31 print(values2_cpu)
---> 33 values2 = _square_cuda(values)
     35 print(values2)

File ~/.conda/envs/coffea-gpu/lib/python3.11/site-packages/numba/cuda/vectorizers.py:28, in CUDAUFuncDispatcher.__call__(self, *args, **kws)
     17 def __call__(self, *args, **kws):
     18     """
     19     *args: numpy arrays or DeviceArrayBase (created by cuda.to_device).
     20            Cannot mix the two types in one call.
   (...)
     26                   the input arguments.
     27     """
---> 28     return CUDAUFuncMechanism.call(self.functions, args, kws)

File ~/.conda/envs/coffea-gpu/lib/python3.11/site-packages/numba/np/ufunc/deviceufunc.py:254, in UFuncMechanism.call(cls, typemap, args, kws)
    252 # Begin call resolution
    253 cr = cls(typemap, args)
--> 254 args = cr.get_arguments()
    255 resty, func = cr.get_function()
    257 outshape = args[0].shape

File ~/.conda/envs/coffea-gpu/lib/python3.11/site-packages/numba/np/ufunc/deviceufunc.py:202, in UFuncMechanism.get_arguments(self)
    198 def get_arguments(self):
    199     """Prepare and return the arguments for the ufunc.
    200     Does not call to_device().
    201     """
--> 202     self._fill_arrays()
    203     self._fill_argtypes()
    204     self._resolve_signature()

File ~/.conda/envs/coffea-gpu/lib/python3.11/site-packages/numba/np/ufunc/deviceufunc.py:100, in UFuncMechanism._fill_arrays(self)
     98     self.scalarpos.append(i)
     99 else:
--> 100     self.arrays[i] = np.asarray(arg)

File [~/coffea-gpu/awkward/src/awkward/highlevel.py:1434](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/highlevel.py#line=1433), in Array.__array__(self, dtype)
   1429 with ak._errors.OperationErrorContext(
   1430     "numpy.asarray", (self,), {"dtype": dtype}
   1431 ):
   1432     from awkward._connect.numpy import convert_to_array
-> 1434     return convert_to_array(self._layout, dtype=dtype)

File [~/coffea-gpu/awkward/src/awkward/_connect/numpy.py:481](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/_connect/numpy.py#line=480), in convert_to_array(layout, dtype)
    480 def convert_to_array(layout, dtype=None):
--> 481     out = ak.operations.to_numpy(layout, allow_missing=False)
    482     if dtype is None:
    483         return out

File [~/coffea-gpu/awkward/src/awkward/_dispatch.py:64](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/_dispatch.py#line=63), in named_high_level_function.<locals>.dispatch(*args, **kwargs)
     62 # Failed to find a custom overload, so resume the original function
     63 try:
---> 64     next(gen_or_result)
     65 except StopIteration as err:
     66     return err.value

File [~/coffea-gpu/awkward/src/awkward/operations/ak_to_numpy.py:48](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/operations/ak_to_numpy.py#line=47), in to_numpy(array, allow_missing)
     45 yield (array,)
     47 # Implementation
---> 48 return _impl(array, allow_missing)

File [~/coffea-gpu/awkward/src/awkward/operations/ak_to_numpy.py:60](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/operations/ak_to_numpy.py#line=59), in _impl(array, allow_missing)
     57 backend = NumpyBackend.instance()
     58 numpy_layout = layout.to_backend(backend)
---> 60 return numpy_layout.to_backend_array(allow_missing=allow_missing)

File [~/coffea-gpu/awkward/src/awkward/contents/content.py:1020](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/contents/content.py#line=1019), in Content.to_backend_array(self, allow_missing, backend)
   1018 else:
   1019     backend = regularize_backend(backend)
-> 1020 return self._to_backend_array(allow_missing, backend)

File [~/coffea-gpu/awkward/src/awkward/contents/listoffsetarray.py:2072](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/contents/listoffsetarray.py#line=2071), in ListOffsetArray._to_backend_array(self, allow_missing, backend)
   2070     return buffer.view(np.dtype(("S", max_count)))
   2071 else:
-> 2072     return self.to_RegularArray()._to_backend_array(allow_missing, backend)

File [~/coffea-gpu/awkward/src/awkward/contents/listoffsetarray.py:283](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/contents/listoffsetarray.py#line=282), in ListOffsetArray.to_RegularArray(self)
    278 _size = Index64.empty(1, self._backend.index_nplike)
    279 assert (
    280     _size.nplike is self._backend.index_nplike
    281     and self._offsets.nplike is self._backend.index_nplike
    282 )
--> 283 self._backend.maybe_kernel_error(
    284     self._backend[
    285         "awkward_ListOffsetArray_toRegularArray",
    286         _size.dtype.type,
    287         self._offsets.dtype.type,
    288     ](
    289         _size.data,
    290         self._offsets.data,
    291         self._offsets.length,
    292     )
    293 )
    294 size = self._backend.index_nplike.index_as_shape_item(_size[0])
    295 length = self._offsets.length - 1

File [~/coffea-gpu/awkward/src/awkward/_backends/backend.py:67](https://analytics-hub.fnal.gov/user/lagray/lab/tree/coffea-gpu/coffea-gpu/awkward/src/awkward/_backends/backend.py#line=66), in Backend.maybe_kernel_error(self, error)
     65     return
     66 else:
---> 67     raise ValueError(self.format_kernel_error(error))

ValueError: cannot convert to RegularArray because subarray lengths are not regular (in compiled code: https://github.com/scikit-hep/awkward/blob/awkward-cpp-35/awkward-cpp/src/cpu-kernels/awkward_ListOffsetArray_toRegularArray.cpp#L22)

This error occurred while calling

    numpy.asarray(
        <Array [[0.2120250193289826, ...], ...] type='50 * var * float64'>
        dtype = None
    )

The text was updated successfully, but these errors were encountered:

lgray · 2024-07-10T16:04:06Z

ah - I was missing a awkward.numba.register_and_check() call! that was it. Seems to work fine now.

lgray · 2024-07-10T16:06:43Z

Ah - no that was for a flattened array! I still get the error above even after doing the register_and_check().

lgray · 2024-07-10T16:38:53Z

So if I wrap in a flatten/unflatten I'm able to get this to work, which is a bit clunky. It seems like some aspect of awkward array are lost when dealing with the "cuda" target of numba.vectorize?

Here's the code that functions:

import awkward as ak
import cupy as cp
import numba as nb

ak.numba.register_and_check()

@nb.vectorize(
    [
        nb.float32(nb.float32),
        nb.float64(nb.float64),
    ]
)
def _square(x):
    return x * x

@nb.vectorize(
    [
        nb.float32(nb.float32),
        nb.float64(nb.float64),
    ],
    target="cuda",
)
def _square_cuda(x):
    return x * x

def square_cuda_wrapped(x):
    counts = x.layout.offsets.data[1:] - x.layout.offsets.data[:-1]
    return ak.unflatten(cp.array(_square_cuda(ak.flatten(x))), counts)

counts = cp.random.poisson(lam=3, size=5000000)
flat_values = cp.random.normal(size=int(counts.sum()))

values = ak.unflatten(flat_values, counts)

values2_cpu = _square(ak.to_backend(values, "cpu"))

print(values2_cpu)

values2 = square_cuda_wrapped(values)

print(values2)

lgray · 2024-07-24T19:47:45Z

@ianna Have you been able to gain any understanding as to what is going on here? Mostly just curious, really. It's holding up progress with getting coffea going on GPUs.

ianna · 2024-07-31T20:11:37Z

@lgray and @jpivarski - it looks like there could be a simple solution to this issue. However, it needs to be implemented in Numba. I'm checking with the developers and keep you posted.

jpivarski · 2024-07-31T20:44:31Z

(I'm not too surprised that the fix needs to go into Numba itself, since Awkward doesn't get much control over how functions that call an Awkward Array get dispatched. If the calling function doesn't call __array_ufunc__ or some equivalent, we have no entry into the code that eventually fails.)

lgray added the bug (unverified) The problem described would be a bug, but needs to be triaged label Jul 10, 2024

lgray closed this as completed Jul 10, 2024

lgray reopened this Jul 10, 2024

ianna self-assigned this Jul 10, 2024

ianna mentioned this issue Jul 31, 2024

Allow libraries that implement __array_ufunc__ to override DUFunc.__c… numba/numba#8995

Merged

This was referenced Aug 1, 2024

[FEA] Allow libraries that implement __array_ufunc__ to override CUDAUFuncDispatcher NVIDIA/numba-cuda#36

Closed

[FEA] Allow libraries that implement __array_ufunc__ to override CUDAUFunc NVIDIA/numba-cuda#37

Open

ianna linked a pull request Aug 1, 2024 that will close this issue

test: add vectorize with target cuda test #3194

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`"cuda"` target in numba.vectorize not working correctly? #3179

`"cuda"` target in numba.vectorize not working correctly? #3179

lgray commented Jul 10, 2024 •

edited

Loading

lgray commented Jul 10, 2024

lgray commented Jul 10, 2024

lgray commented Jul 10, 2024 •

edited

Loading

lgray commented Jul 24, 2024

ianna commented Jul 31, 2024

jpivarski commented Jul 31, 2024

"cuda" target in numba.vectorize not working correctly? #3179

"cuda" target in numba.vectorize not working correctly? #3179

Comments

lgray commented Jul 10, 2024 • edited Loading

Version of Awkward Array

Description and code to reproduce

lgray commented Jul 10, 2024

lgray commented Jul 10, 2024

lgray commented Jul 10, 2024 • edited Loading

lgray commented Jul 24, 2024

ianna commented Jul 31, 2024

jpivarski commented Jul 31, 2024

`"cuda"` target in numba.vectorize not working correctly? #3179

`"cuda"` target in numba.vectorize not working correctly? #3179

lgray commented Jul 10, 2024 •

edited

Loading

lgray commented Jul 10, 2024 •

edited

Loading