Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected None behavior when working with computation ak.cartesian #3197

Open
yimuchen opened this issue Aug 1, 2024 · 0 comments
Open
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged

Comments

@yimuchen
Copy link

yimuchen commented Aug 1, 2024

Version of Awkward Array

2.6.6

Description and code to reproduce

When working with pair-wise computations with ak.cartesian, if a collection was reordered using a None-able array, the computation results can cause a un-used field to be computed as None.

In the case below, the array idx and either be None-able (signature 20 * var * ?int64) or not-None-able (signature 20 * var * int64) by toggling the fill_none line, even if the numerical value of the array is unchanged. The final output of the unused field col1.y will change depending on:

  • Whether idx was a None-able array
  • Whether we attempt to modify the collection with col1["r"].
import awkward as ak
import numpy as np

print(ak.__version__)

rng = np.random.default_rng(seed=1234)

n_events = 20

# Making the first collection with a hand ful of entries
n_col1 = rng.choice([2, 3], size=n_events)
col1 = ak.zip(
    {
        "x": rng.normal(size=ak.sum(n_col1)),
        "y": rng.normal(size=ak.sum(n_col1)),
        "z": rng.normal(size=ak.sum(n_col1)),
    },
)
col1 = ak.unflatten(col1, n_col1)

# Making second collection with few entries
n_col2 = rng.choice([0, 1], size=n_events)
col2 = ak.zip(
    {
        "x": rng.normal(size=ak.sum(n_col2)),
        "y": rng.normal(size=ak.sum(n_col2)),
        "z": rng.normal(size=ak.sum(n_col2)),
    }
)
col2 = ak.unflatten(col2, n_col2)

# Ordering the objects a None-able index array
idx = ak.pad_none(ak.local_index(col1), 4, axis=1)
idx = idx[ak.local_index(col1)]  # Making a non-able version of array
# idx = ak.fill_none(idx, 4 ) # Should crash if None actually exists
col1 = col1[idx]

print(ak.sum(col1.x != pre_x) == 0) # Showing that the array values are indeed identical be and after ordering
print(ak.to_list(col1.y), col1.y.__repr__)

# Making pair of pair computations
d = {"c1": col1, "c2": col2}
pairs = ak.cartesian(d, axis=1, nested=True)
diff_pairs = pairs.c1.x - pairs.c2.x

# Pushing results back into collection
col1["r"] = ak.min(diff_pairs, axis=2)

print(ak.to_list(col1.y), "\n", col1.y.__repr__())
@yimuchen yimuchen added the bug (unverified) The problem described would be a bug, but needs to be triaged label Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged
Projects
None yet
Development

No branches or pull requests

1 participant