Tags: martindurant/awkward
Tags
ByteMaskedArray, BitMaskedArray, and tomask operation (scikit-hep#143) * [WIP] Add a 'tomask' operation to make masked data, rather than filtering. * Added stubs for ByteMaskedArray and BitMaskedArray. * Linked ByteMaskedArray and BitMaskedArray to Python. * [skip ci] Save work. * [skip ci] Save work. * Added UnmaskedArray stubs. * Some of 'getitem' for ByteMaskedArray is done. * ByteMaskedArray::asslice. * ByteMaskedArray, BitMaskedArray, UnmaskedArray integrated into Python (everywhere there had been a reference to IndexedOptionArray64). * Found a jagged indexing case that hadn't been covered before. * [skip ci] save work * [skip ci] save work; added 'carry'. * [skip ci] This isn't working: 'jagged getitem on an array containing Nones'. * Removed projections and carrying of SliceItems. * Masked jagged arrays may now be used to slice masked jagged arrays (though the missing values have to be in the same places, of course). * Finally, we're triggering ByteMaskedArray::getitem_next_jagged_generic * And ByteMaskedArray::getitem_next_jagged_generic was easy. * ByteMaskedArray::setidentities is done. * ByteMaskedArray::deep_copy implemented (without testing). * ByteMaskedArray::validityerror is implemented. * ByteMaskedArray::num is implemented and tested. * ByteMaskedArray::offsets_and_flattened is implemented and tested. * ByteMaskedArray::rpad/rpad_and_clip are implemented and tested, fixing IndexedOptionArray::rpad/rpad_and_clip in the process. * ByteMaskedArray::reducers are implemented and tested. * ByteMaskedArray::localindex is implemented and tested. * ByteMaskedArray::choose is implemented and tested, fixing List/ListOffset/RegularArray::choose in the process. * Finished merging functions for option type arrays, but all other type arrays have to check for the new option type arrays. * All other type arrays now check for the new option type arrays. * The option-type and pass-through type 'simplify' methods are all aware of each other. * Renamed to 'simplify_optiontype' and 'simplify_uniontype' and introduced a 'shallow_simplify'. (Not sure if we'll ever need a 'deep_simplify'...) * Defined BitMaskedArray::toByteMaskedArray (and verified lsb_order against awkward0). * Defined (but didn't test) toIndexedOptionArray64 as well. * Implemented getitem_at/iteration and it agrees with conversions to ByteMaskedArray/IndexedOptionArray64. * BitMaskedArray::bytemask uses the same code as BitMaskedArray::toByteMaskedArray. * BitMaskedArray::getitem_range_nowrap remains a BitMaskedArray only if start % 8 == 0. * BitMaskedArray is done. * UnmaskedArray has been fully implemented, though the tests are minimal. * Byte/Bit/UnmaskedArray boxing and unboxing in Numba is done and tested; needs 'hasfield', 'getitem_at', and 'lower_getitem_at'. * ByteMaskedArray for Numba is done. * BitMaskedArray for Numba is done. * UnmaskedArray for Numba is done. * All cases of IndexedOptionArray in Python that would be better served by ByteMaskedArray have been changed. * Found all the places where ByteMaskedArray was needed in C++ and fixed the validwhen/lsb_order conventions. * Implemented and tested tomask. * Stubs for the 'semigroup' parameter. * Remove those 'semigroup' stubs because it's already there as 'mask'. The high-level parameter name has been renamed to 'maskidentity' to be a little more clear. All I need to do is replace awkward_numpyarray_reduce_mask_indexedoptionarray64 with a ByteMaskedArray version. * The whole 'semigroup'/'maskidentity' thing turned out to be just changing the already-implemented IndexedOptionArray64 into a ByteMaskedArray. Done with the PR.
Implement 'argcross' and 'cross'. (scikit-hep#159) * [WIP] Implement 'argcross' and 'cross'. * The 'cross' operation has been implemented for axis > 0. * The 'cross' operation has been implemented for axis == 0. * Fix Python 2.7. * Really fix Python 2.7. * Fix Python 3.5 also; they both have unordered dicts. * Stubs for 'localindex'. * ListOffsetArray::localindex is done. * argcross has been defined in terms of localindex. * RegularArray::localindex is done. * All other 'localindex' implementations are trivial; done, compiles, but not explicitly tested.
Add the ak.pandas.multiindex(array) function. (scikit-hep#154) * [WIP] Add the ak.pandas.multiindex(array) function. * Stubs for the function and tests. * Keep third-party connectors in awkward1._connect and generate the user-facing modules. * Forgot __init__.py * Seems to be working; now I need good tests. * Entry/subentry numbers are now correct (had to be localindex, not parents). * Fix Numba entrypoint. * I think this is done. * Regularize the JSON for Python 3.5 (which must be picking up an old Pandas). * The old Pandas made broken JSON; fixing that, too. * Very bad JSON... * Very, very bad JSON... * Also replace 'z' in that JSON... * Give up trying to fix old Pandas's bad JSON.
Finish the count/sizes/num operation and the flatten operation (sciki… …t-hep#152) * [skip ci] Comment out original implementation of count/sizes to merge things in deliberately. * [skip ci] Save for now. * Skipping 17 tests, the remainder passes. Now to get those tests working again. * [skip ci] Savepoint. * [skip ci] Got edge cases right for NumpyArray. * This is a good definition for EmptyArray and NumpyArray. * ListArray and ListOffsetArray work. * RegularArray works. * [skip ci] Working on IndexedOptionArray. * IndexedOptionArray works. * Refactored a common technique in IndexedOptionArray. * RecordArray works. * UnionArray works; removed old tests of 'sizes' because the meaning of axis=0 has changed. * Rename 'sizes' to 'num', as in 'ak.num(muons) <= 2'... * Merge master and move the 'rpad_axis0' method. * Restructuring 'flatten' to also return offsets, since these would be needed by UnionArray and axis > 1. * Commented out operations associated with 'flatten'. * [skip ci] This is how it can work. * flatten works for ListOffsetArray, ListArray, and RegularArray for all positive 'axis' values. * Removed commented-out code for ListArray flatten. * [skip ci] Save work. * [skip ci] compiles... * More IndexedArray cases are working. * [skip ci] Cleaned up code. * IndexedArray::flatten works at all depths (ISOPTION and not ISOPTION). * [skip ci] Save notes. * RecordArray::flatten is done. * [skip ci] Save work. * [skip ci] Save work. * UnionArray::flatten is done. * Fixed flattening of sliced ListArrays. * Fixed flattening of sliced IndexedArrays. * This PR seems to be done.
Merge all the rpad work (scikit-hep#114) into new environment. (sciki… …t-hep#132) * [WIP] Merge all the rpad work (scikit-hep#114) into new environment. * Stubs for moving Yana's work into the new PR. * [skip ci] new test * localbuild.py now has a '--no-dependencies' option. * Remove glob(recursive) dependence in localbuild.py. * Add tests. * [skip ci] update * [skip ci] numpy rpad done * [skip ci] numpy rpad * [skip ci] fix test * [skip ci] * [skip ci] test modified * [skip ci] regular array done * [skip ci] test regular array * [skip ci] print array * [skip ci] fix * [skip ci] fix test * [skip ci] clip fix * [skip ci] fix * [skip ci] fix operations in axis 1 * rpad/rpad_and_clip signature now includes 'depth'. * [skip ci] update * [skip ci] update regular array * [skip ci] fix * [skip ci] one more fix * [skip ci] one more test * [skip ci] regular array * [skip ci] remove an empty file * [skip ci] regular array final update from yesterday * The rest of the classes migrated. The old tests pass. Now I need to add the type checks and make sure they are correct in the tests. After that, I'll clean the tests. * add runtime error exception for negative axis * Add type check in tests * rawarray done * add rpad study * Final cleanup * Removing tests that currently raise RuntimeError: these will have to be re-thought, not patched-up. * [skip ci] Added some tests that fail to indicate what should be corrected. * [skip ci] address Jim's comments * [skip ci] make a rpad_axis0 * tests pass, tidy up operations * merged to master * add more tests * High-level Python interface to 'rpad'. Co-authored-by: Ianna Osborne <[email protected]>
RecordArray should use its length_ parameter, regardless of contents_… ….size() (scikit-hep#147) * RecordArray has too many constructors; start by removing two that were never used. * RecordArray has the same constructor for content.empty() and not-content.empty() and all tests pass.
Make __typestr__ a behavior, not a data property. (scikit-hep#140) * [WIP] Make __typestr__ a behavior, not a data property. * Removed 'astype' and everything that dependended on it/used it. * Still propagating 'typestrs' to all types; compiles and passes tests at this save-point. * [skip ci] Work in progress: adding typestr to all Type constructors. * Propagated 'typestrs' to all types. All Type objects take a 'typestr' as a parameter. * [skip ci] Connected through to produce type strings, but tests fail now. * [skip ci] More tests are working. * Done. Start VERSION_INFO numbering at 0.2.0 to break the connection between release number and PR number. * Fixed Python 2.7.
Fix Windows wheel and add auditwheel. (scikit-hep#142) * [WIP] Fix Windows wheel and add auditwheel. * Try to use system compiler, auditwheel show * [skip ci] WOWW (Working on Windows Wheel) baseline triage: verify that the problem is still there. * [skip ci] WOWW: remove the extraneous files before they get copied over. * [skip ci] WOWW: 'pyd', not 'dll'. * Quick fix for missing first item selection linux/macOS * WOWW: the problem is fixed. Windows wheels are correct now. * Bump to start Azure. This is probably done. Co-authored-by: Henry Schreiner <[email protected]>
Strings in Numba (scikit-hep#144) * [WIP] Strings in Numba * Done. * Fix 32-bit errors. * Really fix 32-bit errors this time. * Really, really fix 32-bit errors this time.
PreviousNext