Consistent Naming Standard When Using Dict in GroupBy .agg #21806
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
git diff upstream/master -u -- "*.py" | flake8 --diff
This is a WIP as all of the tests in the groupby directory pass, but there are some outside of that which need touch ups. This is a relatively aggressive change so I figured I'd get buy in before going outside of that to modify tests.
This not only addresses the referenced issue but provides consistent expectations around columns returned when using a
dict
argument in aDataFrameGroupBy
, namely that the returned object will have aMultiIndex
where the top level is the column name and the subsequent level is the aggregation performed.To play devil's advocate, this may be considered undesirable as it now returns a MultiIndex in some cases where that was flat before. However, I'd counter that this:
Curious to hear other's feedback @jreback @TomAugspurger @jorisvandenbossche