Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I cant use .shift() on columns that hold lists as values. #2207

Open
amorimds opened this issue Oct 28, 2021 · 2 comments
Open

I cant use .shift() on columns that hold lists as values. #2207

amorimds opened this issue Oct 28, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@amorimds
Copy link

amorimds commented Oct 28, 2021

kdf = ks.DataFrame({'A': [1, 1, 2, 2], 'B': [[1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2]]}, columns=['A', 'B'])
kdf.groupby('A')['B'].shift(1)
@amorimds
Copy link
Author

I could go around this problem with:

kdf = ks.DataFrame({'A': [1, 1, 2, 2], 'B': [[1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2], [1, 1, 2, 2]]}, columns=['A', 'B'])
kdf['B'] = kdf['B'].astype(str)
kdf.groupby('A')['B'].shift(1).apply(lambda x: eval(str(x)))

@itholic itholic added the bug Something isn't working label Dec 9, 2021
@itholic
Copy link
Contributor

itholic commented Dec 9, 2021

Thanks for the report, @amorimds .

And currently the Koalas project is only in maintaining mode, so the response could be quite delayed.

The Koalas project is currently being managed more actively in PySpark under the name of "pandas API on Spark" (you can simply re-use the existing Koalas code by importing import pyspark.pandas as ks)

So if you're going to continue using Koalas, I recommend using PySpark! (You can get a quicker response if you report the issue to the Apache Spark JIRA)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants