Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve df.corr: remove numpy; add method='spearman'; add labels; make consistent with pl.corr #20957

Open
Julian-J-S opened this issue Jan 28, 2025 · 2 comments · May be fixed by #20967
Open
Labels
enhancement New feature or an improvement of an existing feature

Comments

@Julian-J-S
Copy link
Contributor

Description

Correlation analysis is a very frequently used data-science method.

The current implementation could be improved as recommended by a few issues:

Make consistent with pl.corr: Remove numpy dependency & add method=spearman

  • df.corr link
    • based on numpy (corrcoef) according to the documentation
    • only has method pearson
  • pl.corr link
    • native rust
    • has pearson and spearman

Add parameter to add row labels

  • for correlations when data has lots of rows it would be very handy to optionally add a row label that links between the columns of the correlation
@Julian-J-S Julian-J-S added the enhancement New feature or an improvement of an existing feature label Jan 28, 2025
@kevuoft
Copy link

kevuoft commented Feb 1, 2025

Can we also have kendall correlation to match Pandas.DataFrame.corr()?

@fraruscio
Copy link

Can we also have kendall correlation to match Pandas.DataFrame.corr()?

I also think kendall would be more useful than spearman for many practical cases, and it would be really helpful to have a native implementation for large dataframes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
3 participants