Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review the need of make_scalar_function for functions #14835

Open
jayzhan211 opened this issue Feb 23, 2025 · 7 comments
Open

Review the need of make_scalar_function for functions #14835

jayzhan211 opened this issue Feb 23, 2025 · 7 comments
Labels
enhancement New feature or request

Comments

@jayzhan211
Copy link
Contributor

Is your feature request related to a problem or challenge?

The current code converts scalars to arrays and then back after the function calculation. However, this conversion is unnecessary and can be optimized, especially in cases involving scalars. The conversion process duplicates values in the array, which doesn't add any value.

This approach was likely implemented when Scalar was not yet introduced in arrow-rs, but that’s no longer the case. With Scalar available, the conversion to arrays is redundant.

For example, the gcd function does not require an array and can instead operate directly on an i64 value #14834

Describe the solution you'd like

Revisit the functions that use make_scalar_function and identify those where it is no longer necessary. If there are functions that no longer require it, remove the usage entirely to streamline the code.

Describe alternatives you've considered

No response

Additional context

No response

@jayzhan211 jayzhan211 added the enhancement New feature or request label Feb 23, 2025
@alamb
Copy link
Contributor

alamb commented Feb 23, 2025

Revisit the functions that use make_scalar_function and identify those where it is no longer necessary. If there are functions that no longer require it, remove the usage entirely to streamline the code.

  • I think trying out @findepi 's Simple Functions  #12635 would be an excellent candidate for these functions. The ones that use make_scalar_function are all fairly simple and would benefit from having a special case for constant.

Ensuring #12635 can handle the basic math functions would be a good validation of the API in general

@jayzhan211
Copy link
Contributor Author

I think trying out @findepi 's Simple Functions #12635 would be an excellent candidate for these functions. The ones that use make_scalar_function are all fairly simple and would benefit from having a special case for constant.

I agree, but it is still far from ready.

@jayzhan211
Copy link
Contributor Author

it seems this issue is accidentally added in blog sub-issue

@alamb
Copy link
Contributor

alamb commented Feb 24, 2025

it seems this issue is accidentally added in blog sub-issue

Fixed -- sorry about that -- I am not sure how I did that but will be more careful

@findepi
Copy link
Member

findepi commented Feb 24, 2025

I think trying out @findepi 's Simple Functions #12635 would be an excellent candidate for these functions. The ones that use make_scalar_function are all fairly simple and would benefit from having a special case for constant.

I agree, but it is still far from ready.

i am not questioning it (i know very well how much more I'd like to do there), but I am curious about outsider's perspective.
in your eyes, what makes you say "far from ready", @jayzhan211 ?

@jayzhan211
Copy link
Contributor Author

To replace the existing function without needing to maintain it, we must ensure that all its current functionality is fully supported. Among everything we have so far, extensibility seems to be the biggest challenge. The more extensible we make the function, the less simple it becomes. Since I haven't seen a viable solution for this in #14668, I believe it is still far from being ready.

@alamb
Copy link
Contributor

alamb commented Feb 24, 2025

I think trying out @findepi 's Simple Functions #12635 would be an excellent candidate for these functions. The ones that use make_scalar_function are all fairly simple and would benefit from having a special case for constant.

I agree, but it is still far from ready.

i am not questioning it (i know very well how much more I'd like to do there), but I am curious about outsider's perspective. in your eyes, what makes you say "far from ready", @jayzhan211 ?

FWIW in my opinion, getting to the level of replacing make_scalar_function (and improving the resulting implementations) seems like a good first step towards a more general solution for different types of functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants