Skip to content

Question: should NumericSort skip leading zeroes when comparing numbers? #1413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
goran-w opened this issue Jun 11, 2025 · 6 comments
Closed

Comments

@goran-w
Copy link
Contributor

goran-w commented Jun 11, 2025

Enhancement #597 added "numerical" sorting, where digit sub-string parts in sorted items are treated separately (from non-digit sub-string parts) in order to sort 2 before 10 (for example, ascending).

However, leading zeroes are included when doing a "trivial" number-length pre-check in the NumericSort code, which means 2 is still sorted before 01 (for example, ascending).

If desired, it would be (relatively) easy to correct this, by skipping any leading zeroes from the two numerical sub-strings being compared, before doing the actual number-comparison...

@love-linger
Copy link
Collaborator

love-linger commented Jun 11, 2025

Do you mean that you want to sort names as follow?

Before

001
0001
00001
002
0002
0003
003
00002
0004

After

001
0001
00001
002
0002
00002
003
0003
0004

or

001
002
003
0001
0002
0003
0004
00001
00002

@goran-w
Copy link
Contributor Author

goran-w commented Jun 11, 2025

It depends on what we're aiming for. But the current implementation is not a "truly" numeric sort, since it assumes that leading zeroes should "non-numerically" contribute to the sorting order (since substring length currently takes precedence over actual numeric value). Below is a (hopefully) clarifying example...

NOTE: I've inserted extra line breaks where the numeric sequence-ordering breaks!

The following list is sorted alphabetically (not taking "numbers" into account, only individual digits) :

0001
0002

001
0010

002
0020

01
010
0100

02
020
0200

1
10
100
1000

2
20
200
2000

This is what it should look in a "truly" numeric order (i.e after converting each numeric substring to an integer, and then sorting these numeric substrings using these integers) :

1
01
001
0001
2
02
002
0002
10
010
0010
20
020
0020
100
0100
200
0200
1000
2000

However, using the current implementation of NumericSort.Compare(), this is what the sorted list looks like instead:

1
2

01
02
10
20

001
002
010
020
100
200

0001
0002
0010
0020
0100
0200
1000
2000

Maybe this is what is intended, though, hence the question in the title of this issue...

@love-linger
Copy link
Collaborator

love-linger commented Jun 11, 2025 via email

@goran-w
Copy link
Contributor Author

goran-w commented Jun 11, 2025

Thanks, I see - just wanted to make sure this was "by-design" and not a bug...

It will be identical to a "true" numeric sort-order, as long as we use either NO leading zeroes OR a consistent number of digit-columns when naming files/branches/tags.

@goran-w goran-w closed this as completed Jun 11, 2025
@love-linger
Copy link
Collaborator

love-linger commented Jun 11, 2025 via email

@goran-w
Copy link
Contributor Author

goran-w commented Jun 11, 2025

Moreover, I don't think it's right to divide the same number of digits into different groups. For example, if there are xxx-001 and xxx-002 in your example, it is more reasonable to let them next to each other in the same group .

For a "truly" numeric sort, these would be next to each other, EXCEPT that other representations of the very same numbers 1 and 2 (respectively, with different number of leading zeroes) might be sorted in between them, depending on how otherwise indentical numbers of different substring-lengths are sorted...

NOTE: I'm fine with the current NumericSort method - just wanted to make sure I understood it (and its limits and purpose etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants