`char_indices` is slower than while loop over ascii `Chars`, but faster for unicode #141855

hkBst · 2025-06-01T13:20:00Z

I tried this code:

use std::ops::Range;

#[derive(Debug, PartialEq, Eq)]
pub enum EscapeError {
    /// Raw '\r' encountered.
    BareCarriageReturn,
    /// Raw '\r' encountered in raw string.
    BareCarriageReturnInRawString,
}

pub fn check_raw_str_while(
    src: &str,
    mut callback: impl FnMut(Range<usize>, Result<char, EscapeError>),
) {
    let mut chars = src.chars();

    while let Some(c) = chars.next() {
        let start = src.len() - chars.as_str().len() - c.len_utf8();
        let res = match c {
            '\r' => Err(EscapeError::BareCarriageReturn),
            _ => Ok(c),
        };
        let end = src.len() - chars.as_str().len();
        callback(start..end, res);
    }
}

pub fn check_raw_str_char_indices(
    src: &str,
    mut callback: impl FnMut(Range<usize>, Result<char, EscapeError>),
) {
    src.char_indices().for_each(|(pos, c)| {
        callback(
            pos..pos + c.len_utf8(),
            if c == '\r' {
                Err(EscapeError::BareCarriageReturn)
            } else {
                Ok(c)
            },
        );
    });
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn same() {
        let s = "abcdefghijklmnopqrstuvwxyz0123456789";
        let (mut r1, mut r2) = (vec![], vec![]);
        check_raw_str_while(&s, |r, c| r1.push((r, c)));
        check_raw_str_char_indices(&s, |r, c| r2.push((r, c)));
        assert_eq!(r1, r2);
    }
}

with these benches:

#![feature(test)]

extern crate test;

use bench_char_indices::*;

use std::iter::repeat_n;
const LEN: usize = 10_000;

macro_rules! fn_bench_check_raw {
    ($name:ident, $unit:ty, $check_raw:ident) => {
        fn $name(b: &mut test::Bencher, s: &str, expected: $unit) {
            let input: String = test::black_box(repeat_n(s, LEN).collect());
            assert_eq!(input.len(), LEN * s.len());
            b.iter(|| {
                let mut output = vec![];

                $check_raw(&input, |range, res| output.push((range, res)));
                assert_eq!(output.len(), LEN);
                assert_eq!(output[0], ((0..s.len()), Ok(expected)));
            });
        }
    };
}

fn_bench_check_raw!(bench_check_raw_str_while, char, check_raw_str_while);
fn_bench_check_raw!(
    bench_check_raw_str_char_indices,
    char,
    check_raw_str_char_indices
);

#[bench]
fn bench_check_raw_str_ascii_while(b: &mut test::Bencher) {
    bench_check_raw_str_while(b, "a", 'a');
}

#[bench]
fn bench_check_raw_str_ascii_char_indices(b: &mut test::Bencher) {
    bench_check_raw_str_char_indices(b, "a", 'a');
}

I expected to see this happen: NO performance difference

Instead, there is a more than 10% difference:

test bench_check_raw_str_ascii_char_indices ... bench:      27,733.96 ns/iter (+/- 103.24)
test bench_check_raw_str_ascii_while        ... bench:      24,525.01 ns/iter (+/- 582.28)

Tested on: rustc 1.88.0-nightly (2e6882a 2025-05-05)

The text was updated successfully, but these errors were encountered:

hkBst · 2025-06-01T13:20:35Z

@rustbot: label +I-slow

hkBst · 2025-06-01T13:22:54Z

Remarkably, if you remove the unused second option in the EscapeError enum, namely BareCarriageReturnInRawString, then the results become even more different:

test bench_check_raw_str_ascii_char_indices ... bench:      28,308.71 ns/iter (+/- 69.40)
test bench_check_raw_str_ascii_while        ... bench:      17,260.44 ns/iter (+/- 60.75)

solson · 2025-06-01T13:49:09Z

I was curious so I tried this on my machine and got opposite results:

test bench_check_raw_str_ascii_char_indices ... bench:      11,329.42 ns/iter (+/- 870.58)
test bench_check_raw_str_ascii_while        ... bench:      13,474.87 ns/iter (+/- 553.02)

(on rustc 1.89.0-nightly (4d08223c0 2025-05-31), slightly newer)

hkBst · 2025-06-01T13:52:01Z

@solson Interesting! I'm running on Linux x86_64. You?

In fact adding unicode benches I get the opposite as well:

#[bench]
fn bench_check_raw_str_unicode_while(b: &mut test::Bencher) {
    bench_check_raw_str_while(b, "🦀", '🦀');
}

#[bench]
fn bench_check_raw_str_unicode_char_indices(b: &mut test::Bencher) {
    bench_check_raw_str_char_indices(b, "🦀", '🦀');
}

test bench_check_raw_str_ascii_char_indices   ... bench:      27,720.16 ns/iter (+/- 128.21)
test bench_check_raw_str_ascii_while          ... bench:      23,548.37 ns/iter (+/- 898.86)
test bench_check_raw_str_unicode_char_indices ... bench:      35,898.00 ns/iter (+/- 509.55)
test bench_check_raw_str_unicode_while        ... bench:      39,119.22 ns/iter (+/- 134.90)

solson · 2025-06-01T13:55:31Z

Yep, Linux x86_64 here too.

test bench_check_raw_str_ascii_char_indices   ... bench:      11,355.42 ns/iter (+/- 107.96)
test bench_check_raw_str_ascii_while          ... bench:      13,397.35 ns/iter (+/- 87.63)
test bench_check_raw_str_unicode_char_indices ... bench:      24,234.65 ns/iter (+/- 311.14)
test bench_check_raw_str_unicode_while        ... bench:      24,189.68 ns/iter (+/- 214.49)

hkBst · 2025-06-01T14:07:25Z

After updating to the new nightly (1.89.0-nightly (4d08223 2025-05-31)), I indeed get different results:

test bench_check_raw_str_ascii_char_indices   ... bench:      27,876.82 ns/iter (+/- 60.59)
test bench_check_raw_str_ascii_while          ... bench:      29,549.55 ns/iter (+/- 1,212.18)
test bench_check_raw_str_unicode_char_indices ... bench:      36,247.43 ns/iter (+/- 559.77)
test bench_check_raw_str_unicode_while        ... bench:      41,801.26 ns/iter (+/- 736.20)

mostly because the while variant has become slower in this version???

hkBst · 2025-06-04T13:27:24Z

On rustc 1.89.0-nightly (59aa1e8 2025-06-03) results are back to "normal":

test bench_check_raw_str_ascii_char_indices   ... bench:      27,674.92 ns/iter (+/- 148.05)
test bench_check_raw_str_ascii_while          ... bench:      23,855.94 ns/iter (+/- 627.58)
test bench_check_raw_str_unicode_char_indices ... bench:      36,272.82 ns/iter (+/- 1,479.99)
test bench_check_raw_str_unicode_while        ... bench:      39,030.70 ns/iter (+/- 119.96)

hkBst · 2025-06-04T14:13:45Z

Now if I add strings which alternate between ascii and non-ascii, and change the benchmark code to accomodate that, like so:

#![feature(test)]

extern crate test;

use bench_char_indices::*;

use std::iter::repeat_n;
const LEN: usize = 10_000;

macro_rules! fn_bench_check_raw {
    ($name:ident, $unit:ty, $check_raw:ident) => {
        fn $name(b: &mut test::Bencher, s: &str, expected: Vec<$unit>) {
            let input: String = test::black_box(repeat_n(s, LEN).collect());
            assert_eq!(input.len(), LEN * s.len());
            b.iter(|| {
                let mut output = vec![];

                $check_raw(&input, |range, res| output.push((range, res)));
                assert_eq!(output.len(), LEN * s.chars().count());
                for ((i, &e), (p, c)) in expected.iter().enumerate().zip(s.char_indices()) {
                    assert_eq!(output[i], ((p..p + c.len_utf8()), Ok(e)));
                }
            });
        }
    };
}

fn_bench_check_raw!(bench_check_raw_str_while, char, check_raw_str_while);
fn_bench_check_raw!(
    bench_check_raw_str_char_indices,
    char,
    check_raw_str_char_indices
);

#[bench]
fn bench_check_raw_str_ascii_while(b: &mut test::Bencher) {
    bench_check_raw_str_while(b, "a", repeat_n('a', LEN).collect());
}

#[bench]
fn bench_check_raw_str_ascii_char_indices(b: &mut test::Bencher) {
    bench_check_raw_str_char_indices(b, "a", repeat_n('a', LEN).collect());
}

#[bench]
fn bench_check_raw_str_unicode_while(b: &mut test::Bencher) {
    bench_check_raw_str_while(b, "🦀", repeat_n('🦀', LEN).collect());
}

#[bench]
fn bench_check_raw_str_unicode_char_indices(b: &mut test::Bencher) {
    bench_check_raw_str_char_indices(b, "🦀", repeat_n('🦀', LEN).collect());
}

#[bench]
fn bench_check_raw_str_mixed_while(b: &mut test::Bencher) {
    bench_check_raw_str_while(
        b,
        "a🦀",
        (0..2 * LEN)
            .map(|i| if i % 2 == 0 { 'a' } else { '🦀' })
            .collect(),
    );
}

#[bench]
fn bench_check_raw_str_mixed_char_indices(b: &mut test::Bencher) {
    bench_check_raw_str_char_indices(
        b,
        "a🦀",
        (0..2 * LEN)
            .map(|i| if i % 2 == 0 { 'a' } else { '🦀' })
            .collect(),
    );
}

then I get these puzzling results:

test bench_check_raw_str_ascii_char_indices   ... bench:      22,114.24 ns/iter (+/- 1,117.53)
test bench_check_raw_str_ascii_while          ... bench:      26,663.63 ns/iter (+/- 118.35)
test bench_check_raw_str_mixed_char_indices   ... bench:      57,742.56 ns/iter (+/- 529.01)
test bench_check_raw_str_mixed_while          ... bench:      58,221.94 ns/iter (+/- 568.38)
test bench_check_raw_str_unicode_char_indices ... bench:      37,808.79 ns/iter (+/- 259.83)
test bench_check_raw_str_unicode_while        ... bench:      37,642.51 ns/iter (+/- 701.60)

Note how now bench_check_raw_str_ascii_char_indices is suddenly much faster than bench_check_raw_str_ascii_while, seemingly exchanging timings with it...

lolbinarycat · 2025-06-05T21:55:27Z

I'd be curious to compare the length of the generated assembly, as longer asm with less jumps tends to perform better in microbenchmarks but suffers in real application code because of icache congestion.

hkBst added the C-bug Category: This is a bug. label Jun 1, 2025

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jun 1, 2025

This was referenced Jun 1, 2025

Remove unreachable rust-lang/literal-escaper#8

Open

A moderate rustc/gcc performance difference #141853

Open

hkBst changed the title ~~char_indices is slower than manual while loop~~ char_indices is slower than while loop over ascii Chars, but faster for unicode Jun 4, 2025

lolbinarycat added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jun 5, 2025

lolbinarycat added A-Unicode Area: Unicode C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such and removed C-bug Category: This is a bug. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`char_indices` is slower than while loop over ascii `Chars`, but faster for unicode #141855

`char_indices` is slower than while loop over ascii `Chars`, but faster for unicode #141855

hkBst commented Jun 1, 2025

hkBst commented Jun 1, 2025

Uh oh!

hkBst commented Jun 1, 2025

Uh oh!

solson commented Jun 1, 2025

Uh oh!

hkBst commented Jun 1, 2025 •

edited

Loading

Uh oh!

solson commented Jun 1, 2025

Uh oh!

hkBst commented Jun 1, 2025 •

edited

Loading

Uh oh!

hkBst commented Jun 4, 2025

Uh oh!

hkBst commented Jun 4, 2025

Uh oh!

lolbinarycat commented Jun 5, 2025

Uh oh!

char_indices is slower than while loop over ascii Chars, but faster for unicode #141855

char_indices is slower than while loop over ascii Chars, but faster for unicode #141855

Comments

hkBst commented Jun 1, 2025

hkBst commented Jun 1, 2025

Uh oh!

hkBst commented Jun 1, 2025

Uh oh!

solson commented Jun 1, 2025

Uh oh!

hkBst commented Jun 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

solson commented Jun 1, 2025

Uh oh!

hkBst commented Jun 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hkBst commented Jun 4, 2025

Uh oh!

hkBst commented Jun 4, 2025

Uh oh!

lolbinarycat commented Jun 5, 2025

Uh oh!

`char_indices` is slower than while loop over ascii `Chars`, but faster for unicode #141855

`char_indices` is slower than while loop over ascii `Chars`, but faster for unicode #141855

hkBst commented Jun 1, 2025 •

edited

Loading

hkBst commented Jun 1, 2025 •

edited

Loading