Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text.find slows down with the number of invocations #11859

Closed
4e6 opened this issue Dec 13, 2024 · 11 comments · Fixed by #11945
Closed

Text.find slows down with the number of invocations #11859

4e6 opened this issue Dec 13, 2024 · 11 comments · Fixed by #11945
Assignees
Labels
-language-server -libs Libraries: New libraries to be implemented

Comments

@4e6
Copy link
Contributor

4e6 commented Dec 13, 2024

I encountered the issue when solving Advent Of Code day 13.

I created a project reproducing the issue Text_Find_Bench.zip It contains a bench.txt file with a sequence of lines to parse:

Button A: X+46, Y+89
Button B: X+99, Y+32
Prize: X=5826, Y=7443

Button ...

Then the function parse_input parses those lines, printing how many lines is left to parse:

parse_input lines =
    go input acc =
        if input.is_empty then acc else
            p = input.take 4
            m = Main.parse_machine p
            IO.println ('lines remain: ' + input.length.to_text)
            @Tail_Call go (input.drop 4) acc+1
    go lines 0

When running the program, you can see how the lines remain output slows down with the number of lines parsed:

Text_Find_Bench.mp4
@4e6 4e6 added the -libs Libraries: New libraries to be implemented label Dec 13, 2024
@JaroslavTulach JaroslavTulach self-assigned this Dec 17, 2024
@JaroslavTulach JaroslavTulach moved this from ❓New to 📤 Backlog in Issues Board Dec 17, 2024
@JaroslavTulach
Copy link
Member

JaroslavTulach commented Dec 17, 2024

@4e6, I am converting your code to a benchmark like this:

diff --git test/Benchmarks/src/Text/Contains.enso test/Benchmarks/src/Text/Contains.enso
index 75b7694c31..dd682d668a 100644
--- test/Benchmarks/src/Text/Contains.enso
+++ test/Benchmarks/src/Text/Contains.enso
@@ -1,4 +1,5 @@
 from Standard.Base import all
+from Standard.Base.Runtime import assert
 
 from Standard.Test import Bench, Faker
 
@@ -33,7 +34,7 @@ create_big_random character_template faker =
     Vector.new 200 _-> faker.text_value big_template
 
 
-collect_benches = Bench.build builder->
+collect_benches aoc_count=2 = Bench.build builder->
     bench_contains group_name character_template =
 
         data = Data.create character_template Faker.new
@@ -47,5 +48,66 @@ collect_benches = Bench.build builder->
 
     bench_contains "Text_Contains" (Faker.upper_case_letters + Faker.lower_case_letters + 'ąę\u{301}\u{302}\u{303}\u{321}'.char_vector)
 
-
-main = collect_benches . run_main
+    aoc_options = Bench.options . set_warmup (Bench.phase_conf 10 3) . set_measure (Bench.phase_conf aoc_count 3)
+
+    builder.group "AoC_2024_Day_13" aoc_options group_builder->
+        parse_button node1 =
+            any2 = node1.find 'Button \\w: X[+](\\d+), Y[+](\\d+)'
+            vector2 = any2.groups
+            node2 = vector2.at 1
+            node3 = vector2.at 2
+            vector3 = [node2, node3]
+            vector3
+
+        parse_prize node3 =
+            any3 = node3.find 'Prize: X=(\\d+), Y=(\\d+)'
+            vector2 = any3.groups
+            node4 = vector2.at 1
+            node5 = vector2.at 2
+            vector4 = [node4, node5]
+            vector4
+
+        parse_machine any1 =
+            node1 = any1.at 0
+            node2 = any1.at 1
+            any2 = parse_button node2
+            node3 = any1.at 2
+            vector4 = parse_prize node3
+            vector3 = parse_button node1
+            vector1 = [vector3, any2, vector4]
+            vector2 = vector1.flatten
+            node4 = vector2.map .parse_integer
+            node4
+
+        parse_input (lines:Vector Text) -> Vector =
+            go input acc:Vector =
+                if input.is_empty then acc else
+                    p = input.take 4
+                    m = parse_machine p
+                    @Tail_Call go (input.drop 4) acc+[m]
+            go lines []
+
+        group_builder.specify "find" <|
+            lines = """
+                Button A: X+94, Y+34
+                Button B: X+22, Y+67
+                Prize: X=8400, Y=5400
+
+                Button A: X+26, Y+66
+                Button B: X+67, Y+21
+                Prize: X=12748, Y=12176
+
+                Button A: X+17, Y+86
+                Button B: X+84, Y+37
+                Prize: X=7870, Y=6450
+
+                Button A: X+69, Y+23
+                Button B: X+27, Y+71
+                Prize: X=18641, Y=10279
+
+            res = parse_input (lines.split '\n')
+
+            assert res.length==4 "Expecting four "+res.to_text
+
+main filter=Nothing aoc_count=2 =
+    collect_benches aoc_count . run_main filter

the benchmark is configurable - one can specify the amount of iterations to perform. The following is result of ten iterations:

sbt:enso> runEngineDistribution --run test/Benchmarks/src/Text/Contains.enso .*find.* 10
Measurement avg time:    0.292 ms (+-0.178)

the following is result of one hundred iterations:

sbt:enso> runEngineDistribution --run test/Benchmarks/src/Text/Contains.enso .*find.* 100
Measurement avg time:    0.257 ms (+-0.166)

e.g. I see no slowdown with additional iterations. Btw. this is the output from VisualVM polyglot profiler:

Polyglot sampler

I am not sure what else to investigate. Passing back.

@JaroslavTulach JaroslavTulach assigned 4e6 and unassigned JaroslavTulach Dec 17, 2024
@4e6
Copy link
Contributor Author

4e6 commented Dec 17, 2024

I am converting your code to a benchmark like this
I see no slowdown with additional iterations.

@JaroslavTulach well, have you tried to run the original program? I can still reproduce the issue there

@JaroslavTulach
Copy link
Member

I'd say there is too many nested ArraySlice elements:

obrazek

@JaroslavTulach
Copy link
Member

There is a special code that prevents wrapping ArraySlice in another ArraySlice, but this time we have slice over generic over slice and so on:
obrazek

@JaroslavTulach
Copy link
Member

This change:

diff --git engine/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/ArraySlice.java engine/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/ArraySlice.java
index 67b47a94e3..d305502780 100644
--- engine/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/ArraySlice.java
+++ engine/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/ArraySlice.java
@@ -30,7 +30,7 @@ final class ArraySlice extends EnsoObject {
   private final long end;
 
   private ArraySlice(Object storage, long start, long end) {
-    if (storage instanceof ArraySlice slice) {
+    if (findStorage(storage) instanceof ArraySlice slice) {
       this.storage = slice.storage;
       this.start = slice.start + start;
       this.end = Math.min(slice.end, slice.start + end);
@@ -49,6 +49,16 @@ final class ArraySlice extends EnsoObject {
     }
   }
 
+  private static Object findStorage(Object storage) {
+    for (; ; ) {
+      return switch (storage) {
+        case Vector.Generic gen -> findStorage(gen.toArray());
+        case ArraySlice slice -> slice;
+        default -> storage;
+      };
+    }
+  }
+
   static Vector createOrNull(Object storage, long start, long this_length, long end) {
     long slice_start = Math.max(0, start);
     long slice_end = Math.min(this_length, end);

speeds the sample project up.

@JaroslavTulach JaroslavTulach moved this from 📤 Backlog to 🔧 Implementation in Issues Board Dec 27, 2024
@enso-bot
Copy link

enso-bot bot commented Dec 28, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-12-27):

Progress: .

@JaroslavTulach JaroslavTulach moved this from 🔧 Implementation to 👁️ Code review in Issues Board Dec 28, 2024
@enso-bot
Copy link

enso-bot bot commented Dec 29, 2024

Jaroslav Tulach reports a new STANDUP for yesterday (2024-12-28):

Progress: .

GitHub
Pull Request Description This PR will address #11846 by introducing an internal MultiType replacing Type[], but guaranteeing quick (via ==) comparisons necessary to built various inline caches. Che...

@enso-bot
Copy link

enso-bot bot commented Dec 30, 2024

Jaroslav Tulach reports a new STANDUP for today (2024-12-30):

Progress: .

@enso-bot
Copy link

enso-bot bot commented Jan 4, 2025

Jaroslav Tulach reports a new STANDUP for yesterday (2025-01-03):

Progress: .

@mergify mergify bot closed this as completed in #11945 Jan 4, 2025
@mergify mergify bot closed this as completed in b759d75 Jan 4, 2025
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Jan 4, 2025
@enso-bot
Copy link

enso-bot bot commented Jan 5, 2025

Jaroslav Tulach reports a new STANDUP for yesterday (2025-01-04):

Progress: .

Discord
Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.

@enso-bot
Copy link

enso-bot bot commented Jan 7, 2025

Jaroslav Tulach reports a new STANDUP for yesterday (2025-01-06):

Progress: .

Discord
Discord is great for playing games and chilling with friends, or even building a worldwide community. Customize your own space to talk, play, and hang out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-language-server -libs Libraries: New libraries to be implemented
Projects
Status: 🟢 Accepted
Development

Successfully merging a pull request may close this issue.

2 participants