-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Speed type error testing #119678
ESQL: Speed type error testing #119678
Conversation
This shaves a few minutes off of the ESQL build: ``` 14m 50s -> 12m 38s ``` It does so by moving the type error testing from parameterized tests to a single, stand alone test per scalar that checks the errors for all unsupported types. It gets the list from the parameterized tests the same way as we were doing. But it's *fast*. AND, this will let us test a huge number of combinations without nearly as much overhead as we had before. In the worse case, unary math functions, this doesn't save any time. Maybe .1 second per function. For binary math functions it saves a *little* time. About a second per function. But for non-math, multivalued functions: wow. IpPrefix is ternary and it's test goes from 56.8 seconds to 2.6 seconds! Here are a few examples. | name | before | after | before| after | | -----------------: | -----: | -----------: | ----: | ----: | | Sin | 2.6s | 2.5s | 400 | 291 | | ATan2 | 17.4s | 16.1s | 8270 | 5961 | | IpPrefix | 56.8s | 🎉 2.6s | 40650 | 191 | | Equals | 69.9s | 50.6s | 30130 | 28131 | | NotEquals | 67.1s | 46.8s | 30100 | 28101 | | GreaterThan | 63.7s | 57.8s | 29940 | 27791 | | GreaterThanOrEqual | 61.1s | 61.6s | 29940 | 27791 | | LessThan | 63.7s | 61.3s | 29940 | 27791 | | LessThanOrEqual | 61.1s | 59.8s | 29940 | 27791 | | Case | 115.3s | 🎉 45.1s | 63756 | 13236 | | DateDiff | 3.4s | 4.0s?| 507 | 271 | | DateExtract | 12.1s | 3.4s | 3406 | 156 | | DateFormat | 8.1s | 2.4s | 2849 | 100 | | DateParse | 10.6s | 2.8s | 2992 | 276 | | DateTrunc | 10.9s | 3.4s | 3320 | 790 | | ByteLength | 5.7s | 4.0s | 520 | 391 | | EndsWith | 13.7s | 7.2s | 3880 | 1411 | | Hash | 30.7s | 17.4s | 3980 | 1511 | | LTrim | 27.1s | 29.0s?| 2840 | 2711| | Locate | 85.3s | 🎉 10.3s | 44310 | 1461 | | Replace | 96.5s | 🎉 10.1s | 42010 | 1711 | | RTrim | 15.6s | 20.0s?| 2840 | 2711 | | Split | 6.6s | 4.0s | 3360 | 397 | | StartsWith | 5.5s | 🎉 0.7s | 2800 | 330 | | Substring | 115.2s | 🎉 2.7s | 85386 | 483 | | Trim | 17.4s | 17.8s | 2840 | 2710 | Gradle Enterprise is also not happy with the raw *number* of tests ESQL runs. So lowering the overall number is important. See the table above. This strategy is *super* effective for that. It takes us ``` 769459 -> 470429 ```
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
/** | ||
* Build the expected error message for an invalid type signature. | ||
*/ | ||
protected static String typeErrorMessage( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For those following along, this is copied out of AbstactFunctionTestCase
but modified. I copied it because I think that once we apply this pattern globally we'll delete the function I copied it from.
return ordinal + "argument of [" + source + "] must be [" + expectedTypeString + "], found value [] type [" + name + "]"; | ||
} | ||
|
||
protected static String errorMessageStringForBinaryOperators( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is moved from AbstractFunctionTestCase
and updated for the slightly different error messages this functions builds. the + source +
bit.
@@ -191,7 +191,6 @@ protected static List<TestCaseSupplier> anyNullIsNull( | |||
ExpectedType expectedType, | |||
ExpectedEvaluatorToString evaluatorToString | |||
) { | |||
typesRequired(testCaseSuppliers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's now impossible to build test cases without types so this test doesn't do anything.
@@ -366,10 +370,6 @@ private static List<DataType> append(List<DataType> orig, DataType extra) { | |||
return longer; | |||
} | |||
|
|||
protected static Stream<DataType> representable() { | |||
return DataType.types().stream().filter(DataType::isRepresentable); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused.
+ "]"; | ||
|
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
This shaves a few minutes off of the ESQL build: ``` 14m 50s -> 12m 38s ``` It does so by moving the type error testing from parameterized tests to a single, stand alone test per scalar that checks the errors for all unsupported types. It gets the list from the parameterized tests the same way as we were doing. But it's *fast*. AND, this will let us test a huge number of combinations without nearly as much overhead as we had before. In the worse case, unary math functions, this doesn't save any time. Maybe .1 second per function. For binary math functions it saves a *little* time. About a second per function. But for non-math, multivalued functions: wow. IpPrefix is ternary and it's test goes from 56.8 seconds to 2.6 seconds! Here are a few examples. | name | before | after | before| after | | -----------------: | -----: | -----------: | ----: | ----: | | Sin | 2.6s | 2.5s | 400 | 291 | | ATan2 | 17.4s | 16.1s | 8270 | 5961 | | IpPrefix | 56.8s | 🎉 2.6s | 40650 | 191 | | Equals | 69.9s | 50.6s | 30130 | 28131 | | NotEquals | 67.1s | 46.8s | 30100 | 28101 | | GreaterThan | 63.7s | 57.8s | 29940 | 27791 | | GreaterThanOrEqual | 61.1s | 61.6s | 29940 | 27791 | | LessThan | 63.7s | 61.3s | 29940 | 27791 | | LessThanOrEqual | 61.1s | 59.8s | 29940 | 27791 | | Case | 115.3s | 🎉 45.1s | 63756 | 13236 | | DateDiff | 3.4s | 4.0s?| 507 | 271 | | DateExtract | 12.1s | 3.4s | 3406 | 156 | | DateFormat | 8.1s | 2.4s | 2849 | 100 | | DateParse | 10.6s | 2.8s | 2992 | 276 | | DateTrunc | 10.9s | 3.4s | 3320 | 790 | | ByteLength | 5.7s | 4.0s | 520 | 391 | | EndsWith | 13.7s | 7.2s | 3880 | 1411 | | Hash | 30.7s | 17.4s | 3980 | 1511 | | LTrim | 27.1s | 29.0s?| 2840 | 2711| | Locate | 85.3s | 🎉 10.3s | 44310 | 1461 | | Replace | 96.5s | 🎉 10.1s | 42010 | 1711 | | RTrim | 15.6s | 20.0s?| 2840 | 2711 | | Split | 6.6s | 4.0s | 3360 | 397 | | StartsWith | 5.5s | 🎉 0.7s | 2800 | 330 | | Substring | 115.2s | 🎉 2.7s | 85386 | 483 | | Trim | 17.4s | 17.8s | 2840 | 2710 | Gradle Enterprise is also not happy with the raw *number* of tests ESQL runs. So lowering the overall number is important. See the table above. This strategy is *super* effective for that. It takes us ``` 769459 -> 470429 ```
💚 Backport successful
|
This applies that pattern from elastic#119678 to the `convert` tests. It doesn't buy us any speed, sadly. This is because conversions are *unary* and we rarely get much bonus from unary functions for this. Still, we do marginally reduce the number of test cases which is good for gradle: ``` 13862 -> 11948 (14%) ```
* ESQL: Speed type error testing (#119678) This shaves a few minutes off of the ESQL build: ``` 14m 50s -> 12m 38s ``` It does so by moving the type error testing from parameterized tests to a single, stand alone test per scalar that checks the errors for all unsupported types. It gets the list from the parameterized tests the same way as we were doing. But it's *fast*. AND, this will let us test a huge number of combinations without nearly as much overhead as we had before. In the worse case, unary math functions, this doesn't save any time. Maybe .1 second per function. For binary math functions it saves a *little* time. About a second per function. But for non-math, multivalued functions: wow. IpPrefix is ternary and it's test goes from 56.8 seconds to 2.6 seconds! Here are a few examples. | name | before | after | before| after | | -----------------: | -----: | -----------: | ----: | ----: | | Sin | 2.6s | 2.5s | 400 | 291 | | ATan2 | 17.4s | 16.1s | 8270 | 5961 | | IpPrefix | 56.8s | 🎉 2.6s | 40650 | 191 | | Equals | 69.9s | 50.6s | 30130 | 28131 | | NotEquals | 67.1s | 46.8s | 30100 | 28101 | | GreaterThan | 63.7s | 57.8s | 29940 | 27791 | | GreaterThanOrEqual | 61.1s | 61.6s | 29940 | 27791 | | LessThan | 63.7s | 61.3s | 29940 | 27791 | | LessThanOrEqual | 61.1s | 59.8s | 29940 | 27791 | | Case | 115.3s | 🎉 45.1s | 63756 | 13236 | | DateDiff | 3.4s | 4.0s?| 507 | 271 | | DateExtract | 12.1s | 3.4s | 3406 | 156 | | DateFormat | 8.1s | 2.4s | 2849 | 100 | | DateParse | 10.6s | 2.8s | 2992 | 276 | | DateTrunc | 10.9s | 3.4s | 3320 | 790 | | ByteLength | 5.7s | 4.0s | 520 | 391 | | EndsWith | 13.7s | 7.2s | 3880 | 1411 | | Hash | 30.7s | 17.4s | 3980 | 1511 | | LTrim | 27.1s | 29.0s?| 2840 | 2711| | Locate | 85.3s | 🎉 10.3s | 44310 | 1461 | | Replace | 96.5s | 🎉 10.1s | 42010 | 1711 | | RTrim | 15.6s | 20.0s?| 2840 | 2711 | | Split | 6.6s | 4.0s | 3360 | 397 | | StartsWith | 5.5s | 🎉 0.7s | 2800 | 330 | | Substring | 115.2s | 🎉 2.7s | 85386 | 483 | | Trim | 17.4s | 17.8s | 2840 | 2710 | Gradle Enterprise is also not happy with the raw *number* of tests ESQL runs. So lowering the overall number is important. See the table above. This strategy is *super* effective for that. It takes us ``` 769459 -> 470429 ``` * Compile plz
This applies that pattern from #119678 to the `convert` tests. It doesn't buy us any speed, sadly. This is because conversions are *unary* and we rarely get much bonus from unary functions for this. Still, we do marginally reduce the number of test cases which is good for gradle: ``` 13862 -> 11948 (14%) ```
This applies that pattern from elastic#119678 to the `convert` tests. It doesn't buy us any speed, sadly. This is because conversions are *unary* and we rarely get much bonus from unary functions for this. Still, we do marginally reduce the number of test cases which is good for gradle: ``` 13862 -> 11948 (14%) ```
This applies that pattern from #119678 to the `convert` tests. It doesn't buy us any speed, sadly. This is because conversions are *unary* and we rarely get much bonus from unary functions for this. Still, we do marginally reduce the number of test cases which is good for gradle: ``` 13862 -> 11948 (14%) ```
This reduces the number of test cases in ESQL a little more ala elastic#119678. It migrates a few random tests and all of the multivalue functions: ``` 92775 -> 43760 3m45 -> 4m04 ``` This adds a few more error test cases that were missing to make sure it all lines up well. And it fixes a few error messages in a few functions. That's *likely* where the extra time goes.
This shaves a few minutes off of the ESQL build:
It does so by moving the type error testing from parameterized tests to a single, stand alone test per scalar that checks the errors for all unsupported types. It gets the list from the parameterized tests the same way as we were doing. But it's fast.
AND, this will let us test a huge number of combinations without nearly as much overhead as we had before.
In the worse case, unary math functions, this doesn't save any time. Maybe .1 second per function. For binary math functions it saves a little time. About a second per function.
But for non-math, multivalued functions: wow. IpPrefix is ternary and it's test goes from 56.8 seconds to 2.6 seconds! Here are a few examples.
Gradle Enterprise is also not happy with the raw number of tests ESQL runs. So lowering the overall number is important. See the table above. This strategy is super effective for that. It takes us