Create an array-based version of ChunkedToXContentBuilder #119063

thecoop · 2024-12-19T11:29:24Z

Use fixed-size arrays, but the methods and semantics of ChunkedToXContentBuilder

original-brownbear · 2024-12-19T13:27:43Z

I'm sorry @thecoop but I fail to see what this is trying to achieve.
Yes this is probably somewhat faster than the previous version but also obviously slower than the original version without the builder. Again, the chunked x-content logic is only applicable to a very narrow set of use-cases and any API that makes it look nicer (somewhat subjective to begin with, but I get where this is coming from) by adding more push-style logic before the pull-style iteration will slow things down.
Is it really worth all this noise and cost iterating on a builder implementation when the previous implementation will always be faster and really since this is transport thread logic and about as hot as anything ever gets in ES, that is all that matters pretty much. If anything, I'd rather look into writing custom code for search response and bulk response instead of chaining iterators since both of these are considerable CPU consumers.
Sorry but looking at benchmarks and profiling in general we want faster here and any abstraction layer means slower (and slower means less stable because we're on transport threads!).
=> in light of the cost of all of this, is there any reason to not just revert to the old version and be done with this? :)

thecoop · 2024-12-19T13:35:02Z

Can you point to how this will be slower? The objects returned are the same types returned by the previous ChunkedToXContentHelper methods.

original-brownbear · 2024-12-19T13:46:24Z

This one in isolation isn't much slower but even this still has the negative effect of duplicating existing implementations in ChunkedToXContentHelper (and reintroduction of a method previously deleted from that) which is just needless overhead as well (larger table to look up from for the virtual calls).
My point here really was just: we had something that worked reasonably fast why not just go back to that and be happy with it and optimize further rather than go through all this noise? :) Also, note that we started from issues with slow search response serialization, this doesn't remove the slow builder nor addresses search responses in any way?
A fix for #118647 should deal with the slow builder, not just (re-)add duplicate code for one specific path?

thecoop · 2024-12-19T13:55:10Z

This is a WIP - the code here is an example of the changes I would like to make. I am continuing to refactor the old builder to the new one now.

I want to do this change because it makes the code easier to understand and less error-prone, especially given the obtuseness of some exceptions when you mis-match start and end blocks. The methods here are designed to minimise the chances of doing that by doing the start and end for you, in the same style as ChunkedToXContentBuilder (as I encountered when doing some changes in elasticsearch-internal).

I also did these changes as there was little documentation or comments specifying the performance-critical nature of this code. May I suggest adding assertions for specific types, and extensive comments explaining what is going on, if these are necessary to ensure certain calls do not turn megamorphic?

original-brownbear · 2024-12-26T16:27:04Z

@thecoop I finally got around to be profiling this stuff in some detail.
Now I understand why even this PR doesn't get us back to "normal" for search performance. In fact, even though you'd think it'd return to the old numbers, it doesn't.
The reason is that prior to introducing the builder we had a situation where we pretty much always had the array based iterator at the top level of the chunking and it inlined. We already got regressions out of moving to the builder in early October for bulk (regressions in search though!) because that polluted the profile. The regression was just relatively small and we missed it because of the ongoing Lucene 10 work :(

=> there's no point in messing with this stuff on a per response basis. I'd suggest we asap revert the builder stuff for the time being. Reverting to the old code brings back better than before performance numbers in my testing because part of the Lucene 10 improvements became invisible with the builder introduction on the bulk response.

thecoop added WIP :Core/Infra/Core Core issues without another label >refactoring labels Dec 19, 2024

thecoop requested a review from original-brownbear December 19, 2024 11:29

thecoop requested a review from a team as a code owner December 19, 2024 11:29

elasticsearchmachine added the v9.0.0 label Dec 19, 2024

thecoop mentioned this pull request Dec 19, 2024

Don't expand ChunkedXContent objects until we actually iterate #118649

Closed

Create an array-based version of ChunkedToXContentBuilder

919f514

thecoop force-pushed the array-xcontent-chunks branch from 11e0d0e to 919f514 Compare December 19, 2024 11:41

thecoop added 4 commits December 19, 2024 14:05

Convert a few more objects

974f709

Scrap ifThen

473b823

More conversions

fe7d420

Convert more chunked objects

67b578c

thecoop closed this Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create an array-based version of ChunkedToXContentBuilder #119063

Create an array-based version of ChunkedToXContentBuilder #119063

Uh oh!

thecoop commented Dec 19, 2024

Uh oh!

original-brownbear commented Dec 19, 2024

Uh oh!

thecoop commented Dec 19, 2024

Uh oh!

original-brownbear commented Dec 19, 2024

Uh oh!

thecoop commented Dec 19, 2024

Uh oh!

original-brownbear commented Dec 26, 2024

Uh oh!

Uh oh!

Create an array-based version of ChunkedToXContentBuilder #119063

Create an array-based version of ChunkedToXContentBuilder #119063

Uh oh!

Conversation

thecoop commented Dec 19, 2024

Uh oh!

original-brownbear commented Dec 19, 2024

Uh oh!

thecoop commented Dec 19, 2024

Uh oh!

original-brownbear commented Dec 19, 2024

Uh oh!

thecoop commented Dec 19, 2024

Uh oh!

original-brownbear commented Dec 26, 2024

Uh oh!

Uh oh!