-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance: detector body sorting #1329
Comments
The builtin v8 sort doesn't use quicksort any longer, and when it did, it used insertion sort on arrays of less than ~20 elements, which was optimal at the time. I'm pretty sure most implementations now use timsort.
Please share your benchmarking code you used to come to this conclusion.
It shouldn't be a question of opinion, just a matter of proper benchmarking. That said, the chances are that a custom sort implementation will outperform the native v8 is extremely low, I would imagine. |
My usecase is a topdown game where a body moves through space and other objects e.g. walk around "relatively" slowly. The temporal coherence in this case is quite high. In other words, the y-order of AABBs is relatively stable frame-to-frame vs. simulating e.g. gas particles bouncing around inside of a volume. I don't have an isolated benchmark available. A simulation with hundreds of relatively-stable y-ordered bounding boxes is clearly more efficiently sorted by y-order with e.g. insertion sort. I'm not talking about < 20 elements. I assure you that I measured it for my own simulation! It's trivial to outperform the built-in array sort. I agree that opinion doesn't really factor in. The only relevant opinion is whether the library itself should be optimized for more high-coherence use cases. Personally I suspect that this is, on average, the majority of use cases. I proposed making it easier to simply swap out the sort algorithm without having to write the entire P.S. I don't even sort the bodies in my current implementation. I have written a custom matter-js/src/collision/Detector.js Line 39 in f9208df
.sort operates in-place and care is taken to not mutate it directly (though I wonder whether that care is strictly necessary).
Correction: The broadphase is actually done on the X axis, not the Y axis. The points remain the same though! |
I looked into the sorting algorithm and you're right as far as I can tell: v8 uses an algorithm called timsort which is actually quite fast on mostly-sorted data. I think the actual reason it's possible to beat the built-in implementation easily isn't because of the sorting algorithm at all, it's more what I mentioned in my previous comment: every world update completely smashes the previously-sorted array by copying the bodies array. If that copy never took place and the bodies were simply sorted in-place continuously, the performance would probably increase greatly. |
The burden of proof is on you here, since the builtin v8 array sort is highly optmized by teams of uber-experts in algorithms, and written in low-level C++ (Torque?) code to be as fast as possible. As you point out, Timsort is optimized on mostly-sorted arrays, but even if it wasn't, heavily-optimized insertion sort has been used in quicksort implementations for the leaf recursive nodes since time immemorial. There's basically never been a time in recent history where quicksort was used in totally pure form in any JS engine (guessing, but seems like a safe bet). Similarly, I don't know much about the internals of MJS, so I'll defer to those that do, but for any optimization proposals, a runnable, reproducible benchmark is pretty much standard practice to provide to kick off the conversation. |
You're not seeing the whole picture. In isolation, sort and copy are optimized, but every time you are copy that array you are blowing away the previous sort. This is why it is trivial to outperform the built-in implementation. It's fine if you don't want to take my word for it. I can just hack the library to fix the performance bottlenecks (which is what I'm doing)! Open source is about sharing what we find instead of keeping it to ourselves, but I'm certainly not interested in forcing the issue. FYI, copy doesn't necessarily happen on every frame. It happens when the bodies configuration of the world changes. This, of course, may indeed be every frame based on the usage of the engine. As an aside, it is almost unbelievable that the idea of copying the entire world bodies array every frame would be handwaved away as "should be fast". This seems to imply a very fundamental difference between the kind of performance targets the two of us pursue. |
This topic is just too silly to ignore :) I deleted my last comment since I had a bug in the insertion sort. Here's a benchmark:
This is not benchmarking matter in particular, because the problem I'm discussing is actually not even matter-specific. The benchmark runs 500 iterations over 1000 "bodies". 4 cases are run:
The results:
Take-aways:
It's trivial to show over an order of magnitude improvement; hopefully this is illuminating. Of course, if you disagree with the benchmark methodology, feel free to suggest improvements. Like I already noted, none of this will affect me either way since I have implemented a custom detector! I just think an order of magnitude is worth considering for the stock implementation. |
Thanks for the benchmark. You're proposing a change to Matter.js, so the specific use case does matter. I don't know the internals so I can't review it, but I'm assuming you can't get rid of the copy, and in that case, insertion sort is destroyed every time by the builtin, likely ending the debate in favor of keeping the builtin. So I'd imagine adopting insertion sort boils down to:
Assuming the copy isn't needed at all, the next step I'd take is to actually integrate the insertion sort in MJS and run a number of real-world MJS apps (e.g. create hundreds of thousands of bodies and click around for a minute) and use the browser's profiler to compare performance between the two sorts. Again, I don't know MJS internals, just trying to move the proof along, so there's probably not much more I can offer. |
Two points moving forward:
Moving forward, let's just forget about the insertion sort. As I noted in my original comment, the small API change leaves that to developers anyway. |
matter-js/src/collision/Detector.js
Line 72 in f9208df
The above code is largely inefficient most of the time in practice. This is because quicksort is generally less efficient on mostly-sorted arrays.
Funny as it may sound, the humble insertion sort is a great candidate for sorting mostly-sorted arrays. This is the case a lot of the time in practice: most objects do not move so far every frame as to essentially randomize the bounds sorting.
Replacing the code above with
runs almost 25% faster.
You may not agree with this and I'm curious what your opinion is.
Regardless of whether you think this sorting strategy makes sense in a general case, I suggest that the original line of code above is moved to a new method on
Detector
:sortBodies
. If this was done, it would be much easier to swap out a sort that makes sense for your simulation by simply overridingDetector.sortBodies
. As it stands today, the entireDetector.collisions
method most be overridden to change the sorting strategy for bodies.Thanks for your work!
The text was updated successfully, but these errors were encountered: