sqs: Optimize for concurrency #226

antoineco · 2020-12-01T18:29:58Z

Reopens #225, which was accidentally merged (I reverted it).

Closes #222

antoineco · 2020-12-01T18:31:36Z

Performance on 2 cores with 2 receiver|processor|deleter per thread. No CPU throttling was observed with the current CPU limit of 1.

antoineco · 2020-12-02T19:38:22Z

Here is the result of another load test with dynamic concurrency settings (+1 receiver|processor|deleter every 10s), virtually unlimited buffers (size > total messages in the SQS queue), and TriggerMesh's default CPU limit (500m):

Things become relatively unstable beyond 6 receiver|processor|deleter, which is 3 per thread.

The default limit of 500m is causing a fair amount of CPU throttling:

antoineco · 2020-12-02T19:57:20Z

The same load test as above, but this time without CPU limit. This time the performance remains quite stable until 12 receiver|processor|deleter (6 per thread) but figures did not double compared to the previous experiment, because CloudEvents are still sent sequentially after messages are received from the queue.

Like what I observed before, the CPU usage remains below 400m:

antoineco · 2020-12-02T20:23:37Z

@sebgoa the decision is now about defining the default per Pod.

Do we want to

Assume maximum performance per Pod is always a goal and raise the default request+limit to achieve that at 6 messages processors per thread? (-> 450 msg/s)

Trade-off: we won't be able to schedule as many replicas per node, and it might be an overkill for more moderate traffic (500 msg/s feels like a lot to me)
Take a more moderate stance and stick to this PR's values with 2 messages processors per thread? (-> 150 msg/s)

Trade-off: scaling to higher rates requires horizontal auto-scaling, which we have to enable in a separate PR (this statement can apply to all cases, actually).
Meet half-way, e.g. raise the default to 3 messages processors per thread? (-> 250 msg/s)

Trade-off: we won't be able to schedule as many replicas per node as in 2., but still a bit more than in 1.

I would personally vote for 3 but follow-up with auto-scaling (#230), and maybe also #227 to spawn/terminate goroutines dynamically based on the number of messages being received.

Rewrite of the source adapter to spawn concurrent message processors instead of executing a single loop sequentially. The number of receivers, senders and deleters is based on the number of available CPU cores.

antoineco changed the title ~~Sqs optimize~~ sqs: Optimize for concurrency Dec 1, 2020

antoineco requested a review from sebgoa December 2, 2020 19:57

antoineco marked this pull request as ready for review December 2, 2020 20:25

This was referenced Dec 3, 2020

Investigate why memory isn't freed at the end of a load test #228

Closed

Enable horizontal Pod autoscaling #230

Closed

antoineco added 2 commits December 4, 2020 10:38

sqs: Optimize for concurrency

709353f

Rewrite of the source adapter to spawn concurrent message processors instead of executing a single loop sequentially. The number of receivers, senders and deleters is based on the number of available CPU cores.

sqs: Increase CPU limit to avoid CFS throttling

c8f42c4

antoineco merged commit c8f42c4 into triggermesh:master Dec 4, 2020

antoineco deleted the sqs-optimize branch December 4, 2020 09:41

antoineco mentioned this pull request May 18, 2022

adapterOverrides is not overriding resource limts/requests for awss3source triggermesh/triggermesh#909

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sqs: Optimize for concurrency #226

sqs: Optimize for concurrency #226

antoineco commented Dec 1, 2020 •

edited

Loading

antoineco commented Dec 1, 2020 •

edited

Loading

antoineco commented Dec 2, 2020

antoineco commented Dec 2, 2020

antoineco commented Dec 2, 2020 •

edited

Loading

sqs: Optimize for concurrency #226

sqs: Optimize for concurrency #226

Conversation

antoineco commented Dec 1, 2020 • edited Loading

antoineco commented Dec 1, 2020 • edited Loading

antoineco commented Dec 2, 2020

antoineco commented Dec 2, 2020

antoineco commented Dec 2, 2020 • edited Loading

antoineco commented Dec 1, 2020 •

edited

Loading

antoineco commented Dec 1, 2020 •

edited

Loading

antoineco commented Dec 2, 2020 •

edited

Loading