You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remote clusters can be located in different regions or zones, with high latency transport requests. For instance, the latency between us-west and eu-west can exceed 100ms. When using the _search API with CCS and ccs_minimized_roundtrips=true, only one cross-cluster transport request is needed per remote cluster. However, ES|QL with CCS requires multiple cross-cluster transport requests, including field-caps requests, cluster-compute requests, and exchange requests.
I investigated an issue where a remote cluster generated 70 small pages, with a latency greater than 120ms between clusters. Given that we have three concurrent clients fetching pages, this process alone could take more than 2.8 seconds (70/3 * 120ms).
Node-level reduction should help alleviate this problem, but it is currently disabled by default and only enabled in snapshot builds. And if the remote cluster targets a large number of nodes, cluster-level reduction will be necessary to avoid latency issues. The implementation of cluster-level reduction can be similar to that of node-level reduction. However, the implementation should also consider the trade-off between reducing latency and avoiding duplicate work on both the coordinator in the querying cluster and the remote clusters.
The text was updated successfully, but these errors were encountered:
Remote clusters can be located in different regions or zones, with high latency transport requests. For instance, the latency between us-west and eu-west can exceed 100ms. When using the _search API with CCS and ccs_minimized_roundtrips=true, only one cross-cluster transport request is needed per remote cluster. However, ES|QL with CCS requires multiple cross-cluster transport requests, including field-caps requests, cluster-compute requests, and exchange requests.
I investigated an issue where a remote cluster generated 70 small pages, with a latency greater than 120ms between clusters. Given that we have three concurrent clients fetching pages, this process alone could take more than 2.8 seconds (70/3 * 120ms).
Node-level reduction should help alleviate this problem, but it is currently disabled by default and only enabled in snapshot builds. And if the remote cluster targets a large number of nodes, cluster-level reduction will be necessary to avoid latency issues. The implementation of cluster-level reduction can be similar to that of node-level reduction. However, the implementation should also consider the trade-off between reducing latency and avoiding duplicate work on both the coordinator in the querying cluster and the remote clusters.
The text was updated successfully, but these errors were encountered: