[8.x] [kbn-test] retry 5xx in saml callback (#208977) #211023
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport
This will backport the following commits from
main
to8.x
:Questions ?
Please refer to the Backport tool documentation
\n\n### Questions ?\nPlease refer to the [Backport tool\ndocumentation](https://github.com/sqren/backport)\n\n\n\nCo-authored-by: Dzmitry Lemechko "}},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com//pull/208977","number":208977,"mergeCommit":{"message":"[kbn-test] retry 5xx in saml callback (#208977)\n\n## Summary\r\n\r\nWhen we run Scout tests in parallel, we call SAML authentication in\r\nparallel too and since by default `.security-profile-8` index does not\r\nexist, we periodically getting 503 response:\r\n\r\n```\r\n proc [kibana] [2025-01-29T11:13:10.420+01:00][ERROR][plugins.security.user-profile] \r\nFailed to activate user profile: {\"error\":{\"root_cause\":[{\"type\":\"unavailable_shards_exception\",\"reason\":\r\n\"at least one search shard for the index [.security-profile-8] is unavailable\"}],\r\n\"type\":\"unavailable_shards_exception\",\"reason\":\"at least one search shard\r\nfor the index [.security-profile-8] is unavailable\"},\"status\":503}. {\"service\":{\"node\":\r\n{\"roles\":[\"background_tasks\",\"ui\"]}}}\r\n```\r\n\r\nThe solution is to retry the SAML callback assuming that index will be\r\ncreated and the issue will be solved.\r\nWe agreed with Kibana-Security to retry only **5xx** errors, because for\r\n**4xx** we most likely have to start the authentication from the start.\r\n\r\nFor reviews: it is not 100% reproducible, so I added unit tests to\r\nverify the retry logic is working only for 5xx requests. Please let me\r\nknow if I miss something\r\n\r\nRetry was verified locally, you might be seeing this logs output:\r\n\r\n```\r\n proc [kibana] [2025-01-30T18:40:41.348+01:00][ERROR][plugins.security.user-profile] Failed to activate user profile:\r\n{\"error\":{\"root_cause\":[{\"type\":\"unavailable_shards_exception\",\"reason\":\"at least one search shard for the index\r\n[.security-profile-8] is unavailable\"}],\"type\":\"unavailable_shards_exception\",\"reason\":\"at least one search shard\r\nfor the index [.security-profile-8] is unavailable\"},\"status\":503}. {\"service\":{\"node\":{\"roles\":[\"background_tasks\",\"ui\"]}}}\r\n proc [kibana] [2025-01-30T18:40:41.349+01:00][ERROR][plugins.security.authentication] Login attempt with \"saml\"\r\nprovider failed due to unexpected error: {\"error\":{\"root_cause\":[{\"type\":\"unavailable_shards_exception\",\"reason\":\r\n\"at least one search shard for the index [.security-profile-8] is unavailable\"}],\"type\":\"unavailable_shards_exception\",\r\n\"reason\":\"at least one search shard for the index [.security-profile-8] is unavailable\"},\"status\":503}\r\n{\"service\":{\"node\":{\"roles\":[\"background_tasks\",\"ui\"]}}}\r\n proc [kibana] [2025-01-30T18:40:41.349+01:00][ERROR][http] 500 Server Error {\"http\":{\"response\":{\"status_code\":500},\"request\":{\"method\":\"post\",\"path\":\"/api/security/saml/callback\"}},\"error\":\r\n{\"message\":\"unavailable_shards_exception\\n\\tRoot causes:\\n\\t\\tunavailable_shards_exception: at least one\r\nsearch shard for the index [.security-profile-8] is\r\n ERROR [scout] SAML callback failed: expected 302, got 500\r\n Waiting 939 ms before the next attempt\r\n proc [playwright]\r\n info [o.e.c.r.a.AllocationService] [scout] current.health=\"GREEN\" message=\"Cluster health status changed\r\nfrom [YELLOW] to [GREEN] (reason: [shards started [[.security-profile-8][0]]]).\"\r\nprevious.health=\"YELLOW\" reason=\"shards started [[.security-profile-8][0]]\"\r\n```\r\n\r\nTo reproduce: \r\n```\r\nnode scripts/scout.js run-tests --stateful --config x-pack/platform/plugins/private/discover_enhanced/ui_tests/parallel.playwright.config.ts\r\n```\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"2b5bbf8f86f0c6e0e05ab5e6381bba4919c64e33"}},{"branch":"8.x","label":"v8.19.0","branchLabelMappingKey":"^v8.19.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT-->