-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added additional entries for troubleshooting unhealthy cluster #119914
base: 8.17
Are you sure you want to change the base?
Conversation
Reordered "Re-enable shard allocation" because not as common as other causes Added additional causes of yellow statuses Changed watermark commadn to include high and low watermark so users can make their cluster operate once again.
Documentation preview: |
cc @georgewallace (can't personally add reviewers at the moment) |
Pinging @elastic/es-docs (Team:Docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drive-by copyedit with suggestions for concision and some formatting fixes.
docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
Outdated
Show resolved
Hide resolved
* If you manually restart a node, then it will temporarily cause an unhealthy cluster until the node has recovered. | ||
|
||
* If you have a node that is overloaded or has stopped operating for any reason, then it will temporarily cause an unhealthy cluster. Nodes may disconnect because of prolonged garbage collection (GC) pauses, which can result from "out of memory" errors or high memory usage due to intensive search operations. See <<fix-cluster-status-jvm,Reduce JVM memory pressure>> for more JVM related issues. | ||
|
||
* If nodes cannot reliably communicate due to networking issues, they may lose contact with one another. This can cause shards to become out of sync. You can often identify this issue by checking the logs for repeated messages about nodes leaving and rejoining the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* If you manually restart a node, then it will temporarily cause an unhealthy cluster until the node has recovered. | |
* If you have a node that is overloaded or has stopped operating for any reason, then it will temporarily cause an unhealthy cluster. Nodes may disconnect because of prolonged garbage collection (GC) pauses, which can result from "out of memory" errors or high memory usage due to intensive search operations. See <<fix-cluster-status-jvm,Reduce JVM memory pressure>> for more JVM related issues. | |
* If nodes cannot reliably communicate due to networking issues, they may lose contact with one another. This can cause shards to become out of sync. You can often identify this issue by checking the logs for repeated messages about nodes leaving and rejoining the cluster. | |
* A manual node restart will cause a temporary unhealthy cluster state until the node recovers. | |
* Node overload or failure causes a temporary unhealthy cluster state. Prolonged garbage collection (GC) pauses, caused by out-of-memory errors or high memory usage during intensive searches, can trigger this state. See <<fix-cluster-status-jvm,Reduce JVM memory pressure>> for more JVM-related issues. | |
* Network issues can prevent reliable node communication, causing shards to become out of sync. Check the logs for repeated messages about nodes leaving and rejoining the cluster. |
copyedit for concision
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like these edits but "Node overload or failure causes a temporary unhealthy cluster state" isn't clear to me.
How do you feel about
"When a node becomes overloaded or fails, it can temporarily disrupt the cluster’s health, leading to an unhealthy state."
docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
Outdated
Show resolved
Hide resolved
run docs-build |
docs/reference/troubleshooting/common-issues/red-yellow-cluster-status.asciidoc
Outdated
Show resolved
Hide resolved
…fixes. Co-authored-by: Liam Thompson <[email protected]>
Co-authored-by: Liam Thompson <[email protected]>
Co-authored-by: Liam Thompson <[email protected]>
…r-status.asciidoc Co-authored-by: shainaraskas <[email protected]>
Co-authored-by: Liam Thompson <[email protected]>
…r-status.asciidoc Co-authored-by: Liam Thompson <[email protected]>
Reordered "Re-enable shard allocation" because not as common as other causes
Added additional causes of yellow statuses
Changed watermark command to include high and low watermark so users can make their cluster operate once again.