Grafana Alert Rule Fires when a Kubernetes Pod Is in Crash Loop
Grafana Alert Rule Fires when a Kubernetes Pod Is in Crash Loop
Grafana Alert Rule Fires when a Kubernetes Pod Is in Crash Loop
Grafana Alert Rule Fires when a Kubernetes Pod Is in Crash Loop
Verify that a Grafana alert rule alerts you when pods are not ready to accept traffic for a certain time.
Motivation
Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.
Structure
First, check that the Grafana alert rule responsible for tracking non-ready containers is in an 'okay' state. As soon as one of the containers is crash looping, caused by the crash loop attack, the Grafana alert rule should fire and escalate it to your on-call team.
Solution Sketch
How to use this template?
Import via Hub Connection
Steadybit’s Reliability Hub is already connected to your platform. If you are an admin, you can just easily import templates with just one click.
Are you on-prem?
This is how you import Templates