Steadybit logoResilience Hub
Try SteadybitGitHub icon
Steadybit logoResilience Hub

Alert-Rule Check

Check

Check

Collects information about the alert-rule state and optionally verifies that the monitor has an expected state.
Install now

Alert-Rule Check

Collects information about the alert-rule state and optionally verifies that the monitor has an expected state.
Check

Check

Install now

Alert-Rule Check

Check

Check

Collects information about the alert-rule state and optionally verifies that the monitor has an expected state.
Install now

Alert-Rule Check

Collects information about the alert-rule state and optionally verifies that the monitor has an expected state.
Check

Check

Install now
Go back to list
YouTube content is not loaded by default for privacy reasons.

Introduction

The alert rule check step can be dragged&dropped into the experiment editor. Once done, you can use it to collect information about the state of the Grafana alert rules and, optionally, to verify that they are within the expected state.

Experiments can be aborted and marked as failed when the Grafana alert rule check's actual state diverges from the expected state. This helps implement pre-/post-conditions and invariants. For example, to only start an experiment when the system is healthy.

At last, to help you understand the alert rule states and how they evolved, the run view also contains a state visualization. Through this visualization, you can see what states the Alert rules had throughout the experiment execution.

Use Cases

  • Pre-/postcondition or invariant for any experiment.
  • Verify that alerts are triggered during incidents.

Parameters

ParameterDescriptionDefault
DurationHow long should the state of the alert rule be checked30s
Expected StateThe expected state of the alert rule. One of Inactive, Normal, Pending, Firing.
State Check ModeHow often should the state be expected. "At least once" or "All the time""All the time"
Statistics
-Stars
Tags
Grafana
Kubernetes
Observability
Monitoring
Homepage
hub.steadybit.com/extension/com.steadybit.extension_grafana
License
MIT
MaintainerSteadybit
Install now

Useful Templates

See all
Grafana alert rule fires when a Kubernetes pod is in crash loop

Verify that a Grafana alert rule alerts you when pods are not ready to accept traffic for a certain time.

Motivation

Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.

Structure

First, check that the Grafana alert rule responsible for tracking non-ready containers is in an 'okay' state. As soon as one of the containers is crash looping, caused by the crash loop attack, the Grafana alert rule should fire and escalate it to your on-call team.

Solution Sketch

  • Kubernetes liveness, readiness, and startup probes
Crash loop
Harden Observability
Restart
Grafana
Kubernetes

Grafana alert rules

Kubernetes cluster

Kubernetes pods

Start Using Steadybit Today

Get started with Steadybit, and you’ll get access to all of our features to discover the full power of Steadybit. Available for SaaS and on-prem!

Are you unsure where to begin?

No worries, our reliability experts are here to help: book a demo with them!

Steadybit logoResilience Hub
Try Steadybit
© 2024 Steadybit GmbH. All rights reserved.
Twitter iconLinkedIn iconGitHub icon