Steadybit logoResilience Hub
Try SteadybitGitHub icon
Steadybit logoResilience Hub

Cause Crash Loop

Attack

Attack

Causes a crash loop in a pod
Install now

Cause Crash Loop

Causes a crash loop in a pod
Attack

Attack

Install now

Cause Crash Loop

Attack

Attack

Causes a crash loop in a pod
Install now

Cause Crash Loop

Causes a crash loop in a pod
Attack

Attack

Install now
Go back to list
Experiment editorExperiment editor

Introduction

You can use this step to continuously kill all (or a given) container in a selected pod.

Use Cases

  • Simulate failure of container startups and Kubernetes backing off to restart the container

Known limitations

  • Pods using hostPID=true are currently not supported
  • Containers without kill binary are currently not supported

Rollback

No rollback necessary.

Parameters

ParameterRequiredDescriptionDefault
DurationtrueHow long should the attack run?60s
ContainerfalseName of a container which should be killed. By default all containers are killed.
Statistics
-Stars
Tags
Kubernetes
Homepage
hub.steadybit.com/extension/com.steadybit.extension_kubernetes
License
MIT
MaintainerSteadybit
Install now

Useful Templates (4 of 5)

See all
Datadog alerts when a Kubernetes pod is in crash loop

Verify that a Datadog monitor alerts you when pods are not ready to accept traffic for a certain time.

Motivation

Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.

Structure

First, check that the Datadog monitor responsible for tracking non-ready containers is in an 'okay' state. As soon as one of the containers is crash looping, caused by the crash loop attack, the Datadog monitor should alert and escalate it to your on-call team.

Solution Sketch

  • Kubernetes liveness, readiness, and startup probes
Crash loop
Harden Observability
Datadog
Restart
Kubernetes

Datadog monitors

Kubernetes cluster

Kubernetes pods

Dynatrace should detect a crash looping as problem

Verify that Dynatrace alerts you on pods not being ready to accept traffic for a certain amount of time.

Motivation

Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.

Structure

First, check that Dynatrace has no problems for an entity and doesn't alert already on non-ready containers. As soon as one of the containers is crash looping, caused by the Steadybit attack crash loop, Dynatrace should detect the problem and alert to ensure your on-call team is taking action.

Solution Sketch

  • Kubernetes liveness, readiness, and startup probes
Crash loop
Dynatrace
Harden Observability
Kubernetes

Kubernetes cluster

Kubernetes pods

Grafana alert rule fires when a Kubernetes pod is in crash loop

Verify that a Grafana alert rule alerts you when pods are not ready to accept traffic for a certain time.

Motivation

Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.

Structure

First, check that the Grafana alert rule responsible for tracking non-ready containers is in an 'okay' state. As soon as one of the containers is crash looping, caused by the crash loop attack, the Grafana alert rule should fire and escalate it to your on-call team.

Solution Sketch

  • Kubernetes liveness, readiness, and startup probes
Crash loop
Harden Observability
Restart
Grafana
Kubernetes

Grafana alert rules

Kubernetes cluster

Kubernetes pods

Instana should detect a crash looping as incident

Intent

Verify that Instana alerts you that pods are not ready to accept traffic for some time.

Motivation

Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.

Structure

First, check that Instana has no critical events for an application perspective. As soon as one of the containers is crash looping, caused by the Steadybit attack crash loop, Instana should detect this via a critical event to ensure your on-call team is taking action.

Solution Sketch

  • Kubernetes liveness, readiness, and startup probes
Crash loop
Instana
Harden Observability
Kubernetes

Instana application perspectives

Kubernetes cluster

Kubernetes pods

More Kubernetes Pod Actions

See all
Start Using Steadybit Today

Get started with Steadybit, and you’ll get access to all of our features to discover the full power of Steadybit. Available for SaaS and on-prem!

Are you unsure where to begin?

No worries, our reliability experts are here to help: book a demo with them!

Steadybit logoResilience Hub
Try Steadybit
© 2024 Steadybit GmbH. All rights reserved.
Twitter iconLinkedIn iconGitHub icon