Start Free Trial

Kubernetes Deployments

Target

Kubernetes Deployments

Target

Kubernetes Deployments

Target

Kubernetes Deployments

Target

Go back to list

Discovery of Kubernetes Deployments

Kubernetes API Server

Our agent is able to communicate directly with the Kubernetes API Server to get more details about containers and pods. This optional feature can be given to the agents when deploying the DaemonSet by setting up a ServiceAccount and limit the access by using a RBAC Authorization.

Additional information is provided by `

Deployments
ReplicaSets
StatefulSets
DaemonSets

Our central platform prepares this additional information and uses it to identify new potential targets. Thus it is possible to attack a dedicated Kubernetes Deployment or to cause failures in a ReplicaSet.

Supported Actions

Deployment Pod Count

Verifies Kubernetes Deployment pod counts

Kubernetes Deployments

Deployment Rollout Status

Check the rollout status of the deployment. The check succeeds when no rollout is pending, i.e., kubectl rollout status exits with status code 0.

Kubernetes Deployments

Rollout Restart Deployment

execute a rollout restart for a Kubernetes deployment

Kubernetes Deployments

Scale Deployment

Up-/Downscale a Kubernetes Deployment

Kubernetes Deployments

Set Image

Set the image of a Deployment

Kubernetes Deployments

Recommended Advice (4 of 13)

Image Pull Policy Set To Always

Ensure that your containers are always running the identical latest container image.

Image Pull Policy

Kubernetes Daemonsets

Kubernetes Deployments

Kubernetes Statefulsets

Image Version Explicitly Configured

Validates usage of explicit image versioning and avoids latest-tags.

Explicit Versioning

Kubernetes Deployments

Limit CPU Resources

Validates that your Kubernetes resources have a CPU limit configured.

Kubernetes Daemonsets

Kubernetes Deployments

Kubernetes Statefulsets

Limit Ephemeral Storage Resources

Validates that your Kubernetes resources have a ephemeral storage limit configured.

Ephemeral Storage

Kubernetes Daemonsets

Kubernetes Deployments

Kubernetes Statefulsets

Useful Templates (4 of 29)

Kubernetes deployment survives Redis latency

Verify that your application handles an increased latency in a Redis cache properly, allowing for increased processing time while maintaining throughput.

Motivation

Latency issues in Redis can lead to degraded system performance, longer response times, and potentially lost or delayed data. By testing your system's resilience to Redis latency, you can ensure that it can handle increased processing time and maintain its throughput during increased latency. Additionally, you can identify any potential bottlenecks or inefficiencies in your system and take appropriate measures to optimize its performance and reliability.

Structure

We will verify that a load-balanced user-facing endpoint fully works while having all pods ready. As soon as we simulate Redis latency, we expect the system to maintain its throughput and indicate unavailability appropriately. We can introduce delays in Redis operations to simulate latency. The experiment aims to ensure that your system can handle increased processing time and maintain its throughput during increased latency. The performance should return to normal after the latency has ended.

Kubernetes deployment survives Redis downtime

Check that your application gracefully handles a Redis cache downtime and continues to deliver its intended functionality. The cache downtime may be caused by an unavailable Redis instance or a complete cluster.

Motivation

Redis downtime can lead to degraded system performance, lost data, and potentially long system recovery times. By testing your system's resilience to Redis downtime, you can ensure that it can handle the outage gracefully and continue to deliver its intended functionality. Additionally, you can identify any potential weaknesses in your system and take appropriate measures to improve its performance and resilience.

Structure

We will verify that a load-balanced user-facing endpoint fully works while having all pods ready. As soon as we simulate Redis downtime, we expect the system to indicate unavailability appropriately and maintain its throughput. We can block the traffic to the Redis instance to simulate downtime. The experiment aims to ensure that your system can gracefully handle the outage and continue delivering its intended functionality. The performance should return to normal after the Redis instance is available again.

Network outage for Kubernetes nodes in an availability zone

Achieve high availability of your Kubernetes cluster via redundancy across different Availability Zones. Check what happens to your Kubernetes cluster when one of the zones is down.

Motivation

Cloud providers host your deployments and services across multiple locations worldwide. From a reliability standpoint, regions and availability zones are most interesting. While the former refers to separate geographic areas spread worldwide, the latter refers to an isolated location within a region. For most use cases, applying deployments across availability zones is sufficient. Given that failures may happen at this level quite frequently, you should verify that your applications are still working in case of an outage.

Structure

We leverage the block traffic attack to simulate a full network loss in an availability zone. While the zone outage happens, we observe changes in the Kubernetes cluster with Steadybit's built-in visibility. Once the zone outage is over, we expect that all deployments will recover again within a specified time.

Solution Sketch

AWS Regions and Zones
Azure Regions and Zones
GCP Regions and Zones
Kubernetes liveness, readiness, and startup probes

Availability Zone

Network loss for Kubernetes node's outgoing traffic in an availability zone

Achieve high availability of your Kubernetes cluster via redundancy across different Availability Zones. Check what happens to your Kubernetes cluster when one of the zones suffers from a network loss.

Motivation

Cloud provider host your deployments and services across multiple locations worldwide. From a reliability standpoint, regions and availability zones are most interesting. While the former refers to separate geographic areas spread worldwide, the latter refers to an isolated location within a region. For most use cases, applying deployments across availability zone is sufficient. Given that failures may happen at this level quite frequently, you should verify that your applications are still working in case of an outage.

Structure

We leverage the drop outgoing traffic to simulate network loss in an availability. If you want to test for a full outage of the zone, configure it to 100% loss. While the network loss happens, we observe changes of a Kubernetes cluster with Steadybit's built-in visibility. Once the network loss is over, we expect that all deployments will recover again within a specified time.

Solution Sketch

AWS Regions and Zones
Azure Regions and Zones
GCP Regions and Zones
Kubernetes liveness, readiness, and startup probes

Availability Zone

Start Using Steadybit Today

Get started with Steadybit, and you’ll get access to all of our features to discover the full power of Steadybit. Available for SaaS and on-prem!

Are you unsure where to begin?

No worries, our reliability experts are here to help: book a demo with them!

Statistics

-Stars

Tags

Kubernetes

AKS (Azure Kubernetes Service)

EKS (AWS Elastic Kubernetes Service)

AWS

Azure

GCP

Advice

Homepage

hub.steadybit.com/extension/com.steadybit.extension_kubernetes

GitHub

steadybit/extension-kubernetes

ghcr.io

steadybit/extension-kubernetes

License

MaintainerSteadybit

support@steadybit.com

https://steadybit.com

Start Free Trial

HubActions Targets Advice Extensions Templates

Support & ResourcesDocumentation Youtube Tutorials Case Studies

ContactsAbout us Contact us Imprint and legal Terms of services

© 2026 Steadybit GmbH. All rights reserved.