Steadybit logoResilience Hub
Try SteadybitGitHub icon
Steadybit logoResilience Hub

Create Monitor Downtime

OtherOther
Creates a downtime for a Datadog monitor.
Targets:
Datadog monitors
Install now

Create Monitor Downtime

Creates a downtime for a Datadog monitor.
OtherOther
Install now

Create Monitor Downtime

OtherOther
Creates a downtime for a Datadog monitor.
Targets:
Datadog monitors
Install now

Create Monitor Downtime

Creates a downtime for a Datadog monitor.
OtherOther
Install now
Go back to list
YouTube content is not loaded by default for privacy reasons.

Introduction

When executing chaos experiments, you may mute your datadog monitors not to bother your ops colleagues. You can do this with this action.

The create downtime step can be dragged&dropped into the experiment editor. Afterwards you can select the monitors which should be muted by creating a downtime.

Use Cases

  • Avoid false positives in your monitoring system
  • Avoid alerting your ops colleagues during chaos experiments

Parameters

ParameterDescriptionDefault
DurationHow long should the downtime exist?30s
Notify after Downtime if unhealthyShould Datadog notify after the Downtime if the monitor is still in an unhealthy stateyes

Useful Templates (4 of 7)

See all
Datadog alerts when a Kubernetes pod is in crash loop

Verify that a Datadog monitor alerts you when pods are not ready to accept traffic for a certain time.

Motivation

Kubernetes features a readiness probe to determine whether your pod is ready to accept traffic. If it isn't becoming ready, Kubernetes tries to solve it by restarting the underlying container and hoping to achieve its readiness eventually. If this isn't working, Kubernetes will eventually back off to restart the container, and the Kubernetes resource remains non-functional.

Structure

First, check that the Datadog monitor responsible for tracking non-ready containers is in an 'okay' state. As soon as one of the containers is crash looping, caused by the crash loop attack, the Datadog monitor should alert and escalate it to your on-call team.

Solution Sketch

  • Kubernetes liveness, readiness, and startup probes
Crash loop
Harden Observability
Datadog
Restart
Kubernetes
Datadog monitors
Kubernetes cluster
Kubernetes pods
Linux host losing network connection is detected by Datadog

When a host suddenly loses connection to the network and your system, Datadog should alert about this. Eventually, everything should recover once the network is back again.

Motivation

When you're working in a less volatile system environment, a loss of network can be crucial as there is likely no backup host that will enable faster recovery. Thus, you should check your observability tools to catch this.

Structure

Before blocking a host from the network, we verify that the Datadog monitor is in an ok state Afterward, we block all traffic to and from a host and expect Datadog to alert about the isolated host. Eventually, when the host is online again, we expect Datadog to turn into an OK state again. While experimenting, we create a downtime for the Monitor so that it will not escalate due to the ongoing alert.

Legacy
VM
Host
Linux
Datadog
Datadog monitors
Linux Hosts
Linux Host reboot is alerted by Datadog

When a Linux host is suddenly missing from your system, Datadog should alert you to this. Eventually, everything should recover when only rebooting the host.

Motivation

When you're working in a less volatile system environment, where you expect hosts always to run, you should validate whether you notice whenever a host is rebooting.

Structure

Before restarting a host, we verify that the Datadog monitor is in an ok state Afterward, we trigger the shutdown of a host and expect Datadog to alert about the missing host. Eventually, the host should come back and Datadog turn into an OK state again. While experimenting, we create a downtime for the Monitor so that it will not escalate due to the ongoing alert.

Linux
Legacy
VM
Host
Datadog
Datadog monitors
Linux Hosts
Windows host losing network connection is detected by Datadog

When a host suddenly loses connection to the network and your system, Datadog should alert about this. Eventually, everything should recover once the network is back again.

Motivation

When you're working in a less volatile system environment, a loss of network can be crucial as there is likely no backup host that will enable faster recovery. Thus, you should check your observability tools to catch this.

Structure

Before blocking a host from the network, we verify that the Datadog monitor is in an ok state Afterward, we block all traffic to and from a host and expect Datadog to alert about the isolated host. Eventually, when the host is online again, we expect Datadog to turn into an OK state again. While experimenting, we create a downtime for the Monitor so that it will not escalate due to the ongoing alert.

Legacy
VM
Host
Windows
Datadog
Datadog monitors
Windows Hosts

More Datadog Monitor Actions

See all
Start Using Steadybit Today

Get started with Steadybit, and you’ll get access to all of our features to discover the full power of Steadybit. Available for SaaS and on-prem!

Are you unsure where to begin?

No worries, our reliability experts are here to help: book a demo with them!

Statistics
-Stars
Tags
Datadog
Observability
Monitoring
Homepage
hub.steadybit.com/extension/com.steadybit.extension_datadog
License
MIT
MaintainerSteadybit
Install now
Steadybit logoResilience Hub
Try Steadybit
© 2025 Steadybit GmbH. All rights reserved.
Twitter iconLinkedIn iconGitHub icon