AWS

Extension

A Steadybit discovery and action implementation to inject faults into various AWS services.

Install now

AWS

A Steadybit discovery and action implementation to inject faults into various AWS services.

Extension

Install now

AWS

Extension

A Steadybit discovery and action implementation to inject faults into various AWS services.

Install now

AWS

A Steadybit discovery and action implementation to inject faults into various AWS services.

Extension

Install now

Go back to list

YouTube content is not loaded by default for privacy reasons.

Introduction

The AWS (Amazon Web Services) extension bundles various attacks, target discovery, and check capabilities for AWS managed services. For example, you can use the AWS extension to change the state of EC2 instances, trigger reboot or failover for RDS instances, mess around with ECS tasks and services, or inject failures into lambdas.

The AWS extension is, in essence, an adapter for the AWS APIs.

To set up the extension and the needed IAM permissions, please consult the steadybit/extension-aws/README.md

Further Support for Managed Services

While the AWS extension provides integrations to managed services via AWS APIs, we also offer deeper integration for the following services based on the underlying technology.

Elastic Kubernetes Service (EKS)

Benefit in AWS EKS from the same integration we offer for unmanaged Kubernetes clusters by installing the following extensions in your Kubernetes cluster: extension-kubernetes, extension-container, extension-host.

Elastic Compute Cloud (EC2)

When using Linux-based hosts in EC2, you can also benefit from extension-host's capabilities.

Provided Target Discovery

See all

Application Load Balancers

AWS Lambdas

EC2-instances

ECS Services

ECS Tasks

Elasticaches

FIS experiment templates

MSK Brokers

RDS clusters

RDS instances

Subnets

Zones

Provided Actions

See all

Useful Templates (4 of 5)

See all

Load balancer covers an AWS EC2 restart

EC2 is part of the AWS Elastic Compute Cloud, which acquires and releases resources depending on the traffic demand. Check whether your application is elastic as well by rebooting an EC2 instance.

Motivation

Depending on your traffic demand, you can use AWS cloud's ability to acquire and release resources automatically. Some services, such as S3 and SQS, do that automatically, while others, such as EC2, integrate with AWS Auto Scaling. Once configured, it boils down to fluctuating EC2 instances starting or shutting down frequently. Even when not using AWS Autoscaling, your EC2 instances may need to be restarted occasionally for maintenance and updating purposes. Thus, it is best practice to validate your application's behavior.

Structure

We ensure that a load-balanced user-facing endpoint fully works while having all EC2 instances available. While restarting an EC2 instance, the HTTP endpoint continues operating but may suffer from degraded performance (e.g., lower success rate or higher response time). The performance should recover to a 100% success rate once all EC2 instances are back.

Solution Sketch

AWS Well-Architected Framework
Kubernetes liveness, readiness, and startup probes

Scalability

Redundancy

Elasticity

AWS

EC2-instances

New Relic detects an incident for CPU spikes in an ECS task

Validate your observability to detect a CPU spike in your AWS ECS cluster

Motivation

When you have New Relic configured to detect CPU spikes in your AWS ECS cluster, you can easily validate your observability strategy with this experiment template.

Structure

First, we validate whether New Relic has no ongoing incident. After that, we inject the CPU spike for an ECS service and expected that New Relic detect this as an incident within the given time frame of 3 minutes.

New Relic

AWS ECS

CPU

ECS Tasks

New Relic Accounts

AWS ECS Service Is Scaled up Within Reasonable Time

Verify that your ECS service is scaled up on increased CPU usage.

Motivation

Important ECS services should be scaled up within a reasonable time for an elastic and resilient cloud infrastructure. Undetected high CPU spikes and long startup times are undesirable in these infrastructures.

Structure

First, we ensure that all ECS service's tasks are ready to serve traffic. Afterward, we inject high CPU usage into the ECS task and expect that within a reasonable amount of time, ECS increases the number of ECS tasks and they become ready to handle incoming traffic.

Scalability

CPU

AWS ECS

AWS

ECS Services

ECS Tasks

Scaling up of ECS Service Within Given Time

Ensure that you can scale up your ECS service in a reasonable time.

Motivation

For an elastic and resilient cloud infrastructure, ensure you can scale up your ECS services within a reasonable time. Long startup times are undesirable but sometimes unnoticed and unexpected.

Structure

Validate that all ECS tasks of an ECS service are running. Once we scale the ECS service up, the newly scheduled task should be ready within a reasonable time.

Scalability

Elasticity

AWS ECS

AWS

ECS Services

Start Using Steadybit Today

Get started with Steadybit, and you’ll get access to all of our features to discover the full power of Steadybit. Available for SaaS and on-prem!

Are you unsure where to begin?

No worries, our reliability experts are here to help: book a demo with them!

Statistics

-Stars

AWS

AWS

AWS

AWS

YouTube content is not loaded by default for privacy reasons.

Further Support for Managed Services

Elastic Kubernetes Service (EKS)

Elastic Compute Cloud (EC2)

Application Load Balancers

AWS Lambdas

EC2-instances

ECS Services

ECS Tasks

Elasticaches

FIS experiment templates

MSK Brokers

RDS clusters

RDS instances

Subnets

Zones

AWS FIS Experiment

Blackhole Subnet

Blackhole Zone

Block DNS

Block TCP Connections

Block Traffic

change EC2 instance state

Delay Outgoing Traffic

Drop Outgoing Traffic

Fill Disk

Fill Diskspace

Inject Exception

Inject Latency

Inject Status Code

reboot RDS instance

Return static response

Scale Service

Service Event Log

Service Task Count

Stop Process

Stop Task

Stress CPU

Stress IO

Stress Memory

Trigger DB Cluster Failover

Trigger DB Instance Stop

Trigger Elasticache Node Group Failover

Trigger MSK Broker Reboot

Load balancer covers an AWS EC2 restart

Motivation

Structure

Solution Sketch

New Relic detects an incident for CPU spikes in an ECS task

Motivation

Structure

AWS ECS Service Is Scaled up Within Reasonable Time

Motivation

Structure

Scaling up of ECS Service Within Given Time

Motivation

Structure