Blackhole Zone
Attack
Zones
Blackhole Zone
Attack
Zones
Load balancer covers an AWS zone outage
AWS achieves high availability via redundancy across different Availability Zones. Ensure that failover works seamlessly by simulating Zone outages.
Motivation
AWS hosts your deployments and services across multiple locations worldwide. From a reliability standpoint, AWS regions and Availability Zones are most interesting. While the former refers to separate geographic areas spread worldwide, the latter refers to an isolated location within a region. For most use cases, applying deployments across AWS availability zone is sufficient. Given that failures may happen at this level quite frequently, you should verify that your applications are still working in case of an outage.
Structure
We leverage the AWS blackhole attack to simulate an AWS availability zone outage. Before the simulated outage, we ensure that a load-balanced user-facing endpoint works appropriately. During an AWS availability zone's unavailability, the HTTP endpoint must continue operating but may suffer from degraded performance (e.g., lower success rate or higher response time). The performance should recover as soon as the zone is back again.
Solution Sketch
- Regions and Zones
- Kubernetes liveness, readiness, and startup probes
Zones