Don’t Let the AWS Outage Erode Your Trust in the Cloud

Resilience — not panic or repatriation — should guide your next move after cloud disruptions. 

The recent AWS outage on October 20, 2025 was a wake-up call, but not a reason to abandon ship

At roughly 3 a.m. ET, the Northern Virginia (US-East-1) region was hit by a DNS issue that rippled across core services like DynamoDB. For many organizations, this meant hours of degraded performance or outright downtime.

But here’s what matters most: This wasn’t an unprecedented event or evidence that “the cloud” is inherently unreliable. In fact, it followed a familiar pattern we’ve seen from all hyperscale cloud providers over the last decade: a regional event lasting less than a day, driven by a network dependency, but not affecting running compute instances themselves.

Moving workloads out of AWS (repatriation) or to smaller sovereign clouds (geopatriation) won’t magically shield you from future outages. In fact, these moves often introduce new risks and may even slow down your recovery when things do go wrong.

Download the Cloud Strategy Roadmap

Create a strategy that maximizes the benefits of cloud computing for your organization.

By clicking the "Continue" button, you are agreeing to the Gartner Terms of Use and Privacy Policy.

Contact Information

All fields are required.

Company/Organization Information

All fields are required.

Optional

Resist knee-jerk reactions. Building true resilience requires architectural discipline

No cloud provider can promise zero downtime. What sets IT leaders apart isn’t whether they avoid incidents altogether; it’s how they prepare and respond.

Design for failure (because it will happen)

Modern cloud-native apps should distribute workloads across multiple availability zones and be ready to fail over quickly to another region when needed. It’s not about eliminating risk; it’s about reducing blast radius and recovery time. That means understanding service dependencies (like databases or DNS), maintaining up-to-date runbooks and practicing failover drills before disaster strikes.

Legacy apps need extra attention

If you’re still running critical legacy workloads in the cloud, don’t assume resilience comes “out of the box.” Make sure backups are available in secondary regions and that you can actually restore them under pressure. Test disaster recovery regularly so your team knows exactly what to do if their primary region goes dark.

Transparency builds trust

AWS has been open about its global dependencies and has worked since its December 2021 outage to reduce single points of failure. The October 2025 incident was completely confined to US-East-1, showing progress in fault isolation. This transparency gives CIOs actionable data for risk management instead of leaving them guessing.

Avoid multicloud complexity unless your regulators demand it

It’s tempting to think multicloud is the answer. But Gartner research shows that pursuing multicloud resilience can cost more than it saves, introducing technical complexity without truly eliminating systemic risk. 

Maximize single-cloud resilience first

For most organizations — even those facing strict regulations like the Digital Operations Resilience Act (DORA) — investing in robust architectures within one cloud delivers better uptime and simpler operations than trying to juggle multiple providers with different APIs and processes.

Focus on business continuity through substitutability

If your business absolutely cannot tolerate downtime for certain functions, consider application substitutability — having alternative platforms or manual workarounds ready to go if your primary system fails. This pragmatic approach often satisfies both business needs and regulatory scrutiny without ballooning IT budgets.

Don’t let headlines drive strategy. Let data do the talking

Cloud outages make headlines because they affect so many people at once, but context matters. Every major provider has experienced similar events, from Microsoft Azure to Google Cloud Platform. The real differentiator is how well your organization plans for and recovers from inevitable disruption.

The bottom line is clear: Public cloud remains the best option for scalable infrastructure if you invest in resilience upfront — or correct existing deployments if necessary. Don’t let fear steer you toward costly or ineffective alternatives; instead, double down on architecture, process discipline and transparent partnerships with your providers.

Drive stronger performance on your mission-critical priorities.