Multi-cloud strategies distribute workloads across two or more cloud providers, driven by resilience, regulatory, or negotiation goals. However, multi-cloud introduces significant complexity: each cloud has different IAM models, networking primitives, and managed service APIs. Key cost traps include egress fees (~$0.08–$0.09/GB between clouds), data gravity (data naturally accumulates where compute is cheapest), and operational overhead (two sets of certifications, tooling, and runbooks). True workload portability requires abstracting cloud-specific primitives via Kubernetes and Terraform.

Key Points

  • Data gravity: moving 1 PB between AWS and Azure at $0.09/GB costs $90,000 in egress alone — data architecture decisions have long-term cost lock-in implications.
  • Multi-cloud networking: Megaport, Equinix Fabric, or cloud-native interconnects (AWS Cloud WAN, Azure Virtual WAN) provide private cross-cloud connectivity with predictable latency.
  • Kubernetes enables multi-cloud compute portability — the same Deployment YAML runs on EKS, AKS, or GKE, but networking, storage classes, and load balancer annotations differ.
  • Cloud-agnostic managed services (Confluent, Databricks, MongoDB Atlas, Snowflake) reduce lock-in by abstracting underlying cloud infrastructure — at a premium over native services.
  • Distributed Cloud (AWS Outposts, Azure Arc, GCP Anthos/Distributed Cloud) extends the primary cloud's control plane to on-premises or edge — often a better trade-off than true multi-cloud.
  • Governance challenge: multi-cloud requires unified identity (OIDC federation), unified observability (OpenTelemetry → Grafana/Datadog), and unified policy (OPA) — each cloud's native tools do not cross boundaries.
  • Latency between cloud regions over public internet is typically 20–100 ms — unsuitable for synchronous database calls; design multi-cloud workloads to be eventual-consistent or region-isolated.
  • Most enterprises use multi-cloud opportunistically (different clouds for different workloads/acquisitions) rather than actively (same workload split across clouds) — the latter is rare and operationally expensive.

Real-World Example

Goldman Sachs runs a dual-cloud strategy using AWS for its consumer banking platform (Marcus) and GCP for its Marquee analytics platform, unified by Terraform for IaC, Kubernetes for compute, and Datadog for observability.