Deployment strategies define how new application versions replace old ones in production, with critical tradeoffs between risk, downtime, rollback speed, and infrastructure cost. Blue-green and canary deployments are industry standard for zero-downtime releases at companies like Amazon and Netflix. The choice of strategy is governed by SLA requirements, infrastructure budget, and traffic characteristics. Feature flags complement all strategies by decoupling code deployment from feature activation.

Key Points

  • Blue-Green: maintain two identical environments; switch load balancer from blue (current) to green (new) — instant rollback by reverting the switch.
  • Canary: route a small percentage (1–5%) of traffic to the new version, monitor error rates and latency, then progressively increase — Kubernetes Argo Rollouts and AWS CodeDeploy support this natively.
  • Rolling: replace instances incrementally (e.g., 25% at a time); both old and new versions run simultaneously — requires backward-compatible APIs.
  • Recreate: terminate all old instances, then start new ones — simplest strategy but incurs downtime; suitable for non-production or batch jobs.
  • Shadow/Dark Launch: duplicate production traffic to the new version without serving responses — validates behavior under real load before cutover.
  • Feature flags (LaunchDarkly, Flagsmith) decouple deployment from release — code ships dark, features toggle on independently per user segment.
  • Rollback triggers: automated rollback fires when error rate exceeds threshold (e.g., >1% 5xx errors) or latency p99 exceeds SLO budget.
  • Database migrations must be backward-compatible during rolling/canary deployments — expand-contract pattern avoids schema-breaking changes.
StrategyDowntimeRisk LevelRollback EaseInfrastructure Cost
RecreateMinutesHighSlow (redeploy old version)Low (single environment)
RollingZeroMediumMedium (roll back incrementally)Low (replaces in-place)
Blue-GreenZeroLowInstant (flip load balancer)High (2x infrastructure)
CanaryZeroVery LowFast (reduce canary to 0%)Medium (small canary fleet)
ShadowZeroNone (no user impact)N/A (no production traffic)High (duplicate traffic)

Real-World Example

Amazon uses a regional canary deployment system called "weighted routing" via Route 53 — new Lambda or ECS deployments receive 1% of traffic, automated CloudWatch alarms trigger rollback if error rate spikes within 5 minutes.