Storage
Object (S3/Blob), block (EBS), file (EFS/Azure Files) storage
Cloud storage is categorised by access pattern: object storage (S3, Azure Blob, GCS) for unstructured data at virtually unlimited scale; block storage (EBS, Azure Managed Disks, GCP Persistent Disk) as virtual disks attached to VMs; and file storage (EFS, Azure Files, Filestore) for POSIX-compliant shared filesystems. Choosing incorrectly wastes cost — object storage is ~$0.023/GB/month vs EBS at ~$0.10/GB/month, a 4× difference that compounds at petabyte scale.
Key Points
- S3 storage classes: Standard (~$0.023/GB), Intelligent-Tiering (auto-moves between frequent/infrequent), Glacier Instant Retrieval (~$0.004/GB), Glacier Deep Archive (~$0.00099/GB) — choose based on retrieval SLA.
- EBS volume types: gp3 (general purpose, 3000 baseline IOPS, ~$0.08/GB), io2 Block Express (up to 256,000 IOPS for databases, ~$0.125/GB) — io2 for latency-sensitive OLTP.
- EFS (Elastic File System) provides NFS v4.1/4.2 shared access across multiple EC2 instances; throughput scales automatically — useful for shared application state and CMS assets.
- S3 strong consistency (since December 2020) guarantees read-after-write consistency for all operations including overwrite PUTs and DELETEs.
- Azure Blob access tiers: Hot, Cool, Cold, Archive — moving from Hot to Archive cuts storage cost by ~97% but retrieval takes hours and incurs retrieval fees.
- Block storage IOPS limits are a common database bottleneck: EBS gp3 caps at 16,000 IOPS; io2 supports up to 256,000 IOPS with multi-attach for clustered databases.
- Object storage lifecycle rules automate tiering/deletion — essential for log retention compliance (e.g., move CloudTrail logs to Glacier after 90 days, delete after 7 years).
- S3 multipart upload is mandatory for objects >5 GB and recommended above 100 MB to enable parallel upload and resume on failure.
Real-World Example
Dropbox stores 500+ petabytes of user data on S3 with intelligent lifecycle tiering, using EFS only for shared internal build artifacts requiring concurrent filesystem access.