Amazon SageMaker
Managed ML platform: Studio, training jobs, endpoints, pipelines, Feature Store
Amazon SageMaker is AWS's fully managed ML platform covering the entire machine learning lifecycle: data preparation, model training, evaluation, deployment, and monitoring. It abstracts infrastructure management so data scientists focus on model development.
Key Points
- SageMaker Studio: IDE for the entire ML workflow (notebooks, experiments, pipelines, models)
- Data Wrangler: visual data preparation and feature engineering — connects to S3, Redshift, Athena
- Feature Store: centralised repository of ML features for training and real-time inference
- Training Jobs: managed distributed training on EC2 (including GPU instances); automatic spot instance support
- Built-in Algorithms: 18+ optimised algorithms (XGBoost, Linear Learner, Object Detection, etc.)
- Automatic Model Tuning (Hyperparameter Optimisation): Bayesian search over hyperparameter space
- Autopilot: AutoML — automatically tries different algorithms and features; provides explainability
- Endpoints: real-time inference via HTTPS; supports A/B testing and auto-scaling
- Batch Transform: run inference on large datasets offline
- Pipelines: CI/CD for ML — orchestrate data prep → training → evaluation → deployment
- Model Monitor: detect data drift, model quality issues, and bias in production
- Clarify: bias detection and explainability (SHAP values) for fairness reporting
- JumpStart: pre-trained models and fine-tuning templates (Stable Diffusion, LLaMA, etc.)
- SageMaker Canvas: no-code ML for business analysts
| SageMaker Tool | Purpose | Who Uses It |
|---|---|---|
| Studio | Unified ML IDE | All ML practitioners |
| Data Wrangler | Visual data prep | Data scientists |
| Feature Store | Centralised features | ML engineering teams |
| Training Jobs | Managed training at scale | Data scientists |
| Autopilot | AutoML | Analysts, citizens developers |
| Endpoints | Model serving | ML engineers |
| Pipelines | ML CI/CD | MLOps engineers |
| Model Monitor | Production monitoring | MLOps engineers |
| Canvas | No-code ML | Business analysts |
Real-World Example
Intuit uses SageMaker to train and deploy hundreds of ML models across TurboTax, QuickBooks, and Mint — including models for expense categorisation, refund predictions, and customer support routing. SageMaker Pipelines automates the full lifecycle from data ingestion to model deployment.