A production ML pipeline is far more than just model training. It includes data ingestion, validation, feature engineering, training, evaluation, deployment, and monitoring — all automated, versioned, and reproducible. MLOps is the discipline that operationalises this lifecycle.

Key Points

  • MLOps = DevOps principles applied to ML: automation, versioning, CI/CD, monitoring
  • Data Versioning: track changes to datasets (DVC, Delta Lake) — training data is code
  • Feature Store: centralised repository of curated features shared across models (Feast, Hopsworks)
  • Experiment Tracking: log hyperparameters, metrics, and artefacts for every run (MLflow, W&B)
  • Model Registry: store, version, and stage models (staging → production) with metadata
  • CI/CD for ML: automated pipeline from data validation → training → evaluation → deployment
  • Model Serving: REST endpoints (FastAPI, TF Serving), batch scoring, streaming inference
  • Monitoring: detect data drift (distribution shifts), concept drift (world has changed), performance decay
  • A/B Testing & Shadow Mode: validate new model vs old model in production with real traffic
  • Retraining Triggers: scheduled, event-driven (drift detected), or performance threshold breach
MLOps StageToolsKey Concern
Data ManagementDVC, Great Expectations, Delta LakeValidation, versioning, lineage
Experiment TrackingMLflow, Weights & Biases, CometReproducibility, comparison
TrainingSageMaker, Vertex AI, DatabricksScale, cost, GPU utilisation
Model RegistryMLflow Registry, SageMaker Model RegistryVersioning, staging, rollback
ServingFastAPI, TF Serving, Triton, BentoMLLatency, throughput, scalability
MonitoringEvidently AI, WhyLabs, GrafanaData drift, model decay

Real-World Example

Uber's Michelangelo platform processes petabytes of ride data to train hundreds of ML models — surge pricing, ETA estimation, fraud detection. The platform automates the full lifecycle: feature engineering → training → deployment → monitoring at massive scale.