Supervised Learning
Regression, classification; linear models, SVMs, decision trees, ensembles
Supervised learning trains a model on labelled examples — input-output pairs — so it can predict outputs for new inputs. It is the most common ML paradigm, covering both regression (continuous output) and classification (discrete output).
Key Points
- Regression: predict a continuous value (house price, temperature, stock return)
- Classification: predict a category (spam/not-spam, cat/dog/bird, cancer/benign)
- Linear Regression: fits a line (or hyperplane) minimising squared error
- Logistic Regression: outputs a probability between 0 and 1 (despite the name, it is a classifier)
- Decision Trees: hierarchical if-else splits on feature values; interpretable but prone to overfitting
- Random Forest: ensemble of decision trees; reduces variance by averaging predictions
- Gradient Boosting (XGBoost, LightGBM): builds trees sequentially, each correcting prior errors
- Support Vector Machine (SVM): finds optimal separating hyperplane; good for small datasets
- k-Nearest Neighbours (kNN): classifies based on the k most similar training examples
| Algorithm | Type | Strengths | Weaknesses |
|---|---|---|---|
| Linear Regression | Regression | Fast, interpretable | Only linear relationships |
| Logistic Regression | Classification | Probabilistic output, fast | Linear decision boundary |
| Decision Tree | Both | Interpretable, no scaling | Overfits easily |
| Random Forest | Both | Robust, handles missing data | Slow, hard to interpret |
| XGBoost | Both | State-of-art on tabular data | Many hyperparameters |
| SVM | Classification | Effective in high dimensions | Slow on large datasets |
Real-World Example
Gradient Boosted Trees (XGBoost) won dozens of Kaggle competitions and are used in production by banks for credit scoring, retailers for demand forecasting, and hospitals for readmission risk prediction.