ML Algorithm Cheatsheet | Machine Learning | AI / ML

Choosing the right ML algorithm is one of the most common challenges in practice. The answer depends on four key factors: what kind of problem you have (classification, regression, clustering, anomaly detection), how much labelled data you have, speed vs accuracy trade-off, and linearity of your data.

This cheatsheet condenses the decision logic from the Microsoft Azure ML Algorithm Cheat Sheet — a widely used reference for algorithm selection.

Key Points

Start with the simplest algorithm that could work — complexity is not always better
No single "best" algorithm — try 2-3 candidates and compare on a validation set
Data size matters: small data → simpler models; large data → deep learning or boosting
Linearly separable data → logistic regression / linear SVM; non-linear → tree-based or neural net
Accuracy vs speed: Random Forest > Logistic Regression in accuracy but slower to train
Interpretability needed → Decision Tree, Logistic Regression, Linear Regression
Tabular data → Gradient Boosting (XGBoost/LightGBM) almost always wins competitions
Image / audio / text → Deep Learning (CNN, Transformer)
No labels available → Unsupervised (K-Means, DBSCAN, PCA, Autoencoders)
Rare events or fraud → Anomaly Detection (Isolation Forest, One-Class SVM)

Algorithm selection flowchart — based on the Microsoft Azure ML Cheat Sheet

Problem Type	Best Algorithms	When to Use	Avoid When
Binary Classification	Logistic Regression, SVM, XGBoost	Spam/not-spam, churn yes/no, fraud yes/no	Output needs to be a continuous value
Multi-class Classification	Random Forest, XGBoost, Neural Network	3+ categories: sentiment, topic, digit recognition	Only 2 classes — use binary instead
Regression	Linear Regression, XGBoost, Neural Network	Price, temperature, demand forecasting	Target is a category not a number
Clustering	K-Means, DBSCAN, Hierarchical	Customer segmentation, topic discovery, no labels	You have labels — use classification instead
Anomaly Detection	Isolation Forest, One-Class SVM, Autoencoder	Fraud, defects, network intrusions — rare events	Normal events are as rare as anomalies
Recommendation	Collaborative Filtering, Matrix Factorisation	Product, movie, content recommendations	No user interaction history exists
Time Series	ARIMA, LSTM, Prophet, LightGBM	Demand forecast, stock prices, sensor data	Data has no temporal dependency
NLP / Text	TF-IDF + LR, BERT, Transformer	Sentiment, NER, classification, generation	Data is not text/sequence based
Image / Vision	CNN, ResNet, ViT (Vision Transformer)	Object detection, image classification, OCR	Dataset is too small (< few thousand images)

Real-World Example

Kaggle competition winners almost always use XGBoost or LightGBM for structured/tabular data, and fine-tuned Transformers for text/image tasks. The Microsoft Azure ML Cheat Sheet is the go-to reference for enterprise teams deciding which algorithm to try first — it has been downloaded millions of times.

←PreviousML Pipeline & MLOps