Comprehensive list of Machine Learning Algorithms

Soham Chausalkar
Aug 7, 2023
5 min read

Table of Content

1. Supervised Learning

2. Unsupervised Learning

3. Semi-Supervised Learning

4. Reinforcement Learning

5. Generative Models

6. Ensemble Learning

7. Anomaly Detection

8. Neural Network Architectures

Supervised Learning:

Linear Models:

Linear Regression
Ridge Regression
Lasso Regression
Elastic Net
Polynomial Regression
Bayesian Linear Regression
Locally Weighted Linear Regression
Logistic Regression
Multinomial Logistic Regression
Support Vector Machines (SVM) - Linear Kernel

Nearest Neighbor Methods:

k-Nearest Neighbors (k-NN)
Radius Neighbors
Nearest Centroid Classifier

Decision Trees and Ensembles:

Decision Trees
Random Forest
Extra Trees Classifier/Regressor
AdaBoost (Adaptive Boosting)
Gradient Boosting (e.g., XGBoost, LightGBM, CatBoost)
Histogram-Based Gradient Boosting
LogitBoost
BrownBoost
TotalBoost
Extreme Gradient Boosting (XGBoost)
LightGBM (Light Gradient Boosting Machine)
CatBoost
Gradient Boosting Regression for Survival Data (GBRS)
MART (Multiple Additive Regression Trees)
RankBoost
Stochastic Gradient Boosting (SGD)

Linear Classifiers:

Perceptron
Support Vector Machines (SVM) - Non-linear Kernels (e.g., Polynomial, RBF)
Linear Discriminant Analysis (LDA)
Quadratic Discriminant Analysis (QDA)
Naive Bayes Classifier

Neural Networks:

Feedforward Neural Networks (FNN)
Convolutional Neural Networks (CNNs)
Recurrent Neural Networks (RNNs)
Long Short-Term Memory (LSTM)
Gated Recurrent Unit (GRU)
Bidirectional RNNs
Attention Mechanisms (e.g., Transformer)
Sequence-to-Sequence Models
Capsule Networks (CapsNets)
Generative Adversarial Networks (GANs) for Classification
Transfer Learning with Pre-trained Neural Networks

Probabilistic Models:

Gaussian Naive Bayes
Multinomial Naive Bayes
Bernoulli Naive Bayes
Gaussian Process Classification
Conditional Random Fields (CRF)

Tree-Based Methods:

Decision Trees
Random Forest
Extra Trees Classifier/Regressor
Isolation Forest
Tree Augmented Naive Bayes (TAN)
Bayesian Networks (with Decision Trees)
Chi-Squared Automatic Interaction Detection (CHAID)
Conditional Decision Trees

Rule-Based Classifiers:

RIPPER (Repeated Incremental Pruning to Produce Error Reduction)
PART (Partial Decision Trees)
OneR (One Rule)

Kernel Methods:

Support Vector Machines (SVM) - Non-linear Kernels (e.g., Polynomial, RBF)
Kernel Ridge Regression

Instance-Based Learning:

k-Nearest Neighbors (k-NN)
Learning Vector Quantization (LVQ)

Bayesian Methods:

Naive Bayes Classifier
Gaussian Naive Bayes
Multinomial Naive Bayes
Bernoulli Naive Bayes
Bayesian Linear Regression
Bayesian Network Classifiers

Regularization Methods:

L1 Regularization (Lasso)
L2 Regularization (Ridge)
Elastic Net

Ordinal Regression:

Proportional Odds Model
Continuation Ratio Model
Ordinal Ridge Regression

Multi-Label Classification:

Binary Relevance
Label Powerset
Classifier Chains
Random k-Labelsets

Deep Learning for Classification:

Wide & Deep Learning
Deep Convolutional Neural Networks (CNNs)
Recurrent Neural Networks (RNNs)
Transformer-based Models (e.g., BERT, GPT)

Unsupervised Learning:

Clustering:

k-Means Clustering
k-Medians Clustering
k-Medoids Clustering
k-Prototypes Clustering
Hierarchical Clustering
Agglomerative Clustering
Divisive Clustering
ROCK (Robust Clustering Algorithm)
CLARA (Clustering Large Applications)
CLARANS (Clustering Large Applications based on RANdomized Search)
Chameleon Clustering
Self-Organizing Maps (SOM)
Neural Gas Clustering
BIRCH (Balanced Iterative Reducing and Clustering using Hierarchies)
CURE (Clustering Using Representatives)
CHAMELEON (CHAracterizing MEtrics for LOcalizatioN)
DENCLUE (DENsity CLUstEring)
SNN (Shared Nearest Neighbor)
COBWEB (COntinuous Belief WEb)

Dimensionality Reduction:

Principal Component Analysis (PCA)
Independent Component Analysis (ICA)
Non-Negative Matrix Factorization (NMF)
Laplacian Eigenmaps
Multi-Dimensional Scaling (MDS)
Local Linear Embedding (LLE)
Hessian LLE
Isomap (Isometric Mapping)
t-Distributed Stochastic Neighbor Embedding (t-SNE)
Curvilinear Component Analysis (CCA)
Elastic Embedding
Maximum Variance Unfolding (MVU)
Diffusion Maps
Laplacian Eigenmaps
Sammon Mapping
Local Outlier Factor (LOF)
Autoencoders
Robust Principal Component Analysis (RPCA)
Canonical Correlation Analysis (CCA)
Generalized Low Rank Approximations of Matrices (GLRAM)
Probabilistic PCA (PPCA)

Association Rule Learning:

Apriori Algorithm
Eclat Algorithm
FP-Growth Algorithm

Anomaly Detection:

Isolation Forest
Local Outlier Factor (LOF)
One-Class SVM (Support Vector Machine)
Autoencoders for Anomaly Detection
Cluster-Based Outlier Detection

Generative Models:

Gaussian Mixture Models (GMM)
Hidden Markov Models (HMM)
Latent Dirichlet Allocation (LDA)
Variational Autoencoders (VAEs)
Restricted Boltzmann Machines (RBMs)
Generative Adversarial Networks (GANs)
Autoencoders for Data Generation

Matrix Factorization:

Singular Value Decomposition (SVD)
Non-Negative Matrix Factorization (NMF)
Probabilistic Matrix Factorization

Feature Learning:

Autoencoders
Self-Taught Learning
Sparse Coding

Graph-Based Methods:

Graph Clustering
Community Detection
Graph Embedding

Non-parametric Density Estimation:

Kernel Density Estimation (KDE)
Parzen Windows

Expectation-Maximization (EM):

Gaussian Mixture Models (GMM)
Hidden Markov Models (HMM)

Self-Organizing Maps (SOM):

Kohonen Maps
Growing Neural Gas (GNG)

Independent Component Analysis (ICA):

FastICA
JADE (Joint Approximate Diagonalization of Eigenmatrices)

Principal Component Analysis (PCA):

Kernel PCA
Incremental PCA
Sparse PCA

Hierarchical Temporal Memory (HTM):

Neural Network Model for Learning Sequences and Patterns

Word Embeddings:

Word2Vec
FastText
GloVe

Transfer Learning and Domain Adaptation:

Pre-trained Models (e.g., Word2Vec, GPT, BERT)
Domain Adaptation Algorithms

Semi-Supervised Learning:

Self-Training
Co-Training
Multi-View Learning
Multi-Instance Learning
Semi-Supervised Support Vector Machines (S3VM)
Transductive Support Vector Machines (TSVM)
Temporal Ensembling
Mean Teacher (Temporal Ensembling with Exponential Moving Average)
Virtual Adversarial Training (VAT)
Consistency Regularization
MixMatch
VAT + Entropy Minimization
Noisy Student Training
Pseudo-Labeling
Tri-Training
Self-Ensemble
MentorNet
Combination of Labeled and Unlabeled Data (CLUD)
Entropy-Regularized Self-Training
Self-Paced Learning
Label Propagation
Label Spreading
Manifold Regularization
Self-Taught Learning
Data Programming
Unsupervised Data Augmentation (UDA)
Self-Labeling
Joint Unsupervised Learning (JULE)
Ladder Networks
Deep Generative Models with Semi-Supervised Learning
Semi-Supervised Clustering Algorithms
Semi-Supervised Anomaly Detection Techniques
Semi-Supervised Reinforcement Learning (e.g., S3RL)
Semi-Supervised Sequence Learning
Few-Shot Learning with Semi-Supervised Techniques

Reinforcement Learning:

Model-Free Algorithms:

Q-Learning
SARSA (State-Action-Reward-State-Action)
DDPG (Deep Deterministic Policy Gradient)
TRPO (Trust Region Policy Optimization)
PPO (Proximal Policy Optimization)
A3C (Asynchronous Advantage Actor-Critic)
ACKTR (Actor-Critic using Kronecker-Factored Trust Region)
D4PG (Distributed Distributional Deterministic Policy Gradients)
TD3 (Twin Delayed Deep Deterministic Policy Gradient)
SAC (Soft Actor-Critic)
Hindsight Experience Replay
Rainbow DQN (Combining DQN Improvements)
C51 (Categorical DQN)
IQN (Implicit Quantile Network)
QR-DQN (Quantile Regression DQN)
R2D2 (Recurrent Experience Replay in DQN)
HER (Hindsight Experience Replay)
CACLA (Continuous Actor-Critic Learning Automaton)
FQF (Fully Parameterized Quantile Function)

Model-Based Algorithms:

Monte Carlo Methods
Value Iteration
Policy Iteration
DDP (Differential Dynamic Programming)
MPC (Model Predictive Control)
MBMF (Model-Based Meta-RL)
Dreamer (Deep Reinforcement Learning from Off-Policy Data with Effective Model Planning)
PlaNet (Planning Network)
MBPO (Model-Based Policy Optimization)
SLAC (Unsupervised Discovery of Object Landmarks as Structural Representation in Reinforcement Learning)

Exploration Strategies:

Epsilon-Greedy Exploration
Boltzmann Exploration
Upper Confidence Bound (UCB) Exploration
Thompson Sampling
Noisy Networks for Exploration
Count-Based Exploration
Bootstrapped DQN (Exploration through Disagreement)
Bayesian Exploration
Random Network Distillation

Imitation Learning:

Behavioral Cloning
DAgger (Dataset Aggregation)
GAIL (Generative Adversarial Imitation Learning)
BCQ (Batch-Constrained Q-Learning)
ILQR (Iterative Linear Quadratic Regulator)
IRL (Inverse Reinforcement Learning)

Multi-Agent Reinforcement Learning:

MARL (Multi-Agent Reinforcement Learning)
MADDPG (Multi-Agent Deep Deterministic Policy Gradient)
COMA (Counterfactual Multi-Agent Policy Gradients)
QMIX (Q-value Mixing Network)
IQL (Independent Q-Learning)
ATOC (Actor-Attention-Critic)
MASAC (Multi-Agent Soft Actor-Critic)
SMAC (Sparse Multi-Agent Collaboration)

Generative Models:

Variational Autoencoders (VAEs):

Vanilla VAE
Conditional VAE
Adversarial Autoencoder (AAE)
Beta-VAE
InfoVAE
VQ-VAE (Vector Quantized VAE)
CVAE-GAN (Conditional VAE-GAN)
Ladder VAE

Generative Adversarial Networks (GANs):

Vanilla GAN
DCGAN (Deep Convolutional GAN)
CGAN (Conditional GAN)
WGAN (Wasserstein GAN)
WGAN-GP (Wasserstein GAN with Gradient Penalty)
LSGAN (Least Squares GAN)
EBGAN (Energy-Based GAN)
CycleGAN
StarGAN
Progressive GAN
BigGAN
StyleGAN
StyleGAN2
StyleGAN3
GauGAN
PIX2PIX (Conditional GAN for Image-to-Image Translation)

Autoencoders and Variants:

Denoising Autoencoder
Contractive Autoencoder
Sparse Autoencoder
Stacked Autoencoder
Convolutional Autoencoder
Variational Autoencoders (VAEs)
Adversarial Autoencoder (AAE)
Wasserstein Autoencoder (WAE)
Beta-VAE
InfoVAE
Ladder Network
DAE (Denoising Autoencoder)
CAE (Contractive Autoencoder)
SAE (Sparse Autoencoder)
VQ-VAE (Vector Quantized VAE)
CVAE (Conditional VAE)
VAE-GAN (Combining VAE and GAN)

Normalizing Flows:

Real NVP (Real Non-Volume Preserving)
Glow (Generative Flow with Invertible 1x1 Convolutions)
FFJORD (Free-Form Jacobian Adaption in Real-Time)
MAF (Masked Autoregressive Flow)
IAF (Inverse Autoregressive Flow)
Neural Spline Flows

Other Generative Models:

Adversarial Variational Bayes (AVB)
Adversarially Learned Inference (ALI)
BetaGAN
BiGAN (Bidirectional GAN)
Boundary Equilibrium GAN (BEGAN)
Context Encoders
Energy-Based GAN (EBGAN)
Generative Moment Matching Networks (GMMN)
Generative Query Network (GQN)
GLO (Generative Latent Optimization)
LatentGAN
Neural Processes
Neuromorphic Generative Models
Noise-Contrastive Estimation (NCE)
Recurrent Temporal GAN (RT-GAN)
Sobolev GAN
Variational Information Maximizing Exploration (VIME)

Ensemble Learning:

Bagging Algorithms:

Bagging (Bootstrap Aggregating)
Random Forest
Extra Trees Classifier/Regressor
Random Subspace Method

Boosting Algorithms:

AdaBoost (Adaptive Boosting)
Gradient Boosting (e.g., XGBoost, LightGBM, CatBoost)
LogitBoost
LPBoost (Linear Programming Boosting)
BrownBoost
TotalBoost
MadaBoost (Multi-class AdaBoost)
BrownBoost
RUSBoost (Random Undersampling Boosting)
GBM (Gradient Boosting Machine)
DART (Dropouts meet Multiple Additive Regression Trees)
LogitBoost
BrownBoost

Stacking and Blending:

Stacking (Meta-Ensembling)
Blending
Super Learner
Weighted Averaging

Other Ensemble Techniques:

Bag of Little Bootstraps (BOLB)
Bootstrapped Ensembles
Bayesian Model Averaging
Bayesian Model Combination
Rotation Forest
Ensemble of Classifiers (ECOC)
Heterogeneous Ensembles
Dynamic Classifier Selection
Ensemble Selection
Cluster Ensembles
Feature-based Ensemble Methods
Rank Ensembling
Majority Voting
Simultaneous Boosting and Model Selection

Ensemble Approaches in Deep Learning:

Snapshot Ensembles
Stochastic Weight Averaging (SWA)
Adversarial Training Ensembles
Bag of Tricks for Training Neural Networks

Anomaly Detection:

Statistical Methods:

Z-Score
Modified Z-Score
Mahalanobis Distance
Dixon's Q Test
Grubbs' Test
Hampel Identifier
Tukey's Test
Interquartile Range (IQR)
Box-Cox Transformation
Exponential Smoothing

Density-Based Methods:

Isolation Forest
Local Outlier Factor (LOF)
DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
HBOS (Histogram-Based Outlier Score)
ABOD (Angle-Based Outlier Detection)
COF (Connectivity-Based Outlier Factor)
CBLOF (Clustering-Based Local Outlier Factor)
LOCI (LOF-based Outlier Detection)
LoOP (Local Outlier Probabilities)

Distance-Based Methods:

k-Nearest Neighbors (k-NN)
k-Means Clustering
Distance-Based Outlier Detection (DOD)
Angle-Based Outlier Detection (ABOD)

Model-Based Methods:

Gaussian Mixture Models (GMM)
One-Class SVM (Support Vector Machine)
Autoencoders for Anomaly Detection
Isolation Support Vector Machines (iSVM)
Markov Chain Models
Hidden Markov Models (HMM)
Generative Adversarial Networks (GANs) for Anomaly Detection
Variational Autoencoders (VAEs) for Anomaly Detection
Long Short-Term Memory (LSTM) for Anomaly Detection

Ensemble Methods:

Isolation Forest Ensembles
LOF Ensembles
Autoencoder Ensembles
Bagging-Based Anomaly Detection

Meta-Learning Approaches:

Meta-Anomaly Detection
Learning to Detect Anomalies
Transfer Learning for Anomaly Detection

Time Series Anomaly Detection:

Seasonal Hybrid ESD (S-H-ESD)
Twitter's AnomalyDetection R Package
Prophet
ARIMA (AutoRegressive Integrated Moving Average) with Anomalies

Deep Learning-Based Approaches:

Convolutional Autoencoders for Image Anomaly Detection
LSTM Autoencoders for Sequence Anomaly Detection
GANs for Image and Data Anomaly Detection

Neural Network Architectures:

Feedforward Neural Networks (FNN):

Single-Layer Perceptron
Multi-Layer Perceptron (MLP)
Deep Feedforward Networks
Cascade-Correlation Neural Network
Radial Basis Function Networks (RBFN)
Extreme Learning Machines (ELM)
Functional Link Neural Network (FLNN)
Probabilistic Neural Network (PNN)
Generalized Regression Neural Network (GRNN)
Hierarchical Temporal Memory (HTM)

Convolutional Neural Networks (CNNs):

LeNet-5
AlexNet
VGG (Visual Geometry Group)
GoogLeNet (Inception)
ResNet (Residual Networks)
DenseNet
MobileNet
EfficientNet
Xception
SqueezeNet
Inception-ResNet
ShuffleNet
NASNet
SENet (Squeeze-and-Excitation Networks)
MnasNet
ResNeXt
HRNet (High-Resolution Networks)
GhostNet
EfficientDet (Efficient Object Detection)
RegNet (Regularized Networks)

Recurrent Neural Networks (RNNs):

Vanilla RNN
LSTM (Long Short-Term Memory)
GRU (Gated Recurrent Unit)
Bi-directional RNNs
Attention Mechanisms (e.g., Transformer)
U-Net (Used for Segmentation)
WaveNet (Used for Text-to-Speech)
Neural Turing Machine
Differentiable Neural Computer (DNC)
Dynamic Time Warping Networks

Sequence-to-Sequence Models:

Seq2Seq (Sequence-to-Sequence)
Encoder-Decoder Architectures
Attention Mechanisms (e.g., Transformer)
Pointer Networks
Transformer (e.g., BERT, GPT)
T5 (Text-to-Text Transfer Transformer)
XLNet
RoBERTa
ELECTRA
GPT-3