Pros & Cons of Popular Machine Learning Models

Picture of Thomas Bustos

Thomas Bustos

Data Scientist | Data Engineer | ML Engineer

This blogpost is to help you identify why to use a specific model instead of another by understand the pros and cons of each one. In this post we cover Linear Regression, Logistic Regression, Support Vector Machine Learning, Decision Tree, K Nearest Neighbour, K Means, Principal Component Analysis and Naive Bayes.

Linear Regression

Good

  • Simple to implement and efficient to train
  • Overfitting can be reduced by regularization
  • Performs well when the dataset is linearly separable

 

Bad

  • Assumes that the data is independent which is rare in real life
  • Prone to noise and overfitting
  • Sensitive to outliers

 

Logistic Regression

Good

  • Less prone to over-fitting but it can overfit in high dimensional datasets
  • Efficient when the dataset has features that are linearly separable
  • Easy to implement and efficient to train

 

Bad

  • Should not be used when the number of observations are lesser than the number of features
  • Assumption of linearity which is rare in practice
  • Can only be used to predict discrete functions

 

Support Vector Machine Learning

Good

  • Good at high dimensional data
  • Can work on small dataset
  • Can solve non-linear problems

 

Bad

  • Inefficient on large data
  • Requires picking the right kernal

 

Decision Tree

Good

  • Can solve non-linear problems
  • Can work on high-dimensional data with excellent accuracy
  • Easy to visualize and explain

 

Bad

  • Overfitting. Might be resolved by random forest
  • A small change in the data can lead to a large change in the structure of the optimal decision tree
  • Calculations can get very complex

 

K Nearest Neighbour

Good

  • Can make predictions without training
  • Time complexity is O(n)
  • Can be used for both classification and regression

 

Bad

  • Does not work well with large dataset
  • Sensitive to noisy data, missing values and outliers
  • Need feature scaling
  • Choose the correct K value

 

K Means

Good

  • Simple to implement
  • Scales to large data sets
  • Guarantees convergence
  • Easily adapts to new examples
  • Generalizes to clusters of different shapes and sizes

 

Bad

  • Sensitive to the outliers
  • Choosing the k values manually is tough
  • Dependent on initial values
  • Scalability decreases when dimension increases

 

Principal Component Analysis

Good

  • Reduce correlated features
  • Improve performance
  • Reduce overfitting

 

Bad

  • Principal components are less interpretable
  • Information loss
  • Must standardize data before implementing PCA

 

Naive Bayes

Good

  • Training period is less
  • Better suited for categorical inputs
  • Easy to implement

 

Bad

  • Assumes that all features are independent which is rarely happening in real life
  • Zero Frequency
  • Estimations can be wrong in some cases

 

Those articles are made to remind you key concepts on your journey to become a GREAT data scientist ;). Feel free to share your thoughts on the article, ideas of posts and feedbacks.

Have a nice day!

Share this Post

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles

Preprocessing and Pipelines

This article is about Preprocessing and building Pipeline. For a better understanding it is recommended to read this article before: Fine-Tuning Your Model (Classification Metrics, Logistic

Read More »