The Difference between Generative and Discriminative Classifiers
If you have ever made a classification model, it’s most likely that you have either used a generative or discriminative algorithm.
For example, Naive Bayes, Hidden Markov, and Linear Discriminant Analysis (LDA) are generative models; whereas Logistic Regression, Support Vector Models (SVM), and Nearest Neighbours are discriminative models.
But why is it important to learn about the difference?
Knowing the difference will allow you to better understand which models to use for your particular datasets and what you expect to achieve from your analysis. While it is true that you can just find the best model through trial and error, but with the increasing numbers of models, it’ll become much more challenging and time-consuming to pick one through this method.
The fundamental difference between generative and discriminative
Fundamentally,
- Generative models learn the probability distribution of individual classes
- Discriminative models learn the decision boundary between classes
And they can visually presented below:
Calculation differences between generative and discriminative
While both types of models are used to predict \(P(y|x)\), the calculation methods for both models are somewhat different.
A generative model will first learn the joint probability distribution of \(p(x,y)\), then use Bayes’ Theorem to calculate \(P(y|x)\). In contrast, a discriminative model will learn the conditional probability \(P(y|x)\) directly.
For example, let’s say we our data follows the format of \((x,y)\), where we want to use \(x\) to predict \(y\).
If our data set is as follows: \({(0, 1), (0, 1), (1, 0), (1, 1)}\). The probability of \(P(x,y)\) would be:
y = 0 | y = 1 | |
x = 0 | 1/2 | |
x = 1 | 1/4 | 1/4 |
And P(y|x) would be as follows:
y = 0 | y = 1 | |
x = 0 | 1 | |
x = 1 | 1/2 | 1/2 |
As we can see above, the probability distribution, \(p(y|x)\) allows us to directly classify \(y\) given \(x\). This is where the discriminative name comes from, it discriminates (classify) based on its observations directly.
As for generative models, it models \(p(x,y)\) which is used to calculated \(p(y|x)\) using Bayes’s theorem,
\[P(x,y) = P(x|y) \cdot P(y) \\ \cdot P(x)\]Because \(P(x,y) = P(x|y) \cdot P(y) = P(y|x) \cdot P(x)\)
We can find \(P(y|x)\):
\[P(y|x) \cdot P(x) = P(x|y) \cdot P(y) \\ P(y|x) =\frac{P(x|y) \cdot P(y)}{P(x)}\]Summary: Which is better? Generative or Discriminative Classifiers
Between choosing a discriminative model against a generative one for classification, it’s usually considered that discriminative models outperform in accuracy and resource-saving in comparison to the generative model.
However, with a generative model, you can typically get away with a small amount of data and achieve high accuracy. Though just a disclaimer, a good prediction on current data does not indicate it will perform well for future data.
And while it does feel like discriminative models are the way to go in a classification problem, generative models can be used in areas where discriminative models cannot, which includes image generation, image-to-image translation, text-to-image translation, and many more.
If that’s something you’re interested in, generative models, I would recommend further reading on Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
Futher Reading
Books
- Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play (O’Reilly), by David Foster
(A book that teaches you on how to teach machine learning algorithms to create art from scratch) - Practical Statistics for Data Scientist (O’Reilly), by Peter and Andrew Bruce
(A must-have book for aspiring data scientist on applying statistical methods to data science)
References
- Ng, A. Y., & Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In Advances in neural information processing systems (pp. 841-848).
- Bouchard, G., & Triggs, B. (2004). The tradeoff between generative and discriminative classifiers. In IASC International Symposium on Computational Statistics (COMPSTAT).
- Bernardo, Jose & Bayarri, M & Berger, J & Dawid, A & Heckerman, David & Smith, A & West, Mike & Bishop, Christopher & Lasserre, Julia. (2007). Generative or Discriminative? Getting the Best of Both Worlds. In BAYESIAN STATISTICS. 8. (pp. 3-24).
- ebony1 (2017), Generative vs discriminative models (in Bayesian context). In Stats StackExchange.
- Stompchicken (2009), What is the difference between a generative and a discriminative algorithm?. In StackOverflow.