What is ConfusionMatrixDisplay in SciKit Python? Code Example

Total
0
Shares

ConfusionMatrixDisplay is a SciKit function which is used to plot confusion matrix data. Where, confusion matrix is used to evaluate the output of a classifier on iris dataset.

This confusion matrix is divided into two segments – Diagonal blocks and other blocks.

Diagonal blocks represents the count of successful predictions i.e. the actual value is equals to the predicted value.

Other blocks represents the failed predictions count.

So, it is important that Diagonal blocks should have as higher value as possible for best results.

ConfusionMatrixDisplay

The basic syntax of ConfusionMatrixDisplay is –

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

ConfusionMatrixDisplay(confusion_matrix, *, display_labels=None)

It is part of metrics module of sklearn package.

Check out this code example –

import numpy as np
from sklearn.metrics import ConfusionMatrixDisplay, confusion_matrix
import matplotlib.pyplot as plt

labels = [1,3,6,9,10,14,2,4]

cm = confusion_matrix(labels, labels)
cmp = ConfusionMatrixDisplay(cm, display_labels=labels)

fig, ax = plt.subplots(figsize=(10,10))
cmp.plot(ax=ax)

In this code we used a simple 1d array and passed it as both true data and predicted data into confusion_matrix function.

Since both the true data and predicted data are same, so the final output will have diagonal matrix with highest value.

After that, we passed the output of confusion_matrix to confusionMatrixDisplay and plot it.

Check out the output –

Confusionmatrixdisplay output plot

How to read ConfusionMatrixDisplay?

You will have two matrixes – Actual data matrix & Predicted data matrix.

Suppose there are three different types of data in those matrixes. For example – ironman, hulk, thor.

Let actual matrix is –

actual = ["ironman", "hulk", "thor", "hulk", "hulk", "thor", "ironman"]

and predicted matrix is –

predicted = ["ironman", "hulk", "thor", "thor", "ironman", "thor", "hulk"]

You can see that not all predicted values are same as actual. So, we will definitely have some values in non-diagonal places of confusion matrix.

To understand clearly, let’s put both matrixes together –

actual = ["ironman", "hulk", "thor", "hulk", "hulk", "thor", "ironman"]
predicted = ["ironman", "hulk", "thor", "thor", "ironman", "thor", "hulk"]

Now we will draw a zero matrix board of 3×3. Because we have 3 distinct values in actual and predicted arrays.

          ______ ______ ______
Ironman  |  0   |   0  |  0   |
         |______|______|______|
Hulk     |  0   |   0  |  0   |
         |______|______|______|
Thor     |  0   |   0  |  0   |
         |______|______|______|
        Ironman   Hulk   Thor

The vertical is actual
Horizontal is predicted

We will pick the first value, i.e. ironman. Add 1 to the corresponding blocks. If predicted array has hulk in place of ironman, then we will add 1 to (0,1) block.

actual = ["ironman", "..", "..", "..", "..", "..", "ironman"]
predicted = ["ironman", "..", "..", "..", "..", "..", "hulk"]

          ______ ______ ______
Ironman  | 0+1  | 0+1  |  0   |
         |______|______|______|
Hulk     |  0   |   0  |  0   |
         |______|______|______|
Thor     |  0   |   0  |  0   |
         |______|______|______|
        Ironman   Hulk   Thor

Similarly for hulk

actual = ["..", "hulk", "..", "hulk", "hulk", "..", ".."]
predicted = ["..", "hulk", "..", "thor", "ironman", "..", ".."]

          ______ ______ ______
Ironman  |  0   |   0  |  0   |
         |______|______|______|
Hulk     |  0+1 | 0+1  | 0+1  |
         |______|______|______|
Thor     |  0   |   0  |  0   |
         |______|______|______|
        Ironman   Hulk   Thor

Now for thor

actual = ["..", "..", "thor", "..", "..", "thor", ".."]
predicted = ["..", "..", "thor", "..", "..", "thor", ".."]

          ______ ______ ______
Ironman  |  0   |   0  |  0   |
         |______|______|______|
Hulk     |  0   |   0  |  0   |
         |______|______|______|
Thor     |  0   |   0  |0+1+1 |
         |______|______|______|
        Ironman   Hulk   Thor

Let’s combine all three results –

actual = ["ironman", "hulk", "thor", "hulk", "hulk", "thor", "ironman"]
predicted = ["ironman", "hulk", "thor", "thor", "ironman", "thor", "hulk"]

          ______ ______ ______
Ironman  |  1   |   1  |  0   |
         |______|______|______|
Hulk     |  1   |   1  |  1   |
         |______|______|______|
Thor     |  0   |   0  |  2   |
         |______|______|______|
        Ironman   Hulk   Thor

Finally we got the confusion matrix. You can see that the prediction for thor is completely right because only the diagonal value is greater than 0.

Conclusion

In this article we learned about confusionmatrixdisplay function. It is used to plot the data of confusion matrix. We also discussed about how this matrix is created from actual and predicted data.