A confusion matrix is a table used to evaluate the performance of a classification model in machine learning.
It's a simple yet powerful tool that helps us understand how well our model is doing.
The confusion matrix is made up of four quadrants: true positives, false positives, true negatives, and false negatives.
In the context of a binary classification problem, these quadrants represent the number of correct and incorrect predictions made by the model.
A true positive occurs when the model predicts a positive outcome and it's actually correct.
False positives happen when the model predicts a positive outcome but it's actually not.
True negatives occur when the model predicts a negative outcome and it's correct.
False negatives happen when the model predicts a negative outcome but it's actually not.
What is a Confusion Matrix?
A confusion matrix is a performance evaluation tool in machine learning that helps us understand how well a classification model is doing. It's like a report card for your model.
A confusion matrix is an N x N matrix, where N is the total number of target classes. For example, if we're dealing with a binary classification problem, we'd have a 2 x 2 matrix.
The matrix compares the actual target values with those predicted by the machine learning model, giving us a holistic view of how well our classification model is performing. This is especially useful when we want to identify mis-classifications and improve predictive accuracy.
Here's a breakdown of what each term in the confusion matrix means:
Why Do We Need It?
A confusion matrix is a performance evaluation tool in machine learning that represents the accuracy of a classification model. It displays the number of true positives, true negatives, false positives, and false negatives.
The matrix is especially helpful in evaluating a model's performance beyond basic accuracy metrics, especially when there is an uneven class distribution in a dataset.
Accuracy can be misleading, as seen in the example where a model predicted people who would not get sick with 96% accuracy, while the sick were spreading the virus. This highlights the importance of using a confusion matrix to get a more profound comprehension of a model's recall, accuracy, precision, and overall effectiveness in class distinction.
Here are the key metrics that a confusion matrix provides:
- True Positive (TP): The model correctly predicted a positive outcome (the actual outcome was positive).
- True Negative (TN): The model correctly predicted a negative outcome (the actual outcome was negative).
- False Positive (FP): The model incorrectly predicted a positive outcome (the actual outcome was negative). Also known as a Type I error.
- False Negative (FN): The model incorrectly predicted a negative outcome (the actual outcome was positive). Also known as a Type II error.
These metrics are crucial in understanding the performance of a classification model and identifying areas for improvement.
Calculating Confusion Matrix Values
Calculating Confusion Matrix Values is a crucial step in evaluating the performance of a classification model. The values you need to calculate are True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN).
To start, you'll need to know the actual target values and the predicted values by your machine learning model. For a binary classification problem, the confusion matrix will be a 2x2 matrix.
The formula to calculate each value is as follows:
- TP: The number of samples that were correctly predicted as positive.
- FP: The number of samples that were incorrectly predicted as positive.
- FN: The number of samples that were incorrectly predicted as negative.
- TN: The number of samples that were correctly predicted as negative.
You can use the following table to calculate the confusion matrix:
Here's an example of how to calculate the confusion matrix for a 2-class classification problem:
Using this table, you can calculate the values as follows:
- TP: 12 (samples correctly predicted as positive)
- FP: 3 (samples incorrectly predicted as positive)
- FN: 5 (samples incorrectly predicted as negative)
- TN: 80 (samples correctly predicted as negative)
By calculating these values, you can get a clear picture of how well your classification model is performing.
Types of Confusion Matrices
There are several types of confusion matrices, but for binary classification, we have a specific type of matrix that's useful for image recognition, like identifying dog images.
This type of matrix is a 2X2 Confusion matrix, which is shown below.
A True Positive (TP) is the total count of instances where both the predicted and actual values are 'Dog', while a False Negative (FN) is the total count of instances where the prediction is 'Not Dog' while the actual value is 'Dog'.
Binary
Binary classification is a fundamental concept in machine learning, and it's essential to understand how to evaluate its performance using a confusion matrix.
A 2X2 confusion matrix is commonly used for binary classification, as seen in the image recognition example. This matrix helps us understand the true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).
True positives occur when both the predicted and actual values are the same, such as when a dog is correctly identified as a dog. In the image recognition example, there are 8 true positives.
True negatives happen when both the predicted and actual values are not the same, like when a non-dog is correctly identified as a non-dog. The image recognition example shows 7 true negatives.
False positives occur when the prediction is incorrect, such as when a non-dog is predicted as a dog. In the binary classification example, 3 apples are wrongly predicted as grapes, resulting in 3 false positives.
False negatives happen when the prediction is incorrect, such as when a dog is predicted as a non-dog. The image recognition example shows 3 false negatives.
Here's a summary of the confusion matrix terms:
Understanding these terms is crucial for evaluating the performance of a binary classification model.
Multi-Class
In a multi-class classification problem, we have more than two possible classes for our model to predict. This is in contrast to binary classification, where we only have two classes.
The confusion matrix for multi-class classification expands to accommodate these additional classes. The rows represent the actual classes (ground truth) in our dataset, while the columns represent the predicted classes by our model.
Each cell within the matrix shows the count of instances where the model predicted a particular class when the actual class was another. A 3x3 confusion matrix is a common example of this, where we have three classes.
To calculate the true positive (TP), false negative (FN), false positive (FP), and true negative (TN) values for each class, we need to add the cell values as follows: TP = cell value, FN = sum of values of corresponding rows except for the TP value, FP = sum of values of corresponding columns except for the TP value, and TN = remaining value.
Here's a breakdown of the metrics for a 3-class classification problem:
In a multi-class classification problem, we can use metrics like precision, recall, and f1-score to evaluate our model's performance. For example, we can use the following classification report: precision 0.80, recall 0.80, f1-score 0.80, support 10 for Class 1, precision 0.77, recall 0.83, f1-score 0.80, support 12 for Class 2, and precision 0.89, recall 0.80, f1-score 0.84, support 10 for Class 3.
Precision and Recall
Precision tells us how many of the correctly predicted cases actually turned out to be positive. In our example, 50% of the correctly predicted cases turned out to be positive cases.
Precision is a useful metric in cases where False Positive is a higher concern than False Negatives. This is especially true in music or video recommendation systems, e-commerce websites, etc. Wrong results could lead to customer churn and be harmful to the business.
Recall, on the other hand, tells us how many of the actual positive cases we were able to predict correctly with our model. In our example, 75% of the positives were successfully predicted by our model.
Recall is a useful metric in cases where False Negative trumps False Positive. This is particularly important in medical cases where it doesn’t matter whether we raise a false alarm, but the actual positive cases should not go undetected!
F1-Score and Other Metrics
The F1-score is a harmonic mean of Precision and Recall, giving a combined idea about these two metrics. It's maximum when Precision is equal to Recall.
Interpretability of the F1-score is poor, meaning we don't know what our classifier is maximizing – precision or recall.
We use the F1 score in combination with other evaluation metrics for a complete picture of the result. This is because the F1 score is more useful than accuracy, especially when you have an uneven class distribution.
The F1 score is the weighted average of precision and recall, considering both false positives and false negatives.
Example and Implementation
Let's dive into an example of a Confusion Matrix using pandas.
In a Confusion Matrix, we have four main values: True Positive (TP), False Negative (FN), False Positive (FP), and True Negative (TN).
For instance, if we have a model that predicts whether an image is an apple or grapes, a True Positive means the actual value and the predicted value are the same, like when the actual value is an apple and the model predicts it as an apple.
A False Negative occurs when the actual value is positive, but the model predicts it as negative, such as when the actual value is an apple but the model predicts it as grapes.
False Positive happens when the actual value is negative, but the model predicts it as positive, like when the actual value is grapes but the model predicts it as an apple.
True Negative means the actual value and the predicted value are the same, like when the actual value is grapes and the model predicts it as grapes.
Here's a breakdown of the values mentioned in the example: TP=5, FN=3, FP=2, TN=5.
Important Terms and Concepts
In a confusion matrix, there are four important terms to understand: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).
A True Positive (TP) occurs when the predicted value matches the actual value, or the predicted class matches the actual class, and the actual value is positive.
A True Negative (TN) also occurs when the predicted value matches the actual value, or the predicted class matches the actual class, but this time the actual value is negative.
A False Positive (FP) happens when the predicted value was falsely predicted, the actual value was negative, but the model predicted a positive value.
A False Negative (FN) occurs when the predicted value was falsely predicted, the actual value was positive, but the model predicted a negative value.
Here's a brief summary of these terms in a table:
Sources
- https://www.sharpsightlabs.com/blog/sklearn-confusion_matrix-explained/
- https://www.jcchouinard.com/confusion-matrix-in-scikit-learn/
- https://www.analyticsvidhya.com/articles/confusion-matrix-in-machine-learning/
- https://www.geeksforgeeks.org/confusion-matrix-machine-learning/
- https://www.analyticsvidhya.com/blog/2021/06/confusion-matrix-for-multi-class-classification/
Featured Images: pexels.com