confusion_matrix: Confusion Matrix Metric¶
The confusion_matrix
function computes the confusion matrix for classification tasks. This metric provides a summary of prediction results, showing how many samples were correctly or incorrectly classified for each class.
Overview¶
A confusion matrix is a table that is often used to describe the performance of a classification model.
- Each row represents the actual class.
- Each column represents the predicted class.
- Entry (i, j) is the number of samples with true label i and predicted label j.
This metric helps you understand the types of errors your classifier is making and is essential for evaluating classification accuracy, precision, recall, and other related metrics.
Parameters¶
Parameter | Type | Description | Default |
---|---|---|---|
y_true |
array-like or pandas Series | Ground truth (correct) target values. Shape: (n_samples,) | Required |
y_pred |
array-like or pandas Series | Estimated targets as returned by a classifier. Shape: (n_samples,) | Required |
labels |
array-like | List of labels to index the matrix. If None, uses all labels in sorted order. | None |
verbose |
bool | If True, enables detailed logging for debugging. | False |
Returns¶
- ndarray
Confusion matrix of shape (n_classes, n_classes). Entry (i, j) is the number of samples with true label i and predicted label j.
Raises¶
- ValueError
If input arrays are not the same shape, are empty, are not 1D, or contain NaN/Inf.
Example Usage¶
import numpy as np
from machinegnostics.metrics import confusion_matrix
# Example 1: Basic usage
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
cm = confusion_matrix(y_true, y_pred)
print(cm)
# Output:
# array([[2, 0, 0],
# [0, 0, 1],
# [1, 0, 2]])
# Example 2: With custom labels
cm_custom = confusion_matrix(y_true, y_pred, labels=[0, 1, 2])
print(cm_custom)
Notes¶
- The function supports input as lists, numpy arrays, or pandas Series.
- Both
y_true
andy_pred
must be 1D, have the same shape, and must not be empty or contain NaN/Inf. - If
labels
is not provided, all unique labels iny_true
andy_pred
are used in sorted order. - The confusion matrix is essential for computing other metrics such as accuracy, precision, recall, and F1-score.
Author: Nirmal Parmar
Date: 2025-09-24