Skip to content

confusion_matrix: Confusion Matrix Metric

The confusion_matrix function computes the confusion matrix for classification tasks. This metric provides a summary of prediction results, showing how many samples were correctly or incorrectly classified for each class.


Overview

A confusion matrix is a table that is often used to describe the performance of a classification model.
- Each row represents the actual class. - Each column represents the predicted class. - Entry (i, j) is the number of samples with true label i and predicted label j.

This metric helps you understand the types of errors your classifier is making and is essential for evaluating classification accuracy, precision, recall, and other related metrics.


Parameters

Parameter Type Description Default
y_true array-like or pandas Series Ground truth (correct) target values. Shape: (n_samples,) Required
y_pred array-like or pandas Series Estimated targets as returned by a classifier. Shape: (n_samples,) Required
labels array-like List of labels to index the matrix. If None, uses all labels in sorted order. None
verbose bool If True, enables detailed logging for debugging. False

Returns

  • ndarray
    Confusion matrix of shape (n_classes, n_classes). Entry (i, j) is the number of samples with true label i and predicted label j.

Raises

  • ValueError
    If input arrays are not the same shape, are empty, are not 1D, or contain NaN/Inf.

Example Usage

import numpy as np
from machinegnostics.metrics import confusion_matrix

# Example 1: Basic usage
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
cm = confusion_matrix(y_true, y_pred)
print(cm)
# Output:
# array([[2, 0, 0],
#        [0, 0, 1],
#        [1, 0, 2]])

# Example 2: With custom labels
cm_custom = confusion_matrix(y_true, y_pred, labels=[0, 1, 2])
print(cm_custom)

Notes

  • The function supports input as lists, numpy arrays, or pandas Series.
  • Both y_true and y_pred must be 1D, have the same shape, and must not be empty or contain NaN/Inf.
  • If labels is not provided, all unique labels in y_true and y_pred are used in sorted order.
  • The confusion matrix is essential for computing other metrics such as accuracy, precision, recall, and F1-score.

Author: Nirmal Parmar
Date: 2025-09-24