classification_report: Classification Metrics Summary
The classification_report
function generates a comprehensive summary of key classification metrics—precision, recall, F1 score, and support—for each class in your dataset. It supports both string and dictionary output formats, making it suitable for both human-readable reports and programmatic analysis.
Overview
This function provides a detailed breakdown of classifier performance for each class, including:
- Precision: Proportion of positive identifications that were actually correct.
- Recall: Proportion of actual positives that were correctly identified.
- F1 Score: Harmonic mean of precision and recall.
- Support: Number of true instances for each class.
It also computes weighted averages across all classes.
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
y_true |
array-like or pandas Series | — | Ground truth (correct) target values. Shape: (n_samples,) |
y_pred |
array-like or pandas Series | — | Estimated target values as returned by a classifier. Shape: (n_samples,) |
labels |
array-like or None | None | List of labels to include in the report. If None, uses sorted unique labels from y_true and y_pred. |
target_names |
list of str or None | None | Optional display names matching the labels (same order). |
digits |
int | 2 | Number of digits for formatting output. |
output_dict |
bool | False | If True, return output as a dict. If False, return as a formatted string. |
zero_division |
{0, 1, 'warn'} | 0 | Value to return when there is a zero division (no predicted samples for a class). |
Returns
- report:
str
ordict
Text summary or dictionary of the precision, recall, F1 score, and support for each class.
Example Usage
from machinegnostics.metrics import classification_report
y_true = [0, 1, 2, 2, 0]
y_pred = [0, 0, 2, 2, 0]
# String report
print(classification_report(y_true, y_pred))
# Dictionary report
report_dict = classification_report(y_true, y_pred, output_dict=True)
print(report_dict)
Output Example
String Output:
Class Precision Recall F1-score Support
==========================================================
0 1.00 0.50 0.67 2
1 0.00 0.00 0.00 1
2 1.00 1.00 1.00 2
==========================================================
Avg/Total 0.80 0.60 0.67 5
Dictionary Output:
{
'0': {'precision': 1.0, 'recall': 0.5, 'f1-score': 0.67, 'support': 2},
'1': {'precision': 0.0, 'recall': 0.0, 'f1-score': 0.0, 'support': 1},
'2': {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 2},
'avg/total': {'precision': 0.8, 'recall': 0.6, 'f1-score': 0.67, 'support': 5}
}
Notes
- The function uses
precision_score
,recall_score
, andf1_score
from the Machine Gnostics metrics module for consistency. - If
target_names
is provided, its length must match the number of labels. - For imbalanced datasets, the weighted average provides a more informative summary than the unweighted mean.
- The
zero_division
parameter controls the behavior when a class has no predicted samples.