robr2: Robust R-squared (RobR2) Metric
The robr2
function computes the Robust R-squared (RobR2) value for evaluating the goodness of fit between observed data and model predictions. Unlike the classical R-squared metric, RobR2 is robust to outliers and incorporates sample weights, making it ideal for noisy or irregular datasets.
Overview
Robust R-squared (RobR2) measures the proportion of variance in the observed data explained by the fitted data, while reducing sensitivity to outliers. This is achieved by using a weighted formulation, which allows for more reliable model evaluation in real-world scenarios where data may not be perfectly clean.
Formula
\[
\text{RobR2} = 1 - \frac{\sum_i w_i (e_i - \bar{e})^2}{\sum_i w_i (y_i - \bar{y})^2}
\]
Where:
- \(e_i = y_i - \hat{y}_i\) (residuals)
- \(\bar{e}\) = weighted mean of residuals
- \(\bar{y}\) = weighted mean of observed data
- \(w_i\) = weight for each data point
If weights are not provided, equal weights are assumed.
Parameters
Parameter | Type | Description |
---|---|---|
y |
np.ndarray | Observed data (ground truth). 1D array of numerical values. |
y_fit |
np.ndarray | Fitted data (model predictions). 1D array, same shape as y . |
w |
np.ndarray or None | Optional weights for data points. 1D array, same shape as y . If None, equal weights are used. |
Returns
- float The computed Robust R-squared (RobR2) value. Ranges from 0 (no explanatory power) to 1 (perfect fit).
Raises
- ValueError
- If
y
andy_fit
do not have the same shape. - If
w
is provided and does not have the same shape asy
. - If
y
ory_fit
are not 1D arrays.
Example Usage
import numpy as np
from machinegnostics.metrics import robr2
y = np.array([1.0, 2.0, 3.0, 4.0])
y_fit = np.array([1.1, 1.9, 3.2, 3.8])
w = np.array([1.0, 1.0, 1.0, 1.0])
result = robr2(y, y_fit, w)
print(result) # Example output: 0.98
Comparison with Classical R-squared
- Classical R-squared: Assumes equal weights and is sensitive to outliers.
- RobR2: Incorporates weights and is robust to outliers, making it more reliable for datasets with irregularities or noise.
References
- Kovanic P., Humber M.B (2015) The Economics of Information - Mathematical Gnostics for Data Analysis, Chapter 19
Notes
- If weights are not provided, the metric defaults to equal weighting for all data points.
- RobR2 is particularly useful for robust regression and model evaluation in the presence of outliers.