correlation: Gnostic Correlation Metric¶
The correlation
function computes the Gnostic correlation coefficient between two data samples. This metric provides a robust, assumption-free measure of association, leveraging Mathematical Gnostics to deliver reliable results even in the presence of noise, outliers, or non-Gaussian data.
Overview¶
The Gnostic correlation metric measures the association between two arrays using either estimation geometry (EGDF) or quantifying geometry (QGDF):
- Case
'i'
: Uses estimation geometry (EGDF) for correlation. - Case
'j'
: Uses quantifying geometry (QGDF) for correlation.
Unlike classical correlation coefficients, the Gnostic approach is resilient to data uncertainty and is meaningful for both small and large datasets.
Parameters¶
Parameter | Type | Description |
---|---|---|
X |
array-like | Feature data sample (1D array or single column from 2D array, no NaN/Inf). |
y |
array-like | Target data sample (1D array, no NaN/Inf). |
case |
str | 'i' for estimation geometry, 'j' for quantifying geometry. Default: 'i' . |
verbose |
bool | If True, enables detailed logging for debugging. Default: False . |
Returns¶
- float
The Gnostic correlation coefficient between the two data samples.
Raises¶
- TypeError
IfX
ory
are not array-like. - ValueError
If input arrays are empty, contain NaN/Inf, are not 1D, have mismatched shapes, or ifcase
is not'i'
or'j'
.
Example Usage¶
from machinegnostics.metrics import correlation
import numpy as np
# Example 1: Simple 1D arrays
X = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])
corr = correlation(X, y, case='i')
print(corr)
# Example 2: Multi-column X
X = np.array([[1, 10], [2, 20], [3, 30], [4, 40], [5, 50]])
y = np.array([5, 4, 3, 2, 1])
for i in range(X.shape[1]):
corr = correlation(X[:, i], y)
print(f"Correlation for column {i}: {corr}")
Notes¶
- Both
X
andy
must be 1D arrays of the same length, with no missing or infinite values. - For multi-column
X
, pass each column separately (e.g.,X[:, i]
). - The metric is robust to outliers and provides meaningful estimates even for small or noisy datasets.
- If data homogeneity is not met, the function will adjust parameters and issue a warning for best results.
- For optimal results, ensure your data is preprocessed and cleaned.
Gnostic vs. Classical Correlation¶
Note: Unlike classical correlation metrics that rely on statistical means and linear relationships, the Gnostic correlation uses irrelevance measures derived from gnostic theory. This approach is assumption-free and designed to reveal true data relationships, even in the presence of outliers or non-normal distributions.
Author: Nirmal Parmar
Date: 2025-09-24