Gnostic Tests¶
This tutorial covers advanced data diagnostics in the Machine Gnostics framework:
- Data Homogeneity Test: Analyze if your data sample is homogeneous using EGDF and PDF analysis.
- Data Scedasticity Test: Diagnose homoscedasticity or heteroscedasticity using gnostic variance and regression.
- Data Membership Test: Check if a value can be considered a member of a homogeneous sample.
Data Homogeneity Test¶
Analyze data homogeneity for EGDF objects using probability density function analysis.
The homogeneity criterion is based on the mathematical properties and expected PDF behavior of EGDF according to gnostic theory principles. Homogeneous data should produce a distribution with a single density maximum, while non-homogeneous data will exhibit multiple maxima or negative density values.
Homogeneity Test with Outlier
import numpy as np
from machinegnostics.magcal import DataHomogeneity, EGDF
# Example: homogeneous data with an outlier
data = np.array([-3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Example: non-homogeneous data
data = np.array([-13.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
egdf = EGDF(verbose=False)
egdf.fit(data=data)
dh = DataHomogeneity(gdf=egdf, verbose=False, flush=True)
is_homogeneous = dh.fit(plot=True)
print("Is the data homogeneous? ", is_homogeneous)
dh.results()
Output:
Is the data homogeneous? False
{'has_negative_pdf': np.True_,
'num_maxima': 1,
'extrema_type': 'maxima',
'gdf_type': 'egdf',
'is_homogeneous': False,
...}
Data Scedasticity Test¶
Gnostic Scedasticity Test for Homoscedasticity and Heteroscedasticity
This class provides a method to check for homoscedasticity and heteroscedasticity in data, inspired by fundamental principles rather than standard statistical tests. It uses gnostic variance and gnostic linear regression, which are based on the Machine Gnostics framework.
Scedasticity Test
import numpy as np
from machinegnostics.magcal import DataScedasticity
X = np.array([0., 0.4, 0.8, 1.2, 1.6, 2. ])
y = np.array([17.89408548, 69.61586934, -7.19890572, 9.37670866, -10.55673099, 16.57855348])
ds = DataScedasticity(verbose=False)
is_homoscedastic = ds.fit(x=X, y=y)
print("Is the data homoscedastic? ", is_homoscedastic)
Output:
Data Membership Test¶
Test whether a value can be considered a member of a homogeneous data sample using the EGDF framework.
Data Membership Test
import numpy as np
from machinegnostics.magcal import DataMembership, EGDF
data = np.array([-3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
egdf = EGDF(verbose=False)
egdf.fit(data=data)
dm = DataMembership(egdf=egdf, verbose=False)
lsb, usb = dm.fit()
print(f"Lower and Upper Sample Bounds to check Data Membership: LSB: {lsb}, USB: {usb}")
Output:
Tips¶
- All diagnostic classes (
DataHomogeneity
,DataScedasticity
,DataMembership
) follow a similar API: create an object, fit your data, and inspect results. - Use the
plot=True
option in.fit()
for visual diagnostics where available. - For advanced usage and parameter tuning, see the API Reference, scedasticity, and membership.
Next: Explore more tutorials and real-world examples in the Examples section!