Skip to content

IntervalAnalysis: Marginal Interval Analysis (Machine Gnostics)

The IntervalAnalysis class provides robust, adaptive, and diagnostic interval estimation for Gnostic Distribution Functions (GDFs) such as ELDF, EGDF, QLDF, and QGDF. It estimates meaningful data intervals (tolerance, typical intervals) based on the behavior of the GDF's central parameter (Z0) as the data is extended, enforcing ordering constraints and providing detailed diagnostics.


Overview

IntervalAnalysis orchestrates the complete process of fitting GDFs, checking homogeneity, and computing robust data intervals using the DataIntervals engine. It is designed for reliability, diagnostics, and adaptive interval estimation in scientific and engineering data analysis.

Gnostic vs. Statistical Interval Analysis:Gnostic interval analysis does not rely on probabilistic or statistical assumptions. Instead, it uses algebraic and geometric properties of the data and distribution functions, providing deterministic, reproducible, and interpretable intervals even for small, noisy, or non-Gaussian datasets. This is fundamentally different from classical statistical interval estimation, which depends on distributional assumptions and sampling theory.

  • Assumption-Free: No parametric or probabilistic assumptions.
  • Robust: Handles outliers, heterogeneity, and bounded/unbounded domains.
  • Adaptive: Intervals adapt to data structure and central parameter behavior.
  • Diagnostics: Tracks warnings, errors, and intermediate results.
  • Visualization: Built-in plotting for distributions and intervals.
  • Memory-Efficient: Optional flushing of intermediate arrays.

Key Features

  • End-to-end interval estimation for GDFs
  • Automatic homogeneity testing and diagnostics
  • Adaptive tolerance and typical interval computation
  • Handles weighted, bounded, and unbounded data
  • Detailed error and warning logging
  • Visualization of fitted distributions and intervals
  • Deterministic and reproducible results

Parameters

Parameter Type Default Description
DLB float or None None Data Lower Bound (absolute minimum, optional)
DUB float or None None Data Upper Bound (absolute maximum, optional)
LB float or None None Lower Probable Bound (practical lower limit, optional)
UB float or None None Upper Probable Bound (practical upper limit, optional)
S float or 'auto' 'auto' Scale parameter for distribution
z0_optimize bool True Optimize central parameter Z0 during fitting
tolerance float 1e-9 Convergence tolerance for optimization
data_form str 'a' Data form: 'a' (additive), 'm' (multiplicative)
n_points int 100 Number of points for distribution evaluation
homogeneous bool True Assume data homogeneity (enables homogeneity testing)
catch bool True Store warnings/errors and intermediate results
weights np.ndarray or None None Prior weights for data points
wedf bool False Use Weighted Empirical Distribution Function
opt_method str 'L-BFGS-B' Optimization method (scipy.optimize)
verbose bool False Print detailed progress and diagnostics
max_data_size int 1000 Max data size for smooth GDF generation
flush bool True Flush intermediate arrays after fitting
dense_zone_fraction float 0.4 Fraction of domain near Z0 for dense interval search
dense_points_fraction float 0.7 Fraction of search points in dense zone
convergence_window int 15 Window size for convergence detection
convergence_threshold float 1e-6 Threshold for Z0 convergence
min_search_points int 30 Minimum search points before checking convergence
boundary_margin_factor float 0.001 Margin factor to avoid searching at boundaries
extrema_search_tolerance float 1e-6 Tolerance for detecting extrema in Z0 variation
gdf_recompute bool False Recompute GDF for each candidate datum in interval search
gnostic_filter bool False Apply gnostic clustering to filter outlier Z0 values
cluster_bounds bool True Estimate cluster bounds using DataCluster
membership_bounds bool True Estimate membership bounds using DataMembership

Attributes

  • params: dict Stores all warnings, errors, and diagnostic information from the analysis.

Methods

fit(data, plot=False)

Runs the complete interval analysis workflow on the input data.

  • data: np.ndarray, shape (n_samples,)1D numpy array of input data for interval analysis
  • plot: bool (optional) If True, automatically generates diagnostic plots after fitting

Returns: dict — Estimated interval bounds and diagnostics


results()

Returns a dictionary of estimated interval results and bounds. Also called 'Data Certification'

Returns: dict — Contains keys such as 'LB', 'LSB', 'DLB', 'LCB', 'LSD', 'ZL', 'Z0L', 'Z0', 'Z0U', 'ZU', 'USD', 'UCB', 'DUB', 'USB', 'UB'

-**LB**: Lower Bound
The practical lower limit for the interval (may be set by user or inferred).

-**LSB**: Lower Sample (Membership) Bound
The lowest value for which data is homogeneous.

-**DLB**: Data Lower Bound
The absolute minimum value present in the data.

-**LCB**: Lower Cluster Bound
The lower edge of the main data cluster.

-**LSD**: Lower Standard Deviation Bound
The lowest value as per gnostic standard deviation.

-**ZL**: Z0 Lower Interval
The lower bound of the typical interval.

-**Z0L**: Z0 Lower Bound
The lower bound of the tolerance interval.

-**Z0**: Central Value (Gnostic Mean)
The central parameter of the distribution (gnostic mean).

-**Z0U**: Z0 Upper Bound
The upper bound of the tolerance interval.

-**ZU**: Z0 Upper Interval
The upper bound of the typical interval.

-**USD**: Upper Support/Domain Bound
The highest value in the support or domain of the fitted distribution.

-**UCB**: Upper Cluster Bound
The upper edge of the main data cluster.

-**DUB**: Data Upper Bound
The absolute maximum value present in the data.

-**USB**: Upper Sample (Membership) Bound
The highest value for which data is homogeneous (membership analysis).

-**UB**: Upper Bound
The practical upper limit for the interval (may be set by user or inferred).

plot(GDF=True, intervals=True)

Visualizes the fitted GDFs and the estimated intervals.

  • GDF: bool (default: True)Plot the fitted ELDF (local distribution function)
  • intervals: bool (default: True) Plot the estimated intervals and Z0 variation

Returns: None (displays plot)


Example Usage

import numpy as np
from machinegnostics.magcal import IntervalAnalysis

# Example data
data = np.array([ -13.5, 0, 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])

# Initialize IntervalAnalysis
ia = IntervalAnalysis(verbose=True)

# Fit and get interval results
ia.fit(data, plot=True)
print(ia.results())

# Visualize results
ia.plot(GDF=True, intervals=True)

Notes

  • Gnostic interval analysis is fundamentally different from statistical interval analysis: it does not rely on probability or sampling theory, but on algebraic and geometric properties of the data and distribution functions.
  • Homogeneity of the data is checked automatically; warnings are issued if violated.
  • For best results, use with ELDF/EGDF and set wedf=False for interval estimation.
  • Suitable for scientific, engineering, and reliability applications.
  • All warnings and errors are stored in the params attribute for later inspection.

Author: Nirmal Parmar Date: 2025-09-24