Skip to content

Machine Gnostics Logistic Regression

The LogisticRegressor is a robust and flexible binary classification model built on the Machine Gnostics framework. It is designed to handle outliers, heavy-tailed distributions, and non-Gaussian noise, making it suitable for real-world data challenges. The model supports polynomial feature expansion, robust weighting, early stopping, and seamless MLflow integration for experiment tracking and deployment.

Key Features:

  • Robust to outliers and non-Gaussian noise
  • Polynomial feature expansion (configurable degree)
  • Flexible probability output: gnostic or sigmoid
  • Customizable data scaling (auto or manual)
  • Early stopping based on residual entropy or log loss
  • Full training history tracking (loss, entropy, coefficients, weights)
  • MLflow integration for model tracking and deployment
  • Save and load model using joblib

1. Overview

Machine Gnostics LogisticRegressor brings deterministic, event-level modeling to binary classification. By leveraging gnostic algebra and geometry, it provides robust, interpretable, and reproducible results, even in challenging scenarios.

Highlights:

  • Outlier Robustness: Gnostic weighting reduces the impact of noisy or corrupted samples.
  • Polynomial Feature Expansion: Configurable degree for nonlinear decision boundaries.
  • Flexible Probability Output: Choose between gnostic-based or standard sigmoid probabilities.
  • Early Stopping: Efficient training via monitoring of loss and entropy.
  • MLflow Integration: Supports experiment tracking and deployment.
  • Model Persistence: Save and load models easily with joblib.

2. Gnostic Logistic Regression

Let’s see how to use the Machine Gnostics LogisticRegressor for robust binary classification on a synthetic dataset.

Moon Data

import numpy as np
import matplotlib.pyplot as plt

def make_moons_manual(n_samples=100, noise=0.1):
    n_samples_out = n_samples // 2
    n_samples_in = n_samples - n_samples_out

# First half moon
    outer_circ_x = np.cos(np.linspace(0, np.pi, n_samples_out))
    outer_circ_y = np.sin(np.linspace(0, np.pi, n_samples_out))

# Second half moon
    inner_circ_x = 1 - np.cos(np.linspace(0, np.pi, n_samples_in))
    inner_circ_y = -np.sin(np.linspace(0, np.pi, n_samples_in)) - 0.5

X = np.vstack([
        np.stack([outer_circ_x, outer_circ_y], axis=1),
        np.stack([inner_circ_x, inner_circ_y], axis=1)
    ])
    y = np.array([0] * n_samples_out + [1] * n_samples_in)

# Add noise
    X += np.random.normal(scale=noise, size=X.shape)

return X, y

# Example usage
X, y = make_moons_manual(n_samples=300, noise=0.4)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Set1)
plt.title("Custom make_moons")
plt.show()

1758917910168

Basic Logistic Regression (Sigmoid Probability)

import matplotlib.pyplot as plt
import numpy as np
from machinegnostics.models.classification import LogisticRegressor
from machinegnostics.metrics import accuracy_score, confusion_matrix, classification_report, f1_score, precision_score, recall_score

# using gnostic influenced sigmoid function for probability estimation
model = LogisticRegressor(degree=3,verbose=True, early_stopping=True, proba='sigmoid', tol=0.1, max_iter=100)
model.fit(X, y)
proba_gnostic = model.predict_proba(X)
y_pred_gnostic = model.predict(X)

# --- Plot probability contour and predictions ---
fig, ax = plt.subplots(figsize=(7, 6))

def plot_proba_contour(ax, model, X, title):
    x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
    y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
    xx, yy = np.meshgrid(np.linspace(x_min, x_max, 300),
                        np.linspace(y_min, y_max, 300))
    grid = np.c_[xx.ravel(), yy.ravel()]
    zz = model.predict_proba(grid)
    zz = zz.reshape(xx.shape)

im = ax.imshow(zz, extent=(x_min, x_max, y_min, y_max), origin='lower',
                aspect='auto', cmap='Greens', alpha=0.5, vmin=0, vmax=1)
    plt.colorbar(im, ax=ax, label='Predicted Probability')
    ax.contour(xx, yy, zz, levels=[0.5], colors='k', linewidths=2)
    ax.set_xlim(x_min, x_max)
    ax.set_ylim(y_min, y_max)
    ax.set_title(title)

plot_proba_contour(ax, model, X, "Gnostic Logistic Regression")
ax.scatter(X[:, 0], X[:, 1], c=y, cmap='bwr', edgecolor='k', s=40, label='True label', alpha=0.9)
ax.scatter(X[:, 0], X[:, 1], c=y_pred_gnostic, cmap='cool', marker='x', s=60, label='Predicted class', alpha=0.7)
ax.legend()
ax.grid(True)
plt.tight_layout()
plt.show()

# --- Evaluation ---
print('Gnostic Logistic Regression Evaluation:')
print("Accuracy:", accuracy_score(y, y_pred_gnostic))
print("Precision:", precision_score(y, y_pred_gnostic))
print("Recall:", recall_score(y, y_pred_gnostic))
print("F1-score:", f1_score(y, y_pred_gnostic))
print("\nConfusion Matrix:\n", confusion_matrix(y, y_pred_gnostic))
print("\nClassification Report:\n", classification_report(y, y_pred_gnostic))

Output

1758917921382

Gnostic Logistic Regression Evaluation:
Accuracy: 0.9733333333333334
Precision: 0.9797297297297297
Recall: 0.9666666666666667
F1-score: 0.9731543624161074

Confusion Matrix:
 [[147   3]
 [  5 145]]

Classification Report:
 Class           Precision    Recall  F1-score   Support
========================================================
0                    0.97      0.98      0.97       150
1                    0.98      0.97      0.97       150
========================================================
Avg/Total            0.97      0.97      0.97       300

3. Gnostic Probability Output

The proba argument in LogisticRegressor can be set to 'gnostics' to use a gnostic-based probability estimation, which is more robust to outliers and non-Gaussian data.

Advanced: Gnostic Probability Output

# using gnostic probability estimation
model = LogisticRegressor(degree=3,verbose=True, early_stopping=True, max_iter=100, proba='gnostic', tol=0.1)
model.fit(X, y)
proba_gnostic = model.predict_proba(X)
y_pred_gnostic = model.predict(X)

# --- Plot probability contour and predictions ---
fig, ax = plt.subplots(figsize=(7, 6))

def plot_proba_contour(ax, model, X, title):
    x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
    y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
    xx, yy = np.meshgrid(np.linspace(x_min, x_max, 300),
                        np.linspace(y_min, y_max, 300))
    grid = np.c_[xx.ravel(), yy.ravel()]
    zz = model.predict_proba(grid)
    zz = zz.reshape(xx.shape)

im = ax.imshow(zz, extent=(x_min, x_max, y_min, y_max), origin='lower',
                aspect='auto', cmap='Greens', alpha=0.5, vmin=0, vmax=1)
    plt.colorbar(im, ax=ax, label='Predicted Probability')
    ax.contour(xx, yy, zz, levels=[0.5], colors='k', linewidths=2)
    ax.set_xlim(x_min, x_max)
    ax.set_ylim(y_min, y_max)
    ax.set_title(title)

plot_proba_contour(ax, model, X, "Gnostic Logistic Regression")
ax.scatter(X[:, 0], X[:, 1], c=y, cmap='bwr', edgecolor='k', s=40, label='True label', alpha=0.9)
ax.scatter(X[:, 0], X[:, 1], c=y_pred_gnostic, cmap='cool', marker='x', s=60, label='Predicted class', alpha=0.7)
ax.legend()
ax.grid(True)
plt.tight_layout()
plt.show()

# --- Evaluation ---
print('Gnostic Logistic Regression Evaluation:')
print("Accuracy:", accuracy_score(y, y_pred_gnostic))
print("Precision:", precision_score(y, y_pred_gnostic))
print("Recall:", recall_score(y, y_pred_gnostic))
print("F1-score:", f1_score(y, y_pred_gnostic))
print("\nConfusion Matrix:\n", confusion_matrix(y, y_pred_gnostic))
print("\nClassification Report:\n", classification_report(y, y_pred_gnostic))

Output

1758918037481

Gnostic Logistic Regression Evaluation:
Accuracy: 0.9433333333333334
Precision: 0.965034965034965
Recall: 0.92
F1-score: 0.9419795221843004

Confusion Matrix:
 [[145   5]
 [ 12 138]]

Classification Report:
 Class           Precision    Recall  F1-score   Support
========================================================
0                    0.92      0.97      0.94       150
1                    0.97      0.92      0.94       150
========================================================
Avg/Total            0.94      0.94      0.94       300

Note:

  • The proba argument controls the probability estimation method: 'sigmoid' (default) for standard logistic regression, 'gnostics' for robust, gnostic-based probabilities.
  • Use gnostic probabilities for datasets with outliers or non-Gaussian noise for more reliable classification.

Tips

  • Use LogisticRegressor for robust, interpretable binary classification, especially when data may contain outliers or non-Gaussian noise.
  • Set proba='gnostics' for robust probability estimation.
  • Adjust the degree parameter for nonlinear decision boundaries.
  • Enable early_stopping for efficient training.
  • For more advanced usage and parameter tuning, see the API Reference.

Next: Explore more tutorials and real-world examples in the Examples section!