geomexp package

Subpackages

Module contents

Geometric Expectile Clustering package.

class geomexp.BaseClusterer(n_clusters, max_iter=100, tol=1e-07, random_state=None)[source]

Bases: ABC

Abstract base class for clustering algorithms.

Defines the interface that all clustering algorithms must implement. Subclasses provide _initialize, _fit_iteration, _compute_objective, and _extract_result.

The fit method contains the outer optimisation loop with convergence checking based on objective change and an optional partition stability hook.

Parameters:
  • n_clusters (int)

  • max_iter (int)

  • tol (float)

  • random_state (int | None)

n_clusters

Number of clusters to form.

max_iter

Maximum number of iterations to perform.

tol

Convergence tolerance for objective function change.

random_state

Random seed for reproducibility.

fit(X)[source]

Fit the clustering algorithm to data.

The outer loop follows Algorithm 1 from the thesis: iterate until the objective decrease falls below tol, an additional convergence criterion is met, or max_iter iterations are reached.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult containing assignments, centers, and convergence info.

Raises:

ValueError – If X has fewer samples than n_clusters or invalid shape.

class geomexp.BestResponseIndexStrategy(geometry=None)[source]

Bases: IndexStrategy

Adaptive best-response index vectors (equation 2.2).

Each index vector is set to minimise the cluster-wise risk for fixed partition and centroids. The closed-form solution in a Hilbert space \(H\) is:

\[u_k^* = -r \frac{s_k}{\|s_k\|_H}, \quad s_k = \sum_{i : X_i \in C_k} \|X_i - c_k\|_H \, (X_i - c_k).\]

This captures the length-weighted residual direction (skew) of the cluster. When \(s_k = 0\), the current index is retained.

Parameters:

geometry (HilbertGeometry | None)

initialize(n_clusters, n_features, index_radius, rng)[source]

Create initial index vectors.

Parameters:
  • n_clusters (int) – Number of clusters.

  • n_features (int) – Dimensionality of the data.

  • index_radius (float) – Radius constraint \(r\) with \(\|u_k\|_H = r\).

  • rng (RandomState) – Random state for stochastic initialization.

Return type:

ndarray

Returns:

Array of shape (n_clusters, n_features).

update(X, assignments, centers, indices, index_radius)[source]

Update index vectors given the current clustering state.

Parameters:
  • X (ndarray) – Data array of shape (n_samples, n_features).

  • assignments (ndarray) – Cluster assignments of shape (n_samples,).

  • centers (ndarray) – Cluster centers of shape (n_clusters, n_features).

  • indices (ndarray) – Current index vectors of shape (n_clusters, n_features).

  • index_radius (float) – Radius constraint \(r\).

Return type:

ndarray

Returns:

Updated index vectors of shape (n_clusters, n_features).

class geomexp.ClusterResult(assignments, centers, objective, n_iterations, converged=True, metadata=None)[source]

Bases: object

Container for clustering results.

Parameters:
assignments

Array of shape (n_samples,) containing cluster indices for each point.

centers

Array of shape (n_clusters, n_features) containing cluster centers.

objective

Final objective function value achieved by the algorithm.

n_iterations

Number of iterations performed until convergence.

converged

Whether the algorithm converged within max_iter iterations.

metadata

Optional dictionary containing algorithm-specific information.

assignments: ndarray
centers: ndarray
objective: float
n_iterations: int
converged: bool = True
metadata: dict[str, object] | None = None
class geomexp.ClusterSpecificIndexStrategy(directions, geometry=None)[source]

Bases: IndexStrategy

Fixed per-cluster index vectors.

Distinct fixed directions are prescribed for different clusters, each normalised to \(\|u_k\|_H = r\). Appropriate when prior information about cluster-specific asymmetry directions is available.

Zero-norm rows in the provided directions result in \(u_k = 0\).

Parameters:
initialize(n_clusters, n_features, index_radius, rng)[source]

Create initial index vectors.

Parameters:
  • n_clusters (int) – Number of clusters.

  • n_features (int) – Dimensionality of the data.

  • index_radius (float) – Radius constraint \(r\) with \(\|u_k\|_H = r\).

  • rng (RandomState) – Random state for stochastic initialization.

Return type:

ndarray

Returns:

Array of shape (n_clusters, n_features).

update(X, assignments, centers, indices, index_radius)[source]

Update index vectors given the current clustering state.

Parameters:
  • X (ndarray) – Data array of shape (n_samples, n_features).

  • assignments (ndarray) – Cluster assignments of shape (n_samples,).

  • centers (ndarray) – Cluster centers of shape (n_clusters, n_features).

  • indices (ndarray) – Current index vectors of shape (n_clusters, n_features).

  • index_radius (float) – Radius constraint \(r\).

Return type:

ndarray

Returns:

Updated index vectors of shape (n_clusters, n_features).

class geomexp.CustomIndexStrategy(update_fn, init_fn=None)[source]

Bases: IndexStrategy

User-provided callable for custom index vector updates.

Allows full flexibility by accepting arbitrary callables for initialisation and update of index vectors.

The update callable should have the signature:

def update_fn(
    X: np.ndarray,            # (n_samples, n_features)
    assignments: np.ndarray,   # (n_samples,)
    centers: np.ndarray,       # (n_clusters, n_features)
    indices: np.ndarray,       # (n_clusters, n_features)
    index_radius: float,
) -> np.ndarray:               # (n_clusters, n_features)

The optional init callable should have the signature:

def init_fn(
    n_clusters: int,
    n_features: int,
    index_radius: float,
    rng: np.random.RandomState,
) -> np.ndarray:               # (n_clusters, n_features)
Parameters:
  • update_fn (Callable[..., np.ndarray])

  • init_fn (Callable[..., np.ndarray] | None)

initialize(n_clusters, n_features, index_radius, rng)[source]

Create initial index vectors.

Parameters:
  • n_clusters (int) – Number of clusters.

  • n_features (int) – Dimensionality of the data.

  • index_radius (float) – Radius constraint \(r\) with \(\|u_k\|_H = r\).

  • rng (RandomState) – Random state for stochastic initialization.

Return type:

ndarray

Returns:

Array of shape (n_clusters, n_features).

update(X, assignments, centers, indices, index_radius)[source]

Update index vectors given the current clustering state.

Parameters:
  • X (ndarray) – Data array of shape (n_samples, n_features).

  • assignments (ndarray) – Cluster assignments of shape (n_samples,).

  • centers (ndarray) – Cluster centers of shape (n_clusters, n_features).

  • indices (ndarray) – Current index vectors of shape (n_clusters, n_features).

  • index_radius (float) – Radius constraint \(r\).

Return type:

ndarray

Returns:

Updated index vectors of shape (n_clusters, n_features).

class geomexp.EuclideanGeometry[source]

Bases: HilbertGeometry

Standard Euclidean geometry: \(\langle a, b \rangle = \sum_i a_i b_i\).

Equivalent to identity weights. This is the default geometry when none is specified.

inner(a, b)[source]

Compute the inner product \(\langle a, b \rangle_H\) along the last axis.

Parameters:
  • a (ndarray) – Array of shape (..., d).

  • b (ndarray) – Array of shape (..., d), broadcastable with a.

Return type:

ndarray

Returns:

Inner product values of shape (...).

norm(a)[source]

Compute \(\|a\|_H\) along the last axis.

Parameters:

a (ndarray) – Array of shape (..., d).

Return type:

ndarray

Returns:

Norm values of shape (...).

squared_norm(a)[source]

Compute \(\|a\|_H^2\) along the last axis.

Default implementation returns inner(a, a). Override for efficiency if the inner product is expensive.

Return type:

ndarray

Parameters:

a (ndarray)

class geomexp.GeometricExpectileClustering(n_clusters, index_radius=0.5, index_strategy=None, tie_break_rule=None, empty_cluster_rule=None, geometry=None, n_init=10, max_iter=100, tol=1e-07, center_lr=0.1, center_steps=50, random_state=None)[source]

Bases: BaseClusterer

Geometric expectile clustering with directional asymmetry (Algorithm 1).

A Lloyd-type block coordinate descent algorithm that minimises the empirical risk based on the geometric expectile loss:

\[\hat{\mathcal{R}}(\Pi, C, U) = \frac{1}{n} \sum_{k=1}^K \sum_{i=1}^n \ell_{u_k}(X_i - c_k)\,\mathbb{1}_{\{X_i \in C_k\}}, \quad \|u_k\| = r \;\forall k.\]

Each iteration consists of:

  1. Assignment – assign each point to the cluster minimising \(\ell_{u_k}(X_i - c_k)\).

  2. Empty-cluster handling – reinitialise empty clusters, then re-assign.

  3. Centroid update – for each cluster, minimise the cluster-wise risk over \(c\) via gradient descent.

  4. Index update – update \(U\) via the chosen IndexStrategy.

Setting \(r = 0\) recovers standard K-means.

Parameters:
n_clusters

Number of clusters.

index_radius

Radius constraint \(r \in [0, 1)\).

index_strategy

Strategy for index vector selection.

tie_break_rule

Rule for breaking ties in assignment.

empty_cluster_rule

Policy for handling empty clusters.

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of outer iterations.

tol

Convergence tolerance on objective change.

center_lr

Learning rate for gradient descent on centers.

center_steps

Number of gradient steps per centroid update.

random_state

Random seed.

Example

>>> import numpy as np
>>> X = np.random.randn(100, 2)
>>> model = GeometricExpectileClustering(n_clusters=3, index_radius=0.5)
>>> result = model.fit(X)
>>> print(result.metadata["indices"].shape)
(3, 2)
fit(X)[source]

Fit geometric expectile clustering with multiple random restarts.

Runs n_init independent trials with different random seeds and returns the result with the lowest objective.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

class geomexp.GlobalIndexStrategy(direction, geometry=None)[source]

Bases: IndexStrategy

Fixed single global index vector shared by all clusters.

Sets \(u_1 = \cdots = u_K = u\) for a given direction \(u\). Appropriate when a meaningful asymmetry direction is known a priori; one implicitly assumes a common anisotropy mechanism across clusters.

The provided direction is normalised to have Hilbert norm equal to \(r\).

Parameters:
initialize(n_clusters, n_features, index_radius, rng)[source]

Create initial index vectors.

Parameters:
  • n_clusters (int) – Number of clusters.

  • n_features (int) – Dimensionality of the data.

  • index_radius (float) – Radius constraint \(r\) with \(\|u_k\|_H = r\).

  • rng (RandomState) – Random state for stochastic initialization.

Return type:

ndarray

Returns:

Array of shape (n_clusters, n_features).

update(X, assignments, centers, indices, index_radius)[source]

Update index vectors given the current clustering state.

Parameters:
  • X (ndarray) – Data array of shape (n_samples, n_features).

  • assignments (ndarray) – Cluster assignments of shape (n_samples,).

  • centers (ndarray) – Cluster centers of shape (n_clusters, n_features).

  • indices (ndarray) – Current index vectors of shape (n_clusters, n_features).

  • index_radius (float) – Radius constraint \(r\).

Return type:

ndarray

Returns:

Updated index vectors of shape (n_clusters, n_features).

class geomexp.GramGeometry(gram_matrix)[source]

Bases: HilbertGeometry

General Gram-matrix geometry: \(\langle a, b \rangle_G = a^\top G\, b\).

Suitable for basis coefficient representations where \(G_{ij} = \langle \varphi_i, \varphi_j \rangle\).

Parameters:

gram_matrix (ndarray) – Symmetric positive-definite matrix of shape (d, d).

inner(a, b)[source]

Compute the inner product \(\langle a, b \rangle_H\) along the last axis.

Parameters:
  • a (ndarray) – Array of shape (..., d).

  • b (ndarray) – Array of shape (..., d), broadcastable with a.

Return type:

ndarray

Returns:

Inner product values of shape (...).

norm(a)[source]

Compute \(\|a\|_H\) along the last axis.

Parameters:

a (ndarray) – Array of shape (..., d).

Return type:

ndarray

Returns:

Norm values of shape (...).

squared_norm(a)[source]

Compute \(\|a\|_H^2\) along the last axis.

Default implementation returns inner(a, a). Override for efficiency if the inner product is expensive.

Return type:

ndarray

Parameters:

a (ndarray)

class geomexp.HilbertGeometry[source]

Bases: ABC

Abstract Hilbert space geometry defining inner product and norm.

All clustering geometry (assignment costs, center gradients, index strategies) flows through this interface, ensuring that no part of the algorithm accidentally falls back to Euclidean operations.

abstractmethod inner(a, b)[source]

Compute the inner product \(\langle a, b \rangle_H\) along the last axis.

Parameters:
  • a (ndarray) – Array of shape (..., d).

  • b (ndarray) – Array of shape (..., d), broadcastable with a.

Return type:

ndarray

Returns:

Inner product values of shape (...).

abstractmethod norm(a)[source]

Compute \(\|a\|_H\) along the last axis.

Parameters:

a (ndarray) – Array of shape (..., d).

Return type:

ndarray

Returns:

Norm values of shape (...).

squared_norm(a)[source]

Compute \(\|a\|_H^2\) along the last axis.

Default implementation returns inner(a, a). Override for efficiency if the inner product is expensive.

Return type:

ndarray

Parameters:

a (ndarray)

class geomexp.IndexStrategy[source]

Bases: ABC

Abstract base class for index vector selection strategies.

Index vectors \(u_k\) with \(\|u_k\|_H = r\) determine the directional asymmetry of the geometric expectile loss. Different strategies offer different trade-offs between flexibility and the need for prior knowledge.

abstractmethod initialize(n_clusters, n_features, index_radius, rng)[source]

Create initial index vectors.

Parameters:
  • n_clusters (int) – Number of clusters.

  • n_features (int) – Dimensionality of the data.

  • index_radius (float) – Radius constraint \(r\) with \(\|u_k\|_H = r\).

  • rng (RandomState) – Random state for stochastic initialization.

Return type:

ndarray

Returns:

Array of shape (n_clusters, n_features).

abstractmethod update(X, assignments, centers, indices, index_radius)[source]

Update index vectors given the current clustering state.

Parameters:
  • X (ndarray) – Data array of shape (n_samples, n_features).

  • assignments (ndarray) – Cluster assignments of shape (n_samples,).

  • centers (ndarray) – Cluster centers of shape (n_clusters, n_features).

  • indices (ndarray) – Current index vectors of shape (n_clusters, n_features).

  • index_radius (float) – Radius constraint \(r\).

Return type:

ndarray

Returns:

Updated index vectors of shape (n_clusters, n_features).

class geomexp.KMeans(n_clusters, max_iter=100, tol=0.0001, random_state=None)[source]

Bases: IterativeClusterer

K-means clustering algorithm.

Partitions data into \(K\) clusters by minimising the within-cluster sum of squared distances via Lloyd’s alternating minimisation:

\[J = \sum_{k=1}^K \sum_{i \in C_k} \|x_i - m_k\|^2\]

where \(C_k\) denotes the set of points assigned to cluster \(k\), and \(m_k\) is the cluster mean.

Parameters:
  • n_clusters (int)

  • max_iter (int)

  • tol (float)

  • random_state (int | None)

n_clusters

Number of clusters to form.

max_iter

Maximum number of iterations.

tol

Convergence tolerance on objective change.

random_state

Random seed for reproducibility.

Example

>>> import numpy as np
>>> X = np.random.randn(100, 2)
>>> kmeans = KMeans(n_clusters=3, random_state=42)
>>> result = kmeans.fit(X)
>>> print(result.assignments.shape)
(100,)
class geomexp.Kernel[source]

Bases: ABC

Abstract base class for kernel functions.

A kernel \(k : \mathcal{X} \times \mathcal{X} \to \mathbb{R}\) implicitly defines a feature map \(\varphi\) into a reproducing kernel Hilbert space \(\mathcal{H}\) such that \(k(x, y) = \langle \varphi(x), \varphi(y) \rangle_{\mathcal{H}}\).

gram_matrix(X)[source]

Compute the full Gram matrix \(K_{ij} = k(X_i, X_j)\).

Parameters:

X (ndarray) – Data array of shape (n, d).

Return type:

ndarray

Returns:

Symmetric positive semi-definite matrix of shape (n, n).

class geomexp.KernelGeometricExpectileClustering(n_clusters, kernel, index_radius=0.5, n_init=10, max_iter=100, tol=1e-07, center_lr=0.1, center_steps=50, random_state=None)[source]

Bases: BaseClusterer

Kernelised geometric expectile clustering (Algorithm 2).

In the feature space \(\mathcal{H}\) induced by a kernel \(k\), centroids and index vectors lie in the span of the data (Proposition 3.1):

\[c_k = \sum_{j=1}^n \beta_{kj}\,\varphi(X_j), \quad u_k = \sum_{j=1}^n \alpha_{kj}\,\varphi(X_j).\]

All norms and inner products reduce to operations on the Gram matrix \(K_{ij} = k(X_i, X_j)\):

  • Squared residual norm: \(d_{ik}^2 = K_{ii} - 2(K\beta_k)_i + \beta_k^\top K \beta_k\)

  • Index-residual inner product: \(p_{ik} = (K\alpha_k)_i - \alpha_k^\top K \beta_k\)

Setting \(r = 0\) recovers kernel K-means.

Parameters:
n_clusters

Number of clusters.

kernel

Kernel function.

index_radius

Radius constraint \(r \in [0, 1)\).

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of outer iterations.

tol

Convergence tolerance on objective change.

center_lr

Learning rate for gradient descent on centroid weights.

center_steps

Number of gradient steps per centroid update.

random_state

Random seed.

Example

>>> import numpy as np
>>> from geomexp.kernels import KernelGeometricExpectileClustering, RBFKernel
>>> X = np.random.randn(100, 2)
>>> model = KernelGeometricExpectileClustering(
...     n_clusters=3, kernel=RBFKernel(gamma=0.5), index_radius=0.5
... )
>>> result = model.fit(X)
>>> print(result.metadata["index_weights"].shape)
(3, 100)
fit(X)[source]

Fit kernelised geometric expectile clustering with multiple random restarts.

The Gram matrix is computed once and reused across all n_init restarts.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

class geomexp.KernelKMeans(n_clusters, kernel, n_init=10, max_iter=100, tol=1e-07, random_state=None)[source]

Bases: BaseClusterer

Kernel K-means clustering (dual representation).

Minimises the within-cluster sum of squared distances in the feature space \(\mathcal{H}\) induced by a kernel \(k\):

\[J = \sum_{k=1}^K \sum_{i \in C_k} \|\varphi(X_i) - c_k\|_{\mathcal{H}}^2\]

where \(c_k = \frac{1}{|C_k|} \sum_{j \in C_k} \varphi(X_j)\) is the cluster centroid in feature space. Since \(\varphi\) may not have an explicit form, all computations are expressed through the Gram matrix \(K_{ij} = k(X_i, X_j)\):

\[\|\varphi(X_i) - c_k\|^2 = K_{ii} - \frac{2}{|C_k|} \sum_{j \in C_k} K_{ij} + \frac{1}{|C_k|^2} \sum_{j,l \in C_k} K_{jl}\]

With a linear kernel this recovers standard K-means.

Parameters:
n_clusters

Number of clusters.

kernel

Kernel function.

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of iterations.

tol

Convergence tolerance.

random_state

Random seed.

Example

>>> import numpy as np
>>> from geomexp.kernels import KernelKMeans, RBFKernel
>>> X = np.random.randn(100, 2)
>>> model = KernelKMeans(n_clusters=3, kernel=RBFKernel(gamma=0.5))
>>> result = model.fit(X)
>>> print(result.assignments.shape)
(100,)
fit(X)[source]

Fit kernel K-means with multiple random restarts.

The Gram matrix is computed once and reused across all n_init restarts.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

class geomexp.LinearKernel[source]

Bases: Kernel

Linear (inner product) kernel: \(k(x, y) = \langle x, y \rangle\).

class geomexp.MaternKernel(nu=1.5, lengthscale=1)[source]

Bases: Kernel

Matern kernel implementing the Matern covariance function.

\[k(x, y) = \frac{2^{1-\nu}}{\Gamma(\nu)} \left(\frac{\|x - y\|}{\ell}\right)^{\nu} K_{\nu}\!\left(\frac{\|x - y\|}{\ell}\right)\]

where \(K_{\nu}\) is the modified Bessel function of the second kind. Special cases: \(\nu = 0.5\) gives the exponential kernel, \(\nu = 1.5\) and \(\nu = 2.5\) have closed-form expressions, and \(\nu \to \infty\) recovers the RBF (squared-exponential) kernel.

Parameters:
nu

Smoothness parameter \(\nu > 0\).

lengthscale

Lengthscale parameter \(\ell > 0\).

class geomexp.PolynomialKernel(degree=3, coef0=1)[source]

Bases: Kernel

Polynomial kernel.

\[k(x, y) = (\langle x, y \rangle + c)^p\]
Parameters:
degree

Polynomial degree \(p\).

coef0

Additive constant \(c\).

class geomexp.RBFKernel(gamma=1)[source]

Bases: Kernel

Radial basis function (Gaussian) kernel.

\[k(x, y) = \exp\!\bigl(-\gamma \|x - y\|^2\bigr)\]
Parameters:

gamma (float)

gamma

Bandwidth parameter \(\gamma > 0\).

class geomexp.WeightedEuclideanGeometry(weights)[source]

Bases: HilbertGeometry

Diagonal-weighted geometry: \(\langle a, b \rangle_w = \sum_i w_i\, a_i\, b_i\).

Suitable for functional data discretised on a grid with quadrature weights \(w_i\), approximating

\[\langle f, g \rangle_{L^2} = \int f(t)\, g(t)\, \mathrm{d}t \;\approx\; \sum_i w_i\, f(t_i)\, g(t_i).\]
Parameters:

weights (ndarray) – Positive quadrature weight array of shape (d,).

inner(a, b)[source]

Compute the inner product \(\langle a, b \rangle_H\) along the last axis.

Parameters:
  • a (ndarray) – Array of shape (..., d).

  • b (ndarray) – Array of shape (..., d), broadcastable with a.

Return type:

ndarray

Returns:

Inner product values of shape (...).

norm(a)[source]

Compute \(\|a\|_H\) along the last axis.

Parameters:

a (ndarray) – Array of shape (..., d).

Return type:

ndarray

Returns:

Norm values of shape (...).

squared_norm(a)[source]

Compute \(\|a\|_H^2\) along the last axis.

Default implementation returns inner(a, a). Override for efficiency if the inner product is expensive.

Return type:

ndarray

Parameters:

a (ndarray)

geomexp.expectile_loss(residuals, index_vectors, geometry=None)[source]

Compute the geometric expectile loss.

The loss function is:

\[\ell_u(t) = \frac{1}{2}\bigl(\|t\|_H^2 + \|t\|_H\,\langle u, t \rangle_H\bigr)\]

for \((t, u) \in \mathcal{X} \times \mathcal{B}\), as introduced by Herrmann et al. (2018). When geometry is None the standard Euclidean inner product is used.

Parameters:
  • residuals (ndarray) – Residual vectors \(t = x - c\), shape (n, d) or (d,).

  • index_vectors (ndarray) – Index vector \(u\), broadcastable to residuals. Shape (d,) for a single shared index, or (n, d) for per-sample indices.

  • geometry (HilbertGeometry | None) – Hilbert space geometry. None (default) uses Euclidean.

Return type:

ndarray

Returns:

Loss values of shape (n,) (or scalar if input was 1-D).