geomexp.kernels package

Submodules

geomexp.kernels.kernels module

Kernel functions and kernelised clustering algorithms.

This module provides kernel function abstractions, a fully implemented kernel K-means algorithm, and the kernelised geometric expectile clustering algorithm. In the dual representation, centroids and index vectors are expressed as linear combinations of feature-mapped data points, so all computations reduce to operations on the Gram matrix \(K_{ij} = k(X_i, X_j)\).

class geomexp.kernels.kernels.Kernel[source]

Bases: ABC

Abstract base class for kernel functions.

A kernel \(k : \mathcal{X} \times \mathcal{X} \to \mathbb{R}\) implicitly defines a feature map \(\varphi\) into a reproducing kernel Hilbert space \(\mathcal{H}\) such that \(k(x, y) = \langle \varphi(x), \varphi(y) \rangle_{\mathcal{H}}\).

gram_matrix(X)[source]

Compute the full Gram matrix \(K_{ij} = k(X_i, X_j)\).

Parameters:

X (ndarray) – Data array of shape (n, d).

Return type:

ndarray

Returns:

Symmetric positive semi-definite matrix of shape (n, n).

class geomexp.kernels.kernels.LinearKernel[source]

Bases: Kernel

Linear (inner product) kernel: \(k(x, y) = \langle x, y \rangle\).

class geomexp.kernels.kernels.RBFKernel(gamma=1)[source]

Bases: Kernel

Radial basis function (Gaussian) kernel.

\[k(x, y) = \exp\!\bigl(-\gamma \|x - y\|^2\bigr)\]
Parameters:

gamma (float)

gamma

Bandwidth parameter \(\gamma > 0\).

class geomexp.kernels.kernels.PolynomialKernel(degree=3, coef0=1)[source]

Bases: Kernel

Polynomial kernel.

\[k(x, y) = (\langle x, y \rangle + c)^p\]
Parameters:
degree

Polynomial degree \(p\).

coef0

Additive constant \(c\).

class geomexp.kernels.kernels.MaternKernel(nu=1.5, lengthscale=1)[source]

Bases: Kernel

Matern kernel implementing the Matern covariance function.

\[k(x, y) = \frac{2^{1-\nu}}{\Gamma(\nu)} \left(\frac{\|x - y\|}{\ell}\right)^{\nu} K_{\nu}\!\left(\frac{\|x - y\|}{\ell}\right)\]

where \(K_{\nu}\) is the modified Bessel function of the second kind. Special cases: \(\nu = 0.5\) gives the exponential kernel, \(\nu = 1.5\) and \(\nu = 2.5\) have closed-form expressions, and \(\nu \to \infty\) recovers the RBF (squared-exponential) kernel.

Parameters:
nu

Smoothness parameter \(\nu > 0\).

lengthscale

Lengthscale parameter \(\ell > 0\).

class geomexp.kernels.kernels.KernelKMeans(n_clusters, kernel, n_init=10, max_iter=100, tol=1e-07, random_state=None)[source]

Bases: BaseClusterer

Kernel K-means clustering (dual representation).

Minimises the within-cluster sum of squared distances in the feature space \(\mathcal{H}\) induced by a kernel \(k\):

\[J = \sum_{k=1}^K \sum_{i \in C_k} \|\varphi(X_i) - c_k\|_{\mathcal{H}}^2\]

where \(c_k = \frac{1}{|C_k|} \sum_{j \in C_k} \varphi(X_j)\) is the cluster centroid in feature space. Since \(\varphi\) may not have an explicit form, all computations are expressed through the Gram matrix \(K_{ij} = k(X_i, X_j)\):

\[\|\varphi(X_i) - c_k\|^2 = K_{ii} - \frac{2}{|C_k|} \sum_{j \in C_k} K_{ij} + \frac{1}{|C_k|^2} \sum_{j,l \in C_k} K_{jl}\]

With a linear kernel this recovers standard K-means.

Parameters:
n_clusters

Number of clusters.

kernel

Kernel function.

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of iterations.

tol

Convergence tolerance.

random_state

Random seed.

Example

>>> import numpy as np
>>> from geomexp.kernels import KernelKMeans, RBFKernel
>>> X = np.random.randn(100, 2)
>>> model = KernelKMeans(n_clusters=3, kernel=RBFKernel(gamma=0.5))
>>> result = model.fit(X)
>>> print(result.assignments.shape)
(100,)
fit(X)[source]

Fit kernel K-means with multiple random restarts.

The Gram matrix is computed once and reused across all n_init restarts.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

class geomexp.kernels.kernels.KernelGeometricExpectileClustering(n_clusters, kernel, index_radius=0.5, n_init=10, max_iter=100, tol=1e-07, center_lr=0.1, center_steps=50, random_state=None)[source]

Bases: BaseClusterer

Kernelised geometric expectile clustering (Algorithm 2).

In the feature space \(\mathcal{H}\) induced by a kernel \(k\), centroids and index vectors lie in the span of the data (Proposition 3.1):

\[c_k = \sum_{j=1}^n \beta_{kj}\,\varphi(X_j), \quad u_k = \sum_{j=1}^n \alpha_{kj}\,\varphi(X_j).\]

All norms and inner products reduce to operations on the Gram matrix \(K_{ij} = k(X_i, X_j)\):

  • Squared residual norm: \(d_{ik}^2 = K_{ii} - 2(K\beta_k)_i + \beta_k^\top K \beta_k\)

  • Index-residual inner product: \(p_{ik} = (K\alpha_k)_i - \alpha_k^\top K \beta_k\)

Setting \(r = 0\) recovers kernel K-means.

Parameters:
n_clusters

Number of clusters.

kernel

Kernel function.

index_radius

Radius constraint \(r \in [0, 1)\).

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of outer iterations.

tol

Convergence tolerance on objective change.

center_lr

Learning rate for gradient descent on centroid weights.

center_steps

Number of gradient steps per centroid update.

random_state

Random seed.

Example

>>> import numpy as np
>>> from geomexp.kernels import KernelGeometricExpectileClustering, RBFKernel
>>> X = np.random.randn(100, 2)
>>> model = KernelGeometricExpectileClustering(
...     n_clusters=3, kernel=RBFKernel(gamma=0.5), index_radius=0.5
... )
>>> result = model.fit(X)
>>> print(result.metadata["index_weights"].shape)
(3, 100)
fit(X)[source]

Fit kernelised geometric expectile clustering with multiple random restarts.

The Gram matrix is computed once and reused across all n_init restarts.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

Module contents

Kernel functions and kernelised clustering algorithms.

class geomexp.kernels.Kernel[source]

Bases: ABC

Abstract base class for kernel functions.

A kernel \(k : \mathcal{X} \times \mathcal{X} \to \mathbb{R}\) implicitly defines a feature map \(\varphi\) into a reproducing kernel Hilbert space \(\mathcal{H}\) such that \(k(x, y) = \langle \varphi(x), \varphi(y) \rangle_{\mathcal{H}}\).

gram_matrix(X)[source]

Compute the full Gram matrix \(K_{ij} = k(X_i, X_j)\).

Parameters:

X (ndarray) – Data array of shape (n, d).

Return type:

ndarray

Returns:

Symmetric positive semi-definite matrix of shape (n, n).

class geomexp.kernels.KernelGeometricExpectileClustering(n_clusters, kernel, index_radius=0.5, n_init=10, max_iter=100, tol=1e-07, center_lr=0.1, center_steps=50, random_state=None)[source]

Bases: BaseClusterer

Kernelised geometric expectile clustering (Algorithm 2).

In the feature space \(\mathcal{H}\) induced by a kernel \(k\), centroids and index vectors lie in the span of the data (Proposition 3.1):

\[c_k = \sum_{j=1}^n \beta_{kj}\,\varphi(X_j), \quad u_k = \sum_{j=1}^n \alpha_{kj}\,\varphi(X_j).\]

All norms and inner products reduce to operations on the Gram matrix \(K_{ij} = k(X_i, X_j)\):

  • Squared residual norm: \(d_{ik}^2 = K_{ii} - 2(K\beta_k)_i + \beta_k^\top K \beta_k\)

  • Index-residual inner product: \(p_{ik} = (K\alpha_k)_i - \alpha_k^\top K \beta_k\)

Setting \(r = 0\) recovers kernel K-means.

Parameters:
n_clusters

Number of clusters.

kernel

Kernel function.

index_radius

Radius constraint \(r \in [0, 1)\).

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of outer iterations.

tol

Convergence tolerance on objective change.

center_lr

Learning rate for gradient descent on centroid weights.

center_steps

Number of gradient steps per centroid update.

random_state

Random seed.

Example

>>> import numpy as np
>>> from geomexp.kernels import KernelGeometricExpectileClustering, RBFKernel
>>> X = np.random.randn(100, 2)
>>> model = KernelGeometricExpectileClustering(
...     n_clusters=3, kernel=RBFKernel(gamma=0.5), index_radius=0.5
... )
>>> result = model.fit(X)
>>> print(result.metadata["index_weights"].shape)
(3, 100)
fit(X)[source]

Fit kernelised geometric expectile clustering with multiple random restarts.

The Gram matrix is computed once and reused across all n_init restarts.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

class geomexp.kernels.KernelKMeans(n_clusters, kernel, n_init=10, max_iter=100, tol=1e-07, random_state=None)[source]

Bases: BaseClusterer

Kernel K-means clustering (dual representation).

Minimises the within-cluster sum of squared distances in the feature space \(\mathcal{H}\) induced by a kernel \(k\):

\[J = \sum_{k=1}^K \sum_{i \in C_k} \|\varphi(X_i) - c_k\|_{\mathcal{H}}^2\]

where \(c_k = \frac{1}{|C_k|} \sum_{j \in C_k} \varphi(X_j)\) is the cluster centroid in feature space. Since \(\varphi\) may not have an explicit form, all computations are expressed through the Gram matrix \(K_{ij} = k(X_i, X_j)\):

\[\|\varphi(X_i) - c_k\|^2 = K_{ii} - \frac{2}{|C_k|} \sum_{j \in C_k} K_{ij} + \frac{1}{|C_k|^2} \sum_{j,l \in C_k} K_{jl}\]

With a linear kernel this recovers standard K-means.

Parameters:
n_clusters

Number of clusters.

kernel

Kernel function.

n_init

Number of random restarts (best result is kept).

max_iter

Maximum number of iterations.

tol

Convergence tolerance.

random_state

Random seed.

Example

>>> import numpy as np
>>> from geomexp.kernels import KernelKMeans, RBFKernel
>>> X = np.random.randn(100, 2)
>>> model = KernelKMeans(n_clusters=3, kernel=RBFKernel(gamma=0.5))
>>> result = model.fit(X)
>>> print(result.assignments.shape)
(100,)
fit(X)[source]

Fit kernel K-means with multiple random restarts.

The Gram matrix is computed once and reused across all n_init restarts.

Parameters:

X (ndarray) – Data array of shape (n_samples, n_features).

Return type:

ClusterResult

Returns:

ClusterResult from the best restart.

class geomexp.kernels.LinearKernel[source]

Bases: Kernel

Linear (inner product) kernel: \(k(x, y) = \langle x, y \rangle\).

class geomexp.kernels.MaternKernel(nu=1.5, lengthscale=1)[source]

Bases: Kernel

Matern kernel implementing the Matern covariance function.

\[k(x, y) = \frac{2^{1-\nu}}{\Gamma(\nu)} \left(\frac{\|x - y\|}{\ell}\right)^{\nu} K_{\nu}\!\left(\frac{\|x - y\|}{\ell}\right)\]

where \(K_{\nu}\) is the modified Bessel function of the second kind. Special cases: \(\nu = 0.5\) gives the exponential kernel, \(\nu = 1.5\) and \(\nu = 2.5\) have closed-form expressions, and \(\nu \to \infty\) recovers the RBF (squared-exponential) kernel.

Parameters:
nu

Smoothness parameter \(\nu > 0\).

lengthscale

Lengthscale parameter \(\ell > 0\).

class geomexp.kernels.PolynomialKernel(degree=3, coef0=1)[source]

Bases: Kernel

Polynomial kernel.

\[k(x, y) = (\langle x, y \rangle + c)^p\]
Parameters:
degree

Polynomial degree \(p\).

coef0

Additive constant \(c\).

class geomexp.kernels.RBFKernel(gamma=1)[source]

Bases: Kernel

Radial basis function (Gaussian) kernel.

\[k(x, y) = \exp\!\bigl(-\gamma \|x - y\|^2\bigr)\]
Parameters:

gamma (float)

gamma

Bandwidth parameter \(\gamma > 0\).