ot.gmm : numerical errors
Describe the bug
Function ot.gmm.gaussian_pdf has numerical errors when dealing with high-dimensional Gaussians, e.g. when the covariance matrix determinant becomes very small. This can lead to inaccurate density computations or underflow. The issue arises due to the direct computation of $\det C$ and $\exp (-0.5 ...)$, which are sensitive to instability for poorly scaled covariance matrix.
To Reproduce
- Define a high-dimensional diagonal covariance with small entries
- Compute the density using
ot.gmm.gaussian_pdf - Observe that the computed values are inaccurate
Code sample
import numpy as np import ot.gmm # Example input d = 512 # dimension x = np.random.randn(10, d) # samples m = np.zeros(d) # mean C = np.eye(d) * 0.01 # covariance # Compute PDF pdf = ot.gmm.gaussian_pdf(x, m, C) print("Computed PDF values:", pdf)
Output
Computed PDF values: [nan nan nan nan nan nan nan nan nan nan]
Environment (please complete the following information):
POT installed with pip
macOS-15.0-arm64-arm-64bit Python 3.9.6 (default, Feb 3 2024, 15:58:27) [Clang 15.0.0 (clang-1500.3.9.4)] NumPy 1.25.2 SciPy 1.13.1 POT 0.9.5
Additional context
This numerical instability breaks downstream functions like gmm_ot_apply_map, where fractions of densities are computed using gaussian_pdf. Dividing two small density values can also cause further inaccuracies.