[MRG] Sparse emd implementation by nathanneike · Pull Request #778

[MRG] Sparse emd implementation by nathanneike · Pull Request #778 · PythonOT/POT

added 2 commits

October 28, 2025 16:01

  - Implement sparse bipartite graph EMD solver in C++
  - Add Python bindings for sparse solver (emd_wrap.pyx, _network_simplex.py)
  - Add unit tests to verify sparse and dense solvers produce identical results
  - Tests use augmented k-NN approach to ensure fair comparison
  - Update setup.py to include sparse solver compilation

  Both test_emd_sparse_vs_dense() and test_emd2_sparse_vs_dense() verify:
    * Identical costs between sparse and dense solvers
    * Marginal constraint satisfaction for both solvers

  This PR implements a sparse bipartite graph EMD solver for memory-efficient
  optimal transport when the cost matrix has many infinite or forbidden edges.

  Changes:
  - Implement sparse bipartite graph EMD solver in C++
  - Add Python bindings for sparse solver (emd_wrap.pyx, _network_simplex.py)
  - Add unit tests to verify sparse and dense solvers produce identical results
  - Tests use augmented k-NN approach to ensure fair comparison

  Tests verify correctness:
    * test_emd_sparse_vs_dense() - verifies identical costs and marginal constraints
    * test_emd2_sparse_vs_dense() - verifies cost-only version

  Status: WIP - seeking feedback on implementation approach
  TODO: Add example script and documentation

rflamary changed the title ~~Sparse emd implementation~~ [WIP] Sparse emd implementation

Oct 28, 2025

…trix parameter from emd and fix linting issues

- Remove tuple format support for sparse matrices (use scipy.sparse only)
- Change index types from int64_t to uint64_t throughout (indices are never negative)
- Refactor emd() and emd2() with clear sparse/dense code path separation
- Add sparse_bipartitegraph.h to MANIFEST.in to fix build
- Add test_emd_sparse_backends() to verify backend compatibility

Refactor sparse optimal transport implementation to work across
different backends (NumPy/scipy.sparse, PyTorch/torch.sparse).

Key changes:
- Add `sparse_coo_data()` method to backend layer for uniform sparse
  matrix handling across NumPy, PyTorch, JAX, and TensorFlow backends
- Update `emd()` and `emd2()` to return transport plans in backend-native
  sparse format (scipy.sparse for NumPy, torch.sparse for PyTorch)
- Refactor `plot2D_samples_mat()` to efficiently visualize both dense
  and sparse transport plans by detecting format and iterating only
  over non-zero entries for sparse matrices
- Update `plot_sparse_emd.py` example to use new plotting function
- Add comprehensive tests for sparse EMD across backends
- Update documentation to reflect backend-agnostic sparse support

- Preserve PyTorch sparse tensors through numpy conversion for autograd
- Verify gradient w.r.t. M equals transport plan
- Add sparse backend compatibility checks and teststhrow error when unsupported backend used for sparse"

- Use sklearn.NearestNeighbors in dist_knn() (1.4x faster)
- Remove redundant test code (~50 lines)
- Migrate coo_matrix → coo_array
- Fix parameter ordering consistency

… proper backend adaptation

rflamary changed the title ~~[WIP] Sparse emd implementation~~ [MRG] Sparse emd implementation

Nov 28, 2025