[MRG] EMD and Wasserstein 1D by rtavenar · Pull Request #89 · PythonOT/POT
I started coding a specific EMD for mono-dimensional case (i.e. when sorting both arrays is enough).
Doc is missing for the moment (will do that asap), but a basic implementation that covers the non uniform weight case and tests that checks if the results are coherent with EMD are already there.
>>> n = 20000 >>> m = 3000 >>> u = np.random.randn(n, 1) >>> v = np.random.randn(m, 1) >>> ot.tic(); ot.emd_1d([], [], u, v, metric='sqeuclidean'); ot.toc() array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) Elapsed time : 2.3728668689727783 s 2.3728668689727783 >>> ot.tic(); M = ot.dist(u, v, metric='sqeuclidean'); ot.emd([], [], M); ot.toc() RESULT MIGHT BE INACURATE Max number of iteration reached, currently 100000. Sometimes iterations go on in cycle even though the solution has been reached, to check if it's the case here have a look at the minimal reduced cost. If it is very close to machine precision, you might actually have the correct solution, if not try setting the maximum number of iterations a bit higher /Users/tavenard_r/Documents/costel/src/POT/ot/lp/__init__.py:104: UserWarning: numItermax reached before optimality. Try to increase numItermax. result_code_string = check_result(result_code) array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) Elapsed time : 8.67806887626648 s