Fix interleave_datasets with all_exhausted_without_replacement strategy by prathamk-tw · Pull Request #7955 · huggingface/datasets
When using interleave_datasets with stopping_strategy="all_exhausted_without_replacement" and probabilities=None, the function was incorrectly falling into the undersampling branch, causing it to stop at min(lengths) instead of continuing until all datasets were exhausted. This fix adds a specific branch to handle the all_exhausted_without_replacement case when probabilities=None. The new logic cycles through all datasets round by round, adding elements from each dataset until all are exhausted, ensuring each element appears exactly once. Example fix: - Input: d1=[0,1,2], d2=[10,11,12,13], d3=[20,21,22] - Before: [0, 10, 20, 1, 11, 21, 2, 12, 22] - After: [0, 10, 20, 1, 11, 21, 2, 12, 22, 13] 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>