Fix interleave_datasets with all_exhausted_without_replacement strategy by prathamk-tw · Pull Request #7955 · huggingface/datasets

@prathamk-tw @claude

When using interleave_datasets with stopping_strategy="all_exhausted_without_replacement"
and probabilities=None, the function was incorrectly falling into the undersampling branch,
causing it to stop at min(lengths) instead of continuing until all datasets were exhausted.

This fix adds a specific branch to handle the all_exhausted_without_replacement case when
probabilities=None. The new logic cycles through all datasets round by round, adding elements
from each dataset until all are exhausted, ensuring each element appears exactly once.

Example fix:
- Input: d1=[0,1,2], d2=[10,11,12,13], d3=[20,21,22]
- Before: [0, 10, 20, 1, 11, 21, 2, 12, 22]
- After: [0, 10, 20, 1, 11, 21, 2, 12, 22, 13]

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>