Fix prevent duplicate GPU usage in distributed processing by ved1beta · Pull Request #3526 · huggingface/accelerate
What does this PR do?
same PR as last time added test
Test Code
from accelerate import PartialState from accelerate.utils import gather_object import os import torch # Print CUDA information print(f"CUDA available: {torch.cuda.is_available()}") if torch.cuda.is_available(): print(f"Number of CUDA devices: {torch.cuda.device_count()}") print(f"Current CUDA device: {torch.cuda.current_device()}") print(f"CUDA device name: {torch.cuda.get_device_name()}") # Initialize distributed state first distributed_state = PartialState() # Create some test data prompts = [str(i) for i in range(4)] # ["0", "1", "2", "3"] batch_size = 2 # Split into batches tokenized_prompts = [prompts[i : i + batch_size] for i in range(0, len(prompts), batch_size)] completions_per_process = [] with distributed_state.split_between_processes(tokenized_prompts, apply_padding=True) as batched_prompts: for batch in batched_prompts: generated_text = [f"{distributed_state.device}: {t}" for t in batch] completions_per_process.extend(generated_text) # Gather results from all processes completions_gather = gather_object(completions_per_process) print(completions_gather)
How to Test
- Single GPU test:
CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc-per-node 1 test.py
- CPU mode test (for distributed testing):
CUDA_VISIBLE_DEVICES="" torchrun --nproc-per-node 2 test.pyExpected Output
For single GPU test:
CUDA available: True
Number of CUDA devices: 1
Current CUDA device: 0
CUDA device name: NVIDIA GeForce RTX 3050 6GB Laptop GPU
Local rank: 0
Using device: cuda:0
['cuda:0: 0', 'cuda:0: 1', 'cuda:0: 2', 'cuda:0: 3']
Fixes #3485
Before submitting
- This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- Did you read the contributor guideline,
Pull Request section? - Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case. - Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings. - Did you write any new necessary tests?