Mojo function
cp_async_bulk_wait_group[n: Int32, read: Bool = True]()
Waits for completion of asynchronous bulk memory transfer groups.
This function causes the executing thread to wait until a specified number of the most recent bulk async-groups are pending. It provides synchronization control for bulk memory transfers on NVIDIA GPUs.
Note: This functionality is only available on NVIDIA GPUs. Attempting to use this function on non-NVIDIA GPUs will result in a compile time error.
Example:
from gpu.sync.sync import cp_async_bulk_wait_group
# Wait until at most 2 async groups are pending
cp_async_bulk_wait_group[2]()
# Wait for all async groups to complete
cp_async_bulk_wait_group[0]()Parameters:
- n (
Int32): The number of most recent bulk async-groups allowed to remain pending. When n=0, waits for all prior bulk async-groups to complete. - read (
Bool): If True, indicates that subsequent reads to the transferred memory are expected, enabling optimizations for read access patterns. Defaults to True.