ARROW-1392: [C++] Add GPU IO interfaces for CUDA by wesm · Pull Request #985 · apache/arrow
This makes it easy to write from host to device and read from device to host. We also need a zero-copy device reader for IPC purposes (where we don't want to move any data to the host), can do that in a subsequent patch.
I'm going to start a quick cuda-benchmark since I'm curious if buffering writes affects performance at all
Write buffering is important when the size of individual writes are small; this will be important to enable when writing to GPU memory on IPC hot paths.
In this benchmark, a total of 128MB is written to a GPU buffer using a chunk size varying from 256 bytes to 64K. In the Buffered case, an 8MB host buffer created with cudaMallocHost is used
$ ./release/cuda-benchmark Run on (8 X 4200 MHz CPU s) 2017-08-22 15:33:53 Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------- BM_Writer_Buffered/256/min_time:1.000/real_time 24026339 ns 24027197 ns 58 5.20262GB/s BM_Writer_Buffered/4k/min_time:1.000/real_time 23202300 ns 23202817 ns 60 5.3874GB/s BM_Writer_Buffered/64k/min_time:1.000/real_time 23167510 ns 23168351 ns 60 5.39549GB/s BM_Writer_Unbuffered/256/min_time:1.000/real_time 1691322141 ns 1691377175 ns 1 75.6804MB/s BM_Writer_Unbuffered/4k/min_time:1.000/real_time 125177161 ns 125181662 ns 11 1022.55MB/s BM_Writer_Unbuffered/64k/min_time:1.000/real_time 26685025 ns 26685828 ns 54 4.68428GB/s
I still need to add reader tests. On contemplation I think this should be zero-copy and return device pointers; that would be much more useful at the moment.
cc @m1mc, I can copy you on GPU related PRs if you want to help review
OK, I will merge this on a green build. These interfaces will see hardening when used on the IPC code paths. The idea is that arrow::gpu::CudaBufferReader can be used for zero-copy IPC reads, and the reconstructed record batches will contain device pointers internally. The user can unbox these pointers into whatever data structure they want
@xhochy I added const to the Tell method on files. This probably should have been done in the first place. The impact of this on downstream users is nil, but if any third parties have created their own implementations they will have to add const. This seems unlikely, so I don't think it's needed to go through a deprecation cycle with this
wesm
deleted the
ARROW-1392
branch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters