gpucoder.sort - Optimized GPU implementation of the MATLAB sort function - MATLAB
Optimized GPU implementation of the MATLAB sort function
Syntax
Description
sorts the elements of an array B = gpucoder.sort(A)A, and the generated CUDA® code from gpucoder.sort performs the sort
operation on the GPU. The function sorts A along the first
non-singleton dimension. For example:
If
Ais a vector,gpucoder.sort(A)sorts the elements ofA,If
Ais a matrix,gpucoder.sort(A)sorts the elements of each column ofA.
The sorted output in B has the same type and size
as A.
has the optional argument B = gpucoder.sort(A,dim)dim that specifies the dimension along
which the sort operation is performed.
has the optional argument B = gpucoder.sort(A,direction)direction that specifies the sort
direction. direction can take one of two values:
'ascend'- Sorts in the ascending order. This is the default option.'descend'- Sorts in the descending order.
[
returns a sort index B,I] = gpucoder.sort(A,...)I which specifies how the elements of
A were rearranged to obtain the sorted output
B.
If
Ais a vector, thenB = A(I).If
Ais an m-by-n matrix anddim = 1, thenfor j = 1:n B(:,j) = A(I(:,j),j); end
The sort ordering is stable. Namely, when more than one element has the same value, the order of the equal elements is preserved in the sorted output B and the indices I relating to equal elements are ascending.
When gpucoder.sort is called from MATLAB®, it uses the built-in sort function.
Examples
collapse all
This example generates CUDA code to sort the columns of a matrix in descending order.
In one file, write an entry-point function mySort that
accepts a matrix inputs A. Use the
gpucoder.sort function to sort the columns of
A in descending order.
function B = mySort(A) B = gpucoder.sort(A, 1, 'descend'); end
Use the codegen function to generate
CUDA MEX function.
codegen -config coder.gpuConfig('mex') -args {ones(1024,1024,'double')} -report mySort
Input Arguments
collapse all
Input array, specified as a vector, matrix, or multidimensional array.
Data Types: double | single | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | char
Dimension to operate along, specified as a positive integer scalar. If no value is specified, then the default is the first array dimension whose size does not equal 1.
sort returns A if
dim is greater than ndims(A).
dim is not supported when A is a
cell array, that is, sort only operates along the first
array dimension whose size does not equal 1.
Data Types: double | single | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64
Sorting direction, specified as 'ascend' or
'descend'. direction is not
supported when A is a cell array, that is,
sort only sorts in ascending order.
Output Arguments
collapse all
Sorted array, returned as a vector, matrix, or multidimensional array.
B is the same size and type as A.
The order of the elements in B preserves the order of
equal elements in A.
Data Types: double | single | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | logical | char
Sort index, returned as a vector, matrix, or multidimensional array.
I is the same size as A. The index
vectors are oriented along the same dimension that sort
operates on. For example, if A is a 2-by-3 matrix, then
[B,I] = sort(A,2) sorts the elements in each row of
A. The output I is a collection of
1-by-3 row index vectors describing the rearrangement of each row of
A.
Limitations
gpucoder.sortdoes not support complex numbers.gpucoder.sortdoes not support'MissingPlacement'and'ComparisonMethod'name-value pairs supported by the MATLABsortfunction.
Version History
Introduced in R2018b
expand all
If you enable the GPU Memory Manager and use CUDA Toolkit version 11.6 or newer, generated CUDA code from the gpucoder.sort function has improved
performance. The GPU Memory Manager allocates,
frees, and manages GPU memory. For more information about the GPU Memory Manager,
see How Shared GPU Memory Manager Improves Performance of Generated MEX.
See Also
Apps
Functions
codegen|coder.gpu.kernel|coder.gpu.kernelfun|stencilfun|coder.gpu.constantMemory|gpucoder.reduce