coder.gpu.constantMemory - Pragma that maps a variable to the constant memory on GPU - MATLAB
Main Content
Pragma that maps a variable to the constant memory on GPU
Syntax
Description
coder.gpu.constantMemory( maps the
variable v)v to the constant memory space on the GPU device. Place
this pragma within a parallelizable loop. If GPU Coder™ generates a kernel for the loop, it loads v to a
device constant memory variable. It replaces any
access to this variable within the kernel by access to the constant memory variable.
Within the kernel, the variable v must be read-only. Otherwise,
GPU Coder ignores this pragma. Use this pragma when every thread accesses every
element of the parameter array or matrix.
This function is a code generation function. It has no effect in MATLAB®.
Examples
collapse all
This example shows how to map an input to the constant memory
space on the GPU by using the coder.gpu.constantMemory
pragma.
Write an entry-point function myFun that accepts two
inputs a of size 256x256 and constant
k of size 1x3. The function has a
nested for-loops that adds the constants to each element
of a. To create a kernel, place the
coder.gpu.kernel() pragma outside the nested
for-loop. The
coder.gpu.constantMemory(k) places the read-only
input k into the constant memory of the GPU.
function b = myFun(a,k) b = coder.nullcopy(zeros(size(a))); coder.gpu.kernel(); for j = 1:256 for i = 1:256 coder.gpu.constantMemory(k); b(i,j) = a(i,j) + k(1) + k(2) + k(3); end end end
Create a configuration object for MEX code generation.
cfg = coder.gpuConfig('mex');Define a cell array input that declares the size and
data type of the inputs a,k to the function
myFun.
input = {ones(256),ones(1,3)}Generate a MEX function myFun_mex by using
-config, -args, and
-report options to specify configuration, provide
input arguments, and generate a code generation report.
codegen -config cfg -args input -report myFun
In the report, on the C code tab, click
myFun.cu.
The read-only variable k is declared as
const_k by using the __constant__
qualifier as shown in the code snippet.
/* Variable Definitions */ __constant__ real_T const_k[3];
cudaMemcpyToSymbol call copies the value of
k from the host to the device constant memory
const_k.
cudaMemcpyToSymbol(const_k, k, 24U, 0U, cudaMemcpyHostToDevice); cudaMemcpy(gpu_a, a, 524288U, cudaMemcpyHostToDevice); myFun_kernel1<<<dim3(128U, 1U, 1U), dim3(512U, 1U, 1U)>>>(gpu_a, gpu_b); cudaMemcpy(b, gpu_b, 524288U, cudaMemcpyDeviceToHost);
The kernel body accesses the constant const_k and adds
it to each element of a
static __global__ __launch_bounds__(512, 1) void myFun_kernel1(const real_T *a,
real_T *b)
{
int32_T i;
int32_T j;
int32_T threadIdX;
threadIdX = (int32_T)(blockDim.x * blockIdx.x + threadIdx.x);
i = threadIdX / 256;
j = threadIdX - i * 256;
if ((!(j >= 256)) && (!(i >= 256))) {
b[i + (j << 8)] = ((a[i + (j << 8)] + const_k[0]) + const_k[1]) + const_k[2];
}
}Input Arguments
collapse all
The name of the variable that must be mapped to the constant memory space on the GPU device.
Version History
Introduced in R2017b
See Also
Apps
Functions
codegen|coder.gpu.kernel|coder.gpu.kernelfun|stencilfun|coder.gpu.nokernel|coder.gpu.persistentMemory