how can I load arrays to constant memory space

noah_r · January 23, 2013, 3:36pm

I’m confused about how to load read-only array values into the device constant memory space versus global memory space. The moderately sized array nodes_x in the below kernel meets all the criteria such that some hardware should perform better if the array is loaded to constant memory instead of global memory.

__kernel void sequenceKernel_00(  constant const float * nodes_x, global float * const restrict density_n)
{ /* etc. */ }

My code already works with the first nodes_x argument being declared global. I have been passing the flag CL_MEM_READ_ONLY to clCreateBuffer().
I’m surprised that clCreateBuffer() does not take an additional argument that specifies the desired memory space.

So, it seems insufficient to me, that all I have to change to move nodes_x array into constant memory is the kernel argument declaration from global to constant.

As an experiment, I changed the flag for my nodes_x buffer creation to CL_MEM_WRITE_ONLY. Nevertheless, clSetKernelArg() still reports CL_SUCCESS when setting the constant nodes_x argument to that buffer. This is clearly wrong and makes me think I’m missing something.

Since cl_buffers are created independent of a particular kernel, and clEnqueueWriteBuffer() can be called before clSetKernelArg(), I don’t see how any late-binding can be in play

(running Mac OS X 10.8 and an NVIDIA Tesla GPU)

matthiasv · January 23, 2013, 11:47pm

But it is. When the kernel is executed and the memory object is larger than CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE, you will get an error. Moreover, if you specify CL_MEM_WRITE_ONLY you’ll get “undefined behaviour” as per section 5.2.1 of the OpenCL 1.1 specification.