Arrays to the kernel

I just went through the section 5.7.2 Setting Kernel Arguments of the OpenCL specification and I am guessing that I cannot pass an array to the kernel without putting it on to the global memory. Putting it in another way, to pass an array to the kernel, it has to be put in the global memory and then passed as a pointer. Is this right?

– Bharath

Putting it in another way, to pass an array to the kernel, it has to be put in the global memory and then passed as a pointer. Is this right?

You can use either __global memory or, if the array is small enough, into __constant memory. Query CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE to find out how much __constant memory your device exposes.

You can use either __global memory or, if the array is small enough, into __constant memory. Query CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE to find out how much __constant memory your device exposes.

The constant memory here… Is it in anyway faster than the global memory? Or does it depend on the GPU/Vendor?

From the specification [6.5.3 __constant (or constant)], the constant memory is said to be a part of the global memory, which is read-only for the kernel.

So, does it mean there is no way to pass an array on to the (per-thread)local or the (per-workgroup)shared memory?

– Bharath

The constant memory here… Is it in anyway faster than the global memory?

I don’t know if there might be some strange scenario in which it can be slower, but generally it will be faster.

From the specification [6.5.3 __constant (or constant)], the constant memory is said to be a part of the global memory, which is read-only for the kernel.

Right, I should have made it clear that constant memory is read-only for kernels.

So, does it mean there is no way to pass an array on to the (per-thread)local or the (per-workgroup)shared memory?

That’s also right. There’s no way to pre-load local memory. Each work-group has to initialize its local memory.