CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE can be used to get the max number of arguments declared with the __constant qualifier in a
kernel. Also, max size of a constant buffer can be retrieved…
But is there a limitation for global variables declared in the program source with the __constant qualifier?
coleb, true, I pasted wrong enum.
matrem, I know constant variables reside in global memory and are cached, I just though maybe there was another limitation. Thanks.
Only the size CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE (minimum of 64KB) and the number of arguments in the kernel definition CL_DEVICE_MAX_CONSTANT_ARGS (minimum of 8).
The way I read the spec though you can have more __constant pointers in your code like the following:
So if you’re coming up against the CL_DEVICE_MAX_CONSTANT_ARGS limit you can pack your data into one argument and then split it out again on the device into multiple pointers.
My understanding is that a kernel can not use more than the max constant buffer size. At least in PTX I believe this is a separate memory space which has different allocation limits. (I may both be wrong and this may change with future hardware, particularly with Fermi which appears to unify all the address spaces.)
What’s the difference between configured L1 and “Unified Cache”? And if constant caches are a thing of the past do the new caches still have the same problem that two work-items in a warp accessing different addresses will cause a serialization?
Not to mention the question, how does one “configure” the L1 cache? Hopefully this done based on the shared memory needs of the specific kernel being launched. But then again, how does this balance out with the ability to run multiple kernels on the device at once? Or is it still only one kernel per SM at a time?
My constants are defined in global program source e.g.: __constant int something = 1;
It sucks if I’m limited to use only CL_DEVICE_MAX_CONSTANT_ARGS of these from kernels, because they are not in the kernel parameter list. You think OpenCL automatically sees them as arguments ?
The standard seems to indicate a difference between constant kernel arguments and the __constant variables at program scope. Though it’s not clear what’s the safe way to initialize a program scope constant variable.