max work item and max workgroup sizes for CPUs (MacOS / AMD)

I’m getting a CL_INVALID_WORK_GROUP_SIZE error when trying to launch a kernel on the CPU (Intel Core Duo) in MacOS. I’m setting the local work size to {11,11,1} and my kernel relies on the get_local_id() values to run properly.

When I run the code below, worksize returns from clGetDeviceInfo() with the values {1,1,1}, and maxWorkgroupSize==1. On an AMD CPU running under Windows, the same clGetDeviceInfo() functions return worksize=={1024,1024,1024} and maxWorkgroupSize==1024.

I’m wondering if this is logical – that AMD has the ability to serialize the local work items on the CPU and Apple hasn’t included that functionality. I’d appreciate any thoughts on the matter. Cheers!


size_t worksize[3];
cl_int err = clGetDeviceInfo(cl_getDeviceId(),CL_DEVICE_MAX_WORK_ITEM_SIZES,sizeof(worksize),&worksize,NULL);

cl_uint maxWorkgroupSize;
err = clGetDeviceInfo(cl_getDeviceId(),CL_DEVICE_MAX_WORK_GROUP_SIZE,sizeof(maxWorkgroupSize),&maxWorkgroupSize,NULL);

-Chris

If clGetDeviceInfo() returns {1,1,1} and max block size of 1, it must be a CPU. Try calling clGetDeviceIDs() with CL_DEVICE_TYPE_GPU.
Another reason for getting CL_INVALID_WORK_GROUP_SIZE could be that global work size is not a multiple of a local work size. Set the environment variable CL_LOG_ERROR=“stdout” to get more meaningful error messages.