Atomic operations in OpenCL 1.0

I was having troubles with atomic operations in my kernel for computing
grey level co-occurence matrices. So I tried running a simple kernel below:


#pragma OPENCL EXTENSION cl_khr_global_int32_base_atomics : enable
#pragma OPENCL EXTENSION cl_khr_local_int32_base_atomics : enable
__kernel void
atomic(__global int* num)
{

   atom_inc(&num[0]);
}

But all I’m getting at runtime is this:


[CL_BUILD_PROGRAM_FAILURE] : OpenCL Error : clBuildProgram failed: could not build program for device 0 (0x100683f70) (-44)
Break on OpenCLErrorBreak to debug.
[CL_BUILD_ERROR] : OpenCL Build Error : Compiler build log:
Error while compiling the ptx module: CLH_ERROR_NO_BINARY_FOR_GPU
PTX Info log: 
PTX Error log: 
[CL_INVALID_PROGRAM_EXECUTABLE] : OpenCL Error : Failed to create kernel! Could not find a device with a built executable for this kernel.
Break on OpenCLErrorBreak to debug.

Running on Mac OS 10.6.2 with GeForce 9600M GT.
Any suggestions, please?

You should try on linux or windows :wink:

I have the same problem with atomics but on a GT 120 running os x 10.6 snow leopard. Any fixes?

Have you checked that your OpenCL 1.0 implementation actually supports atomics? In CL 1.0 atomics are provided as extensions.

Check your CL_DEVICE_EXTENSIONS string and see whether cl_khr_global_int32_base_atomics is returned in the list.

Thanks for the reply! Here are my available extensions:

cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_APPLE_gl_sharing cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions

I just tried atom_add on __global memory and that actually worked. Does this mean atomic operations on __local memory are not supported? Cheers.

The cl_khr_local_int32_***_atomics extensions are not exported and therefore atomic operations to local memory are not supported. The GeForce 9600M GT does not support atomic operations to local memory.

The simple program can be solved if the OS is changed from Mac OS 10.6.2 to linux or windows .Usually kernel based code works best on Linux .That may be the root cause of the problem .Its better to get an insight of the problem by going through the handbook and know which versions of that particular kernel are suited for that particular OS.

The best suited solution for the problem would be to simply switch to Windows.

Some people don’t have Windows/Linux machine available. And in my experience each platform has its problems, so you may be solving one issue and creating another. Even profiler and debugger now work on Mac OS, so no need to blame it for problems.

Anyway, since I started this topic… My initial problem was that my card didn’t support local atomics, which I overlooked.