clBuildProgram returns CL_INVALID_BINARY for certain code

i have a cl-file causing problems (i have broken it down to the smallest program causing the problem):


unsigned char lookUp(__global unsigned char *pData, unsigned int dimx, unsigned int dimy, unsigned int x, unsigned int y)
{
	if ((x < dimx)||(y < dimy)) 
		return(pData[y*dimy+x]);
	else
		return(0);
}

__kernel void foo(__global unsigned char *pInData, __global unsigned char *pOutData, unsigned int dimx, unsigned int dimy)
{
	unsigned int x = (int)get_global_id(0);
	unsigned int y = (int)get_global_id(1);

   __global unsigned char *pOut = pOutData + row * dimx;

	unsigned int sum = 0;

	if (x < dimx)&&(y<dimy))
	{
		sum = ((unsigned int)lookUp(pInData, dimx, dimy, x, y));
		sum += ((unsigned int)lookUp(pInData, dimx, dimy, x+1, y));
		pOut[x] = (unsigned char)(sum / 2);
   }
}

this kernel causes clBuildProgram to return CL_INVALID_BINARY - my question is why?
common syntax errors result in a return value of 1 - as far as i spectated
when deleting the line sum += ((unsigned int)lookUp(pInData, dimx, dimy, x+1, y)); it works
need help :wink:

Might be useful to know which platform you’re on.

According to the spec CL_INVALID_BINARY can only be returned “if (the) program is created with clCreateWithProgramBinary”, so if you’re supplying source, you shouldn’t get that error.

Which device are you compiling this kernel for? Is it a GPU and if so, which GPU? This looks like you are doing a byte write which requires #pragma opencl cl_khr_byte_addressable_store : enable extension to be added to your kernel. If you are running on a device that does not support this extension, then clBuildProgram will fail with the error you are reporting.

Use clGetDeviceInfo(device, CL_DEVICE_EXTENSIONS, … ) to see if cl_khr_byte_addressable_store extension is supported by the device you are building the program for. If not, then your kernel will not work on this device and will need to be modified to use 32-bit values.

i am using a NVidia GeForce 9800 GT
i will check the available extensions and try to enable it and see if this solves my problem, thx for now!

Do you get back a compiler error log?

the error is gone with nvidia computing sdk 3.0 and corresponding driver - seems that this was a bug in earlier versions