Type casting question

I am writing a convolution algorithm in c#, using the Cloo library. I have a convolution algorithm that I am sending into the GPU and I am testing it against a serial version of the same algorithm. Unfortunately, I am getting miniscule errors back in my data. I have 1243/64407 inconsistencies, where the output is off by 1, which leads me to think that there is something different going on dealing with truncation.

Is there an inconsistency in the way that OpenCL truncates or approximates vs C#?

Unfortunately, I am getting minuscule errors back in my data.

Chances are that the floating point capabilities of your GPU are not identical to your CPU. You can query the single-precision floating point capabilities of an OpenCL device by calling clGetDeviceInfo(…, CL_DEVICE_SINGLE_FP_CONFIG, …).

All OpenCL-certified devices actually pass some very strict tests regarding their numerical accuracy.