Hi,
I have some float variables that won’t allow multiplication on the GPU.
The multiplication of the float variable “val” with any component(x,y or z) of the
Vec3 “dp” will always result in a “(-)1.#QNAN”, but only when running this kernel on
the GPU, the same works fine when running this kernel on the CPU.
I have triple-checked the single values by writing them to the finalResult buffer. Both
“val” and “dp” have the correct values (compared to the sequential implementation).
Even multiplication of “val” or any component of “dp” with other values works fine,
but “val*dp.x/y/z” will fail on GPU.
The function “Mul(Vec3 v, float f)” is also working fine on GPU in several other kernels.
for (unsigned int j=0; j<3; ++j) {
unsigned int k=(j+1)%3;
unsigned int l=(j+2)%3;
Vec3 dp = Sub(curPositions[l], curPositions[k]); // Vec3f subtraction
float val = deltaL2[k]*gamma[l] + deltaL2[l]*gamma[k];
Vec3 fForce = Mul(dp, val); // NaN on GPU, OK on CPU
results[j].x = dp.x*val; // NaN on GPU, OK un CPU
results[j].y = val; // OK on GPU and CPU
results[j].z = dp.x; // OK on GPU and CPU
// results[j] = dp; // OK on GPU and CPU
}
finalResult[gid].v[0] = results[0];
finalResult[gid].v[1] = results[1];
finalResult[gid].v[2] = results[2];
Does anyone have an idea, why float multiplication would fail on GPU while
working fine on CPU?