CL code failed when access twice in a function

hi,
I’m creating a kernel function to update the positions of vertices of an 3d object when a sphere collides and deforms the object. Basically, my function was:


/*
* center - position of the center of the sphere
* vList - list of vertices
* fList - list of faces of the object. 
* Each face has 3 vertex indices vIndex1, vIndex2, vIndex3 refer to the list of vertices
* vNewList - list of new vertices to be readback 
*/
__kernel void
update(Point3f force, Point3f center, float radius, __global Point3f * vList, __global Face * fList, __global Point3f * vNewList)
{
...
      int nIndex = get_global_id(0);
...
      // calculating the new positions for vertices of a face and store in p1 p2 p3
      // and update the vertex list to vNewList
     vNewList[fList[nIndex].vIndex1] = p1;
     vNewList[fList[nIndex].vIndex2] = p2;
     vNewList[fList[nIndex].vIndex3] = p3;
}

However, the vNewList array elements were all NaN. I tried to comment some lines in my code:


       vNewList[fList[nIndex].vIndex1] = p1;
//     vNewList[fList[nIndex].vIndex2] = p2;
//     vNewList[fList[nIndex].vIndex3] = p3;

And I received some vertices having correct values and the others are all zeros. Then I tried to test my code by just accessing vIndex1 three times (I’ve tested with twice):


       vNewList[fList[nIndex].vIndex1] = p1;
       vNewList[fList[nIndex].vIndex1] = p2; // access vIndex1 only
       vNewList[fList[nIndex].vIndex1] = p3; //

The result was the same with the first trial (all NaN).

I guess I’ve made some memory conflicts, but I don’t know how to fix. Could you guys help me on this? Thank you very much!

Have you checked whether any function calls return an error code? Have you passed a notify function to clCreateContext()?

This looks like a page fault due to an invalid memory access. Do what I suggested above and also double-check that the values from fList[nIndex].vIndexX are in range. You could, for example, do something like this:


__kernel void
update(Point3f force, Point3f center, float radius, __global Point3f * vList, __global Face * fList, __global uint * vNewList)
{
...
      int nIndex = get_global_id(0);
...
     vNewList[nIndex] = fList[nIndex].vIndex1];
}

Thank you for your reply. However, I also tested some other cases to make sure the rest parts of the program were running well (not sure I’ve covered all the cases).
test case 1:


vNewList[fList[nIndex].vIndex1] = p1; // p2 and p3 are all tested

the above code ran well, then I think there’re no problem with p1, p2, p3. Butwhen I added one more line :


vNewList[fList[nIndex].vIndex1] = p1;
vNewList[fList[nIndex].vIndex1] = p1;

==> didn’t work anymore, I mean all the values in the array vNewList became NaN ?!
Then I tried to do to make sure the indices are running correctly by testing like this:


vNewList[fList[nIndex].vIndex1].x = fList[nIndex].vIndex1;
vNewList[fList[nIndex].vIndex1].y = fList[nIndex].vIndex2;
vNewList[fList[nIndex].vIndex1].z = fList[nIndex].vIndex3; 
// tested for vNewList[fList[nIndex].vIndex2] and vNewList[fList[nIndex].vIndex3] aready

the result of this was an array with some elements of the array got the assigned values. But again, if I do assigned one more time, the all the values became NaN !!!


vNewList[fList[nIndex].vIndex1].x = fList[nIndex].vIndex1;
vNewList[fList[nIndex].vIndex1].y = fList[nIndex].vIndex2;
vNewList[fList[nIndex].vIndex1].z = fList[nIndex].vIndex3;

vNewList[fList[nIndex].vIndex1].x = fList[nIndex].vIndex1;
vNewList[fList[nIndex].vIndex1].y = fList[nIndex].vIndex2;
vNewList[fList[nIndex].vIndex1].z = fList[nIndex].vIndex3;

… Im not sure that I’ve covered all of the cases for testing, but the above tests made me pretty sure that there’re nothing wrong with the indices. Moreover, I have checking results for each step in the C++ code:


...
hContext = clCreateContext(0, 1, &cdDevice, NULL, NULL, &error);
printf("context test= %d 
", error); 
...
error = clBuildProgram(hProgram, 0, 0, 0, 0, 0);
printf("build test = %d 
", error);
...
error =	clEnqueueNDRangeKernel(hCmdQueue, hKernel, 1, 0, globalSize, 0, 0, 0, 0);
printf("launching testing = %d 
", error);

and all the results were zeros which are fine. Thank you very much for your reply!

I still think it would be a good idea to try passing a notify function pointer to clCreateContext() instead of NULL.

That said, from your investigation it looks like you’ve found a compiler bug. If you can reduce the kernel code to less lines of code and send it to the OpenCL vendor you are using, I’m sure they would appreciate it and try to fix the problem.