Problem in the kernel output!!

please see this kernel and correspond output, when I execute this kernel it will work for the first thread and the other will give a wrong output, so any one can help me to figure out what is the problem?


__kernel void rmsCalculation(const __global float* a ,
const __global float * C,
__global float * O,
const int col)
{

const int ar = get_global_id(0);


float R=0;
float I=0;
float c=0;
bool totalSch = true;
float sum=0;

for(int j=0;j<col; ++j)
{
c = C[j] * a[ar * col + j];
I=0;

do
{
R = I + c;
I=0;

if(R>T[j])
{
totalSch = false;
break;
}
else
{
for(int k=0 ; k<j ; ++k)
{
I = I + C[k] * a[ar * col + k];
}
}
}while(I+c > R);

sum = sum + R;

if(totalSch == false)
{
break;
}

}//end for(j=0..

O[ar]=sum;

}


but in the output the first element in the “O[]” array is calculated correctly but the other elements are wrong as shown bleow;
0= 11
1= -9.99199e+18
2= -9.99199e+18
3= -9.99199e+18
4= -9.99199e+18
5= -9.99199e+18
6= -9.99199e+18
7= -9.99199e+18
8= -9.99199e+18
9= -9.99199e+18
10= -9.99199e+18
11= -9.99199e+18
12= -9.99199e+18
13= -9.99199e+18
14= -9.99199e+18
15= -9.99199e+18
16= -9.99199e+18
17= -9.99199e+18
18= -9.99199e+18
19= -9.99199e+18

So what is the problem in the code?

I solved the problem, the problem not with the kernel, it was with setting the buffer size for array “a”, since it should be “row*col”, and I just made the size as “col” only, :smile: