working with thread in a kernel

hello!!!
I want to do this:
I have a array “Signal” with length=1000, but I want to process it for parts, in other words: in my PC with 4 cores,
I need that each core process 250 serial elemtents
Core1:0-254
Core2:250-499, etc.
I have this kernel but doesn’t work, why???

int id = get_global_id(0);
int i,e;
e=id*250;
for(i=e;i<250;i++)
{
....
}

In the app the code is this:

size_t local_work_size=1;
size_t global_work_size=4;
clEnqueueNDRangeKernel(queueGPU, kernel_Notch_Notch, 1, NULL, &global_work_size, &local_work_size, 0, NULL, NULL);

Please help me?!?!?!

Your for-loop is wrong. Try something like this:


int id = get_global_id(0); // will be 0, 1, 2, or 3
int start = id * 250; // index of the first element for this block
for(int i = 0; i < 250; i++){
    // input[start+i] will be the value of the i'th element in this block
}

Or try something like this:


int id = get_global_id(0);
int i,e;
e=id*250;
for(i=e;i<1000;i++)
{
....
}

[QUOTE=kylelutz;30092]Your for-loop is wrong. Try something like this:


int id = get_global_id(0); // will be 0, 1, 2, or 3
int start = id * 250; // index of the first element for this block
for(int i = 0; i < 250; i++){
    // input[start+i] will be the value of the i'th element in this block
}

[/QUOTE]