How to pass an array to the kernel

Hi, I’m new to OpenCL and I’m trying to process an array with my GPU.
So, I wrote a simple program to do some tests.

First, I defined an array in the host program as

int *a = new int[81];

I just gave arbitrary values to each array elements like a[i]=i;
Then, I created buffer for this array and passed it to the kernel as an argument

[i]cl_mem memIn=clCreateBuffer(context, CL_MEM_READ_ONLY|CL_MEM_COPY_HOST_PTR, sizeof(a), a, &status);

status=clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&memIn);[/i]

And here is my kernel program (I have set global size to 1)

[i]_kernel void process(__global int* a, __global int* b){ //b is not used
unsigned int id = get_global_id(0);
int c[81];
for(int i=0; i<81; i++){
c[i]=*(a+i);
printf("%d ", c); //just to figure out whether the array is successfully passed in
}
return;
}

The result is that all elements of c[] are 0 (not the same with the a[])
However, if I define the array in the host program as

int a[81]; //not using “new”

It all worked. Since I may need to build a really big array in the real program, I think I’ll have to use the first method to avoid the stack overflow problem.
So, if someone can tell me where I’m doing wrong, or is it impossible to pass an array defined in that way to a kernel program?

Thanks!

This issue is that you are incorrectly using ‘sizeof(a)’ to determine the size of your array. The sizeof operator computes the size of the datatype you give it, not the size of the array you allocated. In your first case (with malloc), the data type is an int*, and so sizeof(a) will be either 4 or 8 bytes, depending on the architecture. In the second case, the data type is int[81], and sizeof(a) will be 81*sizeof(int) bytes, which is why it works.

So, instead of using sizeof(a), use num_elements*sizeof(cl_int).

Hi, jprice. Thank you for your reply. I did what you said and it worked. (What a stupid mistake :doh:) Thanks again.

Hi forum,

How to do the array indexing if we create buffer of vector type as follows:



#define DIM 256

cl_int errNum
buffer1 = clCreateBuffer(context, CL_MEM_READ_WRITE,DIM * sizeof(cl_float2),NULL,&errNum);
...............................

errNum = clSetKernelArg(kernel1,0,sizeof(cl_mem),&buffer1);

Above we have created a buffer of data type cl_float2. And in the kernel i have the following


__kernel func(__global float2* a)
{
   // i am not sure about the indexing to be done out of a. Should it be something as follows:
   // a[i].x and a[i].y
}

If it is linear buffer we are creating then i believe that we have to device a linearized data index for this.

Any thoughts folks?

Thanks

You’re right: a[i].x and a[i].y are the right way to access data in your buffer.