I am trying following simple code but it doesn’t work. It gives the access violation when running the kernel ( Which could mean the kernel arguments are not initialized)
Can you point what is wrong with below code.
Host Code:
iint *bs = new int[2000 *4];
for(int i = 0; i < 8000;i++) {
bs[i] = 10;
}
cl_mem tem = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_USE_HOST_PTR, (2000 * 4) * sizeof(cl_int), bs, &error);
cl_int status;
cl_event event;
status = clSetKernelArg(
scaleKernel, 0,
sizeof(cl_mem), &tem);
size_t globalThreadsStep1[1];
size_t localThreadsStep1[1];
globalThreadsStep1[0] = 2064;
localThreadsStep1[0] = 64;
status = clEnqueueNDRangeKernel(commandQueue, testKernel,
1, NULL,globalThreadsStep1,
localThreadsStep1, 0, NULL,
&event);
/* wait for the kernel call to finish execution */
status = clWaitForEvents(1, &event);
Kernel Code:
#pragma OPENCL EXTENSION cl_amd_printf : enable
kernel void
test_code(global int4 *data) {
int id = get_global_id(0);
int pc = data[id].s0;
printf("id: %d pc:%d", id, pc);
}
I’ve got the status and error object as CL_SUCCESS, untill the end of the code except for clWaitForEvents
The error is: It is trying to read some unintialized memory when executing the kernel, the printf show that it occurs on first OpenCL thread. Access Violation read at 0Xffffff
BTW, It is AMD OpenCL Implementation running on CPU.
It seems we need aligned memory, I just went through normal OpenCL basics and assumed I can allocate memory of 4 ints and it would work.
The memory doesn’t need to be aligned in order for the code to work correctly. The only way in which memory alignment matters is in the case where you are passing CL_MEM_USE_HOST_PTR when you create the buffer. In that case passing aligned memory can help the OpenCL driver avoid memory copies.
Again, the code would work even if the memory was unaligned. It must be something else.
Thinking a bit more about it I think I was wrong. Section 6.1.5 says that “A data item declared to be a data type in memory is always aligned to the size of the data type in bytes”. Also, section 6.2.5. says that pointer casting “represents an unchecked assertion that the address is correctly aligned”.
Since your kernel argument is a pointer to int4 it must be aligned to sizeof(int4).
In other words, you were right and your OpenCL implementation is fine.