OpenCL error on multiple kernel runs

Hello,

I have a very simple kernel that adds a constant to all elements in an array and then updates an output. The kernel is simple and it looks as follows:


__kernel void add_constant_to_vec(__global const float *cpp,
                                  __global float *out,
                                  float offset,
                                  int num_elements)
{
    int gid = get_global_id(0);
    if (gid >= num_elements) return;
    out[gid] = cpp[gid] + offset;    
}

Now I run this kernel as follows. I am using the C++ wrapper around the OpenCL API that I procured from Khronos. SDome bits are removed for brevity but the kernel runs fine and the output is correct.


cl::Buffer input_buffer(OCL::cl_context, CL_MEM_READ_ONLY | CL_MEM_COPY_HOST_PTR, sizeof(float)*NUM_ELEMENTS, input);
cl::Buffer output_buffer(OCL::cl_context, CL_MEM_WRITE_ONLY, sizeof(float)*NUM_ELEMENTS);

int num_elements = 100000;
float const_add = 100.0f;
const int NUM_WORK_ITEMS = 512;
const int GLOBAL_WORK_SIZE = round_up(NUM_WORK_ITEMS, num_elements);

cl::Event event;
cl::CommandQueue queue(OCL::cl_context, OCL::cl_context_devices[0]);
queue.enqueueNDRangeKernel(m_kernels[0], cl::NullRange, cl::NDRange(GLOBAL_WORK_SIZE), cl::NDRange(NUM_WORK_ITEMS), NULL, &event);
event.wait();

Now, a single run of this kernel works just fine. However, when I do this:


for (int i=0; i < 1000000; ++i)
{
queue.enqueueNDRangeKernel(m_kernels[0], cl::NullRange, cl::NDRange(GLOBAL_WORK_SIZE), cl::NDRange(NUM_WORK_ITEMS), NULL, &event);
event.wait();
}

Then it fails with
OpenCL error: clEnqueueNDRangeKernel
Error code: -6
which I think translates to out of memory on host but I am a bit surprised as to why that should be the case… Is there something I am doing wrong here? This is my first attempt at doing any OpenCL , so I would not be surprised if I am doing something obviously wrong :slight_smile:

Also, any advise on optimising this very simple kernel further? :slight_smile:

Thanks,
/xarg

Ok, after messing about a bit, it seems that the cl::Event object could not be reused. So, I basically created a cl::Event object inside the loop and it works fine. Strange.

Does anyone know how I can set the cl::Event object to the correct state after the wait() returns?

Thanks,
xarg

Try using “clReleaseEvent( event )” once u have finish to use it…