So I think I’m supposed to pass a pointer to an Event object.
However when inside the method enqueueNDRangeKernel the Event object pointer is casted to a cl_event pointer, like this:
When debugging I know for sure that this doesn’t release the previous cl_event stored in the Event wrapper. If you put this code in a loop it’ll fill up your RAM. On the other hand there is no easy way to release the previous Event easily, since the Wrapper::release method is protected, the fastest way is to assign a null Event.
Why are we casting an Event pointer to a cl_event pointer? I don’t see any operator& overloading in the wrapper that permits us to do such a thing. I’m no C++ guru, if somebody can explain me that i’d be glad to understand. Is it because the cl_event is the first and only field in the class (therefore it starts exactly where the Event object starts)?
Can you post a more complete example of the memory leak. Including what OpenCL implementation you are using. The following code is rock solid constant in terms of memory usage on Apple’s implementation on a CPU device:
In this code you declare “cl::Event event” inside the loop, so event.obj_ is allocated and set to null. After you set it to null you give it to the enqueueNDRangeKernel and it overwrites it, no problems with that.
Finally you wait for the event and call cl::Event destructor on the “event” object when its scope ends, that frees the resource calling Wrapper::release on the cl::Event object.
If you try instead a code like this:
cl::Event event;
for (unsigned int i = 0; i < 10000000; i++)
{
cl::CommandQueue queue(context, *d);
queue.enqueueNDRangeKernel(kernel,
cl::NullRange,
cl::NDRange(1000),
cl::NullRange,
NULL,
&event);
event.wait();
}
you have nothing in the inner loop that calls the Wrapper::release method, because enqueueNDRangeKernel doesn’t do that, it simply overwrites the event.obj_ field.
The code i wrote should reproduce my memory leak, even though I’d expect it to have the same behaviour of the code you posted.
Ah yes, I see it now. For a temporary workaround you can use a temporary cl::Event object and then assign it back. operator= will do the proper thing with retain and release.
cl::Event event;
for (unsigned int i = 0; i < 10000000; i++)
{
cl::CommandQueue queue(context, *d);
cl::Event tmp;
queue.enqueueNDRangeKernel(kernel,
cl::NullRange,
cl::NDRange(1000),
cl::NullRange,
NULL,
&tmp);
event = tmp;
event.wait();
}
I’ll work on getting a fix into the bindings for the next version.