event return by EnqueueNDRange broken with nVidia

The cl_event return by clEnqueueNDRange in last nVidia implementation on windows seems to be broken.

If I don’t call clWaitEvent, its execution status stay on queued.
And if I try to get the get command queue with clGetEventInfo, the cl_command_queue isn’t valid (the address seems to be shifted by few bytes) and a clRetainCommandQueue on it crash.

There is no problem with AMD implementation.

Someone have the same problem?

I’m experiencing some trouble with that too.

Technically, clEnqueueNDRangeKernel only needs to return CL_SUCCESS if the kernel was successfully queued, but it is not responsible for starting the execution of the queue. The start of the queue executing is done with clFlush. Most people do not need to call this directly as blocking commands like clWaitEvent do an implicit call.

If you do not wish to block, call flush directly. clGetEventInfo doc does not say a clFlush is required to return valid data, but the last paragraph of clFlush’s documentation says that it must be called to use any event object that refers to commands.

The command queue flush solve the problem of non executed commands.
But the cl_command_queue retrieved from the cl_event is always broken…

By the way, does clEnqueueNDRange, and clFlush can (on some implementation) be synchronous? (I mean waiting the kernel/commands execution before returning).

By the way, does clEnqueueNDRange, and clFlush can (on some implementation) be synchronous? (I mean waiting the kernel/commands execution before returning).

In theory they should not be synchronous. In practice it’s a difficult thing to enforce. If you are seeing this I would recommend raising this issue to the particular vendor where this is happening.

Always crash with 257.15 beta …