How to flush buffered clEnqueueNDRangeKernel commands to a device queue?

I called clEnqueueNDRangeKernel commands to a device queue several times, usually they were all executed one by one.
But I found on some computers some of these commands are buffered, only part of them were executed.
It is very important to me that all commands were executed, they are the driving force for new task to be queued in the queue.
I think buffering is very common on computer, such as OpenGL. But I can’t find a command like glFlush to force the OpenCL commands in the queue to be executed.
Who can tell me how to deal with this problem.
Thanks in advance.

Aha, there is a clFlush() in OpenCL, I’ll try it.