clEnqueueNDRangeKernel returns -6 CL_OUT_OF_HOST_MEMORY

I have already found out that this error code may signify a lack any host resource, not necessary memory (as the error code suggests, which is only for legacy reasons).

How do I find *which* resource?

Is there any way to turn on some extended debug output so that the actual reason be exactly diagnosed? Any tweak to turn that on?

When I comment some of the kernel code more or less arbitrarily, the clEnqueueNDRangeKernel succeeds. Which make me think the problem has something to do with the kernel size. Information on the max kernel size is scarce, but I recall it is 2000000 instructions and my kernel is way way below this limit.

Any suggestions?

Platform: GeForce 570, driver 301.42, Windows 7 64 bit, NVIDIA GPU Computing SDK 4.2