View Full Version : 3-D vs 1-D Global worksapce

02-20-2013, 03:13 AM
This is yet another query about global workspace.

Basically, i'm getting a graphics card crash, when i specify a large number of threads in the global workspace.
When i specify a 1-D global workspace, i have noticed that in my case, i can specify a number up to 2^32. Doing so, the kernel runs absolutely fine. It takes a while to get through the number of threads, but all is good. If i do (2^32) +1 threads, then i get a crash (which i would expect).

My problem is when i have a 3-D global workspace, where the number of threads is still very large, but is in fact quite a bit less than 2^32.
For example, specifying workspace as: global(837, 1098, 352) and local(1,1,1)
will cause a crash, with the error of an invalid command queue.
But, (837 * 1098 * 352) < 2^32.........

I have tried removing all code from the kernel, and still get a crash when specifying this size of global workspace.

My max work item sizes is [1024, 1024, 64].
I have tried using [1050, 1050, 100] and this works fine, but say [1050, 1050, 500] will not.

Any ideas?

02-20-2013, 05:39 PM
This seems like a bug in the OpenCL implementation you are running on. Suggest you contact the vendor and file a bug.

02-21-2013, 01:20 AM
It did sound like it to me, but i was hoping for it not to be. Thanks for your reply.

(Using OpenCL 1.1, NVIDIA Quadro 2000, on version 311.15 drivers)

02-23-2013, 05:58 PM
You put the reason right in your message. Your device only accepts maximum dimensions of [1024, 1024, 64], yet you are passing [837, 1098, 352]. Since 352 > 64, you are asking for something the device cannot do.

Furthermore, you are settings a local work group size of [1,1,1] which means your GPU is mostly idle, running over a quarter million work items on a single GPU core. That's not going to be very fast.