When you run a kernel on the GPU, sometimes the kernel ends up running for a very long amount of time. Depending upon the vendor and OS combination, it can lead to anything from application crashes to system freezes. This is obviously very undesirable.
Consider how an API like C++ AMP deals with this. C++ AMP kernels do not lead to system hangs. If the kernel runs for a long time, then the system terminates it and an exception is thrown which can be caught by the application and responded to. Even if the developer is sloppy and does not handle the exception, it will only lead to application crash, not the crashing of the whole system.
Thus, I want a well defined way to enforce some kind of maximum timeouts on GPU kernels and well defined error codes for the case where the timeout does occur.
The current behaviour is completely unacceptable.
thanks,
Rahul Garg
PhD student (CS), McGill University