View Full Version : For loops inside kernels

11-02-2012, 08:06 AM
I'm using the book "openCL in action" to learn how to program openCL. In this book the author claims that for-loops inside kernel-functions is a bad idea because comparison statements are time consuming on gpus which I understand considering general gpu architectures. However in his matrix-examples and in other matrix-examples from other sources, for loops are used quite extensively inside kernels. Isn't this sort of against the whole idea behind using openCL? If I need many for-loops why dispatch kernels at all instead of just writing normal c/c++ code?

Could someone please shed some light on these issues for me? It would be of great help before I start implementing my own algorithms.

11-03-2012, 06:20 PM
For loops in kernels are fine if all work items in the workgroup are looping the same number of times. If each work item takes a different number of loops then you have divergence and that's what can really slow things down.