Results 1 to 6 of 6

Thread: goto statements in OpenCL

  1. #1
    Junior Member
    Join Date
    Jul 2015
    Posts
    12

    goto statements in OpenCL

    Are goto statements supported in OpenCL?

    I saw few posts regarding this but they seem to be pretty old. I am hoping to see if there are any updates. To provide some context, I am thinking of using a finite state machine C code on the GPU side that is runtime generated (on the CPU side) based on a given regex and this code has goto statements.

    Please let me know.

  2. #2
    Senior Member
    Join Date
    Apr 2015
    Posts
    283
    I cannot find anything in standart about goto being restricted in relation to normal C. It might perform very poorly on GPUs, though, but I assume you're aware of that.

  3. #3
    Junior Member
    Join Date
    Jul 2015
    Posts
    12
    Thank you for the reply. Yes, I am aware that goto statements perform poorly in GPUs. But I am trying to understand the reason behind the poor performance due to goto statements.

    Can you please tell me what makes goto statements perform poorly in GPUs?

  4. #4
    Senior Member
    Join Date
    Apr 2015
    Posts
    283
    Goto by itself is fine and efficient, but "finite state machine" part is what concerns me. Each OpenCL workitem shares the instruction pointer with 32/64 others. This means even in case of simple if-else statements in which one half of the threads takes one route and the other half takes another, the execution time effectively doubles. Somewhat like this:
    bool x = condition()
    turnOffThreadsNotFittingCriteria(x)
    //code
    ...
    //code
    turnOffThreadsNotFittingCriteria(!x)
    //code for "else"
    ...
    //
    This will be much worse in case of state machine, unless you can guarantee that most of the time each thread in a 64 thread cluster will be in the same state. Then a GPU will easily jump over unused branches of execution.

    Then again, it may turn out to be an uncharted territory: rarely used and therefore poorly tested. Perfectly valid code may refuse to work due bugs in kernel compiler (*coughing sound* AMD *coughing sound*).

  5. #5
    Junior Member
    Join Date
    Jul 2015
    Posts
    12
    Thank you for the explanation!

    I am getting a gist of what's happening in this case. Would you agree that it is a similar case when using switch-case statements?

    Also, can you please share any reading material (if any) related to this case?

    Thanks!

  6. #6
    Senior Member
    Join Date
    Apr 2015
    Posts
    283
    Would you agree that it is a similar case when using switch-case statements?
    Switch-case is essencially goto, so yeah.

    You should find plenty of info on the topic and more by googling "CUDA/AMD GPU/Intel GPU optimization guide". Any of 3, I mean.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Proudly hosted by Digital Ocean