Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: barrier() crashes on Intel HD Graphics 630 + Apple OpenCL 1.2

  1. #1
    Junior Member
    Join Date
    Feb 2017
    Posts
    14

    barrier() crashes on Intel HD Graphics 630 + Apple OpenCL 1.2

    Simply adding the line "barrier(CLK_GLOBAL_MEM_FENCE);" to a function causes an error when run on Intel HD Graphics 630.
    - Without a printf the function takes a very long time to execute (~10 seconds) and does not appear to execute subsequent statements.
    - With a printf included the program will crash when flush() is called.
    However, on same system the function runs quickly an correctly on both the CPU and discrete GPU.

    Setup:
    macOS Sierra 10.12.6
    MacBookPro (15-inch, 2017)
    3.1 GHz Intel Core i7
    Intel HD Graphics 630
    Radeon Pro 560

    The same problem is encountered when calling barrier(CLK_LOCAL_MEM_FENCE).
    However, when calling an atomic (e.g. atomic_inc) the behavior is correct.

    Is there some special way I should be handling barrier() calls?

  2. #2
    Senior Member
    Join Date
    Dec 2011
    Posts
    253
    First rule is: All work items in the work group must hit the barrier() call.

  3. #3
    Junior Member
    Join Date
    Feb 2017
    Posts
    14
    @Dithermaster: all work items reach the barrier. Also, the same function with same inputs run to completion on the Intel CPU and on the AMD GPU. Only the integrated Intel GPU fails.

  4. #4
    Newbie
    Join Date
    Jan 2018
    Posts
    2
    I'm facing the same error with adding the line barrier(CLK_GLOBAL_MEM_FENCE); have you been able to figure how to work it out?

  5. #5
    Junior Member
    Join Date
    Feb 2017
    Posts
    14
    @garryjoshi I'm glad to hear that I'm not alone in hitting this issue.

    Unfortunately, I never found a resolution to the problem - I simply excluded the integrated GPU from the context.

  6. #6
    Newbie
    Join Date
    Jan 2018
    Posts
    2
    Hi Gabriel, have you tried contacting the Khronos support team? I've contacted them, it been almost 2 week now since I have sent my mail hadn't got any response from them yet

    --
    Garry Joshi
    Tutuapp Vip Showbox Android Tutuapp Free

  7. #7
    Senior Member
    Join Date
    Apr 2015
    Posts
    316
    Khronos won't help you with Intel drivers. Intel support or development forum is a better shot, not sure by how much.

  8. #8
    Administrator khronos's Avatar
    Join Date
    Jun 2002
    Location
    Montreal
    Posts
    104
    Quote Originally Posted by garryjoshi View Post
    Hi Gabriel, have you tried contacting the Khronos support team? I've contacted them, it been almost 2 week now since I have sent my mail hadn't got any response from them yet

    --
    Garry Joshi
    Tutuapp Vip Showbox Android Tutuapp Free
    Who did you contact at Khronos? Feel free to email me at webmaster at khronos.org.
    Last edited by khronos; 01-11-2018 at 09:20 AM. Reason: Add quote from original poster.

  9. #9
    Administrator khronos's Avatar
    Join Date
    Jun 2002
    Location
    Montreal
    Posts
    104
    Quote Originally Posted by AngelGabriel View Post
    Simply adding the line "barrier(CLK_GLOBAL_MEM_FENCE);" to a function causes an error when run on Intel HD Graphics 630.
    - Without a printf the function takes a very long time to execute (~10 seconds) and does not appear to execute subsequent statements.
    - With a printf included the program will crash when flush() is called.
    However, on same system the function runs quickly an correctly on both the CPU and discrete GPU.

    Setup:
    macOS Sierra 10.12.6
    MacBookPro (15-inch, 2017)
    3.1 GHz Intel Core i7
    Intel HD Graphics 630
    Radeon Pro 560

    The same problem is encountered when calling barrier(CLK_LOCAL_MEM_FENCE).
    However, when calling an atomic (e.g. atomic_inc) the behavior is correct.

    Is there some special way I should be handling barrier() calls?
    If you feel this is a bug in OpenCL, you are welcome to post an issue on our issue tracker on Github. This may very well be an issue with the Intel implementation and as Salabar says, posting on Intel forums might be a better place.

  10. #10
    Senior Member
    Join Date
    Apr 2015
    Posts
    316
    I realize this is not a very timely response, but it just came to me. If your kernel is very big, try to split it into multiple smaller kernels. On Radeon GPUs, if your kernel is too big and compiler has to resort to register spilling, I can bloody guarantee you that your GPU will not do whatever you expect it to do. It won't crush, but the cause can be the same. Big kernels are bad for performance and they assume you ain't doing it, therefore no one tests this properly and this part of the compiler is a bug-ridden toxic wasteland.

Page 1 of 2 12 LastLast

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Proudly hosted by Digital Ocean