Simply adding the line "barrier(CLK_GLOBAL_MEM_FENCE);" to a function causes an error when run on Intel HD Graphics 630.
- Without a printf the function takes a very long time to execute (~10 seconds) and does not appear to execute subsequent statements.
- With a printf included the program will crash when flush() is called.
However, on same system the function runs quickly an correctly on both the CPU and discrete GPU.

macOS Sierra 10.12.6
MacBookPro (15-inch, 2017)
3.1 GHz Intel Core i7
Intel HD Graphics 630
Radeon Pro 560

The same problem is encountered when calling barrier(CLK_LOCAL_MEM_FENCE).
However, when calling an atomic (e.g. atomic_inc) the behavior is correct.

Is there some special way I should be handling barrier() calls?