Simply adding the line "barrier(CLK_GLOBAL_MEM_FENCE);" to a function causes an error when run on Intel HD Graphics 630.
- Without a printf the function takes a very long time to execute (~10 seconds) and does not appear to execute subsequent statements.
- With a printf included the program will crash when flush() is called.
However, on same system the function runs quickly an correctly on both the CPU and discrete GPU.

Setup:
macOS Sierra 10.12.6
MacBookPro (15-inch, 2017)
3.1 GHz Intel Core i7
Intel HD Graphics 630
Radeon Pro 560

The same problem is encountered when calling barrier(CLK_LOCAL_MEM_FENCE).
However, when calling an atomic (e.g. atomic_inc) the behavior is correct.

Is there some special way I should be handling barrier() calls?