The OpenCL C++ bindings version 1.2 use a GCC builtin (__sync_val_compare_and_swap) that is not provided by the PGI C++ compiler for x86/x86_64 Linux. (Apparently, some other compiler vendors such as Intel do provide it for compatibility.)
Has anyone found a reasonable workaround? My attempts to provide an implementation of this builtin as a function (inlined or otherwise) causes the templated method that uses the builtin to be outlined, leading to multiply-defined symbol link errors in the code I’m working on.
Would defining __sync_val_compare_and_swap as a macro help at all? I don’t work with OpenCL 1.2 as I target both Nvidia and AMD so I have no experience there.
I replied a few days ago but apparently didn’t hit the right button to submit.
I tried defining __sync_val_compare_and_swap as a macro and as a C++ inline function. In both cases, the class method in the C++ bindings that uses the __sync_val_compare_and_swap builtin ended up being outlined. Since the code I’m working with includes the C++ bindings header from several source files, this meant link errors due to multiply defined symbols for this outlined function.
I don’t understand why the compiler decided to outline that function - I don’t think it should have, but I couldn’t figure out a way to keep it from outlining it.
I may be stuck just modifying all our code to remove use of the C++ bindings. That would be disappointing, since overall I like the way the C++ bindings increase the level of abstraction when programming for OpenCL devices.