I updated CUDA SDK to v 5.5 and got:

cl.hpp(2996) : error C2664: 'clEnqueueNativeKernel' : cannot convert parameter 2 from 'void (__cdecl *)(void *)' to 'void (__stdcall *)(void *)'
1> This conversion requires a reinterpret_cast, a C-style cast or function-style cast

switched back to CUDA SDK 3.2 - all builds OK.

Then I compared cl.hpp I use with provided on khronos site for OpenCL 1.1/1.0 - they are the same.
So I decided that cl.h from CUDA 5.5 SDK is "wrong" one.
But it has no differencies with cl.h for OpenCL 1.1 from Kronos site (!). So, cl.h and cl.hpp, both taken for OpenCL 1.1 are incompatible with each other?
Anyone tried this?