OpenCL ICD infrastructure questions

Hello.

I have a few general questions regarding the OpenCL ICD.

  1. Who installs the OpenCL.DLL on the system? I assume the goal is to be able to run applications using OpenCL on a machine that does not have any additional software installed (Stream / GPU compute SDK). The best solution would be that the DLL gets installed with the OS. That is fairly unrealistic though. The second best solution is that the DLL gets installed with the display driver. Nvidia is already doing that, ATI is not (at least in some cases). The least convenient solution is to install it with the application.

  2. Who installs the IHV OpenCL driver DLL (nvcuda.dll, atiocl.dll/atiocl64.dll) on the system? Again, I assume the goal is to be able to run applications using OpenCL on a machine that does not have any of the Stream / GPU compute SDKs installed. The best solution is that they get installed with the display driver. Nvidia is already doing that, ATI is not (at least in some cases).

  3. Who installs the IHV OpenCL driver DLL for the CPU case? Could this be installed as part of a “CPU driver”?

  4. What is the calling convention for the ICD DLL? I found that Nvidia is using CDECL on W7/64 and stdcall on XP32. Looking at Khronos’ cl_platform.h versus the one that Nvidia ships in their GPU compute SDK makes me think matters are somewhat in flux, hence my question. I assume the goal is to be consistent across vendors and 32-bit / 64-bit windows OSes. OpenGL32.dll is using CDECL. Is the plan to use CDECL for OpenCL as well? Would it be possible for Khronos to provide windows 32- and 64-bit binaries of the OpenCL.DLL ICD along with the stub libs to get this standardized quickly?

If there’s a better forum to asks these questions or a place where I can find the answers, please let me know. I’ve searched here for a while and all I found is that there are others running into the same or similar issues.

since there is no ‘house owner’ to openCL on the windows platform every OpenCL implementor installs his components independently. your application should verify that all the sub components it needs are installed before it runs (e.g. drivers, runtimes , library dll etc.).

4.) the calling conventsion is stdcall. before the calling convention was set by khronos nvidia released a driver that uses cdecl however if you upgrade to gpu sdk 3.0 beta and driver 196.21 you can work with stdcall.

Thanks for your reply. I should have explained what I want to do more clearly. I would like to deploy windows applications that take advantage of OpenCL capable hardware (mostly interested in GPU at this point) if it is available on the end-user’s machine. With the OpenCL ICD, the binaries that are necessary to run an OpenCL program on windows are:

A) application.exe (links to stub lib OpenCL.lib from SDK)
B) OpenCL.dll
C) OpenCL ICD (nvcuda.dll for nvidia and atiocl.dll/atiocl64.dll for ATI)

I’d like to know who is responsible for installing B and C. Nvidia is installing both with their display driver. I think this is the preferred approach. ATI is installing both with their SDK.

I had a look at Nvidia’s 3.0 beta (the 64-bit SDK) and you’re right, for 32-bit binaries the calling convention is stdcall now. For the 64-bit binaries the calling convention is still cdecl. Are you saying that OpenCL will switch to stdcall for 64-bit binaries?

Both should be installed by the display drivers. For OpenCL.dll, if a system does not have an OpenCL device, there is no point in installing one because there will be still no OpenCL, so I don’t think it’s necessary for an application to install that.

It’s stdcall for both 32 bits and 64 bits in current SDK (for both NVIDIA and ATI).

True. Alas AMD does not see it that way, or at least not yet.

It is not stdcall for 64-bit in Nvidia’s most current Cuda toolkit (3.0 beta) available from this thread

http://forums.nvidia.com/index.php?showtopic=149959

The cl_platform.h that it installs has the following lines in it (matches Khronos’ cl_platform.h):

#if defined(_WIN32)
#define CL_API_ENTRY
#define CL_API_CALL __stdcall
#else
#define CL_API_ENTRY
#define CL_API_CALL
#endif

Looking at the binaries in the Cuda toolkit, I see that the exports in …/Win32/opengl.lib have names of the form _clBuildProgram@24 etc. (indicates stdcall), and the exports in …/x64/opengl.lib have names of the form clBuildProgram etc. (indicates cdecl).

Which Nvidia SDK are you using?

_WIN32 is defined for all Windows programs, including x86-64 binaries.

Ah… the x64 binaries have undecorated exports – the leading underscore which indicates cdecl is actually not there. pcchen, you’re right: WIN32 is #defined for both win32 and x64 builds.

Thanks to pcchen and tzachi for their replies to my questions!

To summarize: The calling convention is stdcall for all vendors and for both 32- and 64-bit OSs. Nvidia got it wrong with their GPU compute SDK 2.3a, but the 3.0 beta has the problem corrected.

Also, I’m fairly certain AMD will be installing their ICD and the OpenCL.DLL with their display driver in the near future.

if a system does not have an OpenCL device, there is no point in installing one because there will be still no OpenCL, so I don’t think it’s necessary for an application to install that

It could be good to install the OpenCL.dll because some application link with this library; they need to query clGetPlatformIds to know that there is no OpenCL.

It can be the responsibility of the application to install this dll.

I still think it’s a bad idea for an application to install OpenCL.dll on their own, even if it checks for existing OpenCL.dll beforehand. The problem is, there are too many ways to mess with DLLs in the Windows system directories.

Of course, this can cause some problems with programs linked with OpenCL.dll if it’s not there. A better way is to dynamically link the library in the program, so it’s possible to check whether OpenCL.dll is available or not. Unfortunately, it can be somewhat cumbersome to do that.

When I said “installing”, in fact I was thinking “providing” (in their local folder), application must clearly not overwrite OpenCL.dll.

Loading library at execution is another solution but a bit more complicated :). Furthermore I hope it will be available everywhere as OpenGL. And nowadays nobody think about loading OpenGL library at execution ^^’.

Providing its own OpenCL.dll is not an ideal solution. Current ICD from Khronos is probably not redistributable, I’m not sure about this. Furthermore, if there are some future versions of the ICD, old applications with older ICD may have compatibility problems.

Right now I think the best solution is to dynamically load OpenCL.dll at execution time. It shouldn’t be difficult to write a helper library to do so (someone just has to do the hard job, by manually load every OpenCL function from the ICD… :stuck_out_tongue: ). This also has a upside of being able to support OpenCL “seamlessly,” that is, if a system has no OpenCL, it will go normally without GPGPU, and only tries to use OpenCL when it’s available.

OpenGL is a different story, because the ICD is provided by Microsoft and is installed by default on Windows NT (although the original Windows 95 does not have it by default, IIRC).

I agree with both of you, OpenCL.DLL can not be expected to be installed on the system (unlike OpenGL32.dll which comes with the OS). I also don’t think it’s a good idea to distribute it with the application.

I’m loading the DLL and get the entry points (GetProcAddress) dynamically at runtime. If the DLL can’t be loaded (because it can’t be found anywhere in the process’ path) OpenCL support is disabled in the application. Some of this stuff is needed anyway to support extensions.

Just for reference there’s another mechanism that can be used to solve this issue called delay loading a DLL.