Compile for other GPU

I got a question: is it possible to compile the source code for a different gpu? Did some tests a time ago with NVIDIA cards and different drivers, but didn’t work well in both sides (old to new and new to old drivers). I can understand, that if I got an AMD card to compile the OpenCL code and the App should run on a NVIDIA that this is not working, but if both are from one manufacture (ATI or NVIDIA) this should work, shouldn’t it? The target pc should not need the SDK and the source file. How does this work?

thx for help

Yes. Everytime your application runs it submits the OpenCL kernel for compilation on the selected platform and device. So if you move from a machine with an AMD GPU to a machine with an NVIDIA GPU, the change in compiler happens automatically.

Yes, i know, if you are using the source file with clCreateProgramWithSource(). But I want to create the PTX Code and use it later with clCreateProgramWithBinary(). I think it is not possible to compile the PTX Code on NVIDIA and use it (with clCreateProgramWithBinary) on AMD.

The problem is: I got e.g. a NVIDIA GTX580 and create the PTX code with clCreateProgramWithSource. This code should run on a e.g. NVIDIA GTX690 with clCreateProgramWithBinary. Does this work and how?

I see. Sorry I misread your question.

It is up to NVIDIA. Their OpenCL Programming Guide (4.1 from 1/3/2012) says:

“Currently, the PTX intermediate representation can be obtained by calling
clGetProgramInfo() with CL_PROGRAM_BINARIES. It can be passed to
clCreateProgramWithBinary() to create a program object only if it is
produced and consumed by the same driver. This will likely not be supported in
future versions.”

So, no.

Thank you so much for your answer. I see, if the GTX690 uses the same driver as the GTX580 it can work but this is also not guaranteed.

But how can I give OpenCL code to our customers? The must not see the source code! Precompiling the code for all driver versions sounds not so funny.

OpenCL 1.2 adds some ability to ship object code (instead of source) but is not supported by NVIDIA. You don’t have to ship source as files on disk, it can be compressed or encrypted in your application. However, it needs to be source in memory when passed to clBuildProgramFromSource, and it is possible to intercept that call.

Sounds good, but we want to support AMD and NVIDIA, therefore it is not a solution for our problem.

I know that and this is, why we want to use binary. So we have to compile binarys for all GPUs which are used from customers.

However. Thousand thanks for your help.

Hi Johannes,

you might have a look at the clusterchimp.org oclelf/oclcc/… guide. They have worked on such a problem as well.