compileWithBinnaries and calling Kernels

Hi,

Someone have a simple exemple how works clCreateProgramWithBinary method?

Someone know how much is overhead when a kernel is called?
Is it better for time of execution, put all code in a kernel for make one only calling or split all code in different kernels, this will cause there are more kernels calls in the application?

Thanks,

Luiz Drumond.

Someone have a simple exemple how works clCreateProgramWithBinary method?

I can’t easily give you an example. However, if you write some code I can help you debug it. What follows is the general idea on how to do it:

After you build a program from source successfully, you can call clGetProgramInfo(…, CL_PROGRAM_BINARY_SIZES, …) to find out how big is the program binary for each device. You then need to allocate enough memory for each one using malloc() or similar. Then you can obtain the actual program binaries with glGetProgramInfo(…, CL_PROGRAM_BINARIES, …). Now you can store the program binaries in a file or database for the next time the user runs your applications.

The next time your application runs you can pass those program binaries to clCreateProgramWithBinaries() and then call clBuildProgram().

Someone know how much is overhead when a kernel is called?

That depends on the implementation. You can measure it by creating your command queue with CL_QUEUE_PROFILING_ENABLE.

Is it better for time of execution, put all code in a kernel for make one only calling or split all code in different kernels, this will cause there are more kernels calls in the application?

I would organize my kernels in a way that makes logical sense and forget about any other reasons. Typically the number of kernels you execute comes naturally from the algorithm you are running.