When profiling an opencl application with compute capability 6.1 and cuda 8.0 toolkit, using the commandline nvprof.exe, I run into the following error:
==9652== Profiling application: myopencl.exe
==9652== Profiling result:
No kernels were profiled.
I couldn’t find anything online that seemed to work. Please note this is for OPENCL and not CUDA.
I tried an older version of the toolkit: 7.5. That did not work either. Do you know if OpenCL support is going to be coming back in the newer versions of the toolkit with OpenCL 2.0?
Interestingly, I can get the GUI version to work with 8.0 , just not the commandline version. Does anyone have any idea if the GUI version saves the numbers somewhere. I have looked around in the nvreport , nvactivity etc. None of them hold any useful text data.
It may not give you quite as much detail as the IHV-provided tools, but the “Device Performance Timing” capabilities of the Intercept Layer for OpenCL Applications will give you some profiling information about your OpenCL application, such as the total / min / max / average time for each kernel, and it works on any OpenCL implementation:
do you mind sharing the steps/settings how you got nvvp to profile OpenCL?
I could not even get that to work - my program does run in nvvp, but it does not produce a timeline, or provide any useful information (I am interested in the PC sampling profiling like for my cuda code).