I am trying to measure my kernel executing time by using clGetEventProfilingInfo(). But there are some strange problems. My code in host is listed below.
cl_ulong time_begin,time_finish,result;
//…initializing
cmdQueue=clCreateCommandQueue(context,devices[0],CL_QUEUE_PROFILING_ENABLE,&err);
//kernel executing
clWaitForEvents(1,&fEvt1[0]);
timeCount(fEvt1[0],&time_begin,&time_finish);
result=time_finish-time_begin;
}
void timeCount(cl_event event,cl_ulong *time_begin,cl_ulong *time_finish)
{
*time_begin=0;
*time_finish=0;
//err=clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_QUEUED,sizeof(cl_ulong),time_begin,NULL);
err=clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_START,sizeof(cl_ulong),time_begin,NULL);
err=clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_END,sizeof(cl_ulong),time_finish,NULL);
}
When I use clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_QUEUED,sizeof(cl_ulong),time_begin,NULL), the result is about 2,000,000 ns, which seems correct. I also try clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_SUBMIT,sizeof(cl_ulong),time_begin,NULL), and the result seems correct too.
However, if I use clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_START,sizeof(cl_ulong),time_begin,NULL), the result is 0, which is definitely wrong. The time_begin returns the same value as the time_finish. I was very confused here. Could anyone please help me figure this out?
Many thanks!