Opencl clGetEventProfilingInfo() problem

I am trying to measure my kernel executing time by using clGetEventProfilingInfo(). But there are some strange problems. My code in host is listed below.

cl_ulong time_begin,time_finish,result;
//…initializing

cmdQueue=clCreateCommandQueue(context,devices[0],CL_QUEUE_PROFILING_ENABLE,&err);

//kernel executing

clWaitForEvents(1,&fEvt1[0]);

timeCount(fEvt1[0],&time_begin,&time_finish);
result=time_finish-time_begin;

}

void timeCount(cl_event event,cl_ulong *time_begin,cl_ulong *time_finish)
{
*time_begin=0;
*time_finish=0;
//err=clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_QUEUED,sizeof(cl_ulong),time_begin,NULL);
err=clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_START,sizeof(cl_ulong),time_begin,NULL);
err=clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_END,sizeof(cl_ulong),time_finish,NULL);
}

When I use clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_QUEUED,sizeof(cl_ulong),time_begin,NULL), the result is about 2,000,000 ns, which seems correct. I also try clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_SUBMIT,sizeof(cl_ulong),time_begin,NULL), and the result seems correct too.

However, if I use clGetEventProfilingInfo (event,CL_PROFILING_COMMAND_START,sizeof(cl_ulong),time_begin,NULL), the result is 0, which is definitely wrong. The time_begin returns the same value as the time_finish. I was very confused here. Could anyone please help me figure this out?

Many thanks!