Handling event objects for synchronization points

Hello everyone, I’m quite new in the world of OpenCL and i have some things that I don’t understand. I hope you can make things clearer.
I would like to use some event objects for synchronization points in my program. Nevertheless, I have a hard time with handling memory leaks.
In the function cl_int clEnqueueNDRangeKernel (cl_command_queue command_queue,cl_kernel kernel,cl_uint work_dim,const size_t *global_work_offset,const size_t *global_work_size,const size_t *local_work_size, cl_uint num_events_in_wait_list,const cl_event *event_wait_list,cl_event *event), I think the event object event should increment its status when it’s called by the function and decrement itself after the function is completed, am i right ?

First I want that my kernel2 waits the kernel1 so, this is how I see it:

clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);
clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);

In this case, how will event1 and event2 react ? Is it dangerous to generate event2 even though I’m not sure he will be used.

Now, I would like to call again my first two kernels and they should start as soon as the previous is done. I mean that if the kernel1 is done, I should call kernel2 and in the same time another kernel1. And if kernel2 is done, i should call again kernel2. This how i see it, but I think it’s no good.

clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);
clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);
clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event1);
clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event2, &event2);

Can I do things like that or maybe you have something better in mind ? Should I need to use the function clReleaseEvent() to clean memory ?

I thought about this one, but still no idea if it’s better:

clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);
wait_event1=event1;
clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event2);
wait_event2=event2;
clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event1, &event1);
clEnqueueNDRangeKernel (command_queue, kernel2, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event2, &event2);

Imagine, now that we would like to call many times the same kernel and each kernel should wait for the previous one, how sould we do it without memory leaks.
Is it something like this:

clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);
for(i = 0; i < 1000; i++)

{
     clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &event1, &event1);
}

or maybe this one:

clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  0, NULL, &event1);
for(i = 0; i < 1000; i++)
{
     wait_event1 = event1;
     clEnqueueNDRangeKernel (command_queue, kernel1, work_dim, global_work_offset, global_work_size, local_work_size,  1, &wait_event1, &event1);
}

or do I need to add also some functions to release the event and how to do it with the certitude that the kernel has been completed?

Thank you, for your consideration. I’m sure I am not the only one who is encountering difficulties with the event objects and memory leaks.