Can i call the same kernel function multiple times in a loop

I have two for loops, in the outer for loop i want the result of inner for loop, what I want to do is parallelize the inner for loop which i will be doing using OpenCL. So for the outer loop i need to execute the inner loop multiple times.

You can call a kernel function an unlimited number of times within any sort of loops structure.

Can we have print statements inside the kernel function??

Not all Vendors support that feature. Look for the extension in amd and intel sdk. Nvidia will not work afaik

I am writing an OpenCL code which parallelizes finding the minimum number in an array.
Here is my kernel function:
What i am doing is comparing all the elements with all the other elements and the element which is greater i am changing the corresponding value of M[i] to 1. At last we have only one element in M array with value 0, with that index we can look in the A array and have our minimum value

__kernel void array(__global int *A, __global int *M) {

// Get index of the element
int i = get_global_id(0);
int j = get_global_id(1);

//barrier(CLK_LOCAL_MEM_FENCE);
// Do the operation

if(A[i]<A[j] && i!=j)
{M[j]=1;}
}
I am getting all the values M array as 0.
But the output should be that only one value of the M array should be 0. and that index would give me the min number index in array A.