local memory within two different kernels in the same file?

Hello,

is local memory not accessible within two kernels in the same file ?

I have following problem:

I have one .cl file with two kernels in it. One of them has a __local parameter in its parameter list (this is my first kernel function). The second one does not have any __local parameter. In this case everything works fine.
Now I decided two work with local memory also in the second kernel function. I didnt change the first kernel. I just put a __local parameter in the parameter list of second kernel and I updated the parameter list in my cpp source file before calling the kernels (I also updated the indexes of parameters before enqeue). But if I do so, it doesnt work and I get errorcode -47 for the first kernel function , which I never changed. Why ?

This should be fine. You will need to set the kernel argument to the appropriate size and NULL pointer as you did with the first kernel, though.

-47 is CL_INVALID_KERNEL_DEFINITION, which is a bit strange. It might indicate that you are trying to use too much local memory for the device. Remember that kernels may use local memory on their own, so you need to explicitly query for how much local memory the kernel is using (CL_KERNEL_LOCAL_MEM_SIZE) and subtract that from the device maximum to determine how much you can allocate. If you are getting this at compilation you should check the error log for the compiler.

You can also use context notify function to get back details about the error from OpenCL. On Snow Leopard Apple provides a few of these in the header files or you can set CL_LOG_ERRORS=stdout when you run. This will give you much more detailed information on what is going wrong. I’m not sure what is supported on other platforms.

Note that what you can not do is share local memory between kernels. The contents of the local memory are undefined when the kernel starts, so you’ll end up with garbage.

thank you for your answers and information. I still cannot figure it out and the problem is a bit too strange to me. Let me change the problem description.

I have to kernels (independent from each other), which are empty and do not do anything. They look like:


__kernel void mutualInformation(   
    __global float  *d_Dst,
    __global float  *d_Entropies1,
    __global float  *d_Entropies2, 
    __global float  *d_EntropiesJoint,    
    __local float   *ldata,
    unsigned int histoSize,
    unsigned int normalizeMI

){   
  // an empty kernel...
}


__kernel void mutualInformation2(
    __global float  *d_Src,
    __global float  *d_Entropies1,
    __global float  *d_Entropies2,
    __global float  *d_EntropiesJoint,           
    unsigned int histoSize,
    float densitySize
){

//another empty kernel

}

This code above works fine and I dont get any errors. Especially I dont get any errors after following statements:


    shrLog(LOGBOTH, 0.0, "Building program...
");
        errcode = clBuildProgram(cpProgram, 0, NULL, NULL, NULL, NULL);
        shrCheckError(errcode, CL_SUCCESS);

    shrLog(LOGBOTH, 0.0, "Creating kernels...
");
        ckMutualInformation = clCreateKernel(cpProgram, "mutualInformation", &errcode);
        shrCheckError(errcode, CL_SUCCESS);
        ckMutualInformation2 = clCreateKernel(cpProgram, "mutualInformation2", &errcode);
        shrCheckError(errcode, CL_SUCCESS);

Here everything is fine and by debugging at that position all errcode is 0.

Now I add one more parameter at the end of the second kernel.


__kernel void mutualInformation(   
    __global float  *d_Dst,
    __global float  *d_Entropies1,
    __global float  *d_Entropies2, 
    __global float  *d_EntropiesJoint,    
    __local float   *ldata,
    unsigned int histoSize,
    unsigned int normalizeMI

){   
  // an empty kernel...
}


__kernel void mutualInformation2(
    __global float  *d_Src,
    __global float  *d_Entropies1,
    __global float  *d_Entropies2,
    __global float  *d_EntropiesJoint,           
    unsigned int histoSize,
    float densitySize,    
    __local float   *ldata,
){

//another empty kernel

}

Here I also add the 7. kernel argument for the second kernel with appropriate local memory size of 1 KB, like:


clSetKernelArg(ckMutualInformation2, 7, 256 * sizeof(cl_float), NULL);

But now, if I do so, then I get errcode for the FIRST kernel, which I didnt touched.


    shrLog(LOGBOTH, 0.0, "Building program...
");
        errcode = clBuildProgram(cpProgram, 0, NULL, NULL, NULL, NULL);
        shrCheckError(errcode, CL_SUCCESS);

    shrLog(LOGBOTH, 0.0, "Creating kernels...
");
        ckMutualInformation = clCreateKernel(cpProgram, "mutualInformation", &errcode);
    (DEBUG POINT)    shrCheckError(errcode, CL_SUCCESS);
        ckMutualInformation2 = clCreateKernel(cpProgram, "mutualInformation2", &errcode);
        shrCheckError(errcode, CL_SUCCESS);

Now by debugging , I see that errcode is -47 by (DEBUG POINT) , which is after clCreateKernel of the first kernel. But I didnt change anything by the first kernel. I really cannot understand why I get this error for the first kernel which is still the same and not by the second kernel? and if I change the row like:


    shrLog(LOGBOTH, 0.0, "Building program...
");
        errcode = clBuildProgram(cpProgram, 0, NULL, NULL, NULL, NULL);
        shrCheckError(errcode, CL_SUCCESS);

    shrLog(LOGBOTH, 0.0, "Creating kernels...
");
        ckMutualInformation2 = clCreateKernel(cpProgram, "mutualInformation2", &errcode);
        shrCheckError(errcode, CL_SUCCESS);
        ckMutualInformation = clCreateKernel(cpProgram, "mutualInformation", &errcode);
    (DEBUG POINT)    shrCheckError(errcode, CL_SUCCESS);


Then I still dont get errcode -47 for second kernel (I get errcode 0) , but -47 again for only first kernel.

This problem is very strange to me and I still cannot figure it out why???

OK , PROBLEM IS SOLVED.

the kernel names are mutualInformation and mutualInformation2 and propably OpenCL has some problems with namesgiving ??? just a change of the name of the second kernel from mutualInformation2 to inkero has solved the problem.

Maybe someone knows more about it and can explain it why ???

That sounds like a bug in the OpenCL implementation. Whose OpenCL are you using? I would suggest filing a bug against the vendor.

Had this exact same problem, thanks for the tip :smiley:
Using the NVIDIA CUDA SDK 3.0 on Kubuntu 9.10 x64.