I am trying to develop a algorithm for image processing on GPU.
I can able to implement convolution on GPU till now.
Now in that algorithms I have to use the convolution many times.
Eg:
_kernel xyz(some parameters) // the function xyz must also be executed on GPU
{
convolution(some parameters); // This function is called many times
}
float* convolution(some parameters)
{
const int nWidth = get_global_size(0);
const int xOut = get_global_id(0);
const int yOut = get_global_id(1);
const int xInTopLeft = xOut;
const int yInTopLeft = yOut;
float temp1 = 0;
for (int r = 0; r < nFilterWidth; r++)
{
const int idxFtmp = r * nFilterWidth;
const int yIn = yInTopLeft + r;
const int idxIntmp = yIn * nInWidth + xInTopLeft;
for (int c = 0; c < nFilterWidth; c++)
{
const int idxF = idxFtmp + c;
const int idxIn = idxIntmp + c;
temp1 += tempB[idxF] * pInput[idxIn];
}
} //for (int r = 0...
const int idxOut = yOut * nWidth + xOut;
img[idxOut] = temp1;
}
Is there any method to do such type of things? Or is there any specific method like that?
Please help me in this regard.
Thanks in advance.
Regards