Hello,
Reaction-diffusion systems are a bit like cellular automata, except that they work on floating point values and use a differential equation to compute the new value. I would have thought this was an ideal system for OpenCL to work on.
Here’s an example of one of our OpenCL kernels:
__kernel void grayscott_compute(
__global float *U,__global float *V,
__global float *U2, __global float *V2,
float k,float F,float D_u,float D_v,float delta_t)
{
const int x = get_global_id(0);
const int y = get_global_id(1);
const int X = get_global_size(0);
const int Y = get_global_size(1);
const int i = x*Y+y;
const float u = U[i];
const float v = V[i];
// compute the Laplacians of U and V
const int xm1 = max(x-1,0);
const int xp1 = min(x+1,X-1);
const int ym1 = max(y-1,0);
const int yp1 = min(y+1,Y-1);
const int iLeft = xm1*Y + y;
const int iRight = xp1*Y + y;
const int iUp = x*Y + ym1;
const int iDown = x*Y + yp1;
// Standard 5-point stencil
const float nabla_u = U[iLeft] + U[iRight] + U[iUp] + U[iDown] - 4*u;
const float nabla_v = V[iLeft] + V[iRight] + V[iUp] + V[iDown] - 4*v;
// compute the new rate of change (Gray-Scott)
const float delta_u = D_u * nabla_u - u*v*v + F*(1.0f-u);
const float delta_v = D_v * nabla_v + u*v*v - (F+k)*v;
// apply the change (to the new buffer)
U2[i] = u + delta_t * delta_u;
V2[i] = v + delta_t * delta_v;
}
Comparing our OpenCL implementations with CPU versions (e.g. using SSE) we’re finding that OpenCL doesn’t give us the performance expected. Here’s a page comparing the speeds, with links to the different implementations:
http://code.google.com/p/reaction-diffusion/wiki/SpeedComparisons
You can see that we’ve been trying different things, including image2d_t and float4 (these help). We try to keep all the data on the card over many iterations. Trying to use local data manually made things slower (GrayScott_OpenCL_Local). Using NDRange local(8,8) helps a lot, compared to local(1,1).
Is there something obvious we can do to improve our OpenCL code? Were my expectations that OpenCL would work well for reaction-diffusion wrong?
Thanks,
Tim