Hi. I have a kernel which when run on the CPU works fine.
However, when I run the kernel on a GPU my system profiler indicates about 1MB / sec of system memory (RAM) is leaking. This is still the case after I modify my kernel to return immediately.
Any of you folks have any idea what might be going on? My kernel has many arguments but I checked and I can have up to 9 __constant arguments for my hardware. I’ve included some specifics to the end of this post. Cheers. Dave.
KERNEL FUNCTION
these look like a lot of args, but the total memory required for all __constant and __local memory is probably around 1KB.
// Execute one complete step of hier-leap
// This kernel should run on any number of threads: 1 to MAX_LOCAL_THREADS
__kernel
void hierLeap(global int *globalState, int stateSize,
global float *time,
int constant *cellStartIdx,
float constant *rates,
int2 constant *inpSpec,
int2 constant *outSpec,
constant int *highDeltaS, constant int *lowDeltaS,
constant struct DeltaX *scDeltaHighCell, int nScdHighCell,
constant struct DeltaX *scDeltaLowCell, int nScdLowCell,
global unsigned int *randomInts, int randIdx,
int leap,
local int *highBoundS, local int *highBoundCellS, local int *state, local int *lowBoundCellS, local int *lowBoundS,
local float *DboundHigh, local float *DboundCellHigh, local float *D, local float *DboundCellLow, local float *DboundLow,
local int *rxnEvents, local int *orderedRxnEvents, local float *taus)
{
return;
}
CPU SIDE CODE
called many times per sec; omitting this call results in no memory leaking.
<...>
size_t localSize = 1; // different values still result in memory leak
err |= clEnqueueNDRangeKernel(_cmd_queue, _hierLeap, 1, NULL,
&localSize, &localSize, 0, NULL, NULL);
assert(err == CL_SUCCESS);
clFinish(_cmd_queue);
<...>
HARDWARE
MacbookPro OS X 10.6 with NVIDIA GeForce GT 330M.
A few key values queried from GPU: MaxParamSize(4352), MaxConstArgs(9), LocalMemSize(16384).