02-03-2011, 06:07 PM
Hi. I have a kernel which when run on the CPU works fine.

However, when I run the kernel on a GPU my system profiler indicates about 1MB / sec of system memory (RAM) is leaking. This is still the case after I modify my kernel to return immediately.

Any of you folks have any idea what might be going on? My kernel has many arguments but I checked and I can have up to 9 __constant arguments for my hardware. I've included some specifics to the end of this post. Cheers. Dave.

these look like a lot of args, but the total memory required for all __constant and __local memory is probably around 1KB.

// Execute one complete step of hier-leap
// This kernel should run on any number of threads: 1 to MAX_LOCAL_THREADS
void hierLeap(global int *globalState, int stateSize,
global float *time,
int constant *cellStartIdx,
float constant *rates,
int2 constant *inpSpec,
int2 constant *outSpec,
constant int *highDeltaS, constant int *lowDeltaS,
constant struct DeltaX *scDeltaHighCell, int nScdHighCell,
constant struct DeltaX *scDeltaLowCell, int nScdLowCell,
global unsigned int *randomInts, int randIdx,
int leap,
local int *highBoundS, local int *highBoundCellS, local int *state, local int *lowBoundCellS, local int *lowBoundS,
local float *DboundHigh, local float *DboundCellHigh, local float *D, local float *DboundCellLow, local float *DboundLow,
local int *rxnEvents, local int *orderedRxnEvents, local float *taus)

called many times per sec; omitting this call results in no memory leaking.

size_t localSize = 1; // different values still result in memory leak
err |= clEnqueueNDRangeKernel(_cmd_queue, _hierLeap, 1, NULL,
&localSize, &localSize, 0, NULL, NULL);
assert(err == CL_SUCCESS);

MacbookPro OS X 10.6 with NVIDIA GeForce GT 330M.
A few key values queried from GPU: MaxParamSize(4352), MaxConstArgs(9), LocalMemSize(16384).

02-07-2011, 10:34 AM
Thanks to everyone who has looked into this.

I reduced the kernel to take only one __constant argument, and the memory leak persists. This thread suggests that there is a bug with the drivers:

http://discussions.apple.com/thread.jsp ... D=11451666 (http://discussions.apple.com/thread.jspa?messageID=11451666)

I guess for now I'll use __global memory in place of __constant.