clCreateSubBuffer and clreleaseMemObject

Hi, I think my following question is obvious but I can’t find the answer explicity in the OpenCL specification.

Imagine I create a subbuffer:


cl_buffer_region posRegion;
posRegion.origin = 0;
posRegion.size = numGridCells*sizeof(cl_int);
cl_mem testDevice = clCreateSubBuffer(positioningDevice, CL_MEM_READ_WRITE, CL_BUFFER_CREATE_TYPE_REGION, &posRegion, &errcode_ret);

Then, testDevice contains from 0 to numGridCells ints of positioningDevice.

Now:


cl_int ret = clReleaseMemObject(positioningDevice);

Does testDevice still contains 0 to numGridCells ints of positioningDevice? I think the obvious answer is no, cause otherwise will mean that clCreateSubBuffer is making a copy of positioningDevice which makes no sense for a method call clCreateSubBuffer.

Anyway, I just want to be sure that I’m right.

Another, I think, obvious question is, is clCreateSubBuffer faster than create a new buffer and then clEnqueueCopyBuffer?

Thanks

Does testDevice still contains 0 to numGridCells ints of positioningDevice?

According to the spec, the memory associated with the parent buffer object (e.g. positioningDevice) is not deleted until all of its sub-buffers are also released.

So answering your question, the sub-buffers (e.g. testDevice) will continue to work normally even if the parent object (i.e. positioningDevice in this case) is released.

I think the obvious answer is no, cause otherwise will mean that clCreateSubBuffer is making a copy of positioningDevice which makes no sense for a method call clCreateSubBuffer.

Parent buffers and their sub-buffers share memory with each other. Just because the sub-buffers are valid after the parent buffer was released it doesn’t mean that there is a copy. It’s simply that deleting the memory associated with the parent object is deferred until all the sub-buffers are released.

Does it make sense now?

Is clCreateSubBuffer faster than create a new buffer and then clEnqueueCopyBuffer?

clCreateSubBuffer() doesn’t copy or allocate any new memory, so yes, it is faster than allocating new memory and doing a copy.

A lot of sense :slight_smile:

So if I understand correctly:


cl_buffer_region posRegion;
posRegion.origin = 0;
posRegion.size = numGridCells*sizeof(cl_int);
cl_mem testDevice1 = clCreateSubBuffer(positioningDevice, CL_MEM_READ_WRITE, CL_BUFFER_CREATE_TYPE_REGION, &posRegion, &errcode_ret);
posRegion.origin = numGridCells*sizeof(cl_int);
cl_mem testDevice2 = clCreateSubBuffer(positioningDevice, CL_MEM_READ_WRITE, CL_BUFFER_CREATE_TYPE_REGION, &posRegion, &errcode_ret);

clReleaseMemObject(positioningDevice);

//testDevice1 and testDevice2 are still 'working' and positioningDevice is not deleted yet
clReleaseMemObject(testDevice1);
//testDevice2 are still 'working', positioningDevice is not deleted yet and testDevice1 has been released and deleted
clReleaseMemObject(testDevice2);
//testDevice1, testDevice2 and positioningDevice are released and deleted

Thank you for your help.

Yes, that’s how understand the spec as well.