Host memory allocation with CL_MEM_ALLOC_HOST_PTR

I have a doubt about how cl::Buffer works when created with the CL_MEM_ALLOC_HOST_PTR flag. I create it using a NULL host pointer:


cl::Buffer myBuffer(myContext, CL_MEM_ALLOC_HOST_PTR, n * sizeof(double);

Then inside a loop I map it to a host memory area:


double *hostPtr = NULL;
hostPtr = myQueue.enqueueMapBuffer(myBuffer, CL_FALSE, CL_MAP_WRITE)

This should allocate pinned host memory, map device memory to pinned host memory and return the host pointer. I fill the host memory with the data to be crunched and then send it to the device with:


myQueue.enqueueUnmapMemObject(myBuffer, hostPtr);

At this point I enqueue my kernel for execution and retrieve processed data with:


hostPtr = myQueue.enqueueMapBuffer(myBuffer, CL_TRUE, CL_MAP_READ)
// Do some stuff
myQueue.enqueueUnmapMemObject(myBuffer, hostPtr);

and the iterate the loop. I fear that calling enqueueMapBuffer with a NULL host ptr as argument may trigger host memory allocation every time, so that in my loop hostPtr is allocated twice. Is this true? If so, is it connected to calling enqueueUnmapMemObject after enqueueMapBuffer? I think that for this continuous data transfer between host and device could be sufficient to simply call enqueueMapBuffer alternating CL_MAP_WRITE and CL_MAP_READ properly, and the call enqueueUnmapMemObject only after exiting the loop, but I may be misunderstanding something.
Thanks.

The constructor of Buffer allocates memory. enqueueMapBuffer doesn’t.

enqueueMapBuffer maps the buffer object into host memory. Calling it several times will not allocate several buffers.

Furthermore, you cannot use a buffer as an argument of a kernel while it is mapped (except if the buffer is mapped for reading-only and the kernel only reads from the buffer). Cf. §5.2.4 of OpenCL specification:

[i]If a memory object is currently mapped for writing, the application must ensure that the memory object is unmapped before any enqueued kernels or commands that read from or write to this memory object or any of its associated memory objects (sub-buffer or 1D image buffer objects) or its parent object (if the memory object is a sub-buffer or 1D image buffer object) begin execution; otherwise the behavior is undefined.

If a memory object is currently mapped for reading, the application must ensure that the memory object is unmapped before any enqueued kernels or commands that write to this memory object or any of its associated memory objects (sub-buffer or 1D image buffer objects) or its parent object (if the memory object is a sub-buffer or 1D image buffer object) begin execution; otherwise the behavior is undefined.[/i]

Ok, it’s much more clear now, thanks. What I found a bit misleading is that the pointer to the host memory is returned by enqueueMapBuffer and not by the constructor of cl::Buffer, so that one would guess that it’s enqueueMapBuffer that does the actual host memory allocation.

If the buffer is allocated in device memory, there is no such thing as a pointer to host memory.

Even when the buffer is allocated in host memory, if it resides in pinned memory, a call to enqueueMapBuffer will be necessary to create a temporary mapping between physical and virtual memory addresses.