Hi,
I have a buffer (float*) that represents an image of let’s say 120x120 pixels.
I create on the device a buffer that represents an image of 100x100.
What I want to do is to take the center of the first image (host) to fill the device one.
clEnqueueWriteBufferRect seems to be the perfect solution…
Let’s have a look on the documentation of clEnqueueWriteBufferRect.
cl_int clEnqueueWriteBufferRect(
cl_command_queue command_queue,
cl_mem buffer,
cl_bool blocking_write,
const size_t buffer_origin[3],
const size_t host_origin[3],
const size_t region[3],
size_t buffer_row_pitch,
size_t buffer_slice_pitch,
size_t host_row_pitch,
size_t host_slice_pitch,
void *ptr,
cl_uint num_events_in_wait_list,
const cl_event *event_wait_list,
cl_event *event)
No comments about command_queue, buffer, blocking_write, buffer_row_pitch, buffer_slice_pitch, host_row_pitch, host_slice_pitch, ptr, num_events_in_wait_list, event_wait_list, and event. Now, because the device buffer will be entirely filled, we must have :
size_t buffer_origin[3] = {0, 0, 0};
Only 2 parameters remain : host_origin and region. What the documentation says about these parameters is :
host_origin : The (x, y, z) offset in the memory region pointed to by ptr. For a 2D rectangle region, the z value given by host_origin[2] should be 0. The offset in bytes is computed as host_origin[2] * host_slice_pitch + host_origin[1] * host_row_pitch + host_origin[0].
region : The (width, height, depth) in bytes of the 2D or 3D rectangle being read or written. For a 2D rectangle copy, the depth value given by region[2] should be 1.
So, in my case I should use :
size_t input_offset[3] = {10, 10, 0};
size_t region[3] = {100*sizeof(float), 100*sizeof(float), sizeof(float)};
Of course, it doesn’t work. Let’s focus on the input_offset parameter.
If we consider their formula, it is said that “the offset in bytes is computed as host_origin[2] * host_slice_pitch + host_origin[1] * host_row_pitch + host_origin[0].”. Since host_slice_pitch and host_row_pitch are given in bytes we must have host_origin[2] and host_origin[1] as numbers and host_origin[0] in bytes! No? Otherwise the offset is wrong. Or host_slice_pitch and host_row_pitch must be given not in bytes. Why the parameters are not consistent?
Now, the region parameter. I agree that the region[0] must be in bytes so that we know how many bytes we have to copy.
However, region[1] and region[2] must be given in “number of rows” and “number of slices”, otherwise how to know how many line we have to copy? Anyway, if region[1] and region[2] are given in bytes, the program crashes. Again, why the parameters are not consistent?
Using these remarks, I have
size_t input_offset[3] = {10*sizeof(float), 10, 0};
size_t region[3] = {100*sizeof(float), 100, 1};
and it works perfectly.
So my question is what am I doing wrong? If I’m not doing anything wrong, don’t you think that the documentation is wrong then?
Thanks