Hey so I’m reading data in a file using mmap() as follows :
unsigned char* mapped;
mapped = mmap(0,size,PROT_READ,MAP_PRIVATE,input,0);
Then I created my host buffer and device buffer for pinned memory :
cl_mem pinned_buffer_input = clCreateBuffer(context, CL_MEM_READ_ONLY | CL_MEM_ALLOC_HOST_PTR, size, mapped, NULL);
cl_mem buffer_input = clCreateBuffer(context, CL_MEM_READ_ONLY, input_size, NULL, NULL);
Within a for loop I am :
[ul]
[li]mapping the buffer :[/li]
void *pinnedMemory = clEnqueueMapBuffer(cmd_queue, pinned_buffer_input, CL_TRUE, CL_MAP_WRITE, header[3]+b*input_size, input_size_cur, 0, NULL, &ev, NULL);
[li]enqueuing the buffer :[/li]
clEnqueueWriteBuffer(cmd_queue, buffer_input, CL_FALSE, 0, input_size_cur, pinnedMemory, 0, NULL, &ev);
[li]unmapping the object :[/li]
clEnqueueUnmapMemObject(cmd_queue, pinned_buffer_input, pinnedMemory, 0, NULL, &ev);
[/ul]
Here mapped contains the whole file and is of size size. What I want is to have buffers of size input_size (or input_size_cur, same thing to simplify) to send data by blocks. So the offset is header[3]+b*input_size where b is incremented in the loop but it copies wrong data.
If I don’t initialize pinned_buffer_input with mapped then I can get a pointer to the host buffer with clEnqueueMapBuffer() and copy the data of mapped to that place :
memcpy(pinnedMemory, mapped+header[3]+b*input_size, input_size_cur);
By doing so it works but I want to avoid the memcpy as it is in a for loop and it creates huge delays in my program. To solve this problem I wanted to use the offset parameter of clEnqueueMapBuffer() but it screws up.
With CL_MEM_COPY_HOST_PTR instead of CL_MEM_ALLOC_HOST_PTR the result is correct but it takes ages to create pinned_buffer_input.