I don’t understand. You allocate two uninitialized buffers “pBuf” and “pBuf2”. Then you copy the uninitialized contents of “pBuf” into the memory object “mem”. Finally, you copy the contents of “mem”, which now contains the same uninitialized data as “pBuf” into “pBuf2”.
At this point the contents of “pBuf”, “mem” and “pBuf2” are identical, but since you never initialized “pBuf” in the first place, the data is garbage instead of zeroes. Is that what you are seeing?
thank you two!
the code i posted can work well,After do this the memory of pBuf2 is zero indeed. It was wrong in my computer before, is because i set nCount=100000000
the memory is too big for gpu i think, so problem happens.
but i have another question, does OpenCL provide any technology to solve the bottleneck between memory copy in gup and host?
thanks for your answer! and sorry for my english.
No, OpenCL doesn’t help your here. It’s your job to copy data between devices and host. All you can do is keep the memory transfers to a minimum when you write your programs.