Buffer Strategy

I’m starting my second project using openCL and am curious about others opinions and experience with buffer size. Here is my question. For kernels that use small amounts of input data and as a result might require many kernel calls from the host, is it worth it to set up a “managed buffer” system that packs the data into larger buffers and handles fragmentation, etc…? This might allow me to process a large number of items without alot of buffer changes. Or in general is it better to just create alot of smaller buffers as needed? I understand this is a very general question. Thanks.

The trade-off is that if the composite buffers are too large, the implementation might have trouble placing them in device memory, leading to stalls while other data is shuffled to the host. On the other hand, if there is a large number of small buffers, the overhead of managing them might be higher than necessary.

Thanks for the reply. Am I correct in assuming it’s roughly the equivalent of VBO’s in openGL? My understanding(and I’m no expert) of buffers in openGL is that there is always some overhead associated with switching them, and it is almost always faster to execute calls with a few large VBO’s as opposed to a bunch of small ones. I guess the thing to do would be run some test cases with various buffer sizes vs. number of buffers. I was thinking someone might have done this already and might share their results. Thanks again.