Maybe new functions aren’t needed if I can accomplish my goal of 2D and 3D buffers using images if there is an appropriate cl_mem_flags value, say channel_order of CL_R or CL_A and channel_data_type of CL_FLOAT or CL_DOUBLE?
Your intuition was good because that’s exactly what I was going to recommend
You could create a 2D real matrix with something like this:
cl_image_format image_format = {CL_R, CL_FLOAT};
myMatrix = clCreateImage2D(ctx, CL_READ|CL_WRITE, &image_format, width, height, /*row_pitch*/ 0, /*host_ptr*/ NULL, &errcode);
if(errcode)
{
do_not_panic(errcode);
}
I don’t think there’s any hardware out there that supports double-precision floats in images, though. For doubles you would need to use buffer objects and treat them as 2D manually. Keep in mind that treating a 1D buffer as a 2D surface is not very difficult. Essentially all you have to do is choose whether you want it to be in row-major order or column-major order and write a few utility functions to create, read, write and copy from them.
In linear algebra, matrices/tensors and sub-matrices/tensors are typically defined by m (height), n (width), and lda (row_pitch), which aren’t communicated well using the current clEnqueue{Read, Write}Buffer. The clEnqueue{Read, Write}Image seems to be a better match, but I’m not sure about the internal structure of cl_mem (or * _cl_mem).
Right. When you read/write/copy strided data using clEnqueue{Read,Write}Buffer() you need to account for the stride. A couple of support functions in your app would do the trick.
The internal structure of cl_mem is different in each OpenCL implementation, which is why the OpenCL API exposes it as an opaque handle.