Inteface layer

I’m interested in implementing a interface layer so that clGetPlatformInfo returns all the devices on the platform, instead of just those under the control of the specific implementation. In other words, on linux right now, as far as I can tell, a program can link to either the AMD or NVIDIA implementation, not both. The goal of this project would be to make it possible to be agnostic about what implementations exist on the machine, and leave it to the interface to communicate with them. This would allow a process to send multiple kernels to multiple devices.

I’m fairly sure that this is the case on Apple systems at the moment, so I’d like to get the same functionality on Linux.

Any recommendations on the best way to do this? If possible it would be great if at runtime the interface could search for multiple versions of libOpenCL.so, load them, and multiplex the functions appropriately. I apologise, I’m a bit ignorant when it comes to linkers and compilers. I would be open to more creative ways of interfacing with multiple implementations too, this just seemed like it would be the most efficient/elegant.

Thanks a lot,
George

George,
The platform layer is supposed to provide much of this functionality. Having only used Apple’s platform, however, I do not know how much of this interoperability has actually been implemented, but I believe the plan is to get it in place. That will address the ability to load multiple vendors’ implementations simultaneously, but the harder part of synchronizing data between them will not be addressed. That portion is not addressed in the OpenCL spec, and will be left to the implementer via copies through host memory with explicit synchronization. There is potentially a fair amount of copy/synchronization overhead there, though.

Thanks dbs2,
Unfortunately on Linux at the moment NVIDIA and AMD already have a platform layer implemented, and you have to either link to one or the other. So I guess I’m asking what people think would be the best way to implement a platform layer on top of NVidia and AMD implementations, if its possible at all. This would allow functions like clGetPlatformIDs to return multiple platforms, which as far as I can tell is impossible in Linux as of now.

Thanks,
George

P.S: I had never thought about the fact that there is no inter-device memory transfer specified in OpenCL, that’s interesting.

I believe the platform layer is designed to be vendor-neutral at some point in the future, which will address much of that concern. Of course the vendors don’t have that much incentive to make that work, so it might take a while.

The inter-device memory transfer is specified as being automatic within the constraints of a weak-memory model. (E.g., the runtime is responsible for moving dirty data when needed, but the user is responsible for making sure multiple things aren’t accessing data at the same time.) So this does specify what happens for all devices in a platform. For example, on Apple’s platform, data is automatically moved between AMD and Nvidia GPUs and the CPU as needed. (Those devices all show up with clGetDeviceIDs, so they are all in one platform.) This breaks down when you have multiple platforms, since there is no specified interface for the drivers to communicate the data. Having a platform layer that automatically moved data via the host between platforms would be very nice, but it would undoubtedly have to be conservative in some cases resulting in extra copies.