Is it possible in OpenCL to communicate GPU to GPU over PCIe? I understand that the CPU runs the "host" that does all of the cleanup, and some computation, and the GPU's are almost used as a co-processor with "localities", but is that as far as it can go?

I know that I run a crossfire setup, where the GPU's are linked together, but could it be better to NOT run them in Xfire, and run a "sub-host" on one GPU, and let that do some "local-hosting" to offload/speedup some of the CPU to GPU work? Maybe, dual GPU cards could have one "master" and one "slave" chip, determined by the OpenCL app, and run different threads, like all of its read/write to global duties, almost like a GPU cacher.

