GPGPU and rendering calculations on same GPU?

Hello. I’ve been investigating white papers, podcasts, etc. on OpenCL and have read some papers on GPGPU development in general, but have not yet found an answer to one burning question. Surely it is feasible to have one GPU take care of both hardware accelerated rendering and GPGPU calculations, but would performance be adequate? Is it a better idea to have two GPUs in a system and keep these concerns separate? How do most people approach this situation? I’m in the research phase of a master’s project and am concerned whether or not I will have to drop a load of cash on a machine capable of two solid graphics cards or not.

Any thoughts and knowledge on this topic would be greatly appreciated.

This is a tough question. Apple has a demo app on their website that dose an n-body simulation in OpenCL and uses openGL to render it. Depending on the configuration you’re running, it can be faster to have the compute done one GPU and the rendering on another. (E.g., if you have a GTX 285 and a GT 120 you’ll get the best performance doing compute on the 285 and rendering on the 120, rather than trying to do compute on both and rendering on the 120.) Of course this problem is somewhat special because the amount of data that is transfered from the compute card to the render card is small (~60k points). If you are doing a computation that generates a lot of data for visualization, the cost of moving it from one card’s VRAM to another might outweigh the computational benefit of not doing rendering on the same card. In general you will have to experiment with your particular problem and setup to see what is most efficient.

Thanks so much for the thoughtful and interesting response! Now, you had mentioned that a likely factor when it comes to speed might be in the amount of data to transfer from one GPU to another. I have indeed read that one of the most expensive tasks in GPU processing is in the latency involved in transferring the data across the bus. Does this sound about right? Are there any other significant factors that would contribute to performance costs?

It’s a combination of load-balancing and transfer latency. If your rendering is so computationally heavy that you can amortize the transfer cost by doing more computation while rendering on the other card then you will win. If not, then you’re better off rendering and computing on the same device. Unfortunately this varies a between applications and systems/cards so I can’t give you any more concrete answers. I know multi-device processing today is not as optimized on various platforms as one might hope, so you will want to make sure you are doing a lot of compute for any data transfer.

Interesting. Even though theorizing about concrete situations is not very productive, knowing the fundamental principles helps to make good decisions about various concrete decisions; therefore, your insight was very helpful. Thanks!