Thread: Using a buffer on multiple devices.

    Jul 2012

    Using a buffer on multiple devices.

    Hello everybody,

    I have searched the web for similar questions but couldn't find the answer I need...

    I am making a dynamic scheduler (following AMD's OpenCL Guide) for handling multiple GPUs. However, I am experiencing some troubles with the way OpenCL handles memory...

    Basically, I have 5 buffers, i'll just call them A, B, C, D, E ...
    I am executing two kernels on two devices:

    Device 1 : A = f(B,C) [ does not modify B or C ]
    Device 2 : D = f(B,E) [ does not modify B or E ]

    I am making one host thread per queue, and there is only one queue on each device....
    The problem is that, if Device1 executes first, Device 2 does not execute the task until B is available (i.e. until Device 1 is done...). So, in the end, everything ends up being serialized.
    I have tried to use READ_ONLY and WRITE_ONLY buffers to indicate the OpenCL implementation that B is not modified, but experienced the same problem...
    Is there any AMD-and-NVidia-compatible way of concurrently enqueueing these two tasks without having to duplicate B?

    Thank you very much !

    Edit : my tests were done on an NVidia platform.

    Sep 2002
    Santa Clara

    Re: Using a buffer on multiple devices.

    Just wanted to confirm that when you enqueue kernel to device 2 that does not modify B or E, you do not use the event that refers to kernel enqueued to device 1 (that does not modify B or C) in the event_wait_list argument.

    If no event dependencies are specified both kernels should execute in parallel. You should take this with the folks at AMD on their developer forum.

    Dec 2011

    Re: Using a buffer on multiple devices.

    Did you try using a duplicate of B to see if it works like you want?

