I'm trying to build a program that applies a certain amount of operations on an image, one after another.
To boost performance i would like to reuse the memory output from the first operation as input for the second operation without reading the memory from the gpu memory.

For example operation1 has IMG1 as input and stores IMG1OUT as output, still in gpu memory.
Now i want to use that part of the memory as input for operation2, and so on until i did all my operations.

The problem i stumbled upon was an INVALID_KERNEL_ARG, i figured it out but do not know how to solve it:

To make the first input memory i use the following
Code :
memObject = clCreateImage2D(platform.getContext(), CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, new cl_image_format[]{imgformat}, width, height, width * Sizeof.cl_uint, Pointer.to(dataSrc), null);

As you can see the type of memory is set as READ_ONLY, while the output argument i give is WRITE_ONLY:
Code :
this.output = CL.clCreateImage2D(
                platform.getContext(), CL_MEM_WRITE_ONLY,
                new cl_image_format[]{imageFormat}, imgWidth[0], imgHeight[0],
                0, null, null);

The difference has to be made because of openCL specific stuff.. I read it somewhere

Anyway, how can I use the output immediately again as input for the next operatiion? It would be stupid to read the image back into a bufferedImage into my workmemory and the reallocate again on my gpu memory.
That's just more datatransfer then needed i think.

There has to be a way to change the read/write flags or something like that.
I hope you can help me