Hi,
this is my problem, I have 3 functions in my .cl file and I called them in a kernel. it’a an image processing functions, so I do 3 transformations on the image, now I’m working in the global memory, but I want to work in the local.
what I want to do is to transfer work-groups to local memory, do the 3 operations then write the result back to global, but since each transformation is a separate function, is this possible?
I can detail more if needed.
hi,
You need to change the signature of your functions to pass pointers to local mem:
void Fun(__local float2* t1, __local float* t2, float eta, unsigned int v)
If I understand you correctly you should:
- Download work-group data into local memory from global memory (preferably in coalesced fashion - I’m assuming you’re targeting gpu architecture)
- Process the data in local memory - by invoking your functions with appropriate pointers
- When done write results to global memory
I don’t quite get “(…) transfer work-groups to local memory”. Work-group is execution, memory is memory. Work-groups consist of work-items executing in parallel. Every work-item from a work-group executes a kernel. Local memory is shared by all work-items in a work-group, while global memory is accessible to all work-items from all work-groups.
In general you should batch and minimize your transfers from/to global memory and work as much as possible on fast local memory.