error to build program executable

when i build program, the following error is occured:
error: kernel pointer arguments must point to
addrSpace global, local, or constant
__kernel void cell_traverse_level_non_leafs_kernel_func(int max_depth, int i, global struct _FttRootCell * RootD, global struct _FttOct * tree, unsigned int v)
^

my kernel is :
__kernel void cell_traverse_level_non_leafs_kernel_func(int max_depth, int i, global struct _FttRootCell * RootD, global struct _FttOct * tree, unsigned int v) {


}

the weird :roll: is that the kernel arguments being pointers have the address qualifier global.

i don’t know if it matters but i use as device a CPU with platform AMD Accelerated Parallel Processing.

the structs are:
struct _FttRootCell {
struct _FttCell cell;

struct _FttCellNeighbors neighbors;
struct _FttVector pos;
unsigned int level;
global void * parent;
};

struct _FttOct {
unsigned int level;
global struct _FttCell * parent;
struct _FttCellNeighbors neighbors;
struct _FttVector pos;

struct _FttCell cell[FTT_CELLS];
};

Thank you

I found at OpenCL Programming Guide :

The struct type cannot contain any pointers if the struct or pointer to
a struct is used as an argument type to a kernel function.

The struct type can contain pointers only if the struct or pointer
to a struct is used as an argument type to a non-kernel function or
declared as a variable inside a kernel or non-kernel function.

so how can i do it?
Maybe, if i declare the kernel arguments as void * and then inside the kernel typecasting them to (struct FttOct *) and (struct _FttRootCell)…I don’t know…

I think you have to define every size of the Struct members. I cant see the size of the struct inside your structs. Make shure that the full byte size of your struct is defined.
And tell me if i’m wrong, but shouldn’t it be __global (missing the two underscores)

Thanks for the answer clint3112!
i do it my dissertation and i have been confused. 8)

I think you have to define every size of the Struct members. I cant see the size of the struct inside your structs. Make shure that the full byte size of your struct is defined.

 What do you mean? I suppose that you mean to check if the struct members of the host and the struct members of the device have the same size. Unfortunately, they don't.  I use as device an AMD CPU, so i can use printf inside a kernel. 

Let’s take it from the beginning…
I am trying to traverse a tree, so i use a contiguous memory block on host and i change the pointers using as base address the base address of the buffer object, so i have the question :
tree_d = clCreateBuffer(…); returns the base address of the buffer object? or i should do sth else?
More specifically, i do operations for changing the addresses as follows:
struct _FttOct tree_new_addresses_gpu = malloc(…);
tree_new_addresses_gpu[i].neighbors.c[j] = (struct _FttCell *)((size_t)tree_d + ((size_t)t[i].neighbors.c[j] - (size_t)tree));
and then i tranfer the tree_new_addresses_gpu to the device.
I print the addresses of the tree on the device and they are not correct.

the structs used for the traversal are:

struct _FttVector {
  double x, y, z;
};

struct _FttCell {
  unsigned int  flags;
  global void * data;
  global struct _FttOct * parent, * children;
};

struct _FttCellNeighbors {
  global  struct _FttCell * c[FTT_NEIGHBORS];
};

struct _FttCellChildren {
  global struct _FttCell * c[FTT_CELLS];
};

struct _FttRootCell {
  struct _FttCell cell;

  struct _FttCellNeighbors neighbors;
  struct _FttVector pos;
  unsigned int level;
  global void * parent;
};

struct _FttOct {
  unsigned int level;
  global struct _FttCell * parent;
  struct _FttCellNeighbors neighbors;
  struct _FttVector pos;

  struct _FttCell cell[FTT_CELLS];
};

struct _FttCellFace {
  global struct _FttCell * cell, * neighbor;
  FttDirection d;
};

struct _GfsFaceStateVector {
  double un;
  double v;
};

struct _GfsStateVector {
  /* temporary face variables */
  struct _GfsFaceStateVector f[FTT_NEIGHBORS];

  /* solid boundaries */
  global struct _GfsSolidVector * solid;

  double place_holder;
};

struct _GfsSolidVector {
  double s[FTT_NEIGHBORS];
  double a, fv;
  global struct _FttCell * merged;
  struct _FttVector cm, ca, v;
};

struct _GfsGradient {
  double a, b;
};

I find differences at sizes of structs:
host : sizeof(struct _GfsStateVector) = 80) and device : sizeof(struct _GfsStateVector) = 112
host : sizeof(struct _FttCellNeighbors) = 32 and device : sizeof(struct _FttCellNeighbors) = 48
host : sizeof(struct FttCell) = 32 and device : sizeof(struct FttCell) = 32 are the same…

So, i am confused!!! :frowning: :roll: :cry:
i found :

The compiler should handle data alignment for you. As for the structure’s size, if you put a pointer in it, it’s size will be implementation dependent. That being said, using pointers and expecting them to persist across kernels is dangerous.

On AMD GPUs, memory addresses can map differently across kernel calls which will screw up your data structures. I would recommend using integer offsets instead of pointers, as integers have a predictable size and you can reindex into the buffer they were allocated in at runtime and get the real address.

and also i found at a lecture :

OpenCL does not allow device pointer arithmetic on the host:
Must use integer offsets
i do not understand… :?

So, i cannot use pointers as members of structs?? :?: It is not convenient for me to use integer offsets instead.

P.S I found at OpenCL Progamming Guide that global and __global are the same. :wink:

One thing to keep in mind for your structs is, that they have to be aligned in float or float4 sizes. not really shure there. look into the spec at userdefined structs.

For the pointer you could use clCreateBuffer with flag MEM_ALLOC to let openCL alloc the memory for you. Then you can use clMapBuffer to map the location in that buffer to your host application.

Don’t have time to look that up atm but it could give you a hint.

Thank you fo the answer… :smiley:

With flag CL_MEM_ALLOC_HOST_PTR the application wants the OpenCL implementation to allocate memory from host accessible memory, it’s like pinned memory at CUDA. But, i don’t want this.
I need to copy the whole tree to device, but i should change the addresses in host:
(size_t)base address of memory object on device +((size_t)Ptr - (size_t)HostPtr).
For this reason, i need the base address of memory object on device. Is there any way?

I found the following but it does not work:

__kernel void getPtr( __global void *ptr, __global void  *out )
{
 out = ptr;
}
...

cl_mem auxBuf = clCreateBuffer( context, CL_MEM_READ_WRITE, sizeof(void*), NULL, NULL );
void *gpuPtr;

clSetKernelArg( getterKernel, 0, sizeof(cl_mem), &myBuf );
clSetKernelArg( getterKernel, 1, sizeof(cl_mem), &auxBuf );
clEnqueueTask( commandQueue, getterKernel, 0, NULL, NULL );
clEnqueueReadBuffer( commandQueue, auxBuf, CL_TRUE, 0, sizeof(void*), &gpuPtr, 0, NULL, NULL );

clReleaseMemObject(auxBuf);

...

Now “gpuPtr” should contain the address of the beginning of “myBuf” in GPU memory space.

Just to make that clear, you want to get the memoryadress on the GPU? why do you need that. i dont think you can read that memory without the openCL dll. Anythink i could is imagine is syncing device and host memory for an openCL allocated mom or copy to a memory you have allocated.

or do you want to let the gpu point to another memory location
0x0001 <-- Ptr 1 Points here
0x00ff <-- ptr2 points here
__kernel foo(ptr1, ptr2)
and then
ptr1 = 0x00ff
ptr2 = 0x0001
__kernel foo(prt1, ptr2)

that you change the memory offset by changing your pointer?