casting char* to struct

I’m passing in an uchar array to my kernel then trying to cast to a struct. The GPU driver crashes after a few frames. Here’s the OpenCL code. I’ve stripped out everything else to simplify it and it still crashes. What am I missing?

typedef struct
{
uchar r, g, b;
} Pixel;

__kernel void main (__global uchar* framebuffer, int width, int height)
{
int index = get_global_id(1) * width + get_global_id(0);
Pixel* p = (Pixel* )&framebuffer[index * 3];
p->r = 255;
p->g = 0;
p->b = 0;
}

Assuming that you have already verified that width and height match the global work-size that you passed to clEnqueueNDRangeKernel() and assuming that you have also verified that the buffer object you passed to the kernel is large enough, my guess would be a problem with alignment.

There’s a good chance that sizeof(Pixel) is 4 bytes if not larger. The spec is a bit vague here, although it makes the following two interesting statements.

Section 6.1.5:

A data item declared to be a data type in memory is always aligned to the size of the data type in bytes.

Section 6.2.5.:

Pointers to old and new types may be cast back and forth to each other. Casting a pointer to a new type represents an unchecked assertion that the address is correctly aligned.

In other words, by casting the uchar pointer to Pixel your code is asserting that the pointer is aligned to sizeof(Pixel). If it’s not the case, the code may and will fail to run properly.

Thinking about it a bit more, I’m not convinced that the alignment of the struct would be an issue from the viewpoint of standard OpenCL. As long as the members of the struct are properly aligned when you dereference the struct pointer, it should be fine.

The bug reproducer you have is nice and short. Have you tried sending it to your hardware vendor?

The OpenCL extension cl_khr_byte_addressable_store removes certain restrictions on built-in types char, uchar, char2, uchar2, short, and half. An application that wants to be able to write to elements of a pointer (or struct) that are of type char, uchar, char2, uchar2, short, ushort, and half will need to include the #pragma OPENCL EXTENSION cl_khr_byte_addressable_store : enable directive before any code that performs writes that may not be supported.

http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/cl_khr_byte_addressable_store.html

I’ve already included the cl_khr_byte_addressable_store extension. It still crashes. I’ve also tried adding the alignment and byte packing attributes to the Pixel struct but to no avail.

typedef struct
{
uchar r attribute ((packed));
uchar g attribute ((packed));
uchar b attribute ((packed));
} Pixel attribute ((aligned (1)));

On my NVIDIA Geforce 480 it freezes and crashes. On my AMD Radeon 5850 it actually throws a compiler error stating “invalid type conversion: Pixel* p = (Pixel* )&colorBuffer[index*3];”

EDIT

I mistakenly left out the __global modifier when declaring the Pixel variable. It all works fine now.

__global Pixel* p = (__global Pixel* )&colorBuffer[index * 3];