OpenCL struct alignment on host and device

I have a query regarding the byte alignment of structure in opencl. I read somewhere that I should be making sure that the size of the structure is same on host and device by making use of attribute ((aligned (X))).

My question is that is it sufficient to do something like (on both host and device) -
struct sampleStruct
{
float value_ONE;
float value_TWO;
} attribute ((aligned (8)));

or do I need to do the following (align all the individual elements) -
struct sampleStruct
{
float value_ONE attribute ((aligned (8)));
float value_TWO attribute ((aligned (8)));
};

Why I ask this is because I want to understand the case where by default float on the host is assigned 8 bytes and on the device 4 bytes? Do I need to care for this kind of scenario or OpenCl compiler will handle this?

Thanks

Basic data types in OpenCL (integer and float data types) have defined sizes and alignment requirements (refer to section 6.1.5 of the OpenCL 1.1 specification). To make sure the alignment matches between the host & OpenCL kernel that is using these data types on the device, use the cl_<type name> for declaring structs that will be shared by the host and device (refer to table after table 6.1 in the 1.1 spec)

Note that this only works for integer & floating-point data types. The alignment rules do not work for pointers. This is one reason why pointers cannot be embedded inside a struct in OpenCL.

Affie, isn’t it still possible for both the host and the device compiler to introduce additional padding between struct members and at the end of the struct (which is important for AoS)? Section 6.7.2.1 p13 and p15 of the C99 spec talks about this.

Affie, isn’t it still possible for both the host and the device compiler to introduce additional padding between struct members and at the end of the struct (which is important for AoS)? Section 6.7.2.1 p13 and p15 of the C99 spec talks about this.

The problem is, how do you guarantee that the host and device compiler will introduce the additional padding in the same place? Note that the spec says that padding may be added. Compilers will then choose to add padding as they believe appropriate, and there is not even a guarantee that two different host compilers will add padding in the same place, let alone a host and device compiler which are working on effectively different architectures.

This is exactly why hand-specifying the alignment (and thus forcing the compiler to pad as appropriate) is needed.

One technique is to put the largest data members first, then work your way down in size. Otherwise, when you mix data sizes, you can get different padding on different platforms.