Hi everybody,

I'm currently working with opencl and i'm getting issues with a high amount of registers per thread in my main kernel.

The main kernel use a quite large amount of float4 but actually it could be float3 most of the time. I know cl_float3 is a typedef of cl_float4, i also know that float3 on device side is a 16 bytes struct.

Am i right, if i think that extra unused float is a waste of register ?

if yes I'm looking for a tip to bypass this problem ?

sorry for bad english.