Hi,

there shouldn't be such a huge difference in execution time. Main reason why float4 is faster on GPU architecture is that the GPU architecture is optimized for float4 data. The memory...