What do I need the vector load and store function for?

Aloha,

What are the vector load and store function for?
I couldn’t find good examples.

Do they give a significant performance boost? If yes, in which situation?
Can they be replaced?

Best Wishes

Those are identical to simple assigment operator, but you guarantee that adresses are aligned to some number. Don’t bother on GPU devices. As far as I can tell, on CPUs they should map directly to SSE/AVX aligned store\load instruction, which gives you around 5% better performance than unaligned one. Google “SSE intrinsics” if you want to read on in more detail.