16 bit integer Atomics

This came up recently and seemed somewhat silly as we have 32bit and 64bit atomics. It was frustrating to have to send over a much larger buffer when all I needed was 16bit unsigned shorts.

I doubt that support for 16-bit atomics is widespread on GPU hardware.

Why? 16bit computations are commonplace these days.

You can always use masking combined with 32bit atomic ops to work with a 16bit buffer.

Hmmm…unless I am just crazy I don’t see how that would work with atomic_inc?

All atomic operations can be implemented with atom_cmpxchg(), albeit less efficiently. In particular, using masking like ljbade suggests may be a good option.