I’ve been spending some time with the Vulkan database registry, comparing the various memory types used by different vendors. But there are some setups that I can’t wrap my head around.
Zero memory flags
Memory types have flags, defined by VkMemoryPropertyFlagBits
. For each memory type, the propertyFlags
field describing it must match one of the specified sets of flags.
One of these sets is 0. My question is… why? What does such a memory type actually mean and when would you allocate through it?
Such memory is not DEVICE_LOCAL
, so allocating through it won’t achieve the best performance. Such memory is not HOST_VISIBLE
, so mapping it is not possible. This means you have to treat it as though it were DEVICE_LOCAL
in terms of access (ie: you have to use staging and transfers), but you don’t get the performance out of it.
Several pieces of NVIDIA hardware, using recent drivers, expose memory types that have no flags set. In these cases, the un-flagged memory types all are associated with the CPU-memory heap.
That last part suggests a purpose. It could be for streaming or times when there is contention. But this would only be reasonable if the non-HOST_VISIBLE
types were in some way faster for the GPU to access or copy from; otherwise, you’d just make them HOST_VISIBLE
and be done with it.
Anybody got any insight into why implementers expose such memory types?
Double-device local
AMD hardware presents another interesting memory setup. Lots of AMD hardware has two sets of device local memory heaps.
It seems like they carve 256MB out of their GPU memory and set it aside in a special memory pool. This pool is accessible through a memory type that is DEVICE_LOCAL
and HOST_VISIBLE
(and HOST_COHERENT
).
The thing is, I’m not sure what you would use that for. Even though its visible directly, the fact that it’s DEVICE_LOCAL
probably means that such accesses are not fast for the CPU. So staging would probably best be done using the CPU memory pool instead of this cutting of GPU memory.
What’s the use case for this buffer? Images/buffers you frequently access on the CPU and GPU? Is this intended for things like UBOs, so that they can be in faster memory without you having to DMA data?