I have some trouble understanding how shader input/ouput locations work.
According to the specification, the maximum number of ouput locations you can have in a vertex shader is “maxVertexOutputComponents / 4” and the maximum input locations for a fragment shader is “maxFragmentInputComponents / 4”.
The maximum vertex output and fragment input component number for my GPU is 128, so I should have 32 locations available.
In my shaders I have some input/output blocks similar to this:
Vertex Shader:
#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7
struct DataBlock1
{
vec3 v1;
vec3 v2;
vec3 v3;
vec3 v4;
vec2 v5;
vec3 v6;
};
layout(location = BLOCK1_LOCATION) out DataBlock1 vs_out1;
layout(location = BLOCK2_LOCATION) out DataBlock2
{
vec2 v1;
} vs_out2;
struct DataBlock3
{
vec3 v1;
vec3 v2;
vec4 v3;
};
layout(location = BLOCK3_LOCATION) out DataBlock3 vs_out3[8];
Fragment Shader:
#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7
struct DataBlock1
{
vec3 v1;
vec3 v2;
vec3 v3;
vec3 v4;
vec2 v5;
vec3 v6;
};
layout(location = BLOCK1_LOCATION) in DataBlock1 fs_in1;
layout(location = BLOCK2_LOCATION) in DataBlock2
{
vec2 v1;
} fs_in2;
struct DataBlock3
{
vec3 v1;
vec3 v2;
vec4 v3;
};
layout(location = BLOCK3_LOCATION) in DataBlock3 fs_in3[8];
The specification says this about location consumption:
Inputs and outputs of the following types consume a single interface location:
32-bit scalar and vector types, and
64-bit scalar and 2-component vector types.
64-bit three- and four-component vectors consume two consecutive locations.
If a declared input or output is an array of size n and each element takes m locations, it will be assigned m × n consecutive locations starting with the location specified.
Since all of my types are either vec2, vec3 or vec4, they should each consume just 1 location, correct?
So, that means:
Block #1:
- Start location: 0
- 6 vectors = 6 locations are consumed
Block #2:
- Start location: 6
- 1 vector = 1 location is consumed
Block #3:
- Start location: 7
- 8 *3 vectors = 24 locations are consumed
In total, 31 locations are consumed, which is just within the limits.
However, if I use these locations, it throws an ‘eErrorInitializationFailed (-3)’ error in ‘createGraphicsPipelines’. There are no warnings from the validation layers, or the glslang validator.
If I instead choose some semi-arbitrary locations, e.g.:
#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 20
#define BLOCK3_LOCATION 30
It does not throw any errors and works just fine, at least on an AMD-card. On Nvidia I have to choose different locations to get it to work. Obviously this is very fishy.
What could be causing this? Am I misunderstanding how the locations should be specified? Can they conflict with other shader data (Uniforms, push constants…)? Is there some sort of alignment that needs to be adhered to?