Understanding shader input/output locations

I have some trouble understanding how shader input/ouput locations work.
According to the specification, the maximum number of ouput locations you can have in a vertex shader is “maxVertexOutputComponents / 4” and the maximum input locations for a fragment shader is “maxFragmentInputComponents / 4”.
The maximum vertex output and fragment input component number for my GPU is 128, so I should have 32 locations available.

In my shaders I have some input/output blocks similar to this:
Vertex Shader:

#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7

struct DataBlock1
{
	vec3 v1;
	vec3 v2;
	vec3 v3;
	vec3 v4;
	vec2 v5;
	vec3 v6;
};
layout(location = BLOCK1_LOCATION) out DataBlock1 vs_out1;

layout(location = BLOCK2_LOCATION) out DataBlock2
{
	vec2 v1;
} vs_out2;

struct DataBlock3
{
	vec3 v1;
	vec3 v2;
	vec4 v3;
};

layout(location = BLOCK3_LOCATION) out DataBlock3 vs_out3[8];

Fragment Shader:

#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7

struct DataBlock1
{
	vec3 v1;
	vec3 v2;
	vec3 v3;
	vec3 v4;
	vec2 v5;
	vec3 v6;
};
layout(location = BLOCK1_LOCATION) in DataBlock1 fs_in1;

layout(location = BLOCK2_LOCATION) in DataBlock2
{
	vec2 v1;
} fs_in2;

struct DataBlock3
{
	vec3 v1;
	vec3 v2;
	vec4 v3;
};

layout(location = BLOCK3_LOCATION) in DataBlock3 fs_in3[8];

The specification says this about location consumption:

Inputs and outputs of the following types consume a single interface location:

32-bit scalar and vector types, and
64-bit scalar and 2-component vector types.
64-bit three- and four-component vectors consume two consecutive locations.

If a declared input or output is an array of size n and each element takes m locations, it will be assigned m × n consecutive locations starting with the location specified.

Since all of my types are either vec2, vec3 or vec4, they should each consume just 1 location, correct?

So, that means:
Block #1:

  • Start location: 0
  • 6 vectors = 6 locations are consumed

Block #2:

  • Start location: 6
  • 1 vector = 1 location is consumed

Block #3:

  • Start location: 7
  • 8 *3 vectors = 24 locations are consumed

In total, 31 locations are consumed, which is just within the limits.
However, if I use these locations, it throws an ‘eErrorInitializationFailed (-3)’ error in ‘createGraphicsPipelines’. There are no warnings from the validation layers, or the glslang validator.
If I instead choose some semi-arbitrary locations, e.g.:


#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 20
#define BLOCK3_LOCATION 30

It does not throw any errors and works just fine, at least on an AMD-card. On Nvidia I have to choose different locations to get it to work. Obviously this is very fishy.

What could be causing this? Am I misunderstanding how the locations should be specified? Can they conflict with other shader data (Uniforms, push constants…)? Is there some sort of alignment that needs to be adhered to?

The SPIR-V output of your code seems weird to me…

Have you tried some identical variation e.g.:


#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7
 
struct DataBlock1
{
	vec3 v1;
	vec3 v2;
	vec3 v3;
	vec3 v4;
	vec2 v5;
	vec3 v6;
};

struct DataBlock2
{
	vec2 v1;
};

struct DataBlock3
{
	vec3 v1;
	vec3 v2;
	vec4 v3;
};

layout(location = 0) out DataBlock{
	DataBlock1 vs_out1;
	DataBlock2 vs_out2;
	DataBlock3 vs_out3[8];
} vs_out;

Also there is a fix in commit log that seems to touch this not long ago. Try building yourself the glslangValidator fresh directly from gitHub.