Understanding shader input/output locations

Silverlan · July 17, 2016, 2:01am

I have some trouble understanding how shader input/ouput locations work.
According to the specification, the maximum number of ouput locations you can have in a vertex shader is “maxVertexOutputComponents / 4” and the maximum input locations for a fragment shader is “maxFragmentInputComponents / 4”.
The maximum vertex output and fragment input component number for my GPU is 128, so I should have 32 locations available.

In my shaders I have some input/output blocks similar to this:
Vertex Shader:

#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7

struct DataBlock1
{
	vec3 v1;
	vec3 v2;
	vec3 v3;
	vec3 v4;
	vec2 v5;
	vec3 v6;
};
layout(location = BLOCK1_LOCATION) out DataBlock1 vs_out1;

layout(location = BLOCK2_LOCATION) out DataBlock2
{
	vec2 v1;
} vs_out2;

struct DataBlock3
{
	vec3 v1;
	vec3 v2;
	vec4 v3;
};

layout(location = BLOCK3_LOCATION) out DataBlock3 vs_out3[8];

Fragment Shader:

#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7

struct DataBlock1
{
	vec3 v1;
	vec3 v2;
	vec3 v3;
	vec3 v4;
	vec2 v5;
	vec3 v6;
};
layout(location = BLOCK1_LOCATION) in DataBlock1 fs_in1;

layout(location = BLOCK2_LOCATION) in DataBlock2
{
	vec2 v1;
} fs_in2;

struct DataBlock3
{
	vec3 v1;
	vec3 v2;
	vec4 v3;
};

layout(location = BLOCK3_LOCATION) in DataBlock3 fs_in3[8];

The specification says this about location consumption:

Inputs and outputs of the following types consume a single interface location:

32-bit scalar and vector types, and
64-bit scalar and 2-component vector types.
64-bit three- and four-component vectors consume two consecutive locations.

If a declared input or output is an array of size n and each element takes m locations, it will be assigned m × n consecutive locations starting with the location specified.

Since all of my types are either vec2, vec3 or vec4, they should each consume just 1 location, correct?

So, that means:
Block #1:

Start location: 0
6 vectors = 6 locations are consumed

Block #2:

Start location: 6
1 vector = 1 location is consumed

Block #3:

Start location: 7
8 *3 vectors = 24 locations are consumed

In total, 31 locations are consumed, which is just within the limits.
However, if I use these locations, it throws an ‘eErrorInitializationFailed (-3)’ error in ‘createGraphicsPipelines’. There are no warnings from the validation layers, or the glslang validator.
If I instead choose some semi-arbitrary locations, e.g.:


#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 20
#define BLOCK3_LOCATION 30

It does not throw any errors and works just fine, at least on an AMD-card. On Nvidia I have to choose different locations to get it to work. Obviously this is very fishy.

What could be causing this? Am I misunderstanding how the locations should be specified? Can they conflict with other shader data (Uniforms, push constants…)? Is there some sort of alignment that needs to be adhered to?

krOoze · July 18, 2016, 1:54pm

The SPIR-V output of your code seems weird to me…

Have you tried some identical variation e.g.:


#define BLOCK1_LOCATION 0
#define BLOCK2_LOCATION 6
#define BLOCK3_LOCATION 7
 
struct DataBlock1
{
	vec3 v1;
	vec3 v2;
	vec3 v3;
	vec3 v4;
	vec2 v5;
	vec3 v6;
};

struct DataBlock2
{
	vec2 v1;
};

struct DataBlock3
{
	vec3 v1;
	vec3 v2;
	vec4 v3;
};

layout(location = 0) out DataBlock{
	DataBlock1 vs_out1;
	DataBlock2 vs_out2;
	DataBlock3 vs_out3[8];
} vs_out;

Also there is a fix in commit log that seems to touch this not long ago. Try building yourself the glslangValidator fresh directly from gitHub.