Class memory alignment/padding and Buffers

imported_EmJayJay · January 25, 2015, 1:16am

Quite simple. I have a class which contains 1 GLuint and 1 GLushort variables and no other member variables except global constants. The size of the class is still 8 bytes even though it should be 6 but padding adds 2 empty bytes before the GLushort (Probably. Due to 64-bit implementation I think). Now I have a vector which will contain 262144 (2^18, 64³) objects of such class. I will but them on a buffer which could be read with GL_UNSIGNED_SHORT hint. Do I have to change the readable amount of data from 3 to 4 to take this 2 extra byte padding into consideration in order to keep cohesive memory reads. I don’t do anything with those 2 extra bytes but I guess its still included so I have to take it into an account right when defining vertex attrbute for example when reading from a buffer.

Osbios · January 25, 2015, 2:01am

In C/C++ you can simply change the packing alignment to 1 byte.

[code=“CPP”]
#pragma pack(push) //save default alignment
#pragma pack(1)
class myClass
{
GLuint x;
GLushort y;
};
#pragma pack(pop) //restore default alignment

myClass ca[262144];


On 32bit it would align to 4 bytes, so you still would get a 8 byte object without this.

imported_EmJayJay · January 25, 2015, 3:00am

[QUOTE=Osbios;1263974]In C/C++ you can simply change the packing alignment to 1 byte.

[code=“CPP”]
#pragma pack(push) //save default alignment
#pragma pack(1)
class myClass
{
GLuint x;
GLushort y;
};
#pragma pack(pop) //restore default alignment

myClass ca[262144];


On 32bit it would align to 4 bytes, so you still would get a 8 byte object without this.[/QUOTE]
Can you tell me how to keep this effect within this class only? I used it and it worked but the side effect was that it started to effect something else which made a string comparison to cause SIGSEGV because one of the string variables is &lt;optimized out reference&gt; during this comparison (its always the same variable, tested it by switching sides of the variables in the comparison).

EDIT: I missed that #pragma pack(pop). My bad. Thanks still.

Alfonse_Reinheart · January 25, 2015, 7:19am

In C/C++ you can simply change the packing alignment to 1 byte.

Those pragmas are not part of either the C or C++ standards. That code only works on some compilers.

Granted, “some” in this case means VC++, GCC, Clang, and Intel CC. But it’s always good to know when you’re dealing with code outside the standard.

GClements · January 30, 2015, 10:50pm

It would typically be the same on a 32-bit system.

The GLuint typically needs to be aligned to a 32-bit boundary, either as a hard requirement or to avoid performance penalties. Some CPUs (in fact, most CPUs except x86/x86-64) will either generate an exception or return bad data for unaligned reads (the compiler can work around this, but the performance impact is generally unacceptable). Even if the CPU can handle unaligned reads, there’s a performance cost as the field can straddle cache lines and even page boundaries.

Now I have a vector which will contain 262144 (2^18, 64³) objects of such class. I will but them on a buffer which could be read with GL_UNSIGNED_SHORT hint. Do I have to change the readable amount of data from 3 to 4 to take this 2 extra byte padding into consideration in order to keep cohesive memory reads. I don’t do anything with those 2 extra bytes but I guess its still included so I have to take it into an account right when defining vertex attrbute for example when reading from a buffer.

The buffer size will need to be 8 bytes per structure, and the stride parameter to glVertexAttribPointer will need to be 8.

I suggest that you don’t follow Osbios’ advice and ask the compiler to pack the structure. The difference in memory usage (2 MB versus 1.5 MB) isn’t enough to justify the performance hit. If the array was large enough that the memory consumed by padding mattered, you would be better off replacing your array of structs with a pair of arrays (one of GLuint, one of GLushort) so that the GLuint values remain correctly aligned (for both the CPU and GPU).