I am John, and new to this forum. I want to start by saying, I love OpenGL ES. There are few things that the more I use, the more I fall in love with; C and OpenGL ES are among them.
Here is my question:
I have been doing a lot of tests for how I should setup the base rendering engine. Right now I am not using buffers, but had been considering them for quite a while. I have found that glMapBufferOES is slower then glBufferSubData which is slower then glBufferData; and I am wondering why that is.
My test environment: OpenGL ES 1.1
iPad (first generation) - PowerVR SGX
iPhone 3G - PowerVR MBX
To my understanding, glBufferSubData should always be faster then glBufferData because glBufferData reallocs the memory each time called, thus if your size doesn’t change, use glBufferSubData otherwise use glBufferData. What I have found is that glBufferSubData runs about 68% the speed of only using glBufferData.
I also understand that glMapBufferOES is an extension, but I have found that it also runs slower then glBufferSubData or glBufferData. It, in fact, is the slowest way of updating vertex information.
There might be inefficiencies in the implementation of glMapBuffers and glBufferSubData on the ipad. If things are optimal I would expect:
glMapBufferOES ? glBufferSubData ? glBufferData
Note, some driver will optimize the case of doing uploads with glBufferData if the size is the same as the previous, which allows skipping the re-allocation. You’re likely hitting the optimized case. Otherwise, it would be much slower then glBufferSubData and glMapBuffers. You can try removing or adding one vertex each frame, and I bet you’d see a difference.
I took your suggestion and randomized the quantity of data I sent to openGL (seeding prior to each test of course). Unfortunately I came up with the same results.
Note: The values below are Frames Per Second that were averaged over 300 tests.
Key:
I = Not buffered P = Points N = Not textured
M = glMapBufferOES Q = Quad Y = Textured
S = glBufferSubData
V = glBufferData
This should give subBuffer an advantage on half of the tests (unless bufferData is optimized as jpilon mentioned). Though, even if it is the full buffer though, shouldn’t it be at least equal speed?
If you want, I can post the code, I don’t mind sharing :-).