OpenGL ES rendering

Hello everyone,

I have some questions about shaders usage with OpenGL ES 2.0 and 3.0.
. can we execute more than one shader at a time ?
. Mobile device GPU have many cores, can we address these cores directly. I mean execute a shader on each gpu core.
. are we sure that only one shader can run at a time on one GPU Core ?
. can a running shader be preempted by another shader ?

Is there a way to debug GLSL script running on a shader ?

Thanks.

[QUOTE=ulrich;37800]Hello everyone,

I have some questions about shaders usage with OpenGL ES 2.0 and 3.0.[/quote]

The only shader type that you explicitly “execute” is a compute shader, available in ES 3.1 and above. All other shaders are executed as an effect of the process of rendering, and they are executed only to serve a particular purpose in the rendering pipeline.

As such, the frequency of the execution of such shaders is based purely on the geometry being rendered. If you send 10 vertices, you get 10 vertex shader executions (possibly less if some vertices are reused). If a particular triangle covers 20 pixels, you get 20 fragment shader instantiations that cover that triangle (again, more or less. It’s complicated).

Because of all this, questions about how you can explicitly execute a shader can only apply to compute shaders. So I will only answer them with regard to those.

… That’s difficult to answer.

OpenGL ES operates under an in-order, coherent execution model. If you issue a rendering command, you can immediately attempt to read the results back on the CPU, and you will get those results if you try. The OpenGL ES abstraction requires this to work; it requires that data one command writes is visible by operations from a subsequent command (note: it does not require this to work fast, so please don’t actually try to read results immediately after sending off that work).

However, ES 3.1 introduces the desktop OpenGL concept of image load/store. As a companion to this is an incoherent memory model for such operations. This means that writes to images via image load/store are not necessarily visible to later commands simply because they were issued later. You have to do something special to explicitly make them visible.

And the only way for compute shaders to write any results is via image load/store or SSBOs (which use the same memory model). So in effect, all compute shaders operate under such an execution model.

Therefore, if you have two compute shader operations that are completely independent of each other (one isn’t reading from data being written by another), it is entirely possible for the OpenGL ES implementation to run both concurrently. So long as the first one “completes” before the second, you won’t be able to tell.

But that’s the key point: you can’t tell. You can’t force ES to do it, and you can’t detect when ES does it on its own.

Generally speaking, what you should do is issue compute work to ES, using the fewest memory barriers and synchronization primitives operations you can, and just let the ES implementation figure out how to use the GPU to most effectively.

As previously stated, no. You cannot force OpenGL ES to execute compute shaders on specific sections of the hardware. It is up to the implementation to decide how many resources to allocate to any particular compute operation.

What you call a “GPU core” is a hardware device that contains the source code for a particular GPU operation, a register file, several relatively large blocks of memory, and various other things. Each “core” itself can execute multiple separate shader invocations simultaneously (via SIMD: single instruction, multiple data. Each invocation executes the same instructions, but on different input data, producing different results that are stored to different memory locations).

So while one “GPU core” can only be running a single shader operation at once, a “core” will actually be calculating anywhere from 4 to 32 separate shader invocations (the size is hardware dependent).

So the answer to your question depends on what you mean by “one shader”.

You’re now asking questions about the specifics of a particular ES implementation, which can only be answered by the people who wrote it.

OpenGL ES does not forbid an implementation from pre-empting the execution of one compute shader with another. All it says is that the first compute shader operation will complete before the latter one. And so long as the ES implementation complies with that directive, it can do whatever it wants behind the scenes.

Even so, it’s really not something you should be worried about. By the nature of how shader invocations work, it’s generally cheaper to continue executing the same code than to start up a whole new operation on the same compute unit (the cost of task switching is huge in shaders, relative to their usual length). So while it is possible, it’s not a practical reality most of the time.

Google may be able to help you find tools for debugging OpenGL ES (which typically includes at least some from of GLSL debugger).

[QUOTE=Alfonse Reinheart;37803]The only shader type that you explicitly “execute” is a compute shader, available in ES 3.1 and above. All other shaders are executed as an effect of the process of rendering, and they are executed only to serve a particular purpose in the rendering pipeline.

As such, the frequency of the execution of such shaders is based purely on the geometry being rendered. If you send 10 vertices, you get 10 vertex shader executions (possibly less if some vertices are reused). If a particular triangle covers 20 pixels, you get 20 fragment shader instantiations that cover that triangle (again, more or less. It’s complicated).

Because of all this, questions about how you can explicitly execute a shader can only apply to compute shaders. So I will only answer them with regard to those.

… That’s difficult to answer.

OpenGL ES operates under an in-order, coherent execution model. If you issue a rendering command, you can immediately attempt to read the results back on the CPU, and you will get those results if you try. The OpenGL ES abstraction requires this to work; it requires that data one command writes is visible by operations from a subsequent command (note: it does not require this to work fast, so please don’t actually try to read results immediately after sending off that work).

However, ES 3.1 introduces the desktop OpenGL concept of image load/store. As a companion to this is an incoherent memory model for such operations. This means that writes to images via image load/store are not necessarily visible to later commands simply because they were issued later. You have to do something special to explicitly make them visible.

And the only way for compute shaders to write any results is via image load/store or SSBOs (which use the same memory model). So in effect, all compute shaders operate under such an execution model.

Therefore, if you have two compute shader operations that are completely independent of each other (one isn’t reading from data being written by another), it is entirely possible for the OpenGL ES implementation to run both concurrently. So long as the first one “completes” before the second, you won’t be able to tell.

But that’s the key point: you can’t tell. You can’t force ES to do it, and you can’t detect when ES does it on its own.

Generally speaking, what you should do is issue compute work to ES, using the fewest memory barriers and synchronization primitives operations you can, and just let the ES implementation figure out how to use the GPU to most effectively.

As previously stated, no. You cannot force OpenGL ES to execute compute shaders on specific sections of the hardware. It is up to the implementation to decide how many resources to allocate to any particular compute operation.

What you call a “GPU core” is a hardware device that contains the source code for a particular GPU operation, a register file, several relatively large blocks of memory, and various other things. Each “core” itself can execute multiple separate shader invocations simultaneously (via SIMD: single instruction, multiple data. Each invocation executes the same instructions, but on different input data, producing different results that are stored to different memory locations).

So while one “GPU core” can only be running a single shader operation at once, a “core” will actually be calculating anywhere from 4 to 32 separate shader invocations (the size is hardware dependent).

So the answer to your question depends on what you mean by “one shader”.

You’re now asking questions about the specifics of a particular ES implementation, which can only be answered by the people who wrote it.

OpenGL ES does not forbid an implementation from pre-empting the execution of one compute shader with another. All it says is that the first compute shader operation will complete before the latter one. And so long as the ES implementation complies with that directive, it can do whatever it wants behind the scenes.

Even so, it’s really not something you should be worried about. By the nature of how shader invocations work, it’s generally cheaper to continue executing the same code than to start up a whole new operation on the same compute unit (the cost of task switching is huge in shaders, relative to their usual length). So while it is possible, it’s not a practical reality most of the time.

Google may be able to help you find tools for debugging OpenGL ES (which typically includes at least some from of GLSL debugger).[/QUOTE]

Thank you for taking the time to reply my questions, and i appreciate your detailed explanation.
According to your answers, i understand that OpenGL ES 3.0 manages all ressources to use the GPU to most effectively. Driver parallelize shaders into threads and map it somehow to the GPU cores.
With OpenGL ES 3.1, do you know if we can manage GPU cores like OpenCL or CUDA do with these kernel ? i know that OpenGL ES 3.1 is fresh development.
I have another question : on OpenGL ES 3.0, is there a way to share data between fragment shaders other than textures ? ( if it’s possible)
I mean from a rendering pipeline to another ( vertex + fragment shader) ?
Thanks.

With OpenGL ES 3.1, do you know if we can manage GPU cores like OpenCL or CUDA do with these kernel ? i know that OpenGL ES 3.1 is fresh development.

3.1 is a minor expansion of 3.0 that brings it closer to desktop OpenGL, not more or less. For low level control you should wait for Vulkan, but even there you’ll have to rely on GPU’s scheduler to distribute workload among few cores.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.