Questions about minimizing draw calls and rendering system design in generally

sage12 · January 24, 2019, 6:15pm

Hi. I’m currently in developing of 2D/3D indie game (isometric game with 3d character models and 2d level) on my own engine and now I’m thinking about minimizing draw calls.
For level design we’re using level editor called Tiled. The average level consists of 6-20 layers each represented a part of floor, roof and so on.
When i’m loading the level, my engine creating tileset atlas, (which consists of level tileset images glued together), then i build a vertex mesh for each layer (each tile represented by 4 vertices, each vertex contain texture coordinate of tileset atlas). Some of the 2d stuff (eg walls) also have a peculiar normal map (which stores in another texture atlas), which uses to give player the impression of a volume when this object is enlighted by light source.

Thus, i need also the normal map atlas (consists of minor normal map images - one for each regular tileset). All the 2d objects in engine have the same vetex data type and stores in one VBO (even the ones without the normal maps - for such sprites the appropriate place in normal map atlas left transparent). All the interactive/movable 2d objects which are not the part of level geometry are also in this VBO. All the 2d stuff uses one shader, wherein through the series of if…else statements decided how exactly current primitive should be rendered (i use QUADs primitive for 2d objects). Positions in a world space of each 2d object are set as uniform variable.

3d models are animated (which means that i need to set animation matrices in shader as uniforms), texture of each models are places in the similar texture atlas while level is loading (there is around 10 different model textures per level). All of the models stored in one big VBO and uses one shader, that works similarly to my 2d shader - inside shader is a bunch of if…else statements as a result of which (and, also, the material of a model) it is decided how to exactly draw an object.

I store all of the mentioned texture atlases in the GPU texture memory all at once so i don’t need to change texture units when drawing different models or load other textures to GPU memory.

So, i have several questions:

First off, is my rendering system design any good?
Second, sometimes i have multiple characters, which uses the same mesh. Is it better to store several identical buffers in VBO, or draw one buffer multiple times (i heard about instancing and i don’t want to use it, because i regulary have from 2 to 10 identical objects and not hundreds of it)
Currently i am drawing each object (i.e. one 3d model/level layer/2d sprite) by a single draw call, but i want to minimize draw calls - best case scenario is to have only two draw calls per frame - one for drawing 2d stuff and one for 3d. Best idea i come up with is to pass to a shader two uniform arrays of object positions and orientations for each object and build transformation matrices directly in shader, but in this case i don’t know how to identify an appropriate object in the shader (except, maybe, having an integer value in the vertex data struct of the meshes, which represented an index of the current object in this uniform array - but this means having a lot of repeating data pieces in VBO).
Other idea is to have an additional VBO with positions of each object, but in this case i need to have VBO entries for EVERY vertex i render, so, again, it means (without instancing) a hell of a data repeating.
Maybe i just worry too much about draw calls? Usually i have around 30-70 objects per screen, not thousands. Maybe i should let it be as it is - with a single draw call per object?

Please, give me some advice.

Dark_Photon · January 25, 2019, 5:37am

First, the goal is really about meeting some performance target, not minimizing draw calls. What is your performance target? Where are you now relative to that target?

As with all profiling and optimizing, first determine if you’ve met your target with reasonable buffer to spare. If you have, stop. If not, profile to locate the biggest bottleneck with your rendering and optimize that. You’ll get the biggest bang for the buck (most perf gain for the least amount of your time).

If you’ve determined that you are CPU limited on GPU dispatch, then consider minimizing state changes first, and then minimizing draw calls.

To your specific question about reducing draw calls. There have been a few recent threads that you might want to check out. For instance:

[ul]
[li]Setting up Batch Rendering with Scene Graph [/li][li]Problem with offsetting gl_DrawID [/li][li] GPU vertex dispatch for MultiDrawIndirect and/or Instanced draw calls [/li][/ul]
MDI and/or Geometry instancing and/or can be great solutions for reducing draw calls, unless you have trivially few numbers of vertices per MDI sub-draw or per instance.

However, if your instances or MDI sub-draws do have very few vertices, then you can do partial or full pseudo-instancing (inlining a number of instances end-to-end in your vertex arrays) so that there is more vertex work running at the same time to fill up the pipeline. Mesh shaders are reportedly also an option to avoid this inefficiency with small models, but that’s only available on the newest NVidia GPUs.

To your questions: #1: sounds like you’re well positioned to minimize draw calls. #2: Don’t duplicate if you don’t have to. Unless your meshes are trivially simple, try rendering them with GPU instancing first. To ensure you’re not missing out on perf, try rendering them with pseudo-instancing as well and compare frame time consumption between them on the target GPUs you care about. #3: See those threads. In particular use of gl_DrawID or other user-defined metadata passed into the shader (e.g. in base instance or other vtx attrib). Re obj position, what you need is a transform per object. Then you only need 1 entry for each instance rather than each vertex. #4: Maybe. I’d back up and consider what your larger goal is that got you even caring about draw calls and why. Then re-evaluate whether draw calls are what you want to focus on first.