Render vegetation as quads - What render method

I want to render vegetation - trees, grass patches - using textured quads. I guess I would render no more than like 15K such quads per frame. What method do I use for drawing them if I want best performance? (OpenGL 3.3)

-use the geometry shader, storing 1 position and 2 vectors for orientation for each quad
-non-indexed drawing. either store 6 positions per quad or use triangle strips with 4 + 1 points (for degenerate triangles)
-use indexed drawing, which I think would be faster than non-indexed because of the cache but I also need to store the indices. (either 6 per quad or 4 and 1 primitive-restart-index with strips)

(+I also need to store the texture and other data in each case)

I’ve seen game engines where things like grass patches alway face the camera like they are point sprites or something but I don’t know much about that.
Maybe trees that are closer to the camera and have many such quads should be rendered as instances?

Yes go for instancing.
Here is a tutorial that treats about it.

EDIT: I’m not sure to understand your last paragraph. It seems you know already instancing. So maybe I missed your all point…

EDIT2: I think now I understand.
So, use a vertex attribute to set the size of the quad. So like this you can change the size. Use vertex attributes to set the modelview matrices too.
For texturing, if you have an atlas, you can store a ‘decay’ of the texture coordinates to easily use the right sub-part of the image.

maybe i’ wrong, but i think in general is “instancing” a good idea performancewise (as Silence said) if you have many (and 15K are many) vertices. here’s a good site with different tutorials, maybe it helps you: https://developer.nvidia.com/gpugems/GPUGems/gpugems_ch07.html
i might be wrong again, but i think you can reduce the number of quads with that method and still have an amasing environment.

[QUOTE=Silence;1286981]Yes go for instancing.
Here is a tutorial that treats about it.

EDIT: I’m not sure to understand your last paragraph. It seems you know already instancing. So maybe I missed your all point…

EDIT2: I think now I understand.
So, use a vertex attribute to set the size of the quad. So like this you can change the size. Use vertex attributes to set the modelview matrices too.
For texturing, if you have an atlas, you can store a ‘decay’ of the texture coordinates to easily use the right sub-part of the image.[/QUOTE]

Thank you! I know instancing but AFAIK it is not efficient if my instances are single quads. If I want to render a field of sunflowers that are single textured quads I would rather pack them in a buffer and draw them with a single non-instanced draw call.

If these are static and the graphic card can afford all the necessary memory, then you can. But then you’ll rely only on the GPU for the culling/occlusion.

Why not use GL_POINTS with GL_POINT_SPRITE enabled? They’re fast.
Use gl_PointSize in the vertex shader to size the textured point (enable GL_VERTEX_PROGRAM_POINT_SIZE)
Use gl_PointCoord in the fragment shader to lookup the texture (or texture atlas if you want to say, switch between rotated versions of the texture)

[ATTACH=CONFIG]1491[/ATTACH]

I wouldn’t use point sprites because they have odd scaling issues that might not exactly match with your perspective projection used for other geometry.

Looking at the size of your data set, vertex/index counts are unlikely to be your bottleneck; you’re far more likely to bottlneck on fillrate, meaning that therre’s not actually going to be a huge world of difference coming from which method you choose.

That said…

Triangle strips with degenerate triangles are so 1998; the fast path on desktop hardware has been indexed triangle soup since then, so that’s where I’d be looking forst.

Comparison of indexed vs non-indexed; with your data size you can use 16-bit indices. Assume 24-byte vertices (3 x float position, 2 x float texcoord, 4 x byte color). 15k quads at 6 vertices per quad is ~2mb. 15k quads at 4+1 vertices per quad is ~1.75mb. 15k quads at 4 vertices + 6 indices per quad is ~1.5mb; smaller than both non-indexed cases - it’s easy enough to forget that even though you have indices as well, and even though you have more indices than vertices, a typical index is much smaller than a typical vertex.