Depth texture bound to FBO and texture unit?

Hi,

Is it possible to have a depth texture bound to a texture unit for reading in a fragment shader and also have the depth texture assigned to an FBO’s depth attachment as long as depth writes are disabled?

This works for me on NVIDIA, but after reading “Feedback Loops Between Textures and the Framebuffer” (4.4.3) in the spec, I’m not sure if it should work. If the texture were assigned to a draw framebuffer attachment, it definitely wouldn’t work.

Thanks,
Patrick

In my opinion, it ought to work, since you have no real feedback loop - you don’t feed (i.e. write to) the texture… you just read from it.

This is officially supported only on GL4 class hardware but in fact I agree with skynet and it should work most probably on earlier cards as well as long as you disable depth writes.

The problem with using as texture and rendering to the same depth buffer is, afaik, that usually for rendering to a depth buffer the GPU has an on-chip compressed form of the depth buffer (Hi-Z) and in order to use it as a texture the GPU most probably has to do a decompression step, at least on pre-GL4 hardware as maybe GL4 hardware has some special hardware path for it.

I’m also not sure that in case of pre-GL4 hardware using the same depth buffer as texture would or would not disable Hi-Z and early-Z. However, this is such a common use case that I’m pretty sure IHVs have a solution for it.

Is it possible to have a depth texture bound to a texture unit for reading in a fragment shader and also have the depth texture assigned to an FBO’s depth attachment as long as depth writes are disabled?

That depends.

It is legal to bind the same texture object as you use for a render target. What gives rise to undefined behavior is accessing an image from a texture that you are also rendering to. Whether turning off depth writes is enough to flush the various caches so that you can read from the texture, I don’t know.

According to the letter of the specification, what you are trying to do will result in undefined behavior. The masking flags, like depth mask, are considered irrelevant by the specification.

However, a more important question is this: is there some reason you can’t just remove the texture from the FBO and ensure that your code works that way?

This is officially supported only on GL4 class hardware

It isn’t officially supported on anything. Not without using extensions. GL 4.x hardware uses the same wording as GL 3.x with regard to doing this sort of thing.

The problem with using as texture and rendering to the same depth buffer is, afaik, that usually for rendering to a depth buffer the GPU has an on-chip compressed form of the depth buffer (Hi-Z) and in order to use it as a texture the GPU most probably has to do a decompression step, at least on pre-GL4 hardware as maybe GL4 hardware has some special hardware path for it.

What are you talking about? You have been able to create depth textures, use them as render targets, and then bind them to a texture unit since EXT_FBO, let alone the current form. The only issue the OP is concerned with is leaving the texture bound to the depth slot of the FBO, which is not allowed on any version, regardless of the masking state.

Whether turning off depth writes is enough to flush the various caches so that you can read from the texture, I don’t know.

Sampling a depth texture, that you’ve previously written to, surely will cause the driver/GPU to flush relevant caches. That is normal operation and nothing special.

What are you talking about?

His concern is that for using a depth texture as depthbuffer a different in-memory representation of the depth values might be used than for depth textures that get sampled only. In that case the driver would have to take care of a conversion step whenever you switch from rendering into a depth texture to sampling from it.
But again, the driver can very easily infer whether a depth texture gets sampled in the next draw call and do any appropriate conversion steps. Even if this means, there are now two memory representations for the same depthbuffer, both are now in sync at this point.

GL specs define a rendering loop is beeing created by:

Doing so could lead to the creation of a rendering feedback loop between the writing of pixels by GL rendering operations and the simultaneous reading of those same pixels when used as texels in the currently bound texture.

If you don’t write to the depth texture (glDepthMask() state is called GL_DEPTH_WRITE_MASK!) while reading from it, no feedback loop is being created.

It isn’t officially supported on anything. Not without using extensions. GL 4.x hardware uses the same wording as GL 3.x with regard to doing this sort of thing.

I was not talking about GL 4.x spec allows it officially. I said officially GL4 class hardware supports it (aka Fermi and Evergreen). Just check DX11 that allows rendering to a depth texture that is currently bound for texturing.

What are you talking about?

I was indeed refering to the issue discussed in more detail by skynet. As he said, it is not an issue that cannot be solved, just it means additional burden to the driver.

Thanks everyone for the input.

Just to clarify my original question: I have an FBO with a texture depth attachment. I issue two draw calls when the FBO is bound: the first has depth writes and the depth test disabled (I didn’t realize the depth test was also disabled in my original post), but has the depth texture bound to a texture unit and reads from it in the fragment shader. The second draw call has the depth test enabled, but not depth writes, and does not read directly from the depth texture in the shader.

So I am never writing to the depth texture, but I have it bound to both a texture unit and attached to the FBO’s depth attachment. An easy workaround is to have two copies of the depth texture or modify the FBO attachment/texture unit bindings between draw calls.

As is it works fine on NVIDIA, but sometimes things work on NVIDIA that aren’t true to the spec, then they don’t work on ATI.

Thanks!
Patrick

Sampling a depth texture, that you’ve previously written to, surely will cause the driver/GPU to flush relevant caches. That is normal operation and nothing special.

Since the OpenGL specification doesn’t guarantee it, it’s kinda hard to know whether it will happen.

More importantly, it wouldn’t make sense. Just because you have bound a texture that happens to be attached to the current FBO does not mean that you intend to read from the same image that is attached to the FBO.

GL specs define a rendering loop is beeing created by:

The spec states:

None of these conditions take into account any of the masking state. Which means that an OpenGL implementation is free to flush caches only when a render target is attached/unattached to an FBO or when a new FBO is bound as the draw framebuffer.

In short: the spec does not guarantee that it should work, therefore, you should not rely on it.

Since the OpenGL specification doesn’t guarantee it, it’s kinda hard to know whether it will happen.

I guess, my statement was unclear: with “sampling a depth texture…previously written to…” I meant “written in a previous draw call”. I’m talking about vanilla shadow mapping here:

  1. render scene to depthtexture while it is bound as depthbuffer
  2. render scene 2nd time and sample the depthtexture

That is guaranteed to work, isn’t it?

None of these conditions take into account any of the masking state.

The specs are written in prose. Sometimes something is not explictely stated and we need to use common sense to make a meaning of it. To me this whole paragraph just says “Do not read data you are currently writing to, or the results are undefined.” They even tried to point out that texture filtering may access multiple levels of a texture and I should not write to any of the levels that may get accessed by sampling.

Which means that an OpenGL implementation is free to flush caches only when a render target is attached/unattached to an FBO or when a new FBO is bound as the draw framebuffer.

Well, then it should be possible to do something like this:


FBO1: Depthtexture DT is bound as depthbuffer
FBO2: DT is also bound as depthbuffer

glBindFramebuffer(FBO1);
glDepthMask(GL_TRUE);
renderScene(); // render scene, fill depthbuffer
glBindFramebuffer(FBO2);
glBindTexture(.., DT);
glDepthMask(GL_FALSE);
renderSoftParticles(); // using DT for depth occlusion and as source of depth for softparticle rendering

I think, this is similar to what the OP intents to do. Are there any ambiguities in this setup which should lead to corrupted data? To the driver it is very clear were I stop writing to the texture and start read-only from it.

To quote EXT_texture_barrier:

Another application is to render-to-texture algorithms that ping-pong between two textures, using the result of one rendering pass as the input to the next. Existing mechanisms require expensive FBO Binds, DrawBuffer changes, or FBO attachment changes to safely swap the render target and texture.

It looks like binding a different FBO is a “natural” texture barrier (otherwise, any existing render-to-texture stuff would not work in a defined manner)

In short: the spec does not guarantee that it should work, therefore, you should not rely on it.

Obviously the specs need a clarification, since in my eyes, the intended setup never creates a feedback loop.

I guess, my statement was unclear: with “sampling a depth texture…previously written to…” I meant “written in a previous draw call”. I’m talking about vanilla shadow mapping here:

  1. render scene to depthtexture while it is bound as depthbuffer
  2. render scene 2nd time and sample the depthtexture

That is guaranteed to work, isn’t it?

So long as you have removed the depth texture from the FBO before you do step 2, yes. Otherwise, it is undefined behavior.

The specs are written in prose. Sometimes something is not explictely stated and we need to use common sense to make a meaning of it.

A technical specification describes behavior. This particular part of the spec is very, very clear. It doesn’t rely on anything more than what it says, and there is no need to apply “common sense.” It says what it says, and there’s no wiggle room for interpretation.

If conditions X, Y, and Z happen, undefined behavior results. Full-Stop.

However much it might make “common sense” for something to work a certain way is irrelevant. It is off-spec behavior. Even if every piece of hardware worked as you suggested, it is still, according to The Specification for OpenGL Version 4.1 (and every other version 3.0 or greater), undefined behavior.

If you want to rely on undefined behavior, feel free. But the spec clearly says it is undefined.

Well, then it should be possible to do something like this:

Again, the spec is very clear on this: what matters is what textures are bound to the context and attached to the FBO. Not what the masking state is. And while one would assume that caches are cleared when binding a new FBO, that doesn’t guarantee it. After all, a clever implementation could see that FBO1 and FBO2 use the same depth attachment and therefore not clear the depth cache. As an optimization.

To quote EXT_texture_barrier:

First, there is no EXT_texture_barrier; there is only NV_texture_barrier.

Second, however much you might assume about the functioning of NVIDIA hardware based on NV_texture_barrier’s wording, that doesn’t change what the OpenGL specification says about render targets. Except where NV_texture_barrier changes what it says, of course.

And of course third: even if you assume these things, they will only be true of NVIDIA hardware. So whatever implicit cache invalidation might or might not happen on NVIDIA hardware has no bearing on, for example, ATI hardware.

Obviously the specs need a clarification, since in my eyes, the intended setup never creates a feedback loop.

No, the spec is quite clear. You’re trying to create uncertainty when the spec doesn’t provide any. You think certain hardware works a certain way and that the spec should provide that.

That’s not what specifications are for. It’s possible that masking state could be included in the feedback section. But that is not a “clarification;” that is a change. It now means that IHVs will have to clear depth caches whenever you change the depth mask.

Personally, I don’t see what’s so hard about just unattaching the depth texture or binding a different FBO that doesn’t have that depth texture attached. Not only would this give you in-spec behavior, it wouldn’t make simple glDepthMask() calls cause cache clearing and thus lower performance.

The intent of this setup is not something unusual: For many algorithms you may want to use a depth texture for depthtesting (testing only, no depth writes!) and at the same time to sample the existing depthvalues. Stencil testing comes into mind, too.

You don’t need to tech me what a spec ought to do. They are man made, written in prose. Therefor they can contain uncertainties, ambiguities and errors.

In this case, I believe, there are certain things that are not well covered and leave room for interpretation (or at least confusion :wink: ).
They talk about
“reading and writing of pixels” (masks influence writing!),
“using a texture as source and destination” (is a texture I don’t write into still a destination?; isn’t glDrawBuffer(s) selecting the destination of writes as well?)

It’s possible that masking state could be included in the feedback section. But that is not a “clarification;” that is a change.

Be it, call it change. Just let them add a few words about masking and drawbuffer setting.

It now means that IHVs will have to clear depth caches whenever you change the depth mask.

Do you realize, that assuming something like a "depth cache " is just an assumption as well? :wink:

We don’t need to argue further. I agree with you that binding a depthtexture as texture and depthbuffer attachment might allow undefined results. In practice, it worked though. Should the OP decide whether to use it or not.
He has been warned - and I think that has been your intention.