Feedback Loops and render passes

Input attachments have a very strict limitation. Namely, you can only read from the sample that corresponds to your current fragment shader location. While that’s perfectly acceptable for many uses, it lack one in particular: screen-space ambient occlusion. That algorithm requires each fragment to fetch from neighboring samples in order to determine the amount of occlusion to apply.

If you don’t do SSAO, it seems that you can do all of the steps for a deferred renderer within a single render pass. The g-buffers are written in one subpass, then the HDR color buffer is written with the g-buffers as input attachments, then tone mapping is applied using the HDR color buffer as input and the final LDR image as output.

Now, it’s obvious that you cannot render to an attached subresource at the same time you read from it. But it’s not clear to me if it is legal to have a subpass which reads arbitrarily from an image that is attached to the framebuffer, even if that particular attachment is only preserved during that subpass.

It’s clear that this would work if you ended the renderpass and started a new one. But it’s not clear if it would work within a render pass.

make the input attachment a texture instead and don’t include the VK_DEPENDENCY_BY_REGION_BIT

Coming from Graham Sellers himself, you ARE supposed to make multiple passes for this by API design: Is programmable blending with order-independent transparency possible with Vulkan? - Vulkan - Khronos Forums

I’m not sure what you mean by that. There is no specific way to set an attachment to be used as a texture. Your choices for each attachment in a subpass are color, depth/stencil, preserve, resolve, or to have the contents be undefined. I don’t see the part of the specification that allows “texture” as an option for an attachment.

My question is essentially this: if you set an attachment to be “preserved” for a subpass, is it then legal to bind that image to a descriptor set and read from it? Assuming you’ve done the necessary layout work, of course.

Graham Sellers seems to think it is not. I’d like to know what the Vulkan specification has to say on the matter.

BTW, I think I found the spec quote that forbids this:

So that settles that…

You can read offset from the current fragment out of a subpass input: SPIR-V Specification Provisional <- this takes a x,y offset from the current screen coordinate to index the pixel to be sampled.

Or in other words it lets you sample neighbouring pixels for blurring-type operations. (Remember to remove the VK_DEPENDENCY_BY_REGION_BIT from the subpass dependency)

see GLSL/GL_KHR_vulkan_glsl.txt at master · KhronosGroup/GLSL · GitHub in the section “Subpass Inputs”

ahem:

So no, you can’t.

It seems very clear from the design of the Vulkan API that input attachments, and the entire renderpass system, exist to serve the needs of tile-based renderers. All subpasses in a render pass are expected to live entirely within a tile, with as few writes to live memory as possible. Preserve attachments represent image data that cannot be reused for some other subpass, but if the number of used attachments in that pass is small, the system need not flush the data back to memory. The overall goal is to make it possible for a renderpass to only write to memory when it absolutely must, so that as many FS invocations as possible can be executed within a tile.

Input attachments are a part of that. They represent the direct intent of a fragment shader to accesses tile memory. In such a case, it is impossible for a fragment shader to access memory outside of its tile. And therefore, since the size of a tile is not specified by Vulkan, Vulkan must make it impossible for the fragment shader to access a different sample’s memory.

What we really need is a “flush” operation. It would be like “preserve”, but it would guarantee that the data is written out to memory. It would also guarantee that you can read from that data from the fragment shader, not as an input attachment, but as a texture.

Now, you can only read from a “flushed” attachment in the fragment shader; we don’t want TBRs to have to do unpleasant things. The whole point of the renderpass system is so that it can send all of the vertex processing stuff for a renderpass, then execute all of the fragment stuff (while probably executing more vertex stuff from another renderpass). To allow other stages to read a “flushed” attachment would force them to break up the vertex processing at these boundaries.

oh, I guess that means that it needs an extension to get non-local inputattachment data in the fragment shader. So non tile based renderers can expose the neighbouring pixels

I prefer my idea of being able to flush an attachment and use it as a regular descriptor set image. By doing it that way, you can still use the same algorithm on a TBR as you do on a non-TBR. That is, after all, the point of the complex render-pass system: so that you can write the same code and have it work in both places. Obviously, flushes will be expensive on TBRs, but they will at least work without having to flush all attachments.

I also realized that ping-ponging between a pair of images (reading from one while rendering to another, then swapping) also requires breaking render passes. But doing ping-pong algorithms basically kills any hope of performance for TBRs, and non-TBRs don’t really have a lot of renderpass overhead. Probably.