Synchronization between seperate renderpasses

I know this is not optimal, and a better solution is to use a single renderpass with multiple subpasses…

But let’s say I’m executing two renderpasses back-to-back that write to the same color attachment.

I’m assuming this will introduce a write-after-write hazard, so I need to add a pipeline barrier to guarantee the first renderpass finishes writing to the attachment before the second writes to it.
Problem is I’m not sure what the correct barrier would be, and I can’t really find anything in the spec or online.

// pseudo-code

vkCmdBeginRenderPass(cmdBuffer, renderPassbeginInfo1, VK_SUBPASS_CONTENTS_INLINE); // renderpass #1
// several vkCmdDraws…
vkCmdEndRenderPass(cmdBuffer);

vkCmdPipelineBarrier(cmdBuffer,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, // wait for renderpass #1 finishes writing to color attachment
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, // wait at VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
0, 0, 0, 0, 0, 0, 0); // Do not need any layout transitions. Do we need VkMemoryBarrier? If so, what srcAccess/dstAccess ?

vkCmdBeginRenderPass(cmdBuffer, renderPassbeginInfo2, VK_SUBPASS_CONTENTS_INLINE); // renderpass #2
// more vkCmdDraws…
vkCmdEndRenderPass(cmdBuffer);

… Or is this pipeline barrier actually not needed due to implicit rasterization order rules? Although from the spec it seems that only applies for commands within the same subpass…

Thanks

Or should the pipeline barrier look like this:

VkImageMemoryBarrier imgBarrier{};
imgBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
imgBarrier.image = colorAttachment;
imgBarrier.oldLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
imgBarrier.newLayout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
imgBarrier.srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
imgBarrier.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;

vkCmdPipelineBarrier(cmdBuffer,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
0, 0, 0, 0, 0, 1, &imgBarrier);

It just seems so weird how I’m transitioning my attachment from VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL to VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL. Seems like it shouldn’t be needed.

Well, why would you ever have WaW dependency? That means you are just throwing the old data away.

Ideally you would use External Subpass Dependency. Though Pipeline Barrier is functionally equivalent.

You would be synchronizing against StoreOp of the previous Render Pass Instance and LoadOp of the next one. That indeed matches the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT:

Load operations for attachments with a color format execute in the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage.

The appropriate access mask is VK_ACCESS_COLOR_ATTACHMENT_*:

VK_ATTACHMENT_LOAD_OP_CLEAR […] For attachments with a color format, this uses the access type VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT.

You should not need change the layout, as both are color attachments, so VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL is still appropriate. Though considering you are throwing away old data anyway srcLayout = VK_IMAGE_LAYOUT_UNDEFINED might be appropriate.

I think no Memory Dependency is actually needed due to Implicit External Subpass Dependencies (quote below), but I think that is confusing and is better to supply those dependencies explicitly (and add a proper Execution Dependency there while at it).


VkSubpassDependency implicitDependency = {
    .srcSubpass = VK_SUBPASS_EXTERNAL;
    .dstSubpass = firstSubpass; // First subpass attachment is used in
    .srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT;
    .dstStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
    .srcAccessMask = 0;
    .dstAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
    .dependencyFlags = 0;
};

VkSubpassDependency implicitDependency = {
    .srcSubpass = lastSubpass; // Last subpass attachment is used in
    .dstSubpass = VK_SUBPASS_EXTERNAL;
    .srcStageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT;
    .dstStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT;
    .srcAccessMask = VK_ACCESS_INPUT_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
                     VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
                     VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
    .dstAccessMask = 0;
    .dependencyFlags = 0;
};


PS: If you just want to just “continue” writing the Image, then VK_ATTACHMENT_LOAD_OP_LOAD (and VK_ACCESS_COLOR_ATTACHMENT_READ_BIT) is appropriate instead.

It just seems so weird how I’m transitioning my attachment from VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL to VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL. Seems like it shouldn’t be needed.

And you would indeed do this. It means “perform no transition”:

If oldLayout is not equal to newLayout, then the memory barrier defines an image layout transition for the specified image subresource range.

It feels weird because oftentimes the Vulkan API tries to coax you to “do the right thing” instead. IMO 95 % of the time you would (and should) actually be changing layout on Memory Barrier, so it is part of the API that way.

Well, why would you ever have WaW dependency? That means you are just throwing the old data away.

I am scared that a draw call from the first can be re-ordered and overwrite a pixel from the second, especially if I am not using a depth buffer.

You would be synchronizing against StoreOp of the previous Render Pass Instance and LoadOp of the next one. That indeed matches the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT:

I wasn’t aware the loadOp performed any synchronization… If it does, just setting this would be enough.

I think no Memory Dependency is actually needed due to Implicit External Subpass Dependencies

The implicit subpass dependency doesn’t make sense to me, I wish the documentation would be more clear about it.
The first barrier has srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, which to me implies “wait for nothing”. How is this useful in this case?

There are some misunderstandings.

Firstly my surprise about Write-after-Write situation is that it is wasteful. If you write something, and then write it again, you computed the original write for nothing (as nobody have ever read it). This general situation should not arise often… then again it is the first thing you said that you know your approach is not optimal.

I wasn’t aware the loadOp performed any synchronization… If it does, just setting this would be enough.

It does not (well, it sorta does define some dependency within the subpass; but that was not my point). It is just another Queue Operation that needs to be synchronized against.
The point being, that the specific chosen load\storeOp dictates which Pipeline Stage, and Access mask needs to be provided to the user-defined Dependency.

The implicit subpass dependency doesn’t make sense to me, I wish the documentation would be more clear about it.

It doesn’t have to. It is simply equivalent to the above quoted Dependency. If it confuses you just ignore it and provide your own dependency.

The first barrier has srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, which to me implies “wait for nothing”. How is this useful in this case?

It is probably not. AIS it is always clearer to state things explicitly.
Yes, it “waits for nothing”. But you can chain dependencies, which means you can actually make it wait for something by chaining barrier on TOP if you so choose.

(Assuming two Render Pass Instances directly after each other) I suggest to do:


// Dependency of the first Render Pass Instance
VkSubpassDependency rp1Dependency = {
    .srcSubpass = lastSubpass; // Last subpass color attachment is used in
    .dstSubpass = VK_SUBPASS_EXTERNAL;
    .srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_BIT; // matches storeOp on a color attachment
    .dstStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT; // to be synchronized later
    .srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT; // matches VK_ATTACHMENT_STORE_OP_STORE
    .dstAccessMask = 0;
};

// Dependency of the second Render Pass Instance
VkSubpassDependency rp2Dependency = {
    .srcSubpass = VK_SUBPASS_EXTERNAL;
    .dstSubpass =  firstSubpass; // First subpass color attachment is used in
    .srcStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT; // chains to rp1Dependency::dstStageMask stage
    .dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_BIT; // matches loadOp on a color attachment
    .srcAccessMask = 0;
    .dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT;  // matches VK_ATTACHMENT_LOAD_OP_LOAD
};

You cannot just provide VK_PIPELINE_STAGE_COLOR_ATTACHMENT_BIT to vkCmdPipelineBarrier. The vkCmdPipelineBarrier Scopes does not cover intenal writes of the Render Pass Instance.

Sorry, scratch the last sentence – that is nonsense. You can normally use vkCmdPipelineBarrier.
And you can write the above as a single Dependency if that is easier on your brain. It would be:


srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_BIT;
dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_BIT;
srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT;

As long as the Render Pass Instances are directly after each other it should not matter whether the above is VkSubpassDependency of the first Render Pass, the second Render Pass, or as a vkCmdPipelineBarrier between them.