Multiple Images for Progressive Rendering

Hey
I’m currently implementing Progressive Path Tracing & I wonder if it makes still sense to use a swapchain with multiple images.

A simplyfied example:
Say you want to fade your screen from black to white, increasing all 3 RGB values by 1 each frame & given, that the swap chain contains of 2 images.
-> Present Image A. Colors = (0, 0, 0)
-> Meanwhile, write to Image B. Increase Colors. Colors = (1, 1, 1)
-> Present Image B. Colors = (1, 1, 1)
-> Continue until (255, 255, 255) (white).

What I do not want to do is this (imagine the 170th frame)
-> Present Image A. Colors = (169, 169, 169)
-> Meanwhile, write to Image B. Increase Colors. Colors = (0, 0, 0) + (170, 170, 170)
-> Present Image B. Colors = (170, 170, 170)

What I want to do
-> Present Image A. Colors = (169, 169, 169)
-> Meanwhile, write to Image B. Increase Colors. Colors = Image A (169, 169, 169) + (1, 1, 1)
-> Present Image B. Colors = (170, 170, 170)

So, basically, what I want to do is, I want to take my last calculated image (its colors) and use them again for calculation.
I believe with a swapchain & multiple images I would have to copy memory constantly from Image A to Image B and vise versa.

Does that make sense ? Or should I rather abandon the swap chain and stick with a single image, that’s never cleared ?

There’s nothing wrong with copy. Basically what you are doing is a copy, except you make some incrementation while doing it.
Some options pulling outa my hat:

a) Do what you say you do not want. I.e. find some way to produce the frame procedurally without need for previous frame as you did with the incrementation.

b) Have regular Image store the results and write the increments to both the swapchain image and the regular Image. I belive you can have several color attachments.

c) Make the copy concurrent with the computation. I.e. have two regular images. As soon as one is computed use it to compute the next frame in the second image. At the same time copy it to the appropriate swapchain image (you could even use one of those TRANSFER queues).

d) Let’s say the presentation engine needs to hold on to at least N images. So ask for N+2 images. On first frame acquire 2 swapchin images and compute them both. Then present the oldest image. On each next iteration acquire one swapchain image, compute it using the second image you still have and present the old one.

The best way to think of the presentation system is like this:

You do not own those images. By virtue of the fact that you didn’t create them if nothing else; you did not create them, nor did you allocate memory for them. The presentation engine owns them, and it will unmake or invalidate them as needed. Those images exist for one purpose: to be displayed.

If you need to use an image for any other purpose, then create one yourself.

There’s nothing wrong with copy.

I’ve done no testing, but isn’t it pretty slow to copy a whole image ? Since Path Tracing itself is very slow I need all performance savings I can get.

find some way to produce the frame procedurally without need for previous frame as you did with the incrementation.

That’s not possible. Since what I am trying to achieve is Progressive Path Tracing, there would be no calculation time saved, and it would instead be again normal Path Tracing.

Guess I’ll try to go with your second thought first, thanks :slight_smile:

If you need to use an image for any other purpose, then create one yourself.

What exactly do you mean by that ?
Having a regular image besides the swap chain, as krOoze mentioned, or simply having a regular image with no swap chain at all ?

[QUOTE]There’s nothing wrong with copy.

I’ve done no testing, but isn’t it pretty slow to copy a whole image ? Since Path Tracing itself is very slow I need all performance savings I can get.
[/QUOTE]

There are probably worse things. GPUs can push ungodly Gbps. The latency can be hidden. Some “copies” can even not leave cache. Of course assumptions about performance are dangerous though.
I did not of course suggested doing unnecessary copying. Only the neccessary ones; or where the alternative would be worse.

[QUOTE]If you need to use an image for any other purpose, then create one yourself.

What exactly do you mean by that ?
Having a regular image besides the swap chain, as krOoze mentioned, or simply having a regular image with no swap chain at all ? [/QUOTE]

He was being general about it. Swapchain images should be used only for presentation to the screen/windows. The reason is this:
The Vulkan does not guarantee any other USAGE than COLOR_ATTACHMENT. So you would not necessarily be able to read the swapchain image on all platforms anyways.
Also the swapchain image have opaque memory allocation. It could even be allocated on the CPU side, so potentially a performance minefield for anything else than presentation.

Alright, I’m currently trying to do that.

Do I need to have some sort of synchronisation security when copying data between images ?
Here is what I did now:

  1. I’m setting up memory barriers for the compute Image to be written to by shader. Then I change its layout to transfer & prepare the swap chain image as transfer destination.
  2. Copy data from compute image to swap chain image
  3. change swap chain image from transfer destination to presentation

But now when I try to present the current image or sometimes when submitting the queue, I just get a critical exception.
The validation layers don’t throw anything, just the compiler says critical exception.

This is what my 3 steps look like:
1. step


	std::vector<VkImageMemoryBarrier> barriers = {};

        // Compute Image 
	VkImageMemoryBarrier compWrite = {};
	compWrite.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	compWrite.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
	compWrite.newLayout = VK_IMAGE_LAYOUT_GENERAL;
	compWrite.image = computeImage;
	compWrite.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	compWrite.srcAccessMask = 0;
	compWrite.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;

        // Compute Image 
	VkImageMemoryBarrier compTransfer = {};
	compTransfer.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	compTransfer.oldLayout = VK_IMAGE_LAYOUT_GENERAL;
	compTransfer.newLayout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL;
	compTransfer.image = computeImage;
	compTransfer.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	compTransfer.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
	compTransfer.dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;

        // Current Swap chain image
	VkImageMemoryBarrier swapTransfer = {};
	swapTransfer.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	swapTransfer.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
	swapTransfer.newLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
	swapTransfer.image = swapChainImages[curImageIndex];
	swapTransfer.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	swapTransfer.srcAccessMask = 0;
	swapTransfer.dstAccessMask = VK_ACCESS_MEMORY_WRITE_BIT;

	barriers.push_back(compWrite);
	barriers.push_back(compTransfer);
	barriers.push_back(swapTransfer);

	vkCmdPipelineBarrier(buffer, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
		0, 0, nullptr, 0, nullptr, barriers.size(), barriers.data());

2. step


        // Compute Image
	VkImageSubresourceLayers source;
	source.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
	source.mipLevel = 0;
	source.baseArrayLayer = 0;
	source.layerCount = 1;

        // Current Swap chain image
	VkImageSubresourceLayers dest;
	dest.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
	dest.mipLevel = 0;
	dest.baseArrayLayer = 0;
	dest.layerCount = 1;

	VkImageCopy copy;
	copy.srcSubresource = source;
	copy.dstSubresource = dest;
	copy.extent = { WIDTH, HEIGHT, 1 };

	vkCmdCopyImage(buffer, computeImage, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, swapChainImages[curImageIndex], VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, &copy);

3. step


        // Current Swap chain image
	VkImageMemoryBarrier swapPres = {};
	swapPres.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	swapPres.oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
	swapPres.newLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;
	swapPres.image = swapChainImages[curImageIndex];
	swapPres.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	swapPres.srcAccessMask = VK_ACCESS_MEMORY_WRITE_BIT;
	swapPres.dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;

	vkCmdPipelineBarrier(buffer, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
		0, 0, nullptr, 0, nullptr, 1, &swapPres);

Sorry for throwing so much code in here

There’s no other synchronisation between those steps, just the pipeline barriers.

Edit:
Sorry, I was running in release mode :smiley:
Validation layers are now throwing like crazy

Edit 2:
Since the swap chain creates its images, there’s no way to set their usage to transfer destination right ? So, you can’t copy data to swap chain images ?

[QUOTE=Nasty Nas;42178]
Since the swap chain creates its images, there’s no way to set their usage to transfer destination right ? So, you can’t copy data to swap chain images ?[/QUOTE]

VkSurfaceCapabilitiesKHR::supportedUsageFlags will tell you the ways in which you can use swapchain images. If transfer destination is not listed, then you cannot copy to them. The only one that is required by the specification is color attachment.

So at the very least, you need to plan for the situation when you have to render to it instead of copying to it.

Yep, just found that. Unfortunately the time to edit your posts is pretty small & I didn’t want to write a third post in a row… Thanks !

Just to keep it complete, a whole lot of flags were wrong set.
That’s now what I’ve ended up with my 3 steps, and it works :slight_smile:

1. step


	std::vector<VkImageMemoryBarrier> barriers = {};

        // Compute Image
	VkImageMemoryBarrier compWrite = {};
	compWrite.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	compWrite.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
	compWrite.newLayout = VK_IMAGE_LAYOUT_GENERAL;
	compWrite.image = computeImage;
	compWrite.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	compWrite.srcAccessMask = 0;
	compWrite.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;

        // Compute Image
	VkImageMemoryBarrier compTransfer = {};
	compTransfer.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	compTransfer.oldLayout = VK_IMAGE_LAYOUT_GENERAL;
	compTransfer.newLayout = VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL;
	compTransfer.image = computeImage;
	compTransfer.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	compTransfer.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
	compTransfer.dstAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
        
        // Swap chain image
	VkImageMemoryBarrier swapTransfer = {};
	swapTransfer.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	swapTransfer.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
	swapTransfer.newLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
	swapTransfer.image = swapChainImages[curImageIndex];
	swapTransfer.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	swapTransfer.srcAccessMask = 0;
	swapTransfer.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;

	barriers.push_back(compWrite);
	barriers.push_back(compTransfer);
	barriers.push_back(swapTransfer);

	vkCmdPipelineBarrier(buffer, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
		0, 0, nullptr, 0, nullptr, barriers.size(), barriers.data());

2. step
The offsets gave me a little headache. I just skipped them, thinking they would be initialized by default with { 0, 0, 0 }. Well, they’re not :wink:


        // Compute Image
	VkImageSubresourceLayers source;
	source.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
	source.mipLevel = 0;
	source.baseArrayLayer = 0;
	source.layerCount = 1;

	VkImageSubresourceLayers dest;
	dest.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
	dest.mipLevel = 0;
	dest.baseArrayLayer = 0;
	dest.layerCount = 1;

        // Swap chain image
	VkImageCopy copy;
	copy.srcSubresource = source;
	copy.dstSubresource = dest;
	copy.extent = { WIDTH, HEIGHT, 1 };
	copy.srcOffset = { 0, 0, 0 };
	copy.dstOffset = { 0, 0, 0 };


	vkCmdCopyImage(buffer, computeImage, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, swapChainImages[curImageIndex], VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, &copy);

3. step


	VkImageMemoryBarrier swapPres = {};
	swapPres.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
	swapPres.oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
	swapPres.newLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;
	swapPres.image = swapChainImages[curImageIndex];
	swapPres.subresourceRange = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 };
	swapPres.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
	swapPres.dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;

	vkCmdPipelineBarrier(buffer, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
		0, 0, nullptr, 0, nullptr, 1, &swapPres);