Results 1 to 5 of 5

Thread: vkQueuePresentKHR blocks

  1. #1
    Junior Member
    Join Date
    Jun 2017

    vkQueuePresentKHR blocks


    I have been timing portions of my code as part of an attempt to get a better grasp of how the presentation engine behaves. The code I'm using looks something like this:

    Code :
    // imageCount==2 for FIFO, 3 for Mailbox
    // minImageCount==2
    uint32_t idx;
    acquiredImageAvailableSemaphore = device.createSemaphoreUnique({});
    device.acquireNextImageKHR(*swapchain, timeout_infinite, *acquiredImageAvailableSemaphore, {}, &idx);
    device->waitForFences(1, &*presentationBufferExecutionFences[idx], VK_TRUE, vkt::timeout_infinite);
    device->resetFences(1, &*presentationBufferExecutionFences[idx]);
    vk::CommandBuffer& cb = *presentationCommandBuffers[idx];
    cb.beginRenderPass(&renderPassInfo, vk::SubpassContents::eInline);
    // I don't actually record any commands here at the mome
    vk::SubmitInfo submitInfo = {};
    const vk::PipelineStageFlags waitStage = { vk::PipelineStageFlagBits::eColorAttachmentOutput };
    submitInfo.waitSemaphoreCount = 1;
    submitInfo.pWaitSemaphores = &imageAvailableSemaphores[idx];
    submitInfo.pWaitDstStageMask = &waitStage;
    submitInfo.commandBufferCount = 1;
    submitInfo.pCommandBuffers = &cb;
    submitInfo.signalSemaphoreCount = 1;
    submitInfo.pSignalSemaphores = &presentWaitSemaphores[idx];
    graphicsQueue.submit(1, &submitInfo, *presentationBufferExecutionFences[idx]);
    vk::PresentInfoKHR presentInfo = {};
    presentInfo.waitSemaphoreCount = 1;
    presentInfo.pWaitSemaphores = &presentWaitSemaphores[idx];
    presentInfo.swapchainCount = 1;
    presentInfo.pSwapchains = &*swapchain;
    presentInfo.pImageIndices = &idx;

    The timings I get with mailbox look like this ([milliseconds::microseconds], release, no validation layers):
    Code :
    [ 5089:: 65] > acquiring image
    [ 5089:: 72] > acquired image: 0
    [ 5089:: 78] > waitForFences start
    [ 5089:: 80] > waitForFences end
    [ 5089:: 85] > submit
    [ 5089::137] > presentKHR
    [ 5089::300] > end
    [ 5089::323] > acquiring image
    [ 5089::330] > acquired image: 1
    [ 5089::335] > waitForFences start
    [ 5089::336] > waitForFences end
    [ 5089::341] > submit
    [ 5089::396] > presentKHR
    [ 5089::532] > end
    [ 5089::536] > acquiring image
    [ 5089::558] > acquired image: 2
    [ 5089::563] > waitForFences start
    [ 5089::565] > waitForFences end
    [ 5089::569] > submit
    [ 5089::603] > presentKHR
    [ 5089::705] > end
    [ 5089::710] > acquiring image
    [ 5089::715] > acquired image: 0
    [ 5089::734] > waitForFences start
    [ 5089::736] > waitForFences end
    [ 5089::740] > submit
    [ 5089::788] > presentKHR
    [ 5089::957] > end

    There are some things I'm wondering about:
    - The acquired images are always in consecutive order [0, 1, 2, 0, 1, 2, etc], though I would expect the presentation engine to be presenting one of them, resulting in something like [0, 1, 2, 1, 2, 1, 0, 2, 0, 2]. I guess the presentation engine works a bit differently internally and makes a copy of the relevant data?
    - Submit takes a bit of time, this makes sense. PresentKHR takes significantly more time. Is this normal?
    - Am I handling the semaphores correctly?

    However, the really odd part was when I used the FIFO presentmode. I expected to have vkAcquireImageKHR to block, but what I got instead was this:

    Code :
    [ 7305:: 69] > acquiring image
    [ 7305:: 84] > acquired image: 1
    [ 7305:: 92] > waitForFences start
    [ 7305:: 94] > waitForFences end
    [ 7305::106] > submit
    [ 7305::166] > presentKHR
    [ 7321::533] > end
    [ 7321::553] > acquiring image
    [ 7321::583] > acquired image: 0
    [ 7321::604] > waitForFences start
    [ 7321::607] > waitForFences end
    [ 7321::620] > submit
    [ 7321::676] > presentKHR
    [ 7338::135] > end

    As you can see, acquiring the image is instantaneous. Instead, vkQueuePresentKHR seems to be the synchronization point for my code. Why? Am I doing something wrong? Is this expected (undocumented?) behaviour?

    I'm using a g-sync compatible laptop with a GTX980M. The drivers are approximately one week old and g-sync is disabled in the NVIDIA control panel.

    Any help and advice is appreciated (relevant to the topic or not)!


  2. #2
    Senior Member
    Join Date
    Mar 2016
    What does the imageAvailableSemaphores[idx].swap(acquiredImageAvailableSemaphore); do?
    What's the purpose of presentationBufferExecutionFences; what signals it?
    What OS and Compositor is this on?

    Generally yea, it is a valid choice to copy out the Image. E.g. the DRI3\Present:

    When the X server has finished using 'pixmap' for this
    operation, it will send a PresentIdleNotify event and arrange
    for any 'idle-fence' to be triggered. This may be at any time
    following the PresentPixmap request -- the contents may be
    immediately copied to another buffer, copied just in time for
    the vblank interrupt or the pixmap may be used directly for
    display (in which case it will be busy until some future
    PresentPixmap operation).

  3. #3
    Junior Member
    Join Date
    Jun 2017
    Quote Originally Posted by krOoze View Post
    What does the imageAvailableSemaphores[idx].swap(acquiredImageAvailableSemaphore); do?
    It swaps the two handles stores in the referenced uniquehandles. Because I don't know which is the next image, I wanted to make sure I didn't use any semaphore that may still be in use, so I swap them after I obtain the next image index.

    Quote Originally Posted by krOoze View Post
    What's the purpose of presentationBufferExecutionFences; what signals it?
    They are the fences used with command buffer submission.

    Quote Originally Posted by krOoze View Post
    What OS and Compositor is this on?
    Windows 10 with visual studio 2017 (I assumed that's what you meant by compositor). I am currently using vulkan 1.1.70, as the latest version a week or so back had some problems in vulkan.hpp. I obtained the surface I am presenting to using glfw.

  4. #4
    Senior Member
    Join Date
    Mar 2016
    I think that is not necessary. Apparently you use vkAcquire and vkPresent in discrete 1:1 pairs. But can't hurt to be paranoid...
    Although, I assume the swapped out semaphore is destroyed at the end of scope, so you are assuming it is not used at that point anyway.

    Oh, right. I missed the fence being referenced in the submit command. You apparently need to do that, because you are re-recording the cmdbuffer in the render loop.
    You are waiting on it before signal though; I assume it was created pre-signaled?

    By Compositor I mean whether you e.g. use Wayland, or X. On Windows it does not matter; there is only one available.

    On AMD it does indeed block for me on vkAcquire in FIFO.
    Apparently it is a known behavior of NVIDIA:
    You could try to create the Swapchain with one extra image. Your fence should already make sure you do not queue more than one Present at a time.

  5. #5
    Junior Member
    Join Date
    Jun 2017
    The swapped out semaphore is actually kept until the next vkAcquireImageKHR call, when it is swapped with a new one. I omit the creation of the semaphore every frame at this point, as the behaviour is the same anyway.

    I figured that if I don't know what the next image is, I don't want to accidentally give the vkAcquireImageKHR call one that is in use, thus I have one spare that I swap out. I was under the impression the semaphore is signaled when the image is ready to be presented to screen, which may not be when it is acquired or submitted. Only when I acquire an image, I will be certain that it is not still in use, as I assume the acquire will not give me the same image twice without presenting it on screen (and thus signalling the semaphore).

    And indeed, the fences are created pre-signaled.

    Interesting that it works as expected on AMD. For me, the vkQueuePresentKHR blocks even if I use more images (I tried 3 and 5). I have also tried calling acquireImageKHR with a fence or changing the timeout, but it does not change anything. vkAcquireNextImageKHR always returns VK_SUCCESS without blocking and vkQueuePresentKHR always blocks.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Proudly hosted by Digital Ocean