New GPU render method.

I love programing 3d graphic render.

But all API use many CPU functions. Its not good.
In Vulkan API i sow DrawIndirect and NVIDIA use VK_NVX_device_generated_commands !
It pushed me to write my idea for new method GPU render.

I propose create on GPU new shader. Shader call GPU-manager. This shader work with struct.
All work do GPU-manager (set textures, render targets,shaders,meshes,MSAA,depth,culling) without CPU !

New shader (GPU-manager) will allow run old games for Directx 9\10\11 and run new games (this new games using new GPU-manager shader)

For one mesh struct (for example 64 bytes) in this struct we describe all states for render.
Struct Will relieve us from CPU functions !

How look struct(for example. One this struct for one mesh. If meshes 10 then write 10 this structs):

1-GPU pointer to Vertex Buffer
2-GPU pointer to Index Buffer
3-GPU pointer to RenderTarget
4-GPU pointer to TextureA
5-GPU pointer to TextureB
6-GPU pointer to TextureC
7-GPU pointer to TextureD
8-GPU pointer to TextureE
9-GPU pointer to Vertex shader
10-GPU pointer to Pixel shader
11-GPU pointer to Blende
12-MSAA flag 0=1 1=2xMSAA 2=4xMSAA 3=8xMSAA
13- Cull mode 0=None 1=BackFaces 2=Front
14-Depth 0=off 1=On
15-int32 how many draw triangles for this mesh
16-int32 how many draw this mesh
17-GPU feature flag (for example boolean operation whit mashes,occlusion query,etc)
18-Number mesh over which will occur operations.

Then send this struct to GPU-manager shader and do DrawList();
In code look like this:

1-Fill for each mesh Struct
2-SendGPUList(&pStruct,NumObjects);
3-DrawGPUList();
4-Present

Now we can for one dip, draw all 3d world !
And draw shadowmap and meshes with shadow map.
This method relieve GPU from sending CPU commands.
And GPU work more effective !

Think about it !

This method will give vendors (i mean NVIDIA and AMD) more freedom !
Because GPU-manager shader work on GPU, and CPU does not fit GPU into the API standarts. Like Directx9, Directx11 etc.
And GPU-manager shader allows to more effectively\more faster manage GPU work !
And while doing less actions and more flexible !

Yes !
And we can have in render meshes with different MSAA !
I think many CPU commands\functions for GPU it’s ballast. And code overgrown unnecessary functions.
Code becoming big and not comfortable.

More about present:
4-function Present(Any texture ! Not only swap chain back buffer)
It is convenient to see how render look in some texture.
And more practical.

Now we can presetns many textures without excess\unnecessary render on plane.
Present(swap chain back buffer,color key); // lets say its main render Present(Character Icons,color key);//lets say its couple small icons draw like png with alpha.
Present(MainMenu,color key);//lets says it`s main menu draw at the center of the monitor, draw like png with alpha.

color key for blend or alpha images.

If we want draw many textures. Lets say 40 textures.
Good solution use PresentList(pPointerFirstTexture,HowManyTextures flip); GPU fast flip and blending\alpha all textures on screen
pPointerFirstTexture (it`s data\massive) first pointer to Texture second how draw this texture (alpha or without alpha) and for next texture all same (pointer,alpha mode)

Good solution when we one time render Main Menu or Icons and later draw their without render. Simple flip on screen. And no need more always do render for this textures.

pRect need for better control the output texture on screen, or part of texture !
pRect it`s struct {
ScreenX DWORD 0
ScreenY DWORD 0
left DWORD 0
top DWORD 0
right DWORD 0
bottom DWORD 0
}

Now we can draw in any place on monitor and any part of the texture or whole texture !
Present(swap chain back buffer,color key,pRect);

And init and create render look like this:

  1. hr = InitGPUManeger(&pDevice);// if hr = 1 its ok or bad. Or if pDevice == NULL this mean not detected GPUManeger on this videocard.

  2. pDevice->CreateSwapChain(SizeX,SizeY,FlagMode,V-sync)

  3. pDevice->LoadShader(FileName,ShaderName,&pShader);//pShader as 64 bit for GPU-memory. This we write to GPU-meneger struct for mesh.

  4. pDevice->LoadTexture(FileName,&pTexture);//pTexture as 64 bit for GPU-memory. This we write to GPU-meneger struct for mesh.

  5. pDevice->CreateVertexBuffer(CPUmem,desc,&pGPUVertexmem);//pGPUVertexmem as 64 bit for GPU-memory. This we write to GPU-meneger struct for mesh.

  6. pDevice->CreateIndexBuffer(CPUmem,desc,&pGPUIndxmem);//pGPUIndxmem as 64 bit for GPU-memory. This we write to GPU-meneger struct for mesh.

  7. pDevice->CreateStorageBuffer(FlagRead\Write,size*16,pData);// pData as 64 bit for GPU-memory. For matrices(View\Proj\World) or data. And use in any shaders.

  8. pDevice->GetFeatures(&pDescFeatuares);// Get all features flags. Need for GPU-meneger structs for boolean operation,occlusion querys , copy data etc

  9. Do on CPU frustum culling for meshes and for passed meshes we write arrays GPU-meneger structs

  10. SendGPUList(&pStruct,NumObjects);// Only one call for all meshes !

  11. DrawGPUList();// Only one call for draw all meshes whit all states and graphics pipelines !

  12. Present(swap chain back buffer,color key,pRect);

Thats all ! Simple and faster write simple render for big 3d world with shadows\lightings and animated meshes !

If we want get data from GPU mem. Do:
pDevice->GetGPUData(pGPUresource its texture or buffer or mesh,&pCPUmemory);//pCPUmemory mean big array data not 8 bytes pointer ! And we indicate CPU memory place for storage GPU data !
Then if we want changed data and send back to GPU do:
pDevice->SendGPUData(&pCPUmemory,pGPUresource,how many bytes send);

IIUC, some of the performance benefits of Vulkan relate to being able to produce “binaries” for shaders where things like depth buffer status and blending are fixed.
OpenGL let you switch by simply creating a ton of different versions of each shader. Vulkan requires you to explicitly request each version - although some parameters can be flagged as runtime-changeable (using dynamic state).

Another problem is that shader code runs on shader cores, which don’t have the physical ability to do any of the things you’re asking (they can’t change state).
All of those tasks are done by the master scheduler/dispatch thingy (no idea what the right name is) that orders the cores around. However, that’s not programmable (at least, not in the way you want).

Perhaps future hardware will someday allow user control of said master thingy.

Oops, after rereading post it seems I got a bit confused - and there’s no edit here.
So ignore paragraphs 2 & 3.

One question - what advantage does this have over simply adding more functionality to VK_NVX_device_generated_commands?
It seems that frequently you’ll be sending the same values for many parameters, which eats up data.
Also, every parameter will need to be checked to see if it changes so that you don’t waste time reloading the same resources.

If position meshes not changed, not needed do SendGPUList.
Function SendGPUList only set data for GPU-manager shader.
Then if we do DrawGPUList(); we alway draw the same meshes.

Update SendGPUList need if we changed meshes positions, or if some meshes do not passed frustum culling.

A task of GPU-manager shader only set all states (textures\shaders\blending\Culling\MSAA\Depth\RenderTerget\Index and Vertex buffers\second buffer). And then draw this mesh the required number.
Then GPU-manager shader (without commands from CPU) offset to another struct(for next mesh) and do the same for this mesh.
And do this until end structs.
This method for one DIP will immediately pass this mesh to different shaders. And we draw shadow shader then draw mesh like usual. And we get mesh and her shadow.
In Vulkan\Directx\OpenGL we must first render shadow for this mesh, then render mesh with her shadow. And if we have many different meshes we alway changed pipline\states , write many (Vulkan\Directx\OpenGL) functions !
These switches lead to downtime ! GPU-manager shader eliminate this problem and render write will be more simple and render execute much faster !
PS: In Directx\Vulkan render we have some meshes for which you should always send new positions.

If we want draw shadow for this mesh and mesh we must have two structs !
First struct set vertex\pixel shader for draw mesh in depth
Second struct set vertex\pixel shader for draw mesh and her shadow from the depth.
GPU-manager for one DIP render this ! Executed two structs in one DIP.