glClear for FBO

When I call glClear, it takes a second for the function to return. Wow!
I have a Radeon 9700 with cat 7.5, Win XP

I tried with GL_INTENSITY32F_ARB with 24 bit depth.
It is only glClear that takes 1 second. The rest of the calls that are for rendering to this FBO is fast.

With RGBA8, 32 bit depth, it is fast. No problems.

The buffer GL_INTENSITY32F_ARB is probably not hardware accelerated. But it’s strange.

I can also create RGBA32F, RGBA16F, INTENSITY16F and they all give the same result. Also, it must be hw accelerated since I can validate the FBO.

Just to verify, you have made sure all drawing is done before the clear is run?

If so, this is absurd, as you could with a simple memset over the old PCI bus be able to clear almost 18 RGBA framebuffers at size 1600x1200 in a second (assuming 132MB/s minus some overhead).

A purely speculative idea is that the clear is run in software and actually is so dumb as to loop all pixels, and only writing 7 bytes each iteration (4 for INTENSITY32F and 3 for depth). To make it worse the buffers are most likely in different memory areas, so there would then be effectively no streaming whatsoever over any kind of bus, not to mention the many, many non-/mis-aligned writes to the depth buffer.

Can you get a INTENSITY32F + 32bit depth, just to put my theory to the test? If not, what about no depth buffer? What if you clear in two passes, one for framebuffer directly followed by one for depth (and also measure times independently)?

The first calls at the start of the frame is

glClear(GL_DEPTH_BUFFER_BIT | GL_STENCIL_BUFFER_BIT | GL_COLOR_BUFFER_BIT);
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT,2);
glClearColor(0.000000,0.000000,0.000000,0.000000);
glClear(GL_DEPTH_BUFFER_BIT | GL_COLOR_BUFFER_BIT); 

The first glClear clears the framebuffer.
While I was trying to debug, I can’t reproduce the problem. I can’t even create LUMINANCE16F with 24 depth or 32 depth.

Also, I was making LUMINANCE16F, not INTENSITY16, if it matters.

I can create
LUMINANCE32F, D24, S0
LUMINANCE32F, D32, S0

and it’s fast now.
I also have glGetError sprinkled and using GLintercept since forever. No errors were ever found. Weird. Maybe the problem will appear again?

V-man wrote:
Also, I was making LUMINANCE16F, not INTENSITY16, if it matters.

I can create
LUMINANCE32F, D24, S0
LUMINANCE32F, D32, S0

and it’s fast now.

Ah, you were trying to use 16-bit “floating point”. No wonder it crapped out on you.

But it does remind me that IEEE should perhaps consider float’s in the range 0.0-1.0 (and sign) as a new, specific, data type even.

The problem was that no matter which floating point target I created, glClear was very slow.
Even when I put 5 glClear in a row, each glClear took 1 second.

Actually, I can’t create LUMINANCE16F, but RGBA16F works.

And as I’m typing, I can reproduce the problem with RGBA16F, RGBA32F (32 depth or 24).
LUMINANCE32F is fine (32 depth or 24).

I disable GLintercept but the problem remains.
I’ll just use LUMINANCE32F, 32 depth then.

LUMINANCE is not a renderable format (especially in D3D, ogl should behave in the same way), so I wonder why glGetError returns no error. Is your FBO complete?

tamlin> We do have floats in the range 0.0-1.0, it’s called int (stored in the texture). :wink:

LUMINANCE is not a renderable format (especially in D3D, ogl should behave in the same way)
Why? I’m pretty sure that luminance is a well-defined render target in OpenGL. That D3D doesn’t let you do it shows only that D3D doesn’t let you do it.

OpenGL, if I recall correctly, defines rendering to a luminance framebuffer as taking the Red component of the fragment result and performing the pixel operations on that.

LUMINANCE32F with 32 bit depth or 24 bit depth continues to work fine for me. Maybe they didn’t expose this format for D3D.

I’m getting “Incomplete attachment” error with either INTENSITY32F or LUMINANCE32F. FLOAT_R32_NV works fine, though.

Originally posted by Korval:
Why? I’m pretty sure that luminance is a well-defined render target in OpenGL. That D3D doesn’t let you do it shows only that D3D doesn’t let you do it.
It doesn’t work on my ATI Mobility Radeon X1700 (Catalyst 7.4), I tried it. FBO is always incomplete. It may be a driver bug or not. But for this purpose I use undocumented extension GL_ATI_r_rg which offers one- and two-channel renderable floating-point formats. (supports Radeon9500 and later, see glATI.h in their SDK)