> Using 32*32 sounds to me VERY bad, the changes between two normals should be quite big
“VERY bad” is probably an overstatement.
Actually as the first poster mentioned, 32x32 faces can indeed be adequate for general use, particularly for the diffuse lighting (L dot N) term.
The point to realize is that there are multiple sources of error in shading computations most of the time (particularly when done in fixed-point) so it makes no sense to “max out” the size of your normalization cube map if other errors are going to dominate.
Think about a 256x256 normalization cube map face. The normalization cube map function stored in the faces is fairly smooth (it forms a sphere after all). This means linear filtering works pretty well to provide intermediate values. For a typical 8-bit RGB texture, realize you only have 8 bits per component and if you use “signed expand” to multiply by 2 subtract 1 to move the [0,1] range to a [-1,1] range, you just used one of your 8 bits on a sign bit so there’s 7 effective bits of magnitude.
So how useful would a 256x256 cube map be? It’s basically at the limit of the representable 8 bits of the texel components.
If the filtering is limited to 8 bits, don’t expect a lot more goodness.
And really the fact that your texel filtering is typically just 8 bits for a texture format with just 8 bits per texel component. That’s really the source of your error, more than cube map size.
It would likely be better to use a 32x32 cube map face with more than 8 bits of filtering per texel component than increase the cube map face size.
Try it yourself to see the difference.
GeForce FX GPUs support the GL_HILO8_NV (and GL_SIGNED_HILO8_NV) texture formats. These formats provide 2 8-bit components that are filtered with 16-bit precision. By ganging two HILO8 textures, you can get 3 (really 4) components filtered at 16-bit precision.
Do something like:
TEX R0, f[TEX0], TEX0,CUBE; // HILO8 for xy
TEX R1, f[TEX0], TEX1,CUBE; // HILO8 for z
MUL R2, R0.xyww, R1.wwxy; // combine xyz
The MUL relies on a trick of the HILO format where the W component of a HILO texture fetch result is always 1.0.
Now store your normalization cube map spread across the xy components of texture unit 0’s HILO8 cube map and the z (and w if wanted a 4th component for some reason) component in the x component of texture unit 1’s HILO8 cube map.
You could do the same thing with HILO16 as well if you wanted more precision stored in the normalization cube map.
You can do a whole bunch of experiments and decide which is best for your application.
The case where having more normalization precision is most helpful is when you are doing specular lighting computations. This is where you compute
pow(max(0,dot(N,H)),shininess)
where N is your normal, H is your half-angle (the normalized sum of the view vector and the light vector), and shininess is your surfaces specular shininess term.
Because you are raising dot(H,N) to a power (a very non-linear function), any normalization limitations (whether due to filtering or cube map face precision or any other reason) are exacerbated.
Note that just because you improve the quality of the normalization you use for your specular compuation in specular per-pixel lighting when doing surface-space bump mapping, there are still other errors such as the bumpy surface really having a distribution of surface normals at a viewed pixel rather than just one.
Also, there can be errors if you are computing the half-angle vector per-vertex (a good technique for performance) rather than per-fragment. You might notice the difference on big wall polygons, but you can fix that by tesselating the wall better. Obviously, if you don’t care about squeezing out the most rendering performance for reasonable visual look, you can just do per-fragment half-angle computations (but most developers care about performance).
My personal experience is that 32x32 or 64x64 normalization cube maps are “good enough” when you use RGB8 textures to store your normalization cube maps. By that I mean, a higher-precision cube map doesn’t significant improve the appearance of bumpy objects in the scene. Your mileage may vary if you have very smooth surfaces where normalization quantization becomes a visible artifact, you have high shininess terms, or you are just really picky.
There generally a marginal performance cost when you pick a larger cube map face size than your really need. Same with the texel precision.
I hope this helps.