Frustum Culling á la HoD

here’s my ultra-optimized frustum-culling algorithm :
the basic idea is that you assume that all visible objects are in a cone. you are at the top of it.
the calculation :
if an object has to be drawn depends on its distance to the cone’s middle-line. to calculate this, you need the shortest vector from this object to the middle-line of the cone (the vector from top to bottom). lets name it : “D”.

add this vector to the objects position. you’ll get a point on the middle-line. the distance between this point and he viewer is “D2”.
assuming the fov is 90 degrees, you need only to draw objects if their D is smaller or equal to their D2.

advantages : very simlple, very fast.

to get the distance , use this function :
p is the point, l[0] and l[1] are to points on the middle-line.

public static Verticle vektorPunktStrecke(Verticle p, Verticle[] l)
{
	float u = ( (((p.getX()-l[0].getX() )*( l[1].getX()-l[0].getX()) )+
	            ((p.getY()-l[0].getY())*(l[1].getY()-l[0].getY()))+
	            ((p.getZ()-l[0].getZ())*(l[1].getZ()-l[0].getZ())))
	            /
	            (float)Math.pow(getLength(getVectorBetween(l)),2));


    Verticle newP = new Verticle(l[0].getX()-u*getVectorBetween(l).getX(),
                               l[0].getY()-u*getVectorBetween(l).getY(),
                               l[0].getZ()-u*getVectorBetween(l).getZ());

the function was not written for this problem so it can be optimizied…

what do you think of my idea ?

I don’t see how this could be faster than the normal frustum culling by checking against 6 planes, which with early out should average of 3 plane checks per test. A plane check is a very simple operation, 3 add and 3 mul. I see a divide and power function in your code.

then your frustum-culling code is not the one i know… post plz…

in the optimized version, there is no pow-function (the getLength-func does a sqrt(x))
so you end up with 6 dots, one div, and 11 add/sub

with 3 checks and 3 add&muls per plane, you come to 9 adds and 9 muls…

worst case is 18+18, where my method still needs 6+1+11…

float operator * (const Vertex &u, const Vertex &v){
return u.x * v.x + u.y * v.y + u.z * v.z;
}

float Plane::distance(const Vertex &v) const {
return (v * normal + offset);
}

bool Frustum:: pointInFrustum(const Vertex &v) const {
for (int i = 0; i < 6; i++){
if (planes[i].distance(v) <= 0) return false;
}
return true;
}

6 dots = 18 adds and 12 muls, so your code is at best one div and 5 adds slower than the worst case of the normal plane check algoritm. A fdiv is a slow operation, easily 5-10 times slower than a mul or add, unless you use 3dnow/sse approximation instructions. Also, the plane check has every other instruction a mul and a add, so it fits perfectly on modern cpu’s like the athlon which can execute these in parallel.

[This message has been edited by Humus (edited 01-26-2003).]

HamsterofDeath, you are missing an important part of your technique :

Find the closest point to the line. You should consider that. Do the calculation for a sphere and an orientedBBox, you will see that the number of instructions raise quite a bit.

but i don’t need to calculate the frustum. i just have to get the viewing-vector…

ok, that won’t speed up things much…

Originally posted by Humus:
[b][quote]

float operator * (const Vertex &u, const Vertex &v){
return u.x * v.x + u.y * v.y + u.z * v.z;
}

float Plane::distance(const Vertex &v) const {
return (v * normal + offset);
}

bool Frustum:: pointInFrustum(const Vertex &v) const {
for (int i = 0; i < 6; i++){
if (planes[i].distance(v) <= 0) return false;
}
return true;
}

6 dots = 18 adds and 12 muls, so your code is at best one div and 5 adds slower than the worst case of the normal plane check algoritm. A fdiv is a slow operation, easily 5-10 times slower than a mul or add, unless you use 3dnow/sse approximation instructions. Also, the plane check has every other instruction a mul and a add, so it fits perfectly on modern cpu’s like the athlon which can execute these in parallel.

[This message has been edited by Humus (edited 01-26-2003).][/b][/QUOTE]

Has anyone ever tried implementing this with Plücker coords? I’m learning about them right now. The equations are simple enough, but I just can’t seem to figure out what exactly is happening that makes them work. What I have in mind this sixth-dimentional oddity is an umbra occlusion system. They look promising, but I wonder if it will have any computational advantage over the tradition one-sided plane methods. Either way, when the dust settles, I’ll know for sure and have some good experience to boot .

In my opinion: A frustum culling algorithm should be ultra TIGHT, not ultra FAST. A few extra muls and compares saved just won’t do anything for you, compared to the ability to reject a 2500-poly mesh that happens to be close to the viewing frustum, but not actually in it.

That being said: whatever works for you is fine.

ok then, here’s my ultra fast, and ultra tight frustum culling optimisation : gg

  1. if everything is behind you, don’t draw it (1-plane-frustum-culling)
    you have to clip the object once, not 3 times average.
    may give a little speedup.
  2. use normal frumstum culling after this.

Hi,

I think you should do what humus says. You can start with the near clipping plane if you want, that would make it pretty much like the algorithm you suggested.

I don’t know how important the near plane is though, after the other planes it culls very little if any geometry. I don’t use it at all. The top plane is pretty useless in a typical situation too, so you might want it to be the last plane you test against.

Don’t get stuck with frustum culling for too long time, it’s unlikely to be the bottleneck of your engine.

-Ilkka

The last plane you should test should be the near plane, as you’ll get the distance from near plane value for free, which comes in useful for depth sorting and other such things.

Your call of the pow function alone is going to be slower than a full frustum check.

If done properly, an inside / outside check for a sphere against a cone should be two dot products, two subtracts, two multiplies and a single comparison. The miniscule speed advantage this has over a full set of 6 frustum plane checks on an axis-aligned bounding box is usually not worth the extra objects drawn.

And if you actually call new and pow inside your culling function, I can tell you with confidence that software culling speed is not your problem.