@jsbarretto Ok, that explains why you're so afraid of overdraw. :blobfoxlaughsweat:
Here's a few more ideas that you could try:
* Avoid intersecting triangles. This should be easier if you don't need intersecting geometry in the first place. Use bounding boxes or some other simple algorithm to avoid collisions between objects, and make sure none of your objects are self-intersecting.
* If you're not doing that already, cull back faces of closed objects. There's a standard algorithm that works with vertex order to determine which side is "inside", or you could use face normals.
Keep in mind that extensive geometry processing may be expensive too, so doing too much or using an inefficient algorithm could be worse than a bit of overdraw. Since you talk about performance issues with z buffering, I'm wondering if geometry processing on the CPU + transfer to vmem on every frame isn't gonna hurt more anyway? Then again, I don't know your target platform and how much vertex data you have.
@smochi All being done, thankfully. My target platform is the Gameboy Advance: everything is being rasterised in software using fixed point ( the GBA has no GPU nor support for floating point). If you're curious: https://youtu.be/RDrjsrKmeOs
I do think I'm rapidly getting into diminishing returns at this point. I'm down to optimising things down to the instruction level (and below) in many of the inner loops. Over the past few days I rewrote the clipper & it now seems to no longer be the bottleneck.