Geometry intancing is a powerful technique to render many meshes in a single render call. But by default, there is no culling and the complete geometry is sent through the 3D pipeline even if some instances are not visible.
GPU Caps Viewer shows a geometry instancing demo that relies on the GPU horsepower and then does not perform geometry culling:
Daniel Rákos in this article explains how to perform geometry instancing culling to render thousand of blades of grass and trees.
The rendering is done in two passes: the first one does the culling and store in a buffer (thanks to the transform feedback extension) the visible geometry. The culling use one cool property of geometry shaders: the ability to discard vertices. Here is the code of the geometry shader:
#version 150 core layout(points) in; layout(points, max_vertices = 1) out; in vec4 OrigPosition[1]; flat in int objectVisible[1]; out vec4 CulledPosition; void main() { // only emit primitive if the object is visible if (objectVisible[0] == 1) { CulledPosition = OrigPosition[0]; EmitVertex(); EndPrimitive(); } }
The second pass uses the buffer that contains visible instances (an uniform samplerBuffer in the demo) and performs normal rendering. Vertex position is retrieved in the vertex shader by this line:
vec4 ObjectSpacePosition = texelFetch(InstanceData, gl_InstanceID) + vec4(VertexPosition, 1.0);
I tested the demo on a Radeon HD 5770 with an average FPS of 20.
If only this means games will move away from the stupid ugly grass they have used for 10 years now. One can only hope.
“I tested the demo on a Radeon HD 5770 with an average FPS of 20.”
using 8800GT, 30fps is the absolute minimum on my system. it can climb up to about 200fps, depends on viewing angle, but averages at about 36fps when keeping trees in view. 🙂
well, playing with it some more I was able to catch a minimum of 26fps, quite rarely though.
yep depending on the viewing angle I jump up to 75 FPS but I’m far from 200 FPS (maybe there’s a vsync that limits the fps). 13 FPS is the minimum for my R5770 Hawk 😉
Hi!
Thanks for sharing my article here as well.
Just one clarification: the term “uniform samplerBuffer” mentioned is a bit confusing. Actually a buffer texture is used in the demo. However, the same can be accomplished with a uniform buffer as well.