Particle billboarding is one of the important steps of particle rendering. Billboarding can be achieved by using point sprites (the billboarding is automatically done by the GPU) or geometry shaders as explained in this article. Both methods produce similar visual results. But which method is the faster if we have to render a lot of particles (1 million or more)?
To bring an element of response, I prepared a demo that performs the rendering of 1’000’000 particle. Each particle is animated in a simple way in the vertex shader. The demo is available in two versions: the first one uses point sprites, the second one using geometry shaders.
The GLSL Hacker demo is available in the host_api/Particle_PointSprite_vs_GS_Billboarded_Quads/ folder of the code sample pack. It’s recommended to use the latest GLSL Hacker 0.7.0.3 (I fixed a small bug related to fullscreen in this version).
– CPU: Intel Core i5-4670K @ 3.4GHz
– Mobo: GIGABYTE G1.Sniper M5
– Memory: G-Skill 16GB DDR3 1600MHz
– Windows 8 64-bit
– Catalyst 14.4 (for Radeon cards)
– R340.65 (for GeForce cards)
– GLSL Hacker 0.7.0.3
– FRAPS for displaying the framerate in fullscreen mode.
Settings: particules: 1’000’000, resolution: 1920×1080 fullscreen.
|Point Sprite||Geometry Shader||Difference|
|Radeon HD 7970||147 FPS||105 FPS||-28%|
|Radeon HD 6970||98 FPS||65 FPS||-33%|
|Radeon HD 5870||85 FPS||59 FPS||-30%|
|GeForce GTX 780||295 FPS||279 FPS||-5%|
|GeForce GTX 750||203 FPS||122 FPS||-39%|
|GeForce GTX 680||72 FPS||71 FPS||-1%|
Quick analysis: particle rendering with geometry shader is around 30% slower than with point sprites on Radeon GPUs. I was a bit surprised by this result because Radeon GPUs have a special hardware support in the geometry shader to make the transformation of a vertex into 4 vertices more efficient (see details in the Radeon HD 2000 Programming Guide, page 9). I asked to AMD OpenGL guru about this special support and actually there’s no mention in any internal documentation of that special hardware support for the 1:4 amplification case in geometry shader. This drop in performance is absolutely normal when comparing point sprites versus GS billboarded quads.
With NVIDIA hardware, we can distinguish two types of GPUs: a first type where the difference between PS and GS is very small (GTX 780, GTX 680) and a second type (GTX 750) where the difference is similar to the one observed on Radeon GPUs.
Conclusion: it’s not a surprise, point sprite is the fastest way to render a lot of particles on both AMD and NVIDIA GPUs. And since OpenGL 3, point sprite is the default point rendering mode.
A french version of this article is available HERE.
1 million particle