
|
In this article, I compared two methods for rendering a lot of particles:
On AMD Radeon GPUs, geometry shader sprites are 30% slower than point sprites. On NVIDIA GPUs (GTX 780, GTX 680), geometry shader sprites have more or less the same performance than point sprites. |
Now let’s see a third way, this time, based on geometry instancing: we have a quad (4 vertices) and we use the GPU built-in geometry instancing feature (GL_ARB_draw_instanced + GL_ARB_instanced_arrays, OpenGL 3.3+) to render a lot of quads, each quad being a particle. The billboarding is done for every instance of the quad in the vertex shader.
A demo, coded for GLSL Hacker, is available in the host_api/Particle_Geometry_Instancing/ folder of the code sample pack. Start you favorite FPS-meter (FRAPS, etc…) and load the particle_gi_1920x1080_fullscreen.xml file in GLSL Hacker.
Testbed:
– CPU: Intel Core i5-4670K @ 3.4GHz
– Mobo: GIGABYTE G1.Sniper M5
– Memory: G-Skill 16GB DDR3 1600MHz
– Windows 8 64-bit
– Catalyst 14.4 (for Radeon cards)
– R344.11 (for GeForce cards)
– GLSL Hacker 0.7.0.4
– FRAPS for displaying the framerate in fullscreen mode.
Settings: particules: 1’000’000, resolution: 1920×1080 fullscreen.
Point Sprite (PS) | Geometry Shader (GS) |
Geometry Instancing (GI) |
Difference (GS/PS) | (GI/PS) | |
Radeon HD 7970 | 147 FPS | 105 FPS | 134 FPS | -28% | -8% |
Radeon HD 6970 | 98 FPS | 65 FPS | 91 FPS | -33% | -7% |
Radeon HD 5870 | 85 FPS | 59 FPS | 77 FPS | -30% | -9% |
GeForce GTX 780 | 295 FPS | 279 FPS | 97 FPS | -5% | -67% |
GeForce GTX 750 | 203 FPS | 122 FPS | 104 FPS | -39% | -48% |
GeForce GTX 680 | 72 FPS | 71 FPS | 73 FPS | -1% | +1% |

On AMD Radeon GPUs, geometry instancing work rather fine for particle rendering. This method is slower than point sprites but the difference is smaller compared to geometry shader method: geometry instancing is 10% slower while geometry shader is 30% slower.
On NVIDIA GPUs, results are a bit difficult to interpret… Clearly, rendering particles with geometry instancing is not the strength of GeForce cards. Almost 70% slower on the GTX 780! The GTX 780 (Kepler) is even slower than the GTX 750 (Maxwell first gen)! Only the GTX 680 remains constant. I think I will do a geometry instancing test to check how Radeon and GeForce behave when the number of triangles per instance vary, just to be sure that these results are coherent.
Conclusion: Point sprites are the fastest way to render a lot of particles. If you need more control over particle rendering, you can use geometry instancing with Radeon GPUs and geometry shaders with GeForce GPUs.
Nice to see that you have taken some time to try this 🙂
I didn’t know that Geforce GPUs performed so differently thant Radeon GPUs. Thanks 🙂
78 FPS @ 1920×1080 full-screen.
Stock GTX580, 344.11 driver.
The last time I checked point sprites didn’t render if the whole sprite wasn’t inside the screen. It can be fixed by adding safe borders but you will also need that add that extra space to the other buffers you’re using depending on your rendering passes things.
@fellix: my same GPU@stock-reference but Win 7 x64 SP1 w/ i5-2500K@4.5GHz. 🙂
I got 80 fps at 1080p+FS running same GFX driver (HQ)
Thanks for the performance test. I was checking if for more complicated geometry, the geometry shader would be faster than instance rendering.
I have new EVGA GTX 780 Ti Classified after GTX 580 3GB (freeze – RMA to GTX 770SC)! 🙂
At 1080p running Win 8.1 x64 with 347.71beta (HQ)
-Point Sprite (PS): 301 fps
-Geometry Shader (GS):296 fps
-Geometry Instancing (GI): 110 fps