OpenGL Geometry Instancing Test: Radeon vs GeForce


OpenGL Geometry Instancing Test

This article is a kind of follow-up of this article that showed some surprising results about geometry instancing with two triangles per instance (particle rendering). In this article, I’m going to test geometry instancing performance when the number of faces (or triangles) per instance increases.

The test consists in the rendering of 1’000’000 of quads with increasing polygon density. We start with 2 faces per instance and we end up with 800 faces per instance (800’000’000 triangles!).

The 1’000’000 quads are rendered with instanced arrays: GL_ARB_draw_instanced + GL_ARB_instanced_arrays.

I prepared a demo (Lua + GLSL) for GLSL Hacker. The demo is available in the host_api/GLSL_Geometry_Instancing/benchmark/ folder of the code sample pack.

Testbed:
– CPU: Intel Core i5-4670K @ 3.4GHz
– Mobo: GIGABYTE G1.Sniper M5
– Memory: G-Skill 16GB DDR3 1600MHz
– Windows 8 64-bit
– Catalyst 14.4 (for Radeon cards)
R344.11 (for GeForce cards)
– GLSL Hacker 0.7.1.1
– FRAPS for displaying the framerate in fullscreen mode.

Demo settings: 1920×1080 fullscreen


OpenGL Geometry Instancing Test: Radeon vs GeForce

Framerate (FPS) with respect to the number of faces per instance:

  GTX 780 GTX 750 GTX 680 HD 7970 HD 6970
2 faces 98 90 98 250 162
8 faces 98 41 83 140 92
18 faces 90 23 76 73 53
32 faces 64 17 55 42 33
128 faces 20 6 13 9 7
200 faces 11 4 8 6 4
288 faces 7 3 6 4 3
512 faces 5 2 3 2 2
800 faces 3 1 2 1 1

It’s interesting to see that for 2 triangles per instance, we have the same kind of results than for particle rendering. When the number of triangles per instance is very low (2 up to 8), Radeon GPUs are faster than GeForce ones. GeForce GPUs take the advantage when the number of faces/instance jumps more or less over 16. And after 128 faces/instances (then 128’000’000 triangles), all GPUs are suffering.


OpenGL Geometry Instancing Test