Voxel cone tracing global illumination demo

Started by Stefan, December 13, 2012, 06:15:11 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.


QuoteAn implementation of global illumination using voxel cone tracing, as described by Crassin et al. in Interactive Indirect Illumination Using Voxel Cone Tracing, with the Crytek Sponza model used for content.

This demo served both as a means to familiarize myself with voxel cone tracing and as a testbed for performance experiments with the voxel storage: plain 3D textures, real-time compressed 3D textures, and 3D textures aligned with the diffuse sample rays were tested. Sparse voxel octrees were not implemented due to time constraints, but would have been nice to have as a baseline reference. Compared to SVO in the context of voxel cone tracing (as opposed to ray casting, where SVO is a clear winner), 3D textures allow for easier filtering, direct lookups without evaluating the octree structure, and potentially better cache and memory bandwidth utilization (depending on cone size and scene density). The clear downside is the space requirement: 3D textures can't scale to larger scenes or smaller, more detailed voxels. There may be ways to work around this deficiency: sparse textures (GL_AMD_sparse_texture), compression, or hybrid schemes that mix tree structures with 3D textures.

Real-time DXT compression is fast enough to convert the 3D voxel textures on the fly, however API and driver limitations prevent this from being an effective choice due to the inability to write directly to tiled texture memory and CPU fallbacks that get triggered when trying to populate a compressed 3D texture from GPU memory. The potential memory bandwidth savings did not result in a performance advantage - it seems that the cone tracing is limited by texture filtering and ALU on the hardware tested. This approach may still be worth considering, simply for the compression alone.

Aligning the 3D textures with the diffuse sample cone directions simplifies cone tracing significantly (removing the need to manually filter the directionally-dependent voxels), allowing the diffuse cones to be traced much faster. Unfortunately, this also requires that the cone directions be uniform for all fragments, which in turn requires more cones to maintain quality, giving a net loss.

Requires OpenGL 4.3. Tested on an NVIDIA GeForce GTX 680 with the 310.54 beta drivers.

Works with GTX460 Fermi, only 3.5 fps though


You got only 3.5 fps!?! Sounds it's abnormal. But what's your driver and OS?

I got ~30 fps at 1024x768 desktop - windowed and at default camera running Win XP x86 SP3 with i5-2500K@Turbo Boost and GF GTX 480 (310.70 beta - default set).

I like too GTX 480! :D


Quote from: nuninho1980 on December 13, 2012, 10:24:32 PM
You got only 3.5 fps!?! Sounds it's abnormal. But what's your driver and OS?

I like too GTX 480! :D

Geforce 310.70 and Vista

Judging from GPUshark my bottleneck is the video memory (1GB) which is completely hogged.


Quote from: Stefan on December 14, 2012, 03:46:00 PMGeforce 310.70 and Vista

Judging from GPUshark my bottleneck is the video memory (1GB) which is completely hogged.
Dunno that you got almost out of video memory but I got only 979 MB VRAM used.

Haa... Try turn off AERO maybe to fix slow. ;)


Hi, I authored the demo. I put a comment on the news post with information on how to get the project to build and fix a crash under Win8. If you're seeing bad performance or a black window, it's probably because the demo is a memory hog and needs somewhere between 1 and 2GB to run. I put the demo together to teach myself a basic implementation of voxel cone tracing and run performance experiments against it, some of which I left in there and are pushing the memory usage up a bit.


Thanks for the heads-up.

The comments on the news site occasionally take some time to appear.

btw your forum link seems to be broken.


Quote from: nuninho1980 on December 14, 2012, 11:16:37 PMHaa... Try turn off AERO maybe to fix slow. ;)
Can you not fix slow performance?

You have new GTX 460 replaced!? Because GTX 465 is died?


@Stefan: please you should answer to my 2 questions (my previous message).


That was just a typo, i still have my GTX465.
Disabling Aero makes no measurable difference.

There is a comment on the news page where a user with similar configuration has similar performance.

I think i need a debugger to look up the bottleneck.


It looks like the demo needs more than 1gb of VRAM to run properly. I'd guess that with anything less than 1.5gb, you'll be bottlenecked by swapping data to and from the GPU (if it runs at all).

I added it up and the demo needs at least 950mb per frame of memory, and in the worst case it can actually use up to 1.5gb per frame depending on which debug options you enable.

If you're feeling adventurous and want to try building the demo for yourself, find "render_glsl.h" and change this line:
enum {giDim = 256};
to this:
enum {giDim = 128};

That will reduce the voxel resolution, cutting down on the quality but drastically reducing the memory requirements. Also, if you try this, re-download the demo from my site - I updated it so that it will build out of the box.