Implementing ARB_shader_storage_buffer in Mesa and Intel i965 driver

Started by JeGX, July 08, 2015, 03:58:37 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

JeGX

Some notes on implementing ARB_shader_storage_buffer OpenGL extension in Mesa and the Intel i965 driver.

Quote
In my previous post I introduced ARB_shader_storage_buffer, an OpenGL 4.3 feature that is coming soon to Mesa and the Intel i965 driver. While that post focused on explaining the features introduced by the extension, in this post I'll dive into some of the implementation aspects, for those who are curious about this kind of stuff. Be warned that some parts of this post will be specific to Intel hardware.

...

Another interesting thing we had to deal with are address alignments. UBOs work with layout std140. In this setup, elements in the UBO definition are aligned to 16-byte boundaries (the size of a vec4). It turns out that GPUs can usually optimize reads and writes to multiples of 16 bytes, so this makes sense, however, as I explained in my previous post, SSBOs also introduce a packed layout mode known as std430.

Intel hardware provides a number of messages that we can use through the Data Port interface to write to memory. Each message has different characteristics that makes it more suitable for certain scenarios, like the pixel mask I discussed before. For example, some of these messages have the capacity to write data in chunks of 16-bytes (that is, they write vec4 elements, or OWORDS in the language of the technical docs). One could think that these messages are great when you work with vector data types, however, they also introduce the problem of dealing with partial writes: what happens when you only write to an element of a vector? or to a buffer variable that is smaller than the size of a vector? what if you write columns in a row_major matrix? etc

In these scenarios, using these messages introduces the need to mask the writes because you need to disable the channels in the vec4 element that you don't want to write. Of course, the hardware provides means to do this, we only need to set the writemask of the destination register of the message instruction to select the right channels.

Full post: http://blogs.igalia.com/itoral/2015/07/08/implementing-arb_shader_storage_buffer/

A simple introduction to SSBO can be found here:
http://www.geeks3d.com/20140704/tutorial-introduction-to-opengl-4-3-shader-storage-buffers-objects-ssbo-demo/