Recent Posts

Pages: 1 ... 3 4 [5] 6 7 ... 10
3D-Tech News Around The Web / CToy: C live-coding Tool
« Last post by JeGX on September 15, 2016, 04:24:49 PM »
CToy is a C(99) live-coding environment based on TCC. Small, simple, no bullshit. Write standard cross-platform code and see the result immediately. No installation or compiler required, download (~2mb), run CToy and play. Ready for Windows 64 bit and MacOSX 64 bit (linux in progress). Ideal for games, image processing, teaching, or anything C can do.


I quickly tested it (live-coded the src/sample/triangle_hello.c file):

3D-Tech News Around The Web / NVIDIA GeForce GTX 1080 Ti Specifications Leaked
« Last post by JeGX on September 15, 2016, 04:17:39 PM »
GeForce GTX 1080 Ti specifications:

- 16 nm GP102 silicon
- 3,328 CUDA cores
- 208 TMUs
- 96 ROPs
- 12 GB GDDR5 memory
- 384-bit GDDR5 memory interface
- 1503 MHz core, 1623 MHz GPU Boost
- 8 GHz (GDDR5-effective) memory
- 384 GB/s memory bandwidth
- 250W TDP

- source1
- source2
NB: forum doesn't allow longer headlines  ::)

Don't worry, I will put the full title on twitter  ;)
SetStablePowerState.exe: Disabling GPU Boost on Windows 10 for more deterministic timestamp queries on NVIDIA GPUs

     With all modern graphics APIs (D3D11, D3D12, GL4 and Vulkan), it is possible for an application to query the elapsed GPU time for any given range of render calls by using timestamp queries. Most game engines today are using this mechanism to measure the GPU time spent on a whole frame and per pass. This blog post includes full source code for a simple D3D12 application (SetStablePowerState.exe) that can be run to disable and restore GPU Boost at any time, for all graphics applications running on the system. Disabling GPU Boost helps getting more deterministic GPU times from timestamp queries. And because the clocks are changed at the system level, you can run SetStablePowerState.exe even if your game is using a different graphics API than D3D12. The only requirement is that you use Windows 10 and have the Windows 10 SDK installed.


On some occasions, we have found ourselves confused by the fact that the measured GPU time for a given pass we were working on would change over time, even if we did not make any change to that pass. The GPU times would be stable within a run, but would sometimes vary slightly from run to run. Later on, we learned that this can happen as a side effect of the GPU having a variable Core Clock frequency, depending on the current GPU temperature and possibly other factors such as power consumption. This can happen with all GPUs that have variable frequencies, and can happen with all NVIDIA GPUs that include a version of GPU Boost, more specifically all GPUs based on the Kepler, Maxwell and Pascal architectures, and beyond.

NB: forum doesn't allow longer headlines  ::)
3D-Tech News Around The Web / Raspberry Pi OpenGL performance work
« Last post by JeGX on September 14, 2016, 06:36:48 PM »
Eric Anholt gives some details about upcoming performance boost in the OpenGL driver of the Raspberry Pi. Eric Anholt works for Broadcom on the Raspberry Pi's graphics driver.

Last week I spent working on the glmark2 performance issues.  I now have a NIR patch out for the pathological conditionals test (it's now faster than on the old driver), and a branch for job shuffling (+17% and +27% on the two desktop tests).

Here's the basic idea of job shuffling:

We're a tiled renderer, and tiled renderers get their wins from having a Clear at the start of the frame (indicating we don't need to load any previous contents into the tile buffer).  When your frame is done, we flush each tile out to memory.  If you do your clear, start rendering some primitives, and then switch to some other FBO (because you're rendering to a texture that you're planning on texturing from in your next draw to the main FBO), we have to flush out all of those tiles, start rendering to the new FBO, and flush its rendering, and then when you come back to the main FBO and we have to reload your old cleared-and-a-few-draws tiles.

Job shuffling deals with this by separating the single GL command stream into separate jobs per FBO.  When you switch to your temporary FBO, we don't flush the old job, we just set it aside.  To make this work we have to add tracking for which buffers have jobs writing into them (so that if you try to read those from another job, we can go flush the job that wrote it), and which buffers have jobs reading from them (so that if you try to write to them, they can get flushed so that they don't get incorrectly updated contents).

Complete story:
3D-Tech News Around The Web / MSI GTX 1080 Limited Edition 30th Anniversary
« Last post by JeGX on September 14, 2016, 06:29:34 PM »
MSI is celebrating its 30th anniversary as a leading manufacturer of innovative PC hardware. During the past 30 years, MSI has earned a reputation for providing products featuring cutting edge technology and striving to create and use only the best quality components.

To celebrate this milestone, MSI has created an exclusive limited edition graphics card, combining the excellence of MSI GAMING graphics cards with a unique custom designed EK waterblock for this anniversary edition. The exceptionally classy waterblock features infused RGB LED lights that can be set to any of 16.8 million colors by using the MSI Gaming App.

At the heart of this exclusive card is NVIDIA’s GeForce® GTX 1080 GPU to provide all the power you need at up to 4K resolution gaming. The card comes fully assembled in a closed loop liquid cooling configuration that is covered by warranty and maintenance-free. Enclosed in the exquisite and sturdy wooden box is a small gift which is perfect for enjoying the latest epic games in full comfort.

- Press Release

The CryEngine will add support of Vulkan in version 5.3 (mid-november 2016) and Direct3D 12 multi-GPU support is planned for version 5.4 (late February / GDC 2017).

- CryEngine roadmap in Graphics and Rendering section
- news @
- news @

3D-Tech News Around The Web / Vertex Cache Measurement
« Last post by JeGX on September 14, 2016, 06:14:08 PM »
Now that DX11 has given us UAVs in all the other shading stages as well, I decided to try the equivalent for the vertex cache. By “Vertex Cache”, I mean the Post-transform vertex re-use cache. That is, the thing which enables us to re-use vertex shading results across duplicated vertices in a mesh.

Using UAVs in a VS, we can use SV_VertexID to do an atomic increment into a buffer containing one counter for each vertex. An atomic inc is necessary here because we don’t actually know what the vertex distribution algorithm is, and we could theoretically process a given vert in more than one VS thread simultaneously. For that matter, HW could simply be duplicating all the verts. We won’t know until we’ve looked at the results. Using this approach, we end up with a buffer telling us the exact number of times that each vert was processed during the draw. From this, we can directly calculate the ACMR (average cache miss ratio) of the mesh.

- article
- github
3D-Tech News Around The Web / Masked Software Occlusion Culling Implementation
« Last post by JeGX on September 14, 2016, 06:08:01 PM »
This code accompanies the research paper "Masked Software Occlusion Culling", and implements an efficient alternative to the hierarchical depth buffer algorithm. Our algorithm decouples depth values and coverage, and operates directly on the hierarchical depth buffer. It lets us efficiently parallelize both coverage computations and hierarchical depth buffer updates.

This code is mainly optimized for the AVX2 instruction set, and some AVX specific instructions are required for best performance. However, we also provide SSE 4.1 and SSE 2 implementations for backwards compatibility. The appropriate implementation will be chosen during run-time based on the CPU's capabilities.

- MaskedOcclusionCulling @ github
- Masked Software Occlusion Culling @Intel
Geeks3D's GPU Tools / FurMark released
« Last post by JeGX on September 13, 2016, 03:03:35 PM »
A maintenance release of FurMark is available.

Version - 2016-09-13
+ added command line parameter to enable or disable the dynamic background (/enable_dyn_bkg=1 or
! updated: GPU Shark and GPU-Z 1.11.0.
! updated: ZoomGPU 1.19.3 (GPU monitoring library).

Pages: 1 ... 3 4 [5] 6 7 ... 10