Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Topics - JeGX

Pages: [1] 2 3 ... 46
3D-Tech News Around The Web / Scalable GPU Fluid Simulation
« on: May 23, 2018, 05:24:51 PM »
Let’s take a look at how to efficiently implement a particle based fluid simulation for real time rendering. We will be running a Smooth Particle Hydrodynamics (SPH) simulation on the GPU. This post is intended for experienced developers and provide the general steps of implementation. It is not a step-by step tutorial, but rather introducing algorithms, data structures, ideas and optimization tips. There are multiple parts I will write about: computing SPH, N-body simulation, dynamic hashed grid acceleration structure.


mud is an all-purpose c++ app prototyping library, focused towards live graphical apps and games.
mud contains all the essential building blocks to develop lean c++ apps from scratch, providing reflection and low level generic algorithms, an immediate ui paradigm, and an immediate minimalistic and flexible graphics renderer.

In essence, mud aims to be the quickest and simplest way to prototype a c++ graphical application: it provides facilities which, in retrospect, you will never want to build an application without. It handles the problem of the code you don't want to write, and should not have to write, whenever prototyping an app. As such the core principle in mud is : don't repeat yourself, and we take this aim very seriously. We also believe it's a principle that is way too often disregarded.


This work has been presented at the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games 2018 on 15th of May 2018.

Compositing transparent surfaces rendered in an arbitrary order requires techniques for order-independent transparency. Each surface color needs to be multiplied by the appropriate transmittance to the eye to incorporate occlusion. Building upon moment shadow mapping, we present a moment-based method for compact storage and fast reconstruction of this depth-dependent function per pixel. We work with the logarithm of the transmittance such that the function may be accumulated additively rather than multiplicatively. Then an additive rendering pass for all transparent surfaces yields moments. Moment-based reconstruction algorithms provide approximations to the original function, which are used for compositing in a second additive pass. We utilize existing algorithms with four or six power moments and develop new algorithms using eight power moments or up to four trigonometric moments. The resulting techniques are completely order-independent, work well for participating media as well as transparent surfaces and come in many variants providing different tradeoffs. We also utilize the same approach for the closely related problem of computing shadows for transparent surfaces.

- whitepaper
- code
- source

GeeXLab - english forum / (WIP) Bullet Physics
« on: May 17, 2018, 01:57:48 PM »
I started a minimal integration of Bullet Physics in GeeXLab in order to have an alternative to NVIDIA PhysX. Currently only rigid body collisions are supported. Bullet3 support will be available in GeeXLab 0.25+.

If you are a software developer working in the video game industry and wondering what else you could do to improve the quality of your product or make the development process easier and you don't use static analysis – it's just the right time to start doing so. You doubt that? OK, I'll try to convince you. And if you are just looking to see what coding mistakes are common with video-game and game-engine developers, then you're, again, at the right place: I have picked the most interesting ones for you.


This collection of samples act as an introduction to DirectX Raytracing (DXR). The samples are divided into tutorials and advanced samples. Each tutorial sample introduces a few new DXR concepts. Advanced samples demonstrate more complex techniques and applications of raytracing. We will be adding more samples in the coming future, so check back. In addition, you can find more DXR samples tutorials at Nvidia's DXR samples Github.

The samples are implemented using both DXR and D3D12 Raytracing Fallback Layer APIs. This is purely for demonstration purposes to show API differences. Real-world applications will implement only one or the other. The Fallback Layer uses DXR if a driver and OS supports it. Otherwise, it falls back to the compute pipeline to emulate raytracing. Developers aiming for wider HW support should target the Fallback Layer.


The Khronos OpenCL working group has today released a maintenance update to OpenCL 2.2. Maintenance updates are an essential part of improving the overall health of any open standard. In this recent maintenance update from OpenCL, the working group consolidated 30+ bug fixes and clarifications to make the specification more precisely defined and more easily understood - all while maintaining backwards compatibility for existing applications.


Geeks3D's GPU Tools / GPU Caps Viewer released
« on: May 15, 2018, 11:39:37 AM »
GPU Caps Viewer is available.

Release notes and downloads:

Version - 2018.05.15
+ added command line option to disable log file: /no_logfile
+ added minimal high-DPI support. GPU Caps Viewer is no
  longer scaled (blurry effect) with high DPI systems.
+ added NVIDIA TITAN V and Quadro GV100.
+ added AMD Radeon RX Vega 11, Vega 8 GPUs.
! the report file name now contains the report date and time
  (ex: _report_20180515.093517.txt). A new command line
  param allows to control this feature: /append_timestamp_to_report=0|1
* fixed a bug in the report export via command line.
! updated a bit the OpenGL panel.
! updated Intel GPUs information.
! updated: GPU Shark
! updated with latest GeeXLab SDK libs.
! recompiled with latest Vulkan API headers (v1.1.70).
! updated: ZoomGPU 1.21.7 (GPU monitoring library).

A new small demo is available that shows a wireframe shader based on geometry shader.


GeeXLab - english forum / Newsletter
« on: May 11, 2018, 06:28:00 PM »
If you wish to receive directly in your inbox the latest news about GeeXLab's world, I added a new section for that purpose: Newsletter.
Don't be afraid to subscribe, I won't flood nor spam your inbox. Time to time (depending on the release rate of GeeXLab or demos), I will write a newsletter that will sum up the latest news: links to new versions of GeeXLab (with explanations about the changes), links news demos or programming articles.

3D-Tech News Around The Web / CPU-Z 1.85 released
« on: May 05, 2018, 08:43:32 PM »
CPU-Z 1.85 is out. This new version adds the report of the AGESA version on AMD processors. The information is shown near the BIOS version, in the mainboard page.

In addition, this version reports the clock speeds with a higher refresh rate, and also fixes the "error 577" during the program initialization on Windows XP and 7, preventing all information from being reported.

- source

A lot has been said recently about our GeForce Partner Program. The rumors, conjecture and mistruths go far beyond its intent. Rather than battling misinformation, we have decided to cancel the program.

GPP had a simple goal – ensuring that gamers know what they are buying and can make a clear choice.

NVIDIA creates cutting-edge technologies for gamers. We have dedicated our lives to it. We do our work at a crazy intense level – investing billions to invent the future and ensure that amazing NVIDIA tech keeps coming. We do this work because we know gamers love it and appreciate it. Gamers want the best GPU tech. GPP was about making sure gamers who want NVIDIA tech get NVIDIA tech.

With GPP, we asked our partners to brand their products in a way that would be crystal clear. The choice of GPU greatly defines a gaming platform. So, the GPU brand should be clearly transparent – no substitute GPUs hidden behind a pile of techno-jargon.

Most partners agreed. They own their brands and GPP didn’t change that. They decide how they want to convey their product promise to gamers. Still, today we are pulling the plug on GPP to avoid any distraction from the super exciting work we’re doing to bring amazing advances to PC gaming.


GeeXLab - english forum / Fire shader
« on: May 04, 2018, 06:51:13 PM »
Here is a simple fire shader (GLSL) demo:

GeeXLab - english forum / GeeXLab released
« on: May 04, 2018, 04:27:13 PM »
GeeXLab has been released for all platforms.


Release notes:

Version - 2018.05.03
! improved monitoring mode, a mode where GeeXLab does not eat CPU and GPU cycles
  if not necessary. Monitoring mode is available on all platforms.
+ [WINDOWS] GeeXLab keeps its size on high DPI systems (no longer blurry effect).
+ [WINDOWS/LINUX] added gl_forward_compatible element in the XML window node.
! updated gh_imgui lib with latest version 1.61 WIP.
+ added mouse wheel support in ImGui functions (gh_imgui.frame_begin_v2()).
+ added frame_begin_v2(), set_next_window_content_size(), collapsing_header(),
  text_unformatted_v1(), text_unformatted_v2(), column_get_width(), column_set_width(),
  column_get_offset(), column_set_offset(), get_font_size(), calc_text_size(),
  begin_child(), end_child(), popup_open(), popup_begin(), popup_begin_context_item(),
  popup_end(), selectable() and button_arrow() to gh_imgui lib (lua, python).
+ added get_gpu_config() to gh_gml lib (lua, python).
+ added plotline_draw_v2() to gh_imgui lib (lua, python).
+ added vk_instance_get_num_layers(), vk_instance_get_layer_name(),
  vk_gpu_get_num_layers(), vk_gpu_get_layer_name(),
  vk_gpu_get_num_memory_heaps(), vk_gpu_get_heap_size() and vk_gpu_get_device_type()
  to gh_renderer (lua, python).
* Vulkan plugin: fixed a crash on Radeon GPUs by disabling the call to
  vkGetPhysicalDeviceProperties2() with Adrenalin 18.3.4.
* Vulkan plugin: fixed a bug in the enumeration of device extensions.
+ [RPI / TINKER BOARD] added LuaJIT support.
+ [LINUX] added glx_get_server_num_extensions(), glx_get_server_extension(),
  glx_get_client_num_extensions(), glx_get_client_extension(), glx_get_renderer_info_int(),
  glx_get_renderer_info_str() to gh_renderer (lua, python).

Full changelog:

Slides on modules, bundlers, webgl, and glslify!


Small experimental lossless photographic image compression library with a C API and command-line interface.

It's much faster than PNG and compresses better for photographic images. This compressor often takes less than 6% of the time of a PNG compressor and produces a file that is 66% of the size. It was written in just 500 lines of C code thanks to Facebook's Zstd library.

The goal was to see if I could create a better lossless compressor than PNG in just one evening (a few hours) using Zstd and some past experience writing my GCIF library. Zstd is magical.

I'm not expecting anyone else to use this, but feel free if you need some fast compression in just a few hundred lines of C code.


3D-Tech News Around The Web / Bokeh Depth of Field in a single pass
« on: May 04, 2018, 01:50:33 PM »
When I implemented bokeh depth of field I stumbled upon a neat blending trick almost by accident. In my opinion, the quality of depth of field is more related to how objects of different depths blend together, rather than the blur itself. Sure, bokeh is nicer than gaussian, but if the blending is off the whole thing falls flat. There seems to be many different approaches to this out there, most of them requiring multiple passes and sometimes separation of what's behind and in front of the focal plane. I experimented a bit and stumbled upon a nice trick, almost by accident.


Code: [Select]
uniform sampler2D uTexture; //Image to be processed
uniform sampler2D uDepth; //Linear depth, where 1.0 == far plane
uniform vec2 uPixelSize; //The size of a pixel: vec2(1.0/width, 1.0/height)
uniform float uFar; // Far plane

const float GOLDEN_ANGLE = 2.39996323;
const float MAX_BLUR_SIZE = 20.0;
const float RAD_SCALE = 0.5; // Smaller = nicer blur, larger = faster

float getBlurSize(float depth, float focusPoint, float focusScale)
 float coc = clamp((1.0 / focusPoint - 1.0 / depth)*focusScale, -1.0, 1.0);
 return abs(coc) * MAX_BLUR_SIZE;

vec3 depthOfField(vec2 texCoord, float focusPoint, float focusScale)
 float centerDepth = texture(uDepth, texCoord).r * uFar;
 float centerSize = getBlurSize(centerDepth, focusPoint, focusScale);
 vec3 color = texture(uTexture, vTexCoord).rgb;
 float tot = 1.0;

 float radius = RAD_SCALE;
 for (float ang = 0.0; radius<MAX_BLUR_SIZE; ang += GOLDEN_ANGLE)
  vec2 tc = texCoord + vec2(cos(ang), sin(ang)) * uPixelSize * radius;

  vec3 sampleColor = texture(uTexture, tc).rgb;
  float sampleDepth = texture(uDepth, tc).r * uFar;
  float sampleSize = getBlurSize(sampleDepth, focusPoint, focusScale);
  if (sampleDepth > centerDepth)
   sampleSize = clamp(sampleSize, 0.0, centerSize*2.0);

  float m = smoothstep(radius-0.5, radius+0.5, sampleSize);
  color += mix(color/tot, sampleColor, m);
  tot += 1.0;
  radius += RAD_SCALE/radius;
 return color /= tot;

In this paper we are presenting a new real-time, screen-space technique that can be easily integrated into existing rendering pipelines and that drastically improves the quality of the lighting by adding an important near-field illumination term. It exploits a reprojection of the radiance from one frame to another to provide a theoretically infinite amount of light bounces and at the same time keeps a tight frame budget since it basically relies on the same foundation as the Horizon-Based Ambient Occlusion (HBAO) technique introduced by Bavoil et al. [1] except it uses the information gathered while computing the horizon to its maximum potential.


As a computer engineer who has spent half a decade working with caches at Intel and Sun, I’ve learnt a thing or two about cache-coherency. This was one of the hardest concepts to learn back in college – but once you’ve truly understood it, it gives you a great appreciation for system design principles.

You might be wondering why you as a software developer should care about CPU cache-design. For one thing, many of the concepts learnt in cache-coherency are directly applicable to distributed-system-architecture and database-isolation-levels as well. For instance, understanding how coherency is implemented in hardware caches, can help in better understanding strong-vs-eventual consistency. It can spur ideas on how to better enforce consistency in distributed systems, using the same research and principles applied in hardware.

For another thing, misconceptions about caches often lead to false assertions, especially when it comes to concurrency and race conditions. For example, the common refrain that concurrent programming is hard because “different cores can have different/stale values in their individual caches”. Or that the reason we need volatiles in languages like Java, is to “prevent shared-data from being cached locally”, and force them to be “read/written all the way to main memory”.


3D-Tech News Around The Web / Reducing Vulkan API call overhead
« on: May 02, 2018, 09:41:02 AM »
Vulkan is designed to have significantly smaller CPU overhead compared to other APIs like OpenGL. This is achieved by various means – the API is structured to do more work up-front, such as creating the pipeline state once and binding it many times instead of having to continuously set various state bits, and many API calls do more work per call, for example vkCmdBindVertexBuffers can bind all vertex buffer objects used by the vertex shader stage in one call. However a complex application can still end up calling various Vulkan functions tens or hundreds of thousands of times per frame. This article will look at costs associated with that, and ways to bring them down.


Pages: [1] 2 3 ... 46