Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - Stefan

Pages: [1] 2 3 ... 179
The ShadowFX library provides a scalable and GCN-optimized solution for deferred shadow filtering. Currently the library supports uniform and contact hardening shadow (CHS) kernels.
Version 2.0 of the library now supports both DirectX® 11 and DirectX® 12. The ShadowFX API is unified across D3D11 and D3D12, allowing an easy transition to D3D12. Samples are provided for both D3D11 and D3D12 versions.

Prerequisites for DirectX 12
  • AMD Radeon™ GCN-based GPU (HD 7000 series or newer)
    • Or other DirectX® 12 compatible discrete GPU with Shader Model 5 support
  • 64-bit Windows® 10
  • Visual Studio® 2015[/l][/l]
Demo didn't knock my socks off...


Simulaatio 2016
qbparty 2016
Decrunch 2016

Remember  Topomorph 16 bytes ?
Now there is a 10 bytes demo  8)

>>>download now<<<

5.292.0 Simplified Version

Supports "Preservation Tuning" for GP100+ chips, whatever that means

 CUDA Toolkit 8 RC Now Available

  New in CUDA 8
 Pascal Architecture Support
  • Out of box performance improvements on Tesla P100, supports GeForce GTX 1080
  • Simplify programming using Unified memory on Pascal including support for large datasets, concurrent data access and atomics*
  • Optimize Unified Memory performance using new data migration APIs*
  • Faster Deep Learning using optimized cuBLAS routines for native FP16 computation
Developer Tools
  • Quickly identify latent system-level bottlenecks using the new critical path analysis feature
  • Improve productivity with up to 2x faster NVCC compilation speed
  • Tune OpenACC applications and overall host code using new profiling extensions
  • Accelerate graph analytics algorithms with nvGRAPH
  • New cuBLAS matrix multiply optimizations for matrices with sizes smaller than 512 and for batched operation

... tutorial ... for using the new VK_EXT_debug_marker extension in conjunction with an offline graphics debugging application like RenderDoc.

You can find the article here : Offline debugging with VK_EXT_debug_marker and RenderDoc

Download now, atm only for HP OEM

no Vulkan support
no obvious changes to OpenGL, at least not those recently discussed at Intel developer forum
- Fixes an issue which causes the display to become visually corrupted or to flicker during heavy CPU usage after the system resumes from sleep or connected
standby mode.
- Fixes an issue which causes the display to become visually corrupted or to flicker during heavy CPU usage after the modem resumes from standby mode.
- Fixes an issue which causes unintended noise to be heard from the system.

3D-Tech News Around The Web / MSI Afterburner 4.3.0 Beta 3
« on: May 27, 2016, 02:33:38 PM »
Revision history - Version 4.3.0 Beta 3
  • Added GPU Boost 3.0 technology support for NVIDIA Pascal graphics cards:
    • Added percent based overvoltage support
    • Added voltage/frequency curve customization support. You may use traditional core clock slider on NVIDIA GeForce GTX 1070 and 1080 graphics cards to apply fixed offset to all voltage/frequency curve points as well as use brand new flexible voltage/frequency curve editor window for more precise per-point curve adjustment. The editor window can be activated with <Ctrl> + <F> keyboard shortcut and it provides you the following features:
      • You may independently adjust clock frequency offset for each point with mouse cursor or <Up> / <Down> keys
      • You may hold <Ctrl> key to set anchor and fix clock frequency offset in minimum/maximum voltage point and adjust the offset of any other point with mouse to linearly interpolate the offsets between the anchor and adjustment points
      • You may hold <Shift> key while adjusting the offset of any point with mouse to apply the same fixed offset to all points. That’s equal to adjusting the offset with the slider in main application window
      • You may press <Ctrl> + <D> to reset offsets for all points
      • You may switch between traditional core clock control slider in the main window and voltage/frequency curve editor window to see how they affect each other in realtime
  • Improved validation and handling of erroneous data reported after TDR on NVIDIA graphics cards
  • Startup profile is now also affected by <Lock profiles> button, which means that you cannot modify or delete your startup overclocking settings while this button is pressed. This feature can be useful to protect startup overclocking settings from modification while temporarily testing various overclocking scenarios on overclocked system
  • Added support for unofficial overclocking mode with disabled PowerPlay on PowerPlay7 capable hardware (AMD Tonga and newer graphics processors family)
  • Added ability to use low-level hardware access interface on the systems with AMD graphics cards when legacy VGA BIOS image is not mapped to memory
  • Fixed bug causing the maximum value to be invisible on some hardware monitoring graphs under certain conditions (e.g. <Framerate> or <Frametime> graphs after closing 3D application)

OpenGL Extensions Viewer for Android displays the vendor name, the version, the renderer name and the extensions for OpenGL ES 1.0 to ES 3.1

New in this update

- Display screen density

- Other bug fixes.

>download now<

no changelog available
select no forward context from pulldown menu to see all extensions
on NVIDIA Optimus rigs select a particular GPU in NVCPL

The GCN architecture contains a lot of functionality in the shader cores which is not currently exposed in current APIs like Vulkan™ or Direct3D® 12. One of the mandates of GPUOpen is to give developers better access to the hardware, and today we’re releasing extensions for these APIs to expose additional GCN features to developers.

With those shader extensions, we provide access to wavefront-wide functions, which is an important building block to exploit the SIMD execution model of GPUs. For instance, the use of mbcnt and ballot can replace atomics in various cases, drastically boosting performance. The wavefront-wide instructions also include swizzles, which allow individual lanes to exchange data without going through memory.

Additionally, we expose readfirstlane and other functions which enable the compiler to move data from VGPRs into SGPRs. Especially for VGPR heavy code, marking variables as wavefront-uniform can reduce the VGPR count significantly.

Another often-requested feature which is getting exposed today is direct access to the barycentric coordinates. This is again an important building block for various algorithms.

Finally, we also provide various utility functions. In this release, we’re providing the 3-parameter min, max and med functions which map directly to the corresponding GCN opcodes


Turbocharge your Graphics and GPU Compute Applications with GPUPerfAPI


GPUPerfAPI supports the following APIs:
 • ROCm/HSA (new in v2.20)
 • DirectX 12 (new in v2.20 — currently an alpha/prototype which only supports GPU timestamps and other runtime-supplied statistics – hardware counter support will be added in the future)

NVIDIA Delivers VRWorks Support for GeForce GTX 1080

If you paid attention to the launch of our new Pascal-based GeForce GTX 1080 and 1070 GPUs, you are aware that VR had a huge influence on their creation.
 VR is extremely demanding on the GPU.  VR applications must render high resolution images at 90 frames per second to two screens simultaneously, one for each eye.  Even the slightest graphics stutter will reduce your sense of presence.
 In designing GeForce GTX 1080, NVIDIA’s architects looked for unique ways to increase performance so that developers could maintain 90 FPS while still making VR graphics as detailed as graphics in traditional AAA PC games.  The result was the new Simultaneous Multi-Projection (SMP) engine.
Today, we are excited to officially release VRWorks SDK support for two new features that are based on the SMP engine -- Lens Matched Shading and Single Pass Stereo

Includes NVAPI R367

Prior to a new title launching, our driver team is working up until the last minute to ensure every performance tweak and bug fix possible makes it into the Game Ready driver. As a result, you can be sure you’ll have the best day-1 gaming experience for your favorite new titles.

Game Ready
 Learn more about how to get the optimal experience for Overwatch, World of Tanks, and War Thunder

Gaming Technology
 Supports the new flagship GeForce GTX 1080; the most advanced gaming graphics card ever created. Discover unprecedented performance, power efficiency, and gaming experiences—driven by the new NVIDIA Pascal™ architecture. This is the ultimate gaming platform.

Virtual Reality
 Supports the new GeForce GTX 1080 VRWorks features including Lens Matched Shading and Single Pass Stereo
Windows 10 64-bit
Windows 10 32-bit
Windows 7 64-bit
Windows 7 32-bit

NB: this driver is exclusively for NVIDIA_DEV.1B80 = "NVIDIA GeForce GTX 1080"


Ever wondered how big the new ‪#‎nVIDIA‬ ‪#‎Pascal‬ GPU family would turn out to be? Here's the whole list of Pascal SKUs, with their respective PCI device IDs. Enjoy!
 Note: The list contains a few Maxwell IDs as well, in order to clarify the difference between GMxxx-A and GMxxx-B PCI device regions.
 Note #2: GM200-B may have been a second iteration, an optimized variant of the original GM200, but never reached the market. Could have been a plan B in case Pascal slips to late 2016 or 2017.

 1342 N15S-GM-B (GM108-A)
 1382 GeForce GTX 745 (GM107-A)
 13C1 Graphics Device (GM204-A)
 13C4 D17U-20 (GM204-A)
 1402 GeForce GTX 950 (GM206-A)
 15F0 Graphics Device (GP100GL-A)
 15F1 Graphics Device (GP100GL-A)
 15F8 Graphics Device (GP100GL-A)
 15F9 Graphics Device (GP100GL-A)
 15FA Graphics Device (GP100GL-A)
 15FB Graphics Device (GP100GL-A)
 15FC Graphics Device (GP100GL-A)
 15FD Graphics Device (GP100GL-A)
 15FE Graphics Device (GP100GL-A)
 1600 Graphics Device (GM204-B)
 1646 Graphics Device (GM206-B)
 1670 Graphics Device (GM206GL-B)
 1725 Graphics Device (GP100-B)
 172E Graphics Device (GP100-B)
 172F Graphics Device (GP100-B)
 1731 Graphics Device (GP100GL-B)
 1738 Graphics Device (GP100GL-B)
 1739 Graphics Device (GP100GL-B)
 173A Graphics Device (GP100GL-B)
 173B Graphics Device (GP100GL-B)
 173C Graphics Device (GP100GL-B)
 173D Graphics Device (GP100GL-B)
 1780 Graphics Device (GM107-B)
 17BC Graphics Device (GM107GL-B)
 17C1 Graphics Device (GM200-A)
 1800 Graphics Device (GM200-B)
 1801 Graphics Device (GM200-B)
 1802 Graphics Device (GM200-B)
 1807 Graphics Device (GM200-B)
 1809 Graphics Device (GM200-B)
 1830 Graphics Device (GM200GL-B)
 1831 Graphics Device (GM200GL-B)
 1839 Graphics Device (GM200GL-B)
 1B00 Graphics Device (‪#‎GP102‬-A)
 1B01 Graphics Device (GP102-A)
 1B30 Graphics Device (GP102GL-A)
 1B38 Graphics Device (GP102GL-A)
 1B3E Graphics Device (GP102GL-A)
 1B40 Graphics Device (GP102-B)
 1B41 Graphics Device (GP102-B)
 1B6E Graphics Device (GP102-B)
 1B6F Graphics Device (GP102-B)
 1B70 Graphics Device (GP102GL-B)
 1B78 Graphics Device (GP102GL-B)
 1B80 GeForce GTX 1080 (GP104-A)
 1B81 Graphics Device (GP104-A)
 1B82 Graphics Device (GP104-A)
 1B83 Graphics Device (GP104-A)
 1BB0 Graphics Device (GP104GL-A)
 1BB1 Graphics Device (GP104GL-A)
 1BB4 Graphics Device (GP104GL-A)
 1BC0 Graphics Device (GP104-B)
 1BC1 Graphics Device (GP104-B)
 1BC2 Graphics Device (GP104-B)
 1BC3 Graphics Device (GP104-B)
 1BF0 Graphics Device (GP104GL-B)
 1BF1 Graphics Device (GP104GL-B)
 1BF4 Graphics Device (GP104GL-B)
 1BF5 Graphics Device (GP104GL-B)
 1C00 Graphics Device (GP106-A)
 1C01 Graphics Device (GP106-A)
 1C02 Graphics Device (GP106-A)
 1C03 Graphics Device (GP106-A)
 1C30 Graphics Device (GP106GL-A)
 1C41 Graphics Device (GP106-B)
 1C42 Graphics Device (GP106-B)
 1C43 Graphics Device (GP106-B)
 1C70 Graphics Device (GP106GL-B)
 1C80 Graphics Device (GP107-A)
 1C81 Graphics Device (GP107-A)
 1C82 Graphics Device (GP107-A)
 1CA7 Graphics Device (GP107GL-A)
 1CA8 Graphics Device (GP107GL-A)
 1CAA Graphics Device (GP107GL-A)
 1CC2 Graphics Device (GP107-B)
 1D01 Graphics Device (GP108-A)

Damn, and i thought i made a comprehensive list of ASICs  :P

Dota 2 Update - May 23rd 2016

 * The beta version of Vulkan support for Dota 2 is now available via DLC.

Technical notes:
* Please make sure to opt-in to the Steam Client Beta for the latest Steam Vulkan Overlay (fixes performance issue with Steam Overlay).
* Enable with the -vulkan launch option after downloading the Vulkan Beta DLC.  Remove -dx9/-dx11/-gl (if present) from any previous launch options.

Minimum requirements:
   - Windows 7/8/10 64-bit: NVIDIA 600-series+ (365.19+ driver), AMD 7700+ (Crimson driver)
   - Linux 64-bit: NVIDIA 600-series+ (364.16+ driver), AMD GCN 1.2 (16.20.3 driver)
   - 2GB of GPU memory required - may experience crashes with < 2GB of GPU memory.

* The first time you run with Vulkan you may experience short stutters while the engine caches shaders on disk. After playing through or watching a match, these stutters should go away.

* There is a known issue on Linux with NVIDIA GPUs where tearing can be observed even when vertical sync is enabled. NVIDIA is aware of the issue and it will be fixed in the future through a driver update.

* Please file any bugs with the Vulkan version at      

Dota 2 Benchmark - "Vulkan is slower and faster than DX11"

Vulkan doesn't yet work with Intel Broadwell (patched driver v4409)

You can check if Vulkan renderer is active in VConsole2 coming with DOTA 2

General Discussion / NVIDIA Pascal GPUs list
« on: May 25, 2016, 05:42:29 PM »
After sneaking around at some obscure places i found these Pascal GPUs in production or qualification phase, i.e. they are tested with some special software ;)

 GP100-A01P   / GP100-VSXB-24-A1-5X


GP104-200-A1 / GP104-PR-DT
GP104-200-A1 / GP104-PS-NB - ASUS' teaser?
GP104-200-A1 / GP104-PS-DT / GeForce GTX 1070
GP104-400-A1 / GP104-PR-DT
GP104-400-A1 / GP104-PS-DT / GeForce GTX 1080 / dev_id 1B80
GP104-725-A1 / GP104-PS-DT
GP104-950-KD-A1 / GP104-PR-DT
GP104-975-A1 / GP104-QS-9XX
GP104-985-A1 / GP104-QS-9XX

GP106-750-A1 / GP106-QS-750

GP106-400-A1 / GP106-QS-400-V1

Some sites also talk about GP104-150-A1, cannot confirm it yet.

Also check out:
 AIDA64 issues list of Pascal SKUs, with their respective PCI device IDs

 Radeon Software Crimson Edition 16.5.3 Highlights
  • Support for:
    • Total War: Warhammer™
    • Overwatch™
    • Dota™2 (with Vulkan™ API)
  • New AMD Crossfire profile available for:
    • Total War: Warhammer™
    • Overwatch™

The actual new extension is GL_NV_robustness_video_memory_purge

glCapsViewer has a handy compare function (in this case here)

Overwatch Game Ready Driver Released

One new OpenGL extension since 365.19, i was too lazy to look it up.
Still built against Vulkan API 1.08
PhysX System Software 9.16.0318

3D-Tech News Around The Web / Vulkan SDK 1.0.13 Released
« on: May 22, 2016, 01:59:27 PM »
This SDK supports Vulkan API revision 1.0.13.

 The prior SDK supported Vulkan API revision 1.0.11.

Device layers are now deprecated! Any device layers must be converted to a layer that is queried and enabled at vkCreateInstance. vkEnumerateDeviceLayerProperties is deprecated.
  Overview of new features in SDK
  • Vulkan header includes VK_EXT_debug_marker extension used by RenderDoc
  • Strengthen image format validation
  • Added dozens of new tests to the validation layer test suite
  • Better parameter validation for vkCreateImage and vkCreateImageView and several vkGetPhysicalDevice... commands
  • Much improved parameter_validation coverage, including validation for VkFlags parameters
  • Significant structural and architectural layer improvements and cleanup, especially to core_validation
  • Major cleanup and code improvements to object_tracker
  • Scores of bugfixes and feature additions, including more robust fence handling
This version of the SDK components are based on the following specifications and source code repositories:
   Last Commits
  • LoaderAndValidationLayers 6441fb9a8dad1378d0 docs: update v0 languages for device layer deprecation
  • VulkanSamples f9ae4bbd09401fd1accc Merge branch 'trunk'
  • VulkanTools de4b4aa65254e66e5aac vktrace: Move null instance check after rest of setup
  Known Issues
  • This WSI extension is not supported: VK_KHR_display_swapchain
  • github #561 Resetting command pool doesn't track fences properly
  • github #550 [CTS] QueueBindSparse incorrectly reporting fence in use by another submission
  • github #539 Swapchain layer vkGetPhysicalDeviceQueueFamilyProperties does not call through to other layers.
  • github #536 ParameterValidation: "size not a mul of 4" error reported if VK_WHOLE_SIZE is passed to vkCmdFillBuffer()
  • github #527 Core Validation: Invalid "explicit dep needed" warning is issued at vkCmdBeginRenderPass() call time.
  • github #515 PV: VkDescriptorSetLayoutBinding stageFlags validation
  • github #500 demo Smoke missing -lrt under GCC 4.9
  • github #462 Validation for pCreateInfo structures does not report actual index (attachments, vertex input state)
  • github #415 debug report tries to read from possibly stack allocated memory
  • github #410 DS: vkQueueSubmit checks Event state too early when vkSetEvent is used
  • github #403 [CTS] Object tracker maps need to be per device
  • github #401 MEM: Stencil attachment memory not marked as valid in render pass
  • github #370 loader needs to use object allocators passed by application for memory allocations
  • github #367 The draw state validation layer reports that a timer query from 2 frames ago is unavailable or inflight.
  • github #362 backslash in JSON loader files not properly escaped by cJSON library
  • github #335 Undefined memory tracking is not fine grained enough
  • github #328 Validation layer reports errors if memory object alias same memory
  • github #321 vkAllocateMemory not handling null pointers
  • github #319 vkCmdCopyQueryPoolResults executed from different command buffers
  • github #306 race on globalLockInitialized: it is reset after releasing the mutex lock
  • github #299 Clearly erroneous dynamic UBO offset error triggered
  • github #282 layer_validation_test failures
  • github #281 Cube -- validate with screenshot crashes on AMD/Intel Win 10
  • github #280 Cube resize stops updating image on Win 10 Intel driver
  • github #279 Samples validation errors on AMD driver
  • github #278 tri resize broken on AMD driver
  • github #277 render tests --show-images doesn't work
  • github #276 render tests have validation errors
  • github #128 Memory layer: "Cannot read invalid memory X, please fill the memory before using" is incorrectly reported
  • github #103 DrawState layer: unnecesarily informs about a mismatch of the number of samples when switching to subsequent subpass
  • github #95 Build VKStatic.1 project faild on Visual Studio Ultimate 2013
  • github #90 loader: Add support for WSI VK_KHR_display_swapchain extension
  • github #54 vk_layer_validation_tests test failure
  • github #36 loader: pointer cast build warnings
  • LX#527 The draw state validation layer incorrectly reports layout error when submitting command buffer. Vulkan-LoaderAndValidationLayers GitHub
  • LX#526 The draw state validation layer reports that a timer query from 2 frames ago is unavailable or inflight. Vulkan-LoaderAndValidatioLayers
  • LX#524 Validation layers crash dEQP-VK.wsi.win32.swapchain.render.basic test case
  • LX#523 When I enable validation layers vkCreateWin32SurfaceKHR returns garbage pointer and vkCreateSwapchainKHR crashes
  • LX#522 Error in detecting command buffer reset
  • LX#520 VK_ERROR_OUT_OF_DATE_KHR not set after XCB window is resized
  • LX#519 VulkanRT NT5.x support
  • LX#510 Issue with validation layer and fences
  • LX#506 Invalid vkFlushMappedMemoryRanges validation error with VkMappedMemoryRange::size = VK_WHOLE_SIZE
  • LX#504 VkQueueSubmit not showing real fence on api dump
  • LX#502 No warning for vkCreateImageView with incompatible layerCount & viewType
  • LX#492 Validation: Add output of AspectMask to core_validation error message
  • LX#484 vktrace puts trace file in same dir as trace library if -o not specified
  • LX#483 Descriptor set dynamic offset validation is wrong
  • LX#471 Command buffer tracking not taking vkQueueSubmit(queue, 0, 0, null, fence) into consideration
  • LX#305 [] Validation draw state layer too slow for vkGetQueryPoolResults()
  • github #1 [VulkanTools] Running vktrace with validation enabled causes vkreplay to fail.
  • github #3 [VulkanTools] writes to device memory allocated with VK_MEMORY_PROPERTY_HOST_COHERENT_BIT are not captured
  • github #33 [VulkanTools] The trace file generated on 32bit system cannot be read by 64bit traceviewer
  • github #41 [VulkanTools] Cannot trace and replay Hologram / Smoketest on Linux
  • github #42 [VulkanTools] miss alignment for some arm cpu vfp load instructions
  • github #43 [VulkanTools] api_dump does not dump vkCreateDevice parameters/results when vkCreateDevice returns an error
  • github #45 [VulkanSamples] Missing minimum CMake version in Hologram demo

Pages: [1] 2 3 ... 179