Geeks3D Forums

Geeks3D.com | GeeXLab | Softwares | Reviews | Downloads | Shader Lib 


News:MadView3D 0.3.0: Cross-Platform 3D Object Viewer

Recent Posts

Pages: [1] 2 3 ... 10
1
NVIDIA has released a new Vulkan driver for developers that bring Vulkan API 1.1.85 and new extensions.

NVIDIA Vulkan driver page: https://developer.nvidia.com/vulkan-driver

Downloads
- 399.32 for win10 64-bit: https://developer.nvidia.com/39932-win-10
- 399.32 for win7/win8 64-bit: https://developer.nvidia.com/39932-win-78
- 396.54.06 for Linux 64-bit: https://developer.nvidia.com/linux-3965406

Changelog:
Quote
September 19th, 2018 - Windows 399.32, Linux 396.54.06
    New Extensions:
        VK_KHR_driver_properties
        VK_KHR_shader_atomic_int64
    Bug fixes:
        Corruption workaround for DX content running on Vulkan





Here is the report from GPU Caps Viewer 1.39.0.0 for a GeForce GTX 1080 on Windows 10 64-bit
Quote
- Instance extensions: 12
  - VK_KHR_device_group_creation (version: 1)
  - VK_KHR_external_fence_capabilities (version: 1)
  - VK_KHR_external_memory_capabilities (version: 1)
  - VK_KHR_external_semaphore_capabilities (version: 1)
  - VK_KHR_get_physical_device_properties2 (version: 1)
  - VK_KHR_get_surface_capabilities2 (version: 1)
  - VK_KHR_surface (version: 25)
  - VK_KHR_win32_surface (version: 6)
  - VK_EXT_debug_report (version: 9)
  - VK_EXT_swapchain_colorspace (version: 3)
  - VK_NV_external_memory_capabilities (version: 1)
  - VK_EXT_debug_utils (version: 1)
- Instance layers: 5
  - VK_LAYER_NV_optimus (version: 1.1.85, impl: 1)
  - VK_LAYER_RENDERDOC_Capture (version: 1.0.0, impl: 91)
  - VK_LAYER_NV_nsight (version: 1.0.13, impl: 1)
  - VK_LAYER_NV_nomad (version: 1.1.71, impl: 1)
  - VK_LAYER_LUNARG_standard_validation (version: 1.0.82, impl: 1)
- Physical devices: 1
  - [Vulkan device 0]: GeForce GTX 1080 ------------------
    - API version: 1.1.85
    - vendorID: 4318
    - deviceID: 7040
    - driver version: 1674051584
    - NVIDIA driver version: 399.32
  - memory heap count: 2
    - heap1: 8079MB
    - heap2: 32706MB
  - memory type count: 4
    - mem type 7 - heap index : 0 - property flag : 1
      > mem property: VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
    - mem type 8 - heap index : 0 - property flag : 1
      > mem property: VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT
    - mem type 9 - heap index : 1 - property flag : 6
      > mem property: VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
    - mem type 10 - heap index : 1 - property flag : 14
      > mem property: VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
      > mem property: VK_MEMORY_PROPERTY_HOST_CACHED_BIT
  - extensions: 63
    - VK_KHR_8bit_storage (version: 1)
    - VK_KHR_16bit_storage (version: 1)
    - VK_KHR_bind_memory2 (version: 1)
    - VK_KHR_create_renderpass2 (version: 1)
    - VK_KHR_dedicated_allocation (version: 3)
    - VK_KHR_descriptor_update_template (version: 1)
    - VK_KHR_device_group (version: 3)
    - VK_KHR_draw_indirect_count (version: 1)
    - VK_KHR_driver_properties (version: 1)
    - VK_KHR_external_fence (version: 1)
    - VK_KHR_external_fence_win32 (version: 1)
    - VK_KHR_external_memory (version: 1)
    - VK_KHR_external_memory_win32 (version: 1)
    - VK_KHR_external_semaphore (version: 1)
    - VK_KHR_external_semaphore_win32 (version: 1)
    - VK_KHR_get_memory_requirements2 (version: 1)
    - VK_KHR_image_format_list (version: 1)
    - VK_KHR_maintenance1 (version: 2)
    - VK_KHR_maintenance2 (version: 1)
    - VK_KHR_maintenance3 (version: 1)
    - VK_KHR_multiview (version: 1)
    - VK_KHR_push_descriptor (version: 2)
    - VK_KHR_relaxed_block_layout (version: 1)
    - VK_KHR_sampler_mirror_clamp_to_edge (version: 1)
    - VK_KHR_sampler_ycbcr_conversion (version: 1)
    - VK_KHR_shader_atomic_int64 (version: 1)
    - VK_KHR_shader_draw_parameters (version: 1)
    - VK_KHR_storage_buffer_storage_class (version: 1)
    - VK_KHR_swapchain (version: 70)
    - VK_KHR_variable_pointers (version: 1)
    - VK_KHR_vulkan_memory_model (version: 2)
    - VK_KHR_win32_keyed_mutex (version: 1)
    - VK_EXT_blend_operation_advanced (version: 2)
    - VK_EXT_conditional_rendering (version: 1)
    - VK_EXT_conservative_rasterization (version: 1)
    - VK_EXT_depth_range_unrestricted (version: 1)
    - VK_EXT_descriptor_indexing (version: 2)
    - VK_EXT_discard_rectangles (version: 1)
    - VK_EXT_hdr_metadata (version: 1)
    - VK_EXT_inline_uniform_block (version: 1)
    - VK_EXT_post_depth_coverage (version: 1)
    - VK_EXT_sample_locations (version: 1)
    - VK_EXT_sampler_filter_minmax (version: 1)
    - VK_EXT_shader_subgroup_ballot (version: 1)
    - VK_EXT_shader_subgroup_vote (version: 1)
    - VK_EXT_shader_viewport_index_layer (version: 1)
    - VK_EXT_vertex_attribute_divisor (version: 3)
    - VK_NV_clip_space_w_scaling (version: 1)
    - VK_NV_dedicated_allocation (version: 1)
    - VK_NV_device_diagnostic_checkpoints (version: 2)
    - VK_NV_external_memory (version: 1)
    - VK_NV_external_memory_win32 (version: 1)
    - VK_NV_fill_rectangle (version: 1)
    - VK_NV_fragment_coverage_to_color (version: 1)
    - VK_NV_framebuffer_mixed_samples (version: 1)
    - VK_NV_geometry_shader_passthrough (version: 1)
    - VK_NV_sample_mask_override_coverage (version: 1)
    - VK_NV_shader_subgroup_partitioned (version: 1)
    - VK_NV_viewport_array2 (version: 1)
    - VK_NV_viewport_swizzle (version: 1)
    - VK_NV_win32_keyed_mutex (version: 1)
    - VK_NVX_device_generated_commands (version: 3)
    - VK_NVX_multiview_per_view_attributes (version: 1)
  - device layers: 1
    - VK_LAYER_NV_optimus (version: 1.1.85, impl: 1)
  - device features:
    - robustBufferAccess: true
    - fullDrawIndexUint32: true
    - imageCubeArray: true
    - independentBlend: true
    - geometryShader: true
    - tessellationShader: true
    - sampleRateShading: true
    - dualSrcBlend: true
    - logicOp: true
    - multiDrawIndirect: true
    - drawIndirectFirstInstance: true
    - depthClamp: true
    - depthBiasClamp: true
    - fillModeNonSolid: true
    - depthBounds: true
    - wideLines: true
    - largePoints: true
    - alphaToOne: true
    - multiViewport: true
    - samplerAnisotropy: true
    - textureCompressionETC2: false
    - textureCompressionASTC_LDR: false
    - textureCompressionBC: true
    - occlusionQueryPrecise: true
    - pipelineStatisticsQuery: true
    - vertexPipelineStoresAndAtomics: true
    - fragmentStoresAndAtomics: true
    - shaderTessellationAndGeometryPointSize: true
    - shaderImageGatherExtended: true
    - shaderStorageImageExtendedFormats: true
    - shaderStorageImageMultisample: true
    - shaderStorageImageReadWithoutFormat: true
    - shaderStorageImageWriteWithoutFormat: true
    - shaderUniformBufferArrayDynamicIndexing: true
    - shaderSampledImageArrayDynamicIndexing: true
    - shaderStorageBufferArrayDynamicIndexing: true
    - shaderStorageImageArrayDynamicIndexing: true
    - shaderClipDistance: true
    - shaderCullDistance: true
    - shaderFloat64: true
    - shaderInt64: true
    - shaderInt16: true
    - shaderResourceResidency: true
    - shaderResourceMinLod: true
    - sparseBinding: true
    - sparseResidencyBuffer: true
    - sparseResidencyImage2D: true
    - sparseResidencyImage3D: true
    - sparseResidency2Samples: true
    - sparseResidency4Samples: true
    - sparseResidency8Samples: true
    - sparseResidency16Samples: true
    - sparseResidencyAliased: true
    - variableMultisampleRate: true
    - inheritedQueries: true
  - device limits
    - maxImageDimension1D: 32768
    - maxImageDimension2D: 32768
    - maxImageDimension3D: 16384
    - maxImageDimensionCube: 32768
    - maxImageArrayLayers: 2048
    - maxTexelBufferElements: 134217728
    - maxUniformBufferRange: 65536
    - maxStorageBufferRange: 4294967295
    - maxPushConstantsSize: 256
    - maxMemoryAllocationCount: 4096
    - maxSamplerAllocationCount: 4000
    - bufferImageGranularity: 1024
    - sparseAddressSpaceSize: 18446744073709551615
    - maxBoundDescriptorSets: 32
    - maxPerStageDescriptorSamplers: 1048576
    - maxPerStageDescriptorUniformBuffers: 15
    - maxPerStageDescriptorSampledImages: 1048576
    - maxPerStageDescriptorStorageImages: 1048576
    - maxPerStageDescriptorInputAttachments: 1048576
    - maxPerStageResources: 4294967295
    - maxDescriptorSetSamplers: 1048576
    - maxDescriptorSetUniformBuffers: 90
    - maxDescriptorSetUniformBuffersDynamic: 15
    - maxDescriptorSetStorageBuffers: 1048576
    - maxDescriptorSetStorageBuffersDynamic: 16
    - maxDescriptorSetSampledImages: 1048576
    - maxDescriptorSetStorageImages: 1048576
    - maxDescriptorSetInputAttachments: 1048576
    - maxVertexInputAttributes: 32
    - maxVertexInputBindings: 32
    - maxVertexInputAttributeOffset: 2047
    - maxVertexInputBindingStride: 2048
    - maxVertexOutputComponents: 128
    - maxTessellationGenerationLevel: 64
    - maxTessellationPatchSize: 32
    - maxTessellationControlPerVertexInputComponents: 128
    - maxTessellationControlPerVertexOutputComponents: 128
    - maxTessellationControlPerPatchOutputComponents: 120
    - maxTessellationControlTotalOutputComponents: 4216
    - maxTessellationEvaluationInputComponents: 128
    - maxTessellationEvaluationOutputComponents: 128
    - maxGeometryShaderInvocations: 32
    - maxGeometryInputComponents: 128
    - maxGeometryOutputComponents: 128
    - maxGeometryOutputVertices: 1024
    - maxGeometryTotalOutputComponents: 1024
    - maxFragmentInputComponents: 128
    - maxFragmentOutputAttachments: 8
    - maxFragmentDualSrcAttachments: 1
    - maxFragmentCombinedOutputResources: 16
    - maxComputeSharedMemorySize: 49152
    - maxComputeWorkGroupCount: [2147483647; 65535; 65535]
    - maxComputeWorkGroupInvocations: 1536
    - maxComputeWorkGroupSize: [1536; 1024; 64]
    - subPixelPrecisionBits: 8
    - subTexelPrecisionBits: 8
    - mipmapPrecisionBits: 8
    - maxDrawIndexedIndexValue: 4294967295
    - maxDrawIndirectCount: 4294967295
    - maxSamplerLodBias: 15.000000
    - maxSamplerAnisotropy: 16.000000
    - maxViewports: 16
    - maxViewportDimensions: [32768; 32768]
    - viewportBoundsRange: [-65536.000000 ; 65536.000000]
    - viewportSubPixelBits: 8
    - minMemoryMapAlignment: 64
    - minTexelBufferOffsetAlignment: 16
    - minUniformBufferOffsetAlignment: 256
    - minStorageBufferOffsetAlignment: 32
    - minTexelOffset: 4294967288
    - maxTexelOffset: 7
    - minTexelGatherOffset: 4294967264
    - maxTexelGatherOffset: 31
    - minInterpolationOffset: -0.500000
    - maxInterpolationOffset: 0.437500
    - subPixelInterpolationOffsetBits: 4
    - maxFramebufferWidth: 32768
    - maxFramebufferHeight: 32768
    - maxFramebufferLayers: 2048
    - framebufferColorSampleCounts: 15
    - framebufferDepthSampleCounts: 15
    - framebufferStencilSampleCounts: 31
    - framebufferNoAttachmentsSampleCounts: 31
    - maxColorAttachments: 8
    - sampledImageColorSampleCounts: 15
    - sampledImageIntegerSampleCounts: 15
    - sampledImageDepthSampleCounts: 15
    - sampledImageStencilSampleCounts: 31
    - storageImageSampleCounts: 15
    - maxSampleMaskWords: 1
    - timestampComputeAndGraphics: 1
    - timestampPeriod: 1.000000
    - maxClipDistances: 8
    - maxCullDistances: 8
    - maxCombinedClipAndCullDistances: 8
    - discreteQueuePriorities: 2
    - pointSizeRange: [1.000000 ; 189.875000]
    - lineWidthRange: [0.500000 ; 10.000000]
    - pointSizeGranularity: 0.125000
    - lineWidthGranularity: 0.125000
    - strictLines: 1
    - standardSampleLocations: 1
    - optimalBufferCopyOffsetAlignment: 1
    - optimalBufferCopyRowPitchAlignment: 1
    - nonCoherentAtomSize: 64

2
3D-Tech News Around The Web / Re: TechPowerUp GPU-Z 2.11.0 Released
« Last post by JeGX on September 18, 2018, 10:53:52 AM »
5
3D-Tech News Around The Web / TechPowerUp GPU-Z 2.11.0 Released
« Last post by Stefan on September 17, 2018, 05:23:21 PM »
   GPU-Z is a lightweight utility designed to give you all information about your video card and GPU.



    Version History   v2.11.0 (September 17th, 2018) 
  • Added NVIDIA GeForce RTX Turing support
  • Added option to minimize GPU-Z on close
  • Added system RAM memory usage sensor
  • Added temperature monitoring offset for Threadripper 2nd gen
  • Fixed typo in NVIDIA Perf Cap Reason tooltip
  • GPU-Z will no longer use AMD ADL memory sensors because they are buggy, WDDM monitoring used again
  • GPU Lookup feature improved by taking boost clock into account
  • Added ability to clean up old QueryExternal files in temp directory
  • Added support to BIOS parser for USB-C output, GDDR6 memory, 16 Gbit memory chips
  • Added support for NVIDIA RTX 2080 Ti, RTX 2080, RTX 2070, GTX 750 Ti (GM107-A), GTX 1050 Ti Mobile 4 GB, Quadro P1000, Tesla P100 DGXS, GeForce 9200
  • Added support for AMD Vega 20, Fenghuang, Ryzen 5 Pro 2500U, 5 Pro 2400G, 3 Pro 2200G, 3 Pro 2300U, 3 2200GE, Athlon 200GE, Embedded V1807B
  • Added support for Intel UHD 610, UHD P630 (Xeon), Coffee Lake GT3e (i5-8259U)
cc9.png
6
General Discussion / Re: NVIDIA Fermat OptiX-RTX ray tracing benchmark
« Last post by m_nyers on September 17, 2018, 09:27:25 AM »
Okey, I will double check with CUDA SDK 10 and RTX GPUs and create a pack from it!
7
3D-Tech News Around The Web / Vulkan Hardware Capability Viewer 1.8 released
« Last post by Stefan on September 16, 2018, 11:11:41 AM »
Vulkan Hardware Capability Viewer 1.8 released

Version 1.8 of the Vulkan Hardware Capability Viewer is now available for all platforms (Windows, Linux, Android).
As with 1.8 this version fully supports Vulkan 1.1 and adds support for new extensions:
 
  • VK_EXT_inline_uniform_block
  • VK_KHR_vulkan_memory_model
  • VK_EXT_vertex_attribute_divisor
The UI has also been slightly updated. Instead of listing extension features and properties on a separate tab, these are now included in the extensions tab


8
3D-Tech News Around The Web / NVIDIA Turing GPU Architecture in-depth
« Last post by JeGX on September 14, 2018, 06:29:26 PM »
A 87-page PDF document that covers the Turing GPU architecture:

Quote
...

New Streaming Multiprocessor (SM)
Turing introduces a new processor architecture, the Turing SM, that delivers a dramatic boost in
shading efficiency, achieving 50% improvement in delivered performance per CUDA Core
compared to the Pascal generation. These improvements are enabled by two key architectural
changes. First, the Turing SM adds a new independent integer datapath that can execute
instructions concurrently with the floating-point math datapath. In previous generations,
executing these instructions would have blocked floating-point instructions from issuing. Second,
the SM memory path has been redesigned to unify shared memory, texture caching, and memory
load caching into one unit. This translates to 2x more bandwidth and more than 2x more capacity
available for L1 cache for common workloads.

...

Turing Tensor Cores
Tensor Cores are specialized execution units designed specifically for performing the tensor /
matrix operations that are the core compute function used in Deep Learning. Similar to Volta
Tensor Cores, the Turing Tensor Cores provide tremendous speed-ups for matrix computations at
the heart of deep learning neural network training and inferencing operations. Turing GPUs
include a new version of the Tensor Core design that has been enhanced for inferencing. Turing
Tensor Cores add new INT8 and INT4 precision modes for inferencing workloads that can tolerate
quantization and don’t require FP16 precision. Turing Tensor Cores bring new deep learningbased
AI capabilities to GeForce gaming PCs and Quadro-based workstations for the first time. A
new technique called Deep Learning Super Sampling (DLSS) is powered by Tensor Cores. DLSS
leverages a deep neural network to extract multidimensional features of the rendered scene and
intelligently combine details from multiple frames to construct a high-quality final image. DLSS
uses fewer input samples than traditional techniques such as TAA, while avoiding the algorithmic
difficulties such techniques face with transparency and other complex scene elements.

Real-Time Ray Tracing Acceleration
Turing introduces real-time ray tracing that enables a single GPU to render visually realistic 3D
games and complex professional models with physically accurate shadows, reflections, and
refractions. Turing’s new RT Cores accelerate ray tracing and are leveraged by systems and
interfaces such as NVIDIA’s RTX ray tracing technology, and APIs such as Microsoft DXR, NVIDIA
OptiX™, and Vulkan ray tracing to deliver a real-time ray tracing experience.

...


Link: https://www.nvidia.com/content/dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf


9
3D-Tech News Around The Web / Vulkan Memory Model
« Last post by JeGX on September 14, 2018, 05:57:23 PM »
Some détails about the new memory model introduced with Vulkan 1.1.84.

Quote
This week, Vulkan® has become the world’s first graphics API to include a formal memory model for its associated GLSL™ and SPIR-V™ programming languages. This significant announcement has a number of components that come together to significantly boost the robustness of the Vulkan standard for programming correctness and sophisticated compiler optimizations.

Firstly, Khronos® has released a provisional Vulkan Memory Model Specification that includes extensions for Vulkan, SPIR-V, and GLSL that gives Vulkan developers additional control over how their shaders synchronize access to memory in a parallel execution environment. In tandem with the extension specification, Khronos has released memory model extension conformance tests to help shader compilers ensure that they implement the specified memory model synchronization functionality correctly.


The Vulkan memory model is based on the C++ memory model, but adds valuable functionality including scopes, storage classes, and memory availability and visibility operations. Scopes allow synchronization to be limited to threads in close proximity to each other. Storage classes allow synchronization to be limited to specific types of memory. Availability and visibility operations give control over when and how cache maintenance operations are performed in systems with noncoherent cache hierarchies. Each of these capabilities enable an additional level of control compared to the baseline C++ model, which can be exploited to reduce the cost of synchronization and thus increase performance.

Links:
- VK_KHR_vulkan_memory_model

- Vulkan has just become the world’s first graphics API with a formal memory model.  So, what is a memory model and why should I care?

- What is the purpose of Vulkan's new extension VK_KHR_vulkan_memory_model?

- https://github.com/KhronosGroup/Vulkan-MemoryModel
10
3D-Tech News Around The Web / Re: Microsoft Forza Horizon 4 benchmark
« Last post by nuninho1980 on September 14, 2018, 04:09:19 PM »
i7-4790K@4.4GHz and EVGA GeForce GTX 980 Ti SC+@all stock speeds with v399.24 driver-HQ


I got average 72.5 fps. :D
Pages: [1] 2 3 ... 10