Author Topic: Machine Learning Acceleration in Vulkan with Cooperative Matrices  (Read 1148 times)

0 Members and 1 Guest are viewing this topic.


  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2495
Machine learning harnesses computing power to solve a variety of ‘hard’ problems that seemed impossible to program using traditional languages and techniques. Machine learning avoids the need for a programmer to explicitly program the steps in solving a complex pattern-matching problem such as understanding speech or recognizing objects within an image. NVIDIA aims to bring machine learning to Vulkan programmers though the Cooperative Matrix vendor extension.

Machine learning-based applications train a network of simulated neurons, a neural network, by feeding it a large number of examples and then giving feedback on the generated responses until the network achieves a desired task. This is similar to teaching a human baby to recognize words and pictures through reading them picture books! 

Once trained, the network can be deployed in an application, fed real-world data and generating or inferencing useful responses in real-time. The amount of compute power needed to run a trained neural network in real-time is intense and parallelizable. This is why the compute power of GPUs substantially accelerate inferencing on many platforms that have a GPU available, from mobile phones to supercomputers.

These applications often call into a software inferencing engine highly-optimized to run the necessary inferencing calculations through the GPU as quickly as possible, such as NVIDIA’s TensorRT. The latest generations of GPUs, such as those based on NVIDIA’s Turing and Volta architectures, even have dedicated processing blocks to run those inferencing operations significantly faster than using the traditional processors found in previous GPUs. 

Complete article: