The OpenCL™ working group today released the OpenCL 3.0.10 specification including the latest round of maintenance updates, clarifications and bug fixes - in many cases responding to issues and questions from the OpenCL developer community.
This latest specification includes updates for readability and accessibility, such as improved syntax highlighting, as well as new and updated extensions which are outlined below.
The cl_khr_command_buffer extension enables a command-buffer to be recorded and then separately dispatched. Enqueuing the same workload multiple times from a recorded command-buffer significantly lowers call overhead and execution latency - particularly useful to increase the performance of use cases with a large number of smaller commands, such as machine learning and inferencing.
This extension is released provisionally to gather developer feedback before finalization. In this base extension, command-buffers are immutable after recording and can only have commands recorded to a single command-queue, however the API is designed to enable these restrictions to be relaxed in future layered extensions.
The cl_khr_async_work_group_copy_fence and cl_khr_extended_async_copies extensions enhance the asynchronous copy functionality in OpenCL C kernels by enabling efficient transfer of data between global and local memories via DMA transactions. This complements previous one-dimensional operations by supporting complex memory transfers and by enabling multiple asynchronous transactions to run in parallel using fences for synchronization.
This functionality is of particular relevance to digital signal processors (DSPs) and other embedded processors. Both extensions were provisionally ratified in April 2020, but have now integrated developer feedback and are finalized.
Expect and Assume Optimization Hints
The cl_khr_expect_assume cl_khr_expect_assume extension enables the SPIR-V "expect" and "assume" optimization hints to be used in an OpenCL environment, enabling additional information to be provided to the compiler about the expected values of variables and when conditions may be assumed to be true to improve the performance of some kernels.