AMD - What’s new in HIP and HCC for ROCm 1.6
HIP has a new home HIP now easier-to-use
  • We’ve made it even easier to port CUDA code to HIP:
    • The hipLaunchParm extra kernel argument is no longer necessary, and HIP does not modify the function signature for __global__ functions.  This removes one of last manual steps in the porting process.  Just use “hipLaunchKernelGGL” instead of hipLaunchKernel
    • HIP_KERNEL_NAME wrapper is only needed if the kernel name include commas. Typically, this only occurs in a template with multiple parameters.
  • HIP now provides two flavors of the hipify translation tools:
    • hipify-perl :  Does a simple text-based conversion for the CUDA code into HIP.  Easy to use and useful for quick scans and conversions, but can be fooled by macros or complex code paths.
    • hipify-clang: Parses the CUDA code using clang and performs a translation of the symbols.   Requires some setup to configure the correct source paths and defines, but provides most robust translation.  Useful if you are familiar with the source code and make infrastructure.
Faster performance
  • The ROCm 1.6 HCC compiles HIP code up to 2X faster than the version included in ROCM1.4.
  • HCC code generation includes latest version of the LLVM Direct-To-GCN compiler, which includes optimizations for scheduling and register allocation.
  • HIP/HCC now optimize GPU cache flushing so it is performed only as-needed, rather than on each kernel command.