Adding Experimental PTX Support To ComputeCpp For NVIDIA® Hardware

Posted on December 6, 2017 by Rod Burns.

ComputeCpp provides support for a wide range of devices that can consume OpenCL SPIR instructions. When SYCL code is compiled by ComputeCpp, the device code is output using SPIR or SPIR-V binary formats. This enables SYCL developers to deploy their software to a wide variety of devices from several companies including AMD, Intel, ARM and Renesas without re-writing any code.

One of the main requests of our developer community has been to support NVIDIA platforms. Unfortunately, NVIDIA provides an OpenCL interface but it does not provide support for standard intermediate representations (SPIR or SPIR-V). While SPIR and SPIR-V are open standards developed to be the intermediate representation used by OpenCL devices, Parallel Thread Execution (PTX) was developed by NVIDIA as an intermediate representation of code for its GPUs. When CUDA applications are compiled the output is the PTX intermediate representation.

The ComputeCpp compiler team, together with Codeplay Research, has been working on a side project to integrate the PTX backend generation into our compiler toolchain. We are pleased to announce that this experimental support is now available in our Community Edition. What this means is that developers using ComputeCpp can compile their code for the PTX target and will be able to execute it on NVIDIA GPUs, with some limitations. In particular, although the code generation is ready, the builtin support is limited. We still need to map OpenCL builtins to their PTX counterparts. The ComputeCpp team will continue working on adding those steadily over the 2018 releases, focusing in those more requested by the users. If you try to use PTX support and find a missing builtin, all hope is not lost! Simply send us an email with your error and the builtins you need to and we'll do our best to include them for the next release.

To find out how to take advantage of this new feature take a look at the PTX guide on our website.

As we've already said, the implementation is experimental, which means there are some limitations you should be aware of. Firstly, it has not been tested on a wide range of NVIDIA hardware but we are really keen to hear about your experiences on your NVIDIA platforms. We have also not tested support on Windows, although it is known to work. Note that, in the community edition, you cannot have both SPIR and PTX binaries loaded at runtime. Contact us if you are interested in this feature that will enable deploying a single binary for all the available GPUs in the market!

If you have any feedback on our PTX implementation get in touch and tell us how you got on and how we can make it better.