Make sure you check the return value from API calls—all CUDA runtime and CUDA library API calls return an error code which is cudaSuccess if no error has occurred. Consider how you will distribute the CUDA run
Thus, a C wrapper was required to launch the CUDA kernels. The experiments were performed with Linux Ubuntu 19.04. The code was compiled with the 2019 version of the Intel Fortran and C compilers, and the CUDA version 10.1. We do not expect that the performance of the algorithms will ...