We used CUDA Unified Memory to simplify the first steps in accelerating our code.
This made the process simple, but it also made the code not portable:
PGI-only: –ta=tesla:managed flag
NVIDIA-only: CUDA Unified Memory
Explicitly managing data will make the code portable and may improve performance.