PSyclone-based source-code refactoring for enhancing loop-level parallelism
Context
Various NEMO code sections have been optimised towards a small memory footprint, in order to increase their performance when run on CPUs. Such optimisations, however, can reduce the loop-level parallelism achievable at the compilation stage of the build process due to causing an increase in the incidence of loop-carried dependencies. Further, some parallelisable loops contain ancillary procedure calls (such as for diagnostic output or for error reporting) that do not lend themselves to being executed on accelerators and thus can also limit the loop-level parallelism achievable on such devices.
Proposal
The provision of a PSyclone transformation script for localised, bespoke build-time source-code refactoring of some procedures of the OCE
component is proposed (both in branch_5.0
and in main
): the aim of the transformation is to reduce the number of loop-carried dependencies in the refactored source code, typically at the expense of higher memory usage (for example through rank expansion of temporary variables), and to enhance the achievable loop-level parallelism through substitution of some procedure calls inside otherwise parallelisable loops with in-line code (in order to facilitate the widening of, for example, OpenACC kernels regions).