psyclone.rst

.. _using_psyclone:

******************
Using PSyclone 
******************

.. contents::
   :local:
   :depth: 1

Overview
========

This section contains step-by-step instructions that demonstrate an application of the
PSyclone-based source-code processing option available in the NEMO build system. 
It assumes that PSyclone has been correctly installed as detailed in the NEMO 
:doc:`installation guide <install>` and that an appropriate arch file has been generated 
(hereafter referred to as ``arch-auto.fcm``).

This initial application is simply a PSyclone passthrough of a NEMO configuration. That
is, the source code is processed through PSyclone but the resultant code is functionally
equivalent to the original source (although standardisation of some F90 constructs will be
carried out). Later additions to this guide will illustrate how to add transformation
scripts to this process to perform complex tasks such as identifying computational kernels
and inserting OpenACC directives for GPU offloading.

`PSyclone <https://github.com/stfc/PSyclone>`_ processing of the NEMO source code is
available as an option in the NEMO build system, and PSyclone transformations can be
enabled via option::

    -p <PSyclone processing option> 

of the ``./makenemo`` command (``-p all`` lists the available PSyclone transformations).
The different options correspond to transformation scripts in directory ``sct/`` with the
exception of ``passthrough`` (which is supported internally).  This initial part of
PSyclone-processing guide demonstrates the use of the ``passthrough`` option from scratch,
including the compilation of a model configuration with and without PSyclone passthrough,
running of the model, and verification of the passthrough.

Compilation and running of the BENCH test case
==============================================

Step 1 - Compile the BENCH test case with and without PSyclone passthrough
--------------------------------------------------------------------------

At the top level of the NEMO repository,

::

    $ cd nemo

a reference configuration of the BENCH test case can be compiled::

    $ ./makenemo -m auto -a BENCH -n BENCH_0 -j 8 -v 1

Next, a corresponding configuration with PSyclone passthrough can be built::

    $ ./makenemo -m auto -a BENCH -n BENCH_PT -j 8 -v 1 -p passthrough

For demonstration purposes, a simple, non-invasive PSyclone transformation script can 
alternatively be enabled::

    $ ./makenemo -m auto -a BENCH -n BENCH_INFO -j 8 -v 1 -p list_symbols

This variant produces the same source code as the passthrough, but during the build
process it outputs the names of all variables defined in the majority of the Fortran
modules and in the associated module procedures.

Step 2 - Prepare a suitable configuration and submission script
---------------------------------------------------------------

For testing, a BENCH configuration of a small domain (ORCA2 equivalent) and a low number of
time steps (80) can be generated as::

    $ sed -e 's/nn_itend.*/nn_itend=80/' -e 's/nn_isize.*/nn_isize=180/' -e 's/nn_jsize.*/nn_jsize=148/' -e 's/nn_ksize.*/nn_ksize=31/' -e 's/ln_timing.*/ln_timing=.true./' -e '/\&namctl/asn_cfctl%l_runstat=.true.' tests/BENCH/EXPREF/namelist_cfg_orca1_like > ./namelist_cfg

Step 3 - Prepare a submission script (if required)
--------------------------------------------------

In principle, inside the experiment directories ``tests/BENCH_{0,PT,INFO}/EXP00/`` it
would suffice to run the model as ``mpirun -n 4 ./nemo`` or similarly, but assuming that
NEMO runs are typically submitted on a HPC system via a job scheduler, script
``./submit.sh`` will be assumed to contain the necessary system-specific settings and
commands (in effect starting the executable ``./nemo`` in an MPI environment) in the next
step::

    $ vi submit.sh; chmod u+x ./submit.sh

Step 4 - Run NEMO ``BENCH_0`` and ``BENCH_PT``
----------------------------------------------

Next, the two model runs can be started::

    $ cp namelist_cfg submit.sh tests/BENCH_0/EXP00/
    $ cd tests/BENCH_0/EXP00/
    $ <job submission command> ./submit.sh
    $ cd -
    $ cp namelist_cfg submit.sh tests/BENCH_PT/EXP00/
    $ cd tests/BENCH_PT/EXP00/
    $ <job submission command> ./submit.sh
    $ cd -

Verification and source-code inspection
=======================================

Step 5 - Verify the PSyclone passthrough
----------------------------------------

Once the runs have finished, comparison of model output from the model builds with and 
without PSyclone passthrough,

::

    $ vimdiff tests/BENCH_{0,PT}/EXP00/run.stat

should (hopefully) reveal identical results.

Step 6 - Inspect the source code for the effect of the PSyclone passthrough
---------------------------------------------------------------------------

With PSyclone processing, the build system processes the NEMO source code in three stages
(or with AGRIF in four stages): CPP preprocessing, PSyclone processing, and the actual
compilation. For the example of the `BENCH_PT` configuration, the original source-code
files are linked to in directory ``tests/BENCH_PT/WORK``, the CPP preprocessed
versions can be found in directory ``tests/BENCH_ST/BLD_SCT_PSYCLONE/ppsrc/nemo/``, and
the PSyclone processed files supplied to the Fortran compiler are at
``tests/BENCH_ST/BLD_SCT_PSYCLONE/obj/``. For example, differences between the three
source-code variants for module ``usrdef_sbc`` can be visualised with::

    $ vimdiff tests/BENCH_PT/WORK/usrdef_sbc.F90 tests/BENCH_PT/BLD_SCT_PSYCLONE/{ppsrc/nemo,obj}/usrdef_sbc.f90

The substantial transformation of the original ``WHERE`` construct starting at line 178 of
the original file in this example demonstrates the normalisation aspect of the PSyclone
processing stage. The various build stages are illustrated by this example which takes the
code block from its original:

.. code-block:: fortran

      DO jl = 1, jpl
         WHERE    ( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) <  0.1_wp )     ! linear decrease from hi=0 to 10cm
            qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(A2D(0),jl) * 10._wp ) )
         ELSEWHERE( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) >= 0.1_wp )     ! constant (ztri) when hi>10cm
            qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:)
         ELSEWHERE                                                         ! zero when hs>0
            qtr_ice_top(:,:,jl) = 0._wp
         END WHERE
      ENDDO

through normal CPP macro expansion, to:

.. code-block:: fortran

      DO jl = 1, jpl
         WHERE    ( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <  0.1_wp )     ! linear decrease from hi=0 to 10cm
            qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) * 10._wp ) )
         ELSEWHERE( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) >= 0.1_wp )     ! constant (ztri) when hi>10cm
            qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:)
         ELSEWHERE                                                         ! zero when hs>0
            qtr_ice_top(:,:,jl) = 0._wp
         END WHERE
      ENDDO

and PSyclone transformation to:

.. code-block:: fortran

    do jl = 1, jpl, 1
      do widx2 = 1, Nje0 + 0 - (Njs0 - 0) + 1, 1
        do widx1 = 1, Nie0 + 0 - (Nis0 - 0) + 1, 1
          if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. &
             &phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <  0.1_wp) then
              qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = &
             &    qsr_ice(LBOUND(qsr_ice,     dim=1) + widx1 - 1,LBOUND(qsr_ice,     dim=2) + widx2 - 1,jl) * &
             &      (ztri(LBOUND(ztri,        dim=1) + widx1 - 1,LBOUND(ztri,        dim=2) + widx2 - 1) + &
             & (1._wp - ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1)) * &
             & (1._wp - phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) * 10._wp))
          else
            if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. &
               &phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) >= 0.1_wp) then
                qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = &
               &    qsr_ice(LBOUND(qsr_ice,     dim=1) + widx1 - 1,LBOUND(qsr_ice,     dim=2) + widx2 - 1,jl) * &
               &       ztri(LBOUND(ztri,        dim=1) + widx1 - 1,LBOUND(ztri,        dim=2) + widx2 - 1)
            else
              qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = 0._wp
            end if
          end if
        enddo
      enddo
    enddo

where the latter has been manually reformatted for readability.