Newer
Older
Andrew Coward
committed
.. _using_psyclone:
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
******************
Using PSyclone
******************
.. contents::
:local:
:depth: 1
Overview
========
This section contains step-by-step instructions that demonstrate an application of the
PSyclone-based source-code processing option available in the NEMO build system.
It assumes that PSyclone has been correctly installed as detailed in the NEMO
:doc:`installation guide <install>` and that an appropriate arch file has been generated
(hereafter referred to as ``arch-auto.fcm``).
This initial application is simply a PSyclone passthrough of a NEMO configuration. That
is, the source code is processed through PSyclone but the resultant code is functionally
equivalent to the original source (although standardisation of some F90 constructs will be
carried out). Later additions to this guide will illustrate how to add transformation
scripts to this process to perform complex tasks such as identifying computational kernels
and inserting OpenACC directives for GPU offloading.
`PSyclone <https://github.com/stfc/PSyclone>`_ processing of the NEMO source code is
available as an option in the NEMO build system, and PSyclone transformations can be
enabled via option::
-p <PSyclone processing option>
of the ``./makenemo`` command (``-p all`` lists the available PSyclone transformations).
The different options correspond to transformation scripts in directory ``sct/`` with the
exception of ``passthrough`` (which is supported internally). This initial part of
PSyclone-processing guide demonstrates the use of the ``passthrough`` option from scratch,
including the compilation of a model configuration with and without PSyclone passthrough,
running of the model, and verification of the passthrough.
Compilation and running of the BENCH test case
==============================================
Step 1 - Compile the BENCH test case with and without PSyclone passthrough
--------------------------------------------------------------------------
At the top level of the NEMO repository,
::
$ cd nemo
a reference configuration of the BENCH test case can be compiled::
$ ./makenemo -m auto -a BENCH -n BENCH_0 -j 8 -v 1
Next, a corresponding configuration with PSyclone passthrough can be built::
$ ./makenemo -m auto -a BENCH -n BENCH_PT -j 8 -v 1 -p passthrough
For demonstration purposes, a simple, non-invasive PSyclone transformation script can
alternatively be enabled::
$ ./makenemo -m auto -a BENCH -n BENCH_INFO -j 8 -v 1 -p list_symbols
This variant produces the same source code as the passthrough, but during the build
process it outputs the names of all variables defined in the majority of the Fortran
modules and in the associated module procedures.
Step 2 - Prepare a suitable configuration and submission script
---------------------------------------------------------------
For testing, a BENCH configuration of a small domain (ORCA2 equivalent) and a low number of
time steps (80) can be generated as::
$ sed -e 's/nn_itend.*/nn_itend=80/' -e 's/nn_isize.*/nn_isize=180/' -e 's/nn_jsize.*/nn_jsize=148/' -e 's/nn_ksize.*/nn_ksize=31/' -e 's/ln_timing.*/ln_timing=.true./' -e '/\&namctl/asn_cfctl%l_runstat=.true.' tests/BENCH/EXPREF/namelist_cfg_orca1_like > ./namelist_cfg
Step 3 - Prepare a submission script (if required)
--------------------------------------------------
In principle, inside the experiment directories ``tests/BENCH_{0,PT,INFO}/EXP00/`` it
would suffice to run the model as ``mpirun -n 4 ./nemo`` or similarly, but assuming that
NEMO runs are typically submitted on a HPC system via a job scheduler, script
``./submit.sh`` will be assumed to contain the necessary system-specific settings and
commands (in effect starting the executable ``./nemo`` in an MPI environment) in the next
step::
$ vi submit.sh; chmod u+x ./submit.sh
Step 4 - Run NEMO ``BENCH_0`` and ``BENCH_PT``
----------------------------------------------
Next, the two model runs can be started::
$ cp namelist_cfg submit.sh tests/BENCH_0/EXP00/
$ cd tests/BENCH_0/EXP00/
$ <job submission command> ./submit.sh
$ cd -
$ cp namelist_cfg submit.sh tests/BENCH_PT/EXP00/
$ cd tests/BENCH_PT/EXP00/
$ <job submission command> ./submit.sh
$ cd -
Verification and source-code inspection
=======================================
Step 5 - Verify the PSyclone passthrough
----------------------------------------
Once the runs have finished, comparison of model output from the model builds with and
without PSyclone passthrough,
::
$ vimdiff tests/BENCH_{0,PT}/EXP00/run.stat
should (hopefully) reveal identical results.
Step 6 - Inspect the source code for the effect of the PSyclone passthrough
---------------------------------------------------------------------------
With PSyclone processing, the build system processes the NEMO source code in three stages
(or with AGRIF in four stages): CPP preprocessing, PSyclone processing, and the actual
compilation. For the example of the `BENCH_PT` configuration, the original source-code
files are linked to in directory ``tests/BENCH_PT/WORK``, the CPP preprocessed
versions can be found in directory ``tests/BENCH_ST/BLD_SCT_PSYCLONE/ppsrc/nemo/``, and
the PSyclone processed files supplied to the Fortran compiler are at
``tests/BENCH_ST/BLD_SCT_PSYCLONE/obj/``. For example, differences between the three
source-code variants for module ``usrdef_sbc`` can be visualised with::
$ vimdiff tests/BENCH_PT/WORK/usrdef_sbc.F90 tests/BENCH_PT/BLD_SCT_PSYCLONE/{ppsrc/nemo,obj}/usrdef_sbc.f90
The substantial transformation of the original ``WHERE`` construct starting at line 178 of
the original file in this example demonstrates the normalisation aspect of the PSyclone
processing stage. The various build stages are illustrated by this example which takes the
code block from its original:
.. code-block:: fortran
DO jl = 1, jpl
WHERE ( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) < 0.1_wp ) ! linear decrease from hi=0 to 10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(A2D(0),jl) * 10._wp ) )
ELSEWHERE( phs(A2D(0),jl) <= 0._wp .AND. phi(A2D(0),jl) >= 0.1_wp ) ! constant (ztri) when hi>10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:)
ELSEWHERE ! zero when hs>0
qtr_ice_top(:,:,jl) = 0._wp
END WHERE
ENDDO
through normal CPP macro expansion, to:
.. code-block:: fortran
DO jl = 1, jpl
WHERE ( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) < 0.1_wp ) ! linear decrease from hi=0 to 10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ( ztri(:,:) + ( 1._wp - ztri(:,:) ) * ( 1._wp - phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) * 10._wp ) )
ELSEWHERE( phs(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) <= 0._wp .AND. phi(Nis0-(0):Nie0+(0),Njs0-(0):Nje0+(0),jl) >= 0.1_wp ) ! constant (ztri) when hi>10cm
qtr_ice_top(:,:,jl) = qsr_ice(:,:,jl) * ztri(:,:)
ELSEWHERE ! zero when hs>0
qtr_ice_top(:,:,jl) = 0._wp
END WHERE
ENDDO
and PSyclone transformation to:
.. code-block:: fortran
do jl = 1, jpl, 1
do widx2 = 1, Nje0 + 0 - (Njs0 - 0) + 1, 1
do widx1 = 1, Nie0 + 0 - (Nis0 - 0) + 1, 1
if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. &
&phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) < 0.1_wp) then
qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = &
& qsr_ice(LBOUND(qsr_ice, dim=1) + widx1 - 1,LBOUND(qsr_ice, dim=2) + widx2 - 1,jl) * &
& (ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1) + &
& (1._wp - ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1)) * &
& (1._wp - phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) * 10._wp))
else
if (phs(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) <= 0._wp .AND. &
&phi(Nis0 - 0 + widx1 - 1,Njs0 - 0 + widx2 - 1,jl) >= 0.1_wp) then
qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = &
& qsr_ice(LBOUND(qsr_ice, dim=1) + widx1 - 1,LBOUND(qsr_ice, dim=2) + widx2 - 1,jl) * &
& ztri(LBOUND(ztri, dim=1) + widx1 - 1,LBOUND(ztri, dim=2) + widx2 - 1)
else
qtr_ice_top(LBOUND(qtr_ice_top, dim=1) + widx1 - 1,LBOUND(qtr_ice_top, dim=2) + widx2 - 1,jl) = 0._wp
end if
end if
enddo
enddo
enddo
where the latter has been manually reformatted for readability.