chap_DIA.tex


The last of these (the 2nd temporal operation) simply returns the result of the 3rd operation- a 5-day weighted
average every 5 days.
One could equivalently specify \texttt{operation="average"} in the file definition and get the same result,
although the time coordinate for the diagnostic would be that of an average (with values of 2.5, 7.5, ... days) rather than
an instantaneous quantity (with values of 5, 10, ... days).

\item Use of the ``@'' function: example 2, monthly SSH standard deviation

The square of the SSH is added as a new variable in the field definition:

\begin{xmllines}
<field_definition operation="average" freq_op="1ts">
   <field id="ssh2" long_name="square of sea surface temperature" unit="degC2" > ssh * ssh </field>
</field_definition>
\end{xmllines}

In the file definition, monthly averages of this variable and \texttt{ssh} are then calculated and used to calculate the
monthly standard deviation:

\begin{xmllines}
<file_definition>
   <file_group id="1m" output_freq="1m"  output_level="10" enabled=".true." >  <!-- 1m files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <field field_ref="ssh" name="sshstd" long_name="sea_surface_temperature_standard_deviation"
                operation="instant" freq_op="1m" >
            sqrt( @ssh2 - @ssh * @ssh )
         </field>
      </file>
   </file_group>
</file_definition>
\end{xmllines}

In this example, the following operations occur in order:

- \texttt{ssh2} is calculated from \texttt{ssh} for every timestep

- 1-month averages of \texttt{ssh2} and \texttt{ssh} are calculated (temporal operation 1)

- The standard deviation, \texttt{sqrt(ssh2 - ssh * ssh)}, is calculated using these 1-month averages

- The instantaneous value of this expression is output every month (temporal operation 2)

\item Use of the ``@'' function: example 3, monthly average of SST diurnal cycle

The temporal minimum and maximum of the SST are added as new variables in the field definition:

\begin{xmllines}
<field_definition operation="average" freq_op="1ts">
   <field id="sstmax" field_ref="sst" long_name="max of sea surface temperature" operation="maximum" />
   <field id="sstmin" field_ref="sst" long_name="min of sea surface temperature" operation="minimum" />
</field_definition>
\end{xmllines}

In the file definition, these variables are then evaluated over a 1-day period and used to calculate the
diurnal amplitude of the SST and its monthly average:

\begin{xmllines}
<file_definition>
   <file_group id="1m" output_freq="1m"  output_level="10" enabled=".true." >  <!-- 1m files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <field field_ref="sst" name="sstdcy" long_name="amplitude of sst diurnal cycle"
                operation="average" freq_op="1d" >
            @sstmax - @sstmin
         </field>
      </file>
   </file_group>
</file_definition>
\end{xmllines}

In this example, the following operations occur in order:

- Daily minima (\texttt{sstmin}) and maxima (\texttt{sstmax}) of \texttt{sst} are calculated (temporal operation 1)

- The amplitude, \texttt{sstmax - sstmin}, is calculated using these daily extrema

- The monthly average of this expression is calculated and output every month (temporal operation 2)

\item Changing variable precision

Diagnostic output precision can be modified with the \texttt{prec} attribute of the field tag family.
Data packing is also supported via the \texttt{add\_offset} and \texttt{scale\_factor} attributes.

\begin{xmllines}
<!-- 64-bit (8-byte) float -->
<field field_ref="sst" name="tos_r8" prec="8" />
<!-- Packing to 16-bit (2-byte) integer -->
<field field_ref="sss" name="sos_i2" prec="2" add_offset="20." scale_factor="1.e-3" />
\end{xmllines}

If the data cannot be converted to the target precision, XIOS will crash with a
"\texttt{NetCDF: Numeric conversion not representable}" error. In the case of single-precision floating point diagnostics
(\texttt{prec="4"}), this often happens when \NEMO\ has sent XIOS data containing NaNs or very large/small values,
which can result from \eg\ floating point calculation errors. Forcing double-precision output (\texttt{prec="8"})
may bypass the XIOS crash, but it is usually better to inspect and troubleshoot the diagnostic data being sent from \NEMO.

\item Adding user-defined NetCDF file attributes

User-defined NetCDF attributes can added to the output file metadata at the global and variable levels:

\begin{xmllines}
<file_definition>
   <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
      <!-- Variable attributes -->
      <field field_ref="sst" name="tos" >
         <variable id="my_attribute1" type="string"  > blabla </variable>
         <variable id="my_attribute2" type="integer" > 3      </variable>
         <variable id="my_attribute3" type="float"   > 5.0    </variable>
      </field>
      <!-- Global attributes -->
      <variable id="my_global_attribute" type="string" > blabla_global </variable>
   </file>
</file_definition>
\end{xmllines}

\end{enumerate}

%% =================================================================================================
\subsection{CF metadata standard compliance}

Output from XIOS is compliant with
\href{http://cfconventions.org/Data/cf-conventions/cf-conventions-1.5/build/cf-conventions.html}{version 1.5} of
the CF metadata standard.
Therefore while a user may wish to add their own metadata to the output files (as demonstrated in example 3 of
\autoref{subsec:DIA_io_xios_adv}) the metadata should, for the most part, comply with the CF-1.5 standard.

Some metadata required for full compliance with the CF standard (horizontal cell areas and vertices) are not
output by default. It can be output by setting \np[=.true.]{ln_cfmeta}{ln\_cfmeta} in the \nam{run}{run}
namelist, but note that it will be added to all files with variables on the horizontal domain, which may
significantly increase the file size.

%% =================================================================================================
\subsection{Enabling NetCDF4 compression with XIOS}

XIOS supports the use of gzip compression when compiled with NetCDF4 libraries but is subject to the
same restrictions as the underlying HDF5 component: compression is not available when the
XIOS servers are writing in parallel to a single output file. Thus, compression can only be applied in
''multiple\_file'' mode only, or with two levels of servers using multiple level 1 servers and a single
level 2 server.
Compression is activated by using the \texttt{compression\_level} attribute of the \texttt{field} or \texttt{file}
tag families:

\begin{xmllines}
<file_definition>
  <file name="output" output_freq="1ts" compression_level="2">
     <field id="field_A" grid_ref="grid_A" operation="average" compression_level=" 4" />
     <field id="field_B" grid_ref="grid_A" operation="average" compression_level=" 0" />
     <field id="field_C" grid_ref="grid_A" operation="average" />
  </file>
</file_definition>
\end{xmllines}

Its value is an integer between 0 and 9. A value of 2 is normally recommended as a suitable trade-off between
algorithm performance and compression levels.

It is unclear how XIOS2 decides on suitable chunking parameters before applying compression, so it may
be necessary to re-chunk data when combining files produced with the ''multiple\_file'' output mode.
The \href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO} tool is capable
of doing this.
With XIOS3, the user is provided with more control over the chunking but the relationship
between input settings and final chunk sizes is complex. See the
\href{https://sites.nemo-ocean.io/draft-guide/xios3demo.html#chunking-and-compression}{XIOS3 demonstrator}
section of the user guide for an illustration.

%% =================================================================================================
\section{Reading and writing restart files}
\label{sec:XIOS_restarts}

From \NEMO\ v4.2, XIOS may be used to read in a single-file restart dump produced by \NEMO.
This does not add new functionality (\NEMO\ has long had the capability for all
processes to read their subdomain from a single, combined restart file) but it may be advantageous
on systems which struggle with too many simultaneous accesses to one file. The
variables written to files associated with the logical units \forcode{numror} (OCE), \forcode{numrir} (SI3), \forcode{numrtr}
(TOP) and \forcode{numrsr} (SED) can be handled by XIOS.

The use of XIOS to read restart files is activated by setting \np[=.true.]{ln_xios_read}{ln\_xios\_read}
in \nam{cfg}{cfg}. This setting will be ignored when multiple restart files are present, and default \NEMO\
functionality will instead be used for reading.

The \textit{iodef.xml} XIOS configuration file does not need to be
changed to use this functionality, as all definitions are implemented within the \NEMO\ code as a separate XIOS context.
For high resolution configurations, however, there may be a need to add the following line in \textit{iodef.xml}:

\begin{xmllines}
<variable_definition>
   <variable id="recv_field_timeout" type="double">1800</variable>
</variable_definition>
\end{xmllines}

which sets the timeout period for reading data.

\noindent If XIOS is to be used to read from restart files generated with an earlier \NEMO\ version (3.6 for instance),
the dimension \forcode{z} defined in the restart file must be renamed to \forcode{nav_lev}.\\

XIOS can also be used to write \NEMO\ restarts. The namelist parameter
\np{nn_wxios}{nn\_wxios} is used to determine the type of restart \NEMO\ will write:

\begin{description}
\item [{\np[=0]{nn_wxios}{nn\_wxios}}] \hfill \\
    Default functionality: each \NEMO\ process writes its own restart file
\item [{\np[=1]{nn_wxios}{nn\_wxios}}] \hfill \\
    XIOS will write to a single restart file
\item [{\np[=2]{nn_wxios}{nn\_wxios}}] \hfill \\
    XIOS will write to multiple restart files, one per server
\end{description}

This option aims to reduce the number of restart files generated by \NEMO, and may
be useful when there is a need to change the number of processors used to run the simulation.
Note that \textbf{\NEMO\ will not be able to read the restart files generated by XIOS with
\np[=2]{nn_wxios}{nn\_wxios}}. These files will have to be combined (with \eg\
\href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO}) before
continuing the run.

The use of XIOS to read and write restart files is in preparation for running \NEMO\ on exascale
computing platforms. While this may not yield significant performance gains on current clusters, it
should reduce file system bottlenecks in future attempts to run \NEMO\ on hundreds of
thousands of cores.

\section[NetCDF4 support (legacy output file method)]{NetCDF4 support (legacy output file method)}
\label{sec:DIA_nc4}

As of \NEMO\ v5, the legacy output method (where diagnostic and/or restart files are written
by \NEMO\ using the old IOIPSL interface, rather than by XIOS)
only supports NetCDF4 (version 4.1 and later are recommended) built with HDF5 (version 1.8.4
and later are recommended). This allows chunking and (loss-less) compression, which can
achieve a significant reduction in file size for a small runtime overhead.  For a fuller
discussion on chunking and other performance issues the reader is referred to the NetCDF4
documentation found
\href{https://docs.unidata.ucar.edu/nug/current/netcdf_perf_chunking.html}{here}.

Datasets created with chunking and compression are not backwards
compatible with the NetCDF3 "classic" format, but most analysis codes can simply be relinked
with the NetCDF4 libraries and will then read both NetCDF3 and NetCDF4 files. \NEMO\
executables linked with NetCDF4 libraries can be made to produce NetCDF3 files by setting
\np[=.false.]{ln_nc4zip}{ln\_nc4zip} in the \nam{nc4}{nc4} namelist.

\begin{listing}
  \nlst{namnc4}
  \caption{\forcode{&namnc4}}
  \label{lst:namnc4}
\end{listing}

Chunking and compression are applied only to 4D fields and there is no advantage in
chunking across more than one time dimension, since previously written chunks would have to
be read back and decompressed before being added to.  Therefore, user control over chunk
sizes is provided only for the three spatial dimensions.  The user sets an approximate
number of chunks along each spatial axis.  The actual size of the chunks will depend on the
global domain size for mono-processors and the local processor domain size
for distributed processing.  The derived values are subject to practical minimum values
(to avoid wastefully small chunk sizes) and cannot be greater than the domain size in any
dimension.  The algorithm used is:

\begin{forlines}
ichunksz(1) = MIN(idomain_size, MAX((idomain_size-1) / nn_nchunks_i + 1 ,16 ))
ichunksz(2) = MIN(jdomain_size, MAX((jdomain_size-1) / nn_nchunks_j + 1 ,16 ))
ichunksz(3) = MIN(kdomain_size, MAX((kdomain_size-1) / nn_nchunks_k + 1 , 1 ))
ichunksz(4) = 1
\end{forlines}

\noindent As an example, setting:

\begin{forlines}
nn_nchunks_i=4
nn_nchunks_j=4
nn_nchunks_k=31
\end{forlines}

for a standard ORCA2\_ICE\_PISCES configuration (with a global domain of {\small\texttt 182x149x31})
gives chunk sizes of {\small\texttt 46x38x1} respectively in the mono-processor case.
An illustration of the potential space savings that NetCDF4 chunking and compression provides is given in
\autoref{tab:DIA_NC4}, which compares the results of two short runs of the deprecated ORCA2\_LIM reference configuration
(now the ORCA2\_ICE\_PISCES configuration) with a 4x2 MPI decomposition.
Note the variation in the compression ratio achieved, which reflects chiefly the dry to wet volume ratio of
each processing region.

\begin{table}
  \centering
  \caption{Filesize comparison between NetCDF3 and NetCDF4 with chunking and compression}
  \begin{tabular}{lrrr}
    Filename                    & NetCDF3	& NetCDF4  & Reduction \\
                                & filesize	& filesize & \%        \\
                                & (KB)		& (KB)	  &           \\
    ORCA2\_restart\_0000.nc     & 16420 	& 8860 	  & 47\%      \\
    ORCA2\_restart\_0001.nc     & 16064 	& 11456    & 29\%      \\
    ORCA2\_restart\_0002.nc     & 16064		& 9744	  & 40\%      \\
    ORCA2\_restart\_0003.nc     & 16420		& 9404	  & 43\%      \\
    ORCA2\_restart\_0004.nc     & 16200 	& 5844	  & 64\%      \\
    ORCA2\_restart\_0005.nc     & 15848 	& 8172	  & 49\%      \\
    ORCA2\_restart\_0006.nc     & 15848 	& 8012 	  & 50\%      \\
    ORCA2\_restart\_0007.nc     & 16200 	& 5148 	  & 69\%      \\
    ORCA2\_2d\_grid\_T\_0000.nc & 2200 		& 1504	  & 32\%      \\
    ORCA2\_2d\_grid\_T\_0001.nc & 2200 		& 1748	  & 21\%      \\
    ORCA2\_2d\_grid\_T\_0002.nc & 2200 		& 1592	  & 28\%      \\
    ORCA2\_2d\_grid\_T\_0003.nc & 2200 		& 1540	  & 30\%      \\
    ORCA2\_2d\_grid\_T\_0004.nc & 2200 		& 1204	  & 46\%      \\
    ORCA2\_2d\_grid\_T\_0005.nc & 2200 		& 1444	  & 35\%      \\
    ORCA2\_2d\_grid\_T\_0006.nc & 2200 		& 1428	  & 36\%      \\
    ORCA2\_2d\_grid\_T\_0007.nc & 2200		& 1148	  & 48\%      \\
    ...                         & ...		& ...      & ...       \\
    ORCA2\_2d\_grid\_W\_0000.nc & 4416		& 2240	  & 50\%      \\
    ORCA2\_2d\_grid\_W\_0001.nc & 4416		& 2924	  & 34\%      \\
    ORCA2\_2d\_grid\_W\_0002.nc & 4416		& 2512	  & 44\%      \\
    ORCA2\_2d\_grid\_W\_0003.nc & 4416		& 2368	  & 47\%      \\
    ORCA2\_2d\_grid\_W\_0004.nc & 4416		& 1432	  & 68\%      \\
    ORCA2\_2d\_grid\_W\_0005.nc & 4416		& 1972	  & 56\%      \\
    ORCA2\_2d\_grid\_W\_0006.nc & 4416		& 2028	  & 55\%      \\
    ORCA2\_2d\_grid\_W\_0007.nc & 4416		& 1368	  & 70\%      \\
  \end{tabular}
  \label{tab:DIA_NC4}
\end{table}

Note that chunking and compression can also be applied when combining output files with \eg\
\href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO}.

%% =================================================================================================
\section[Tracer/Dynamics trends (\forcode{namtrd}, \forcode{namtrc_trd})]{Tracer/Dynamics trends (\protect\nam{trd}{trd}, \protect\nam{trc_trd}{trc\_trd})}
\label{sec:DIA_trd}

\begin{listing}
  \nlst{namtrd}
  \caption{\forcode{&namtrd}}
  \label{lst:namtrd}
\end{listing}

\begin{listing}
  \nlst{namtrc_trd}
  \caption{\forcode{&namtrc_trd}}
  \label{lst:namtrc_trd}
\end{listing}

Each trend of the time evolution equations for the dynamics (\mdl{trddyn}) and both active (\mdl{trdtra})
and passive (\mdl{trdtrc}) tracers can be output following their computation, via calls to the
\rou{trd\_tra} (active and passive tracers), \rou{trd\_dyn} (dynamics) and
\rou{trd\_trc} (passive tracers) subroutines.

The output of trends diagnostics for the dynamics and active tracers is controlled by parameters in the \nam{trd}{trd}
namelist:

\begin{description}
\item [{\np[=.true.]{ln_glo_trd}{ln\_glo\_trd}}] \hfill \\
  Every \np{nn_trd}{nn\_trd} time-steps, a check of the basin averaged properties of the momentum and tracer
  equations is performed. This also includes a check of $T^2$, $S^2$, $\tfrac{1}{2} (u^2+v^2)$,
  and potential energy time evolution equations properties.
\item [{\np[=.true.]{ln_dyn_trd}{ln\_dyn\_trd}}] \hfill \\
  Each 3D trend of the evolution of the two momentum components is output.
\item [{\np[=.true.]{ln_dyn_mxl}{ln\_dyn\_mxl}}] (\textbf{currently not working}) \hfill \\
  Each 3D trend of the evolution of the two momentum components averaged over the mixed layer is output.
\item [{\np[=.true.]{ln_vor_trd}{ln\_vor\_trd}}] (\textbf{currently not working}) \hfill \\
  A vertical summation of the moment tendencies is performed,
  then the curl is computed to obtain the barotropic vorticity tendencies which are output.
\item [{\np[=.true.]{ln_KE_trd}{ln\_KE\_trd}}] \hfill \\
  Each 3D trend of the Kinetic Energy equation is output.
\item [{\np[=.true.]{ln_PE_trd}{ln\_PE\_trd}}] (\textbf{currently not working with nonlinear free surface}) \hfill \\
  Each 3D trend of the Potential Energy equation is output.
\item [{\np[=.true.]{ln_tra_trd}{ln\_tra\_trd}}] \hfill \\
  Each 3D trend of the evolution of temperature and salinity is output.
\item [{\np[=.true.]{ln_tra_mxl}{ln\_tra\_mxl}}] (\textbf{currently not working}) \hfill \\
  Each 2D trend of the evolution of temperature and salinity averaged over the mixed layer is output.
\end{description}

% TODO: Need a similar description for this namelist
while the output of trends diagnostics for the passive tracers is controlled by parameters in the
\nam{trc_trd}{trc\_trd} namelist. \\

\noindent As all 3D trends are output using XIOS, \key{xios} must generally be specified.
Additionally, the passive tracer trends require \key{trdtrc} (for 3D trends) and/or \key{trdmxl\_trc}
(for 2D trends averaged over the mixed layer) to be specified.

Note that currently, \textbf{the trends diagnostics are not fully functional or tested} and a warning will
be raised if they are used.
In particular, the code associated with the \np{ln_dyn_mxl}{ln\_dyn\_mxl}, \np{ln_vor_trd}{ln\_vor\_trd},
and \np{ln_tra_mxl}{ln\_tra\_mxl} namelist options is not working and an error will be raised if they are used.

%% =================================================================================================
\section[Transports across sections]{Transports across sections}
\label{sec:DIA_diag_dct}

\begin{listing}
  \nlst{nam_diadct}
  \caption{\forcode{&nam_diadct}}
  \label{lst:nam_diadct}
\end{listing}

Diagnostics to compute the transport of volume, heat and salt through sections can be activated by setting
\np[=.true.]{ln_diadct}{ln\_diadct} in the \nam{_diadct}{\_diadct} namelist.
Each section is defined by the coordinates of its 2 extremities.
The pathways between them are constructed using the \texttt{SECTIONS\_DIADCT} tool
and are written to a binary file \textit{section\_ijglobal.diadct}, which is later read in by
\NEMO\ to compute on-line transports.\\

\noindent The on-line transports module (\mdl{diadct}) outputs three ascii files:\\

- \textit{volume\_transport} for volume transports (unit: $10^{6}\ m^{3}s^{-1}$)

- \textit{heat\_transport}   for   heat transports (unit: $10^{15}\ W$)

- \textit{salt\_transport}   for   salt transports (unit: $10^{9}\ Kgs^{-1}$) \\

\noindent Namelist variables in the \nam{_diadct}{\_diadct} namelist control how frequently the flows are summed
and the time scales over which they are averaged, as well as the level of output for debugging:

\begin{description}
\item [{\np{nn_dct}{nn\_dct}}] \hfill \\
    Frequency of computation of the transports (in time steps)
\item [{\np{nn_dctwri}{nn\_dctwri}}] \hfill \\
    Averaging period of the transports (as a frequency, in time steps)
\item [{\np{nn_secdebug}{nn\_secdebug}}] \hfill \\
    Sections to debug: \\
        \indent \texttt{\ 0} - Do not debug any sections \\
        \indent \texttt{-1} - Debug all sections \\
        \indent \texttt{\ n} - Debug section number \texttt{n}
\end{description}

%% =================================================================================================
\subsubsection{Creating a binary file containing the pathway of each section}

In \path{./tools/SECTIONS_DIADCT/run}, the file \textit{{list\_sections.ascii\_global}} contains a list of
all the sections (based on MERSEA project metrics) that are to be computed.
Another file is available for the GYRE configuration (\textit{ {list\_sections.ascii\_GYRE}}).\\

\noindent Each section in this file is defined by a line containing, in order:

\begin{description}
\item [\texttt{long1} \texttt{lat1}] \hfill \\
    Coordinates of the first extremity of the section, \eg\ \texttt{-68.} \texttt{-54.5}
\item [\texttt{long2} \texttt{lat2}] \hfill \\
    Coordinates of the second extremity of the section, \eg\ \texttt{-60.} \texttt{-64.7}
\item [\texttt{nclass}] \hfill \\
    The number of bounds in each class type (\texttt{nclass - 1} classes per type), \eg\ \texttt{2}
\item [\texttt{okstrpond} or \texttt{nostrpond}] \hfill \\
    A string controlling whether to compute heat and salt transports (\texttt{okstrpond}) or not (\texttt{nostrpond})
\item [\texttt{ice} or \texttt{noice}] \hfill \\
    A string controlling whether to compute surface and volume ice transports (\texttt{ice}) or not (\texttt{noice})
\item [\texttt{section\_name}] \hfill \\
    The name of the section, \eg\ \texttt{ACC\_Drake\_Passage}
\end{description}

\noindent Note that neither the results of the transport calculations nor the directions of positive and
negative flow depend on the order in which the 2 extremities are specified in this file. \\

\noindent If \texttt{nclass} $\neq$ 0, the following \texttt{nclass + 1} lines contain a class type and its bounds,
which may be repeated for several class types. \eg\ for 2 class types with 2 bounds (1 class per type): \\

%\\
{
  \noindent \texttt{
    long1 lat1 long2 lat2 nclass (ok/no)strpond (no)ice section\_name \\
    classtype                                                         \\
    bound\_1                                                           \\
    bound\_2                                                           \\
    classtype                                                         \\
    bound\_1                                                           \\
    bound\_2}
}
\\

\noindent where \texttt{classtype} can be:\\

 - \texttt{zsal}  for          salinity classes

 - \texttt{ztem}  for       temperature classes

 - \texttt{zlay}  for             depth classes

 - \texttt{zsigi} for    insitu density classes

 - \texttt{zsigp} for potential density classes \\

 The script \textit{job.ksh} computes the pathway for each section and creates a binary file
 \textit{section\_ijglobal.diadct} which is read by \NEMO.
 The top part of this script should be modified for the user's configuration, including setting the name and path
 of the coordinates file to use. \\

 Examples of two sections, \texttt{ACC\_Drake\_Passage} with no classes,
 and \texttt{ATL\_Cuba\_Florida} with 4 temperature clases (5 class bounds), are shown: \\

 %\\
 {
   \noindent \texttt{
     -68.    -54.5   -60.    -64.7  00 okstrpond noice ACC\_Drake\_Passage \\
     -80.5    22.5   -80.5    25.5  05 nostrpond noice ATL\_Cuba\_Florida  \\
     ztem                                                                  \\
     -2.0                                                                  \\
     4.5                                                                   \\
     7.0                                                                   \\
     12.0                                                                  \\
     40.0}
 }

%% =================================================================================================
\subsubsection{Reading the output files}

The format of the output file is: \\

%\\
{
 \noindent \texttt{
    date, time-step number, section number,                \\
    section name, section slope coefficient, class number, \\
    class name, class bound 1, class bound2,               \\
    transport direction 1, transport direction 2,          \\
    transport total}
}
\\

For sections with classes, the first \texttt{nclass - 1} lines correspond to the transport for each class and
the last line corresponds to the total transport summed over all classes.
For sections with no classes, class number \texttt{1} corresponds to \texttt{total class} and
this class is called \texttt{N}, meaning \texttt{none}.
\texttt{transport direction 1} is the positive part of the transport ($\geq$ 0) and
\texttt{transport direction 2} is the negative part of the transport ($\leq$ 0). \\
\noindent \texttt{section slope coefficient} gives information about the significance of transports signs and
direction (see \autoref{tab:DIA_dct_sect_slope}).

\begin{table}
  \caption[Transport section slope coefficients]{Transport section slope coefficients}
  \begin{tabular}{|l|l|l|l|l|}
    \hline
    Section slope coefficient      & Section type & Direction 1 & Direction 2 & Total transport    \\
    \hline
    0.                             & Horizontal	 & Northward	& Southward   & positive: northward    \\
    \hline
    1000.                          & Vertical     & Eastward    & Westward    & positive: eastward		\\
    \hline
    \texttt{$\neq$ 0, $\neq$ 1000.} & Diagonal     & Eastward    & Westward	  & positive: eastward		\\
    \hline
  \end{tabular}
  \label{tab:DIA_dct_sect_slope}
\end{table}

%% =================================================================================================
\section{Diagnosing the steric effect on sea surface height}
\label{sec:DIA_steric}

Changes in steric sea level are caused when changes in the density of the water column imply an expansion or
contraction of the column.
It is essentially produced through surface heating/cooling and to a lesser extent through non-linear effects of
the equation of state (cabbeling, thermobaricity...).
Non-Boussinesq models contain all ocean effects within the ocean acting on the sea level.
In particular, they include the steric effect.
In contrast, Boussinesq models, such as \NEMO, conserve volume, rather than mass,
and so do not properly represent expansion or contraction.
The steric effect is therefore not explicitely represented.
This approximation does not represent a serious error with respect to the flow field calculated by the model
\citep{greatbatch_JGR94}, but extra attention is required when investigating sea level,
as steric changes are an important contribution to local changes in sea level on seasonal and climatic time scales.
This is especially true for investigation into sea level rise due to global warming.

Fortunately, the steric contribution to the sea level consists of a spatially uniform component that
can be diagnosed by considering the mass budget of the world ocean \citep{greatbatch_JGR94}.
In order to better understand how global mean sea level evolves and thus how the steric sea level can be diagnosed,
we compare, in the following, the non-Boussinesq and Boussinesq cases.

Let denote
$\mathcal{M}$ the total mass    of liquid seawater ($\mathcal{M} = \int_D \rho dv$),
$\mathcal{V}$ the total volume  of        seawater      ($\mathcal{V} = \int_D dv$),
$\mathcal{A}$ the total surface of       the ocean      ($\mathcal{A} = \int_S ds$),
$\bar{\rho}$ the global mean  seawater (\textit{in situ}) density
($\bar{\rho} = 1/\mathcal{V} \int_D \rho \,dv$), and
$\bar{\eta}$ the global mean sea level
($\bar{\eta} = 1/\mathcal{A} \int_S \eta \,ds$).

A non-Boussinesq fluid conserves mass. It satisfies the following relations:

\begin{equation}
  \begin{split}
    \mathcal{M} &=  \mathcal{V}  \;\bar{\rho} \\
    \mathcal{V} &=  \mathcal{A}  \;\bar{\eta}
  \end{split}
  \label{eq:DIA_MV_nBq}
\end{equation}

Temporal changes in total mass are obtained from the density conservation equation:

\begin{equation}
  \frac{1}{e_3} \partial_t ( e_3\,\rho) + \nabla( \rho \, \textbf{U} )
  = \left. \frac{\textit{emp}}{e_3}\right|_\textit{surface}
  \label{eq:DIA_Co_nBq}
\end{equation}

where $\rho$ is the \textit{in situ} density, and \textit{emp} the surface mass exchanges with the other media of
the Earth system (atmosphere, sea-ice, land).
Its global average leads to the total mass change

\begin{equation}
  \partial_t \mathcal{M} = \mathcal{A} \;\overline{\textit{emp}}
  \label{eq:DIA_Mass_nBq}
\end{equation}

where $\overline{\textit{emp}} = \int_S \textit{emp}\,ds$ is the net mass flux through the ocean surface.
Bringing \autoref{eq:DIA_Mass_nBq} and the time derivative of \autoref{eq:DIA_MV_nBq} together leads to
the evolution equation of the mean sea level

\begin{equation}
  \partial_t \bar{\eta} = \frac{\overline{\textit{emp}}}{ \bar{\rho}}
  - \frac{\mathcal{V}}{\mathcal{A}} \;\frac{\partial_t \bar{\rho} }{\bar{\rho}}
  \label{eq:DIA_ssh_nBq}
\end{equation}

The first term in equation \autoref{eq:DIA_ssh_nBq} alters sea level by adding or subtracting mass from the ocean.
The second term arises from temporal changes in the global mean density; \ie\ from steric effects.

In a Boussinesq fluid, $\rho$ is replaced by $\rho_o$ in all the equation except when $\rho$ appears multiplied by
the gravity (\ie\ in the hydrostatic balance of the primitive equations).
In particular, the mass conservation equation, \autoref{eq:DIA_Co_nBq}, degenerates into the incompressibility equation:

\[
  \frac{1}{e_3} \partial_t ( e_3 ) + \nabla( \textbf{U} ) = \left. \frac{\textit{emp}}{\rho_o \,e_3}\right|_ \textit{surface}
  % \label{eq:DIA_Co_Bq}
\]

and the global average of this equation now gives the temporal change of the total volume,

\[
  \partial_t \mathcal{V} = \mathcal{A} \;\frac{\overline{\textit{emp}}}{\rho_o}
  % \label{eq:DIA_V_Bq}
\]

Only the volume is conserved, not mass, or, more precisely, the mass which is conserved is the Boussinesq mass,
$\mathcal{M}_o = \rho_o \mathcal{V}$.
The total volume (or equivalently the global mean sea level) is altered only by net volume fluxes across
the ocean surface, not by changes in mean mass of the ocean: the steric effect is missing in a Boussinesq fluid.

Nevertheless, following \citet{greatbatch_JGR94}, the steric effect on the volume can be diagnosed by
considering the mass budget of the ocean.
The apparent changes in $\mathcal{M}$, mass of the ocean, which are not induced by surface mass flux
must be compensated by a spatially uniform change in the mean sea level due to expansion/contraction of the ocean
\citep{greatbatch_JGR94}.
In other words, the Boussinesq mass, $\mathcal{M}_o$, can be related to $\mathcal{M}$,
the total mass of the ocean seen by the Boussinesq model, via the steric contribution to the sea level,
$\eta_s$, a spatially uniform variable, as follows:

\begin{equation}
  \mathcal{M}_o = \mathcal{M} + \rho_o \,\eta_s \,\mathcal{A}
  \label{eq:DIA_M_Bq}
\end{equation}

Any change in $\mathcal{M}$ which cannot be explained by the net mass flux through the ocean surface
is converted into a mean change in sea level.
Introducing the total density anomaly, $\mathcal{D}= \int_D d_a \,dv$,
where $d_a = (\rho -\rho_o ) / \rho_o$ is the density anomaly used in \NEMO\ (cf. \autoref{subsec:TRA_eos})
in \autoref{eq:DIA_M_Bq} leads to a very simple form for the steric height:

\begin{equation}
  \eta_s = - \frac{1}{\mathcal{A}} \mathcal{D}
  \label{eq:DIA_steric_Bq}
\end{equation}

The above formulation of the steric height of a Boussinesq ocean requires four remarks.
First, one can be tempted to define $\rho_o$ as the initial value of $\mathcal{M}/\mathcal{V}$,
\ie\ set $\mathcal{D}_{t=0}=0$, so that the initial steric height is zero.
We do not recommend that.
Indeed, in this case $\rho_o$ depends on the initial state of the ocean.
Since $\rho_o$ has a direct effect on the dynamics of the ocean
(it appears in the pressure gradient term of the momentum equation)
it is definitively not a good idea when inter-comparing experiments.
We instead recommend to set a fixed value $\rho_o = 1035\;Kg\,m^{-3}$.
This value is a sensible choice for the reference density used in a Boussinesq ocean climate model since,
with the exception of only a small percentage of the ocean, density in the World Ocean varies by no more than
2$\%$ from this value \citep[][page 47]{gill_bk82}.

Second, we have assumed here that the total ocean surface, $\mathcal{A}$,
does not change when the sea level is changing as it is the case in all global ocean GCMs
(wetting and drying of grid point is not allowed).

Third, the discretisation of \autoref{eq:DIA_steric_Bq} depends on the type of free surface which is considered.
In the non linear free surface case, it is given by

\[
  \eta_s = - \frac{ \sum_{i,\,j,\,k} d_a\; e_{1t} e_{2t} e_{3t} }{ \sum_{i,\,j,\,k}       e_{1t} e_{2t} e_{3t} }
  % \label{eq:DIA_discrete_steric_Bq_nfs}
\]

whereas in the linear free surface, \ie\ when \key{linssh} is specified,
the volume above the \textit{z=0} surface must be explicitly taken into account to
better approximate the total ocean mass and thus the steric sea level:

\[
  \eta_s = - \frac{ \sum_{i,\,j,\,k} d_a\; e_{1t}e_{2t}e_{3t} + \sum_{i,\,j} d_a\; e_{1t}e_{2t} \eta }
                  { \sum_{i,\,j,\,k}       e_{1t}e_{2t}e_{3t} + \sum_{i,\,j}       e_{1t}e_{2t} \eta }
  % \label{eq:DIA_discrete_steric_Bq_fs}
\]

The fourth and last remark concerns the effective sea level and the presence of sea-ice.
In the real ocean, sea ice (and snow above it)  depresses the liquid seawater through its mass loading.
This depression is a result of the mass of sea ice/snow system acting on the liquid ocean.
There is, however, no dynamical effect associated with these depressions in the liquid ocean sea level,
so that there are no associated ocean currents.
Hence, the dynamically relevant sea level is the effective sea level,
\ie\ the sea level as if sea ice (and snow) were converted to liquid seawater \citep{campin.marshall.ea_OM08}.
% NOTE: This is not true, we have an embedded sea ice option, but I don't know what to put here. Ask Clem?
However, in the current version of \NEMO\ the sea-ice is levitating above the ocean without mass exchanges between
ice and ocean.
Therefore the model effective sea level is always given by $\eta + \eta_s$, whether or not there is sea ice present.

Global averages of both the steric (\texttt{sshsteric} diagnostic) and thermosteric (\texttt{sshthster} diagnostic)
sea level can be output by the AR5 diagnostics module (\mdl{diaar5}, see \autoref{sec:DIA_diag_others_cmip_ptr}).
The latter is the steric sea level due to changes in ocean density arising only from changes in temperature.
It is given by:

\[
  \eta_s = - \frac{1}{\mathcal{A}} \int_D d_a(T,S_o,p_o) \,dv
  % \label{eq:DIA_thermosteric_Bq}
\]

where $S_o$ and $p_o$ are the initial salinity and pressure, respectively.

When this diagnostic is output, salinity data for $S_o$ must be provided via a
variable named \texttt{vosaline} in a file named \textit{sali\_ref\_clim\_monthly.nc}.
This data must be provided as a monthly climatology; \ie\ the file's time coordinate must have a length of 12.

%% =================================================================================================
\section[Tidal harmonic and generic multiple-linear-regression analysis (\forcode{diamlr.F90})]{Tidal harmonic and generic multiple-linear-regression analysis (\protect\mdl{diamlr})}
\label{sec:DIA_diamlr}

Functionality for multiple-linear-regression (MLR) analysis of arbitrary output fields, using regressors that are a function of the continuous model time, is available as a diagnostic option of \NEMO.
Its implementation makes use of the ordinary-least-squares method (a method overview can be found \href{https://en.wikipedia.org/wiki/Ordinary_least_squares}{here}), it depends on XIOS for generating a set of intermediate output files, the set of regressors is configurable as part of the model-output XIOS configuration, and the regression analyses can be completed versatilely in a post-processing step. In particular, the analysis time window remains flexible until the post-processing step, from partial model runs (depending on the selected temporal resolution for the intermediate output) to spanning multiple restart segments; also, the original regressor set can be restricted at the post-processing step.
For the specific case of tidal harmonic analysis, the configuration of regressors that correspond to tidal constituents available for tidal forcing (see \autoref{sec:SBC_TDE}) is facilitated through a substitution mechanism (\ie\ model-provided tidal frequencies, phases, and amplitudes can be referred to symbolically in MLR analysis configurations).\par

\subsection{Configuration of the multiple-linear-regression analysis}

The MLR analysis is activated by defining an empty file-group entry
\begin{xmllines}
   <file_group id="diamlr_files" output_freq="<output frequency>" enabled=".TRUE." />
\end{xmllines}
in the XIOS configuration, where \xmlcode{<output frequency>} specifies the temporal resolution of the intermediate output: if defined, this file group will be populated during model initialisation.
Other prerequisite XIOS-configuration elements (regressors and the time variable) are pre-defined in the default XIOS configuration file \path{./cfgs/SHARED/field_def_nemo-oce.xml}, and can be modified if required.\par

Regressors are defined and enabled within the XIOS field group \xmlcode{<field_group id="diamlr_fields">} in the form of individual fields that are computed from variable \xmlcode{diamlr_time} as
\begin{xmllines}
   <field id="diamlr_r<mmm>" field_ref="diamlr_time" expr="<expression>" enabled=".TRUE." comment="<comment>" />,
\end{xmllines}
where \xmlcode{"<mmm>"} refers to a 3-digit identification number, \xmlcode{"<expression>"} is a functional expression, \xmlcode{"<comment>"} an arbitrary string (which may be utilised to pass information to post-processing utilities), and \xmlcode{"diamlr_time"} refers to the continuous model time in seconds.
For the functional expression, XIOS requires the specified reference \xmlcode{diamlr_time} to be included; therefore, in order to obtain a constant expression for fitting an intercept, \xmlcode{diamlr_time^0} can be chosen.
The model time \xmlcode{"diamlr_time"} corresponds to module variable \forcode{adatrj} of module \mdl{dom\_oce} and is defined in module \mdl{daymod}; its continuity across model restarts depends on a selection made in \nam{dom}{dom}.
Similarly, a XIOS field \xmlcode{<field>} can be selected for MLR analysis through the definition of a field
\begin{xmllines}
   <field id="diamlr_f<nnn>" field_ref="<field>" enabled=".TRUE." />
\end{xmllines}
in field group \xmlcode{<field_group id="diamlr_fields">}, where \xmlcode{<nnn>} is a 3-digit identification number.\par

For the purpose of tidal harmonic analysis, two orthogonal regressors per analysed tidal-constituent signal need to be defined in order to fit both the amplitude and phase of the corresponding harmonic, typically a sine and cosine function with identical argument.
Further, regressor configurations can be equipped with placeholders to refer to the frequency, phase, and amplitude of each of the constituents available and evaluated for tidal forcing of the model. In particular:
\begin{description}
   \item [\xmlcode{__TDE_<constituent>_omega__}] \hfill \\
      refers to the angular velocity (in units of rad s$^{-1}$);
   \item [\xmlcode{__TDE_<constituent>_phase__}] \hfill \\
      refers to the phase, including the nodel correction at the beginning of the model run (in units of rad); and
   \item [\xmlcode{__TDE_<constituent>_amplitude__}] \hfill \\
      refers to the equilibrium-tide amplitude (in units of m)
\end{description}
of tidal constituent \xmlcode{<constituent>}.
During model initialisation, these placeholders are automatically substituted with the corresponding model-computed values for the respective tidal constituent.\par

A default set of regressors relevant for tidal harmonic analysis has been pre-defined (see \path{./cfgs/SHARED/field_def_nemo-oce.xml}) and can be redefined. An example of such a redefinition can be found in the AMM12 reference configuration, in file \path{./cfgs/AMM12/EXPREF/context_nemo.xml}.\par

\subsection{The intermediate output and its post-processing}

Internally, during model initialisation, the initial XIOS configuration for MLR analysis is expanded automatically through the generation of field and output-file definitions for the relevant intermediate model output.
The resulting intermediate output consists of fields of scalar products between each regressor and the values of the fields selected for MLR analysis, as well as of scalar products between each regressor-regressor pair, all sampled at the configured interval.
For the final analysis only the scalar products over the analysis time span are required, thus the intermediate output can be freely subset or combined (added) along its time dimension to select the analysis window (which enables analyses across multiple restart segments) during post-processing.\par

The total number of intermediate output variables depends on the number of analysed fields ($n_{f}$) and the number of regressors ($n_{r}$) (for tidal analysis, $n_{r} = 2n_{c}+1$, \ie\ twice the number of tidal constituents, $n_{c}$, plus one regressor to fit the intercept) and amounts to $n_{f} n_{r} + 2 n_{r}^{2} - n_{r}$ (of which $2 n_{r}^2 - n_{r}$ variables are scalar time series). These output variables are written to output files labelled with \path{diamlr_scalar}, which contain the regressor-regressor scalar products, and with \path{diamlr_grid_<grid_type>}, which contain the regressor-diagnostic scalar-product fields for the fields defined on a grid of type \path{<grid_type>}.\par

For the computation of regression coefficients from previously generated intermediate output files, the rudimentary script \path{./tools/DIAMLR/diamlr.py} can be used. This script is provided as a simple example of the final analysis step: it processes suitable intermediate-output files by adding all available time slices and by computing regression coefficients for all available analysed fields and for all or a subset of the regressors identified from the content of the intermediate-output files. To complete a tidal harmonic analysis, the pairs of regression coefficients associated with each of the tidal constituents selected for analysis (the \xmlcode{comment} attribute could be used for identifying such pairs) could in turn be converted into maps of amplitudes and phases.\par

%% =================================================================================================
\section{Other diagnostics}
\label{sec:DIA_diag_others}

Aside from the standard model variables, other diagnostics can be computed on-line.
The available ready-to-add diagnostics modules can be found in directory DIA.

%% =================================================================================================
\subsection[Depth of various quantities (\textit{diahth.F90})]{Depth of various quantities (\protect\mdl{diahth})}

The following diagnostics are available via the \mdl{diahth} module when \key{diahth} is specified:\\

%\\
- the mixed layer depth \citep[based on the density criterion of][]{de-boyer-montegut.madec.ea_JGR04}

- the turbocline depth (based on a turbulent mixing coefficient criterion)

- the depth of the 20\deg{C} isotherm

- the depth of the thermocline (maximum of the vertical temperature gradient)

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{DIA_mask_subasins}
  \caption[Sub-basin decomposition used to compute transports and the meridional stream-function]{
    Decomposition of the World Ocean (shown here for the ORCA2 grid) into sub-basins used to
    compute the heat and salt transports as well as the meridional stream-function:
    Atlantic basin (red), Pacific basin (green),
    Indian basin (blue), Indo-Pacific basin (blue+green).
    Note that semi-enclosed seas (Red, Med and Baltic seas) as well as
    Hudson Bay are removed from the sub-basins, and that
    the Arctic Ocean has been split into Atlantic and
    Pacific basins along the North fold line.
  }
  \label{fig:DIA_mask_subasins}
\end{figure}

%% =================================================================================================
\subsection[CMIP-specific and poleward transport diagnostics (\textit{diaar5.F90}, \textit{diaptr.F90})]{CMIP-specific and poleward transport diagnostics (\protect\mdl{diaar5}, \protect\mdl{diaptr})}
\label{sec:DIA_diag_others_cmip_ptr}

Diagnostics in the \mdl{diaar5} module correspond to outputs that are required for the AR5/CMIP5 simulations
(see for example, the thermosteric sea level diagnostic in \autoref{sec:DIA_steric}).

The \mdl{diaptr} module computes the online poleward heat and salt transports,
their advective and diffusive component, the meridional stream function, and the zonal mean temperature, salinity and
cell $i$-$k$ surface area.
These are computed by default for the global ocean, but if the file \textit{subbasins.nc} is provided then these
diagnostics are also computed for the Atlantic, Indian, Pacific and Indo-Pacific Ocean basins (defined north of 30\deg{S}).
The \texttt{atlmsk}, \texttt{indmsk} and \texttt{pacmsk} variables in this file are masks corresponding to the
Atlantic, Indian and Pacific basins, while the Indo-Pacific basin mask is computed as the union of the
Indian and Pacific basin masks (\autoref{fig:DIA_mask_subasins}).

The diagnostics available from both modules are listed in the \path{./cfgs/SHARED/field_def_nemo-oce.xml} XIOS
configuration file: \mdl{diaar5} diagnostics are listed under \xmlcode{<!-- variables available with diaar5 -->} comment
headers, while \mdl{diaptr} diagnostics are listed within the \xmlcode{<field_group id="diaptr" >} element.

%% =================================================================================================
\subsection{25 hour mean output for tidal models}

\begin{listing}
  \nlst{nam_dia25h}
  \caption{\forcode{&nam_dia25h}}
  \label{lst:nam_dia25h}
\end{listing}

% TODO: Needs more info- e.g. timestep must be a multiple of 3600s
The \mdl{dia25h} module computes a crudely detided M2 signal by obtaining a 25 hour mean.
The 25 hour mean is available for daily runs by summing up the 25 hourly instantananeous hourly values from
midnight at the start of the day to midnight at the day end.
This diagnostic is activated by setting \np[=.true.]{ln_dia25h}{ln\_dia25h} in the \nam{_dia25h}{\_dia25h} namelist.

%% =================================================================================================
\subsection{Courant numbers}

Courant numbers provide a theoretical indication of the model's numerical stability.
The advective Courant numbers can be calculated according to

\[
  C_u = |u|\frac{\rdt}{e_{1u}}, \quad C_v = |v|\frac{\rdt}{e_{2v}}, \quad C_w = |w|\frac{\rdt}{e_{3w}}
  % \label{eq:DIA_CFL}
\]

in the zonal, meridional and vertical directions respectively.
The vertical component is included although it is not strictly valid as the vertical velocity is calculated from
the continuity equation rather than as a prognostic variable.
Physically this represents the rate at which information is propogated across a grid cell.
Values greater than 1 indicate that information is propagated across more than one grid cell in a single time step.

Courant number diagnostics can be activated by setting \np[=.true.]{ln_diacfl}{ln\_diacfl} in the \nam{ctl}{ctl} namelist.
The global maximum values of $C_u$, $C_v$, $C_w$, and their coordinates, for each timestep and for the whole model run,
are written to an ascii file named \textit{cfl\_diagnostics.ascii}.
The maximum values for the whole model run are also copied to the \textit{ocean.output} file.
Additionally, the depth maxima of $C_u$, $C_v$, and $C_w$ are available as 2D XIOS diagnostics (\texttt{cfl\_cu},
\texttt{cfl\_cv}, and \texttt{cfl\_cw} fields respectively).

\subinc{\input{../../global/epilogue}}

\end{document}