\documentclass[../main/NEMO_manual]{subfiles}

\begin{document}

\chapter{Output and Diagnostics (IOM, DIA, TRD)}
\label{chap:DIA}

%    {\em 4.0} & {\em Mirek Andrejczuk, Massimiliano Drudi} & {\em }  \\
%    {\em }      & {\em Dorotea Iovino, Nicolas Martin} & {\em }  \\
%    {\em 3.6} & {\em Gurvan Madec, Sebastien Masson } & {\em }  \\
%    {\em 3.4} & {\em Gurvan Madec, Rachid Benshila, Andrew Coward } & {\em }  \\
%    {\em }      & {\em Christian Ethe, Sebastien Masson } & {\em }  \\

\chaptertoc

\paragraph{Changes record} ~\\

{\footnotesize
  \begin{tabularx}{\textwidth}{l||X|X}
    Release & Author(s) & Modifications \\
    \hline
    {\em   5.0} &
        {\em D. Calvert, A. Coward and S. M{\" u}ller} &
            {\em General revisions; addition of \autoref{sec:DIA_diamlr}; expansion of \autoref{sec:DIA_dia25h}} \\
    {\em   4.0} & {\em ...} & {\em ...} \\
    {\em   3.6} & {\em ...} & {\em ...} \\
    {\em   3.4} & {\em ...} & {\em ...} \\
    {\em <=3.4} & {\em ...} & {\em ...}
  \end{tabularx}
}

\clearpage

%% =================================================================================================
\section{Model outputs}
\label{sec:DIA_io_old}

The model outputs are of three types: the output log/progress listings; 
the diagnostic output file(s); and the restart file(s).

The output log and progress listings are output in the \textit{ocean.output} file(s), which contains
information printed from within the code on the logical unit \texttt{numout}. To
locate these prints, use the UNIX command "\textit{grep -i numout}" in the source code
directory. Model errors that are caught by \NEMO\ (via the \rou{ctl\_stop} subroutine) will issue a
return code of \texttt{123} and information on the errors will be written to the \textit{ocean.output} file.
Additional progress information can be requested using the options explained in
\autoref{subsec:MISC_statusinfo}.

Diagnostic output files are written in NetCDF4 format and are generated by one of two available methods.
With the legacy method (used when \key{xios} is not specified), output files have a predefined structure
and contain time averaged diagnostics. If \key{diainstant} is specified, instantaneous diagnostics are instead output.
With the standard method (used when \key{xios} is specified), \NEMO\ can employ the full capability of the XIOS I/O server,
which provides flexibility in the choice of the fields to be written as well as how the writing tasks are distributed
over the processors in a massively parallel computing environment. A complete description
of the use of this I/O server is presented in the next section.

The restart file is used by the code when the user wants to start the model with initial
conditions defined by a previous simulation.  Restart files are NetCDF files containing
all the information that is necessary in order for there to be no changes in the model
results (even at the computer precision) between a run performed with several stops and
restarts and the same run performed in one continuous integration step.  It should be
noted that this requires that the restart file contains two consecutive time steps for all
the prognostic variables.

Two methods are available to read and write restart files.
The default method is for \NEMO\ to perform these tasks: \NEMO\ will generate a restart file for each MPP subdomain,
which will then be read by the same subdomain on restarting. Therefore if a change in MPP decomposition is required
between runs, then the individual restart files
must first be combined into a single restart file for the full domain. This can be done using the
\href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO} tool.
The alternative method is to use XIOS to read and/or write restart files (see \autoref{sec:XIOS_restarts}).
This functionality was introduced in \NEMO\ v4.2 and includes the ability to write restart data directly to a single file
for the full domain.

%% =================================================================================================
\section[Standard model diagnostic output with XIOS (\texttt{iom\_put}, \texttt{key\_xios})]{Standard model diagnostic output with XIOS (\protect\rou{iom\_put}, \protect\key{xios})}
\label{sec:DIA_iom}

The standard \NEMO\ diagnostic output method (activated when \key{xios} is specified) uses an
external I/O library and server named \XIOS\, with the \NEMO\ subroutine \rou{iom\_put} serving as the main interface
to this library.

XIOS is developed by Yann Meurdesoif and his team at IPSL, and has its own
\href{http://forge.ipsl.jussieu.fr/ioserver/wiki}{repository and support pages}.
\NEMO\ v5 can be used with either \href{https://forge.ipsl.jussieu.fr/ioserver/browser/XIOS2/trunk}{version 2}
or \href{https://forge.ipsl.jussieu.fr/ioserver/browser/XIOS3/trunk}{version 3} of XIOS.
\NEMO\ expects XIOS v2 by default and requires at least SVN revision 2131 of this version of the library.
The use of XIOS v3 requires that the \NEMO\ key \key{xios3} be specified.
Further details are available in the \href{https://sites.nemo-ocean.io/user-guide/}{NEMO user guide}.
XIOS will create output files in NetCDF4 format, which is incompatible with the older NetCDF3
libraries. Post-processing and visualization tools must therefore be linked to NetCDF4 libraries to be able to
handle the NetCDF files created by XIOS.//

XIOS has been designed to be simple to use, flexible and efficient.
Its two main purposes are:

\begin{enumerate}
\item The complete and flexible control of the output files through external XML files adapted by
  the user from standard templates.
\item To achieve high performance and scalable output through the optional distribution of
  all diagnostic output related tasks to dedicated processes.
\end{enumerate}

\noindent The first functionality allows the user to specify, without code changes or recompilation,
aspects of the diagnostic output stream, such as:

\begin{itemize}
\item The choice of output frequencies that can be different for each file (including real months and years).
\item The choice of file contents; includes complete flexibility over which data are written in which files
  (the same data can be written in different files).
\item The possibility to split output files at a chosen frequency.
\item The possibility to extract a vertical or an horizontal subdomain.
\item The choice of the temporal operation to perform, \eg: average, accumulate, instantaneous, min, max and once.
\item Control over metadata via a large XML "database" of possible output fields.
\item Control over the compression and/or precision of output fields (subject to certain conditions)
\end{itemize}

\noindent In addition, the \rou{iom\_put} interface allows the user to add in the \NEMO\ code the output of any new
variable (scalar, 1D, 2D or 3D) in a very easy way. The functionalities of XIOS and the \rou{iom\_put} interface
are listed in the following subsections. \\

\noindent The second functionality targets output performance when running in parallel. XIOS
provides the possibility to specify $N$ dedicated I/O servers (in addition to the \NEMO\
processes) to collect and write the outputs.  With an appropriate choice of $N$ by the user,
the bottleneck associated with the writing of the output files can be greatly reduced.

XIOS can take advantage of the parallel I/O functionality of NetCDF4\footnote{This requires that your
NetCDF4 library is linked to an HDF5 library that has been correctly compiled (\ie\ with the configure option
\texttt{--enable-parallel})} to have each XIOS server write
to a single output file. This facility is ideal for
small to moderate size configurations but can be problematic with large models due to the large memory
requirements and the inability to use NetCDF4's compression capabilities in this "one\_file" mode.

XIOS2 has the option of using two levels of I/O servers so it may be possible, in some circumstances,
to use a single I/O server at the second level to enable compression.
In many cases though, it is
often more robust to use "multiple\_file" mode (where each XIOS server writes to a separate file) and to
recombine these files as a post-processing step. The
\href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO} tool is provided for this purpose.
As the number of XIOS servers is typically much less than the number of \NEMO\ processes, significantly fewer
output files will be generated compared to the legacy method (which outputs one file per \NEMO\ process), reducing
the overhead of this post-processing step.
For smaller configurations this post-processing step can be avoided entirely, even without a parallel-enabled
NetCDF4 library, by using only one XIOS server.

XIOS3 provides a more versatile approach with the concept of "pools
and services" where different "pools" of XIOS processes can be assigned to perform
different services.  Chief amongst these services are "gatherers" and "writers"
which are similar to the two-level server capabilties of XIOS2, but, crucially, allow
the user better control over the assignment of resources. With care, it is possible
to achieve sustained "one\_file" output even with large models. These capabilities
are relatively new at the time of the 5.0 release, but the approach is documented in
the \href{https://sites.nemo-ocean.io/user-guide/xios3demo.html}{XIOS3 demonstrator}
of the on-line user guide.

%% =================================================================================================
\subsection{Main XIOS configuration file (\textit{iodef.xml})}

The behaviour of XIOS is controlled by settings in external XML configuration files, with settings for different
applications (or components of one) split into separate "contexts". These settings are specified via the
top-level \textit{iodef.xml} file (see for example \path{./cfgs/ORCA2_ICE_PISCES/EXPREF/iodef.xml}).
Basic details on XML syntax and rules can be found in \autoref{subsec:DIA_iom_xml}.

In \NEMO, the \textit{iodef.xml} file typically contains settings for the "xios" context
(controlling the overall functionality of XIOS) and for one or more "nemo" contexts (defining the fields and grids used,
as well as the output files to be generated, by \NEMO\ and any AGRIF child grids).
Further information on these contexts can be found in \autoref{subsec:DIA_iom_xml_tags}.

%% =================================================================================================
\subsubsection{The "xios" context}

"xios" context settings that might commonly be configured are shown in \autoref{tab:DIA_iom_xios_context}.

% NOTE: These are not described in the text, except for using_server
\begin{table}
  \caption["xios" context variables in \textit{iodef.xml}]{"xios" context variables typically used
  in the \textit{iodef.xml} configuration files used by \NEMO}
  \begin{tabularx}{\textwidth}{|lXl|}
    \hline
    variable name                                                           &
    description                                                             &
    example  \\
    \hline
    \hline
    \texttt{info\_level}                                                    &
    verbosity level (0 to 100)                                              &
    0        \\
    \hline
    \texttt{using\_server}                                                  &
    activate attached (false) or detached(true) mode                         &
    true     \\
    \hline
    \texttt{using\_oasis}                                                   &
    XIOS is used with OASIS (true) or not (false)                            &
    false    \\
    \hline
    \texttt{oasis\_codes\_id}                                               &
    [\textbf{XIOS 2 only}] when using oasis, define the identifier of \NEMO\ in the namcouple.
    Note that the identifier of XIOS is "xios.x"                            &
    oceanx   \\
    \hline
  \end{tabularx}
  \label{tab:DIA_iom_xios_context}
\end{table}

%% =================================================================================================
\subsubsection{The "nemo" context}

"nemo" context settings are usually separated into several XML files, each handling a different component of
the configuration. These files are included in \textit{iodef.xml} via a nested set of \texttt{src} directives
(see \autoref{subsec:DIA_iom_xml_nesting}), usually via an intermediate file \textit{context\_nemo.xml}.
\eg\ for the ORCA2\_ICE\_PISCES reference configuration, this hierarchy of files can be represented as:

\begin{verbatim}
iodef.xml <----------- <context id="nemo" src="./context_nemo.xml"/>
context_nemo.xml <-+
                   |
                   +-- <field_definition  src="./field_def_nemo-oce.xml"/>
                   +-- <field_definition  src="./field_def_nemo-ice.xml"/>
                   +-- <field_definition  src="./field_def_nemo-pisces.xml"/>
                   |
                   +-- <file_definition   src="./file_def_nemo-oce.xml"/>
                   +-- <file_definition   src="./file_def_nemo-ice.xml"/>
                   +-- <file_definition   src="./file_def_nemo-pisces.xml"/>
                   |
                   +-- <axis_definition   src="./axis_def_nemo.xml"/>
                   |
                   +-- <domain_definition src="./domain_def_nemo.xml"/>
                   |
                   +-- <grid_definition   src="./grid_def_nemo.xml"/>
\end{verbatim}

The purposes and contents of these XML files will be explained further in later sections.

%% =================================================================================================
\subsection{Practical issues}

%% =================================================================================================
\subsubsection{Installation}

As mentioned, XIOS is supported separately and must be downloaded and compiled before it can be used with \NEMO.
See the installation guide on the \href{http://forge.ipsl.jussieu.fr/ioserver/wiki}{XIOS wiki} for help and guidance.
\NEMO\ will then need to link to the compiled XIOS library- see the
\href{https://sites.nemo-ocean.io/user-guide/install.html#download-and-install-the-nemo-code}{NEMO user guide}.

%% =================================================================================================
\subsubsection{Attached or detached mode?}

For both XIOS2 and XIOS3, a key setting in the "xios" context (\textit{iodef.xml}) is:

\xmlline|<variable id="using_server" type="bool"></variable>|

which determines whether or not the server will be used in
\textit{attached mode}
(as a library) [\texttt{.false.}] or in \textit{detached mode}
(as an external executable on N additional, dedicated cpus) [\texttt{.true.}].
The \textit{attached mode} is simpler to use but much less efficient for
massively parallel applications.
The output produced will also depend on the type of each file requested in the
\texttt{file\_definition} sections.  The type can be either ''multiple\_file'' or
''one\_file'' (explained more fully in later sections).

In \textit{attached mode} and if the type of file is ''multiple\_file'',
then each \NEMO\ process will also act as an I/O server and produce its own set of output files.
Superficially, this emulates the standard behaviour of \NEMO\ without XIOS.
However, the subdomain written out by each process does not correspond to
the \forcode{jpi x jpj} domain actually computed by the process (although it may if \forcode{jpni=1}).
Instead each process will have collected and written out a number of complete longitudinal strips.
If the ''one\_file'' option is chosen then all processes will collect their longitudinal strips and
write (in parallel) to a single output file.

In \textit{detached mode} and if the type of file is ''multiple\_file'',
then each stand-alone XIOS process will collect data for a range of complete longitudinal strips and
write to its own set of output files.
If the ''one\_file'' option is chosen then all XIOS processes will collect their longitudinal strips and
write (in parallel) to a single output file.
Note running in detached mode requires launching a Multiple Process Multiple Data (MPMD) parallel job.
The following subsection provides a typical example but the syntax will vary in different MPP environments.

%% =================================================================================================
\subsubsection{Number of cpus used by XIOS in detached mode}

The number of cores used by the XIOS servers is specified when launching the model. This number
should be from \texttildelow 1/10 to \texttildelow 1/50 of the
number of cores dedicated to \NEMO.  Some manufacturers suggest using O($\sqrt{N}$)
dedicated I/O processors for $N$ processors, but this is a general recommendation and not
specific to \NEMO.  It is difficult to provide precise recommendations because the optimal
choice will depend on the particular hardware properties of the target system (parallel
filesystem performance, available memory, memory bandwidth etc.) and the volume and
frequency of data to be created.  Here is an example of using 2 cpus for XIOS servers and 62
cpus for \NEMO\ with \texttt{mpirun}:

\begin{cmds}
  mpirun -np 62 ./nemo.exe : -np 2 ./xios_server.exe
\end{cmds}

%% =================================================================================================
\subsubsection{Add your own outputs}

It is very easy to add your own outputs with XIOS and the \NEMO\ \rou{iom\_put} interface.
Many standard fields and diagnostics are already prepared (\ie, steps 1 to 3 below have been done) and
simply need to be activated by including the required output in a file definition (step 4).
To add new diagnostics, all 4 of the following steps must be taken.

\begin{enumerate}
\item In the \NEMO\ code, add a "\forcode{CALL iom_put( 'identifier', array )}" for the array that is
to be output as a diagnostic. In most cases, this will be in a part of the code which is executed
only once per timestep and after the array has been updated for that timestep.

Adding this call simply exposes the array to the XIOS workflow; whether or not (and at
which frequency) the corresponding diagnostic is actually output by XIOS will be determined by the contents of the
file definition (see step 4).
\item If necessary, add "\forcode{USE iom ! I/O manager library}" to the list of used
modules in the upper part of your module.
\item In the appropriate \path{./cfgs/SHARED/field_def_nemo-....xml} files, add a definition for your diagnostic to the
field definition using the same identifier you used in the  \NEMO\ Fortran code (see
subsequent sections for details of the XML syntax and rules). For example:
\begin{xmllines}
<field_definition>
   <field_group id="grid_T" grid_ref="grid_T_3D"> <!-- T grid -->
      <field id="identifier" long_name="blabla" />
   </field_group>
</field_definition>
\end{xmllines}
This definition must be added to the \texttt{field\_group} whose reference grid (\texttt{grid\_ref}) is consistent
with the size of the array passed to \rou{iom\_put}. The \texttt{grid\_ref} attribute refers to
definitions set in \textit{grid\_def\_nemo.xml} which, in turn, reference domains and axes defined either
in the code (\rou{iom\_set\_domain\_attr} and \rou{iom\_set\_axis\_attr} in \mdl{iom}) or
in the XML configuration files (\textit{domain\_def\_nemo.xml} and \textit{axis\_def\_nemo.xml}).  \eg\ :
\begin{xmllines}
<grid_definition>
  <grid id="grid_T_3D" >
    <domain domain_ref="grid_T" />
    <axis axis_ref="deptht" />
  </grid>
</grid_definition>
\end{xmllines}
Note that if the array passed to \rou{iom\_put} is computed within the Surface Boundary Condition module
(\autoref{chap:SBC}), then the corresponding field definition must be added within the SBC \texttt{field\_group},
\xmlcode{<field_group id="SBC" ...>}. This is because the array is updated every \np{nn_fsbc}{nn\_fsbc} time steps
and the frequency of operations in the SBC \texttt{field\_group} has been defined accordingly
(see \rou{iom\_set\_field\_attr} in \mdl{iom}).
\item Finally, add your field to one or more file definitions defined in \textit{file\_def\_nemo-*.xml}
(each corresponding to an output file- again, see the subsequent sections for XML syntax and rules)
\begin{xmllines}
<file_definition>
   <file_group id="5d" output_freq="5d"  output_level="10" enabled=".TRUE.">  <!-- 5d files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <field field_ref="identifier" />
      </file>
   </file_group>
</file_definition>
\end{xmllines}
\end{enumerate}

%% =================================================================================================
\subsection{XML fundamentals}
\label{subsec:DIA_iom_xml}

This subsection discusses some basic aspects of the XML syntax used by XIOS.
Further information can be found in the XIOS reference and user guides available
\href{https://forge.ipsl.jussieu.fr/ioserver/}{here}.

%% =================================================================================================
\subsubsection{XML basic rules}

XML tags begin with the less-than character ("$<$") and end with the greater-than character ("$>$").
You use tags to mark the start and end of elements, which are the logical units of information in an XML document.
In addition to marking the beginning of an element, XML start tags also provide a place to specify attributes.
An attribute specifies a single property for an element, using a name/value pair, for example:
\xmlcode{<a b="x" c="y" d="z"> ... </a>}.
See \href{http://www.xmlnews.org/docs/xml-basics.html}{here} for more details.

%% =================================================================================================
\subsubsection{XML tags}
\label{subsec:DIA_iom_xml_tags}

The XML tags used by XIOS are organised into 7 families:
\texttt{context}, \texttt{axis}, \texttt{domain}, \texttt{grid}, \texttt{field}, \texttt{file} and \texttt{variable}.
Each tag family has a hierarchy of three scopes (except for \texttt{context}), shown in \autoref{tab:DIA_xml_scope}.

\begin{table}
  \caption[Hierarchy of tag scopes used by XIOS]{Hierarchy of scopes used by tags in the XIOS XML configuration files}
  \begin{tabular*}{\textwidth}{|p{0.15\textwidth}p{0.4\textwidth}p{0.35\textwidth}|}
    \hline
    Scope   & description                                                                 &
                                                                                            example                          \\
    \hline
    \hline
    \texttt{root}    & declaration of the root element that can contain element groups or elements &
                                                                                            \xmlcode{<file_definition ... >} \\
    \hline
    \texttt{group}   & declaration of a group element that can contain element groups or elements  &
                                                                                            \xmlcode{<file_group      ... >} \\
    \hline
    \texttt{element} & declaration of an element that can contain elements                         &
                                                                                            \xmlcode{<file            ... >} \\
    \hline
  \end{tabular*}
  \label{tab:DIA_xml_scope}
\end{table}

Each element may have several attributes.
Some attributes are mandatory, some are optional with a default value, and others are completely optional.
A special attribute, \texttt{id}, is used to identify an element (or a group of elements) and must have a unique
value within each element family.
This attribute is optional, but the corresponding element cannot be referenced if this is not defined.

XIOS "contexts" (definitions and settings for different applications or components of one) are
separated by the \texttt{context} tag.
No interference is possible between 2 different contexts.
Each context has its own calendar and an associated timestep.
The contexts used by \NEMO\ (which can be defined in any order) are shown in \autoref{tab:DIA_xml_contexts}.
The "xios" context uses only 1 tag (\autoref{tab:DIA_xml_tags_xios}), while the other contexts related to \NEMO\
use 5 tags (\autoref{tab:DIA_xml_tags_nemo}).

\begin{table}
  \caption[XIOS contexts used by \NEMO]{XIOS contexts used by \NEMO}
  \begin{tabular}{|p{0.15\textwidth}p{0.4\textwidth}p{0.35\textwidth}|}
    \hline
    context         &	description                                                                &
                                                                                                     example                              \\
    \hline
    \hline
    \texttt{xios}    &	context containing information for XIOS                                    &
                                                                                                     \xmlcode{<context id="xios" ... >}   \\
    \hline
    \texttt{nemo}    &	context containing I/O information for \NEMO\ (mother grid when using AGRIF)  &
                                                                                                     \xmlcode{<context id="nemo" ... >}   \\
    \hline
    \texttt{n\_nemo} &	context containing I/O information for \NEMO\ child grid n (when using AGRIF) &
                                                                                                     \xmlcode{<context id="n_nemo" ... >} \\
    \hline
  \end{tabular}
  \label{tab:DIA_xml_contexts}
\end{table}

\begin{table}
  \caption[XIOS tags used by the xios context]{XIOS tags used by the xios context}
  \begin{tabular}{|p{0.2\textwidth}p{0.35\textwidth}p{0.35\textwidth}|}
    \hline
    context tag                                      &
                                                       description                                      &
                                                                                                          example                              \\
    \hline
    \hline
    \texttt{variable\_definition}                             &
                                                       define variables needed by XIOS.
                                                       This can be seen as a kind of namelist for XIOS. &
                                                                                                          \xmlcode{<variable_definition ... >} \\
    \hline
  \end{tabular}
  \label{tab:DIA_xml_tags_xios}
\end{table}

\begin{table}
  \caption[XIOS tags used by the nemo contexts]{XIOS tags used by the nemo contexts
           (both mother and child grids when using AGRIF)}
  \begin{tabular}{|p{0.2\textwidth}p{0.35\textwidth}p{0.35\textwidth}|}
    \hline
    context tag        &	description                                                               &
                                                                                                       example                            \\
    \hline
    \hline
    \texttt{field\_definition}  &	define all variables that can potentially be outputted                    &
                                                                                                       \xmlcode{<field_definition ... >}  \\
    \hline
    \texttt{file\_definition}   &	define the netcdf files to be created and the variables they will contain &
                                                                                                       \xmlcode{<file_definition ... >}   \\
    \hline
    \texttt{axis\_definition}   &	define vertical axis                                                      &
                                                                                                       \xmlcode{<axis_definition ... >}   \\
    \hline
    \texttt{domain\_definition} &	define the horizontal grids                                               &
                                                                                                       \xmlcode{<domain_definition ... >} \\
    \hline
    \texttt{grid\_definition}   &	define the 2D and 3D grids (association of an axis and a domain)          &
                                                                                                       \xmlcode{<grid_definition ... >}   \\
    \hline
  \end{tabular}
  \label{tab:DIA_xml_tags_nemo}
\end{table}

%% =================================================================================================
\subsubsection{Nesting XML files}
\label{subsec:DIA_iom_xml_nesting}

The main XML file (\textit{iodef.xml}) can be split into different parts to improve its readability.
These other XML files can then be included in the \textit{iodef.xml} file via the \texttt{src} attribute:
\begin{xmllines}
<context id="nemo" src="./context_nemo.xml"/>
\end{xmllines}

\noindent In the \NEMO\ reference configurations, the field, file and grid definitions are typically split
over several XML files in this manner. \eg\ the \textit{context\_nemo.xml} file for AGRIF\_DEMO contains:
\begin{xmllines}
<!-- Fields definition -->
    <field_definition src="./field_def_nemo-oce.xml"      />   <!--  NEMO ocean dynamics               -->
    <field_definition src="./field_def_nemo-ice.xml"      />   <!--  NEMO ocean sea ice                -->
    <field_definition src="./field_def_nemo-pisces.xml"   />   <!--  NEMO ocean biogeochemical         -->
    <field_definition src="./field_def_nemo-innerttrc.xml"/>   <!--  NEMO ocean inert passive tracer   -->

<!-- Files definition -->
    <file_definition src="./file_def_nemo-oce.xml"/>       <!--  NEMO ocean dynamics                   -->
    <file_definition src="./file_def_nemo-ice.xml"/>       <!--  NEMO ocean sea ice                    -->
    <file_definition src="./file_def_nemo-innerttrc.xml"/> <!--  NEMO ocean inert passive tracer       -->

<!-- Grids/domains/axes definition -->
    <axis_definition   src="./axis_def_nemo.xml"/>         <!-- Axis definition -->
    <domain_definition src="./domain_def_nemo.xml"/>       <!-- Domain definition -->
    <grid_definition   src="./grid_def_nemo.xml"/>         <!-- Grids definition -->
\end{xmllines}

%% =================================================================================================
\subsubsection{Use of inheritance}

XML extensively uses the concept of inheritance.
XML has a tree based structure with a parent-child oriented relation: all children inherit attributes from their parent,
and an attribute defined in a child replaces the inherited attribute value.
Note that the special attribute \texttt{id} is never inherited.
\\
\\
Example 1: direct inheritance.

\begin{xmllines}
<field_definition operation="average" >
  <field_group id="grid_T" grid_ref="grid_T_2D"> <!-- T grid -->
     <field id="sst"                    />   <!-- averaged      sst -->
     <field id="sss" operation="instant"/>   <!-- instantaneous sss -->
  </field_group>
</field_definition>
\end{xmllines}

\noindent The field ''sst'' which is part (or a child) of the \texttt{field\_definition} will inherit the
value ''average'' of the attribute ''operation'' from its parent.
Note that a child can overwrite the attribute definition inherited from its parents.
In the example above, the field ''sss'' will for example output instantaneous values instead of average values.
\\
\\
Example 2: inheritance by reference: inherit (and overwrite, if needed) the attributes of a tag you are referring to:


\begin{xmllines}
<field_definition>
   <field_group id="grid_T" grid_ref="grid_T_2D"> <!-- T grid -->
      <field id="sst" long_name="sea surface temperature" />
      <field id="sss" long_name="sea surface salinity"    />
   </field_group>
</field_definition>

<file_definition>
   <file id="myfile" output_freq="1d" />
      <field field_ref="sst"                            />  <!-- default -->
      <field field_ref="sss" long_name="my description" />  <!-- overwritten -->
   </file>
</file_definition>
\end{xmllines}

%% =================================================================================================
\subsubsection{Use of groups}

Groups can be used for 2 purposes.
Firstly, a group can be used to define common attributes to be shared by the elements of
the group through inheritance.
In the following example, we define a group of 2D and 3D fields on the T grid:

\begin{xmllines}
<field_definition>
   <field_group id="grid_T" grid_ref="grid_T_2D">
      <field id="toce" long_name="temperature"             unit="degC" grid_ref="grid_T_3D"/>
      <field id="sst"  long_name="sea surface temperature" unit="degC"                     />
      <field id="sss"  long_name="sea surface salinity"    unit="psu"                      />
      <field id="ssh"  long_name="sea surface height"      unit="m"                        />
   </field_group>
</field_definition>
\end{xmllines}

Most of the fields are 2D, so the 2D grid definition (''grid\_T\_2D'') is used by the group.
Field ''toce'' is 3D, so the 2D grid definition inherited from the group is overwritten by that
of the 3D grid (''grid\_T\_3D''). \\

\noindent Secondly, a group can be used to refer to multiple elements with a single reference.
Several examples of groups of fields are included at the end of the field definition XML configuration files (
\path{./cfgs/SHARED/field_def_nemo-oce.xml},
\path{./cfgs/SHARED/field_def_nemo-pisces.xml} and
\path{./cfgs/SHARED/field_def_nemo-ice.xml} ) .
For example, a shortlist of variables on the U grid:

\begin{xmllines}
<field_definition>
   <field_group id="groupU" >
      <field field_ref="uoce"  />
      <field field_ref="ssu" />
      <field field_ref="utau"  />
   </field_group>
</field_definition>
\end{xmllines}

can be included in a file definition via the \texttt{group\_ref} attribute:

\begin{xmllines}
<file_definition>
   <file id="myfile_U" output_freq="1d" />
      <field_group group_ref="groupU" />
      <field field_ref="uocetr_eff"   />  <!-- add another field -->
   </file>
</file_definition>
\end{xmllines}

%% =================================================================================================
\subsection{Detailed functionalities}

This subsection discusses some of the functionality offered by XIOS, several examples of which can be found within
the files \path{./cfgs/ORCA2_ICE_PISCES/EXPREF/*.xml}.
Again, refer to the XIOS reference and user guides, available \href{https://forge.ipsl.jussieu.fr/ioserver/}{here},
for more information.

%% =================================================================================================
\subsubsection{Define horizontal subdomains/zooms}

Horizontal subdomains ("zooms") are defined through the attributes \texttt{ibegin}, \texttt{jbegin},
\texttt{ni}, \texttt{nj} of the \texttt{zoom\_domain} tag.
This must appear within a \texttt{domain} tag, and must therefore be placed in the domain definition part
of the XML (\ie\ between the \texttt{domain\_definition} tags in \path{./cfgs/SHARED/domain_def_nemo.xml}).

Note that \textbf{\texttt{zoom\_domain} is deprecated in XIOS3} and will eventually be removed;
\texttt{extract\_domain} should be used instead.
XIOS3 still supports the use of \texttt{zoom\_domain}, but will generate warnings stating that this has been renamed
to \texttt{extract\_domain}.

For example, a 5 by 5 box with the bottom left corner at point (10,10) would be defined as:

\begin{xmllines}
<domain_definition>
   <domain id="myzoomT" domain_ref="grid_T">
      <zoom_domain ibegin="10" jbegin="10" ni="5" nj="5" />
   </domain>
</domain_definition>
\end{xmllines}

and would then be used for diagnostic output via the \texttt{domain\_ref} attribute of the \texttt{field} tag family, \eg\:

\begin{xmllines}
<file_definition>
   <file id="myfile_zoom" output_freq="1d" >
      <field field_ref="toce" domain_ref="myzoomT"/>
   </file>
</file_definition>
\end{xmllines}

\noindent However, only \texttt{grid\_ref} or a \texttt{domain\_ref}/\texttt{axis\_ref} pair may be specified, not both.
In the example above, field "toce" has likely been defined with \texttt{grid\_ref="grid\_T\_3D"} in the field definition
XML configuration file. This will be inherited, so we must override \texttt{grid\_ref} instead of \texttt{domain\_ref}
by defining a new grid (a copy of "grid\_T\_3D" in \path{./cfgs/SHARED/grid_def_nemo.xml}):

\begin{xmllines}
<grid_definition>
   <grid id="grid_T_3D_myzoomT">
      <domain domain_ref="myzoomT" />
      <axis axis_ref="deptht" />
   </grid>
</grid_definition>
\end{xmllines}

and then referencing this in the \texttt{field} tag:

\begin{xmllines}
<file_definition>
   <file id="myfile_zoom" output_freq="1d" >
      <field field_ref="toce" grid_ref="grid_T_3D_myzoomT"/>
   </file>
</file_definition>
\end{xmllines}

\noindent Moorings are seen as an extreme case corresponding to a 1 by 1 subdomain.
The Equatorial section, the TAO, RAMA and PIRATA moorings are already defined in the code and
can therefore be used without needing to specify their (i,j) position in the grid.
These predefined zooms can be activated by the use of a specific \texttt{domain\_ref}:
''EqT'', ''EqU'' or ''EqW'' for the equatorial sections and
the mooring position for TAO, RAMA and PIRATA followed by ''T'', \eg\:

\begin{xmllines}
<file_definition>
   <file id="myfile_zoom" output_freq="1d" >
      <field field_ref="sst" domain_ref="0n180wT"/>
   </file>
</file_definition>
\end{xmllines}

\noindent A full list of these section and mooring domains can be found in \path{./cfgs/SHARED/domain_def_nemo.xml}. \\

\noindent As noted in \autoref{sec:DIA_iom}, using ''multiple\_file'' type output will produce one file per XIOS
server with each file containing a different part of the full domain, which may split the subdomain across
several files. In this case, tools like \href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO}
should be used to combine these files.

%% =================================================================================================
\subsubsection{Define vertical zooms}

Vertical zooms are defined through the attributes \texttt{begin} and \texttt{n} of the \texttt{zoom\_axis} tag.
This must appear within an \texttt{axis} tag, and must therefore be placed in the axis definition part
of the XML (\ie\ between the \texttt{axis\_definition} tags in \path{./cfgs/SHARED/axis_def_nemo.xml}).
Note that as for \texttt{zoom\_domain}, \textbf{\texttt{zoom\_axis} is deprecated in XIOS3} and \texttt{extract\_axis}
should be used instead.

For example, a zoom corresponding to the top 300m of the ocean would be defined as:

\begin{xmllines}
<axis_definition>
   <!-- Vertical zoom for a 31-levels ORCA2 grid. For eORCA1 300m corresponds to n=35 -->
   <axis id="deptht300" axis_ref="deptht" >
     <zoom_axis begin="0" n="19" />
   </axis>
</axis_definition>
\end{xmllines}

and would then be used for diagnostic output via the \texttt{axis\_ref} attribute of the \texttt{field} tag family, \eg\:

\begin{xmllines}
<file_definition>
   <file id="myfile_zoom" output_freq="1d" >
      <field field_ref="diag_1d" axis_ref="deptht300"/>
   </file>
</file_definition>
\end{xmllines}

\noindent As noted in the previous section, only \texttt{grid\_ref} or a \texttt{domain\_ref}/\texttt{axis\_ref} pair
may be specified, not both. Therefore in the case of a 3D diagnostic, we must override \texttt{grid\_ref} instead of
\texttt{axis\_ref} by defining a new grid (a copy of grid\_T\_3D in \path{./cfgs/SHARED/grid_def_nemo.xml}):

\begin{xmllines}
<grid_definition>
   <grid id="grid_T_3D_0_300m">
      <domain domain_ref="grid_T" />
      <axis axis_ref="deptht300" />
   </grid>
</grid_definition>
\end{xmllines}

and then referencing this in the \texttt{field} tag:

\begin{xmllines}
<file_definition>
   <file id="myfile_zoom" output_freq="1d" >
      <field field_ref="toce" grid_ref="grid_T_3D_0_300m"/>
   </file>
</file_definition>
\end{xmllines}

%% =================================================================================================
\subsubsection{Changes to the names of output files applied by \NEMO}

The output file names are defined by the attributes \texttt{name} and \texttt{name\_suffix} of the
\texttt{file} tag family. For example:

\begin{xmllines}
<file_definition>
   <file_group id="1d" output_freq="1d" name="myfile_1d" >
      <file id="myfileA" name_suffix="_AAA" > <!-- will create file "myfile_1d_AAA"  -->
      ...
      </file>
      <file id="myfileB" name_suffix="_BBB" > <!-- will create file "myfile_1d_BBB" -->
      ...
      </file>
   </file_group>
</file_definition>
\end{xmllines}

However it is often very convenient to include the name of the experiment,
the output file frequency and the start/end dates of the simulation in the file name,
which are stored either in the namelist or in the XML file.
To achieve this, we added the following rule:
if the \texttt{id} of the \texttt{file} tag is ''fileN'' (where N = 1 to 999 on 1 to 3 digits) or
one of the predefined sections or moorings (see next subsection),
parts of the \texttt{name} and \texttt{name\_suffix} attributes (which can be inherited)
will be automatically replaced if they correspond to any of the placeholders in \autoref{tab:DIA_xios_filename_subs}. \\

\begin{table}
  \caption[File name placeholder strings and their substitutions]{
           Placeholder strings for the names of diagnostic output files generated by XIOS and
           the strings they are substituted for, when the file \texttt{id} has the form ''fileN''}
  \begin{tabularx}{\textwidth}{|lX|}
    \hline
    \centering Placeholder string &
    Automatically replaced by                          \\
    \hline
    \hline
    \centering \texttt{@expname@} &
    The experiment name (from \texttt{cn\_exp} in the namelist) \\
    \hline
    \centering \texttt{@freq@}    &
    Output frequency (from XML attribute \texttt{output\_freq})     \\
    \hline
    \centering \texttt{@startdate@}        &
    Starting date of the simulation (from \texttt{nn\_date0} in the restart or the namelist).
    \newline
    \verb?yyyymmdd?          format                   \\
    \hline
    \centering \texttt{@startdatefull@}    &
    Starting date of the simulation (from \texttt{nn\_date0} in the restart or the namelist).
    \newline
    \verb?yyyymmdd_hh:mm:ss? format                    \\
    \hline
    \centering \texttt{@enddate@}          &
    Ending date of the simulation   (from \texttt{nn\_date0} and \texttt{nn\_itend} in the namelist).
    \newline
    \verb?yyyymmdd?          format                    \\
    \hline
    \centering \texttt{@enddatefull@}      &
    Ending date of the simulation   (from \texttt{nn\_date0} and \texttt{nn\_itend} in the namelist).
    \newline
    \verb?yyyymmdd_hh:mm:ss? format                    \\
    \hline
  \end{tabularx}
  \label{tab:DIA_xios_filename_subs}
\end{table}

\noindent For example,
\begin{xmllines}
<file_definition>
   <file id="file66" name="myfile_@expname@_@startdate@_freq@freq@" output_freq="1d" >
</file_definition>
\end{xmllines}

\noindent with the namelist:
\begin{forlines}
cn_exp    = "ORCA2"
nn_date0  = 19891231
ln_rstart = .false.
\end{forlines}

\noindent will give the following file name radical: \textit{myfile\_ORCA2\_19891231\_freq1d}

%% =================================================================================================
\subsubsection{Other XML attributes set by \NEMO}

The values of some XML attributes (including \texttt{name\_suffix}, discussed in the previous subsection) are
automatically set by the \rou{set\_xmlatt} subroutine in \NEMO\ (\mdl{iom}). These attributes and their values are given
in \autoref{tab:DIA_xios_auto_xml}.
Any definition of these attributes in the XML files will be overwritten; by convention their values are set to
''auto'' (for strings) or ''0000'' (for integers), although this is not necessary.

\begin{table}
  \caption[XML attributes set automatically by \NEMO]{
           XIOS XML attributes that are set automatically by \NEMO, excluding \texttt{name} and \texttt{name\_suffix}}
  \begin{tabular}{|l|c|c|}
    \hline
    Tag family and \texttt{id} affected by automatic definition                                  &
    Attribute name                                                                      &
    Attribute value                                                                     \\
    of some of their attributes                                                         &
                                                                                        &
                                                                                        \\
    \hline
    \hline
    \texttt{field\_definition}                                                          &
    \texttt{freq\_op}                                                                   &
    \np{rn_rdt}{rn\_rdt}                                                                \\
    \hline
    \texttt{field}: SBC, SBC\_scalar, ABL                                               &
    \texttt{freq\_op}                                                                   &
    \np{rn_rdt}{rn\_rdt} $\times$ \np{nn_fsbc}{nn\_fsbc}                                \\
    \hline
    \texttt{field}: trendT\_even                                                        &
    \texttt{freq\_op}                                                                   &
    $2 \times$ \np{rn_rdt}{rn\_rdt}                                                     \\
    \hline
    \texttt{field}: trendT\_odd                                                         &
    \texttt{freq\_op}                                                                   &
    $2 \times$ \np{rn_rdt}{rn\_rdt}                                                     \\
                                                                                        &
    \texttt{freq\_offset}                                                               &
    $-1$                                                                                \\
    \hline
    \texttt{zoom\_domain}: EqT, EqU, EqW                                                 &
    \texttt{jbegin}, \texttt{ni},                                                       &
    set according to the grid                                                           \\
    \hline
    \texttt{zoom\_domain}: TAO, RAMA and PIRATA moorings                                 &
    \texttt{ibegin}, \texttt{jbegin},                                                   &
    set according to the grid                                                           \\
    \hline
  \end{tabular}
  \label{tab:DIA_xios_auto_xml}
\end{table}

%% =================================================================================================
\subsubsection{Advanced use of XIOS functionalities}
\label{subsec:DIA_io_xios_adv}

XIOS can do far more than just gather and write output. Importantly, it can perform computations with the
fields it receives providing opportunities to create derived quantities without burdening the model simulation.
This section provides a few illustrations of the possibilities:

\begin{enumerate}
\item Using algebraic expressions

A new diagnostic can be derived from existing diagnostics, either in the file definition:

\begin{xmllines}
<file_definition>
   <file id="derived_vars" output_freq="1d" >
      <field field_ref="sst"  name="tosK"  unit="degK" > sst + 273.15 </field>
      <field field_ref="taum" name="taum2" unit="N2/m4" long_name="square of wind stress module" > taum * taum </field>
      <field field_ref="qt"   name="stupid_check" > qt - qsr - qns </field>
   </file>
</file_definition>
\end{xmllines}

or in the field definition:

\begin{xmllines}
<field_definition>
   <field id="sst2" field_ref="sst" long_name="square of sea surface temperature" unit="degC2" > sst * sst </field>
</field_definition>
\end{xmllines}

and then referenced in the file definition:

\begin{xmllines}
<file_definition>
   <file id="derived_vars" output_freq="1d" >
      <field field_ref="sst2" > sst2 </field>
   </file>
</file_definition>
\end{xmllines}

% NOTE: Is this still the case?
Note that in this case, simply adding "\xmlcode{<field field_ref="sst2" />}" to the file definition
would not work since "sst2" would not be evaluated.

\item Use of the ``@'' function: example 1, weighted temporal average

The ``@'' function can be used in algebraic expressions to chain temporal operations.
In this example, it is used to output a weighted temporal average of the temperature
(with the time-varying layer thickness as the weight).

The product of the two quantities is first added as a new variable in the field definition:

\begin{xmllines}
<field_definition operation="average" freq_op="1ts">
   <field id="toce_e3t" long_name="temperature * e3t" unit="degC*m" grid_ref="grid_T_3D" >toce * e3t</field>
</field_definition>
\end{xmllines}

The \texttt{operation="average"} and \texttt{freq\_op="1ts"} attributes specify the temporal operation and its
sampling frequency- \texttt{toce} and \texttt{e3t} will be used to calculate \texttt{toce\_e3t} for every timestep,
which will then be averaged over a time period set by the \texttt{output\_freq} or \texttt{freq\_op} attributes
(the latter is given priority) in the file definition. For example:

\begin{xmllines}
<file_definition>
   <file_group id="5d" output_freq="5d"  output_level="10" enabled=".true." >  <!-- 5d files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <!-- 5-day averages  -->
         <field field_ref="toce" />
         <!-- 1-day averages, output once every 5 days -->
         <field field_ref="toce" freq_op="1d" name="toce_1d" />
      </file>
   </file_group>
</file_definition>
\end{xmllines}

To produce a 5-day weighted average, the 5-day average of the weighted temperature
(\texttt{@toce\_e3t}) must be divided by that of the layer thickness (\texttt{@e3t}):

\begin{xmllines}
<file_definition>
   <file_group id="5d" output_freq="5d"  output_level="10" enabled=".true." >  <!-- 5d files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <field field_ref="toce" operation="instant" freq_op="5d" > @toce_e3t / @e3t </field>
      </file>
   </file_group>
</file_definition>
\end{xmllines}

Normally, the \texttt{operation="average"} and \texttt{freq\_op="1ts"} attributes inherited from the field
definition would be overwritten by the \texttt{operation="instant"} and \texttt{freq\_op="5d"} attributes
in the file definition. This would result in instantaneous output (data for one timestep) every 5 days.

The ``@'' function overrides this behaviour so that instead, the temporal operations are applied separately.
Specifically, it indicates that the temporal operation for the adjacent field should be performed before evaluating
the algebraic expression it is part of. This results in 2 chained temporal operations:

- Temporal operation 1: the operation type and sampling frequency are set by the \texttt{operation} and
\texttt{freq\_op} attributes in the field definition, while the temporal period of the operation is set
by the \texttt{freq\_op} attribute in the file definition.

- Temporal operation 2: the operation type, sampling frequency and temporal period are specified
by the \texttt{operation}, \texttt{freq\_op} and \texttt{output\_freq} attributes in the file definition.

For the above thickness-weighted temperature example, the following operations occur in order:

- \texttt{toce\_e3t} is calculated from \texttt{toce} and \texttt{e3t} for every timestep

- 5-day averages of \texttt{toce\_e3t} and \texttt{e3t} are calculated (temporal operation 1)

- The weighted average, \texttt{toce\_e3t / e3t}, is calculated using these 5-day averages

- The instantaneous value of this expression is output every 5 days (temporal operation 2)

The last of these (the 2nd temporal operation) simply returns the result of the 3rd operation- a 5-day weighted
average every 5 days.
One could equivalently specify \texttt{operation="average"} in the file definition and get the same result,
although the time coordinate for the diagnostic would be that of an average (with values of 2.5, 7.5, ... days) rather than
an instantaneous quantity (with values of 5, 10, ... days).

\item Use of the ``@'' function: example 2, monthly SSH standard deviation

The square of the SSH is added as a new variable in the field definition:

\begin{xmllines}
<field_definition operation="average" freq_op="1ts">
   <field id="ssh2" long_name="square of sea surface temperature" unit="degC2" > ssh * ssh </field>
</field_definition>
\end{xmllines}

In the file definition, monthly averages of this variable and \texttt{ssh} are then calculated and used to calculate the
monthly standard deviation:

\begin{xmllines}
<file_definition>
   <file_group id="1m" output_freq="1m"  output_level="10" enabled=".true." >  <!-- 1m files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <field field_ref="ssh" name="sshstd" long_name="sea_surface_temperature_standard_deviation"
                operation="instant" freq_op="1m" >
            sqrt( @ssh2 - @ssh * @ssh )
         </field>
      </file>
   </file_group>
</file_definition>
\end{xmllines}

In this example, the following operations occur in order:

- \texttt{ssh2} is calculated from \texttt{ssh} for every timestep

- 1-month averages of \texttt{ssh2} and \texttt{ssh} are calculated (temporal operation 1)

- The standard deviation, \texttt{sqrt(ssh2 - ssh * ssh)}, is calculated using these 1-month averages

- The instantaneous value of this expression is output every month (temporal operation 2)

\item Use of the ``@'' function: example 3, monthly average of SST diurnal cycle

The temporal minimum and maximum of the SST are added as new variables in the field definition:

\begin{xmllines}
<field_definition operation="average" freq_op="1ts">
   <field id="sstmax" field_ref="sst" long_name="max of sea surface temperature" operation="maximum" />
   <field id="sstmin" field_ref="sst" long_name="min of sea surface temperature" operation="minimum" />
</field_definition>
\end{xmllines}

In the file definition, these variables are then evaluated over a 1-day period and used to calculate the
diurnal amplitude of the SST and its monthly average:

\begin{xmllines}
<file_definition>
   <file_group id="1m" output_freq="1m"  output_level="10" enabled=".true." >  <!-- 1m files -->
      <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
         <field field_ref="sst" name="sstdcy" long_name="amplitude of sst diurnal cycle"
                operation="average" freq_op="1d" >
            @sstmax - @sstmin
         </field>
      </file>
   </file_group>
</file_definition>
\end{xmllines}

In this example, the following operations occur in order:

- Daily minima (\texttt{sstmin}) and maxima (\texttt{sstmax}) of \texttt{sst} are calculated (temporal operation 1)

- The amplitude, \texttt{sstmax - sstmin}, is calculated using these daily extrema

- The monthly average of this expression is calculated and output every month (temporal operation 2)

\item Changing variable precision

Diagnostic output precision can be modified with the \texttt{prec} attribute of the field tag family.
Data packing is also supported via the \texttt{add\_offset} and \texttt{scale\_factor} attributes.

\begin{xmllines}
<!-- 64-bit (8-byte) float -->
<field field_ref="sst" name="tos_r8" prec="8" />
<!-- Packing to 16-bit (2-byte) integer -->
<field field_ref="sss" name="sos_i2" prec="2" add_offset="20." scale_factor="1.e-3" />
\end{xmllines}

If the data cannot be converted to the target precision, XIOS will crash with a
"\texttt{NetCDF: Numeric conversion not representable}" error. In the case of single-precision floating point diagnostics
(\texttt{prec="4"}), this often happens when \NEMO\ has sent XIOS data containing NaNs or very large/small values,
which can result from \eg\ floating point calculation errors. Forcing double-precision output (\texttt{prec="8"})
may bypass the XIOS crash, but it is usually better to inspect and troubleshoot the diagnostic data being sent from \NEMO.

\item Adding user-defined NetCDF file attributes

User-defined NetCDF attributes can added to the output file metadata at the global and variable levels:

\begin{xmllines}
<file_definition>
   <file id="file1" name_suffix="_grid_T" description="ocean T grid variables" >
      <!-- Variable attributes -->
      <field field_ref="sst" name="tos" >
         <variable id="my_attribute1" type="string"  > blabla </variable>
         <variable id="my_attribute2" type="integer" > 3      </variable>
         <variable id="my_attribute3" type="float"   > 5.0    </variable>
      </field>
      <!-- Global attributes -->
      <variable id="my_global_attribute" type="string" > blabla_global </variable>
   </file>
</file_definition>
\end{xmllines}

\end{enumerate}

%% =================================================================================================
\subsection{CF metadata standard compliance}

Output from XIOS is compliant with
\href{http://cfconventions.org/Data/cf-conventions/cf-conventions-1.5/build/cf-conventions.html}{version 1.5} of
the CF metadata standard.
Therefore while a user may wish to add their own metadata to the output files (as demonstrated in example 3 of
\autoref{subsec:DIA_io_xios_adv}) the metadata should, for the most part, comply with the CF-1.5 standard.

Some metadata required for full compliance with the CF standard (horizontal cell areas and vertices) are not
output by default. It can be output by setting \np[=.true.]{ln_cfmeta}{ln\_cfmeta} in the \nam{run}{run}
namelist, but note that it will be added to all files with variables on the horizontal domain, which may
significantly increase the file size.

%% =================================================================================================
\subsection{Enabling NetCDF4 compression with XIOS}

XIOS supports the use of gzip compression when compiled with NetCDF4 libraries but is subject to the
same restrictions as the underlying HDF5 component: compression is not available when the
XIOS servers are writing in parallel to a single output file. Thus, compression can only be applied in
''multiple\_file'' mode only, or with two levels of servers using multiple level 1 servers and a single
level 2 server.
Compression is activated by using the \texttt{compression\_level} attribute of the \texttt{field} or \texttt{file}
tag families:

\begin{xmllines}
<file_definition>
  <file name="output" output_freq="1ts" compression_level="2">
     <field id="field_A" grid_ref="grid_A" operation="average" compression_level=" 4" />
     <field id="field_B" grid_ref="grid_A" operation="average" compression_level=" 0" />
     <field id="field_C" grid_ref="grid_A" operation="average" />
  </file>
</file_definition>
\end{xmllines}

Its value is an integer between 0 and 9. A value of 2 is normally recommended as a suitable trade-off between
algorithm performance and compression levels.

It is unclear how XIOS2 decides on suitable chunking parameters before applying compression, so it may
be necessary to re-chunk data when combining files produced with the ''multiple\_file'' output mode.
The \href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO} tool is capable
of doing this.
With XIOS3, the user is provided with more control over the chunking but the relationship
between input settings and final chunk sizes is complex. See the
\href{https://sites.nemo-ocean.io/draft-guide/xios3demo.html#chunking-and-compression}{XIOS3 demonstrator}
section of the user guide for an illustration.

%% =================================================================================================
\section{Reading and writing restart files}
\label{sec:XIOS_restarts}

From \NEMO\ v4.2, XIOS may be used to read in a single-file restart dump produced by \NEMO.
This does not add new functionality (\NEMO\ has long had the capability for all
processes to read their subdomain from a single, combined restart file) but it may be advantageous
on systems which struggle with too many simultaneous accesses to one file. The
variables written to files associated with the logical units \forcode{numror} (OCE), \forcode{numrir} (SI3), \forcode{numrtr}
(TOP) and \forcode{numrsr} (SED) can be handled by XIOS.

The use of XIOS to read restart files is activated by setting \np[=.true.]{ln_xios_read}{ln\_xios\_read}
in \nam{cfg}{cfg}. This setting will be ignored when multiple restart files are present, and default \NEMO\
functionality will instead be used for reading.

The \textit{iodef.xml} XIOS configuration file does not need to be
changed to use this functionality, as all definitions are implemented within the \NEMO\ code as a separate XIOS context.
For high resolution configurations, however, there may be a need to add the following line in \textit{iodef.xml}:

\begin{xmllines}
<variable_definition>
   <variable id="recv_field_timeout" type="double">1800</variable>
</variable_definition>
\end{xmllines}

which sets the timeout period for reading data.

\noindent If XIOS is to be used to read from restart files generated with an earlier \NEMO\ version (3.6 for instance),
the dimension \forcode{z} defined in the restart file must be renamed to \forcode{nav_lev}.\\

XIOS can also be used to write \NEMO\ restarts. The namelist parameter
\np{nn_wxios}{nn\_wxios} is used to determine the type of restart \NEMO\ will write:

\begin{description}
\item [{\np[=0]{nn_wxios}{nn\_wxios}}] \hfill \\
    Default functionality: each \NEMO\ process writes its own restart file
\item [{\np[=1]{nn_wxios}{nn\_wxios}}] \hfill \\
    XIOS will write to a single restart file
\item [{\np[=2]{nn_wxios}{nn\_wxios}}] \hfill \\
    XIOS will write to multiple restart files, one per server
\end{description}

This option aims to reduce the number of restart files generated by \NEMO, and may
be useful when there is a need to change the number of processors used to run the simulation.
Note that \textbf{\NEMO\ will not be able to read the restart files generated by XIOS with
\np[=2]{nn_wxios}{nn\_wxios}}. These files will have to be combined (with \eg\
\href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO}) before
continuing the run.

The use of XIOS to read and write restart files is in preparation for running \NEMO\ on exascale
computing platforms. While this may not yield significant performance gains on current clusters, it
should reduce file system bottlenecks in future attempts to run \NEMO\ on hundreds of
thousands of cores.

\section[NetCDF4 support (legacy output file method)]{NetCDF4 support (legacy output file method)}
\label{sec:DIA_nc4}

As of \NEMO\ v5, the legacy output method (where diagnostic and/or restart files are written
by \NEMO\ using the old IOIPSL interface, rather than by XIOS)
only supports NetCDF4 (version 4.1 and later are recommended) built with HDF5 (version 1.8.4
and later are recommended). This allows chunking and (loss-less) compression, which can
achieve a significant reduction in file size for a small runtime overhead.  For a fuller
discussion on chunking and other performance issues the reader is referred to the NetCDF4
documentation found
\href{https://docs.unidata.ucar.edu/nug/current/netcdf_perf_chunking.html}{here}.

Datasets created with chunking and compression are not backwards
compatible with the NetCDF3 "classic" format, but most analysis codes can simply be relinked
with the NetCDF4 libraries and will then read both NetCDF3 and NetCDF4 files. \NEMO\
executables linked with NetCDF4 libraries can be made to produce NetCDF3 files by setting
\np[=.false.]{ln_nc4zip}{ln\_nc4zip} in the \nam{nc4}{nc4} namelist.

\begin{listing}
  \nlst{namnc4}
  \caption{\forcode{&namnc4}}
  \label{lst:namnc4}
\end{listing}

Chunking and compression are applied only to 4D fields and there is no advantage in
chunking across more than one time dimension, since previously written chunks would have to
be read back and decompressed before being added to.  Therefore, user control over chunk
sizes is provided only for the three spatial dimensions.  The user sets an approximate
number of chunks along each spatial axis.  The actual size of the chunks will depend on the
global domain size for mono-processors and the local processor domain size
for distributed processing.  The derived values are subject to practical minimum values
(to avoid wastefully small chunk sizes) and cannot be greater than the domain size in any
dimension.  The algorithm used is:

\begin{forlines}
ichunksz(1) = MIN(idomain_size, MAX((idomain_size-1) / nn_nchunks_i + 1 ,16 ))
ichunksz(2) = MIN(jdomain_size, MAX((jdomain_size-1) / nn_nchunks_j + 1 ,16 ))
ichunksz(3) = MIN(kdomain_size, MAX((kdomain_size-1) / nn_nchunks_k + 1 , 1 ))
ichunksz(4) = 1
\end{forlines}

\noindent As an example, setting:

\begin{forlines}
nn_nchunks_i=4
nn_nchunks_j=4
nn_nchunks_k=31
\end{forlines}

for a standard ORCA2\_ICE\_PISCES configuration (with a global domain of {\small\texttt 182x149x31})
gives chunk sizes of {\small\texttt 46x38x1} respectively in the mono-processor case.
An illustration of the potential space savings that NetCDF4 chunking and compression provides is given in
\autoref{tab:DIA_NC4}, which compares the results of two short runs of the deprecated ORCA2\_LIM reference configuration
(now the ORCA2\_ICE\_PISCES configuration) with a 4x2 MPI decomposition.
Note the variation in the compression ratio achieved, which reflects chiefly the dry to wet volume ratio of
each processing region.

\begin{table}
  \centering
  \caption{Filesize comparison between NetCDF3 and NetCDF4 with chunking and compression}
  \begin{tabular}{lrrr}
    Filename                    & NetCDF3	& NetCDF4  & Reduction \\
                                & filesize	& filesize & \%        \\
                                & (KB)		& (KB)	  &           \\
    ORCA2\_restart\_0000.nc     & 16420 	& 8860 	  & 47\%      \\
    ORCA2\_restart\_0001.nc     & 16064 	& 11456    & 29\%      \\
    ORCA2\_restart\_0002.nc     & 16064		& 9744	  & 40\%      \\
    ORCA2\_restart\_0003.nc     & 16420		& 9404	  & 43\%      \\
    ORCA2\_restart\_0004.nc     & 16200 	& 5844	  & 64\%      \\
    ORCA2\_restart\_0005.nc     & 15848 	& 8172	  & 49\%      \\
    ORCA2\_restart\_0006.nc     & 15848 	& 8012 	  & 50\%      \\
    ORCA2\_restart\_0007.nc     & 16200 	& 5148 	  & 69\%      \\
    ORCA2\_2d\_grid\_T\_0000.nc & 2200 		& 1504	  & 32\%      \\
    ORCA2\_2d\_grid\_T\_0001.nc & 2200 		& 1748	  & 21\%      \\
    ORCA2\_2d\_grid\_T\_0002.nc & 2200 		& 1592	  & 28\%      \\
    ORCA2\_2d\_grid\_T\_0003.nc & 2200 		& 1540	  & 30\%      \\
    ORCA2\_2d\_grid\_T\_0004.nc & 2200 		& 1204	  & 46\%      \\
    ORCA2\_2d\_grid\_T\_0005.nc & 2200 		& 1444	  & 35\%      \\
    ORCA2\_2d\_grid\_T\_0006.nc & 2200 		& 1428	  & 36\%      \\
    ORCA2\_2d\_grid\_T\_0007.nc & 2200		& 1148	  & 48\%      \\
    ...                         & ...		& ...      & ...       \\
    ORCA2\_2d\_grid\_W\_0000.nc & 4416		& 2240	  & 50\%      \\
    ORCA2\_2d\_grid\_W\_0001.nc & 4416		& 2924	  & 34\%      \\
    ORCA2\_2d\_grid\_W\_0002.nc & 4416		& 2512	  & 44\%      \\
    ORCA2\_2d\_grid\_W\_0003.nc & 4416		& 2368	  & 47\%      \\
    ORCA2\_2d\_grid\_W\_0004.nc & 4416		& 1432	  & 68\%      \\
    ORCA2\_2d\_grid\_W\_0005.nc & 4416		& 1972	  & 56\%      \\
    ORCA2\_2d\_grid\_W\_0006.nc & 4416		& 2028	  & 55\%      \\
    ORCA2\_2d\_grid\_W\_0007.nc & 4416		& 1368	  & 70\%      \\
  \end{tabular}
  \label{tab:DIA_NC4}
\end{table}

Note that chunking and compression can also be applied when combining output files with \eg\
\href{https://sites.nemo-ocean.io/user-guide/tools.html#rebuild-nemo}{REBUILD\_NEMO}.

%% =================================================================================================
\section[Tracer/Dynamics trends (\forcode{namtrd}, \forcode{namtrc_trd})]{Tracer/Dynamics trends (\protect\nam{trd}{trd}, \protect\nam{trc_trd}{trc\_trd})}
\label{sec:DIA_trd}

\begin{listing}
  \nlst{namtrd}
  \caption{\forcode{&namtrd}}
  \label{lst:namtrd}
\end{listing}

\begin{listing}
  \nlst{namtrc_trd}
  \caption{\forcode{&namtrc_trd}}
  \label{lst:namtrc_trd}
\end{listing}

Each trend of the time evolution equations for the dynamics (\mdl{trddyn}) and both active (\mdl{trdtra})
and passive (\mdl{trdtrc}) tracers can be output following their computation, via calls to the
\rou{trd\_tra} (active and passive tracers), \rou{trd\_dyn} (dynamics) and
\rou{trd\_trc} (passive tracers) subroutines.

The output of trends diagnostics for the dynamics and active tracers is controlled by parameters in the \nam{trd}{trd}
namelist:

\begin{description}
\item [{\np[=.true.]{ln_glo_trd}{ln\_glo\_trd}}] \hfill \\
  Every \np{nn_trd}{nn\_trd} time-steps, a check of the basin averaged properties of the momentum and tracer
  equations is performed. This also includes a check of $T^2$, $S^2$, $\tfrac{1}{2} (u^2+v^2)$,
  and potential energy time evolution equations properties.
\item [{\np[=.true.]{ln_dyn_trd}{ln\_dyn\_trd}}] \hfill \\
  Each 3D trend of the evolution of the two momentum components is output.
\item [{\np[=.true.]{ln_dyn_mxl}{ln\_dyn\_mxl}}] (\textbf{currently not working}) \hfill \\
  Each 3D trend of the evolution of the two momentum components averaged over the mixed layer is output.
\item [{\np[=.true.]{ln_vor_trd}{ln\_vor\_trd}}] (\textbf{currently not working}) \hfill \\
  A vertical summation of the moment tendencies is performed,
  then the curl is computed to obtain the barotropic vorticity tendencies which are output.
\item [{\np[=.true.]{ln_KE_trd}{ln\_KE\_trd}}] \hfill \\
  Each 3D trend of the Kinetic Energy equation is output.
\item [{\np[=.true.]{ln_PE_trd}{ln\_PE\_trd}}] (\textbf{currently not working with nonlinear free surface}) \hfill \\
  Each 3D trend of the Potential Energy equation is output.
\item [{\np[=.true.]{ln_tra_trd}{ln\_tra\_trd}}] \hfill \\
  Each 3D trend of the evolution of temperature and salinity is output.
\item [{\np[=.true.]{ln_tra_mxl}{ln\_tra\_mxl}}] (\textbf{currently not working}) \hfill \\
  Each 2D trend of the evolution of temperature and salinity averaged over the mixed layer is output.
\end{description}

% TODO: Need a similar description for this namelist
while the output of trends diagnostics for the passive tracers is controlled by parameters in the
\nam{trc_trd}{trc\_trd} namelist. \\

\noindent As all 3D trends are output using XIOS, \key{xios} must generally be specified.
Additionally, the passive tracer trends require \key{trdtrc} (for 3D trends) and/or \key{trdmxl\_trc}
(for 2D trends averaged over the mixed layer) to be specified.

Note that currently, \textbf{the trends diagnostics are not fully functional or tested} and a warning will
be raised if they are used.
In particular, the code associated with the \np{ln_dyn_mxl}{ln\_dyn\_mxl}, \np{ln_vor_trd}{ln\_vor\_trd},
and \np{ln_tra_mxl}{ln\_tra\_mxl} namelist options is not working and an error will be raised if they are used.

%% =================================================================================================
\section[Transports across sections]{Transports across sections}
\label{sec:DIA_diag_dct}

\begin{listing}
  \nlst{nam_diadct}
  \caption{\forcode{&nam_diadct}}
  \label{lst:nam_diadct}
\end{listing}

Diagnostics to compute the transport of volume, heat and salt through sections can be activated by setting
\np[=.true.]{ln_diadct}{ln\_diadct} in the \nam{_diadct}{\_diadct} namelist.
Each section is defined by the coordinates of its 2 extremities.
The pathways between them are constructed using the \texttt{SECTIONS\_DIADCT} tool
and are written to a binary file \textit{section\_ijglobal.diadct}, which is later read in by
\NEMO\ to compute on-line transports.\\

\noindent The on-line transports module (\mdl{diadct}) outputs three ascii files:\\

- \textit{volume\_transport} for volume transports (unit: $10^{6}\ m^{3}s^{-1}$)

- \textit{heat\_transport}   for   heat transports (unit: $10^{15}\ W$)

- \textit{salt\_transport}   for   salt transports (unit: $10^{9}\ Kgs^{-1}$) \\

\noindent Namelist variables in the \nam{_diadct}{\_diadct} namelist control how frequently the flows are summed
and the time scales over which they are averaged, as well as the level of output for debugging:

\begin{description}
\item [{\np{nn_dct}{nn\_dct}}] \hfill \\
    Frequency of computation of the transports (in time steps)
\item [{\np{nn_dctwri}{nn\_dctwri}}] \hfill \\
    Averaging period of the transports (as a frequency, in time steps)
\item [{\np{nn_secdebug}{nn\_secdebug}}] \hfill \\
    Sections to debug: \\
        \indent \texttt{\ 0} - Do not debug any sections \\
        \indent \texttt{-1} - Debug all sections \\
        \indent \texttt{\ n} - Debug section number \texttt{n}
\end{description}

%% =================================================================================================
\subsubsection{Creating a binary file containing the pathway of each section}

In \path{./tools/SECTIONS_DIADCT/run}, the file \textit{{list\_sections.ascii\_global}} contains a list of
all the sections (based on MERSEA project metrics) that are to be computed.
Another file is available for the GYRE configuration (\textit{ {list\_sections.ascii\_GYRE}}).\\

\noindent Each section in this file is defined by a line containing, in order:

\begin{description}
\item [\texttt{long1} \texttt{lat1}] \hfill \\
    Coordinates of the first extremity of the section, \eg\ \texttt{-68.} \texttt{-54.5}
\item [\texttt{long2} \texttt{lat2}] \hfill \\
    Coordinates of the second extremity of the section, \eg\ \texttt{-60.} \texttt{-64.7}
\item [\texttt{nclass}] \hfill \\
    The number of bounds in each class type (\texttt{nclass - 1} classes per type), \eg\ \texttt{2}
\item [\texttt{okstrpond} or \texttt{nostrpond}] \hfill \\
    A string controlling whether to compute heat and salt transports (\texttt{okstrpond}) or not (\texttt{nostrpond})
\item [\texttt{ice} or \texttt{noice}] \hfill \\
    A string controlling whether to compute surface and volume ice transports (\texttt{ice}) or not (\texttt{noice})
\item [\texttt{section\_name}] \hfill \\
    The name of the section, \eg\ \texttt{ACC\_Drake\_Passage}
\end{description}

\noindent Note that neither the results of the transport calculations nor the directions of positive and
negative flow depend on the order in which the 2 extremities are specified in this file. \\

\noindent If \texttt{nclass} $\neq$ 0, the following \texttt{nclass + 1} lines contain a class type and its bounds,
which may be repeated for several class types. \eg\ for 2 class types with 2 bounds (1 class per type): \\

%\\
{
  \noindent \texttt{
    long1 lat1 long2 lat2 nclass (ok/no)strpond (no)ice section\_name \\
    classtype                                                         \\
    bound\_1                                                           \\
    bound\_2                                                           \\
    classtype                                                         \\
    bound\_1                                                           \\
    bound\_2}
}
\\

\noindent where \texttt{classtype} can be:\\

 - \texttt{zsal}  for          salinity classes

 - \texttt{ztem}  for       temperature classes

 - \texttt{zlay}  for             depth classes

 - \texttt{zsigi} for    insitu density classes

 - \texttt{zsigp} for potential density classes \\

 The script \textit{job.ksh} computes the pathway for each section and creates a binary file
 \textit{section\_ijglobal.diadct} which is read by \NEMO.
 The top part of this script should be modified for the user's configuration, including setting the name and path
 of the coordinates file to use. \\

 Examples of two sections, \texttt{ACC\_Drake\_Passage} with no classes,
 and \texttt{ATL\_Cuba\_Florida} with 4 temperature clases (5 class bounds), are shown: \\

 %\\
 {
   \noindent \texttt{
     -68.    -54.5   -60.    -64.7  00 okstrpond noice ACC\_Drake\_Passage \\
     -80.5    22.5   -80.5    25.5  05 nostrpond noice ATL\_Cuba\_Florida  \\
     ztem                                                                  \\
     -2.0                                                                  \\
     4.5                                                                   \\
     7.0                                                                   \\
     12.0                                                                  \\
     40.0}
 }

%% =================================================================================================
\subsubsection{Reading the output files}

The format of the output file is: \\

%\\
{
 \noindent \texttt{
    date, time-step number, section number,                \\
    section name, section slope coefficient, class number, \\
    class name, class bound 1, class bound2,               \\
    transport direction 1, transport direction 2,          \\
    transport total}
}
\\

For sections with classes, the first \texttt{nclass - 1} lines correspond to the transport for each class and
the last line corresponds to the total transport summed over all classes.
For sections with no classes, class number \texttt{1} corresponds to \texttt{total class} and
this class is called \texttt{N}, meaning \texttt{none}.
\texttt{transport direction 1} is the positive part of the transport ($\geq$ 0) and
\texttt{transport direction 2} is the negative part of the transport ($\leq$ 0). \\
\noindent \texttt{section slope coefficient} gives information about the significance of transports signs and
direction (see \autoref{tab:DIA_dct_sect_slope}).

\begin{table}
  \caption[Transport section slope coefficients]{Transport section slope coefficients}
  \begin{tabular}{|l|l|l|l|l|}
    \hline
    Section slope coefficient      & Section type & Direction 1 & Direction 2 & Total transport    \\
    \hline
    0.                             & Horizontal	 & Northward	& Southward   & positive: northward    \\
    \hline
    1000.                          & Vertical     & Eastward    & Westward    & positive: eastward		\\
    \hline
    \texttt{$\neq$ 0, $\neq$ 1000.} & Diagonal     & Eastward    & Westward	  & positive: eastward		\\
    \hline
  \end{tabular}
  \label{tab:DIA_dct_sect_slope}
\end{table}

%% =================================================================================================
\section{Diagnosing the steric effect on sea surface height}
\label{sec:DIA_steric}

Changes in steric sea level are caused when changes in the density of the water column imply an expansion or
contraction of the column.
It is essentially produced through surface heating/cooling and to a lesser extent through non-linear effects of
the equation of state (cabbeling, thermobaricity...).
Non-Boussinesq models contain all ocean effects within the ocean acting on the sea level.
In particular, they include the steric effect.
In contrast, Boussinesq models, such as \NEMO, conserve volume, rather than mass,
and so do not properly represent expansion or contraction.
The steric effect is therefore not explicitely represented.
This approximation does not represent a serious error with respect to the flow field calculated by the model
\citep{greatbatch_JGR94}, but extra attention is required when investigating sea level,
as steric changes are an important contribution to local changes in sea level on seasonal and climatic time scales.
This is especially true for investigation into sea level rise due to global warming.

Fortunately, the steric contribution to the sea level consists of a spatially uniform component that
can be diagnosed by considering the mass budget of the world ocean \citep{greatbatch_JGR94}.
In order to better understand how global mean sea level evolves and thus how the steric sea level can be diagnosed,
we compare, in the following, the non-Boussinesq and Boussinesq cases.

Let denote
$\mathcal{M}$ the total mass    of liquid seawater ($\mathcal{M} = \int_D \rho dv$),
$\mathcal{V}$ the total volume  of        seawater      ($\mathcal{V} = \int_D dv$),
$\mathcal{A}$ the total surface of       the ocean      ($\mathcal{A} = \int_S ds$),
$\bar{\rho}$ the global mean  seawater (\textit{in situ}) density
($\bar{\rho} = 1/\mathcal{V} \int_D \rho \,dv$), and
$\bar{\eta}$ the global mean sea level
($\bar{\eta} = 1/\mathcal{A} \int_S \eta \,ds$).

A non-Boussinesq fluid conserves mass. It satisfies the following relations:

\begin{equation}
  \begin{split}
    \mathcal{M} &=  \mathcal{V}  \;\bar{\rho} \\
    \mathcal{V} &=  \mathcal{A}  \;\bar{\eta}
  \end{split}
  \label{eq:DIA_MV_nBq}
\end{equation}

Temporal changes in total mass are obtained from the density conservation equation:

\begin{equation}
  \frac{1}{e_3} \partial_t ( e_3\,\rho) + \nabla( \rho \, \textbf{U} )
  = \left. \frac{\textit{emp}}{e_3}\right|_\textit{surface}
  \label{eq:DIA_Co_nBq}
\end{equation}

where $\rho$ is the \textit{in situ} density, and \textit{emp} the surface mass exchanges with the other media of
the Earth system (atmosphere, sea-ice, land).
Its global average leads to the total mass change

\begin{equation}
  \partial_t \mathcal{M} = \mathcal{A} \;\overline{\textit{emp}}
  \label{eq:DIA_Mass_nBq}
\end{equation}

where $\overline{\textit{emp}} = \int_S \textit{emp}\,ds$ is the net mass flux through the ocean surface.
Bringing \autoref{eq:DIA_Mass_nBq} and the time derivative of \autoref{eq:DIA_MV_nBq} together leads to
the evolution equation of the mean sea level

\begin{equation}
  \partial_t \bar{\eta} = \frac{\overline{\textit{emp}}}{ \bar{\rho}}
  - \frac{\mathcal{V}}{\mathcal{A}} \;\frac{\partial_t \bar{\rho} }{\bar{\rho}}
  \label{eq:DIA_ssh_nBq}
\end{equation}

The first term in equation \autoref{eq:DIA_ssh_nBq} alters sea level by adding or subtracting mass from the ocean.
The second term arises from temporal changes in the global mean density; \ie\ from steric effects.

In a Boussinesq fluid, $\rho$ is replaced by $\rho_o$ in all the equation except when $\rho$ appears multiplied by
the gravity (\ie\ in the hydrostatic balance of the primitive equations).
In particular, the mass conservation equation, \autoref{eq:DIA_Co_nBq}, degenerates into the incompressibility equation:

\[
  \frac{1}{e_3} \partial_t ( e_3 ) + \nabla( \textbf{U} ) = \left. \frac{\textit{emp}}{\rho_o \,e_3}\right|_ \textit{surface}
  % \label{eq:DIA_Co_Bq}
\]

and the global average of this equation now gives the temporal change of the total volume,

\[
  \partial_t \mathcal{V} = \mathcal{A} \;\frac{\overline{\textit{emp}}}{\rho_o}
  % \label{eq:DIA_V_Bq}
\]

Only the volume is conserved, not mass, or, more precisely, the mass which is conserved is the Boussinesq mass,
$\mathcal{M}_o = \rho_o \mathcal{V}$.
The total volume (or equivalently the global mean sea level) is altered only by net volume fluxes across
the ocean surface, not by changes in mean mass of the ocean: the steric effect is missing in a Boussinesq fluid.

Nevertheless, following \citet{greatbatch_JGR94}, the steric effect on the volume can be diagnosed by
considering the mass budget of the ocean.
The apparent changes in $\mathcal{M}$, mass of the ocean, which are not induced by surface mass flux
must be compensated by a spatially uniform change in the mean sea level due to expansion/contraction of the ocean
\citep{greatbatch_JGR94}.
In other words, the Boussinesq mass, $\mathcal{M}_o$, can be related to $\mathcal{M}$,
the total mass of the ocean seen by the Boussinesq model, via the steric contribution to the sea level,
$\eta_s$, a spatially uniform variable, as follows:

\begin{equation}
  \mathcal{M}_o = \mathcal{M} + \rho_o \,\eta_s \,\mathcal{A}
  \label{eq:DIA_M_Bq}
\end{equation}

Any change in $\mathcal{M}$ which cannot be explained by the net mass flux through the ocean surface
is converted into a mean change in sea level.
Introducing the total density anomaly, $\mathcal{D}= \int_D d_a \,dv$,
where $d_a = (\rho -\rho_o ) / \rho_o$ is the density anomaly used in \NEMO\ (cf. \autoref{subsec:TRA_eos})
in \autoref{eq:DIA_M_Bq} leads to a very simple form for the steric height:

\begin{equation}
  \eta_s = - \frac{1}{\mathcal{A}} \mathcal{D}
  \label{eq:DIA_steric_Bq}
\end{equation}

The above formulation of the steric height of a Boussinesq ocean requires four remarks.
First, one can be tempted to define $\rho_o$ as the initial value of $\mathcal{M}/\mathcal{V}$,
\ie\ set $\mathcal{D}_{t=0}=0$, so that the initial steric height is zero.
We do not recommend that.
Indeed, in this case $\rho_o$ depends on the initial state of the ocean.
Since $\rho_o$ has a direct effect on the dynamics of the ocean
(it appears in the pressure gradient term of the momentum equation)
it is definitively not a good idea when inter-comparing experiments.
We instead recommend to set a fixed value $\rho_o = 1035\;Kg\,m^{-3}$.
This value is a sensible choice for the reference density used in a Boussinesq ocean climate model since,
with the exception of only a small percentage of the ocean, density in the World Ocean varies by no more than
2$\%$ from this value \citep[][page 47]{gill_bk82}.

Second, we have assumed here that the total ocean surface, $\mathcal{A}$,
does not change when the sea level is changing as it is the case in all global ocean GCMs
(wetting and drying of grid point is not allowed).

Third, the discretisation of \autoref{eq:DIA_steric_Bq} depends on the type of free surface which is considered.
In the non linear free surface case, it is given by

\[
  \eta_s = - \frac{ \sum_{i,\,j,\,k} d_a\; e_{1t} e_{2t} e_{3t} }{ \sum_{i,\,j,\,k}       e_{1t} e_{2t} e_{3t} }
  % \label{eq:DIA_discrete_steric_Bq_nfs}
\]

whereas in the linear free surface, \ie\ when \key{linssh} is specified,
the volume above the \textit{z=0} surface must be explicitly taken into account to
better approximate the total ocean mass and thus the steric sea level:

\[
  \eta_s = - \frac{ \sum_{i,\,j,\,k} d_a\; e_{1t}e_{2t}e_{3t} + \sum_{i,\,j} d_a\; e_{1t}e_{2t} \eta }
                  { \sum_{i,\,j,\,k}       e_{1t}e_{2t}e_{3t} + \sum_{i,\,j}       e_{1t}e_{2t} \eta }
  % \label{eq:DIA_discrete_steric_Bq_fs}
\]

The fourth and last remark concerns the effective sea level and the presence of sea-ice.
In the real ocean, sea ice (and snow above it)  depresses the liquid seawater through its mass loading.
This depression is a result of the mass of sea ice/snow system acting on the liquid ocean.
There is, however, no dynamical effect associated with these depressions in the liquid ocean sea level,
so that there are no associated ocean currents.
Hence, the dynamically relevant sea level is the effective sea level,
\ie\ the sea level as if sea ice (and snow) were converted to liquid seawater \citep{campin.marshall.ea_OM08}.
% NOTE: This is not true, we have an embedded sea ice option, but I don't know what to put here. Ask Clem?
However, in the current version of \NEMO\ the sea-ice is levitating above the ocean without mass exchanges between
ice and ocean.
Therefore the model effective sea level is always given by $\eta + \eta_s$, whether or not there is sea ice present.

Global averages of both the steric (\texttt{sshsteric} diagnostic) and thermosteric (\texttt{sshthster} diagnostic)
sea level can be output by the AR5 diagnostics module (\mdl{diaar5}, see \autoref{sec:DIA_diag_others_cmip_ptr}).
The latter is the steric sea level due to changes in ocean density arising only from changes in temperature.
It is given by:

\[
  \eta_s = - \frac{1}{\mathcal{A}} \int_D d_a(T,S_o,p_o) \,dv
  % \label{eq:DIA_thermosteric_Bq}
\]

where $S_o$ and $p_o$ are the initial salinity and pressure, respectively.

When this diagnostic is output, salinity data for $S_o$ must be provided via a
variable named \texttt{vosaline} in a file named \textit{sali\_ref\_clim\_monthly.nc}.
This data must be provided as a monthly climatology; \ie\ the file's time coordinate must have a length of 12.

%% =================================================================================================
\section[Tidal harmonic and generic multiple-linear-regression analysis (\textit{diamlr.F90})]{Tidal harmonic and generic multiple-linear-regression analysis (\protect\mdl{diamlr})}
\label{sec:DIA_diamlr}

Functionality for multiple-linear-regression (MLR) analysis of arbitrary output fields, using regressors that are a function of the continuous model time, is available as a diagnostic option of \NEMO.
Its implementation makes use of the ordinary-least-squares method (a method overview can be found \href{https://en.wikipedia.org/wiki/Ordinary_least_squares}{here}), it depends on XIOS for generating a set of intermediate output files, the set of regressors is configurable as part of the model-output XIOS configuration, and the regression analyses can be completed versatilely in a post-processing step. In particular, the analysis time window remains flexible until the post-processing step, from partial model runs (depending on the selected temporal resolution for the intermediate output) to spanning multiple restart segments; also, the original regressor set can be restricted at the post-processing step.
For the specific case of tidal harmonic analysis, the configuration of regressors that correspond to tidal constituents available for tidal forcing (see \autoref{sec:SBC_TDE}) is facilitated through a substitution mechanism (\ie\ model-provided tidal frequencies, phases, and amplitudes can be referred to symbolically in MLR analysis configurations).\par

\subsection{Configuration of the multiple-linear-regression analysis}

The MLR analysis is activated by defining an empty file-group entry
\begin{xmllines}
   <file_group id="diamlr_files" output_freq="<output frequency>" enabled=".TRUE." />
\end{xmllines}
in the XIOS configuration, where \texttt{<output frequency>} specifies the temporal resolution of the intermediate output: if defined, this file group will be populated during model initialisation.
Other prerequisite XIOS-configuration elements (regressors and the time variable) are pre-defined in the default XIOS configuration file \path{./cfgs/SHARED/field_def_nemo-oce.xml}, and can be modified if required.\par

Regressors are defined and enabled within the XIOS field group \xmlcode{<field_group id="diamlr_fields">} in the form of individual fields that are computed from the spatially uniform field \texttt{diamlr\_time} as
\begin{xmllines}
   <field id="diamlr_r<mmm>" field_ref="diamlr_time" expr="<expression>" enabled=".TRUE." comment="<comment>" />,
\end{xmllines}
where \texttt{<mmm>} is a 3-digit identification number, \texttt{<expression>} a functional expression, and \texttt{<comment>} an arbitrary string (which may be utilised to pass information to post-processing utilities); field \texttt{diamlr\_time} contains the continuous model time in seconds.
In the functional expression, XIOS requires the specified reference field \texttt{diamlr\_time} to be included; therefore, in order to obtain a constant expression for fitting an intercept, \texttt{diamlr\_time\^{}0} can be chosen.
The model time \texttt{diamlr\_time} corresponds to module variable \forcode{adatrj} of module \mdl{dom\_oce} and is defined in module \mdl{daymod}; its continuity across model restarts depends on a selection made in \nam{dom}{dom}.
Similarly, a XIOS field \texttt{<field name>} can be selected for MLR analysis through the definition of a new field
\begin{xmllines}
   <field id="diamlr_f<nnn>" field_ref="<field name>" enabled=".TRUE." />
\end{xmllines}
in field group \xmlcode{<field_group id="diamlr_fields">}, where \texttt{<nnn>} is a 3-digit identification number.\par

For the purpose of tidal harmonic analysis, two orthogonal regressors per analysed tidal-constituent signal need to be defined in order to fit both the amplitude and phase of the corresponding harmonic, typically a sine and cosine function with identical argument.
Further, regressor configurations can be equipped with placeholders to refer to the frequency, phase, and amplitude of each of the constituents available and evaluated for tidal forcing of the model. In particular:
\begin{description}
   \item [\texttt{\_\_TDE\_<constituent>\_omega\_\_}] \hfill \\
      refers to the angular velocity (in units of rad s$^{-1}$);
   \item [\texttt{\_\_TDE\_<constituent>\_phase\_\_}] \hfill \\
      refers to the phase, including the nodel correction at the beginning of the model run (in units of rad); and
   \item [\texttt{\_\_TDE\_<constituent>\_amplitude\_\_}] \hfill \\
      refers to the equilibrium-tide amplitude (in units of m)
\end{description}
of tidal constituent \texttt{<constituent>}.
During model initialisation, these placeholders are automatically substituted with the corresponding model-computed values for the respective tidal constituent.\par

A default set of regressors relevant for tidal harmonic analysis has been pre-defined (see \path{./cfgs/SHARED/field_def_nemo-oce.xml}) and can be redefined. An example of such a redefinition can be found in the AMM12 reference configuration, in file \path{./cfgs/AMM12/EXPREF/context_nemo.xml}.\par

\subsection{The intermediate output and its post-processing}

Internally, during model initialisation, the initial XIOS configuration for MLR analysis is expanded automatically through the generation of field and output-file definitions for the relevant intermediate model output.
The resulting intermediate output consists of fields of scalar products between each regressor and the values of the fields selected for MLR analysis, as well as of scalar products between each regressor-regressor pair, all sampled at the configured interval.
For the final analysis only the scalar products over the analysis time span are required, thus the intermediate output can be freely subset or combined (added) along its time dimension to select the analysis window (which enables analyses across multiple restart segments) during post-processing.\par

The total number of intermediate output variables depends on the number of analysed fields ($n_{f}$) and the number of regressors ($n_{r}$) (for tidal analysis, $n_{r} = 2n_{c}+1$, \ie\ twice the number of tidal constituents, $n_{c}$, plus one regressor to fit the intercept) and amounts to $n_{f} n_{r} + 2 n_{r}^{2} - n_{r}$ (of which $2 n_{r}^2 - n_{r}$ variables are scalar time series). These output variables are written to output files labelled with \path{diamlr_scalar}, which contain the regressor-regressor scalar products, and with \path{diamlr_grid_<grid_type>}, which contain the regressor-diagnostic scalar-product fields for the fields defined on a grid of type \texttt{<grid\_type>}.\par

For the computation of regression coefficients from previously generated intermediate output files, the rudimentary script \path{./tools/DIAMLR/diamlr.py} can be used. This script is provided as a simple example of the final analysis step: it processes suitable intermediate-output files by adding all available time slices and by computing regression coefficients for all available analysed fields and for all or a subset of the regressors identified from the content of the intermediate-output files. To complete a tidal harmonic analysis, the pairs of regression coefficients associated with each of the tidal constituents selected for analysis (the \texttt{comment} attribute could be used for identifying such pairs) could in turn be converted into maps of amplitudes and phases.\par

%% =================================================================================================
\section{Other diagnostics}
\label{sec:DIA_diag_others}

Aside from the standard model variables, other diagnostics can be computed on-line.
The available ready-to-add diagnostics modules can be found in directory DIA.

%% =================================================================================================
\subsection[Depth of various quantities (\textit{diahth.F90})]{Depth of various quantities (\protect\mdl{diahth})}

The following diagnostics are available via the \mdl{diahth} module when \key{diahth} is specified:\\

%\\
- the mixed layer depth \citep[based on the density criterion of][]{de-boyer-montegut.madec.ea_JGR04}

- the turbocline depth (based on a turbulent mixing coefficient criterion)

- the depth of the 20\deg{C} isotherm

- the depth of the thermocline (maximum of the vertical temperature gradient)

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{DIA_mask_subasins}
  \caption[Sub-basin decomposition used to compute transports and the meridional stream-function]{
    Decomposition of the World Ocean (shown here for the ORCA2 grid) into sub-basins used to
    compute the heat and salt transports as well as the meridional stream-function:
    Atlantic basin (red), Pacific basin (green),
    Indian basin (blue), Indo-Pacific basin (blue+green).
    Note that semi-enclosed seas (Red, Med and Baltic seas) as well as
    Hudson Bay are removed from the sub-basins, and that
    the Arctic Ocean has been split into Atlantic and
    Pacific basins along the North fold line.
  }
  \label{fig:DIA_mask_subasins}
\end{figure}

%% =================================================================================================
\subsection[CMIP-specific and poleward transport diagnostics (\textit{diaar5.F90}, \textit{diaptr.F90})]{CMIP-specific and poleward transport diagnostics (\protect\mdl{diaar5}, \protect\mdl{diaptr})}
\label{sec:DIA_diag_others_cmip_ptr}

Diagnostics in the \mdl{diaar5} module correspond to outputs that are required for the AR5/CMIP5 simulations
(see for example, the thermosteric sea level diagnostic in \autoref{sec:DIA_steric}).

The \mdl{diaptr} module computes the online poleward heat and salt transports,
their advective and diffusive component, the meridional stream function, and the zonal mean temperature, salinity and
cell $i$-$k$ surface area.
These are computed by default for the global ocean, but if the file \textit{subbasins.nc} is provided then these
diagnostics are also computed for the Atlantic, Indian, Pacific and Indo-Pacific Ocean basins (defined north of 30\deg{S}).
The \texttt{atlmsk}, \texttt{indmsk} and \texttt{pacmsk} variables in this file are masks corresponding to the
Atlantic, Indian and Pacific basins, while the Indo-Pacific basin mask is computed as the union of the
Indian and Pacific basin masks (\autoref{fig:DIA_mask_subasins}).

The diagnostics available from both modules are listed in the \path{./cfgs/SHARED/field_def_nemo-oce.xml} XIOS
configuration file: \mdl{diaar5} diagnostics are listed under \xmlcode{<!-- variables available with diaar5 -->} comment
headers, while \mdl{diaptr} diagnostics are listed within the \xmlcode{<field_group id="diaptr" >} element.

%% =================================================================================================
\subsection[De-tided diagnostic output from tidal models (\textit{dia25h.F90}, \textit{diadetide.F90})]{De-tided diagnostic output from tidal models (\protect\mdl{dia25h}, \protect\mdl{diadetide})}
\label{sec:DIA_dia25h}

\subsubsection[25-hour averages (\textit{dia25h.F90})]{25-hour averages (\protect\mdl{dia25h})}

Modelled fields of potential temperature, salinity, SSH, velocity, vertical diffusivity and viscosity, TKE, and the mixing length can be approximately de-tided by crudely (fully) removing the signal of the M2 (S2) tidal constituent.
This operation is carried out by the \mdl{dia25h} module by averaging 25 instantaneous values at one-hour intervals that span consecutive periods of 24 model hours (\ie\ the model state at the boundaries of such sampling windows is accounted for in both adjacent averages).
As a consequence of this method of averaging 25 hourly values, the least common multiple of the time-step length \np{rn_Dt}{rn\_Dt} and 3600 s is required to be 3600 s.
This diagnostic is activated when any of the available daily 25-hour-average output fields is selected in the XIOS model-output configuration: \texttt{temper25h}, \texttt{salin25h}, \texttt{ssh25h}, \texttt{vozocrtx25h}, \texttt{vomecrty25h}, \texttt{vovecrtz25h}, \texttt{avt25h}, \texttt{avm25h}, \texttt{tke25h}, or \texttt{mxln25h}.

\subsubsection[Generic de-tiding of model output (\textit{diadetide.F90}, experimental)]{Generic de-tiding of model output (\protect\mdl{diadetide}, experimental)}

A more generic alternative to the de-tiding option provided by module \mdl{dia25h} is available with module \mdl{diadetide};
this option, however, has not been fully developed and well tested, and therefore should be considered to be an {\em experimental} feature and used with care.
Like for the \mdl{dia25h} implementation, the current version of this alternative de-tiding option computes daily averages with an averaging window that corresponds to twice the M2 tidal period;
in contrast to module \mdl{dia25h}, the averaging procedure is more accurate for sufficiently small time steps, the method can be used with an arbitrary time-step length, and it can be applied to analyse arbitrary output fields available for XIOS-based model output.

An example of the application of the de-tiding option provided by module \mdl{diadetide} has been included in the AMM12 reference configuration.
The corresponding activation can be found in the XIOS file group \texttt{diadetide\_files} defined in file \path{cfgs/AMM12/EXPREF/context_nemo.xml}.
Implementation details (in particular the computation of the averaging weights) can be found in module \mdl{diadetide}.

%% =================================================================================================
\subsection{Courant numbers}

Courant numbers provide a theoretical indication of the model's numerical stability.
The advective Courant numbers can be calculated according to

\[
  C_u = |u|\frac{\rdt}{e_{1u}}, \quad C_v = |v|\frac{\rdt}{e_{2v}}, \quad C_w = |w|\frac{\rdt}{e_{3w}}
  % \label{eq:DIA_CFL}
\]

in the zonal, meridional and vertical directions respectively.
The vertical component is included although it is not strictly valid as the vertical velocity is calculated from
the continuity equation rather than as a prognostic variable.
Physically this represents the rate at which information is propogated across a grid cell.
Values greater than 1 indicate that information is propagated across more than one grid cell in a single time step.

Courant number diagnostics can be activated by setting \np[=.true.]{ln_diacfl}{ln\_diacfl} in the \nam{ctl}{ctl} namelist.
The global maximum values of $C_u$, $C_v$, $C_w$, and their coordinates, for each timestep and for the whole model run,
are written to an ascii file named \textit{cfl\_diagnostics.ascii}.
The maximum values for the whole model run are also copied to the \textit{ocean.output} file.
Additionally, the depth maxima of $C_u$, $C_v$, and $C_w$ are available as 2D XIOS diagnostics (\texttt{cfl\_cu},
\texttt{cfl\_cv}, and \texttt{cfl\_cw} fields respectively).

\subinc{\input{../../global/epilogue}}

\end{document}