chap_time_domain.tex

\documentclass[../main/NEMO_manual]{subfiles}

\begin{document}

\chapter{Time Domain}
\label{chap:TD}

\chaptertoc

\paragraph{Changes record} ~\\

{\footnotesize
  \begin{tabularx}{0.5\textwidth}{l||X|X}
    Release          & Author(s)                                                    & 
    Modifications                                                                   \\
    \hline
    {\em        5.0} & {\em J\'{e}r\^{o}me Chanut \newline Sibylle T\'{e}chen\'{e}} & 
    {\em Update \newline Review                                                   } \\
    {\em        4.0} & {\em J\'{e}r\^{o}me Chanut \newline Tim Graham             } & 
    {\em Review \newline Update                                                   } \\
    {\em        3.6} & {\em Christian \'{E}th\'{e}                                } & 
    {\em Update                                                                   } \\
    {\em $\leq$ 3.4} & {\em Gurvan Madec                                          } & 
    {\em First version                                                            } \\
  \end{tabularx}
}

\clearpage

% Missing things:
% - daymod: definition of the time domain (nit000, nitend and the calendar)

\cmtgm{STEVEN :maybe a picture of the directory structure in the introduction which
could be referred to here, would help  ==> to be added}

Having defined the continuous equations in \autoref{chap:MB},
we need now to choose a time discretization,
a key feature of an ocean model as it exerts a strong influence on the structure of the computer code
(\ie\ on its flowchart).
In the present chapter, we provide a general description of the \NEMO\ time stepping strategy and
the consequences for the order in which the equations are solved.

%% =================================================================================================
\section{Time stepping schemes: MLF and RK3}
\label{sec:TD_environment}

One can choose between two time stepping schemes for non-diffusive processes in \NEMO\ . 
It can either be a three-stages, two-time level Runge Kutta scheme (RK3, \cite{wicker_skamarock_MWR02}) 
or a three-time level \textbf{M}odified \textbf{L}eap\textbf{F}rog scheme (MLF, \cite{mesinger.arakawa_bk76} ; \cite{leclair.madec_OM09}). 
MLF being the "historical" time stepping scheme, it is still the default. 
RK3 scheme can be activated by adding the cpp key \key{RK3} at compilation time. 

The RK3 scheme, beside its higher formal accuracy than MLF (3\textsuperscript{rd} order vs less than 2\textsuperscript{nd} order for linear terms), 
has the advantage of requiring less memory storage and being more computationnaly efficient (\autoref{tab:linadv}). 
As such, it will soon become the only time stepping option in \NEMO\ . 
The reader is referred to (\cite{ducousso_JAMES24} and \cite{madec_JAMES24}) for further specific details about the implementation of the RK3 scheme in \NEMO\ . 

In the following, one describes the time evolution of a variable $x$, that stands for $u$, $v$, $T$ or $S$. 
RHS is the \textbf{R}ight-\textbf{H}and-\textbf{S}ide of the corresponding time evolution equation while $\rdt$ is the time step. 
The superscripts indicate the time at which a quantity is evaluated. 
Each term of the RHS is evaluated at a specific time step depending on the physics with which it is associated. 
In both schemes, the final value of $x$, \ie\ at time  $t + \rdt$, is obtained where
implicit vertical diffusion is computed, \ie\ in the \mdl{trazdf} and \mdl{dynzdf} modules.

\begin{table}[h]
\begin{tabular}{cccccc}
\hline
                          & $\displaystyle \alpha_{\textrm C2}^{\max}$ & $\displaystyle \alpha_{\textrm UP3}^{\max}$ & $\displaystyle \alpha_{\textrm C4}^{\max}$ & $\displaystyle \alpha_{\textrm UP5}^{\max}$ & $\displaystyle \alpha_{\textrm Co4}^{\max}$ \\
\hline                          
LFRA ($\gamma= 0.1$) & $\sqrt{\frac{1-\gamma}{1+\gamma}}$ = 0.904534 & 0.47074  & 0.659175 & Unst. & $\sqrt{\frac{3}{11}}$ = 0.522233   \\
RK3 & $\sqrt{3}$ = 1.73205 & 1.62589 & 1.26222 & 1.43498  & 1 \\
\hline
\end{tabular}
\caption{CFL stability criteria $\alpha^{\max}$ for LF (with Asselin parameter $\gamma$) and 
RK3 time discretizations combined with the most commonly used linear advection schemes. 
Specific acronyms : C2 = second-order centered, UP3 = third-order upwind, C4 = fourth-order centered, 
UP5 = fifth-order upwind, Co4 = fourth-order compact. Table extracted from \cite{madec_JAMES24}.}
\label{tab:linadv}
\end{table}

\section{MLF scheme}
\subsection{Non-diffusive part}
\label{sec:TD_leap_frog}
The \textbf{L}eap\textbf{F}rog (LF) time stepping is a three level scheme 
that can be represented as follows for non-diffusive processes:

\begin{equation}
  \label{eq:TD}
  x^{t + \rdt} = x^{t - \rdt} + 2 \, \rdt \ \text{RHS}_x^{t}
\end{equation}

This three level scheme requires three arrays for each prognostic variable.
For each variable $x$ there is $x_b$ (before), $x_n$ (now) and $x_a$.
The third array, although referred to as $x_a$ (after) in the code,
is usually not the variable at the after time step;
but rather it is used to store the time derivative (RHS in \autoref{eq:TD})
prior to time-stepping the equation.

%% =================================================================================================

This scheme is widely used for advection processes in low-viscosity fluids.
It is a time centred scheme, \ie\ the RHS in \autoref{eq:TD} is evaluated at
time step $t$, the now time step.
It may be used for momentum and tracer advection, pressure gradient, and Coriolis terms,
but not for diffusion terms.
It is an efficient method that achieves second-order accuracy with
just one right hand side evaluation per time step.
Moreover, it does not artificially damp linear oscillatory motion
nor does it produce instability by amplifying the oscillations.
These advantages are somewhat diminished by the large phase-speed error of the leapfrog scheme,
and the unsuitability of leapfrog differencing for the representation of diffusion and
Rayleigh damping processes.
However, the scheme allows the coexistence of a numerical and a physical mode due to
its leading third order dispersive error.
In other words a divergence of odd and even time steps may occur.
To prevent it, the leapfrog scheme is often used in association with
a \textbf{R}obert-\textbf{A}sselin time filter (hereafter the LF-RA scheme).
This filter,
first designed by \citet{robert_JMSJ66} and more comprehensively studied by \citet{asselin_MWR72},
is a kind of laplacian diffusion in time that mixes odd and even time steps:

\begin{equation}
  \label{eq:TD_asselin}
  x_F^t = x^t + \gamma \, \lt[ x_F^{t - \rdt} - 2 x^t + x^{t + \rdt} \rt]
\end{equation}

where the subscript $F$ denotes filtered values and $\gamma$ is the Asselin coefficient.
$\gamma$ is initialized as \np{rn_atfp}{rn\_atfp} (namelist parameter).
Its default value is \np[=10.e-3]{rn_atfp}{rn\_atfp} (see \autoref{sec:TD_mLF}),
causing only a weak dissipation of high frequency motions (\citep{farge-coulombier_phd87}).
The addition of a time filter degrades the accuracy of the calculation from second to first order.
However, the second order truncation error is proportional to $\gamma$, which is small compared to 1.
Therefore, the LF-RA is a quasi second order accurate scheme.
The LF-RA scheme is preferred to other time differencing schemes such as
predictor corrector or trapezoidal schemes, because the user has an explicit and simple control of
the magnitude of the time diffusion of the scheme.
When used with the 2$^nd$ order space centred discretisation of the advection terms in
the momentum and tracer equations, LF-RA avoids implicit numerical diffusion:
diffusion is set explicitly by the user through the Robert-Asselin filter parameter and
the viscosity and diffusion coefficients.

%% =================================================================================================
\subsection{Diffusive part --- Forward or backward scheme}
\label{sec:TD_forward_imp}

The leapfrog differencing scheme is unsuitable for
the representation of diffusion and damping processes.
For a tendency $D_x$, representing a diffusion term or a restoring term to a tracer climatology
(when present, see \autoref{sec:TRA_dmp}), a forward time differencing scheme is used :
\[
  %\label{eq:TD_euler}
  x^{t + \rdt} = x^{t - \rdt} + 2 \, \rdt \ D_x^{t - \rdt}
\]

This is diffusive in time and conditionally stable.
The conditions for stability of second and fourth order horizontal diffusion schemes are
\citep{griffies_bk04}:

\begin{equation}
  \label{eq:TD_euler_stability}
  A^h <
  \begin{cases}
    \frac{e^2}{ 8 \, \rdt} & \text{laplacian diffusion} \\
    \frac{e^4}{64 \, \rdt} & \text{bilaplacian diffusion}
  \end{cases}
\end{equation}

where $e$ is the smallest grid size in the two horizontal directions and
$A^h$ is the mixing coefficient.
The linear constraint \autoref{eq:TD_euler_stability} is a necessary condition, but not sufficient.
If it is not satisfied, even mildly, then the model soon becomes wildly unstable.
The instability can be removed by either reducing the length of the time steps or
reducing the mixing coefficient.

For the vertical diffusion terms, a forward time differencing scheme can be used,
but usually the numerical stability condition imposes a strong constraint on the time step.
To overcome the stability constraint, a backward (or implicit) time differencing scheme is used.
This scheme is unconditionally stable but diffusive and can be written as follows:

\begin{equation}
  \label{eq:TD_imp}
  x^{t + \rdt} = x^{t - \rdt} + 2 \, \rdt \ \text{RHS}_x^{t + \rdt}
\end{equation}

\cmtgm{UPDATE the next paragraphs with time varying thickness ...}

This scheme is rather time consuming since it requires a matrix inversion.
For example, the finite difference approximation of the temperature equation is:
\[
  % \label{eq:TD_imp_zdf}
  \frac{T(k)^{t + 1} - T(k)^{t - 1}}{2 \; \rdt}
  \equiv
  \text{RHS} + \frac{1}{e_{3t}} \delta_k \lt[ \frac{A_w^{vT}}{e_{3w} } \delta_{k + 1/2} \lt[ T^{t + 1} \rt] \rt]
\]
where RHS is the right hand side of the equation except for the vertical diffusion term.
We rewrite \autoref{eq:TD_imp} as:

\begin{equation}
  \label{eq:TD_imp_mat}
  -c(k + 1) \; T^{t + 1}(k + 1) + d(k) \; T^{t + 1}(k) - \; c(k) \; T^{t + 1}(k - 1) \equiv b(k)
\end{equation}

where

\[
  c(k) = A_w^{vT} (k) \, / \, e_{3w} (k) \text{,} \quad
  d(k) = e_{3t}   (k)       \, / \, (2 \rdt) + c_k + c_{k + 1} \quad \text{and} \quad
  b(k) = e_{3t}   (k) \; \lt( T^{t - 1}(k) \, / \, (2 \rdt) + \text{RHS} \rt)
\]

\autoref{eq:TD_imp_mat} is a linear system of equations with
an associated matrix which is tridiagonal.
Moreover, $c(k)$ and $d(k)$ are positive and
the diagonal term is greater than the sum of the two extra-diagonal terms,
therefore a special adaptation of the Gauss elimination procedure is used to find the solution
(see for example \citet{richtmyer.morton_bk67}).

%% =================================================================================================
\subsection{Modified LeapFrog -- Robert Asselin filter scheme (LF-RA)}
\label{sec:TD_mLF}

Significant changes have been introduced by \cite{leclair.madec_OM09} in
the LF-RA scheme in order to ensure tracer conservation and to
allow the use of a much smaller value of the Asselin filter parameter.
The modifications affect both the forcing and filtering treatments in the LF-RA scheme.

In a classical LF-RA environment,
the forcing term is centred in time, \ie\ it is time-stepped over a $2 \rdt$ period:
$x^t = x^t + 2 \rdt Q^t$ where $Q$ is the forcing applied to $x$,
and the time filter is given by \autoref{eq:TD_asselin} so that
$Q$ is redistributed over several time step.
In the modified LF-RA environment, these two formulations have been replaced by:

\begin{gather}
  \label{eq:TD_forcing}
  x^{t + \rdt} = x^{t - \rdt} + \rdt \lt( Q^{t - \rdt / 2} + Q^{t + \rdt / 2} \rt)  \\
  \label{eq:TD_RA}
  x_F^t       = x^t + \gamma \, \lt( x_F^{t - \rdt} - 2 x^t + x^{t + \rdt} \rt)
                    - \gamma \, \rdt \, \lt( Q^{t + \rdt / 2} - Q^{t - \rdt / 2} \rt)
\end{gather}

The change in the forcing formulation given by \autoref{eq:TD_forcing}
(see \autoref{fig:TD_MLF_forcing}) has a significant effect:
the forcing term no longer excites the divergence of odd and even time steps
\citep{leclair.madec_OM09}.
% forcing seen by the model....
This property improves the LF-RA scheme in two aspects.
First, the LF-RA can now ensure the local and global conservation of tracers.
Indeed, time filtering is no longer required on the forcing part.
The influence of the Asselin filter on the forcing is explicitly removed by
adding a new term in the filter (last term in \autoref{eq:TD_RA} compared to \autoref{eq:TD_asselin}).
Since the filtering of the forcing was the source of non-conservation in the classical LF-RA scheme,
the modified formulation becomes conservative \citep{leclair.madec_OM09}.
Second, the LF-RA becomes a truly quasi-second order scheme.
Indeed, \autoref{eq:TD_forcing} used in combination with a careful treatment of static instability
(\autoref{subsec:ZDF_evd}) and of the TKE physics (\autoref{subsec:ZDF_tke_ene})
(the two other main sources of time step divergence),
allows a reduction by two orders of magnitude of the Asselin filter parameter.

Note that the forcing is now provided at the middle of a time step:
$Q^{t + \rdt / 2}$ is the forcing applied over the $[t,t + \rdt]$ time interval.
This and the change in the time filter, \autoref{eq:TD_RA},
allows for an exact evaluation of the contribution due to the forcing term between any two time steps,
even if separated by only $\rdt$ since the time filter is no longer applied to the forcing term.

\begin{figure}
  \centering
  \includegraphics[width=0.66\textwidth]{TD_MLF_forcing}
  \caption[Forcing integration methods for modified leapfrog (top and bottom)]{
    Illustration of forcing integration methods.
    (top) ''Traditional'' formulation:
    the forcing is defined at the same time as the variable to which it is applied
    (integer value of the time step index) and it is applied over a $2 \rdt$ period.
    (bottom)  modified formulation:
    the forcing is defined in the middle of the time
    (integer and a half value of the time step index) and
    the mean of two successive forcing values ($n - 1 / 2$, $n + 1 / 2$) is applied over
    a $2 \rdt$ period.}
  \label{fig:TD_MLF_forcing}
\end{figure}

%% =================================================================================================
\section{RK3 scheme}
%% =================================================================================================
\subsection{Non-diffusive part}
%% =================================================================================================
\label{sec:RK3_3stages}
The RK3 time stepping is a three-stages, two-time level scheme that can be represented 
as follows (for non-diffusive processes):

\begin{equation}
  \label{eq:RK3_stages}
    \begin{split}
    x^{t + \rdt / 3} &= x^{t} + \frac{\rdt}{3} \ \text{RHS}_x^{t} \\
    x^{t + \rdt / 2} &= x^{t} + \frac{\rdt}{2} \ \text{RHS}_x^{t + \rdt / 3} \\
    x^{t + \rdt    } &= x^{t} +       \rdt     \ \text{RHS}_x^{t + \rdt / 2}
    \end{split}
\end{equation}

RHS applies here to the Coriolis force, momentum/tracer advection and hydrostatic pressure terms. 
From \autoref{eq:RK3_stages}, it appears that in order to advance from time $t$ to time $t+\rdt$, 
the RK3 scheme requires 3 evaluations of the right-hand-side 
whereas only one evaluation is necessary for the LF scheme. However, it must be clear that it does
not necessarily mean that the time-to-solution will be longer with RK3. Indeed, larger
time-steps are theoretically allowed with RK3 ($\rdt$ can be doubled actually, see \autoref{tab:linadv}) and several costly terms (vertical 
mixing parameterization, rotated diffusion, viscous/diffusive operators) are computed 
only once per time-steps. Overall the transition from LF to RK3 is expected to lead to a more 
efficient code on top of the increased accuracy in time.

%% =================================================================================================
\subsection{Diffusive part --- Forward or backward scheme}
\label{sec:RK3_forward_imp}
Similarly to the LF scheme, tendencies due to lateral diffusion or restoring (\autoref{sec:TRA_dmp}) are applied thanks to a forward in time differencing:
\[
  x^{t + \rdt} = x^{t} + \rdt \ D_x^{t}
\]
These are computed once per time step and added to non-diffusive tendencies at the third stage of \autoref{eq:RK3_stages}. 
Vertical diffusion processes follow, here again, an uncontionnally stable, backward in time differencing, performed at stage 3:   

\begin{equation}
  x^{t + \rdt} = x^{t} + \rdt \ \text{RHS}_x^{t + \rdt}
\end{equation}


%% =================================================================================================
\section{Surface pressure gradient}
\label{sec:TD_spg_ts}

The leapfrog environment supports a centred in time computation of the surface pressure,
\ie\ evaluated at \textit{now} time step. The same applies to RK3, the surface pressure gradient being 
considered in the 3 stages described in \autoref{sec:RK3_3stages}. 
This refers to as the explicit free surface case in the code
(\np[=.true.]{ln_dynspg_exp}{ln\_dynspg\_exp}, which is not available for RK3 at the time of writing).
This choice however imposes a strong constraint on the time step which
should be small enough to resolve the propagation of external gravity waves.
As a matter of fact, one rather uses in a realistic setup
a split-explicit free surface (\np[=.true.]{ln_dynspg_ts}{ln\_dynspg\_ts}) in which
barotropic and baroclinic dynamical equations are solved separately with ad-hoc time steps.
The use of the time-splitting (in combination with non-linear free surface) imposes
some constraints on the design of the overall flowchart,
in particular to ensure exact tracer conservation (see \autoref{fig:TD_TimeStep_flowchart} for MLF scheme) 
and the mode splitting stability (\cite{ducousso_JAMES24}).

Compared to the former use of the filtered free surface in \NEMO\ v3.6 (\citet{roullet.madec_JGR00}),
the use of a split-explicit free surface is advantageous on massively parallel computers.
Indeed, no global computations are anymore required by the elliptic solver which
saves a substantial amount of communication time.
Fast barotropic motions (such as tides) are also simulated with a better accuracy.

%\cmtgm{
\begin{figure}
  \centering
  \includegraphics[width=0.66\textwidth]{TD_TimeStepping_flowchart_v4}
  \caption[Leapfrog time stepping sequence with split-explicit free surface]{
    Sketch of the leapfrog time stepping sequence in \NEMO\ with split-explicit free surface.
    The latter combined with non-linear free surface requires
    the dynamical tendency being updated prior tracers tendency to ensure conservation.
    Note the use of time integrated fluxes issued from the barotropic loop in
    subsequent calculations of tracer advection and in the continuity equation.
    Details about the time-splitting scheme can be found in \autoref{subsec:DYN_spg_ts}.}
  \label{fig:TD_TimeStep_flowchart}
\end{figure}
%}

%% =================================================================================================
\section{Start/Restart strategy}
\label{sec:TD_rst}

\begin{listing}
  \nlst{namrun}
  \caption{\forcode{&namrun}}
  \label{lst:namrun}
\end{listing}

When starting from initial conditions, and specific to the Leapfrog scheme (\ie\ no specific procedure is needed with RK3), 
the first step is a forward step (Euler time integration):
\[
  % \label{eq:TD_DOM_euler}
  x^1 = x^0 + \rdt \ \text{RHS}^0
\]
This is done simply by keeping the leapfrog environment
(\ie\ the \autoref{eq:TD} three level time stepping) but
setting all $x^0$ (\textit{before}) and $x^1$ (\textit{now}) fields equal at the first time step and
using half the value of a leapfrog time step ($2 \rdt$).

It is also possible to restart from a previous computation, by using a restart file and setting \np[=.true.]{ln_restart}{ln\_restart}.
The restart strategy is designed to ensure perfect restartability of the code:
the user should obtain the same results to machine precision either by
running the model for $2N$ time steps in one go,
or by performing two consecutive experiments of $N$ steps with a restart.
This requires saving the necessary time levels (2 in case of LF and 1 with RK3) and many auxiliary data in
the restart files in machine precision.

Note that the time step $\rdt$, is also saved in the restart file.
When restarting, if the time step has been changed, or
one of the prognostic variables at \textit{before} time step is missing,
an Euler time stepping scheme is imposed with Leapfrog.
A forward initial step can still be enforced by the user by
setting the namelist variable \np[=0]{nn_euler}{nn\_euler}.
Other options to control the time integration of the model are defined through
the \nam{run}{run} namelist variables.

The consistency between the provided input restart file(s) (\np{cn_ocerst_in}{cn\_ocerst\_in}) and 
the namelist settings is handled with the parameter \np{nn_rstctl}{nn\_rstctl}, as detailed in \nam{run}{run}.

Restart files are created through the legacy interface, which allows to specify the root name
of the output restart file (\np{cn_ocerst_out}{cn\_ocerst\_out}) and the timestep frequency at which restart file is created (\np{nn_stock}{nn\_stock}).
Similarly the parameter \np{nn_write}{nn\_write} control the creation of output files.
It is also possible to save restart files at specific timesteps during the model execution, 
by setting \np[=.true.]{ln_rst_list}{ln\_rst\_list} and filling up the restart dump times in \np{nn_stocklist}{nn\_stocklist} (always requires 10 values).

Since version 4.2 it is possible to use the XIOS interface to directly write (\np{nn_wxios}{nn\_wxios}) 
and read (\np{ln_xios_read}{ln\_xios\_read}) the restart files, as detailed in \autoref{subsec:XIOS_restarts}.

%% =================================================================================================
\section[Initial state (\textit{istate.F90} and \textit{dtatsd.F90})]{Initial state (\protect\mdl{istate} and \protect\mdl{dtatsd})}
\label{sec:TD_DTA_tsd}

\begin{listing}
  \nlst{namtsd}
  \caption{\forcode{&namtsd}}
  \label{lst:namtsd}
\end{listing}

Basic initial state options are defined in \nam{tsd}{tsd}.
By default, the ocean starts from rest (the velocity field is set to zero) and
the initialization of temperature and salinity fields is controlled through the \np{ln_tsd_init}{ln\_tsd\_init} namelist parameter:

\begin{description}
\item [{\np[=.true.]{ln_tsd_init}{ln\_tsd\_init}}] \hfill \\
    Initial temperature and salinity input fields can be provided on the model grid itself or on their native input-data grids.
    In the latter case, the data will be interpolated on-the-fly both in the horizontal and the vertical to the model grid
    (see \autoref{subsec:SBC_iof}).
    The information relating to the input files is specified in the \np{sn_tem}{sn\_tem} and \np{sn_sal}{sn\_sal} structures.
    The computation is done in the \mdl{dtatsd} module.
\item [{\np[=.false.]{ln_tsd_init}{ln\_tsd\_init}}] \hfill \\
    Initial temperature and salinity fields are set as part of a user-defined configuration (subroutine \rou{usr\_def\_istate} in
    module \mdl{userdef\_istate}- see the \href{https://sites.nemo-ocean.io/user-guide/}{user guide} for more information).
    The default configuration (GYRE, see \autoref{sec:CFGS_gyre}) sets horizontally uniform temperature and salinity fields.
\end{description}

%% =================================================================================================
\section[Adaptive-implicit vertical advection (\forcode{ln_zad_Aimp})]{Adaptive-implicit vertical advection(\protect\np{ln_zad_Aimp}{ln\_zad\_Aimp})}
\label{subsec:TD_zad_aimp}

The adaptive-implicit vertical advection option in \NEMO\ is based on the work of
\citet{shchepetkin_OM15}. In common with most ocean models, the timestep used with \NEMO\
needs to satisfy multiple criteria associated with different physical processes in order
to maintain numerical stability. \citet{shchepetkin_OM15} pointed out that the vertical
CFL criterion is commonly the most limiting. \citet{lemarie.debreu.ea_OM15} examined the
constraints for a range of time and space discretizations and provide the CFL stability
criteria for a range of advection schemes. The values for the Leap-Frog with Robert
asselin filter time-stepping (as used in \NEMO) are reproduced in
\autoref{tab:TD_zad_Aimp_CFLcrit}.

Treating the vertical advection implicitly can avoid these
restrictions but at the cost of large dispersive errors and, possibly, large numerical
viscosity. The adaptive-implicit vertical advection option, enabled by setting
\np[=.true.]{ln_zad_Aimp}{ln\_zad\_Aimp} in \nam{zdf}{zdf}, provides a targetted use of the
implicit scheme only when and where potential breaches of the vertical CFL condition
occur. In many practical applications these events may occur away from the main area of
interest or due to short-lived conditions, such that the extra numerical diffusion or
viscosity does not greatly affect the overall solution. With such applications, this option
should allow much longer model timesteps to be used whilst
retaining the accuracy of the high order explicit schemes over most of the domain.

\begin{table}[htbp]
  \centering
  % \begin{tabular}{cp{70pt}cp{70pt}cp{70pt}cp{70pt}}
  \begin{tabular}{r|ccc}
    \hline
    Spatial discretization  & 2\textsuperscript{nd} order centered &
        3\textsuperscript{rd} order upwind & 4\textsuperscript{th} order compact \\
    Advective CFL criterion &                                              0.904 &
                                     0.472 &                               0.522 \\
    \hline
  \end{tabular}
  \caption[Advective CFL criteria for the leapfrog with Robert Asselin filter time-stepping]{
    The advective CFL criteria for a range of spatial discretizations for
    the leapfrog with Robert Asselin filter time-stepping
    ($\nu=0.1$) as given in \citet{lemarie.debreu.ea_OM15}.}
  \label{tab:TD_zad_Aimp_CFLcrit}
\end{table}

In particular, the advection scheme remains explicit everywhere except where and when
local vertical velocities exceed a threshold set just below the explicit stability limit.
Once the threshold is reached, a tapered transition towards an implicit scheme is used by
partitioning the vertical velocity into a part that can be treated explicitly and any
excess that must be treated implicitly. The partitioning is achieved via a Courant-number
dependent weighting algorithm as described in \citet{shchepetkin_OM15}.

The local cell Courant number ($Cu$) used for this partitioning is:

\begin{equation}
  \label{eq:TD_Eqn_zad_Aimp_Courant}
  \begin{split}
    Cu &= {2 \rdt \over e^n_{3t_{ijk}}} \bigg (\big [ \max(w^n_{ijk},0.0) - \min(w^n_{ijk+1},0.0) \big ]    \\
       &\phantom{=} +\big [ \max(e_{{2_u}ij}e^n_{{3_{u}}ijk}u^n_{ijk},0.0) - \min(e_{{2_u}i-1j}e^n_{{3_{u}}i-1jk}u^n_{i-1jk},0.0) \big ]
                     \big / e_{{1_t}ij}e_{{2_t}ij}            \\
       &\phantom{=} +\big [ \max(e_{{1_v}ij}e^n_{{3_{v}}ijk}v^n_{ijk},0.0) - \min(e_{{1_v}ij-1}e^n_{{3_{v}}ij-1k}v^n_{ij-1k},0.0) \big ]
                     \big / e_{{1_t}ij}e_{{2_t}ij} \bigg )    \\
  \end{split}
\end{equation}

\noindent and the tapering algorithm follows \citet{shchepetkin_OM15} as:

\begin{align}
  \label{eq:TD_Eqn_zad_Aimp_partition}
Cu_{min} &= 0.15 \nonumber \\
Cu_{max} &= 0.3  \nonumber \\
Cu_{cut} &= 2Cu_{max} - Cu_{min} \nonumber \\
Fcu    &= 4Cu_{max}*(Cu_{max}-Cu_{min}) \nonumber \\
\cf &=
     \begin{cases}
        0.0                                                        &\text{if $Cu \leq Cu_{min}$} \\
        (Cu - Cu_{min})^2 / (Fcu +  (Cu - Cu_{min})^2)             &\text{else if $Cu < Cu_{cut}$} \\
        (Cu - Cu_{max}) / Cu                                       &\text{else}
     \end{cases}
\end{align}

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{TD_zad_Aimp_coeff}
  \caption[Partitioning coefficient used to partition vertical velocities into parts]{
    The value of the partitioning coefficient (\cf) used to partition vertical velocities into
    parts to be treated implicitly and explicitly for a range of typical Courant numbers
    (\forcode{ln_zad_Aimp=.true.}).}
  \label{fig:TD_zad_Aimp_coeff}
\end{figure}

\noindent The partitioning coefficient (\cf) is used to determine the part of the vertical
velocity that must be handled implicitly ($w_i$) and to subtract this from the total
vertical velocity ($w_n$) to leave that which can continue to be handled explicitly:

\begin{align}
  \label{eq:TD_Eqn_zad_Aimp_partition2}
    w_{i_{ijk}} &= \cf_{ijk} w_{n_{ijk}}     \nonumber \\
    w_{n_{ijk}} &= (1-\cf_{ijk}) w_{n_{ijk}}
\end{align}

\noindent Note that the coefficient is such that the treatment is never fully implicit;
the three cases from \autoref{eq:TD_Eqn_zad_Aimp_partition} can be considered as:
fully-explicit; mixed explicit/implicit and mostly-implicit, with \cf\
varying as shown in \autoref{fig:TD_zad_Aimp_coeff}. Note with these values
the $Cu_{cut}$ boundary between the mixed implicit-explicit treatment and 'mostly
implicit' is 0.45, which is just below the stability criterion given in
\autoref{tab:TD_zad_Aimp_CFLcrit} for a 3rd order scheme.

The $w_i$ component is added to the implicit solvers for the vertical mixing in
\mdl{dynzdf} and \mdl{trazdf} in a similar way to \citet{shchepetkin_OM15}. This is
sufficient for the flux-limited advection scheme (\np[=.true.]{ln_traadv_mus}{ln\_traadv\_mus}) but further
intervention is required when using the flux-corrected advection scheme (\np[=.true.]{ln_traadv_fct}{ln\_traadv\_fct}).
For this scheme, the implicit upstream fluxes must be added to both the monotonic guess
and to the higher order solution when calculating the antidiffusive fluxes. The implicit
vertical fluxes are then removed since they are added by the implicit solver later on.

The adaptive-implicit vertical advection option is new to \NEMO\ at v4.0 and has yet to be
used in a wide range of simulations. The following test simulation, however, does illustrate
the potential benefits and will hopefully encourage further testing and feedback from users:

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{TD_zad_Aimp_overflow_frames}
  \caption[OVERFLOW: time-series of temperature vertical cross-sections]{
    A time-series of temperature vertical cross-sections for the OVERFLOW test case.
    These results are for the default settings with \np[=10.0]{rn_Dt}{rn\_Dt} and
    without adaptive implicit vertical advection (\np[=.false.]{ln_zad_Aimp}{ln\_zad\_Aimp}).}
  \label{fig:TD_zad_Aimp_overflow_frames}
\end{figure}

%% =================================================================================================
\subsection{Adaptive-implicit vertical advection in the OVERFLOW test-case}

The \href{https://sites.nemo-ocean.io/user-guide/tests.html#overflow}{OVERFLOW test case}
provides a simple illustration of the adaptive-implicit advection in action. The example here differs from the basic test case
by only a few extra physics choices namely:

\begin{forlines}
     ln_dynldf_OFF = .false.
     ln_dynldf_lap = .true.
     ln_dynldf_hor = .true.
     ln_zdfnpc     = .true.
     ln_traadv_fct = .true.
        nn_fct_h   =  2
        nn_fct_v   =  2
\end{forlines}

\noindent which were chosen to provide a slightly more stable and less noisy solution. The
result when using the default value of \np[=10.]{rn_Dt}{rn\_Dt} without adaptive-implicit
vertical velocity is illustrated in \autoref{fig:TD_zad_Aimp_overflow_frames}. The mass of
cold water, initially sitting on the shelf, moves down the slope and forms a
bottom-trapped, dense plume. Even with these extra physics choices the model is close to
stability limits and attempts with \np[=30.]{rn_Dt}{rn\_Dt} will fail after about 5.5 hours
with excessively high horizontal velocities. This time-scale corresponds with the time the
plume reaches the steepest part of the topography and, although detected as a horizontal
CFL breach, the instability originates from a breach of the vertical CFL limit. This is a good
candidate, therefore, for use of the adaptive-implicit vertical advection scheme.

The results with \np[=.true.]{ln_zad_Aimp}{ln\_zad\_Aimp} and a variety of model timesteps
are shown in \autoref{fig:TD_zad_Aimp_overflow_all_rdt} (together with the equivalent
frames from the base run).  In this simple example the use of the adaptive-implicit
vertcal advection scheme has enabled a 12x increase in the model timestep without
significantly altering the solution (although at this extreme the plume is more diffuse
and has not travelled so far).  Notably, the solution with and without the scheme is
slightly different even with \np[=10.]{rn_Dt}{rn\_Dt}, suggesting that the base run was
close enough to instability to trigger the scheme despite completing successfully.
To assist in diagnosing how active the scheme is, in both location and time, the 3D
implicit and explicit components of the vertical velocity are available (when using XIOS, \key{xios})
via the \forcode{wimp} and \forcode{wexp} diagnostics respectively. Likewise, the partitioning coefficient
(\cf) is also available as \forcode{wi_cff}. For a quick oversight of
the scheme's activity, the global maximum values of the absolute implicit component
of the vertical velocity and the partitioning coefficient are written to the netCDF
version of the run statistics file (\textit{run.stat.nc}) if this is active (see
\autoref{sec:MISC_opt} for activation details).

\autoref{fig:TD_zad_Aimp_maxCf} shows examples of the maximum partitioning coefficient for
the various overflow tests.  Note that the adaptive-implicit vertical advection scheme is
active even in the base run with \np[=10.]{rn_Dt}{rn\_Dt}, adding to the evidence that the
test case is close to stability limits even with this value. At the larger timesteps, the
vertical velocity is treated mostly implicitly at some locations throughout the run. The
oscillatory nature of this measure appears to be linked to the progress of the plume front
as each cusp is associated with the location of the maximum shifting to the adjacent cell.
This is illustrated in \autoref{fig:TD_zad_Aimp_maxCf_loc} where the $i$- and $k$- locations of the
maximum have been overlaid for the base run case.

\medskip
\noindent Only limited tests have been performed in more realistic configurations. In the
ORCA2\_ICE\_PISCES reference configuration, the scheme does activate and passes
restartability and reproducibility tests but it is unable to improve the model's stability
enough to allow an increase in the model time-step. A view of the time-series of maximum
partitioning coefficient (not shown here)  suggests that the default time-step of \np[=5400.]{rn_Dt}{rn\_Dt} is
already pushing at stability limits, especially in the initial start-up phase. The
time-series does not, however, exhibit any of the 'cuspiness' found with the overflow
tests.

\medskip
\noindent A short test with an eORCA1 configuration is more promising, since using a
time-step of \np[=3600.]{rn_Dt}{rn\_Dt} remains stable with \np[=.true.]{ln_zad_Aimp}{ln\_zad\_Aimp}
whereas the time-step is limited to \np[=2700.]{rn_Dt}{rn\_Dt} without.

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{TD_zad_Aimp_overflow_all_rdt}
  \caption[OVERFLOW: sample temperature vertical cross-sections from mid- and end-run]{
    Sample temperature vertical cross-sections from mid- and end-run using
    different values for \np{rn_Dt}{rn\_Dt} and with or without adaptive implicit vertical advection.
    Without the adaptive implicit vertical advection,
    only the run with the shortest timestep is able to run to completion.
    Note also that the colour-scale has been chosen to confirm that
    temperatures remain within the original range of 10$^o$ to 20$^o$.}
  \label{fig:TD_zad_Aimp_overflow_all_rdt}
\end{figure}

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{TD_zad_Aimp_maxCf}
  \caption[OVERFLOW: maximum partitioning coefficient during a series of test runs]{
    The maximum partitioning coefficient during a series of test runs with
    increasing model timestep length.
    At the larger timesteps,
    the vertical velocity is treated mostly implicitly at some locations throughout the run.}
  \label{fig:TD_zad_Aimp_maxCf}
\end{figure}

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{TD_zad_Aimp_maxCf_loc}
  \caption[OVERFLOW: maximum partitioning coefficient for the case overlaid]{
    The maximum partitioning coefficient for the \np[=10.]{rn_Dt}{rn\_Dt} case overlaid with
    information on the gridcell $i$- and $k$-locations of the maximum value.}
  \label{fig:TD_zad_Aimp_maxCf_loc}
\end{figure}

\cmtgm{       % add a subsection here

%% =================================================================================================
\subsection{Time domain}
\label{subsec:TD_time}

Options are defined through the\nam{dom}{dom} namelist variables.
 \colorbox{yellow}{add here a few word on nit000 and nitend}

 \colorbox{yellow}{Write documentation on the calendar and the key variable adatrj}

add a description of daymod, and the model calendar (leap-year and co)

}     %% end add

\cmtgm{       % add implicit in vvl case  and Crant-Nicholson scheme

Implicit time stepping in case of variable volume thickness.

Tracer case (NB for momentum in vector invariant form take care!)

\begin{flalign*}
  &\frac{\lt( e_{3t}\,T \rt)_k^{t+1}-\lt( e_{3t}\,T \rt)_k^{t-1}}{2\rdt}
  \equiv \text{RHS}+ \delta_k \lt[ {\frac{A_w^{vt} }{e_{3w}^{t+1} }\delta_{k + 1/2} \lt[ {T^{t+1}} \rt]}
  \rt]      \\
  &\lt( e_{3t}\,T \rt)_k^{t+1}-\lt( e_{3t}\,T \rt)_k^{t-1}
  \equiv {2\rdt} \ \text{RHS}+ {2\rdt} \ \delta_k \lt[ {\frac{A_w^{vt} }{e_{3w}^{t+1} }\delta_{k + 1/2} \lt[ {T^{t+1}} \rt]}
  \rt]      \\
  &\lt( e_{3t}\,T \rt)_k^{t+1}-\lt( e_{3t}\,T \rt)_k^{t-1}
  \equiv 2\rdt \ \text{RHS}
  + 2\rdt \ \lt\{ \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k + 1/2} [ T_{k +1}^{t+1} - T_k      ^{t+1} ]
    - \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k - 1/2} [ T_k       ^{t+1} - T_{k -1}^{t+1} ]  \rt\}     \\
  &\\
  &\lt( e_{3t}\,T \rt)_k^{t+1}
  -  {2\rdt} \           \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k + 1/2}                  T_{k +1}^{t+1}
  + {2\rdt} \ \lt\{  \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k + 1/2}
    +  \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k - 1/2}     \rt\}   T_{k    }^{t+1}
  -  {2\rdt} \           \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k - 1/2}                  T_{k -1}^{t+1}      \\
  &\equiv \lt( e_{3t}\,T \rt)_k^{t-1} + {2\rdt} \ \text{RHS}    \\
  %
\end{flalign*}

\begin{flalign*}
  \allowdisplaybreaks
  \intertext{ Tracer case }
  %
  &  \qquad \qquad  \quad   -  {2\rdt}                  \ \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k + 1/2}
  \qquad \qquad \qquad  \qquad  T_{k +1}^{t+1}   \\
  &+ {2\rdt} \ \biggl\{  (e_{3t})_{k   }^{t+1}  \bigg. +    \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k + 1/2}
  +   \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k - 1/2} \bigg. \biggr\}  \ \ \ T_{k   }^{t+1}  &&\\
  & \qquad \qquad  \qquad \qquad \qquad \quad \ \ -  {2\rdt} \                          \lt[ \frac{A_w^{vt}}{e_{3w}^{t+1}} \rt]_{k - 1/2}                          \quad \ \ T_{k -1}^{t+1}
  \ \equiv \ \lt( e_{3t}\,T \rt)_k^{t-1} + {2\rdt} \ \text{RHS}  \\
  %
\end{flalign*}

\begin{flalign*}
  \allowdisplaybreaks
  \intertext{ Tracer content case }
  %
  & -  {2\rdt} \              & \frac{(A_w^{vt})_{k + 1/2}} {(e_{3w})_{k + 1/2}^{t+1}\;(e_{3t})_{k +1}^{t+1}}  && \  \lt( e_{3t}\,T \rt)_{k +1}^{t+1}   &\\
  & + {2\rdt} \ \lt[ 1  \rt.+ & \frac{(A_w^{vt})_{k + 1/2}} {(e_{3w})_{k + 1/2}^{t+1}\;(e_{3t})_k^{t+1}}
  + & \frac{(A_w^{vt})_{k - 1/2}} {(e_{3w})_{k - 1/2}^{t+1}\;(e_{3t})_k^{t+1}}  \lt.  \rt]  & \lt( e_{3t}\,T \rt)_{k   }^{t+1}  &\\
  & -  {2\rdt} \               & \frac{(A_w^{vt})_{k - 1/2}} {(e_{3w})_{k - 1/2}^{t+1}\;(e_{3t})_{k -1}^{t+1}}     &\  \lt( e_{3t}\,T \rt)_{k -1}^{t+1}
  \equiv \lt( e_{3t}\,T \rt)_k^{t-1} + {2\rdt} \ \text{RHS}  &
\end{flalign*}

}

\subinc{\input{../../global/epilogue}}

\end{document}