chap_LBC.tex

\documentclass[../main/NEMO_manual]{subfiles}

\begin{document}

\chapter{Lateral Boundary Condition (LBC)}
\label{chap:LBC}

\chaptertoc

\paragraph{Changes record} ~\\

{\footnotesize
  \begin{tabularx}{\textwidth}{l||X|X}
    Release & Author(s) & Modifications \\
    \hline
    {\em   5.0} & {\em Simon M{\" u}ller and S{\' e}bastien Masson} & {\em Review and general revision} \\
    {\em   4.2} & {\em Simon M{\" u}ller} & {\em Minor update of \autoref{subsec:LBC_bdy_tides}} \\
    {\em   4.0} & {\em ...} & {\em ...} \\
    {\em   3.6} & {\em ...} & {\em ...} \\
    {\em   3.4} & {\em ...} & {\em ...} \\
    {\em <=3.4} & {\em ...} & {\em ...}
  \end{tabularx}
}

\clearpage

Four types of lateral boundary conditions (LBC) can be specified or configured: boundary conditions at the continental interface, at the boundaries of the model domain, between sub-domains defined for massively parallel processing (MPP), and at open boundaries (BDY) for regional configurations.

%% =================================================================================================
\section[Boundary condition at the continental interface (\forcode{rn_shlat})]{Boundary condition at the continental interface (\protect\np{rn_shlat}{rn\_shlat})}
\label{sec:LBC_coast}

\begin{listing}
  \nlst{namlbc}
  \caption{\forcode{&namlbc}}
  \label{lst:namlbc}
\end{listing}

%The lateral ocean boundary conditions contiguous to coastlines are Neumann conditions for heat and salt
%(no flux across boundaries) and Dirichlet conditions for momentum (ranging from free-slip to "strong" no-slip).
%They are handled automatically by the mask system (see \autoref{subsec:DOM_msk}).

%OPA allows land and topography grid points in the computational domain due to the presence of continents or islands,
%and includes the use of a full or partial step representation of bottom topography.
%The computation is performed over the whole domain, \ie\ we do not try to restrict the computation to ocean-only points.
%This choice has two motivations.
%Firstly, working on ocean only grid points overloads the code and harms the code readability.
%Secondly, and more importantly, it drastically reduces the vector portion of the computation,
%leading to a dramatic increase of CPU time requirement on vector computers.
%The current section describes how the masking affects the computation of the various terms of the equations
%with respect to the boundary condition at solid walls.
%The process of defining which areas are to be masked is described in \autoref{subsec:DOM_msk}.

Options are defined through the \nam{lbc}{lbc} namelist variables.
The discrete representation of a domain with complex boundaries (coastlines and bottom topography) leads to
arrays that include large portions where a computation is not required as the model variables remain at zero.
Nevertheless, vectorial supercomputers are far more efficient when computing over a whole array,
and the readability of the code is greatly improved when computational loops work across whole arrays and boundary conditions are applied through multiplication with a mask array (a mask array contains elements of $1$ at ocean locations and $0$ elsewhere), rather than by specific computations before or after each computational loop.
The simple multiplication of a variable by the relevant mask ensures that it remains zero over land areas;
since the majority of the boundary conditions implement a zero flux across solid boundaries,
they can be simply applied through multiplication with the mask array associated with the grid on which the flux is evaluated.
For example, the diffusive heat flux in the $i$-direction is evaluated at $u$-points.
Evaluating this quantity as,

\[
  % \label{eq:LBC_aaaa}
  \frac{A^{lT} }{e_1 }\frac{\partial T}{\partial i}\equiv \frac{A_u^{lT}
  }{e_{1u} } \; \delta_{i+1 / 2} \left[ T \right]\;\;mask_u
\]
(where $mask_{u}$ corresponds to the mask array at a $u$-point) ensures that the heat flux is zero inside topographic features
and at their boundaries, since mask$_{u}$ is zero at solid boundaries, which in this case are defined at $u$-points
(the normal velocity $u$ also remains zero at such boundaries) (\autoref{fig:LBC_uv}).

\begin{figure}[h]
  \centering
  \includegraphics[width=0.33\textwidth]{LBC_uv}
  \caption[Lateral boundary at $T$-level]{
    Lateral boundary (thick line) at a T-level;
    the velocity normal to the boundary is set to zero.}
  \label{fig:LBC_uv}
\end{figure}

For momentum the situation is a bit more complex as two boundary conditions must be provided along the coast
(one each for the normal and tangential velocities).
The boundary of the ocean in the C-grid is defined by the velocity-faces.
For example, at a given $T$-level,
the lateral boundary (a coastline or an intersection with the bottom topography) is made of
segments joining at $f$-points, and normal velocity points are located between two $f-$point neighbours (\autoref{fig:LBC_uv}).
The boundary condition on the normal velocity (no flux through solid boundaries)
can thus be easily implemented using the mask arrays.
The boundary condition on the tangential velocity requires a more specific treatment.
This boundary condition influences the relative vorticity and momentum diffusive trends,
and is required in order to compute the vorticity at the coast.
Four different types of lateral boundary condition are available,
controlled by the value of the \np{rn_shlat}{rn\_shlat} namelist parameter
(The value of the $mask_{f}$ array along the coastline and bottom topography is set equal to this parameter).
These are:

\begin{figure}[h]
  \centering
  \includegraphics[width=0.33\textwidth]{LBC_shlat}
  \caption[Lateral boundary conditions]{
    Lateral boundary conditions
    (a) free-slip                       (\protect\np[=0]{rn_shlat}{rn\_shlat});
    (b) no-slip                         (\protect\np[=2]{rn_shlat}{rn\_shlat});
    (c) "partial" free-slip (\forcode{0<}\protect\np[<2]{rn_shlat}{rn\_shlat}) and
    (d) "strong" no-slip    (\forcode{2<}\protect\np{rn_shlat}{rn\_shlat}).
    Implied "ghost" velocity inside land area is display in grey.}
  \label{fig:LBC_shlat}
\end{figure}

\begin{description}
  
  \item [free-slip boundary condition ({\np[=0]{rn_shlat}{rn\_shlat}})]\hfill \\
  the tangential velocity at the boundary is equal to the offshore velocity,
  \ie\ the normal derivative of the tangential velocity is zero at the coast,
  so the vorticity array, $mask_{f}$, is set to zero inside and at continental boundaries
  (\autoref{fig:LBC_shlat}-a).
  
  \item [no-slip boundary condition ({\np[=2]{rn_shlat}{rn\_shlat}})]\hfill \\
  the tangential velocity vanishes at continental boundaries.
  Assuming that the tangential velocity decreases linearly from
  the closest ocean velocity grid point to the coastline,
  the normal derivative is evaluated as if the velocities at neighbouring cross-boundary velocity grid-points were of the same magnitude and opposite direction
  (\autoref{fig:LBC_shlat}-b).
  Therefore, the vorticity along the boundary is given by:
  
  \[
    \zeta \equiv \frac{2}{e_{1f} e_{2f}} \left(\delta_{i+1/2} \left[e_{2v} v \right] - \delta_{j+1/2} \left[e_{1u} u \right] \right) \ ,
  \]
  where $u$ and $v$ are masked fields.
  Setting the $mask_{f}$ array to $2$ along the coastline provides a vorticity field computed with
  the no-slip boundary condition as:
  \[
    % \label{eq:LBC_bbbb}
    \zeta \equiv \frac{1}{e_{1f} {\kern 1pt}e_{2f} }\left( {\delta_{i+1/2}
    \left[ {e_{2v} \,v} \right]-\delta_{j+1/2} \left[ {e_{1u} \,u} \right]}
    \right)\;\mbox{mask}_f
  \]
      
  \item ["partial" free-slip boundary condition (0 $<$\np{rn_shlat}{rn\_shlat}$<$ 2)]\hfill \\
  the tangential velocity at
  the coastline is smaller than the offshore velocity, \ie\ there is a lateral friction but
  not strong enough to make the tangential velocity at the coast vanish (\autoref{fig:LBC_shlat}-c).
  This can be selected by providing a value of mask$_{f}$ between $0$ and $2$.
    
  \item ["strong" no-slip boundary condition (2 $<$\np{rn_shlat}{rn\_shlat})]\hfill \\
  the viscous boundary layer is assumed to
  be smaller than half the grid size (\autoref{fig:LBC_shlat}-d).
  The friction is thus larger than in the no-slip case.
      
\end{description}
    
Note that when the bottom topography is entirely represented by the $s$-coordinates (pure $s$-coordinate),
the lateral boundary condition on the tangential velocity is of much less importance as
it is only applied next to the coast where the minimum water depth can be quite shallow.
    
%% =================================================================================================
\section{Model-domain boundary condition}
\label{sec:LBC_jperio}
    
Several options of global model-domain boundary conditions are available:
closed, cyclic east-west, cyclic north-south, cyclic east-west and a north fold, closed and a north fold, or
bi-cyclic east-west and north-south.
The north-fold boundary condition is associated with the 3-pole ORCA meshes and is available in two variants.
The application of these boundary conditions is carried out by calling routine \rou{lbc\_lnk} (module \mdl{lbclnk}).

%% =================================================================================================
\subsection{Closed, cyclic (\forcode{l_Iperio,l_Jperio})}
\label{subsec:LBC_jperio012}

The choice of closed or cyclic model domain boundary condition is controled by
setting the internal code variables \forcode{l_Iperio,l_Jperio} to $.true.$ or $.false.$.
The way these variables are defined will differ accoring to the value of \np{ln_read_cfg}{ln\_read\_cfg} parameter in
namelist \nam{cfg}{cfg}, whose usage is detailed in \autoref{subsec:DOM_config}. If \np[=.false.]{ln_read_cfg}{ln\_read\_cfg}, the user can define \forcode{l_Iperio,l_Jperio} in routine \rou{usrdef\_nam} (module \mdl{usrdef\_nam}). If \np[=.true.]{ln_read_cfg}{ln\_read\_cfg}, \forcode{l_Iperio,l_Jperio} will be defined according to the values of the global attributes (\texttt{Iperio,Jperio} = 0 or 1) in the NetCDF domain configuration file referred to by the \np{cn_domcfg}{cn\_domcfg} parameter in namelist \nam{cfg}{cfg}.

\begin{description}

\item [For a fully closed boundary (\forcode{l_Iperio = .false.,l_Jperio = .false.})], solid walls are imposed at all four model-domain boundaries; the first and last rows and columns of the domain must be set to zero and will be forced to 0 if nedeed.

\item [For a cyclic east-west boundary (\forcode{l_Iperio = .true.,l_Jperio = .false.})], the first and last rows are set to zero (closed); the first \np{nn_hls}{nn\_hls} columns (left halo) are defined with the last \np{nn_hls}{nn\_hls} columns of the orginal global domain (without halos); the last \np{nn_hls}{nn\_hls} columns (right halo) are defined with the first \np{nn_hls}{nn\_hls} columns of the orginal global domain (without halos); flows out of the eastern (western) boundary re-enter through the western (eastern) boundary.

\item [For a cyclic north-south boundary (\forcode{l_Iperio = .false.,l_Jperio = .true.})], the first and last columns are set to zero (closed); analogous to the east-west periodic option, the \np{nn_hls}{nn\_hls} first (last) rows are replicated at the last (first) rows of the original global domain (without halos); flows out of the northern (southern) boundary of the domain re-enter through the southern (northern) boundary. Note that the cyclic north-south boundary requires the f-plan approximation so that f, the coriolis parameter, remains constant in j-direction.

\item [The bi-cyclic east-west and north-south boundary (\forcode{l_Iperio = .true.,l_Jperio = .true.})] combines the two cyclic options.

\end{description}

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.5\textwidth]{LBC_Iperio}
  \caption{Setting of east-west cyclic boundary conditions for \np[=2]{nn_hls}{nn\_hls}. The orginal global domain (without halos) is delimited by the bold lines.}
  \label{fig:LBC_Iperio}
\end{figure}

%% =================================================================================================
\subsection{North-fold (\forcode{l_NFold = .true.}, \forcode{c_NFtype = 'T'} or \forcode{c_NFtype = 'F'})}
\label{subsec:LBC_north_fold}

The north fold boundary condition has been introduced in order to handle the northern boundary of
a three-polar ORCA grid.
When mapping these grids onto a sphere, the last grid row of the original global domain (without halo) folds onto itself to connect the model domain between the two poles in the northern hemisphere (\autoref{fig:CFGS_ORCA_msh}),
and thus requires a specific treatment to implement such folding at the northern edge of the domain.
Further information can be found in \autoref{apdx:NFOLD} and in the \mdl{lbcnfd} module, which contains the subroutine that applies the north-fold boundary condition. 
The \np{ln_nnogather}{ln\_nnogather} parameter in namelist \nam{mpp}{mpp} must be set to $.true.$ to get the best performances. It can be set to $.false.$ for debugging purposes.

%% =================================================================================================
\section[Exchange with neighbouring processes (\textit{lbclnk.F90}, \textit{lib\_mpp.F90})]{Exchange with neighbouring processes (\protect\mdl{lbclnk}, \protect\mdl{lib\_mpp})}
\label{sec:LBC_mpp}

\begin{listing}
  \nlst{nammpp}
  \caption{\forcode{&nammpp}}
  \label{lst:nammpp}
\end{listing}

For massively parallel processing (MPP), a horizontal domain decomposition method is used, see \autoref{fig:LBC_3x3}.
The basic idea of the method is to split the computational domain of a large numerical experiment horizontally into several smaller subdomains and solve the set of equations by addressing independent local problems.
A number of processes, each with its own local memory, that can communicate with each other, are utilised to compute the model equations in parallel, with each process carrying out the computation restricted to an individual subdomain.
The present implementation is largely inspired by Guyon's work [Guyon 1995].

\begin{figure}[h]
  \centering
  \includegraphics[width=0.66\textwidth]{LBC_3x3}
  \caption{Horizontal domain decomposition in $3 \times 3$ subdomains. The thick line on the left panel delimits the original domain (without halos). Subdomains numbering starts at 0 from the bottom-left subdomain. Communications of the subdomain 4 with its neighbours are represented by the blue arrows.}
  \label{fig:LBC_3x3}
\end{figure}

Each subdomain consists of an inner region and a boundary.
The boundary, also called halo, overlaps with neighbouring subdomains as show in the left panel of \autoref{fig:LBC_3x3}.
The halo consist of \np{nn_hls}{nn\_hls} (namelist \nam{mpp}{mpp}) rows or columns at each of the sides of the subdomain. It must be set to $1$ in \NEMO\ version 4.2 and to $2$ in \NEMO\ version 5.0.

The \rou{lbc\_lnk} routine (found in \mdl{lbclnk} module) is used to fill the halo either by communicating with neighbouring subdomains or locally by applying a cyclic or closed boundary condition.
To carry out exchanges with neighbours, the Message Passing Interface (MPI; \href{https://www.mpi-forum.org}) standard is utilised.
When communicating, each process sends to the processors associated with its neighbouring subdomains an update of the values of the points that correspond to the interior part of the overlap (\ie\ the \np{nn_hls}{nn\_hls} innermost of the 2*\np{nn_hls}{nn\_hls} rows and columns of the overlap, see blue arrows in \autoref{fig:LBC_3x3}).
The output file \textit{communication\_report.txt} provides an overview of the number of such exchanges made by individual routines during each time step.

Since \NEMO\ version 4.2, the MPP approach is activated by default.
It can be deactivated (\eg\ if no MPI implementation is available on the target computer) by defining \key{mpi\_off}, and for compatibility with the MPI verstion 2 standard, \key{mpi2} can be defined.
With the \NEMO\ version 4.2 release, a new communication strategy, which preserves performance efficiency by reducing communication time, has been introduced by using the neighbourhood collective communications avaialable with the MPI version 3 standard.
It provides a way to use sub-communicators to perform collective communications among neighbourhoods: a single MPI message needs to be built for all neighbours before calling the collective operation, instead of four different messages.
The new communication approach can be selected by setting \np[=2]{nn_comm}{nn\_comm} in the \nam{mpp}{mpp} namelist record. By default, \np[=1]{nn_comm}{nn\_comm}, which activates the original point-to-point communication of \NEMO\, which has been further optimized in \NEMO\ version 4.2.
Note that other communication strategies are available in the \href{https://sites.nemo-ocean.io/user-guide/tests.html#bench}{BENCH} test case. These communication strategies are using non-blocking point-to-point communications with different approaches to test whether communications have been received: MPI\_Iprobe (\np[=30]{nn_comm}{nn\_comm}), MPI\_Waitany (\np[=31]{nn_comm}{nn\_comm}), MPI\_Waitall (\np[=32]{nn_comm}{nn\_comm}).

In \NEMO\, the decomposition of the model domain results in a regular horizontal grid of subdomains, which can be manually configured with the parameters \np{jpni}{jpni} and \np{jpnj}{jpnj} defined in the \nam{mpp}{mpp} namelist record to select the number of colums along the i-axis and the number of rows along the j-axis, respectively.
If both \np{jpni}{jpni} and \np{jpnj}{jpnj} are less than 1 (default), they will be automatically set during model initialisation to result in an optimal domain decomposition (see below and \citep{Irrmann2022}) into a number of subdomains that corresponds to the number of MPI processes allocated to \NEMO\ when the model is launched (\ie\ \texttt{mpirun -np <n> ./nemo} will automatically result in a domain decomposition into \texttt{<n>} subdomains).

\begin{figure}[h]
  \centering
  \includegraphics[width=0.66\textwidth]{LBC_mppini2}
  \caption[Atlantic domain defined for the CLIPPER projet]{
    Example of Atlantic domain defined for the CLIPPER projet.
    Initial grid is composed of 773 x 1236 horizontal points.
    (a) the domain is split onto 9 $times$ 20 subdomains (jpni=9, jpnj=20). Subdomains with ocean points are numbered first starting from bottom-left. The 52 subdomains that are land areas are next numbered starting also from bottom-left (in yellow).
    (b) The 52 subdomains are eliminated (white rectangles) and
    the resulting number of processors really used during the computation is 128. Note that the subdomains with ocean points have the same number in both cases.}
  \label{fig:LBC_mppini2}
\end{figure}

The \NEMO\ model computes equation terms with the help of mask arrays (0 on land points and 1 on sea points), see \autoref{sec:LBC_coast}.
It is therefore possible that an MPI subdomain contains only land points, see \autoref{fig:LBC_mppini2}.
To save ressources, the model initialisation attempts to suppress as many land subdomains as possible from the computational domain.
For example if $N_{mpi}$ processes are allocated to \NEMO\, the domain decomposisiton results in the equality
\[
  N_{mpi} = jpni \times jpnj - N_{land} + N_{useless}
\]
where $N_{land}$ is the total number of land subdomains in the domain decomposition into $jpni$ by $jpnj$ subdomains.
$N_{useless}$ is the number of land subdomains that are kept in the compuational domain in order to ensure that all $N_{mpi}$ MPI processes are allocated a computational task.
The values of $N_{mpi}$, $jpni$, $jpnj$, $N_{land}$ and $N_{useless}$ are reported in the output file \texttt{ocean.output}.
$N_{useless}$ must, of course, be as small as possible to reduce a wasting of ressources, and therefore a warning is issued in \texttt{ocean.output} if $N_{useless}$ is not zero.
Note that a non-zero value of $N_{useless}$ is usually required when using AGRIF as, up to now, the parent and all child grids must make use of all $N_{mpi}$ processes.

If the domain decomposition is performed automatically, the variant chosen by the model will both minimise the horizontal subdomain size (defined as $max_{all domains}(subdomain size)$) and maximize the number of eliminated land subdomains.
This means that no other decomposition will use fewer processes than $(jpni  \times  jpnj - N_{land})$ while having a smaller subdomain size.
In order to tune $N_{mpi}$ (minimize $N_{useless}$), the model can initially be run with \np{ln_listonly}{ln\_listonly} activated: the model will start the initialisation phase, print a list of optimal decompositions ($N_{mpi}$, \np{jpni}{jpni} and \np{jpnj}{jpnj}) to \texttt{ocean.output}, and then halt.
The maximum value of $N_{mpi}$ tested in this list is $max(N_{MPI\_tasks}, jpni \times jpnj)$.
For example, \NEMO\ can be run on a single process with \np[=.true.]{ln_listonly}{ln\_listonly}, \np[=1000]{jpni}{jpni} and \np[=10]{jpnj}{jpnj} to obtain a list of optimal domain decompositions for the number of processes ranging from 1 to about 10000.

The subdomains are numbered from 0 to $N_{mpi} - 1$:
first, subdomains containing ocean points are numbered from 0 to $jpni * jpnj - N_{land} - 1$, starting with the bottom-left-most subdomain, see \autoref{fig:LBC_mppini2}; then, the remaining $N_{useless}$ land subdomains are numbered, starting from the bottom-left-most land subdomain (yellow numbers in \autoref{fig:LBC_mppini2}a). This implies that, for given values of \np{jpni}{jpni} and \np{jpnj}{jpnj}, the numbers attributed to the ocean subdomains do not vary with $N_{useless}$.

When land processors are eliminated, the value corresponding to these locations in the model output files remain undefined by default, but to avoid missing-number entries in output files, \np{ln_mskland}{ln\_mskland} can be activated.
Note that it may be beneficial to not eliminate land processes when creating the \texttt{meshmask} output file (\ie\ when setting a non-zero value to \np{nn_msh}{nn\_msh}).

\begin{figure}[h]
  \centering
  \includegraphics[width=0.66\textwidth]{LBC_mpp}
  \caption{Positioning of a sub-domain when massively parallel processing is used}
  \label{fig:LBC_mpp}
\end{figure}

With the exception of message passing or synchronisation, each process runs independently and accesses solely its own local memory.
For this reason, the main model dimensions, \forcode{jpi}, \forcode{jpj}, and \forcode{jpk}, correspond to that of the local subdomain.
As detailed in \citet{Irrmann2022}, the value of \forcode{jpi} and \forcode{jpj} can differ between subdomains.
Note that if the configuration requires north folding (\forcode{l_NFold} = .True.), the \forcode{jpj} of the processes involved in the folding are reduced in order to compensate for the extra cost of the north folding operation.

As shown in \autoref{fig:LBC_mpp}, the extents of the subdomain, \forcode{jpi} and \forcode{jpj}, account for both the the inner domain and the halo.
\forcode{Nis0} (\forcode{Njs0}) and \forcode{Nie0} (\forcode{Nje0}) are used to specify the start and end of the inner domain along the i-axis (j-axis).

Several variables are available to convert between indices of the local subdomain and the global domain (with or without halos).
The dimensions of the whole model domain with a halo are referred to as \forcode{jpiglo}, \forcode{jpjglo}, and \forcode{jpk}.
\forcode{Ni0glo} and \forcode{Ni0glo} correspond to the actual domain size as seen in input and output files, without any halo.
The 1-d arrays $mig(1:jpi)$ and $mjg(1:jpj)$, defined in the \rou{init\_locglo} routine (module \mdl{mppini}), can be used to convert local indices to indices of the global domain with halos.
For example, an element of $T_{l}$, a local array (subdomain) corresponds to an element of $T_{g}$, a global array (model domain with halos) through the relationship:
\[
  % \label{eq:LBC_nimpp}
  T_{g} (mig(i),mjg(j),k) = T_{l} (i,j,k),
\]
with $1 \leq i \leq jpi$, $1  \leq j \leq jpj $ , and  $1  \leq k \leq jpk$.
Similarly, the 1-d arrays $mig0(1:jpi)$ and $mjg0(1:jpj)$ can be used to convert local subdomain indices to indices of the global domain without halos.
The 1-d arrays $mi0(1:jpiglo)$, $mi1(1:jpiglo)$m, $mj0(1:jpjglo)$, and $mj1(1:jpjglo)$ have the reverse purpose and can be used as loop loop bounds formulated in terms of global domain indices (see module \mdl{dtastd} for examples).\\

%% =================================================================================================
\section{Unstructured open boundary conditions (BDY)}
\label{sec:LBC_bdy}

\begin{listing}
  \nlst{nambdy}
  \caption{\forcode{&nambdy}}
  \label{lst:nambdy}
\end{listing}

\begin{listing}
  \nlst{nambdy_dta}
  \caption{\forcode{&nambdy_dta}}
  \label{lst:nambdy_dta}
\end{listing}

\begin{listing}
  \nlstlocal{nambdy_index}
  \caption{\forcode{&nambdy & nambdy_index}}
  \label{lst:nambdy_index}
\end{listing}

Options are defined through the \nam{bdy}{bdy} and \nam{bdy_dta}{bdy\_dta} namelist variables.
For regional configurations, the BDY module is the core implementation for the application of open boundary conditions to the
ocean temperature, salinity, and barotropic-baroclinic velocities, as well as to ice-snow concentration, thicknesses, temperatures, salinity, and melt-pond concentration and thickness.

The BDY module was modelled on the OBC module (see \NEMO\ 3.4) and shares many features and
a similar coding structure \citep{chanut_trpt05}.
The specification of the open-boundary location is completely flexible and
facilitates setups from regular boundaries to irregular contours (it includes the possibility to set an open boundary able to follow an isobath).
Boundary data files used with versions of \NEMO\ prior to Version 3.4 may need to be re-ordered to be compatible with the current version.
See \autoref{subsec:LBC_bdy_data} for details.

%% =================================================================================================
\subsection{Namelists}
\label{subsec:LBC_bdy_namelist}

The BDY module is activated by setting \np[=.true.]{ln_bdy}{ln\_bdy} .
It is possible to define more than one boundary ``set'' and apply different boundary conditions to each set.
The number of boundary sets is defined by \np{nb_bdy}{nb\_bdy}.
Each boundary set can be either defined as a series of straight line segments with namelist records (\np[=.false.]{ln_coords_file}{ln\_coords\_file}), in which case an additional record must be provided for each set (see the example namelist records \ref{lst:nambdy_index});
or it can be read in from a file (\np[=.true.]{ln_coords_file}{ln\_coords\_file}), in which case a file with the name \np{cn_coords_file}{cn\_coords\_file} (the default name, ``\textit{coordinates.bdy.nc}'', will be used below to refer to such input files) must be provided for each set (the ``\textit{coordinates.bdy}'' file is analagous to the usual \NEMO\ ``\textit{coordinates.nc}'' file).
For more details of the definition of the boundary geometry and coordinate files see section \autoref{subsec:LBC_bdy_geometry}.

For each boundary set a boundary condition has to be chosen for different field categories: the barotropic solution,
``u2d'' (sea-surface height and barotropic velocities); the baroclinic velocities, ``u3d'';
the active tracers \footnote{The current version of the BDY module does not extend to passive tracers.}, ``tra''; and sea-ice, ``ice''.
For each category of variables an algorithm and the boundary data have to be selected (set with
\np{cn_u2d}{cn\_u2d} and \np{nn_u2d_dta}{nn\_u2d\_dta}, \np{cn_u3d}{cn\_u3d} and \np{nn_u3d_dta}{nn\_u3d\_dta}, \np{cn_tra}{cn\_tra} and \np{nn_tra_dta}{nn\_tra\_dta}, and \np{cn_ice}{cn\_ice} and \np{nn_ice_dta}{nn\_ice\_dta}, respectively).\\

The choice of algorithm is currently:

\begin{description}
\item [\forcode{'none'}]\hfill \\
no boundary condition is applied, the solution will ``see'' land points around the edge of the domain;
\item [\forcode{'specified'}]\hfill \\
the specified boundary condition is applied for baroclinic velocity and tracer variables;
\item [\forcode{'neumann'}]\hfill \\
the values at the boundary are duplicated (no gradient) for baroclinic velocity and tracer variables;
\item [\forcode{'frs'}]\hfill \\
the Flow Relaxation Scheme (FRS) for all variables;
\item [\forcode{'orlanski'}]\hfill \\
the Orlanski radiation scheme (fully oblique) for barotropic, baroclinic and tracer variables;
\item [\forcode{'orlanski_npo'}]\hfill \\
the Orlanski radiation scheme (NPO) for barotropic, baroclinic, and tracer variables; and
\item [\forcode{'flather'}]\hfill \\
the Flather radiation scheme for the barotropic variables.
\end{description}

The boundary data is either set to initial conditions
(\np[=0]{nn_tra_dta}{nn\_tra\_dta}) or forced with external data from a file (\np[=1]{nn_tra_dta}{nn\_tra\_dta}).
If external boundary data is required then the \nam{bdy_dta}{bdy\_dta} namelist must be defined.
One \nam{bdy_dta}{bdy\_dta} namelist is required for each boundary set, adopting the same order of indexes in which the boundary sets are defined in \nam{bdy}{bdy}.
The boundary data is read in from directory \np{cn_dir}{cn\_dir} using module \mdl{fldread} (see \autoref{subsec:SBC_fldread}),
and entries for each required variable in the \nam{bdy_dta}{bdy\_dta} namelist record can be formulated in the corresponding format (filename, file and data frequency, time-interpolation, climatological (time-cyclic) data flag).
For sea-ice salinity, temperatures and melt ponds, constant values instead of input data can be selected by specifying the filename as {'NOT USED'}.\\
In case the ``u3d'' velocity data contains the total velocity (ie, baroclinic and barotropic velocity),
the BDY code can derive baroclinic and barotropic velocities by setting \np[=.true.]{ln_full_vel}{ln\_full\_vel}
For the barotropic solution there is also the option to use tidal harmonic forcing without (\np[=2]{nn_dyn2d_dta}{nn\_dyn2d\_dta}) or in addition to external tidal forcing data (\np[=3]{nn_dyn2d_dta}{nn\_dyn2d\_dta}).\\
If not set to initial conditions, sea-ice salinity, temperatures and melt ponds data at the boundary can either be read in from a file or set to a constant value (\np{rn_ice_sal}{rn\_ice\_sal}, \np{rn_ice_tem}{rn\_ice\_tem}, \np{rn_ice_apnd}{rn\_ice\_apnd}, \np{rn_ice_hpnd}{rn\_ice\_hpnd}, and \np{rn_ice_hlid}{rn\_ice\_hlid});
the ice age is constant and defined by \np{rn_ice_age}{rn\_ice\_age}.


There is also an option to vertically interpolate the open boundary data onto the native grid at run-time (\np[=.true.]{ln_zinterp}{ln\_zinterp}).
For this to be successful the additional depth and scale-factor variables $gdept$, $gdepu$, $gdepv$, $e3t$, $e3u$, and $e3v$ are required to be present in the lateral boundary files, and
the scale factors are used to adjustment to velocity fields due to differences in the total water depths between the two vertical grids.\\

The AMM12 reference configuration, \path{crgs/AMM12/EXPREF/namelist_cfg}, provides an example with one boundary set of external daily data provided in individual files (from a large-scale model):
FRS conditions are applied to temperature and salinity, and
Flather conditions to the barotropic variables; no condition is specified for the baroclinic velocities and sea-ice; tidal harmonic forcing is also used.

%% =================================================================================================
\subsection{Flow relaxation scheme}
\label{subsec:LBC_bdy_FRS_scheme}

The Flow Relaxation Scheme (FRS) \citep{davies_QJRMS76,engedahl_T95},
applies a simple relaxation of the model fields to externally-specified values over
a zone next to the edge of the model domain.
Given a model prognostic variable $\Phi$
\[
  % \label{eq:LBC_bdy_frs1}
  \Phi(d) = \alpha(d)\Phi_{e}(d) + (1-\alpha(d))\Phi_{m}(d)\;\;\;\;\; d=1,N
\]
where $\Phi_{m}$ is the model solution and $\Phi_{e}$ is the specified external field,
$d$ gives the discrete distance from the model boundary and
$\alpha$ is a parameter that varies from $1$ at $d=1$ to a small value at $d=N$.
It can be shown that this scheme is equivalent to adding a relaxation term to
the prognostic equation for $\Phi$ of the form:
\[
  % \label{eq:LBC_bdy_frs2}
  -\frac{1}{\tau}\left(\Phi - \Phi_{e}\right)
\]
where the relaxation time scale $\tau$ is given by a function of $\alpha$ and the model time step $\rdt$:
\[
  % \label{eq:LBC_bdy_frs3}
  \tau = \frac{1-\alpha}{\alpha}  \,\rdt
\]
Thus, the model solution is completely prescribed by the external conditions at the edge of the model domain and
is relaxed towards the external conditions over the rest of the FRS zone.
The application of a relaxation zone helps to prevent spurious reflections of
outgoing signals from the model boundary.

The function $\alpha$ is specified as a $tanh$ function:
\[
  % \label{eq:LBC_bdy_frs4}
  \alpha(d) = 1 - \tanh\left(\frac{d-1}{2}\right),       \quad d=1,N
\]
The width of the FRS zone is specified in the namelist as \np{nn_rimwidth}{nn\_rimwidth}.
This is typically set to a value in the range from 8 to 10 (default).

%% =================================================================================================
\subsection{Flather radiation scheme}
\label{subsec:LBC_bdy_flather_scheme}

The \citet{flather_JPO94} scheme is a radiation condition on the normal,
depth-mean transport across the open boundary.
It takes the form
\begin{equation}
  \label{eq:LBC_bdy_fla1}
  U = U_{e} + \frac{c}{h}\left(\eta - \eta_{e}\right),
\end{equation}
where $U$ is the depth-mean velocity normal to the boundary and $\eta$ is the sea surface height,
both from the model.
The subscript $e$ indicates the corresonding fields from external sources.
The speed of external gravity waves is given by $c = \sqrt{gh}$, and $h$ is the depth of the water column.
The depth-mean normal velocity along the edge of the model domain is set to
the external depth-mean normal velocity,
plus a correction term that allows gravity waves generated internally to exit the model boundary.
Note that the sea-surface height difference in \autoref{eq:LBC_bdy_fla1} is a spatial gradient across the model boundary,
so that $\eta_{e}$ is defined on $T$-grid points with \texttt{nbr=1} (adjacent to the boundary, see \autoref{subsec:LBC_bdy_geometry}) and $\eta$ is defined on the $T$ points with \forcode{nbr=2}.
$U$ and $U_{e}$ are defined on the $u$- or $v$-points with \forcode{nbr=1}, \ie\ between the two $T$-grid points.

%% =================================================================================================
\subsection{Orlanski radiation scheme}
\label{subsec:LBC_bdy_orlanski_scheme}

The Orlanski scheme is based on the algorithm described by \citep{marchesiello.mcwilliams.ea_OM01}, hereafter MMS.

The adaptive Orlanski condition solves a wave plus relaxation equation at the boundary:
\begin{equation}
  \label{eq:LBC_wave_continuous}
  \frac{\partial\phi}{\partial t} + c_x \frac{\partial\phi}{\partial x} + c_y \frac{\partial\phi}{\partial y} =
  -\frac{1}{\tau}(\phi - \phi^{ext})
\end{equation}

where $\phi$ is the model field, $x$ and $y$ refer to the normal and tangential directions to the boundary, respectively, and the phase
velocities are diagnosed from the model fields as:

\begin{equation}
  \label{eq:LBC_cx}
  c_x = -\frac{\partial\phi}{\partial t}\frac{\partial\phi / \partial x}{(\partial\phi /\partial x)^2 + (\partial\phi /\partial y)^2}
\end{equation}
\begin{equation}
  \label{eq:LBC_cy}
  c_y = -\frac{\partial\phi}{\partial t}\frac{\partial\phi / \partial y}{(\partial\phi /\partial x)^2 + (\partial\phi /\partial y)^2}
\end{equation}

(As noted by MMS, this is a circular diagnosis of the phase speeds which only makes sense on a discrete grid).
Equation (\autoref{eq:LBC_wave_continuous}) is defined adaptively depending on the sign of the phase velocity normal to the boundary $c_x$.
For $c_x$ outward, we have

\begin{equation}
\tau = \tau_{out}
\end{equation}

For $c_x$ inward, the radiation equation is not applied:

\begin{equation}
  \label{eq:LBC_tau_in}
  \tau = \tau_{in}\,\,\,;\,\,\, c_x = c_y = 0
\end{equation}

Generally the relaxation time scale at inward propagation points (\np{rn_time_dmp}{rn\_time\_dmp}) is set much shorter than the time scale at outward propagation
points (\np{rn_time_dmp_out}{rn\_time\_dmp\_out}) so that the solution is constrained more strongly by the external data at inward propagation points.
See \autoref{subsec:LBC_bdy_relaxation} for details on the spatial shape of the scaling.\\
The ``normal propagation of oblique radiation'' or NPO approximation (called \forcode{'orlanski_npo'}) involves assuming
that $c_y$ is zero in equation (\autoref{eq:LBC_wave_continuous}), but including
this term in the denominator of equation (\autoref{eq:LBC_cx}). Both versions of the scheme are options in BDY. Equations
(\autoref{eq:LBC_wave_continuous}) - (\autoref{eq:LBC_tau_in}) correspond to equations (13) - (15) and (2) - (3) in MMS.\\

%% =================================================================================================
\subsection{Relaxation at the boundary}
\label{subsec:LBC_bdy_relaxation}

In addition to a specific boundary condition specified as \np{cn_tra}{cn\_tra} and \np{cn_dyn3d}{cn\_dyn3d}, relaxation on baroclinic velocities and tracers are available.
This option can be selected with the namelist parameters \np{ln_tra_dmp}{ln\_tra\_dmp} and \np{ln_dyn3d_dmp}{ln\_dyn3d\_dmp} for each boundary set.

The relaxation time scale value (\np{rn_time_dmp}{rn\_time\_dmp} and \np{rn_time_dmp_out}{rn\_time\_dmp\_out}, $\tau$) are defined at the boundaries itself.
This time scale ($\alpha$) is weighted by the distance ($d$) from the boundary over \np{nn_rimwidth}{nn\_rimwidth} cells ($N$):

\[
  \alpha = \frac{1}{\tau}(\frac{N+1-d}{N})^2,       \quad d=1,N
\]

The same scaling is applied in the Orlanski damping.

%% =================================================================================================
\subsection{Boundary geometry}
\label{subsec:LBC_bdy_geometry}

Each open boundary set is defined as a list of points.
The information is stored in the arrays \forcode{nbi}, \forcode{nbj}, and \forcode{nbr} in the \forcode{idx_bdy} structure.
The \forcode{nbi} and \forcode{nbj} arrays define the local $(i,j)$ indexes of each point in the boundary zone and
the \forcode{nbr} array defines the discrete distance from the boundary: \forcode{nbr=1} means that
the boundary point is next to the edge of the model domain, while \forcode{nbr>1} means that
the boundary point is increasingly further away from the edge of the model domain.
A set of \forcode{nbi}, \forcode{nbj}, and \forcode{nbr} arrays is defined for each of the $T$-, $u$- and $v$-grids.
\autoref{fig:LBC_bdy_geom} shows an example of an irregular boundary.

The boundary geometry for each set may be defined in a namelist \forcode{&nambdy_index} or
by reading in a ``\textit{coordinates.bdy.nc}'' file.

The \forcode{&nambdy_index} namelist record defines straight-line segments for northern, eastern, southern or western boundaries (\np{ctypebdy}{ctypebdy} is set to \texttt{'N'}, \texttt{'E'}, \texttt{'S'}, or \texttt{'W'}, respectively),
and contains the three values \np{nbdybeg}{nbdybeg}, \np{nbdyend}{nbdyend}, and \np{nbdyind}{nbdyind} that specify the first and final index on the axis along, as well as the position on the axis perpendicular to, the segment, respectively.
If parameter \np{nbdyind}{nbdyind} is set to -1, the code will automatically define the boundary segment to span the respective domain extent. These segments define a list of $T$ grid points along the outermost row of the boundary (\forcode{nbr = 1}).
The code deduces the $u$- and $v$-points and also the points for \forcode{nbr > 1} if \np[>1]{nn_rimwidth}{nn\_rimwidth}.

The boundary geometry may alternatively be defined through a ``\textit{coordinates.bdy.nc}'' file.
\autoref{fig:LBC_nc_header} provides an example of the header information of such a file.
The file contains the index arrays for each of the $T$-, $u$- and $v$-grids; prefixes \texttt{nbi}, \texttt{nbj}, and \texttt{nbr} identify the \texttt{i}-, \texttt{j}-, and boundary-width-indices relevant for the global domain (the former two are internally converted to local domain indices), respectively;
these arrays must be in order of increasing \texttt{nbr}.
Such files are also used to generate external boundary data via interpolation, and therefore
it also contains the latitudes and longitudes of each point as shown; horizontal interpolation is, however, not relevant when running the model.

For some choices of an irregular boundary, the model domain may contain ocean areas which
are not part of the computational domain.
For example, if an open boundary is defined along an isobath, say at the shelf break,
then the areas of ocean outside of this boundary will need to be masked out.
Parameter \np{ln_mask_file}{ln\_mask\_file} in the \nam{bdy}{bdy} namelist record can be used to activate the input of a mask file of name \np{cn_mask_file}{cn\_mask\_file}.
Only one mask file can be used, even if multiple boundary sets are defined.

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{LBC_bdy_geom}
  \caption[Geometry of unstructured open boundaries]{Example of the geometry of unstructured open boundaries}
  \label{fig:LBC_bdy_geom}
\end{figure}

%% =================================================================================================
\subsection{Input boundary data files}
\label{subsec:LBC_bdy_data}

The data files contain the data arrays in the order in which the points are defined in the \forcode{nbi} and \forcode{nbj} arrays.
The data arrays have three dimensions:
time;
$xb$, the index of the boundary data point in the horizontal;
and $yb$, a dummy dimension of extent one that enables the file to be read by the standard \NEMO\ I/O routines.
3D fields also have a depth dimension.

From \NEMO\ version 3.4 there are additional restrictions on the order in which boundary points are defined
(and therefore restrictions on the order of the data in the file).
In particular:

\begin{enumerate}
  \item the data points must be in order of increasing \texttt{nbr};
  \item all the data for a particular boundary set must be in the same order.
\end{enumerate}

These restrictions mean that data files previously used with a \NEMO\ model version lower than 3.4 may not work from version 3.4 onwards.

\begin{figure}[!t]
  \centering
  \includegraphics[width=0.66\textwidth]{LBC_nc_header}
  \caption[Header for a \textit{coordinates.bdy.nc} file]{
    Example of the header of a ``\textit{coordinates.bdy.nc}'' file}
  \label{fig:LBC_nc_header}
\end{figure}

%% =================================================================================================
\subsection{Volume correction}
\label{subsec:LBC_bdy_vol_corr}

A volume correction can be applied to ensure that the total volume of a regional model remains constant.
This option can be activated with parameter \np{ln_vol}{ln\_vol} of namelist \nam{bdy}{bdy}; \np[=.false.]{ln_vol}{ln\_vol} (default) indicates that this option is not used.
When active (\np[=.true.]{ln_vol}{ln\_vol}), two options are available with parameter \np{nn_volctl}{nn\_volctl}:
\begin{description}
\item[{\np[=0]{nn_volctl}{nn\_volctl}}]\hfill \\
the normal barotropic velocities around the boundary are adjusted at each timestep so that the integrated volume flow through the boundary is zero;
\item[{\np[=1]{nn_volctl}{nn\_volctl}}]\hfill \\
in addition to the integrated volume flow through the boundary, the change due to the surface freshwater flux is taken into account to adjust the normal barotropic velocities around the boundary to achieve an integrated volume flow of zero at each timestep.
\end{description}
If more than one boundary set is used, the volume correction is
applied to all boundaries simulatneously.

%% =================================================================================================
\subsection{Tidal harmonic forcing}
\label{subsec:LBC_bdy_tides}

\begin{listing}
  \nlst{nambdy_tide}
  \caption{\forcode{&nambdy_tide}}
  \label{lst:nambdy_tide}
\end{listing}

Tidal forcing at open boundaries requires the activation of surface
tides (\ie, in \nam{_tide}{\_tide}, \np[=.true.]{ln_tide}{ln\_tide} with the active tidal
constituents listed in the \np{sn_tide_cnames}{sn\_tide\_cnames} array; see
\autoref{sec:SBC_TDE}). The specific options related to the reading in of
the complex harmonic amplitudes of elevation (SSH) and barotropic
velocity components (u,v) at the open boundaries are defined through the
\nam{bdy_tide}{bdy\_tide} namelist parameters.\par

The tidal harmonic data at open boundaries can be specified in two
different ways, either on a two-dimensional grid covering the entire
model domain or along open boundary segments; these two variants can
be selected by setting \np[=.true.]{ln_bdytide_2ddta}{ln\_bdytide\_2ddta} or
\np[=.false.]{ln_bdytide_2ddta}{ln\_bdytide\_2ddta}, respectively. In either
case, the real and imaginary parts of SSH, u, and v amplitudes associated with
each activated tidal constituent \texttt{<constituent>} have to be provided
separately as fields in input files with names based on
\np[=<input>]{filtide}{filtide}. When two-dimensional data is used:
\begin{description}
  \item[file \textit{<input>\_grid\_T.nc}]\hfill \\
  is expected to contain variables \texttt{<constituent>\_z1} and \texttt{<constituent>\_z2} with the real and imaginary parts of SSH, respectively;
  \item[file \textit{<input>\_grid\_U.nc}]\hfill \\
  variables \texttt{<constituent>\_u1} and \texttt{<constituent>\_u2} with the real and imaginary parts of u, respectively; and
  \item[file \textit{<input>\_grid\_V.nc}]\hfill \\
  variables \texttt{<constituent>\_v1} and \texttt{<constituent>\_v2} with the real and imaginary parts of v, respectively.
\end{description}
When data along open boundary segments is used:
\begin{description}
  \item[file \textit{<input><constituent>\_grid\_T.nc}]\hfill \\
  is expected to contain variables \texttt{z1} and \texttt{z2} (real and imaginary part of SSH);
  \item[file \textit{<input><constituent>\_grid\_U.nc}]\hfill \\
  variables \texttt{u1} and \texttt{u2} (real and imaginary part of u); and
  \item[file \textit{<input><constituent>\_grid\_V.nc}]\hfill \\
  variables \texttt{v1} and \texttt{v2} (real and imaginary part of v).
\end{description}

Note that the barotropic velocity components are assumed to be defined
on the native model grid and should be rotated accordingly when they
are converted from their definition on a different source grid. To do
so, the u, v amplitudes and phases can be converted into tidal
ellipses, the grid rotation added to the ellipse inclination, and then
converted back (care should be taken regarding conventions of the
direction of rotation).

\subinc{\input{../../global/epilogue}}

\end{document}