Reproducibility of the OBS global grid search
In the context of issue #126 (closed) it was found that the observation-operator implementation (OBS) would fail the SETTE reproducibility test if SETTE would compare OBS-specific model output (see !194 (comment 5931)). As a result, !194 (merged) includes a SETTE extension for the comparison of OBS-specific output, which evinces the ORCA2_ICE_OBS
failure, but the actual reproducibility failure has yet to be resolved in the corresponding development branch. Further, the same issue has also been found to be present in branch_4.2
Some associations of observation locations to model grid locations in the vicinity of land-only subdomains (two land-only subdomains are present in the REPRO_8_4
configuration, which, however, are not suppressed) have been found to differ between ORCA2_ICE_OBS
SETTE runs REPRO_4_8
and REPRO_8_4
. As halo exchanges that solely affect land points are suppressed (and by extension exchanges with land-only subdomains), coordinate values in halo regions of arrays glam{t,u,v,f}
and gphi{t,u,v,f}
are unreliable. However, the OBS grid-search algorithm subjects such values to a global reduction operation (calls of subroutine mpp_global_max
in module obs_grid
). Further, the current implementation of the OBS global grid search (option ln_grid_global
) appears to be incompatible with the suppression of land subdomains. In fact, running REPRO_8_4
on 30 instead of 32 processes results in a failure of the model run.
In branch_4.2
, the global reduction used to build the global coordinate array could give precedence to interior-domain values over halo values to avoid the use of coordinate values from halo regions in the grid search in many cases (although not all), for example with the modification
--- a/src/OCE/OBS/obs_grid.F90
+++ b/src/OCE/OBS/obs_grid.F90
@@ -285,9 +285,21 @@ CONTAINS
zmskg(mig(ji),mjg(jj)) = tmask(ji,jj,1)
+ DO jj = 1+nn_hls, jpj-nn_hls
+ DO ji = 1+nn_hls, jpi-nn_hls
+ zlamg(mig(ji),mjg(jj)) = glamt(ji,jj) + 1000000.0_wp
+ zphig(mig(ji),mjg(jj)) = gphit(ji,jj) + 1000000.0_wp
+ zmskg(mig(ji),mjg(jj)) = tmask(ji,jj,1) + 1000000.0_wp
CALL mpp_global_max( zlamg )
CALL mpp_global_max( zphig )
CALL mpp_global_max( zmskg )
+ WHERE( zmskg(:,:) >= 1000000.0_wp )
+ zlamg(:,:) = zlamg(:,:) - 1000000.0_wp
+ zphig(:,:) = zphig(:,:) - 1000000.0_wp
+ zmskg(:,:) = zmskg(:,:) - 1000000.0_wp
! Add various grids here.
DO jj = 1, jlat
and a corresponding modification in file obs_grd_bruteforce.h90, which resolves the ORCA2_ICE_OBS
reproducibility failure. Further, it is recommended to include a model stop when both OBS is active with the global grid-search option (ln_grid_global
) and land subdomains are suppressed:
--- a/src/OCE/OBS/diaobs.F90
+++ b/src/OCE/OBS/diaobs.F90
@@ -426,6 +426,9 @@ CONTAINS
IF( ln_grid_global ) THEN
+ IF( jpnij < jpni * jpnj ) THEN
+ CALL ctl_stop( 'STOP', 'dia_obs_init: ln_grid_global=T is incompatible with suppressed land subdomains' )
CALL ctl_warn( 'dia_obs_init: ln_grid_global=T may cause memory issues when used with a large number of processors' )
Corresponding fixes could be applied as part of !194 (merged) to resolve this issue in main