Fix in `zpshde.F90` to prevent results differing between `nn_hls`
This fixes the issue raised by @smasson in his 20/01/22 email:
When activating sn_cfctl%l_prtctl = .true., and running ORCA2_ICE_PICES twin experiments (nn_hls = 1 or 2) on 1 core for 1 time step, I found that the first différence between the 2 mono.output_0000 files is for slp wj (array wslpj in ldfslp).
After some digging, I identified that the problem is coming from an optimization on pgrv in zpshde that came with the last merge related to the tiles:
IF (nn_hls==1) CALL lbc_lnk( 'zpshde', pgru , 'U', -1.0_wp , pgrv , 'V', -1.0_wp ) ! Lateral boundary conditions
A quick and dirty solution is to suppress the "IF (nn_hls==1)”. The real problem is that I don’t understand why suppressing this “IF" solves the problem... Up to know, I was not able to figure out the list of dependancies that would explain why this call to lbc_lnk on pgrv is always needed (it should not! I agree with the use of this IF). I suspect a tricky trick somewhere in the way something is computed on each side of the north pole folding...but I could not find it...
I think the issue is to do with the IF statements dependent on ze3wu
/ze3wv
, which appear several times in the code:
DO_2D( nn_hls-1, nn_hls-1, nn_hls-1, nn_hls-1 ) ! Gradient of density at the last level
iku = mbku(ji,jj)
ikv = mbkv(ji,jj)
ze3wu = e3w(ji+1,jj ,iku,Kmm) - e3w(ji,jj,iku,Kmm)
ze3wv = e3w(ji ,jj+1,ikv,Kmm) - e3w(ji,jj,ikv,Kmm)
IF( ze3wu >= 0._wp ) THEN ; pgru(ji,jj) = umask(ji,jj,1) * ( zri(ji ,jj ) - prd(ji,jj,iku) ) ! i: 1
ELSE ; pgru(ji,jj) = umask(ji,jj,1) * ( prd(ji+1,jj,iku) - zri(ji,jj ) ) ! i: 2
ENDIF
IF( ze3wv >= 0._wp ) THEN ; pgrv(ji,jj) = vmask(ji,jj,1) * ( zrj(ji,jj ) - prd(ji,jj,ikv) ) ! j: 1
ELSE ; pgrv(ji,jj) = vmask(ji,jj,1) * ( prd(ji,jj+1,ikv) - zrj(ji,jj ) ) ! j: 2
ENDIF
END_2D
IF (nn_hls==1) CALL lbc_lnk( 'zpshde', pgru , 'U', -1.0_wp , pgrv , 'V', -1.0_wp ) ! Lateral boundary conditions
pgrv(ji,Nje0+1)
will be calculated in place when nn_hls = 2
and set by lbc_lnk
when nn_hls = 1
. The issue is that, due to the mirroring of the north fold, the value of ze3wv
used to calculate pgrv(ji,Nje0+1)
in the nn_hls = 1
case will have the opposite sign to that in the nn_hls = 2
case. Due to the various IF statements, this means that different paths are taken through the code. Therefore while the value of pgrv(ji,Nje0+1)
is arithmetically correct for both nn_hls
cases, it is not bitwise identical.
The solution is to remove the lbc_lnk
calls for pgru
/pgrv
/pgrui
/pgrvi
, which causes results to change in the nn_hls = 1
case.