This fixes the issue raised by @smasson in his 20/01/22 email:
When activating sn_cfctl%l_prtctl = .true., and running ORCA2_ICE_PICES twin experiments (nn_hls = 1 or 2) on 1 core for 1 time step, I found that the first différence between the 2 mono.output_0000 files is for slp wj (array wslpj in ldfslp).
After some digging, I identified that the problem is coming from an optimization on pgrv in zpshde that came with the last merge related to the tiles:
IF (nn_hls==1) CALL lbc_lnk( 'zpshde', pgru , 'U', -1.0_wp , pgrv , 'V', -1.0_wp ) ! Lateral boundary conditions
A quick and dirty solution is to suppress the "IF (nn_hls==1)”. The real problem is that I don’t understand why suppressing this “IF" solves the problem... Up to know, I was not able to figure out the list of dependancies that would explain why this call to lbc_lnk on pgrv is always needed (it should not! I agree with the use of this IF). I suspect a tricky trick somewhere in the way something is computed on each side of the north pole folding...but I could not find it...
I think the issue is to do with the IF statements dependent on
ze3wv, which appear several times in the code:
DO_2D( nn_hls-1, nn_hls-1, nn_hls-1, nn_hls-1 ) ! Gradient of density at the last level iku = mbku(ji,jj) ikv = mbkv(ji,jj) ze3wu = e3w(ji+1,jj ,iku,Kmm) - e3w(ji,jj,iku,Kmm) ze3wv = e3w(ji ,jj+1,ikv,Kmm) - e3w(ji,jj,ikv,Kmm) IF( ze3wu >= 0._wp ) THEN ; pgru(ji,jj) = umask(ji,jj,1) * ( zri(ji ,jj ) - prd(ji,jj,iku) ) ! i: 1 ELSE ; pgru(ji,jj) = umask(ji,jj,1) * ( prd(ji+1,jj,iku) - zri(ji,jj ) ) ! i: 2 ENDIF IF( ze3wv >= 0._wp ) THEN ; pgrv(ji,jj) = vmask(ji,jj,1) * ( zrj(ji,jj ) - prd(ji,jj,ikv) ) ! j: 1 ELSE ; pgrv(ji,jj) = vmask(ji,jj,1) * ( prd(ji,jj+1,ikv) - zrj(ji,jj ) ) ! j: 2 ENDIF END_2D IF (nn_hls==1) CALL lbc_lnk( 'zpshde', pgru , 'U', -1.0_wp , pgrv , 'V', -1.0_wp ) ! Lateral boundary conditions
pgrv(ji,Nje0+1) will be calculated in place when
nn_hls = 2 and set by
nn_hls = 1. The issue is that, due to the mirroring of the north fold, the value of
ze3wv used to calculate
pgrv(ji,Nje0+1) in the
nn_hls = 1 case will have the opposite sign to that in the
nn_hls = 2 case. Due to the various IF statements, this means that different paths are taken through the code. Therefore while the value of
pgrv(ji,Nje0+1) is arithmetically correct for both
nn_hls cases, it is not bitwise identical.
The solution is to remove the
lbc_lnk calls for
pgrvi, which causes results to change in the
nn_hls = 1 case.