removing isolated lbc_lnk from ice and ocean codes
They are a number of routines that still include few mpi communications (lbc_lnk), namely:
- stprk3_stg
- icedyn
- bdyice
- zdfphy
- ldfslp
- sbcssm
- divhor
- traadv_fct
It appears that they can take more than 50% of the total cost of lbc_lnk eventhough the number of communications is dominated (by far) by the 2D module and ice rheology. It seems to suggest that 3D communications are way more costly than 2D (first tests suggest a factor of 3), and that the isolated lbc_lnk can be costly especially if they gather multiple arrays. An ORCA1 simulation with 280 cores (20x20 mpi decomposition) shows that 37% of the time is spent in lbc_lnk. About 17% is due to 2D and ice rheology and the rest (20%) to "isolated" lbc_lnk.
Here we propose to remove as many unnecessary mpi comm. as possible in order to gain scalability