I am seeking help regarding a parallelization issue on the Lengau Cluster (CHPC South Africa). I am using Yambo 5.2.0, and although the build is correctly identified as MPI+OpenMP, my jobs remain stuck on a single node even when multiple nodes are requested via PBS.
Technical Context:
The log confirms the build is correct:
Version 5.2.0 Revision 22184 Hash 2871b0cee
MPI+OpenMP+SLK+SLEPC+HDF5_MPI_IO Build
The Issue:
When I submit the following script, the calculation only uses one node, leading to crashes due to memory limitations when NGsBlkXd is increased to 3 Ry.
Submission Script:
Code: Select all
#PBS -l select=4:ncpus=24:mpiprocs=2:nodetype=haswell_reg
export OMP_NUM_THREADS=12
export OMP_STACKSIZE=1G
export I_MPI_PIN_DOMAIN=omp
# Path to binaries
module load gcc/9.2.0
module load chpc/yambo/5.2.0/gcc-9.2.0-cpu
MPI_EXEC=$(which mpiexec.hydra || which mpirun)
$MPI_EXEC -np 8 yambo -F RPA_3Ry.in -J RPA_3Ry
- On Lengau, should I be using a specific MPI wrapper (like mpirun vs mpiexec.hydra) to ensure PBS node propagation?
Has anyone successfully used the chpc/yambo/5.2.0 module across multiple nodes?
Could the issue come from how the select statement interacts with the Intel/GCC environment on this specific cluster?
Please find attached the following files to help diagnose the issue:
RPA_3Ry.in: My input file, including the parallelization strategy (CPU/ROLEs).
Code: Select all
#
# : ::: ::: :::: :::: ::::::::: ::::::::
# :+: :+: :+: :+: +:+:+: :+:+:+ :+: :+: :+: :+
# +:+ +:+ +:+ +:+ +:+ +:+:+ +:+ +:+ +:+ +:+ +:+
# +#++: +#++:++#++: +#+ +:+ +#+ +#++:++#+ +#+ +:+
# +#+ +#+ +#+ +#+ +#+ +#+ +#+ +#+ +#+
# #+# #+# #+# #+# #+# #+# #+# #+# #+#
# ### ### ### ### ### ######### ########
#
#
# Version 5.2.0 Revision 22184 Hash (prev commit) 2871b0cee
# Branch is 5.2
# MPI+OpenMP+SLK+SLEPC+HDF5_MPI_IO Build
# http://www.yambo-code.org
#
optics # [R] Linear Response optical properties
infver # [R] Input file variables verbosity
kernel # [R] Kernel
chi # [R][CHI] Dyson equation for Chi.
dipoles # [R] Oscillator strenghts (or dipoles)
FFTGvecs= 6 Ry # [FFT] Plane-waves
DIP_Threads=0 # [OPENMP/X] Number of threads for dipoles
X_Threads=0 # [OPENMP/X] Number of threads for response functions
Chimod= "HARTREE" # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
NGsBlkXd= 3 Ry # [Xd] Response block size
% QpntsRXd
1 | 1 | # [Xd] Transferred momenta
%
% BndsRnXd
250 | 820 | # [Xd] Polarization function bands
%
% EnRngeXd
0.00000 | 10.00000 | eV # [Xd] Energy range
%
% DmRngeXd
0.100000 | 0.100000 | eV # [Xd] Damping range
%
ETStpsXd= 1200 # [Xd] Total Energy steps
% LongDrXd
1.000000 | 0.000000 | 0.000000 | # [Xd] [cc] Electric Field
%
CUTGeo= "slab z" # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws/slab X/Y/Z/XY..
% CUTBox
0.000000 | 0.000000 | 10.000000 | # [CUT] [au] Box sides
%
X_all_q_nCPU_LinAlg_INV= 8
X_and_IO_CPU= "1 1 8"
X_all_q_ROLEs= "q k c v"
Best regards,
Thank you for your time and assistance.
Julien H. OKOUEMBE
Master's Degree, Faculté des Sciences et Techniques
Université Marien NGOUABI, Congo