Page 1 of 1

Parallelization error when running on multiple nodes

Posted: Wed Jul 12, 2023 9:32 am
by stefan19rkc
Dear Yambo community,

I have been working on some calculations with a very large system, and have managed to run them on a single cluster node. When I try running the same calculation on 3 instances of the same node, I encounter an OOM error. Is there a way I could quickly resolve this?

Please find the log, input file, and the job launch script attached and let me know if you have additional questions. Note: I tried running on both MPI (which worked in the first place, on only one node) and OpenMP, to no difference.

Kind regards,
Stefan Velja

Re: Parallelization error when running on multiple nodes

Posted: Wed Jul 12, 2023 10:34 am
by Daniele Varsano
Dear Stefano,

please note, you are asking for 144 tasks, but then you are assigning 288 tasks in the input variables.
I do not know how many cores you have per node. Of course, you can use less of them in order to have more memory per task available.
Indeed, the effectively used parallel distribution in the log file is not the one indicated in input.

In order to optimize memory distribution among tasks, you can try to set:

Code: Select all

X_and_IO_CPU= "1 1 1 32 9"     # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v"       # [PARALLEL] CPUs roles (q,g,k,c,v)
if you plan to use 288 tasks,

or something like:

Code: Select all

X_and_IO_CPU= "1 1 1 47 4"     # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v"       # [PARALLEL] CPUs roles (q,g,k,c,v)
if you plan to use 188 tasks.

Finally, I'm not sure if you gain much going in hyperthreading.

Best,
Daniele

Re: Parallelization error when running on multiple nodes

Posted: Wed Jul 12, 2023 11:23 am
by Nicola Spallanzani
Dear Stefan,
as additional information, in the jobscript there are these two lines:

Code: Select all

#SBATCH --cpus-per-task=2

export OMP_NUM_THREADS=6
they have to be set at the same value. To make it automatic you can do this:

Code: Select all

#SBATCH --cpus-per-task=2

export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
Best,
Nicola