Page 1 of 1

strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Wed Apr 07, 2021 2:14 pm
by ljzhou86
I recently used yambo-5.0.1 to conduct my excitonic calculations using BSE. The pbs submission script is as follows:

"#!/bin/sh
#BS -l nodes=2:ppn=48
#PBS -l walltime=48:00:00
#PBS -q batch
#PBS -V
#PBS -S /bin/bash
module load yambo/5.0.1-hdf-sp-mix
cd $PBS_O_WORKDIR
NP=`cat $PBS_NODEFILE | wc -l`
NN=`cat $PBS_NODEFILE | sort | uniq | tee /tmp/nodes.$$ | wc -l`
cat $PBS_NODEFILE > /tmp/nodefile.$$
mpirun -rdma -machinefile /tmp/nodefile.$$ -np $NP yambo -F ./Inputs/ljbse -J ljbse -C ljbse >$PBS_JOBID.log>log"

Today, I encountered a very strange issue: as above, I allocated 2 nodes 96 cores to do this calculation, however, the actually called number of nodes is 1 (48 cores). In fact, no matter how many nodes I set in the submission script, the number of nodes in the final calculation is on one node. Is it related to any setting? The above PBS submission script works well for yambo-4.5.3. It troubled me very much, could you help me to spot and fix it? The configure options are as follows:
./configure FC=ifort F77=ifort --enable-yaml-output --enable-par-linalg --enable-mpi --enable-open-mp --enable-memory-profile --enable-uspp --enable-netcdf-hdf5 --enable-hdf5-compression --enable-hdf5-p2y-support --enable-hdf5-par-io --enable-logging --enable-memory-profile --enable-time-profile --enable-debug-flags --with-blas-libs="-lmkl_intel_lp64 -lmkl_sequential -lmkl_core" --with-lapack-libs="-lmkl_intel_lp64 -lmkl_sequential -lmkl_core"

Another question: as I known, the 1st step for BSE calculation is to calculate the static inverse Dielectric Matrix, but I found this option "-b" has been removed out. Does it mean that the new BSE calculation does not need the static inverse Dielectric Matrix anymore? Thanks a lot

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Wed Apr 07, 2021 3:59 pm
by Daniele Varsano
Dear Dr. Zhou Liu-Jiang,
in order to try to understand what is going on can you provide input/report/log files?

Stating screening is now activated with the string (-X s).
In general

Code: Select all

yambo -h
provide the help on how to build inputs.

Best,
Daniele

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Wed Apr 07, 2021 4:38 pm
by ljzhou86
Please see the report and log files for the static inverse dielec. mat. calculation and spot the strange nodes' allocation. As I also found this allocation issue existed in BSE calculation, it seems to be independent of what kind of run levels.
05w.zip
Thanks

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Wed Apr 07, 2021 5:33 pm
by Daniele Varsano
Dear Zhou,

as you can see from the report you are using just 1 MPI process.
Looking at your submission script it seems there is a "P" missing:
#BS -l nodes=2:ppn=48
instead of
#PBS -l nodes=2:ppn=48

Best,
Daniele

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Wed Apr 07, 2021 6:38 pm
by ljzhou86
Thanks, Daniele

I have corrected this typo in the submission script. However, I got another issue: the Yambo could be not executed on several nodes after the successful submission of job. I checked the job status to be "running", and no report and log files were generated. I entered into the computational node and did not find the running Yambo commands when I typing "top". Could you know the reason? This strange situation is not observed in Yambo -version 4.5.3. Thanks.

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Fri Apr 09, 2021 8:13 am
by Daniele Varsano
Dear Zhou,

hard to say what is happening,
I checked the job status to be "running", and no report and log files were generated. I entered into the computational node and did not find the running Yambo commands when I typing "top"
and this is quite inconsistent, maybe you should talk with your system administrator.
Please do not user input generated by the previous version, but use consistent inputs.
Maybe running a simple job interactively may helo to understand what is happening.

Best,
Daniele

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Sun Apr 11, 2021 11:51 pm
by ljzhou86
Dear sir

I reported here the new status when we fixing. I used the input generated by Yambo-5.0.1, but this issue still existed.

Our system administrator also had no ideas. We even found that the Yambo executable file could be called by allocating "#PBS -l nodes=4:ppn=24", but it failed by "#PBS -l nodes=4:ppn=48" or "#PBS -l nodes=6:ppn=24" or "#PBS -l nodes=2:ppn=48". It seems to be cores and nodes-dependent. What's more, the use of options "-J" and/or "-C" may also influence the calling of Yambo executable file. These issues do not appear in previous old Yambo versions. Are they related to some internal inconsistency brought by Yambo's parallelization strategy in version 5.0.1? Thanks

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Mon Apr 12, 2021 8:46 am
by Daniele Varsano
Dear Zhou ,
when changing the number of core, change the parallelisation input variable accordingly,
next:

mpirun -rdma -machinefile /tmp/nodefile.$$ -np $NP yambo -F ./Inputs/ljbse -J ljbse -C ljbse >$PBS_JOBID.log>log"

yambo does not need the redirection of the output, so try to run your job without the >$PBS_JOBID.log>log redirection.
This could be the reason of the problem.

Best,
Daniele

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Mon Apr 12, 2021 11:20 am
by ljzhou86
Dear Daniele

The redirection ">$PBS_JOBID.log>log redirection " is only used for me to obtain the JobID, which is not associated with this issue in our tests.

I used the default parallelization strategy without any specification input file. It is fine with yambo 4.5.3. Is it not feasible with the new version? And do I have to change the parallelisation manually?

Thanks

Re: strange allocation of nodes on the calculations using yambo-5.0.1

Posted: Mon Apr 12, 2021 11:49 am
by Daniele Varsano
Dear Zhou ,
the default parallelisation can be used also in the version 5.0.1, anyway it could bring to not optimised strategies.

Have you tried to run interactively? as your problem seems to be more related to the submission/queue system.

Best,
Daniele