Static screening error with mpirun

Deals with issues related to computation of optical spectra in reciprocal space: RPA, TDDFT, local field effects.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan

Post Reply
pandachang97
Posts: 7
Joined: Mon Feb 19, 2024 11:06 pm

Static screening error with mpirun

Post by pandachang97 » Tue Feb 20, 2024 2:01 am

Hello Yambo experts:

I hope you are all dong well today. My name is Xin Chang at UT austin. I am running Yambo on monolayer MoS2. I am following the tutorial on the Yambo wiki. I have already initialized the database without any error reported. Then, I used the command listed on the webcite, yambo -X s -F 01_3D_BSE_screening.in . Then, in the input file, I only changed these two settings
% BndsRnXs
42 | 65
% LongDrXs
1.000000 | 1.000000 | 0.000000 |
When I submitted the job, I used one of my scripts in the PBS job system with 1 node 24 cores. mpirun -np 24 yambo -F 01_3D_BSE_screening.in -J 3D_BSE
Later, in the folder, the Yambo package generates 18 outputs flies, with the name l-3D_BSE_screen_dipoles_em1s and l-3D_BSE_screen_dipoles_em1s_01 .... Then, I keep an eye on the outputs. I found there will be an error before step [04] dipoles in most outputs, which says "[ERROR] STOP signal received while in[04] Dipoles [ERROR] Writing File ./3D_BSE//ndb.dipoles; Variable NOT DEFINED; Permission denied". But, the interesting point is that the job is still running. I found there are two outputs (I have tried several times, each time the name of the outputs is differetn. ), let's say, l-3D_BSE_screen_dipoles_em1s_09 and l-3D_BSE_screen_dipoles_em1s_02 are still updating. They did not receive the errors. And they go to the step 04 and step 5 to calculate the dielectric constant. It takes a while (generally, 40 mins) to finish the calculation. And I can still run the next step calculation with the files in the folder, 3D_BSE.
I am not sure if you get my idea. It seems that the code only allows 2 processes to run the calculation. Please let me know what I should do to avoid that error. Best regards,
Xin
Xin Chang
Postdoc research @UT Austin

User avatar
Daniele Varsano
Posts: 3816
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Static screening error with mpirun

Post by Daniele Varsano » Tue Feb 20, 2024 9:25 am

Dear Xin,

please sign your post with your name and affiliation, this is a rule of the forum. You can do once for all by filling your signature profile.
Next, in order to receive help, it is also useful to post input and report files so we can inspect them.
The error "permission denied" it seems more that there is a problem in the node you are running the job. Anyway, please post your input and report and we will have a look.

Moreover:

Code: Select all

% BndsRnXs
42 | 65
% LongDrXs
1.000000 | 1.000000 | 0.000000 |
Here the namelists are not closed, the correct syntax is:

Code: Select all

% BndsRnXs
42 | 65
% 
% LongDrXs
1.000000 | 1.000000 | 0.000000 |
%
I do not know if this is correct in your input file.

Side note (not related with the error), BndsRnXs is a sum-over-states and this is for sure not converged. It is safer to include all the occupied bands, while the empty bands should be brought to convergence.

Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

pandachang97
Posts: 7
Joined: Mon Feb 19, 2024 11:06 pm

Re: Static screening error with mpirun

Post by pandachang97 » Tue Feb 20, 2024 7:02 pm

Hello Daniele,

Thank you for your help. I just add my signature and affiliation in my profile.
For the Yambo calculation, I have some updates. First, I just put my input file in this post.

Code: Select all

# Version 5.2.1 Revision 22792 Hash (prev commit) ace55e496e          
#                       Branch is                                     
#                  Serial+HDF5_IO Build                               
#               http://www.yambo-code.org                             
#
screen                           # [R] Inverse Dielectric/Response Matrix
em1s                             # [R][Xs] Statically Screened Interaction
dipoles                          # [R] Oscillator strenghts (or dipoles)
Chimod= "HARTREE"                # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
% BndsRnXs
   42 |  65 |                         # [Xs] Polarization function bands
%
NGsBlkXs= 1                RL    # [Xs] Response block size
% LongDrXs
 1.000000 | 1.000000 | 0.000000 |        # [Xs] [cc] Electric Field
%
This input file is generated followed by the tutorial on the Yambo wiki. Then, when I run it, as I mentioned above, most of the output files received error reports. Only two or three processes are running. Then, when I do the next step calculation, which is the Bethe-Salpeter kernel calculation instructed by the tutorial on Yambo wiki, I got an error that the ndp.dipoles is empty (the output is missing, since later I have finished the calcs). The input of BS kernel is shown below:

Code: Select all

# Version 5.2.1 Revision 22792 Hash (prev commit) ace55e496e          
#                        Branch is                                    
#                  Serial+HDF5_IO Build                               
#                http://www.yambo-code.org                            
#
optics                           # [R] Linear Response optical properties
bse                              # [R][BSE] Bethe Salpeter Equation.
dipoles                          # [R] Oscillator strenghts (or dipoles)
BSKmod= "SEX"                    # [BSE] IP/Hartree/HF/ALDA/SEX/BSfxc
BSEmod= "resonant"               # [BSE] resonant/retarded/coupling
BSENGexx=  31271          RL    # [BSK] Exchange components
BSENGBlk= 1                RL    # [BSK] Screened interaction block size [if -1 uses all the G-vectors of W(q,G,Gp)]
#WehCpl                        # [BSK] eh interaction included also in coupling
% BSEQptR
 1 | 1 |                             # [BSK] Transferred momenta range
%
% BSEBands
   42 |  65 |                         # [BSK] Bands range
%
Then, I go back to the first step. I just use one core to run the first calculation. It is done in couple of seconds, since the matrix elements are all generated previously. Then, I run the BS kernel calculation, it works.
Finally, I go to the step 3, which is the Diagonalisation of the excitonic Hamiltonian. The input file is listed as below:

Code: Select all

 Version 5.2.1 Revision 22792 Hash (prev commit) ace55e496e          
#                        Branch is                                    
#                  Serial+HDF5_IO Build                               
#                http://www.yambo-code.org                            
#
bss                              # [R] BSE solver
optics                           # [R] Linear Response optical properties
dipoles                          # [R] Oscillator strenghts (or dipoles)
bse                              # [R][BSE] Bethe Salpeter Equation.
BSKmod= "SEX"                    # [BSE] IP/Hartree/HF/ALDA/SEX/BSfxc
BSEmod= "resonant"               # [BSE] resonant/retarded/coupling
BSSmod= "d"                      # [BSS] (h)aydock/(d)iagonalization/(s)lepc/(i)nversion/(t)ddft`
BSENGexx= 31271            RL    # [BSK] Exchange components
BSENGBlk= 1                RL    # [BSK] Screened interaction block size [if -1 uses all the G-vectors of W(q,G,Gp)]
#WehCpl                        # [BSK] eh interaction included also in coupling
KfnQPdb= "none"                  # [EXTQP BSK BSS] Database action
KfnQP_INTERP_NN= 1               # [EXTQP BSK BSS] Interpolation neighbours (NN mode)
KfnQP_INTERP_shells= 20.00000    # [EXTQP BSK BSS] Interpolation shells (BOLTZ mode)
KfnQP_DbGd_INTERP_mode= "NN"     # [EXTQP BSK BSS] Interpolation DbGd mode
% KfnQP_up_E
 0.000000 | 1.000000 | 1.000000 |        # [EXTQP BSK BSS] E parameters UP (c/v) eV|adim|adim
%
KfnQP_up_Z= ( 1.000000 , 0.000000 )      # [EXTQP BSK BSS] Z factor UP (c/v)
KfnQP_up_Wv_E= 0.000000    eV    # [EXTQP BSK BSS] W Energy reference UP (valence)
% KfnQP_up_Wv
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP BSK BSS] W parameters UP (valence) eV| 1|eV^-1
%
KfnQP_up_Wv_dos= 0.000000  eV    # [EXTQP BSK BSS] W dos pre-factor UP (valence)
KfnQP_up_Wc_E= 0.000000    eV    # [EXTQP BSK BSS] W Energy reference UP (conduction)
% KfnQP_up_Wc
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP BSK BSS] W parameters UP (conduction) eV| 1 |eV^-1
%
KfnQP_up_Wc_dos= 0.000000  eV    # [EXTQP BSK BSS] W dos pre-factor UP (conduction)
% KfnQP_dn_E
 0.000000 | 1.000000 | 1.000000 |        # [EXTQP BSK BSS] E parameters DOWN (c/v) eV|adim|adim
%
KfnQP_dn_Z= ( 1.000000 , 0.000000 )      # [EXTQP BSK BSS] Z factor DOWN (c/v)
KfnQP_dn_Wv_E= 0.000000    eV    # [EXTQP BSK BSS] W Energy reference DOWN (valence)
% KfnQP_dn_Wv
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP BSK BSS] W parameters DOWN (valence) eV| 1|eV^-1
%
KfnQP_dn_Wv_dos= 0.000000  eV    # [EXTQP BSK BSS] W dos pre-factor DOWN (valence)
KfnQP_dn_Wc_E= 0.000000    eV    # [EXTQP BSK BSS] W Energy reference DOWN (conduction)
% KfnQP_dn_Wc
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP BSK BSS] W parameters DOWN (conduction) eV| 1 |eV^-1
%
KfnQP_dn_Wc_dos= 0.000000  eV    # [EXTQP BSK BSS] W dos pre-factor DOWN (conduction)
% BSEQptR
 1 | 1 |                             # [BSK] Transferred momenta range
%
% BSEBands
  42 |  65 |                         # [BSK] Bands range
%
% BEnRange
  1.00000 | 6.00000 |         eV    # [BSS] Energy range
%
% BDmRange
 0.100000 | 0.100000 |         eV    # [BSS] Damping range
%
BEnSteps= 200                    # [BSS] Energy steps
% BLongDir
 1.000000 | 1.000000 | 0.000000 |        # [BSS] [cc] Electric Field
%
BSEprop= "abs"                   # [BSS] Can be any among abs/jdos/kerr/magn/dich/photolum/esrt
BSEdips= "none"                  # [BSS] Can be "trace/none" or "xy/xz/yz" to define off-diagonal rotation plane
#WRbsWF                        # [BSS] Write to disk excitonic the WFs
                                                                                   
When I use my script to run it parallel, it fails again without any instruction. Please help me check them and feel free to let me know if you have any suggestions. Best regards,
Xin
Xin Chang
Postdoc research @UT Austin

pandachang97
Posts: 7
Joined: Mon Feb 19, 2024 11:06 pm

Re: Static screening error with mpirun

Post by pandachang97 » Tue Feb 20, 2024 7:23 pm

Hello Daniele,
Another update for the Bethe-Salpeter solver: diagonalization. The input file is still the same and I have uncommented the WRbsWF option. The problem is still the same when I run it:

Code: Select all

mpirun noticed that process rank 2 with PID 0 on node compute exited on signal 15 (Terminated).
I don't know what to do to avoid it. Please help me on it. Best regards,
Xin
Xin Chang
Postdoc research @UT Austin

User avatar
Daniele Varsano
Posts: 3816
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Static screening error with mpirun

Post by Daniele Varsano » Fri Feb 23, 2024 10:04 am

Dear Xin,

in general, if a run does not complete, it is not recommended to go further as databases are either missing or corrupted.
The problem I can see is that you compiled Yambo in serial, as you can see from the header:

Code: Select all

Version 5.2.1 Revision 22792 Hash (prev commit) ace55e496e          
#                       Branch is                                     
#                  Serial+HDF5_IO Build 
so when you launch with mpirun, you are essentially running n serial jobs, and they go in conflict.
Please check your compilation by inspecting the config.log to understand what went wrong with the parallel compilation. If you cannot solve it, you can post
your issue attaching the config.log in the compilation subforum, and we will have a look to it.

Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

Post Reply