NaNf in ndb.RIM_W causing NaN's in QP energies

Various technical topics such as parallelism and efficiency, netCDF problems, the Yambo code structure itself, are posted here.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan, Nicola Spallanzani

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Franz Fischer » Tue Apr 18, 2023 9:37 am

Dear Yambo team,

I was making use of the Wavg method described in one of your recent papers [1] for some more efficient GW calculations in 2D systems.
On first sight everything is working fine, but after I increased the number of bands (BndsRnXp, GbndRnge) to converge my QP results I get NaN's in the QP database for every kpoint and every band. I could narrow down the error to be caused by a single NaN in the ndb.RIM_W database, see here the corresponding first few lines after dumping the database to text:

Code: Select all

RIM_W =
  NaNf, -0.0005769487, -0.0001289001, -5.003623e-05, -2.529687e-05,
    -1.479399e-05, -9.448796e-06, -6.396577e-06, -0.0001789454,
    -6.725168e-05, -3.238434e-05, -1.826945e-05, -1.137765e-05,
    -7.562438e-06, -5.262167e-06, -3.561303e-05, -2.06126e-05, -1.290349e-05,
    -8.56654e-06, -5.941106e-06, -1.349546e-05, -9.160479e-06, -6.425639e-06,
    -4.633982e-06, -6.602507e-06, -4.84217e-06, -3.678894e-06,
  0.0001129744, 9.595103e-05, 4.535672e-05, 2.364617e-05, 1.377599e-05,
    8.70959e-06, 5.833819e-06, 4.075021e-06, 5.504934e-05, 2.935193e-05,
    1.685224e-05, 1.046192e-05, 6.898701e-06, 4.757858e-06, 3.395363e-06,
    1.817919e-05, 1.159872e-05, 7.718514e-06, 5.334397e-06, 3.803356e-06,
    8.031931e-06, 5.670818e-06, 4.091146e-06, 3.011618e-06, 4.195494e-06,
    3.138716e-06, 2.419028e-06,
  0.0001129751, 9.593033e-05, 4.535044e-05, 2.364704e-05, 1.377833e-05,
    8.711026e-06, 5.833838e-06, 4.073988e-06, 5.504785e-05, 2.93511e-05,
    1.685393e-05, 1.046362e-05, 6.899353e-06, 4.757353e-06, 3.394134e-06,
    1.8179e-05, 1.159953e-05, 7.71926e-06, 5.334412e-06, 3.802672e-06,
    8.031874e-06, 5.670897e-06, 4.090903e-06, 3.011015e-06, 4.195467e-06,
    3.138519e-06, 2.419011e-06,
I tried using more or less RandGvecW RL's, but the outcome is the same on 10^6 random qpoints.
I also used differently dense k-grids and the issue persists. So it might be related to the bands, as this was the only thing I changed.
I will attach my report file and the input file I used.

Best,
Franz

[1] https://www.nature.com/articles/s41524-023-00989-7
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

User avatar
Davide Sangalli
Posts: 620
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Davide Sangalli » Tue Apr 18, 2023 10:57 am

Dear Franz,
the log.zip attachment seems to be empty.

Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Franz Fischer » Tue Apr 18, 2023 11:06 am

Hi Davide,

I now added a .zip file with contents. :mrgreen:

Best,
Franz
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

Alberto Guandalini
Posts: 8
Joined: Thu Jun 24, 2021 9:10 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Alberto Guandalini » Wed Apr 19, 2023 3:10 pm

Dear Franz,
I found in the log.zip file only the input file and r_setup file.

It would be helpful if you can send to us also the report file and log files of the GW calculation.

Cheers,
Alberto

User avatar
Davide Sangalli
Posts: 620
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Davide Sangalli » Wed Apr 19, 2023 5:45 pm

Dear Franz,
we may have spotted the bug and fixed it.

Can you please try yambo 5.1.2
https://github.com/yambo-code/yambo/wik ... gz-format)

Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Franz Fischer » Thu Apr 20, 2023 10:23 am

Hi Davide,

I tried out yambo version 5.1.2, but the problem is still there.
This could still be an error on my side for the following reasons:
I did not recompute the dynamic screening, but used the one that I computed earlier with:

Code: Select all

yambo - MPI+HDF5_MPI_IO - Ver. 5.1.0 Revision 21716 Hash 29911edc6
I removed the databases from the previous run with v. 5.1.0, i.e. ndb.HF_and_locXC, ndb.RIM_W and ndb.QP, from my job directory and relaunched the Self-Energy / RIM_W / QP step using the newest version.

I still see one single NaN in the first entry of the ndb.RIM_W database that renders all QP energies to NaNs.

Code: Select all

RIM_W =
  NaNf, -0.0005769487, -0.0001289001, -5.003623e-05, -2.529687e-05,
    -1.479399e-05, -9.448796e-06, -6.396577e-06, -0.0001789454,
    -6.725168e-05, -3.238434e-05, -1.826945e-05, -1.137765e-05,
    -7.562438e-06, -5.262167e-06, -3.561303e-05, -2.06126e-05, -1.290349e-05,
    -8.56654e-06, -5.941106e-06, -1.349546e-05, -9.160479e-06, -6.425639e-06,
    -4.633982e-06, -6.602507e-06, -4.84217e-06, -3.678894e-06,
  0.0001129744, 9.595103e-05, 4.535672e-05, 2.364617e-05, 1.377599e-05, ...
If you think I should recompute the dynamical screening or any other previously computed quantity or change the parallelization scheme, let me know.
I will also attach a .zip file with all the report/input files/(first 50) processor dependent log files. Sadly I could not attach all of the latter as this exceeded the maximum file size.

Best,
Franz
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

Alberto Guandalini
Posts: 8
Joined: Thu Jun 24, 2021 9:10 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Alberto Guandalini » Thu Apr 20, 2023 1:39 pm

Dear Franz,
by looking at you report file there are some parts which are strange:

Code: Select all

| [RD./gw//ndb.RIM]---------------------------------------------------------------
|  Brillouin Zone Q/K grids (IBZ/BZ)                :   27  225   27  225
|  *WRN* Coulomb cutoff potential                   : none
|  Coulombian RL components                         :  101
|  Coulombian diagonal components                   : yes
|  RIM random points                                :  1000000
|  RIM  RL volume             [a.u.]                :  0.160773
|  Real RL volume             [a.u.]                :  0.160633
|  Eps^-1 reference component                       : 0
|  Eps^-1 components                                :  0.000000  0.000000  0.000000
|  RIM anysotropy factor                            :  0.000000
| - S/N 002373 ---------------------------------------------- v.05.01.00 r.21716 -
From here it seems that you are restarting the v averages without the slab cutoff.

In addition:

Code: Select all

| Comparison between head Wc = W - v averages: < Wc > [au] RIM-W/RIM
|
|   < -Wc [Q = 1] >      NaN-0.021135 * < -Wc [Q = 2] > 0.003625 0.003565
Here there is the NaN obtained by the W-av method, but it is also unexpected that the v-av method
gives a contribution different from zero. As described in the W-av paper, with cutoff slab you should
obtain a v-av contribution that equals zero.

My advice is to repeat the calculation from scratch (maybe with low bands and convergence parameters
to speed up the calculation)
to see if the problem is due to incompatibility between different databases calculated with and without cutoff.
If this does not work, I would look at the database ndb.pp_fragment_1 to check if the screening properties follow
the expected trend for 2D semiconductors (vX(q=0,w=0) \approx 0).

N.B. in the ndb.pp_fragment_1 database the variable X_mat is exactly vX, so the coulomb interaction times the reducible
response function.

Cheers,
Alberto

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Franz Fischer » Fri Apr 21, 2023 2:40 pm

Hi Alberto,

thank you for your reply.
I took action as you suggested and started tackling the weird behaviour of the RIM database. I can't explain why this happened as I did not remove the cutoff keyword from any input file that I used. Furthermore I started recomputing the ndb.RIM from scratch, i.e. using this small input file here:

Code: Select all

rim_cut                          # [R] Coulomb potential
RandQpts=1000000                       # [RIM] Number of random q-points in the BZ
RandGvec= 101                RL    # [RIM] Coulomb interaction RS components
CUTGeo= "slab z"                   # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws/slab X/Y/Z/XY..
What I find extremely strange is that if I compute the ndb.RIM for the first time (i.e. the job directory is entirely empty) it says in the report file that no cutoff technique has been used on the RIM, but I find a ndb.cutoff database afterwards in my job directory. When I then remove the ndb.RIM database (and keep ndb.cutoff) and restart the calculation the cutoff is employed in the RIM step and is visible in the report file. This behaviour I can not explain.

Furthermore I recomputed all the necessary GW quantities from scratch using version 5.1.2. This time I used the workaround I describe above to get the cutoff in the ndb.RIM step and also I split up the computation in individual steps in this order:

ndb.RIM, ppa screening (ndb.pp*), ndb.HF_and_locXC and finally the ndb.RIM_W and ndb.QP.

I still get a NaN in the first entry of ndb.RIM_W and no valid QP energies.

After that I checked the contents of the ndb.pp* files, as you suggested.
First of all, when I ncdump or read the ndb.pp_fragment_1 file I do not see the variable keyword 'X_mat'. The only ones that I see are:

Code: Select all

dict_keys(['FREQ_PARS_sec_iq1', 'FREQ_sec_iq1', 'X_Q_1'])
I also plotted the static part of the inverse dielectric function over the q-points that I read from the ndb.pp_fragment_* files and it shows an unphysical behaviour for q=0 and G=G'=0 that overshoots the expected value of 1. I attached a figure in the style of fig 1 in [1].

Lastly I can give you some information on the way I compiled yambo 5.1.2 on the HPC that I am using.
I downloaded the tar.gz from your git repository and configured it with these flags:

Code: Select all

./configure CC=mpiicc FC=mpiifort --enable-mpi --enable-hdf5-p2y-support
and afterwards just did:

Code: Select all

make all
I loaded intel-19.0.4/compilers_and_libraries_2019.4.243 for the compilation.


[1] https://journals.aps.org/prb/abstract/1 ... .93.235435
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

Alberto Guandalini
Posts: 8
Joined: Thu Jun 24, 2021 9:10 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Alberto Guandalini » Wed Apr 26, 2023 5:59 am

Dear Franz,
first of all, X_Q_n is the correct variable to check in the database, as you correctly used (X_mat is the name inside of the code, sorry for the error).

Actually, X_Q_n is equal to v_Coulomb*X. Thus, the inverse dielectric function is eps^-1(q_n) = 1 + X_Q_n.

If you plot eps^-1 with the formula above, and find that eps^-1(q=0) \neq 1, the Wav interpolation goes wrong as the interpolation
formula expect that eps^-1(q --> 0) --> 1.

Doy you confirm that you have a 2D semiconductor?

Cheers,
Alberto

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

Re: NaNf in ndb.RIM_W causing NaN's in QP energies

Post by Franz Fischer » Wed Apr 26, 2023 9:39 am

Hi Alberto,
Actually, X_Q_n is equal to v_Coulomb*X. Thus, the inverse dielectric function is eps^-1(q_n) = 1 + X_Q_n.
this is exactly what I plotted in the figure attached to my last post.
I also checked the array eps^-1[q, G, G'] for real-parts greater than 1 and I only found one entry at q=G=G'=0. So there must be something going wrong with the cutoff or RIM I guess.

My system is a 2H-stacked (W on top of S and vice versa) homo-bilayer of WS2 and I will attach the report file of yambo as well as my QE in-/outputs.
Maybe you can spot an error.

Best,
Franz
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

Post Reply