Page 1 of 1

'Not enough states to converge the Fermi Level' when GW band interpolation

Posted: Sun Feb 11, 2024 12:23 am
by edward
Dear Yambo Developers,

When I was trying to interpolate the GW bands, I ran into the problems mentioned in the title. And I have tried to comment out the code in 'src/common/OCCUPATIONS_Fermi.F' as follows

Code: Select all

if (i_Ef_fine(2)==n_total_states) call error('Not enough states to converge the Fermi Level')
However, then the ypp executable will keep running and never exit. I think it might be caused by the under-convergence search for the correct Fermi level.
I'm sincerely looking forward to your assistance! Thanks in advance!

Best regards,
Mingran

Re: 'Not enough states to converge the Fermi Level' when GW band interpolation

Posted: Mon Feb 12, 2024 9:21 am
by Daniele Varsano
Dear Mingran,
please sigh your post with your full name and affiliation, this is a rule of the forum, and you can do once for all filling the signature in your user profile.

To inspect the problem, can you attach the report file of your calculations: both GW and band interpolation.

Best,
Daniele

Re: 'Not enough states to converge the Fermi Level' when GW band interpolation

Posted: Wed Feb 14, 2024 2:30 pm
by edward
Dear Daniele,

Thanks for your prompt reply and sorry for my late reply. I had a problem connecting to my server these days, which causes this lag.

As for the problem, I deleted the report of GW calculation and the interpolation report is attached. Do I need to rerun the GW calculation?

ps.
1) The interpolation report file is too large to be attached, so I deleted the energy output at each K point.
2) command line output when running '~/bin/yambo-5.2.1-comment/bin/ypp -F ypp_bands.in' is given as follows, it will stuck there.
3) By the way, is there a solution to cut down the GPU memory consumption? I have already tried to use the parallel layout solely on bands but it's still not enough. Is it possible to copy the wave function on the fly when the one-time copy before \chi calculation is impossible?

Code: Select all



 __    __ ______           ____     _____
/\ \  /\ \\  _  \  /"\_/`\/\  _`\ /\  __`\
\ `\`\\/"/ \ \L\ \/\      \ \ \L\ \ \ \/\ \
 `\ `\ /" \ \  __ \ \ \__\ \ \  _ <" \ \ \ \
   `\ \ \  \ \ \/\ \ \ \_/\ \ \ \L\ \ \ \_\ \
     \ \_\  \ \_\ \_\ \_\\ \_\ \____/\ \_____\
      \/_/   \/_/\/_/\/_/ \/_/\/___/  \/_____/


 <---> [01] MPI/OPENMP structure, Files & I/O Directories
 <---> [02] Y(ambo) P(ost)/(re) P(rocessor)
 <---> [03] Core DB
 <---> :: Electrons             : 364.0000
 <---> :: Temperature           : 0.000000 [eV]
 <---> :: Lattice factors       : 16.24821  14.07136  45.60503 [a.u.]
 <---> :: K points              : 26
 <---> :: Bands                 :  600
 <---> :: Symmetries            :  6
 <---> :: RL vectors            : 1202267
 <---> [04] K-point grid
 <---> :: Q-points   (IBZ): 26
 <---> :: X K-points (IBZ): 26
 <---> [05] CORE Variables Setup
 <---> [05.01] Unit cells
 <---> [05.02] Symmetries
 <---> [05.03] Reciprocal space
 <---> [05.04] K-grid lattice
 <---> Grid dimensions      :  12  12
 <---> [05.05] Energies & Occupations
 <---> [05.05.01] External/Internal QP corrections
 <---> E<./PARA_1/ndb.QP[ PPA@E  27.21138 * XG 5585 * Xb 1-600 * Scb 1-600]
 <---> [dE_from_DB-Nearest K] Exact matches       :  100.0000 [o/o]
 <---> [QP_apply] Action to be applied: E<./PARA_1/ndb.QP[ PPA@E  27.21138 * XG 5585 * Xb 1-600 * Scb 1-600]
 <---> [05.05.01.01]  QP corrections report
Best,
Mingran

Re: 'Not enough states to converge the Fermi Level' when GW band interpolation

Posted: Thu Feb 15, 2024 8:44 am
by Daniele Varsano
Dear Mingran,

1) I cannot see anything wrong in your report. It seems that the code has difficulty in finding the Fermi level once the QP correction is applied. I supsect that something nasty happened in the GW correction evaluation. Can you post your GW report, or qp output file?
2) It's not totally clear to me what you mean with "copy the wave function on the fly when the one-time copy before \chi calculation is impossible". I imagine you want to reduce memory consumption in a GW calculation, right? Anyway, the most efficient way as you tried is to parallelize over c,v in the calculation of the response function and over "b" in the Sigma evaluation.

Best,
Daniele

Re: 'Not enough states to converge the Fermi Level' when GW band interpolation

Posted: Fri Feb 16, 2024 1:52 pm
by edward
Dear Daniele,

Thanks for your prompt reply and kind support!

I reran a GW calculation and the report file is attached. This time, I deleted the comment I added to the ypp source code and the report file is attached too.

As for the GPU memory issue, I find that before running the polarizability calculation, the wavefunction will be copied to the device as shown in the log file as follows

Code: Select all

 <24s> P1-gpu001.sulis.hpc: [MEMORY] Alloc WF%c( 38.43904 [Gb]) (HOST) TOTAL:  40.01232 [Gb] (traced)  2.050536 [Gb] (memstat)
 <24s> P1-gpu001.sulis.hpc: [MEMORY] Alloc WF%c_d( 38.43904 [Gb]) (DEV) TOTAL:  39.52785 [Gb] (traced)
So here is a problem, I'm currently using A100 40GB cards, and therefore it is impossible to implement the calculation. I have tried the parallel strategy you mentioned on 30 GPUs, which is the upper limit of this cluster, but it is still not working in some systems.

I wondering if there are some solutions to further cut down memory consumption. I've tried to use the single-precision version but it seems not very effective.

Just a little thought, is it possible to set a parameter to further split the \chi calculation equally and run in serial turns? For example, further split the local conduction band indices into groups and copy the wavefunction to devices before running each group. In this way, the memory issue will be handled and convergent calculation will be realizable in more cases.

Thank you,
Mingran

Re: 'Not enough states to converge the Fermi Level' when GW band interpolation

Posted: Fri Feb 16, 2024 5:18 pm
by edward
Dear Daniele,

Sorry for my carelessness. I just found the NaN in the report file, which is quite strange. When I run the calculation on a coarser mesh, everything works fine. I attached the cheaper calculation files as well.

Best,
Mingran