Page 1 of 1
Haydock error
Posted: Fri Mar 19, 2021 10:15 am
by malwi
Dear Yambo Team,
I got an error during BSE Haydock step:
[03.01] Haydock Solver for abs @q1, scheme hermitian
====================================================
Accuracy (requested) : -0.020000 [o/o]
[ERROR] STOP signal received while in[03.01] Haydock Solver for abs @q1, scheme hermitian
[ERROR]Bf=NaN likely because some eigenvalue of the BSE is negative.
Yambo 5.0 was compiled with Intel 6.7 on Prometheus (Cyfronet Centre).
Best regards,
Gosia
Re: Haydock error
Posted: Fri Mar 19, 2021 1:12 pm
by Daniele Varsano
Dear Gosia,
can you please post your input and report files?
You can upload as attachment by renaming them with an allowed suffix (e.g. .txt).
Best,
Daniele
Re: Haydock error
Posted: Fri Mar 19, 2021 11:00 pm
by malwi
Dear Daniele,
thank you again.
I attached the files.
Best regards,
Gosia
Re: Haydock error
Posted: Sat Mar 20, 2021 9:38 am
by Daniele Varsano
Dear Gosia,
I can't see anything wrong in your input file.
It seems you end up with a negative eigenvalue, I do not know if something weird happened in the calculation of the BSE kernel you performed in a previous run, but the matrix seems to be read correctly, and I assume your scissor value is correct.
Maybe someone expert on the Haydock algorithm and its parallel implementation can gives you some hint on how to spot the problem.
Best,
Daniele
Re: Haydock error
Posted: Sat Mar 20, 2021 4:35 pm
by malwi
Dear Daniele,
thank you.
I tried also full diagonalization and now it stops with some memory problem.
I changed the number of tasks and nodes to get more memory - I will ask Maciej Czuchry what could be wrong with that.
My doubts now are: can I change the number of cpus used for kernel to different number of cpus for diagonalization or Haydock?
This series reads files from the previous calculations.
The case, which I am calculating is actually the same as in the previous posted error. But now I use k-meshes 12 and 16 in the nscf step;
and before, it was 4. Is any physical/numerical reason which can cause the problem with Haydock? I mean for example: taking the bands
in BSE which exchange the order of symmetry at some k-points?
Best,
Gosia
Re: Haydock error
Posted: Sat Mar 20, 2021 5:08 pm
by Daniele Varsano
Dear gosia,
The matrix is far too large for a full diagonalisation.
I cannot say much on the parallelism on the haydock procedure. Other developers expert on that will answer.
Best,
Daniele
Re: Haydock error
Posted: Sat Mar 20, 2021 6:04 pm
by Davide Sangalli
Dear all,
not easy to say which could be the problem.
A simple test might be to re-run the Haydock solver putting a higher QP correction in input
% KfnQP_E
1.400000 | 1.000000 | 1.000000 | # [EXTQP BSK BSS] E parameters (c/v) eV|adim|adim
%
It's probably unphysical in this system, but just to check if there is some pole with E< 1eV, which would, in turn, give a negative eigenvalue with 0.4 eV of QP corrections.
Best,
D
Re: Haydock error
Posted: Mon Apr 12, 2021 10:08 am
by malwi
Dear Davide and Daniele,
I am back to this problem after a break for writing a proposal....
Accidentally, I succeeded to get Haydock result for more k-points:
Maciej Czuchry (from Cyfronet) suggested to take less cpu, and the calculations went through.
But I do not understand it.
When I use nscf k-mesh 8 8 8 (IBZ 35) for BSE, then it runs on 432 cpu.
When I use nscf k-mesh 10 10 10 (56 IBZ) (or more kpoints 12 or 16) for BSE,
then it does not run on 432 cpu, but runs on 144 cpu.
Why? How it is parallelized?
I do not say any thing about the parallel structure in the input, letting it go by default.
Back to Davide advice: maybe I really put to small KfnQP_E
My DFT+SOC gives gap 0.3, GW correction is 0.4 eV
What should be given for KfnQP_E, is it just the GW correction 0.4 or DFT+GW 0.7 ?
Second surprise for me is that the energy parameters for exchange and correlation
converge in BSE much faster than in GW
(for example BSENGexx= 10 Ry and EXXRLvcs= 30 Ry is convergent,
BSENGBlk= 4 Ry and VXCRLvcs= 6 Ry) and similarly NGsBlkXs= 4 Ry is enough in BSE.
The same way, number of bands in polarization is convergent 10 times faster in BSE!
% BndsRnXp
1 | 1000 | # [Xp] Polarization function bands
% BndsRnXs
1 | 108 | # [Xs] Polarization function bands
108 gives no difference in the result with respect to 1000 is used for Xs in BSE, why it is so?
On the other hand, I am not surprised that k-points nscf 4 4 4 are ok for GW, while k-points nscf 16 16 16 are
not enough for BSE. This is because the excitations in this system are not from VBM to CBM but much higher,
which is in agreement with the experiment for the optical pumping.
Best regards,
Gosia
Re: Haydock error
Posted: Mon Apr 12, 2021 11:04 am
by Daniele Varsano
Dear Gosia,
Why? How it is parallelized?
I do not say any thing about the parallel structure in the input, letting it go by default.
In yambo there is a default parallelisation that may fail, my advise is to explicitly assign CPU in input on different roles. I suggest you to use cpu on k role as much as possibile:
Code: Select all
BS_CPU= "nk neh nt" # [PARALLEL] CPUs for each role
BS_ROLEs= "k eh t" # [PARALLEL] CPUs roles (k,eh,t)
the product of nk*neh*nt has to be the number of MPI you are using.
My DFT+SOC gives gap 0.3, GW correction is 0.4 eV
What should be given for KfnQP_E, is it just the GW correction 0.4 or DFT+GW 0.7 ?
it is the correction: 0.4 eV
converge in BSE much faster than in GW
(for example BSENGexx= 10 Ry and EXXRLvcs= 30 Ry is convergent,
BSENGBlk= 4 Ry and VXCRLvcs= 6 Ry) and similarly NGsBlkXs= 4 Ry is enough in BSE.
his is not suprising, these are different terms, in GW it is a Fock integral, in BSE it is essentially an Hartree term.
The same way, number of bands in polarization is convergent 10 times faster in BSE!
This is a but stranger: anyway you can use the screening already calculated for GW stored in ndb.pp for the BSE, yambo will take the static part (use ppa in the input instead em1s)
and relative variables ( BndsRnXp, NGsBlkXp).
On the other hand, I am not surprised that k-points nscf 4 4 4 are ok for GW, while k-points nscf 16 16 16 are
not enough for BSE. This is because the excitations in this system are not from VBM to CBM but much higher,
which is in agreement with the experiment for the optical pumping.
As you say, k convergence in BSE can be more problematic, you need a better discretisation to include relevant transition in the BSE matrix.
Best,
Daniele
Re: Haydock error
Posted: Mon Apr 12, 2021 11:11 am
by malwi
Thank you very much Daniele,
I continue as you said.
Best regards,
Gosia