Page 1 of 1
					
				Haydock error
				Posted: Fri Mar 19, 2021 10:15 am
				by malwi
				Dear Yambo Team,
I got an error during BSE Haydock step:
 [03.01] Haydock Solver for abs @q1, scheme hermitian
  ====================================================
  Accuracy (requested)      : -0.020000 [o/o]
  [ERROR] STOP signal received while in[03.01] Haydock Solver for abs @q1, scheme hermitian
  [ERROR]Bf=NaN likely because some eigenvalue of the BSE is negative.
Yambo 5.0 was compiled with Intel 6.7 on Prometheus (Cyfronet Centre).
   Best regards,
    Gosia
			 
			
					
				Re: Haydock error
				Posted: Fri Mar 19, 2021 1:12 pm
				by Daniele Varsano
				Dear Gosia, 
can you please post your input and report files?
You can upload as attachment by renaming them with an allowed suffix (e.g. .txt).
Best,
Daniele
			 
			
					
				Re: Haydock error
				Posted: Fri Mar 19, 2021 11:00 pm
				by malwi
				Dear Daniele,
thank you again.
I attached the files.
Best regards,
Gosia
			 
			
					
				Re: Haydock error
				Posted: Sat Mar 20, 2021 9:38 am
				by Daniele Varsano
				Dear Gosia, 
I can't see anything wrong in your input file.
It seems you end up with a negative eigenvalue, I do not know if something weird happened in the calculation of the BSE kernel you performed in a previous run, but the matrix seems to be read correctly, and I assume your scissor value is correct.
Maybe someone expert on the Haydock algorithm and its parallel implementation can gives you some hint on how to spot the problem.
Best,
Daniele
			 
			
					
				Re: Haydock error
				Posted: Sat Mar 20, 2021 4:35 pm
				by malwi
				Dear Daniele,
thank you.
I tried also full diagonalization and now it stops with some memory problem.
I changed the number of tasks and nodes to get more memory - I will ask Maciej Czuchry what could be wrong with that.
My doubts now are: can I change the number of cpus used for kernel to different number of cpus for diagonalization or Haydock?
This series reads files from the previous calculations.
The case, which I am calculating is actually the same as in the previous posted error. But now I use k-meshes 12 and 16 in the nscf step;
and before, it was 4. Is any physical/numerical reason which can cause the problem with Haydock? I mean for example: taking the bands
in BSE which exchange the order of symmetry at some k-points?
Best,
Gosia
			 
			
					
				Re: Haydock error
				Posted: Sat Mar 20, 2021 5:08 pm
				by Daniele Varsano
				Dear gosia,
The matrix is far too large for a full diagonalisation.
I cannot say much on the parallelism on the haydock procedure. Other developers expert on that will answer.
Best,
Daniele
			 
			
					
				Re: Haydock error
				Posted: Sat Mar 20, 2021 6:04 pm
				by Davide Sangalli
				Dear all,
not easy to say which could be the problem.
A simple test might be to re-run the Haydock solver putting a higher QP correction in input
% KfnQP_E
 1.400000 | 1.000000 | 1.000000 |        # [EXTQP BSK BSS] E parameters  (c/v) eV|adim|adim
%
It's probably unphysical in this system, but just to check if there is some pole with E< 1eV, which would, in turn, give a negative eigenvalue with 0.4 eV of QP corrections.
Best,
D
			 
			
					
				Re: Haydock error
				Posted: Mon Apr 12, 2021 10:08 am
				by malwi
				Dear Davide and Daniele,
I am back to this problem after a break for writing a proposal....
Accidentally, I succeeded to get Haydock result for more k-points:
Maciej Czuchry (from Cyfronet) suggested to take less cpu, and the calculations went through.
But I do not understand it.
When I use nscf k-mesh 8 8 8 (IBZ 35) for BSE, then it runs on 432 cpu.
When I use nscf k-mesh 10 10 10 (56 IBZ)  (or more kpoints 12 or 16) for BSE, 
then it does not run on 432 cpu, but runs on 144 cpu. 
Why? How it is parallelized? 
I do not say any thing about the parallel structure in the input, letting it go by default.
Back to Davide advice: maybe I really put to small  KfnQP_E 
My DFT+SOC gives gap 0.3, GW correction is 0.4 eV
What should be given for KfnQP_E,  is it just the GW correction 0.4 or DFT+GW 0.7 ?
  
Second surprise for me is that the energy parameters for exchange and correlation 
converge in BSE much faster than in GW
 (for example BSENGexx=  10 Ry and EXXRLvcs= 30 Ry is convergent,
BSENGBlk= 4 Ry and VXCRLvcs= 6 Ry) and similarly NGsBlkXs= 4 Ry  is enough in BSE.
The same way, number of bands in polarization is convergent 10 times faster in BSE!
% BndsRnXp
    1 | 1000 |                 # [Xp] Polarization function bands
% BndsRnXs
    1 | 108 |                 # [Xs] Polarization function bands    
108 gives no difference in the result with respect to  1000 is used for Xs in BSE, why it is so?
On the other hand, I am not surprised that k-points nscf 4 4 4 are ok for GW, while k-points nscf 16 16 16 are
not enough for BSE. This is because the excitations in this system are not from VBM to CBM but much higher,
which is in agreement with the experiment for the optical pumping.
   Best regards,
    Gosia
			 
			
					
				Re: Haydock error
				Posted: Mon Apr 12, 2021 11:04 am
				by Daniele Varsano
				Dear Gosia, 
Why? How it is parallelized? 
I do not say any thing about the parallel structure in the input, letting it go by default.
In yambo there is a default parallelisation that may fail, my advise is to explicitly assign CPU in input on different roles. I suggest you to use cpu on k role as much as possibile:
Code: Select all
BS_CPU= "nk neh nt"                     # [PARALLEL] CPUs for each role
BS_ROLEs= "k eh t"                   # [PARALLEL] CPUs roles (k,eh,t)
the product of nk*neh*nt has to be the number of MPI you are using.
My DFT+SOC gives gap 0.3, GW correction is 0.4 eV
What should be given for KfnQP_E, is it just the GW correction 0.4 or DFT+GW 0.7 ?
it is the correction: 0.4 eV
converge in BSE much faster than in GW
(for example BSENGexx= 10 Ry and EXXRLvcs= 30 Ry is convergent,
BSENGBlk= 4 Ry and VXCRLvcs= 6 Ry) and similarly NGsBlkXs= 4 Ry is enough in BSE.
his is not suprising, these are different terms, in GW it is a Fock integral, in BSE it is essentially an Hartree term.
The same way, number of bands in polarization is convergent 10 times faster in BSE!
This is a but stranger: anyway you can use the screening already calculated for GW stored in ndb.pp for the BSE, yambo will take the static part (use ppa in the input instead em1s) 
and relative variables ( BndsRnXp, NGsBlkXp). 
On the other hand, I am not surprised that k-points nscf 4 4 4 are ok for GW, while k-points nscf 16 16 16 are
not enough for BSE. This is because the excitations in this system are not from VBM to CBM but much higher,
which is in agreement with the experiment for the optical pumping.
As you say, k convergence in BSE can be more problematic, you need a better discretisation to include relevant transition in the BSE matrix.
Best,
Daniele
 
			
					
				Re: Haydock error
				Posted: Mon Apr 12, 2021 11:11 am
				by malwi
				Thank you very much Daniele,
I continue as you said.
   Best regards,
    Gosia