COHSEX crash (netcdf related?)

Concerns issues with computing quasiparticle corrections to the DFT eigenvalues - i.e., the self-energy within the GW approximation (-g n), or considering the Hartree-Fock exchange only (-x)

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano

Post Reply
User avatar
vormar
Posts: 10
Joined: Wed Jan 27, 2010 4:40 pm

COHSEX crash (netcdf related?)

Post by vormar » Mon Jun 06, 2011 3:46 pm

Dear all,

I have problems with COHSEX self-energy calculations using the recent GPL version of the code (3.2.4, rev.17., internal rev. 855). Whatever I tried, the code crashed at the same point. Yambo is compiled on cineca sp6 with netcdf enabled. I prepared the input file using 'yambo -p c'. My system is a molecule, thus I only have one q-point. Here is the input file:

Code: Select all

cohsex
gw0
HF_and_locXC
em1s
EXXRLvcs= 140233       RL
% QpntsRXs
 1 | 1 |
%
% BndsRnXs
    1 | 100 |
%
NGsBlkXs= 1            Ry
% LongDrXs
 1.000000 | 0.000000 | 0.000000 |
%
% GbndRnge
    1 | 100 |
%
%QPkrange
  1|  1|  28| 29|
%
%QPerange
  1|  1| 0.0|-1.0|
%
The calculation stops after the static screening step:

Code: Select all

[...]
 <02m-46s> Xo@q[1] 1-1 |##################  | [090%] 02m-06s(E) 02m-20s(X)
 <02m-53s> Xo@q[1] 1-1 |################### | [095%] 02m-13s(E) 02m-20s(X)
 <02m-59s> Xo@q[1] 1-1 |####################| [100%] 02m-19s(E) 02m-19s(X)
 <02m-59s> X @q[1] 1-1 |                    | [000%] --(E) --(X)
 <03m-00s> X @q[1] 1-1 |####################| [100%] --(E) --(X)
 <03m-00s> [M 0.030 Gb] Free WF (0.220)
 <03m-00s> [06] Dyson equation: Newton solver
 <03m-00s> [06.01] G0W0 : COHSEX
 <03m-02s> [FFT-SC] Mesh size:  65   65   65
 <03m-02s> [WF-SC loader] Wfs (re)loading |                    | [000%] --(E) --(X)
 <03m-02s> [M 0.469 Gb] Alloc wf_disk (0.375)
 <03m-03s> [WF-SC loader] Wfs (re)loading |####################| [100%] 01s(E) 01s(X)
 <03m-03s> [M 0.094 Gb] Free wf_disk (0.375)
 <03m-03s> G0W0 COHSEX |                    | [000%] --(E) --(X)
If I restart the calculation from this point then it prints the following in the logfile:

Code: Select all

 <---> [01] Files & I/O Directories
 <---> [02] CORE Variables Setup
 <---> [02.01] Unit cells
 <---> [02.02] Symmetries
 <---> [02.03] RL shells
 <---> [02.04] K-grid lattice
 <---> [02.05] Energies [ev] & Occupations
 <---> [03] Transferred momenta grid
 <---> [04] Bare local and non-local Exchange-Correlation
 <---> [05] Static Dielectric Matrix


 <05s> [06] Dyson equation: Newton solver
[ERROR] STOP signal received while in :[06] Dyson equation: Newton solver
[ERROR][NetCDF] NetCDF: Variable not found
The same ground-state database works for other type of calculations (eg. RPA optics, BSE) with the same yambo binary, thus I assume that the problem is rooted somewhere in the COHSEX part. Do you have any idea what may happen? Am I missing something in the input file?

Also, I don't understand why GbndRnge is present in the input file. Am I right that this is just simply ignored by the code?

If needed, I can provide the netcdf database.

Thanks,
Marton
M\'arton V\"or\"os
PhD student
Department of Atomic Physics,
Budapest University of Technology and Economics
Budafoki út 8., H-1111, Budapest, Hungary
http://www.fat.bme.hu/MartonVoros

User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: COHSEX crash (netcdf related?)

Post by claudio » Mon Jun 06, 2011 6:40 pm

Dear Marton

you are right, we corrected this bug in the revision number 18 on subversion

Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

User avatar
vormar
Posts: 10
Joined: Wed Jan 27, 2010 4:40 pm

Re: COHSEX crash (netcdf related?)

Post by vormar » Thu Jun 09, 2011 3:04 pm

Dear Claudio,

Thank you for the prompt answer. Now the code works. I have started to do some COHSEX calculations but something is not totally clear to me. Why does the code calculate the exchange self-energy? In the COHSEX case, the static self-energy has no pure exchange contribution.

So in this case the correction for the single particle level |i> is (E-E0 in the output):

<i|\Sigma_{COHSEX}-V_{xc}|i>, which is, I assume, transformed to <i|\Sigma_{COHSEX}-\Sigma_x+\Sigma_x-V_{xc}|i>, so finally the reported Sc(Eo) is the correlation part of the self-energy: <i|\Sigma_{COHSEX}-\Sigma_x|i>. Am I right about this?

Thanks,
Marton
M\'arton V\"or\"os
PhD student
Department of Atomic Physics,
Budapest University of Technology and Economics
Budafoki út 8., H-1111, Budapest, Hungary
http://www.fat.bme.hu/MartonVoros

User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: COHSEX crash (netcdf related?)

Post by claudio » Thu Jun 09, 2011 4:24 pm

Ciao Marton

you are right, the reported SC(E) is <Sigma_COHSEX - Sigma_x>

you can also have a look to: src/qp/QP_newton.F line 103

or documentation for the plasmon pole case: http://www.yambo-code.org/doc/docs/doc_GW.php

Cla
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

ljzhou86
Posts: 85
Joined: Fri May 03, 2013 10:20 am

Re: COHSEX crash (netcdf related?)

Post by ljzhou86 » Tue Jan 06, 2015 2:23 am

Dear all,
vormar wrote: I have problems with COHSEX self-energy calculations using the recent GPL version of the code (3.2.4, rev.17., internal rev. 855). Whatever I tried, the code crashed at the same point. Yambo is compiled on cineca sp6 with netcdf enabled. I prepared the input file using 'yambo -p c'. My system is a molecule, thus I only have one q-point. Here is the input file:
<05s> [06] Dyson equation: Newton solver
[ERROR] STOP signal received while in :[06] Dyson equation: Newton solver
[ERROR][NetCDF] NetCDF: Variable not found[/code]
I also run into similar problem with G0W0's calculation by "yambo -c -g n -p p" using the Version 3.4.1 Revision 3187. Since the time limit for our cluster, GoW0's calculation stoped at the step of "G0W0 PPA", thus, I do a restart calculation, however, I got the following errs:

[ERROR] STOP signal received while in :
[ERROR][NetCDF] NetCDF: Variable not found


The input, r- and l- files are enclosed in the attachment, pls help me to resolve it. Note that
You do not have the required permissions to view the files attached to this post.
Dr. Zhou Liu-Jiang
Fujian Institute of Research on the Structure of Matter
Chinese Academy of Sciences
Fuzhou, Fujian, 350002

User avatar
Daniele Varsano
Posts: 3835
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: COHSEX crash (netcdf related?)

Post by Daniele Varsano » Tue Jan 06, 2015 6:54 am

Dear Zhou Liu-Jiang,
from your reports it is not clear what is going on, as the error it is not shown. Also note that your input file contains a report and not an input.
Anyway my impression is that it is a restart problem due to some corrupted file.
I can see that you are calculating a big number of qp corrections using a hige number of cpu's. My suggestion here is to do split you calculations, ie, you can try to perform
several runs containing less number of qp corrections (e.g. 2 runs: first bands 6 to 10 and next 11-15). In such a way you will not need to restart the calculations.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

ljzhou86
Posts: 85
Joined: Fri May 03, 2013 10:20 am

Re: COHSEX crash (netcdf related?)

Post by ljzhou86 » Tue Jan 06, 2015 7:05 pm

Dear sir
Daniele Varsano wrote: My suggestion here is to do split you calculations, ie, you can try to perform several runs containing less number of qp corrections (e.g. 2 runs: first bands 6 to 10 and next 11-15). In such a way you will not need to restart the calculations.
If I split my calculation into 2 runs, then I will get two ndb.QP files. Now, how can I merge the two ndb.QP files so that I can include the GW-corrected QP energies to the BSE's calculation by using KfnQPdb= "E < ./SAVE/ndb.QP"? Thanks in advance.
Dr. Zhou Liu-Jiang
Fujian Institute of Research on the Structure of Matter
Chinese Academy of Sciences
Fuzhou, Fujian, 350002

User avatar
Daniele Varsano
Posts: 3835
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: COHSEX crash (netcdf related?)

Post by Daniele Varsano » Tue Jan 06, 2015 9:06 pm

Dear Zhou,
if you need the qp database for BSE calculations, the tool to merge databases is ypp. This is done by using the option

Code: Select all

ypp -q m
You can try to use that, anyway it has been recently noticed that the current gpl release can have some problem with that option in generating the pp input file.
This will be surely fixed in the next release.
So for the moment, in the case ypp is not working what you can do is try to repeat to complete successfully the entire calculation. You can try to run the job without using the memory distributed option, eventually trying to reduce the FFTGvecs if you have memory issue. Before to do that, you need to delete the RESTART directory and the ./SAVE/ndb.QP file.
Anyway something strange happened in your previous run as the estimated time change from 30m to 9h, I do not know if you had some machine problem.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

Post Reply