Page 1 of 1

COHSEX crash (netcdf related?)

Posted: Mon Jun 06, 2011 3:46 pm
by vormar
Dear all,

I have problems with COHSEX self-energy calculations using the recent GPL version of the code (3.2.4, rev.17., internal rev. 855). Whatever I tried, the code crashed at the same point. Yambo is compiled on cineca sp6 with netcdf enabled. I prepared the input file using 'yambo -p c'. My system is a molecule, thus I only have one q-point. Here is the input file:

Code: Select all

cohsex
gw0
HF_and_locXC
em1s
EXXRLvcs= 140233       RL
% QpntsRXs
 1 | 1 |
%
% BndsRnXs
    1 | 100 |
%
NGsBlkXs= 1            Ry
% LongDrXs
 1.000000 | 0.000000 | 0.000000 |
%
% GbndRnge
    1 | 100 |
%
%QPkrange
  1|  1|  28| 29|
%
%QPerange
  1|  1| 0.0|-1.0|
%
The calculation stops after the static screening step:

Code: Select all

[...]
 <02m-46s> Xo@q[1] 1-1 |##################  | [090%] 02m-06s(E) 02m-20s(X)
 <02m-53s> Xo@q[1] 1-1 |################### | [095%] 02m-13s(E) 02m-20s(X)
 <02m-59s> Xo@q[1] 1-1 |####################| [100%] 02m-19s(E) 02m-19s(X)
 <02m-59s> X @q[1] 1-1 |                    | [000%] --(E) --(X)
 <03m-00s> X @q[1] 1-1 |####################| [100%] --(E) --(X)
 <03m-00s> [M 0.030 Gb] Free WF (0.220)
 <03m-00s> [06] Dyson equation: Newton solver
 <03m-00s> [06.01] G0W0 : COHSEX
 <03m-02s> [FFT-SC] Mesh size:  65   65   65
 <03m-02s> [WF-SC loader] Wfs (re)loading |                    | [000%] --(E) --(X)
 <03m-02s> [M 0.469 Gb] Alloc wf_disk (0.375)
 <03m-03s> [WF-SC loader] Wfs (re)loading |####################| [100%] 01s(E) 01s(X)
 <03m-03s> [M 0.094 Gb] Free wf_disk (0.375)
 <03m-03s> G0W0 COHSEX |                    | [000%] --(E) --(X)
If I restart the calculation from this point then it prints the following in the logfile:

Code: Select all

 <---> [01] Files & I/O Directories
 <---> [02] CORE Variables Setup
 <---> [02.01] Unit cells
 <---> [02.02] Symmetries
 <---> [02.03] RL shells
 <---> [02.04] K-grid lattice
 <---> [02.05] Energies [ev] & Occupations
 <---> [03] Transferred momenta grid
 <---> [04] Bare local and non-local Exchange-Correlation
 <---> [05] Static Dielectric Matrix


 <05s> [06] Dyson equation: Newton solver
[ERROR] STOP signal received while in :[06] Dyson equation: Newton solver
[ERROR][NetCDF] NetCDF: Variable not found
The same ground-state database works for other type of calculations (eg. RPA optics, BSE) with the same yambo binary, thus I assume that the problem is rooted somewhere in the COHSEX part. Do you have any idea what may happen? Am I missing something in the input file?

Also, I don't understand why GbndRnge is present in the input file. Am I right that this is just simply ignored by the code?

If needed, I can provide the netcdf database.

Thanks,
Marton

Re: COHSEX crash (netcdf related?)

Posted: Mon Jun 06, 2011 6:40 pm
by claudio
Dear Marton

you are right, we corrected this bug in the revision number 18 on subversion

Claudio

Re: COHSEX crash (netcdf related?)

Posted: Thu Jun 09, 2011 3:04 pm
by vormar
Dear Claudio,

Thank you for the prompt answer. Now the code works. I have started to do some COHSEX calculations but something is not totally clear to me. Why does the code calculate the exchange self-energy? In the COHSEX case, the static self-energy has no pure exchange contribution.

So in this case the correction for the single particle level |i> is (E-E0 in the output):

<i|\Sigma_{COHSEX}-V_{xc}|i>, which is, I assume, transformed to <i|\Sigma_{COHSEX}-\Sigma_x+\Sigma_x-V_{xc}|i>, so finally the reported Sc(Eo) is the correlation part of the self-energy: <i|\Sigma_{COHSEX}-\Sigma_x|i>. Am I right about this?

Thanks,
Marton

Re: COHSEX crash (netcdf related?)

Posted: Thu Jun 09, 2011 4:24 pm
by claudio
Ciao Marton

you are right, the reported SC(E) is <Sigma_COHSEX - Sigma_x>

you can also have a look to: src/qp/QP_newton.F line 103

or documentation for the plasmon pole case: http://www.yambo-code.org/doc/docs/doc_GW.php

Cla

Re: COHSEX crash (netcdf related?)

Posted: Tue Jan 06, 2015 2:23 am
by ljzhou86
Dear all,
vormar wrote: I have problems with COHSEX self-energy calculations using the recent GPL version of the code (3.2.4, rev.17., internal rev. 855). Whatever I tried, the code crashed at the same point. Yambo is compiled on cineca sp6 with netcdf enabled. I prepared the input file using 'yambo -p c'. My system is a molecule, thus I only have one q-point. Here is the input file:
<05s> [06] Dyson equation: Newton solver
[ERROR] STOP signal received while in :[06] Dyson equation: Newton solver
[ERROR][NetCDF] NetCDF: Variable not found[/code]
I also run into similar problem with G0W0's calculation by "yambo -c -g n -p p" using the Version 3.4.1 Revision 3187. Since the time limit for our cluster, GoW0's calculation stoped at the step of "G0W0 PPA", thus, I do a restart calculation, however, I got the following errs:

[ERROR] STOP signal received while in :
[ERROR][NetCDF] NetCDF: Variable not found


The input, r- and l- files are enclosed in the attachment, pls help me to resolve it. Note that

Re: COHSEX crash (netcdf related?)

Posted: Tue Jan 06, 2015 6:54 am
by Daniele Varsano
Dear Zhou Liu-Jiang,
from your reports it is not clear what is going on, as the error it is not shown. Also note that your input file contains a report and not an input.
Anyway my impression is that it is a restart problem due to some corrupted file.
I can see that you are calculating a big number of qp corrections using a hige number of cpu's. My suggestion here is to do split you calculations, ie, you can try to perform
several runs containing less number of qp corrections (e.g. 2 runs: first bands 6 to 10 and next 11-15). In such a way you will not need to restart the calculations.
Best,
Daniele

Re: COHSEX crash (netcdf related?)

Posted: Tue Jan 06, 2015 7:05 pm
by ljzhou86
Dear sir
Daniele Varsano wrote: My suggestion here is to do split you calculations, ie, you can try to perform several runs containing less number of qp corrections (e.g. 2 runs: first bands 6 to 10 and next 11-15). In such a way you will not need to restart the calculations.
If I split my calculation into 2 runs, then I will get two ndb.QP files. Now, how can I merge the two ndb.QP files so that I can include the GW-corrected QP energies to the BSE's calculation by using KfnQPdb= "E < ./SAVE/ndb.QP"? Thanks in advance.

Re: COHSEX crash (netcdf related?)

Posted: Tue Jan 06, 2015 9:06 pm
by Daniele Varsano
Dear Zhou,
if you need the qp database for BSE calculations, the tool to merge databases is ypp. This is done by using the option

Code: Select all

ypp -q m
You can try to use that, anyway it has been recently noticed that the current gpl release can have some problem with that option in generating the pp input file.
This will be surely fixed in the next release.
So for the moment, in the case ypp is not working what you can do is try to repeat to complete successfully the entire calculation. You can try to run the job without using the memory distributed option, eventually trying to reduce the FFTGvecs if you have memory issue. Before to do that, you need to delete the RESTART directory and the ./SAVE/ndb.QP file.
Anyway something strange happened in your previous run as the estimated time change from 30m to 9h, I do not know if you had some machine problem.
Best,
Daniele