Dear all,
I am using Yambo GPL Version 4.0.4 Revision 107 build with MPI+OpenMP.
When I was calcuating a SiO2 quartz having a large unit cell,
the calculation stopped at the final stage in G0W0 PPA.
$tail -n 1 l_em1d_ppa_HF_and_locXC_gw0_CPU_768
-> <12h-16m-31s> P0768: G0W0 PPA |####################################### | [097%] 12h-13m-10s(E) 12h-31m-56s(X)
The calculation did not respond at all after this output and
was killed due to a time limit (~24h).
However, some of the calculations seem to finish correclty...
$tail -n 1 l_em1d_ppa_HF_and_locXC_gw0_CPU_2
-> <10h-49m-35s> P0002: G0W0 PPA |########################################| [100%] 10h-46m-13s(E) 10h-46m-13s(X)
Do you think it is a hardware probem?
I attaced the input and output files.
The calculation was done with 288 nodes (2304 cores, flat mpi).
Best regards.
Kousuke
A calculation stops at the final stage in G0W0 PPA.
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano
-
- Posts: 14
- Joined: Wed Sep 16, 2015 10:55 am
A calculation stops at the final stage in G0W0 PPA.
You do not have the required permissions to view the files attached to this post.
Kosuke Nakano
Asahi Glass Co., Ltd. Corporate Research Center
1150 Hazawa-cho Kanagawa-ku Yokohama-shi
Kanagawa 221-8755 Japan
Asahi Glass Co., Ltd. Corporate Research Center
1150 Hazawa-cho Kanagawa-ku Yokohama-shi
Kanagawa 221-8755 Japan
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: A calculation stops at the final stage in G0W0 PPA.
Dear Kosuke,
I had a look to your input/log files, it is not totally clear what happened anyway my impression is that your calculation is highly unbalanced:
some cpus finished their task while other hardly started the G0W0 PPA step.
This usually happen when parallelizing over q, as not all q have the same loads.
My suggestion is to avoid at all parallelization over q and pushing parallelism on bands and qp:
SE_CPU= "1,36,64" # [PARALLEL] CPUs for each role
SE_ROLEs= "q,qp,b"
Moreover, are you sure that this variable is set to its correct value?
Note that this govern the sum-over-states in the GW calculations, so it is not safe to skip some ocupied bands, morever 224 bands could be not enough to get convergences. The number of unoccupied states should in principle infinite and in practice bourght to convergence.
You are using nearly twice the number of occupied bands and they seems to be quite few, but of course this is somenthing to be checked.
Last remark,
do you really need to calculate qp energies for all these bands, usually one is interested to the correction arounf the Fermi energy. If you need them to plug for BSE calculations, in most of the case you can calcualte the qp correction for some bands arounf the Fermi energy and extrapolate the corrections for higher bands.
Best,
Daniele
I had a look to your input/log files, it is not totally clear what happened anyway my impression is that your calculation is highly unbalanced:
some cpus finished their task while other hardly started the G0W0 PPA step.
This usually happen when parallelizing over q, as not all q have the same loads.
My suggestion is to avoid at all parallelization over q and pushing parallelism on bands and qp:
SE_CPU= "1,36,64" # [PARALLEL] CPUs for each role
SE_ROLEs= "q,qp,b"
Moreover, are you sure that this variable is set to its correct value?
Code: Select all
% GbndRnge
33 | 224 | # [GW] G[W] bands range
%
You are using nearly twice the number of occupied bands and they seems to be quite few, but of course this is somenthing to be checked.
Last remark,
do you really need to calculate qp energies for all these bands, usually one is interested to the correction arounf the Fermi energy. If you need them to plug for BSE calculations, in most of the case you can calcualte the qp correction for some bands arounf the Fermi energy and extrapolate the corrections for higher bands.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 14
- Joined: Wed Sep 16, 2015 10:55 am
Re: A calculation stops at the final stage in G0W0 PPA.
Dear Daniele,
I am sorry for my late reply.
The error seems to be due to the load unbalance as you have expected.
I have checked the convergence using this procedure.
http://www.attaccalite.com/reasonable-p ... culations/
The dielectric function in optics seemed to be converged with
But, I should have adopted a larger number of bands for GbndRnge.
Best regards.
Kousuke
I am sorry for my late reply.
Thanks to your advise, I suceeded in calculating G0W0!!!My suggestion is to avoid at all parallelization over q and pushing parallelism on bands and qp:
SE_CPU= "1,36,64" # [PARALLEL] CPUs for each role
SE_ROLEs= "q,qp,b"
The error seems to be due to the load unbalance as you have expected.
Thank you for your advise!Note that this govern the sum-over-states in the GW calculations,
so it is not safe to skip some ocupied bands,
morever 224 bands could be not enough to get convergences.
The number of unoccupied states should in principle infinite and in practice
bourght to convergence.
You are using nearly twice the number of occupied bands and they seems to be quite few,
but of course this is somenthing to be checked.
I have checked the convergence using this procedure.
http://www.attaccalite.com/reasonable-p ... culations/
The dielectric function in optics seemed to be converged with
Code: Select all
% BndsRnXd
33 | 224 | # [Xd] Polarization function bands
%
I see. I will use the EXTOP (scissor) parameter for BSE calculations from the next time.do you really need to calculate qp energies for all these bands, usually one is interested to the correction
around the Fermi energy. If you need them to plug for BSE calculations, in most of the case you can calcualte
the qp correction for some bands arounf the Fermi energy and extrapolate the corrections for higher bands.
Best regards.
Kousuke
Kosuke Nakano
Asahi Glass Co., Ltd. Corporate Research Center
1150 Hazawa-cho Kanagawa-ku Yokohama-shi
Kanagawa 221-8755 Japan
Asahi Glass Co., Ltd. Corporate Research Center
1150 Hazawa-cho Kanagawa-ku Yokohama-shi
Kanagawa 221-8755 Japan