A calculation stops at the final stage in G0W0 PPA.

emerarud · Post by **emerarud** » Mon May 22, 2017 4:30 am

Dear all,

I am using Yambo GPL Version 4.0.4 Revision 107 build with MPI+OpenMP.

When I was calcuating a SiO2 quartz having a large unit cell,
the calculation stopped at the final stage in G0W0 PPA.

$tail -n 1 l_em1d_ppa_HF_and_locXC_gw0_CPU_768

-> <12h-16m-31s> P0768: G0W0 PPA |####################################### | [097%] 12h-13m-10s(E) 12h-31m-56s(X)

The calculation did not respond at all after this output and
was killed due to a time limit (~24h).

However, some of the calculations seem to finish correclty...

$tail -n 1 l_em1d_ppa_HF_and_locXC_gw0_CPU_2

-> <10h-49m-35s> P0002: G0W0 PPA |########################################| [100%] 10h-46m-13s(E) 10h-46m-13s(X)

Do you think it is a hardware probem?

I attaced the input and output files.

The calculation was done with 288 nodes (2304 cores, flat mpi).

Best regards.

Kousuke

Post by **Daniele Varsano** » Mon May 22, 2017 2:32 pm

Dear Kosuke,
I had a look to your input/log files, it is not totally clear what happened anyway my impression is that your calculation is highly unbalanced:
some cpus finished their task while other hardly started the G0W0 PPA step.
This usually happen when parallelizing over q, as not all q have the same loads.
My suggestion is to avoid at all parallelization over q and pushing parallelism on bands and qp:
SE_CPU= "1,36,64" # [PARALLEL] CPUs for each role
SE_ROLEs= "q,qp,b"

Moreover, are you sure that this variable is set to its correct value?

Code: Select all

% GbndRnge
  33 | 224 |                  # [GW] G[W] bands range
%

Note that this govern the sum-over-states in the GW calculations, so it is not safe to skip some ocupied bands, morever 224 bands could be not enough to get convergences. The number of unoccupied states should in principle infinite and in practice bourght to convergence.
You are using nearly twice the number of occupied bands and they seems to be quite few, but of course this is somenthing to be checked.

Last remark,
do you really need to calculate qp energies for all these bands, usually one is interested to the correction arounf the Fermi energy. If you need them to plug for BSE calculations, in most of the case you can calcualte the qp correction for some bands arounf the Fermi energy and extrapolate the corrections for higher bands.

Best,

Daniele

emerarud · Post by **emerarud** » Wed May 31, 2017 11:51 am

Dear Daniele,

I am sorry for my late reply.

My suggestion is to avoid at all parallelization over q and pushing parallelism on bands and qp:
SE_CPU= "1,36,64" # [PARALLEL] CPUs for each role
SE_ROLEs= "q,qp,b"

Thanks to your advise, I suceeded in calculating G0W0!!!
The error seems to be due to the load unbalance as you have expected.

Note that this govern the sum-over-states in the GW calculations,
so it is not safe to skip some ocupied bands,
morever 224 bands could be not enough to get convergences.
The number of unoccupied states should in principle infinite and in practice
bourght to convergence.
You are using nearly twice the number of occupied bands and they seems to be quite few,
but of course this is somenthing to be checked.

Thank you for your advise!

I have checked the convergence using this procedure.

http://www.attaccalite.com/reasonable-p ... culations/

The dielectric function in optics seemed to be converged with

Code: Select all

% BndsRnXd
 33 | 224 | # [Xd] Polarization function bands
%

But, I should have adopted a larger number of bands for GbndRnge.

do you really need to calculate qp energies for all these bands, usually one is interested to the correction
around the Fermi energy. If you need them to plug for BSE calculations, in most of the case you can calcualte
the qp correction for some bands arounf the Fermi energy and extrapolate the corrections for higher bands.

I see. I will use the EXTOP (scissor) parameter for BSE calculations from the next time.

Best regards.

Kousuke

Yambo Community Forum

A calculation stops at the final stage in G0W0 PPA.

A calculation stops at the final stage in G0W0 PPA.

Re: A calculation stops at the final stage in G0W0 PPA.

Re: A calculation stops at the final stage in G0W0 PPA.