Advice on parameters

bob · Post by **bob** » Mon Nov 28, 2016 8:59 am

Dear all,

I've been trying to calculate the QP-corrected bands of a 16 atom silicon cell. Yambo is running on 4 OMP threads but according to its estimate, it's going to take about 10 days to determine EXS. I'm assuming that I could dial down some parameters a bit but I wonder if some of you might have a immediate suggestion what I'm doing less than optimal (I know this is a RTFM moment but there is some urgency...). I attached my yambo.in and some output.

Appreciate any help!
Bjoern

Post by **Daniele Varsano** » Mon Nov 28, 2016 9:04 am

Dear Bjoern,

The input/output files is not attached, probably you should rename it with the allowed suffix (.txt,.zip etc.).
I suggest you anyway to check for how many states you are calculating the qp corrections, and set it to the bands
you are interested in (%QPkrange), usually a limited number of ocupied and empty bands. Once you upload the input we will have a look to it if there is something strange, in any case
the timing you mention it is not reasonable.

Best,

Daniele

bob · Post by **bob** » Tue Nov 29, 2016 9:18 am

Dear Daniele,

my bad. The output should be here now.

Best,
Bjoern

Post by **Daniele Varsano** » Tue Nov 29, 2016 9:43 am

Dear Bob,

yambo it takes a lot of time because in the input file you are asking for:

Code: Select all

%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1| 60|  1| 46|
%

60x46 =2760 QP corrections which is a big number.

Usually people is interested in bands near the Fermi Energy, here you are asking also for deep bands and high energy bands.
Setting for instances:

Code: Select all

%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1| 60|  29| 36|
%

Yambo will calculate 4 occupied and 4 unoccupied bands.

Hope it helps,

Daniele

bob · Post by **bob** » Wed Nov 30, 2016 4:07 pm

Hi Daniele,

that of course does help. I think I might still need more bandn that but that's a different issue.

What I also realized is that I'm also taking only few empty bands into account for the dielectric matrix (14, compared to 32 occupied). I guess I should increase that? At the same time, I believe that the k-point sampling might be slightly greedy.

Anyways, that leads me to another point that I'd just like to clarify for the practical workflow: I run a nscf pw calculation using a uniform IBZ k-point grid. Then yambo calculates the QP corrections for these k-points. If I want to plot the QP bandstructure along high symmetry lines, I should use ypp and interpolate the BS. Correct, or am I missing something?

Cheers,
Bjoern

Post by **Daniele Varsano** » Wed Nov 30, 2016 4:19 pm

Dear Bjoern,

What I also realized is that I'm also taking only few empty bands into account for the dielectric matrix (14, compared to 32 occupied). I guess I should increase that? At the same time, I believe that the k-point sampling might be slightly greedy.

Yes, you need to check convergences, 14 empty bands seems to be are few, both in the screening and in GW summation (% GbndRnge).
For the latter you can use terminator technique which speed up the convergence wrt this parameter setting
GTermKind= "BG".

Please also note that you are considering just 1 Gvector in the screening (scalar):

Code: Select all

NGsBlkXd= 1            RL

the size of the screening matrix is another parameter you need to check the convergence. I suggest you to increase this parameter in Ry and not in RL (e.g. try 1 Ry, 2Ry etc.. until convergence). I suggest to check the convergence for few points (e.g. the gap).

For the explanation of each variable you can have a look here:
http://www.attaccalite.com/yambo-input- ... explained/

If I want to plot the QP bandstructure along high symmetry lines, I should use ypp and interpolate the BS. Correct, or am I missing something?

Correct. If you want calculate QP correction for a single point non included in your grid you can built a specific grid that can exploit the screening calculated with another grid, but if you want the band structure you need to interpolate.

I suggest you also to have a look to the tutorial page of Yambo for a typical GW calculation:
http://www.yambo-code.org/tutorials/GW/index.php

Best,

Daniele

bob · Post by **bob** » Mon Dec 05, 2016 8:48 am

Hi Daniele,

thanks for your suggestions and the clarification - greatly appreciated.

I am currently running these checks but hit trouble with the runtimes. I attached an input and current state of output as an example. Since the cluster I'm running this on has behaved strangely sometimes, I wonder if it is normal that the runtime estimate in the G0W0 step varies so much? It goes from ~2h to ~1d2h as of now. The queue has a 48 hour wall time limit, so I might eventually run into trouble there.

In fact, I have two more questions related to this:

a) I realized that when the job is terminated during the calculations of the X[q], yambo restarts where it was at time of termination. Would the same also hold for the EXS and G0W0 steps?

b) Would it make sense to go to higher parallelization? Right now I'm running the job on 1 node and 24 OpenMP threads, so I'm considering if it's worth using 2 nodes (MPI) and 24 OpenMP threads on each node. But even if so, I have not fully understood how to tell yambo how to do just that, even after looking at the examples. Any hints?

Cheers,
Bjoern

Post by **Daniele Varsano** » Mon Dec 05, 2016 9:50 am

Dear Bjoern,

I wonder if it is normal that the runtime estimate in the G0W0 step varies so much?

No, it is not.

a) I realized that when the job is terminated during the calculations of the X[q], yambo restarts where it was at time of termination. Would the same also hold for the EXS and G0W0 steps?

No, here the code does not restart, but you can divide your job in several smaller runs, e.g.:
Run1:

Code: Select all

%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1| 28|  31|32|
%

Run2:

Code: Select all

%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1| 28|  33|34|
%

etc..

b) Would it make sense to go to higher parallelization? Right now I'm running the job on 1 node and 24 OpenMP threads, so I'm considering if it's worth using 2 nodes (MPI) and 24 OpenMP threads on each node. But even if so, I have not fully understood how to tell yambo how to do just that, even after looking at the examples. Any hints?

Sure, MPI parallelization is much more effective here than OpenMP. I would consider OpenMP with no more than 4/8 threads, and an MPI parallelization also inside the same node. In order to do that you need to activate parallelization variable adding -V par when building the input files. In the yambo web page there is a specified tutorial to practice with parallelization variable. My advise is to parallelize on bands and k points and avoid parallelization on q points as most of the time you end up with unbalanced run.

A typical set of variable in the input considering e.g. 12 MPI process and 2 threads reads:

Code: Select all

X_all_q_CPU= "1 6 2 1"              # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v"            # [PARALLEL] CPUs roles (q,k,c,v)
X_all_q_nCPU_invert=0        # [PARALLEL] CPUs for matrix inversion
X_Threads=  2              # [OPENMP/X] Number of threads for response functions
DIP_Threads=  2             # [OPENMP/X] Number of threads for dipoles
SE_CPU= "1 6 2"                   # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b"                 # [PARALLEL] CPUs roles (q,qp,b)
SE_Threads=  2

Best,
Daniele

bob · Post by **bob** » Tue Dec 06, 2016 8:34 am

Dear Daniele,

Can't thank you enough.

I have made some comparison testings and here is what I found. As you said, a 12/2 parallelization is more efficient. I have also looked into linking against additional libraries from MKL (adding FFTW3 and SCALAPCK):

Compiled with MKL Lapack/Blas

Code: Select all

MPI/OMP        X         EXS       G0W0
-------------------------------------------
1/24         1h 15m    1h 32m     8h 51m
12/2            35m       15m     1h 57m

Compiled with MKL Lapack/Blas + FFTW3 + SCALAPACK

Code: Select all

MPI/OMP        X         EXS       G0W0
-------------------------------------------
1/24            50m       37m     3h 16m
12/2            19m        6m     2h 01m

FFTW3 seems to have noticeable influence, even more so for the purely OpenMP threaded run. I am however still a bit puzzled about the G0W0 timings: a) the difference between different compilations, b) the fact that also on a different cluster, the timing estimate for G0W0 grows with run time, see attached log. Given that it starts out from ~10m but ends up at ~2h, I wonder if I'm still doing something wrong? Is there anything else I could test? Or should I not concern myself with that?

One final question: You mentioned to set the variable:

Code: Select all

X_all_q_nCPU_invert=0        # [PARALLEL] CPUs for matrix inversion

while using -V par puts

Code: Select all

X_all_q_nCPU_LinAlg_INV= 1   # [PARALLEL] CPUs for Linear Algebra

Is that the same variable/setting?

Cheers,
Bjoern

Post by **Daniele Varsano** » Tue Dec 06, 2016 10:23 am

Dear Bjoern,

Great your calculations are running properly. Surely MKL helps a lot.
About the G0W0 timing I would say that MPI are essentially the same considering fluctuations in the machine. About OMP 1/24 most probably
it is favorable when considering internal OMP parallelization of FFTW libraries.

Given that it starts out from ~10m but ends up at ~2h, I wonder if I'm still doing something wrong? Is there anything else I could test? Or should I not concern myself with that?

Ok, we will check if there is some timing misplaced there.

X_all_q_nCPU_LinAlg_INV= 1

I'm not sure, but this appear when SCALAPACK are linked. Scalapack features have been added very recently.

Overall I think you have good setup to run yambo.

Best,

Daniele

Yambo Community Forum

Advice on parameters

Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters

Re: Advice on parameters