Advice on parameters
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano
-
- Posts: 42
- Joined: Wed Aug 04, 2010 8:39 am
- Location: Eindhoven, The Netherlands
- Contact:
Advice on parameters
Dear all,
I've been trying to calculate the QP-corrected bands of a 16 atom silicon cell. Yambo is running on 4 OMP threads but according to its estimate, it's going to take about 10 days to determine EXS. I'm assuming that I could dial down some parameters a bit but I wonder if some of you might have a immediate suggestion what I'm doing less than optimal (I know this is a RTFM moment but there is some urgency...). I attached my yambo.in and some output.
Appreciate any help!
Bjoern
I've been trying to calculate the QP-corrected bands of a 16 atom silicon cell. Yambo is running on 4 OMP threads but according to its estimate, it's going to take about 10 days to determine EXS. I'm assuming that I could dial down some parameters a bit but I wonder if some of you might have a immediate suggestion what I'm doing less than optimal (I know this is a RTFM moment but there is some urgency...). I attached my yambo.in and some output.
Appreciate any help!
Bjoern
Last edited by bob on Mon Nov 28, 2016 9:16 am, edited 1 time in total.
Dr. Bjoern Baumeier
Eindhoven University of Technology
Eindhoven, The Netherlands
Eindhoven University of Technology
Eindhoven, The Netherlands
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Advice on parameters
Dear Bjoern,
The input/output files is not attached, probably you should rename it with the allowed suffix (.txt,.zip etc.).
I suggest you anyway to check for how many states you are calculating the qp corrections, and set it to the bands
you are interested in (%QPkrange), usually a limited number of ocupied and empty bands. Once you upload the input we will have a look to it if there is something strange, in any case
the timing you mention it is not reasonable.
Best,
Daniele
The input/output files is not attached, probably you should rename it with the allowed suffix (.txt,.zip etc.).
I suggest you anyway to check for how many states you are calculating the qp corrections, and set it to the bands
you are interested in (%QPkrange), usually a limited number of ocupied and empty bands. Once you upload the input we will have a look to it if there is something strange, in any case
the timing you mention it is not reasonable.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 42
- Joined: Wed Aug 04, 2010 8:39 am
- Location: Eindhoven, The Netherlands
- Contact:
Re: Advice on parameters
Dear Daniele,
my bad. The output should be here now.
Best,
Bjoern
my bad. The output should be here now.
Best,
Bjoern
You do not have the required permissions to view the files attached to this post.
Dr. Bjoern Baumeier
Eindhoven University of Technology
Eindhoven, The Netherlands
Eindhoven University of Technology
Eindhoven, The Netherlands
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Advice on parameters
Dear Bob,
yambo it takes a lot of time because in the input file you are asking for:
60x46 =2760 QP corrections which is a big number.
Usually people is interested in bands near the Fermi Energy, here you are asking also for deep bands and high energy bands.
Setting for instances:
Yambo will calculate 4 occupied and 4 unoccupied bands.
Hope it helps,
Daniele
yambo it takes a lot of time because in the input file you are asking for:
Code: Select all
%QPkrange # [GW] QP generalized Kpoint/Band indices
1| 60| 1| 46|
%
Usually people is interested in bands near the Fermi Energy, here you are asking also for deep bands and high energy bands.
Setting for instances:
Code: Select all
%QPkrange # [GW] QP generalized Kpoint/Band indices
1| 60| 29| 36|
%
Hope it helps,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 42
- Joined: Wed Aug 04, 2010 8:39 am
- Location: Eindhoven, The Netherlands
- Contact:
Re: Advice on parameters
Hi Daniele,
that of course does help. I think I might still need more bandn that but that's a different issue.
What I also realized is that I'm also taking only few empty bands into account for the dielectric matrix (14, compared to 32 occupied). I guess I should increase that? At the same time, I believe that the k-point sampling might be slightly greedy.
Anyways, that leads me to another point that I'd just like to clarify for the practical workflow: I run a nscf pw calculation using a uniform IBZ k-point grid. Then yambo calculates the QP corrections for these k-points. If I want to plot the QP bandstructure along high symmetry lines, I should use ypp and interpolate the BS. Correct, or am I missing something?
Cheers,
Bjoern
that of course does help. I think I might still need more bandn that but that's a different issue.
What I also realized is that I'm also taking only few empty bands into account for the dielectric matrix (14, compared to 32 occupied). I guess I should increase that? At the same time, I believe that the k-point sampling might be slightly greedy.
Anyways, that leads me to another point that I'd just like to clarify for the practical workflow: I run a nscf pw calculation using a uniform IBZ k-point grid. Then yambo calculates the QP corrections for these k-points. If I want to plot the QP bandstructure along high symmetry lines, I should use ypp and interpolate the BS. Correct, or am I missing something?
Cheers,
Bjoern
Dr. Bjoern Baumeier
Eindhoven University of Technology
Eindhoven, The Netherlands
Eindhoven University of Technology
Eindhoven, The Netherlands
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Advice on parameters
Dear Bjoern,
For the latter you can use terminator technique which speed up the convergence wrt this parameter setting
GTermKind= "BG".
Please also note that you are considering just 1 Gvector in the screening (scalar):
the size of the screening matrix is another parameter you need to check the convergence. I suggest you to increase this parameter in Ry and not in RL (e.g. try 1 Ry, 2Ry etc.. until convergence). I suggest to check the convergence for few points (e.g. the gap).
For the explanation of each variable you can have a look here:
http://www.attaccalite.com/yambo-input- ... explained/
I suggest you also to have a look to the tutorial page of Yambo for a typical GW calculation:
http://www.yambo-code.org/tutorials/GW/index.php
Best,
Daniele
Yes, you need to check convergences, 14 empty bands seems to be are few, both in the screening and in GW summation (% GbndRnge).What I also realized is that I'm also taking only few empty bands into account for the dielectric matrix (14, compared to 32 occupied). I guess I should increase that? At the same time, I believe that the k-point sampling might be slightly greedy.
For the latter you can use terminator technique which speed up the convergence wrt this parameter setting
GTermKind= "BG".
Please also note that you are considering just 1 Gvector in the screening (scalar):
Code: Select all
NGsBlkXd= 1 RL
For the explanation of each variable you can have a look here:
http://www.attaccalite.com/yambo-input- ... explained/
Correct. If you want calculate QP correction for a single point non included in your grid you can built a specific grid that can exploit the screening calculated with another grid, but if you want the band structure you need to interpolate.If I want to plot the QP bandstructure along high symmetry lines, I should use ypp and interpolate the BS. Correct, or am I missing something?
I suggest you also to have a look to the tutorial page of Yambo for a typical GW calculation:
http://www.yambo-code.org/tutorials/GW/index.php
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 42
- Joined: Wed Aug 04, 2010 8:39 am
- Location: Eindhoven, The Netherlands
- Contact:
Re: Advice on parameters
Hi Daniele,
thanks for your suggestions and the clarification - greatly appreciated.
I am currently running these checks but hit trouble with the runtimes. I attached an input and current state of output as an example. Since the cluster I'm running this on has behaved strangely sometimes, I wonder if it is normal that the runtime estimate in the G0W0 step varies so much? It goes from ~2h to ~1d2h as of now. The queue has a 48 hour wall time limit, so I might eventually run into trouble there.
In fact, I have two more questions related to this:
a) I realized that when the job is terminated during the calculations of the X[q], yambo restarts where it was at time of termination. Would the same also hold for the EXS and G0W0 steps?
b) Would it make sense to go to higher parallelization? Right now I'm running the job on 1 node and 24 OpenMP threads, so I'm considering if it's worth using 2 nodes (MPI) and 24 OpenMP threads on each node. But even if so, I have not fully understood how to tell yambo how to do just that, even after looking at the examples. Any hints?
Cheers,
Bjoern
thanks for your suggestions and the clarification - greatly appreciated.
I am currently running these checks but hit trouble with the runtimes. I attached an input and current state of output as an example. Since the cluster I'm running this on has behaved strangely sometimes, I wonder if it is normal that the runtime estimate in the G0W0 step varies so much? It goes from ~2h to ~1d2h as of now. The queue has a 48 hour wall time limit, so I might eventually run into trouble there.
In fact, I have two more questions related to this:
a) I realized that when the job is terminated during the calculations of the X[q], yambo restarts where it was at time of termination. Would the same also hold for the EXS and G0W0 steps?
b) Would it make sense to go to higher parallelization? Right now I'm running the job on 1 node and 24 OpenMP threads, so I'm considering if it's worth using 2 nodes (MPI) and 24 OpenMP threads on each node. But even if so, I have not fully understood how to tell yambo how to do just that, even after looking at the examples. Any hints?
Cheers,
Bjoern
You do not have the required permissions to view the files attached to this post.
Dr. Bjoern Baumeier
Eindhoven University of Technology
Eindhoven, The Netherlands
Eindhoven University of Technology
Eindhoven, The Netherlands
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Advice on parameters
Dear Bjoern,
Run1:
Run2:
etc..
A typical set of variable in the input considering e.g. 12 MPI process and 2 threads reads:
Best,
Daniele
No, it is not.I wonder if it is normal that the runtime estimate in the G0W0 step varies so much?
No, here the code does not restart, but you can divide your job in several smaller runs, e.g.:a) I realized that when the job is terminated during the calculations of the X[q], yambo restarts where it was at time of termination. Would the same also hold for the EXS and G0W0 steps?
Run1:
Code: Select all
%QPkrange # [GW] QP generalized Kpoint/Band indices
1| 28| 31|32|
%
Code: Select all
%QPkrange # [GW] QP generalized Kpoint/Band indices
1| 28| 33|34|
%
Sure, MPI parallelization is much more effective here than OpenMP. I would consider OpenMP with no more than 4/8 threads, and an MPI parallelization also inside the same node. In order to do that you need to activate parallelization variable adding -V par when building the input files. In the yambo web page there is a specified tutorial to practice with parallelization variable. My advise is to parallelize on bands and k points and avoid parallelization on q points as most of the time you end up with unbalanced run.b) Would it make sense to go to higher parallelization? Right now I'm running the job on 1 node and 24 OpenMP threads, so I'm considering if it's worth using 2 nodes (MPI) and 24 OpenMP threads on each node. But even if so, I have not fully understood how to tell yambo how to do just that, even after looking at the examples. Any hints?
A typical set of variable in the input considering e.g. 12 MPI process and 2 threads reads:
Code: Select all
X_all_q_CPU= "1 6 2 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
X_all_q_nCPU_invert=0 # [PARALLEL] CPUs for matrix inversion
X_Threads= 2 # [OPENMP/X] Number of threads for response functions
DIP_Threads= 2 # [OPENMP/X] Number of threads for dipoles
SE_CPU= "1 6 2" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
SE_Threads= 2
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 42
- Joined: Wed Aug 04, 2010 8:39 am
- Location: Eindhoven, The Netherlands
- Contact:
Re: Advice on parameters
Dear Daniele,
Can't thank you enough.
I have made some comparison testings and here is what I found. As you said, a 12/2 parallelization is more efficient. I have also looked into linking against additional libraries from MKL (adding FFTW3 and SCALAPCK):
Compiled with MKL Lapack/Blas
Compiled with MKL Lapack/Blas + FFTW3 + SCALAPACK
FFTW3 seems to have noticeable influence, even more so for the purely OpenMP threaded run. I am however still a bit puzzled about the G0W0 timings: a) the difference between different compilations, b) the fact that also on a different cluster, the timing estimate for G0W0 grows with run time, see attached log. Given that it starts out from ~10m but ends up at ~2h, I wonder if I'm still doing something wrong? Is there anything else I could test? Or should I not concern myself with that?
One final question: You mentioned to set the variable:
while using -V par puts
Is that the same variable/setting?
Cheers,
Bjoern
Can't thank you enough.

I have made some comparison testings and here is what I found. As you said, a 12/2 parallelization is more efficient. I have also looked into linking against additional libraries from MKL (adding FFTW3 and SCALAPCK):
Compiled with MKL Lapack/Blas
Code: Select all
MPI/OMP X EXS G0W0
-------------------------------------------
1/24 1h 15m 1h 32m 8h 51m
12/2 35m 15m 1h 57m
Code: Select all
MPI/OMP X EXS G0W0
-------------------------------------------
1/24 50m 37m 3h 16m
12/2 19m 6m 2h 01m
One final question: You mentioned to set the variable:
Code: Select all
X_all_q_nCPU_invert=0 # [PARALLEL] CPUs for matrix inversion
Code: Select all
X_all_q_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra
Cheers,
Bjoern
You do not have the required permissions to view the files attached to this post.
Dr. Bjoern Baumeier
Eindhoven University of Technology
Eindhoven, The Netherlands
Eindhoven University of Technology
Eindhoven, The Netherlands
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Advice on parameters
Dear Bjoern,
Great your calculations are running properly. Surely MKL helps a lot.
About the G0W0 timing I would say that MPI are essentially the same considering fluctuations in the machine. About OMP 1/24 most probably
it is favorable when considering internal OMP parallelization of FFTW libraries.
I'm not sure, but this appear when SCALAPACK are linked. Scalapack features have been added very recently.
Overall I think you have good setup to run yambo.
Best,
Daniele
Great your calculations are running properly. Surely MKL helps a lot.
About the G0W0 timing I would say that MPI are essentially the same considering fluctuations in the machine. About OMP 1/24 most probably
it is favorable when considering internal OMP parallelization of FFTW libraries.
Ok, we will check if there is some timing misplaced there.Given that it starts out from ~10m but ends up at ~2h, I wonder if I'm still doing something wrong? Is there anything else I could test? Or should I not concern myself with that?
X_all_q_nCPU_LinAlg_INV= 1
I'm not sure, but this appear when SCALAPACK are linked. Scalapack features have been added very recently.
Overall I think you have good setup to run yambo.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/