Page 1 of 1

Speed up GW calculation

Posted: Wed Feb 17, 2021 9:14 am
by Bruno
Hello!

I recently spent some time finding a nice combination do parallelize my calculation and actually making it run, because I was having problems with memory. Now that it is in fact running, how could I work to speed up the calculation? There are any variables that I could maybe assign more cores in order to it run faster? Because then I could slowly go to that direction to find the best combination, so far I was focusing on distributing cores on "c,v".

Regards,

Re: Speed up GW calculation

Posted: Wed Feb 17, 2021 9:21 am
by Daniele Varsano
Dear Bruno,

parallelise on "cv" is usually a good choice as it also allows for memory distribution.
What you can do is to choice the cpus on "c" and "v" in order to have them balanced:
Nc/cpu_c ~ Nv/cpu_v

where Nc and Nv are the valence and conduction bands included in the calculation and cpu_cv are the cpu you assign for each role.

Best,
Daniele

Re: Speed up GW calculation

Posted: Wed Feb 17, 2021 10:42 am
by Bruno
Hello

That is nice! Thank you Daniele, I'll apply this rule on future calculations. But just to understand something, currently my problem is on the very last part of the G0W0 calculation. I've already have the dipoles, dynamic dielectric matrix and also the local xc + non-local fock, I'm on the final G0W0(w ppa) step. Can this tip help me get this part computed faster? Because I have the parallelization flags: X_and_IO_CPU which I suppose it's for the matrix, DIP_CPU for the dipoles, and SE_CPU which has the roles "q", "qp" and "b" inside of it.

Regards,

Re: Speed up GW calculation

Posted: Wed Feb 17, 2021 11:21 am
by Daniele Varsano
Dear Bruno,
if your problem is in the GW convolution part here some advises:
1) Avoid parallelisation over q (this can produce unbalance)
2) qp parallelisation is totally independent, so it should scale linearly
3) b parallelisation speed up the calculation of a single qp and also distribute memory

So I suggest you to use qp and b depending on you memory resources.

Note that being qp independent, you can also split your calculation in different runs (e.g. b1 | bn in two runs containing (bn-b1+1)/2)) or more runs, and run them simultaneously (with a different output directory -J in order to not overwrite the ndb.QP). Next you can merge the obtained databases using ypp.

Best,
Daniele