Hello!
I recently spent some time finding a nice combination do parallelize my calculation and actually making it run, because I was having problems with memory. Now that it is in fact running, how could I work to speed up the calculation? There are any variables that I could maybe assign more cores in order to it run faster? Because then I could slowly go to that direction to find the best combination, so far I was focusing on distributing cores on "c,v".
Regards,
Speed up GW calculation
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan
-
- Posts: 72
- Joined: Tue Dec 08, 2020 11:16 am
Speed up GW calculation
MSc. Bruno Cucco
PhD Candidate
CNRS Institut des Sciences Chimiques de Rennes, France
Université de Rennes 1
https://iscr.univ-rennes1.fr
PhD Candidate
CNRS Institut des Sciences Chimiques de Rennes, France
Université de Rennes 1
https://iscr.univ-rennes1.fr
- Daniele Varsano
- Posts: 4209
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Speed up GW calculation
Dear Bruno,
parallelise on "cv" is usually a good choice as it also allows for memory distribution.
What you can do is to choice the cpus on "c" and "v" in order to have them balanced:
Nc/cpu_c ~ Nv/cpu_v
where Nc and Nv are the valence and conduction bands included in the calculation and cpu_cv are the cpu you assign for each role.
Best,
Daniele
parallelise on "cv" is usually a good choice as it also allows for memory distribution.
What you can do is to choice the cpus on "c" and "v" in order to have them balanced:
Nc/cpu_c ~ Nv/cpu_v
where Nc and Nv are the valence and conduction bands included in the calculation and cpu_cv are the cpu you assign for each role.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 72
- Joined: Tue Dec 08, 2020 11:16 am
Re: Speed up GW calculation
Hello
That is nice! Thank you Daniele, I'll apply this rule on future calculations. But just to understand something, currently my problem is on the very last part of the G0W0 calculation. I've already have the dipoles, dynamic dielectric matrix and also the local xc + non-local fock, I'm on the final G0W0(w ppa) step. Can this tip help me get this part computed faster? Because I have the parallelization flags: X_and_IO_CPU which I suppose it's for the matrix, DIP_CPU for the dipoles, and SE_CPU which has the roles "q", "qp" and "b" inside of it.
Regards,
That is nice! Thank you Daniele, I'll apply this rule on future calculations. But just to understand something, currently my problem is on the very last part of the G0W0 calculation. I've already have the dipoles, dynamic dielectric matrix and also the local xc + non-local fock, I'm on the final G0W0(w ppa) step. Can this tip help me get this part computed faster? Because I have the parallelization flags: X_and_IO_CPU which I suppose it's for the matrix, DIP_CPU for the dipoles, and SE_CPU which has the roles "q", "qp" and "b" inside of it.
Regards,
MSc. Bruno Cucco
PhD Candidate
CNRS Institut des Sciences Chimiques de Rennes, France
Université de Rennes 1
https://iscr.univ-rennes1.fr
PhD Candidate
CNRS Institut des Sciences Chimiques de Rennes, France
Université de Rennes 1
https://iscr.univ-rennes1.fr
- Daniele Varsano
- Posts: 4209
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Speed up GW calculation
Dear Bruno,
if your problem is in the GW convolution part here some advises:
1) Avoid parallelisation over q (this can produce unbalance)
2) qp parallelisation is totally independent, so it should scale linearly
3) b parallelisation speed up the calculation of a single qp and also distribute memory
So I suggest you to use qp and b depending on you memory resources.
Note that being qp independent, you can also split your calculation in different runs (e.g. b1 | bn in two runs containing (bn-b1+1)/2)) or more runs, and run them simultaneously (with a different output directory -J in order to not overwrite the ndb.QP). Next you can merge the obtained databases using ypp.
Best,
Daniele
if your problem is in the GW convolution part here some advises:
1) Avoid parallelisation over q (this can produce unbalance)
2) qp parallelisation is totally independent, so it should scale linearly
3) b parallelisation speed up the calculation of a single qp and also distribute memory
So I suggest you to use qp and b depending on you memory resources.
Note that being qp independent, you can also split your calculation in different runs (e.g. b1 | bn in two runs containing (bn-b1+1)/2)) or more runs, and run them simultaneously (with a different output directory -J in order to not overwrite the ndb.QP). Next you can merge the obtained databases using ypp.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/