Dear YAMBO,
Can you provide me some tips for BSE kernel parallelization. I followed some variables of BSE parallel
BS_CPU= "47 1 28" # [PARALLEL] CPUs for each role
BS_ROLEs= "k eh t" # [PARALLEL] CPUs roles (k,eh,t)
BS_nCPU_LinAlg_INV= 8 # [PARALLEL] CPUs for matrix inversion
BS_nCPU_LinAlg_DIAGO= 8 # [PARALLEL] CPUs for matrix diagonalization
but it doesn't work. Looking at output file, these configurations were not taken into account.
Another question is which parallelism gives memory distribution. It seems that this calculation requires a lot of memory.
I attached here input and output files. Thank you.
Parallelism of BSE kernel
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, Daniele Varsano
-
- Posts: 20
- Joined: Mon Nov 18, 2019 6:53 am
- Location: Austin, TX, USA
- Contact:
Parallelism of BSE kernel
You do not have the required permissions to view the files attached to this post.
Viet-Anh Ha,
Oden Institute for Computational Engineering and Sciences,
https://www.oden.utexas.edu/
The University of Texas at Austin,
https://www.utexas.edu/
201 E 24th St, Austin, TX 78712, USA.
Oden Institute for Computational Engineering and Sciences,
https://www.oden.utexas.edu/
The University of Texas at Austin,
https://www.utexas.edu/
201 E 24th St, Austin, TX 78712, USA.
- Daniele Varsano
- Posts: 3980
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Parallelism of BSE kernel
Dear Viet-Anh Ha,,
please note that parallel linear algebra will be not in place if you do not compile the code using SCALAPACK libraries:
Anyway this are need for the diagoalization of the BSE matrix, while you are experiencing problems in the building of the kernel.
Why do you say that BS_CPU and BS_ROLEs are not taken into account? The parallel distribution of the kernel is reported in the log files.
You have a lot of k points (1000 in the BZ) so you can easily end up with very large matrices I really suggest you to reduce the number of bands in BSE.
The first band is around 41 eV below the Fermi level, this will not participate in the first excitations ie in the low energy part of the spectrum.
Also, the last bands 16/17/18 are 40eV above the Fermi energy. I would reduce drastically the range of bands in the BSE, this will reduce a lot the computational burden.
Best,
Daniele
please note that parallel linear algebra will be not in place if you do not compile the code using SCALAPACK libraries:
Code: Select all
BS_nCPU_LinAlg_INV= 8 # [PARALLEL] CPUs for matrix inversion
BS_nCPU_LinAlg_DIAGO= 8 # [PARALLEL] CPUs for matrix diagonalization
Why do you say that BS_CPU and BS_ROLEs are not taken into account? The parallel distribution of the kernel is reported in the log files.
You have a lot of k points (1000 in the BZ) so you can easily end up with very large matrices I really suggest you to reduce the number of bands in BSE.
Code: Select all
% BSEBands
1 | 18 | # [BSK] Bands range
%
Also, the last bands 16/17/18 are 40eV above the Fermi energy. I would reduce drastically the range of bands in the BSE, this will reduce a lot the computational burden.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 20
- Joined: Mon Nov 18, 2019 6:53 am
- Location: Austin, TX, USA
- Contact:
Re: Parallelism of BSE kernel
Thank you Daniele.
I did not pay attention to log files and didn't recognize that parallel was performed. I just looked at the beginning of output file with the following text
* CPU-Threads :1316(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)-1(threads@NL)
* MPI CPU : 1316
* THREADS (max): 1
* THREADS TOT(max): 1316
* I/O NODES : 1
* Fragmented WFs :yes
so I thought the parallelization was not successfully.
Another question is why NLogCPUs does not work in this run. I set NLogCPUs = 2 but there are still a number of log files written (=number of MPIs).
I did not pay attention to log files and didn't recognize that parallel was performed. I just looked at the beginning of output file with the following text
* CPU-Threads :1316(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)-1(threads@NL)
* MPI CPU : 1316
* THREADS (max): 1
* THREADS TOT(max): 1316
* I/O NODES : 1
* Fragmented WFs :yes
so I thought the parallelization was not successfully.
Another question is why NLogCPUs does not work in this run. I set NLogCPUs = 2 but there are still a number of log files written (=number of MPIs).
Viet-Anh Ha,
Oden Institute for Computational Engineering and Sciences,
https://www.oden.utexas.edu/
The University of Texas at Austin,
https://www.utexas.edu/
201 E 24th St, Austin, TX 78712, USA.
Oden Institute for Computational Engineering and Sciences,
https://www.oden.utexas.edu/
The University of Texas at Austin,
https://www.utexas.edu/
201 E 24th St, Austin, TX 78712, USA.
- Daniele Varsano
- Posts: 3980
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Parallelism of BSE kernel
Dear Viet-Anh Ha,
Finally, I can see you are using a rather old version of Yambo ( GPL Version 4.4.0), I suggest you to update to a newer version.
Best,
Daniele
That's strange, are you sure that there are no logs from previous runs? These are not deleted, and an incremental number is used *_01 etc is used for new ones.Another question is why NLogCPUs does not work in this run. I set NLogCPUs = 2 but there are still a number of log files written (=number of MPIs).
Finally, I can see you are using a rather old version of Yambo ( GPL Version 4.4.0), I suggest you to update to a newer version.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 20
- Joined: Mon Nov 18, 2019 6:53 am
- Location: Austin, TX, USA
- Contact:
Re: Parallelism of BSE kernel
I tried to delete log files in LOG directory but still see the same problem. Yes, I'll update the newest version.
Viet-Anh Ha,
Oden Institute for Computational Engineering and Sciences,
https://www.oden.utexas.edu/
The University of Texas at Austin,
https://www.utexas.edu/
201 E 24th St, Austin, TX 78712, USA.
Oden Institute for Computational Engineering and Sciences,
https://www.oden.utexas.edu/
The University of Texas at Austin,
https://www.utexas.edu/
201 E 24th St, Austin, TX 78712, USA.