Hi,
I have a question regarding the BSE implementation.
As an example I performed a BSE calculation for 2DMoS2 on a 30x30x1 kgrid including 2 valence and 2 conduction bands with a wavefunction cutoff of 80 Ry (~20000 Gvectors) and a screening cutoff of 10 Ry (~700 Gvectors). The transition space of the BSE Kernel (resonant) is kpts_BZ * nbnd_valence * nbnd_conduction = 900 * 2 * 2 = 3600. Since the BSE matrix is hermitian there should be 6481800 individual Coulomb matrix elements (this can probably be reduced by using symmetries).
Writing out the Coulomb matrix elements in a planewave basis (see attached figure) there will be 4 summations, over G and G' (for the screening) and two over the wavefunction Gvectors. So for each Coulomb matrix element you would have ~ 700 * 700 * 20000 * 20000 ~ 2E14 summands.
Considering how many Coulomb matrix elements there are and the number of summands I am wondering how this calculation finished in 3 minutes on ~60 cores.
Perhaps my derivation is wrong or are you using some smart tricks to speedup the calculation?
Best,
Franz
BSE speed
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan, Nicola Spallanzani

 Posts: 43
 Joined: Wed Jul 20, 2022 9:36 am
BSE speed
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRSUFAST fellow
Institute of Physical Chemistry
University of Hamburg
PhD student / IMPRSUFAST fellow
Institute of Physical Chemistry
University of Hamburg
 Daniele Varsano
 Posts: 3975
 Joined: Tue Mar 17, 2009 2:23 pm
 Contact:
Re: BSE speed
Dear Franz,
indeed, symmetries are used, anyway as you say the dimension of the matrix to be calculated is kpts_BZ * nbnd_valence * nbnd_conduction.
Regarding the kernel, the terms <cke^(i(q+g)rc'k'> are calculated as FFT of the product of the wavefunctions, so finally the kernel reduce to a matrix multiplication Y=AX plus a dot product.
Best,
Daniele
indeed, symmetries are used, anyway as you say the dimension of the matrix to be calculated is kpts_BZ * nbnd_valence * nbnd_conduction.
Regarding the kernel, the terms <cke^(i(q+g)rc'k'> are calculated as FFT of the product of the wavefunctions, so finally the kernel reduce to a matrix multiplication Y=AX plus a dot product.
Best,
Daniele
Dr. Daniele Varsano
S3CNR Institute of Nanoscience and MaX Center, Italy
MaX  Materials design at the Exascale
http://www.nano.cnr.it
http://www.maxcentre.eu/
S3CNR Institute of Nanoscience and MaX Center, Italy
MaX  Materials design at the Exascale
http://www.nano.cnr.it
http://www.maxcentre.eu/

 Posts: 43
 Joined: Wed Jul 20, 2022 9:36 am
Re: BSE speed
Dear Daniele,
I have a (possibly really stupid) followup question: How do you compute the product of the wavefunctions?
Best,
Franz
I have a (possibly really stupid) followup question: How do you compute the product of the wavefunctions?
Best,
Franz
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRSUFAST fellow
Institute of Physical Chemistry
University of Hamburg
PhD student / IMPRSUFAST fellow
Institute of Physical Chemistry
University of Hamburg
 Davide Sangalli
 Posts: 620
 Joined: Tue May 29, 2012 4:49 pm
 Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
 Contact:
Re: BSE speed
Dear Franz,
since it is a product in realspace and a convolution in Gspace, the operation is performed in realspace and later a Fast Fourier Transform (FFT) of the product is performed. This is also to be taken into account when considering the scaling.
This is the subroutine which computes the product of two wavefunctions:
https://github.com/yambocode/yambo/blo ... ter_Bamp.F
The sum over G, G' is performed up to 10 Ry in your case (700 G vectors), and also the product of the wavefunctions up to 700 G vectors is used.
The realspace multiplication is done up to the size of the real space grid which is defined using both the values of 20 Ry and 70 Ry. See this line:
inside https://github.com/yambocode/yambo/blo ... rc/bse/K.F where the input cutoff (10 Ry) is used together with other values. Inside WF_load (if I remember well) this is combined with "wf_ng" (the cutoff on the WFs).
The conversion of these cutoff into a realspace grid is done according to some "magic rules". See details here:
https://github.com/yambocode/yambo/blo ... ft_setup.F
See the "magic table" in the subroutine "fft_best_size"
The value is reported in the log and in the report, you should see something like fft size 15 x 15 x 8 (just random numbers here) or similar.
Finally, as already pointed out, not all wavefunctions products are computed. There are some complex tables which limit the total number by taking advantage of symmetries. If you wish to dig into this see here:
https://github.com/yambocode/yambo/blo ... llisions.F
since it is a product in realspace and a convolution in Gspace, the operation is performed in realspace and later a Fast Fourier Transform (FFT) of the product is performed. This is also to be taken into account when considering the scaling.
This is the subroutine which computes the product of two wavefunctions:
https://github.com/yambocode/yambo/blo ... ter_Bamp.F
The sum over G, G' is performed up to 10 Ry in your case (700 G vectors), and also the product of the wavefunctions up to 700 G vectors is used.
The realspace multiplication is done up to the size of the real space grid which is defined using both the values of 20 Ry and 70 Ry. See this line:
Code: Select all
call WF_load(WF,NG,O_ng_shift,bands_to_load,(/1,Xk%nibz/),space='R',title=trim(section_title))
The conversion of these cutoff into a realspace grid is done according to some "magic rules". See details here:
https://github.com/yambocode/yambo/blo ... ft_setup.F
See the "magic table" in the subroutine "fft_best_size"
The value is reported in the log and in the report, you should see something like fft size 15 x 15 x 8 (just random numbers here) or similar.
Finally, as already pointed out, not all wavefunctions products are computed. There are some complex tables which limit the total number by taking advantage of symmetries. If you wish to dig into this see here:
https://github.com/yambocode/yambo/blo ... llisions.F
Davide Sangalli, PhD
CNRISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.maxcentre.eu/
CNRISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.maxcentre.eu/

 Posts: 43
 Joined: Wed Jul 20, 2022 9:36 am
Re: BSE speed
Dear Davide,
thank you for your thorough answer, but there are still some things that I don't understand.
Can you elaborate on:
Concerning the symmetries mentioned here:
Best,
Franz
thank you for your thorough answer, but there are still some things that I don't understand.
Can you elaborate on:
With this product you mean the realspace product of the WFs, i.e. the convolution in Gspace. What is then meant by the next sentence?and also the product of the wavefunctions up to 700 G vectors is used.
Here you again mention a realspace multiplication using different grid sizes. Moreover, where are the values 20 Ry and 70 Ry coming from?The realspace multiplication is done up to the size of the real space grid which is defined using both the values of 20 Ry and 70 Ry.
Concerning the symmetries mentioned here:
I saw in the file that you linked that you are checking that the product of your symmetries that bring k_BZ and k'_BZ to the IBZ lies in the star of k'_BZ. I don't really understand what you mean by that. Or in other words, how do you compute the quantity PHASE in https://github.com/yambocode/yambo/blo ... llisions.F?Finally, as already pointed out, not all wavefunctions products are computed. There are some complex tables which limit the total number by taking advantage of symmetries.
Best,
Franz
Franz Fischer
PhD student / IMPRSUFAST fellow
Institute of Physical Chemistry
University of Hamburg
PhD student / IMPRSUFAST fellow
Institute of Physical Chemistry
University of Hamburg