BSE speed

Various technical topics such as parallelism and efficiency, netCDF problems, the Yambo code structure itself, are posted here.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan, Nicola Spallanzani

Post Reply
Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

BSE speed

Post by Franz Fischer » Wed Jun 14, 2023 11:02 am

Hi,

I have a question regarding the BSE implementation.
As an example I performed a BSE calculation for 2D-MoS2 on a 30x30x1 k-grid including 2 valence and 2 conduction bands with a wavefunction cutoff of 80 Ry (~20000 G-vectors) and a screening cutoff of 10 Ry (~700 G-vectors). The transition space of the BSE Kernel (resonant) is kpts_BZ * nbnd_valence * nbnd_conduction = 900 * 2 * 2 = 3600. Since the BSE matrix is hermitian there should be 6481800 individual Coulomb matrix elements (this can probably be reduced by using symmetries).

Writing out the Coulomb matrix elements in a plane-wave basis (see attached figure) there will be 4 summations, over G and G' (for the screening) and two over the wavefunction G-vectors. So for each Coulomb matrix element you would have ~ 700 * 700 * 20000 * 20000 ~ 2E14 summands.

Considering how many Coulomb matrix elements there are and the number of summands I am wondering how this calculation finished in 3 minutes on ~60 cores.

Perhaps my derivation is wrong or are you using some smart tricks to speed-up the calculation?

Best,
Franz
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

User avatar
Daniele Varsano
Posts: 3926
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: BSE speed

Post by Daniele Varsano » Wed Jun 14, 2023 11:24 am

Dear Franz,

indeed, symmetries are used, anyway as you say the dimension of the matrix to be calculated is kpts_BZ * nbnd_valence * nbnd_conduction.
Regarding the kernel, the terms <ck|e^(i(q+g)r|c'k'> are calculated as FFT of the product of the wavefunctions, so finally the kernel reduce to a matrix multiplication Y=AX plus a dot product.

Best,

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

Re: BSE speed

Post by Franz Fischer » Wed Jun 14, 2023 12:23 pm

Dear Daniele,

I have a (possibly really stupid) follow-up question: How do you compute the product of the wavefunctions?

Best,
Franz
You do not have the required permissions to view the files attached to this post.
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

User avatar
Davide Sangalli
Posts: 617
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: BSE speed

Post by Davide Sangalli » Mon Jun 19, 2023 8:24 am

Dear Franz,
since it is a product in real-space and a convolution in G-space, the operation is performed in real-space and later a Fast Fourier Transform (FFT) of the product is performed. This is also to be taken into account when considering the scaling.
This is the subroutine which computes the product of two wave-functions:
https://github.com/yambo-code/yambo/blo ... ter_Bamp.F

The sum over G, G' is performed up to 10 Ry in your case (700 G vectors), and also the product of the wave-functions up to 700 G vectors is used.

The real-space multiplication is done up to the size of the real space grid which is defined using both the values of 20 Ry and 70 Ry. See this line:

Code: Select all

call WF_load(WF,NG,O_ng_shift,bands_to_load,(/1,Xk%nibz/),space='R',title=trim(section_title))
inside https://github.com/yambo-code/yambo/blo ... rc/bse/K.F where the input cutoff (10 Ry) is used together with other values. Inside WF_load (if I remember well) this is combined with "wf_ng" (the cutoff on the WFs).
The conversion of these cut-off into a real-space grid is done according to some "magic rules". See details here:
https://github.com/yambo-code/yambo/blo ... ft_setup.F
See the "magic table" in the subroutine "fft_best_size"
The value is reported in the log and in the report, you should see something like fft size 15 x 15 x 8 (just random numbers here) or similar.

Finally, as already pointed out, not all wave-functions products are computed. There are some complex tables which limit the total number by taking advantage of symmetries. If you wish to dig into this see here:
https://github.com/yambo-code/yambo/blo ... llisions.F
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

Franz Fischer
Posts: 43
Joined: Wed Jul 20, 2022 9:36 am

Re: BSE speed

Post by Franz Fischer » Thu Jun 22, 2023 9:44 am

Dear Davide,

thank you for your thorough answer, but there are still some things that I don't understand.

Can you elaborate on:
and also the product of the wave-functions up to 700 G vectors is used.
With this product you mean the real-space product of the WFs, i.e. the convolution in G-space. What is then meant by the next sentence?
The real-space multiplication is done up to the size of the real space grid which is defined using both the values of 20 Ry and 70 Ry.
Here you again mention a real-space multiplication using different grid sizes. Moreover, where are the values 20 Ry and 70 Ry coming from?

Concerning the symmetries mentioned here:
Finally, as already pointed out, not all wave-functions products are computed. There are some complex tables which limit the total number by taking advantage of symmetries.
I saw in the file that you linked that you are checking that the product of your symmetries that bring k_BZ and k'_BZ to the IBZ lies in the star of k'_BZ. I don't really understand what you mean by that. Or in other words, how do you compute the quantity PHASE in https://github.com/yambo-code/yambo/blo ... llisions.F?

Best,
Franz
Franz Fischer
PhD student / IMPRS-UFAST fellow
Institute of Physical Chemistry
University of Hamburg

Post Reply