Speed up IP RPA

Deals with issues related to computation of optical spectra in reciprocal space: RPA, TDDFT, local field effects.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan

Post Reply
muhammadhasan
Posts: 45
Joined: Tue Aug 27, 2024 4:42 am

Speed up IP RPA

Post by muhammadhasan » Tue Dec 03, 2024 8:31 pm

Hi Professor,

I am currently calculating the dielectric function of Au using the IP RPA approximation. The calculations involve 6859 k-points (converged) and I have included 20 filled and 20 empty bands. I also need to compute the dielectric function for all 6859 q-points.

So far, my observations regarding computational resources and timing are as follows:
Total Processors| Total Time per q-point
96 (3 nodes, 32 CPUs per node)| 21 minutes
64 (4 nodes, 16 CPUs per node)| 42 minutes
Using this setup, I estimate that processing all q-points will require approximately two months, with about 94 q-points calculated per day.

To expedite the process, I plan to reduce the number of filled and empty bands to just one each, as the results do not significantly differ from the case with 20 bands. I believe this adjustment will make the calculations faster.

Would you kindly provide additional suggestions for optimizing the parallelization strategy? Specifically, I am interested in recommendations for allocating CPUs to v, c, k, q etc. to enhance performance.

Additionally, I have prepared a bash script that executes a loop to calculate one q-point per run.

Thank you for your time and guidance.

Code: Select all

# Timing   [Min/Max/Average]: 20m-30s/21m-14s/20m-53s
#
# .-Input file  output_files/eps_finalHT/q2.in_20Ry_280bnd_k6859
# | infver                           # [R] Input file variables verbosity
# | optics                           # [R] Linear Response optical properties
# | dipoles                          # [R] Oscillator strenghts (or dipoles)
# | chi                              # [R][CHI] Dyson equation for Chi.
# | ElecTemp= 0.025869         eV    # Electronic Temperature
# | FFTGvecs= 20               Ry    # [FFT] Plane-waves
# | X_CPU= "4.4.6.1"                 # [PARALLEL] CPUs for each role
# | X_ROLEs= "v.c.k.q"               # [PARALLEL] CPUs roles (q,g,k,c,v)
# | X_nCPU_LinAlg_INV= 1             # [PARALLEL] CPUs for Linear Algebra (if -1 it is automatically set)
# | Chimod= "IP"                     # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
# | % QpntsRXd
# |  2 | 2 |                             # [Xd] Transferred momenta
# | %
# | % BndsRnXd
# |  234 | 280 |                         # [Xd] Polarization function bands
# | %
# | GrFnTpXd= "R"                    # [Xd] Green`s function (T)ordered,(R)etarded,(r)senant,(a)ntiresonant [T, R, r, Ta, Ra]
# | % EnRngeXd
# |  0.001500 | 0.500000 |         eV    # [Xd] Energy range
# | %
# | % DmRngeXd
# |  0.018000 | 0.018000 |         eV    # [Xd] Damping range
# | %
# | ETStpsXd=  85                    # [Xd] Total Energy steps
# | % LongDrXd
# |  1.000000 | 0.000000 | 0.000000 |        # [Xd] [cc] Electric Field
# | %
Best
Md J Hasan
PhD Student
Mechanical Engineering
University of Maine

User avatar
Daniele Varsano
Posts: 4198
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Speed up IP RPA

Post by Daniele Varsano » Wed Dec 04, 2024 9:40 am

Dear Hasan,

IP response for different q points are independent, so you can run multiple runs simultaneously.
About parallelization strategy, you can rely on "c,v" parallelization, distributing MPI tasks to "c" and "v" according to the number of conduction and valence bands involved. Of course, if you plan to use just one conduction and empty band, then set the parallelization all on the "k" role.

Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

Post Reply