Page 1 of 1

Speed up IP RPA

Posted: Tue Dec 03, 2024 8:31 pm
by muhammadhasan
Hi Professor,

I am currently calculating the dielectric function of Au using the IP RPA approximation. The calculations involve 6859 k-points (converged) and I have included 20 filled and 20 empty bands. I also need to compute the dielectric function for all 6859 q-points.

So far, my observations regarding computational resources and timing are as follows:
Total Processors| Total Time per q-point
96 (3 nodes, 32 CPUs per node)| 21 minutes
64 (4 nodes, 16 CPUs per node)| 42 minutes
Using this setup, I estimate that processing all q-points will require approximately two months, with about 94 q-points calculated per day.

To expedite the process, I plan to reduce the number of filled and empty bands to just one each, as the results do not significantly differ from the case with 20 bands. I believe this adjustment will make the calculations faster.

Would you kindly provide additional suggestions for optimizing the parallelization strategy? Specifically, I am interested in recommendations for allocating CPUs to v, c, k, q etc. to enhance performance.

Additionally, I have prepared a bash script that executes a loop to calculate one q-point per run.

Thank you for your time and guidance.

Code: Select all

# Timing   [Min/Max/Average]: 20m-30s/21m-14s/20m-53s
#
# .-Input file  output_files/eps_finalHT/q2.in_20Ry_280bnd_k6859
# | infver                           # [R] Input file variables verbosity
# | optics                           # [R] Linear Response optical properties
# | dipoles                          # [R] Oscillator strenghts (or dipoles)
# | chi                              # [R][CHI] Dyson equation for Chi.
# | ElecTemp= 0.025869         eV    # Electronic Temperature
# | FFTGvecs= 20               Ry    # [FFT] Plane-waves
# | X_CPU= "4.4.6.1"                 # [PARALLEL] CPUs for each role
# | X_ROLEs= "v.c.k.q"               # [PARALLEL] CPUs roles (q,g,k,c,v)
# | X_nCPU_LinAlg_INV= 1             # [PARALLEL] CPUs for Linear Algebra (if -1 it is automatically set)
# | Chimod= "IP"                     # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
# | % QpntsRXd
# |  2 | 2 |                             # [Xd] Transferred momenta
# | %
# | % BndsRnXd
# |  234 | 280 |                         # [Xd] Polarization function bands
# | %
# | GrFnTpXd= "R"                    # [Xd] Green`s function (T)ordered,(R)etarded,(r)senant,(a)ntiresonant [T, R, r, Ta, Ra]
# | % EnRngeXd
# |  0.001500 | 0.500000 |         eV    # [Xd] Energy range
# | %
# | % DmRngeXd
# |  0.018000 | 0.018000 |         eV    # [Xd] Damping range
# | %
# | ETStpsXd=  85                    # [Xd] Total Energy steps
# | % LongDrXd
# |  1.000000 | 0.000000 | 0.000000 |        # [Xd] [cc] Electric Field
# | %
Best
Md J Hasan
PhD Student
Mechanical Engineering
University of Maine

Re: Speed up IP RPA

Posted: Wed Dec 04, 2024 9:40 am
by Daniele Varsano
Dear Hasan,

IP response for different q points are independent, so you can run multiple runs simultaneously.
About parallelization strategy, you can rely on "c,v" parallelization, distributing MPI tasks to "c" and "v" according to the number of conduction and valence bands involved. Of course, if you plan to use just one conduction and empty band, then set the parallelization all on the "k" role.

Best,
Daniele