Yambo 5.0.4 parallization strategy for large system [nbnds =3600]

Various technical topics such as parallelism and efficiency, netCDF problems, the Yambo code structure itself, are posted here.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan, Nicola Spallanzani

Post Reply
harrier_class
Posts: 10
Joined: Tue May 13, 2025 4:27 pm

Yambo 5.0.4 parallization strategy for large system [nbnds =3600]

Post by harrier_class » Tue Jun 02, 2026 5:45 pm

Dear Yambo team,

I am trying to run optical absorption spectrum calculation using Yambo 5.0.4. As its a big system with 332 atoms and number of bands = 3600. With the modest parameters, we need around 224.2684 [Gb] [per MPI task; please help me know if my understanding is correct]

SInce, its a big system, the calculations will take quite long time, so we decided to resort to the parallelization strategies. And we use them in our input files [as shown here]

Code: Select all

#                                                                     
#  __  __   ________   ___ __ __    _______   ______                  
# /_/\/_/\ /_______/\ /__//_//_/\ /_______/\ /_____/\                 
# \ \ \ \ \\::: _  \ \\::\| \| \ \\::: _  \ \\:::_ \ \                
#  \:\_\ \ \\::(_)  \ \\:.      \ \\::(_)  \/_\:\ \ \ \               
#   \::::_\/ \:: __  \ \\:.\-/\  \ \\::  _  \ \\:\ \ \ \              
#     \::\ \  \:.\ \  \ \\. \  \  \ \\::(_)  \ \\:\_\ \ \             
#      \__\/   \__\/\__\/ \__\/ \__\/ \_______\/ \_____\/             
#                                                                     
#                                                                     
#       Version 5.0.4 Revision 19598 Hash 20b2ffa04                   
#                      Branch is 5.0                                  
#              MPI+OpenMP+SLK+HDF5_IO Build                           
#                http://www.yambo-code.org                            
#
optics                           # [R] Linear Response optical properties
chi                              # [R][CHI] Dyson equation for Chi.
tddft                            # [R][K] Use TDDFT kernel
X_Threads=0                      # [OPENMP/X] Number of threads for response functions
DIP_Threads=0                    # [OPENMP/X] Number of threads for dipoles
Chimod= "ALDA"                   # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
FxcGRLc= 1                 Ry    # [TDDFT] XC-kernel RL size
NGsBlkXd= 1                Ry    # [Xd] Response block size
% QpntsRXd
 1 | 1 |                             # [Xd] Transferred momenta
%
% BndsRnXd
     1 |  3600 |                     # [Xd] Polarization function bands
%
% EnRngeXd
  0.00000 | 6.00000 |         eV    # [Xd] Energy range
%
% DmRngeXd
 0.100000 | 0.100000 |         eV    # [Xd] Damping range
%
ETStpsXd= 100                    # [Xd] Total Energy steps
% LongDrXd
 1.000000 | 1.000000 | 1.000000 |        # [Xd] [cc] Electric Field
%
X_CPU= "1.4.8.1.1"               # [PARALLEL] CPUs for each role
X_ROLEs= "g.v.c.k.q"             # [PARALLEL] CPUs roles (q,g,k,c,v)
X_nCPU_LinAlg_INV= 32            # [PARALLEL] CPUs for Linear Algebra (if -1 it is automatically set)
DIP_CPU= "1 8 4" # [PARALLEL] CPUs for each role
DIP_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
PAR_def_mode= "memory"
X_all_q_ROLEs= "q k c v"
X_all_q_CPU= "1 1 8 4"
~                       
This is the submission script used by me:

Code: Select all

#!/bin/bash -l
#SBATCH --job-name=E_top_3600
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=8
#SBATCH --time=24:00:00
#SBATCH --export=NONE
#SBATCH -o slurm-%j.out
#SBATCH -e slurm-%j.err
###SBATCH --mem=0
#SBATCH -p spr1tb

unset SLURM_EXPORT_ENV


module purge
module load intel/2021.4.0
module load intelmpi/2021.6.0
module load mkl/2021.4.0
module load hdf5/1.10.7-impi-intel
module load netcdf-c/4.8.1
module load netcdf-fortran/4.5.3-intel

export PATH=/home/woody/bccc/bccc128h/software/yambo_5.0_fritz_cpu/bin:$PATH

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK

srun yambo -F yambo.in_ALDA

Code: Select all



  
 ___ __  _____  __ __  _____   _____
|   Y  ||  _  ||  Y  ||  _  \ |  _  |
|   |  ||. |  ||.    ||. |  / |. |  |
 \_  _/ |. _  ||.\_/ ||. _  \ |. |  |
  |: |  |: |  ||: |  ||: |   \|: |  |
  |::|  |:.|:.||:.|:.||::.   /|::.  |
  `--"  `-- --"`-- --"`-----" `-----"


 <---> P20: [01] MPI/OPENMP structure, Files & I/O Directories
 <---> P20-f2274.nhr.fau.de: MPI Cores-Threads   : 32(CPU)-8(threads)
 <---> P20-f2274.nhr.fau.de: MPI Cores-Threads   : DIP(environment)-1 8 4(CPUs)-k c v(ROLEs)
 <---> P20-f2274.nhr.fau.de: MPI Cores-Threads   : X(environment)-1.4.8.1.1(CPUs)-g.v.c.k.q(ROLEs)
 <---> P20-f2274.nhr.fau.de: [02] CORE Variables Setup
 <---> P20-f2274.nhr.fau.de: [02.01] Unit cells
 <04s> P20-f2274.nhr.fau.de: [02.02] Symmetries
 <04s> P20-f2274.nhr.fau.de: [02.03] Reciprocal space
 <04s> P20-f2274.nhr.fau.de: [02.04] K-grid lattice
 <04s> P20-f2274.nhr.fau.de: [02.05] Energies & Occupations
 <04s> P20-f2274.nhr.fau.de: [WARNING][X] Metallic system
 <04s> P20-f2274.nhr.fau.de: [03] Transferred momenta grid and indexing
 <04s> P20-f2274.nhr.fau.de: [MEMORY] Alloc bare_qpg( 58.86900 [Mb]) TOTAL:  243.3530 [Mb] (traced)  96.28400 [Mb] (memstat)
 <04s> P20-f2274.nhr.fau.de: [04] Dipoles
 <04s> P20-f2274.nhr.fau.de: [PARALLEL DIPOLES for K(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
 <04s> P20-f2274.nhr.fau.de: [PARALLEL DIPOLES for CON bands on 8 CPU] Loaded/Total (Percentual):108/862(13%)
 <04s> P20-f2274.nhr.fau.de: [PARALLEL DIPOLES for VAL bands on 4 CPU] Loaded/Total (Percentual):688/2752(25%)
 <04s> P20-f2274.nhr.fau.de: [DIP] Checking dipoles header
 <05s> P20-f2274.nhr.fau.de: [WARNING][DIP] Database not correct or missing. To be computed
 <05s> P20-f2274.nhr.fau.de: [MEMORY] Alloc DIP_iR( 55.59900 [Mb]) TOTAL:  299.0520 [Mb] (traced)  96.28400 [Mb] (memstat)
 <05s> P20-f2274.nhr.fau.de: [MEMORY] Alloc DIP_P( 55.59900 [Mb]) TOTAL:  354.6510 [Mb] (traced)  96.28400 [Mb] (memstat)
 <05s> P20-f2274.nhr.fau.de: [MEMORY] Alloc DIP_v( 55.59900 [Mb]) TOTAL:  410.2500 [Mb] (traced)  96.28400 [Mb] (memstat)
 <05s> P20-f2274.nhr.fau.de: [x,Vnl] computed using 7398 projectors
 <05s> P20-f2274.nhr.fau.de: [WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
 <05s> P20-f2274.nhr.fau.de: [MEMORY] Alloc pp_kb( 128.7640 [Mb]) TOTAL:  539.0170 [Mb] (traced)  96.28400 [Mb] (memstat)
 <05s> P20-f2274.nhr.fau.de: [MEMORY] Alloc pp_kbd( 128.7640 [Mb]) TOTAL:  667.7810 [Mb] (traced)  96.28400 [Mb] (memstat)
 <15s> P20-f2274.nhr.fau.de: [MEMORY] Alloc kbv( 217.7363 [Gb]) TOTAL:  218.4041 [Gb] (traced)  96.28400 [Mb] (memstat)
 <15s> P20-f2274.nhr.fau.de: Dipoles: P, V and iR (T) |                                        | [000%] --(E) --(X)
 <15s> P20-f2274.nhr.fau.de: [MEMORY] Alloc WF%c( 5.856924 [Gb]) TOTAL:  224.2610 [Gb] (traced)  96.28400 [Mb] (memstat)
 <15s> P20-f2274.nhr.fau.de: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):796/3600(22%)
 <16s> P20-f2274.nhr.fau.de: [MEMORY] Alloc wf_disk( 5.856924 [Gb]) TOTAL:  230.1327 [Gb] (traced)  96.28400 [Mb] (memstat)
 <44s> P20-f2274.nhr.fau.de: [MEMORY]  Free wf_disk( 5.856924 [Gb]) TOTAL:  224.2758 [Gb] (traced)  96.28400 [Mb] (memstat)
                                                                                                                            
For the sake of the completeness, this is the r_optics_chi_tddft file from this run

Code: Select all


     ____  ____     _       ____    ____  ______      ___
    |_  _||_  _|   / \     |_   \  /   _||_   _ \   ."   `.
      \ \  / /    / _ \      |   \/   |    | |_) | /  .-.  \
       \ \/ /    / ___ \     | |\  /| |    |  __". | |   | |
       _|  |_  _/ /   \ \_  _| |_\/_| |_  _| |__) |\  `-"  /
      |______||____| |____||_____||_____||_______/  `.___."



          Version 5.0.4 Revision 19598 Hash 20b2ffa04
                         Branch is 5.0
                  MPI+OpenMP+SLK+HDF5_IO Build
                   http://www.yambo-code.org


 06/02/2026 at 14:55 yambo @ f2268.nhr.fau.de
 ==================================================

 Cores-Threads       : 32(CPU)-8(threads)
 Cores-Threads       : DIP(environment)-1 8 4(CPUs)-k c v(ROLEs)
 Cores-Threads       : X(environment)-1.4.8.1.1(CPUs)-g.v.c.k.q(ROLEs)
 MPI Cores           :   32
 Threads per core    :   8
 Threads total       :  256
 Nodes Computing     :   8
 Nodes IO            :  1

 Fragmented WFs      : yes
 CORE databases      : .
 Additional I/O      : .
 Communications      : .
 Input file          : yambo.in_ALDA
 Report file         : ./r_optics_chi_tddft_19
 Verbose log/report  : no
 Log files           : ./LOG

 Precision           : SINGLE

 [RD./SAVE//ns.db1]--------------------------------------------------------------
  Bands                                            :   3600
  K-points                                         :  1
  G-vectors                                        :   7535261 [RL space]
  Components                                       :   941817 [wavefunctions]
  Symmetries                                       :  2 [spatial+T-reV]
  Spinor components                                :  1
  Spin polarizations                               :  1
  Temperature                                      :  0.025852 [eV]
  Electrons                                        :   5488.00
  WF G-vectors                                     :   941817
  Max atoms/species                                :  280
  No. of atom species                              :   5
  Exact exchange fraction in XC                    :  0.000000
  Exact exchange screening in XC                   :  0.000000
  Magnetic symmetries                              : no
 - S/N 002268 ---------------------------------------------- v.05.00.04 r.19598 -

 [02] CORE Variables Setup
 =========================


  [02.01] Unit cells
  ==================

  Cell kind             :  Unknown
  Atoms in the cell     :  Au  C  H  S  O
  number of Au atoms    :  280
  number of C  atoms    :   32
  number of H  atoms    :  16
  number of S  atoms    :  2
  number of O  atoms    :  2
  Alat factors          :  47.30090  39.54959  41.66742 [a.u.]

  Direct lattice volume :   77948.5    [a.u.]
  Direct lattice vectors:  A[ 1 ]  A[ 2 ]  A[ 3 ]
   A[ 1 ]:  1.000000  0.000000  0.000000  [iru]
   A[ 2 ]:  0.000000  1.000000  0.000000  [iru]
   A[ 3 ]:  0.000000  0.000000  1.000000  [iru]

  Recip. lattice volume :  0.003182 [a.u.]
  Recip. lattice vectors:  B[ 1 ]  B[ 2 ]  B[ 3 ]
   B[ 1 ]:  1.000000  0.000000  0.000000  [iku]
   B[ 2 ]:  0.000000  1.000000  0.000000  [iku]
   B[ 3 ]:  0.000000  0.000000  1.000000  [iku]

  [02.02] Symmetries
  ==================

  Inversion symmetry    : yes
  Spatial inversion     : no
  Inversion index       :  2
  K-space Time-reversal : yes
  Magnetic symmetries   : no
  Time-reversal derived K-space symmetries:  2  2
  Group table correct   : yes
  Symmetries units      :  [cc]

   [S 1]:  1.000000  0.000000  0.000000  0.000000  1.000000  0.000000  0.000000  0.000000  1.000000
   [S*2]: -1.000000  0.000000  0.000000  0.000000 -1.000000  0.000000  0.000000  0.000000 -1.000000

  [02.03] Reciprocal space
  ========================

  nG shells         :   514990
  nG charge         :   7535261
  nG WFs            :   941817
  nC WFs            :   941817
  G-vecs. in first 80 shells:  [ Number ]
    1   3   5   7  11  15  19  27  29
    31   35   39   41   45   49   57   61   65
    73   81   83   87   91   95   99  107  111
   119  121  129  137  141  143  147  151  159
   163  167  171  179  187  195  199  201  209
   213  221  225  229  233  237  245  253  261
   269  277  281  283  287  295  299  303  307
    323   327   335   337   345   353   361   365   369
    373   377   379   387   395   403   407   415
  ...
  Shell energy in first 80 shells:  [ mHa ]
    0.00000   8.82249  11.36938  12.61961  20.19186  21.44209  23.98899  32.81147  35.28995
   45.47751  46.65932  47.90955  50.47843  54.30000  58.09712  59.27893  59.30093  61.84781
   66.91961  70.67030  79.40240  80.76746  85.76838  90.77177  92.02200  93.38707  95.95594
    97.1378  102.3244  103.3914  104.7784  111.1469  113.5765  114.9440  122.3989  123.7665
   124.8799  124.9458  129.8808  131.2459  133.7683  137.4995  137.6143  141.1598  141.2502
   148.8664  150.2339  152.5292  152.8028  153.7794  159.0540  160.2358  161.6253  165.1488
   167.8765  175.3584  181.7268  181.9100  186.6373  188.0928  190.7325  191.6382  192.9789
   194.3439  194.5297  199.2569  201.9137  203.0076  203.3521  204.3482  210.7362  213.2831
   215.9009  217.2000  220.5622  222.1056  224.7233  229.8196  231.9316  232.2052
  ...

  [02.04] K-grid lattice
  ======================

  Compatible Grid is   : 0D
  K lattice UC volume  :  0.003182 [a.u.]

  [02.05] Energies & Occupations
  ==============================

  [X] === General ===
  [X] Electronic Temperature                        :  0.258606E-1   300.100    [eV K]
  [X] Bosonic    Temperature                        :  0.258606E-1   300.100    [eV K]
  [X] Finite Temperature mode                       : yes
  [X] El. density                                   :  0.47512E+24 [cm-3]
  [X] Fermi Level                                   :  6.085646 [eV]

  [X] === Gaps and Widths ===
  [X] Conduction Band Min                           :  6.085646 [eV]
  [X] Valence Band Max                              :  6.085646 [eV]
  [X] Filled Bands                                  :  2738
  [X] Metallic Bands                                :  2739  2752
  [X] Empty Bands                                   :   2753   3600

  [X] === Metallic Characters ===
  [X] N of el / N of met el                         :   5488.00      12.0000
  [X] Average metallic occ.                         :  0.428572

  [WARNING][X] Metallic system


 Timing [Min/Max/Average]: 03s/03s/03s

 [03] Transferred momenta grid and indexing
 ==========================================

 [RD./SAVE//ndb.kindx]-----------------------------------------------------------
  Fragmentation                                    : no
  Polarization last K                              :  1
  QP states                                        :  1  1
  X grid is uniform                                : yes
  Grids                                            : X S
  BS scattering                                    : no
  COLL scattering                                  : no
  Sigma scattering                                 : yes
  X scattering                                     : yes
 - S/N 002268 ---------------------------------------------- v.05.00.04 r.19598 -

 IBZ Q-points :  1
 BZ  Q-points :  1

 K/Q-points units:
 rlu = crystal or reduced units; cc = cartesian coordinates; iku = interal k-units

 Q [1]:  0.000000  0.000000  0.000000 [rlu]

 [04] Dipoles
 ============


 [WARNING][DIP] Database not correct or missing. To be computed
 [RD./SAVE//ns.kb_pp_pwscf]------------------------------------------------------
  Fragmentation                                    : yes
 - S/N 002268 ---------------------------------------------- v.05.00.04 r.19598 -

 [WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
 [WF-Oscillators/G space/Transverse up loader] Normalization (few states)  min/max  :  0.18709E-10   1.0000

Now, even after changing the parallelization flags [as shown above], we see that the memory requirements [as seen in the LOG file] does not change. Since our system is quite large, memory requirements are quite high and that can be handled by higher OpenMP threads, but that makes the calculation slow. Can you please help us with suggestion on how to parallelize better? Further, since we have a walltime of 24 hrs, will checkpointing be possible? [probably, I should add another question for that].

Thanks a lot in advance.

--
Best regards,
Vipul
Vipul Kumar Ambasta
MSc. student
Friedrich Alexander Universitaet
Erlangen (Germany)

User avatar
Daniele Varsano
Posts: 4346
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Yambo 5.0.4 parallization strategy for large system [nbnds =3600]

Post by Daniele Varsano » Wed Jun 03, 2026 10:20 am

Dear Vipul Kumar Ambasta,

the best strategy to distribute memory is parallelizing on bands "c" and "v".
In doing that you can fine tune the assignment to MPI task such that you have distributed workload: in your case tasks on "v" about three times tasks on "c", being
occupied states three times empty states. The suggestion is to move more tasks on "v" something like:

Code: Select all

X_CPU= "1.8.4.1.1"              
X_ROLEs= "g.v.c.k.q"  
Next, consider to neglect the non-local commutator:

Code: Select all

 [WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
and see if it allows continuing your calculation.

Next, consider also to update to a more recent release of the code.

Unfortunately restart is not possible for dipole calculaitons.

Best,

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

harrier_class
Posts: 10
Joined: Tue May 13, 2025 4:27 pm

Re: Yambo 5.0.4 parallization strategy for large system [nbnds =3600]

Post by harrier_class » Thu Jun 04, 2026 9:12 pm

Dear Dr. Varsano,

I appreciate your quick response. As suggested by you, I tried to parallelize in such a way that more MPI tasks are alloted for valence bands, than conduction bands.

Later, I also tried with your another suggestion to neglect the non-local commutator, by renaming the ns.kb_pp file, and that really helped. Now, I see that dipoles calculations take less than 5 minutes, which was taking [or I should say showing] an estimate of more than 5 days! Further, since this will be an approximation, and as I am new to such calculations, any suggestions that I should keep in my mind, while doing such approximation. [Just to share, my system has C, H, S, O, and Au, and I use nc sr pseudopotential, and quantum ESPRESSO was used to relax the structure at gamma point]. Also, about the dipole calcualtions, I see in another reply by you that if the dipoles are calculated with the covariant approach, then in this case commutator is not needed. Though I will for sure read about it and try to understand more, is there something I should keep in my mind?

Further, I wanted to share that, even though the dipoles calculations go very fast, and also the Xo@q[1] calculations is quite fast [this also takes less than an hour[. But the calculations stay at the X@q[1] for quite long [Last calculation which I did involved it stuck at here for more than 8 hours]. Though, I will let the job run, till the wall time is hit [which is 24 hours], I am not sure, if such long calculation for X@q[1] is expected. Any suggestions/advices please?

This is the log file from this run

Code: Select all



  __ __  ____ ___ ___ ____   ___
 |  |  |/    |   |   |    \ /   \
 |  |  |  o  | _   _ |  o  )     |
 |  ~  |     |  \_/  |     |  O  |
 |___, |  _  |   |   |  O  |     |
 |     |  |  |   |   |     |     |
 |____/|__|__|___|___|_____|\___/


 <02s> P10: [01] MPI/OPENMP structure, Files & I/O Directories
 <02s> P10-f2179.nhr.fau.de: MPI Cores-Threads   : 32(CPU)-8(threads)
 <02s> P10-f2179.nhr.fau.de: MPI Cores-Threads   : DIP(environment)-1 4 8(CPUs)-k c v(ROLEs)
 <02s> P10-f2179.nhr.fau.de: MPI Cores-Threads   : X(environment)-1.8.4.1.1(CPUs)-g.v.c.k.q(ROLEs)
 <02s> P10-f2179.nhr.fau.de: [02] CORE Variables Setup
 <02s> P10-f2179.nhr.fau.de: [02.01] Unit cells
 <05s> P10-f2179.nhr.fau.de: [02.02] Symmetries
 <05s> P10-f2179.nhr.fau.de: [02.03] Reciprocal space
 <05s> P10-f2179.nhr.fau.de: [02.04] K-grid lattice
 <05s> P10-f2179.nhr.fau.de: [02.05] Energies & Occupations
 <05s> P10-f2179.nhr.fau.de: [WARNING][X] Metallic system
 <05s> P10-f2179.nhr.fau.de: [03] Transferred momenta grid and indexing
 <05s> P10-f2179.nhr.fau.de: [MEMORY] Alloc bare_qpg( 58.86900 [Mb]) TOTAL:  243.3510 [Mb] (traced)  96.30800 [Mb] (memstat)
 <05s> P10-f2179.nhr.fau.de: [04] Dipoles
 <05s> P10-f2179.nhr.fau.de: [PARALLEL DIPOLES for K(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
 <05s> P10-f2179.nhr.fau.de: [PARALLEL DIPOLES for CON bands on 4 CPU] Loaded/Total (Percentual):216/862(25%)
 <05s> P10-f2179.nhr.fau.de: [PARALLEL DIPOLES for VAL bands on 8 CPU] Loaded/Total (Percentual):344/2752(13%)
 <05s> P10-f2179.nhr.fau.de: [DIP] Checking dipoles header
 <06s> P10-f2179.nhr.fau.de: [WARNING][DIP] Database not correct or missing. To be computed
 <06s> P10-f2179.nhr.fau.de: [MEMORY] Alloc DIP_iR( 55.59900 [Mb]) TOTAL:  299.0500 [Mb] (traced)  96.30800 [Mb] (memstat)
 <06s> P10-f2179.nhr.fau.de: [MEMORY] Alloc DIP_P( 55.59900 [Mb]) TOTAL:  354.6490 [Mb] (traced)  96.30800 [Mb] (memstat)
 <06s> P10-f2179.nhr.fau.de: [MEMORY] Alloc DIP_v( 55.59900 [Mb]) TOTAL:  410.2480 [Mb] (traced)  96.30800 [Mb] (memstat)
 <06s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |                                        | [000%] --(E) --(X)
 <06s> P10-f2179.nhr.fau.de: [MEMORY] Alloc WF%c( 4.120449 [Gb]) TOTAL:  4.530697 [Gb] (traced)  96.30800 [Mb] (memstat)
 <06s> P10-f2179.nhr.fau.de: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):560/3600(16%)
 <07s> P10-f2179.nhr.fau.de: [MEMORY] Alloc wf_disk( 2.531133 [Gb]) TOTAL:  7.076566 [Gb] (traced)  96.30800 [Mb] (memstat)
 <21s> P10-f2179.nhr.fau.de: [MEMORY]  Free wf_disk( 2.531133 [Gb]) TOTAL:  4.545433 [Gb] (traced)  96.30800 [Mb] (memstat)
 <21s> P10-f2179.nhr.fau.de: [MEMORY] Alloc wf_disk( 1.589316 [Gb]) TOTAL:  6.134749 [Gb] (traced)  96.30800 [Mb] (memstat)
 <25s> P10-f2179.nhr.fau.de: [MEMORY]  Free wf_disk( 1.589316 [Gb]) TOTAL:  4.545433 [Gb] (traced)  96.30800 [Mb] (memstat)
 <01m-24s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#                                       | [002%] 01m-18s(E) 52m-05s(X)
 <01m-29s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |##                                      | [005%] 01m-23s(E) 25m-37s(X)
 <01m-34s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |###                                     | [008%] 01m-28s(E) 17m-41s(X)
 <01m-39s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |####                                    | [011%] 01m-33s(E) 13m-52s(X)
 <01m-44s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#####                                   | [014%] 01m-38s(E) 11m-37s(X)
 <01m-49s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |######                                  | [016%] 01m-43s(E) 10m-08s(X)
 <01m-54s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#######                                 | [019%] 01m-48s(E) 09m-04s(X)
 <01m-59s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#########                               | [022%] 01m-53s(E) 08m-16s(X)
 <02m-04s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |##########                              | [025%] 01m-58s(E) 07m-40s(X)
 <02m-09s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |###########                             | [028%] 02m-03s(E) 07m-10s(X)
 <02m-14s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |############                            | [031%] 02m-08s(E) 06m-47s(X)
 <02m-19s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#############                           | [034%] 02m-13s(E) 06m-27s(X)
 <02m-24s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |##############                          | [037%] 02m-18s(E) 06m-10s(X)
 <02m-29s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |################                        | [040%] 02m-23s(E) 05m-56s(X)
 <02m-34s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#################                       | [043%] 02m-28s(E) 05m-44s(X)
 <02m-39s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |##################                      | [045%] 02m-33s(E) 05m-33s(X)
 <02m-44s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |###################                     | [048%] 02m-38s(E) 05m-23s(X)
 <02m-49s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |####################                    | [051%] 02m-43s(E) 05m-15s(X)
 <02m-54s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#####################                   | [054%] 02m-48s(E) 05m-07s(X)
 <02m-59s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#######################                 | [057%] 02m-53s(E) 05m-00s(X)
 <03m-04s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |########################                | [060%] 02m-58s(E) 04m-54s(X)
 <03m-09s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#########################               | [063%] 03m-03s(E) 04m-49s(X)
 <03m-14s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |##########################              | [066%] 03m-08s(E) 04m-43s(X)
 <03m-19s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |###########################             | [069%] 03m-13s(E) 04m-39s(X)
 <03m-24s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |############################            | [072%] 03m-18s(E) 04m-34s(X)
 <03m-29s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#############################           | [074%] 03m-23s(E) 04m-30s(X)
 <03m-34s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |###############################         | [077%] 03m-28s(E) 04m-27s(X)
 <03m-39s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |################################        | [080%] 03m-33s(E) 04m-23s(X)
 <03m-44s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |#################################       | [083%] 03m-38s(E) 04m-20s(X)
 <03m-49s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |##################################      | [086%] 03m-43s(E) 04m-17s(X)
 <03m-54s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |###################################     | [089%] 03m-48s(E) 04m-15s(X)
 <03m-59s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |####################################    | [092%] 03m-53s(E) 04m-12s(X)
 <04m-04s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |######################################  | [095%] 03m-58s(E) 04m-09s(X)
 <04m-09s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |####################################### | [098%] 04m-03s(E) 04m-07s(X)
 <04m-12s> P10-f2179.nhr.fau.de: Dipoles: P, V and iR (T) |########################################| [100%] 04m-06s(E) 04m-06s(X)
 <04m-12s> P10-f2179.nhr.fau.de: [MEMORY]  Free WF%c( 4.120449 [Gb]) TOTAL:  410.2760 [Mb] (traced)  97.23200 [Mb] (memstat)
 <04m-22s> P10-f2179.nhr.fau.de: [DIP] Checking dipoles header
 <04m-22s> P10-f2179.nhr.fau.de: [WARNING] [r,Vnl^pseudo] not included in position and velocity dipoles
 <04m-22s> P10-f2179.nhr.fau.de: [WARNING] In case H contains other non local terms, also these are neglected
 <04m-22s> P10-f2179.nhr.fau.de: [MEMORY]  Free DIP_iR( 55.59900 [Mb]) TOTAL:  354.6550 [Mb] (traced)  97.23200 [Mb] (memstat)
 <04m-22s> P10-f2179.nhr.fau.de: [MEMORY]  Free DIP_P( 55.59900 [Mb]) TOTAL:  299.0560 [Mb] (traced)  97.23200 [Mb] (memstat)
 <04m-22s> P10-f2179.nhr.fau.de: [MEMORY]  Free DIP_v( 55.59900 [Mb]) TOTAL:  243.4570 [Mb] (traced)  97.23200 [Mb] (memstat)
 <04m-22s> P10-f2179.nhr.fau.de: [MEMORY]  88.30300 [Mb] Host :g_vec [RL]
 <04m-22s> P10-f2179.nhr.fau.de: [MEMORY]  58.86900 [Mb] Host :g_rot [RL]
 <04m-22s> P10-f2179.nhr.fau.de: [05] Optics
 <04m-22s> P10-f2179.nhr.fau.de: [PARALLEL Response_G_space for K(bz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
 <04m-22s> P10-f2179.nhr.fau.de: [PARALLEL Response_G_space for Q(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
 <04m-22s> P10-f2179.nhr.fau.de: [PARALLEL Response_G_space for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
 <04m-22s> P10-f2179.nhr.fau.de: [LA@Response_G_space] PARALLEL linear algebra uses a 5x5 SLK grid (25 cpu)
 <04m-22s> P10-f2179.nhr.fau.de: [PARALLEL Response_G_space for K(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
 <04m-22s> P10-f2179.nhr.fau.de: [PARALLEL Response_G_space for CON bands on 4 CPU] Loaded/Total (Percentual):216/862(25%)
 <04m-22s> P10-f2179.nhr.fau.de: [PARALLEL Response_G_space for VAL bands on 8 CPU] Loaded/Total (Percentual):344/2752(13%)
 <05m-20s> P10-f2179.nhr.fau.de: [MEMORY] Alloc g_vec_rot( 88.30300 [Mb]) TOTAL:  338.6690 [Mb] (traced)  97.23200 [Mb] (memstat)
 <05m-20s> P10-f2179.nhr.fau.de: [MEMORY]  Free g_vec_rot( 88.30300 [Mb]) TOTAL:  275.5040 [Mb] (traced)  97.23200 [Mb] (memstat)
 <05m-20s> P10-f2179.nhr.fau.de: [MEMORY] Alloc WF%c( 4.921914 [Gb]) TOTAL:  5.197418 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-20s> P10-f2179.nhr.fau.de: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):344/2752(13%)
 <05m-21s> P10-f2179.nhr.fau.de: [FFT-Rho] Mesh size:  135  114  119
 <05m-21s> P10-f2179.nhr.fau.de: [MEMORY] Alloc wf_disk( 2.531133 [Gb]) TOTAL:  7.757183 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-33s> P10-f2179.nhr.fau.de: [MEMORY]  Free wf_disk( 2.531133 [Gb]) TOTAL:  5.226050 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-33s> P10-f2179.nhr.fau.de: [xc] Functional : Perdew, Burke & Ernzerhof(X)+Perdew, Burke & Ernzerhof(C)
 <05m-33s> P10-f2179.nhr.fau.de: [xc] LIBXC used to calculate xc functional
 <05m-35s> P10-f2179.nhr.fau.de: [MEMORY] Alloc pp_rhog_nlcc( 58.86900 [Mb]) TOTAL:  5.292061 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-35s> P10-f2179.nhr.fau.de: [MEMORY]  Free pp_rhog_nlcc( 58.86900 [Mb]) TOTAL:  5.233193 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-35s> P10-f2179.nhr.fau.de: [WARNING] Fxc not coded for GGA. Using LDA part only of the functional
 <05m-36s> P10-f2179.nhr.fau.de: [WARNING] Fxc not coded for GGA. Using LDA part only of the functional
 <05m-36s> P10-f2179.nhr.fau.de: [MEMORY]  Free WF%c( 4.921914 [Gb]) TOTAL:  317.8820 [Mb] (traced)  97.23200 [Mb] (memstat)
 <05m-36s> P10-f2179.nhr.fau.de: [MEMORY] Alloc X_par%blc( 1.375725 [Gb]) TOTAL:  1.668453 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-36s> P10-f2179.nhr.fau.de: [PARALLEL distribution for RL vectors(X) on 1 CPU] Loaded/Total (Percentual):1760929/1760929(100%)
 <05m-36s> P10-f2179.nhr.fau.de: [WARNING] The system is a metal but Drude term not included.
 <05m-36s> P10-f2179.nhr.fau.de: [DIP] Checking dipoles header
 <05m-36s> P10-f2179.nhr.fau.de: [MEMORY] Alloc DIP_iR( 55.59900 [Mb]) TOTAL:  1.724058 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-36s> P10-f2179.nhr.fau.de: [MEMORY] Alloc DIP_P( 55.59900 [Mb]) TOTAL:  1.779657 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-36s> P10-f2179.nhr.fau.de: [MEMORY] Alloc DIP_v( 55.59900 [Mb]) TOTAL:  1.835256 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-36s> P10-f2179.nhr.fau.de: [MEMORY] Alloc g_vec_rot( 88.30300 [Mb]) TOTAL:  1.923559 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-37s> P10-f2179.nhr.fau.de: [MEMORY]  Free g_vec_rot( 88.30300 [Mb]) TOTAL:  1.860394 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-37s> P10-f2179.nhr.fau.de: [MEMORY] Alloc WF%c( 51.50841 [Gb]) TOTAL:  53.36880 [Gb] (traced)  97.23200 [Mb] (memstat)
 <05m-37s> P10-f2179.nhr.fau.de: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):3600/3600(100%)
 <05m-43s> P10-f2179.nhr.fau.de: [FFT-X] Mesh size:  135  114  119
 <05m-43s> P10-f2179.nhr.fau.de: [MEMORY] Alloc wf_disk( 13.24430 [Gb]) TOTAL:  66.64174 [Gb] (traced)  97.23200 [Mb] (memstat)
 <06m-18s> P10-f2179.nhr.fau.de: [MEMORY]  Free wf_disk( 13.24430 [Gb]) TOTAL:  53.39743 [Gb] (traced)  97.23200 [Mb] (memstat)
 <06m-18s> P10-f2179.nhr.fau.de: [MEMORY] Alloc wf_disk( 13.24430 [Gb]) TOTAL:  66.64174 [Gb] (traced)  97.23200 [Mb] (memstat)
 <06m-56s> P10-f2179.nhr.fau.de: [MEMORY]  Free wf_disk( 13.24430 [Gb]) TOTAL:  53.39743 [Gb] (traced)  97.23200 [Mb] (memstat)
 <06m-56s> P10-f2179.nhr.fau.de: [X-CG] R(p) Tot o/o(of R):    8893   74304     100
 <06m-56s> P10-f2179.nhr.fau.de: Xo@q[1] |                                        | [000%] --(E) --(X)
 <07m-22s> P10-f2179.nhr.fau.de: Xo@q[1] |#                                       | [002%] 25s(E) 17m-06s(X)
 <07m-51s> P10-f2179.nhr.fau.de: Xo@q[1] |##                                      | [005%] 54s(E) 18m-07s(X)
 <08m-28s> P10-f2179.nhr.fau.de: Xo@q[1] |###                                     | [007%] 01m-31s(E) 20m-07s(X)
 <09m-12s> P10-f2179.nhr.fau.de: Xo@q[1] |####                                    | [010%] 02m-16s(E) 22m-29s(X)
 <09m-52s> P10-f2179.nhr.fau.de: Xo@q[1] |#####                                   | [012%] 02m-55s(E) 23m-24s(X)
 <10m-35s> P10-f2179.nhr.fau.de: Xo@q[1] |######                                  | [015%] 03m-38s(E) 24m-13s(X)
 <11m-15s> P10-f2179.nhr.fau.de: Xo@q[1] |#######                                 | [017%] 04m-19s(E) 24m-37s(X)
 <11m-58s> P10-f2179.nhr.fau.de: Xo@q[1] |########                                | [020%] 05m-01s(E) 25m-04s(X)
 <12m-37s> P10-f2179.nhr.fau.de: Xo@q[1] |#########                               | [022%] 05m-41s(E) 25m-16s(X)
 <13m-18s> P10-f2179.nhr.fau.de: Xo@q[1] |##########                              | [025%] 06m-21s(E) 25m-23s(X)
 <13m-59s> P10-f2179.nhr.fau.de: Xo@q[1] |###########                             | [027%] 07m-02s(E) 25m-35s(X)
 <14m-40s> P10-f2179.nhr.fau.de: Xo@q[1] |############                            | [030%] 07m-43s(E) 25m-42s(X)
 <15m-21s> P10-f2179.nhr.fau.de: Xo@q[1] |#############                           | [032%] 08m-24s(E) 25m-48s(X)
 <16m-02s> P10-f2179.nhr.fau.de: Xo@q[1] |##############                          | [035%] 09m-05s(E) 25m-58s(X)
 <16m-45s> P10-f2179.nhr.fau.de: Xo@q[1] |###############                         | [037%] 09m-48s(E) 26m-09s(X)
 <17m-29s> P10-f2179.nhr.fau.de: Xo@q[1] |################                        | [040%] 10m-32s(E) 26m-19s(X)
 <18m-12s> P10-f2179.nhr.fau.de: Xo@q[1] |#################                       | [042%] 11m-15s(E) 26m-26s(X)
 <18m-52s> P10-f2179.nhr.fau.de: Xo@q[1] |##################                      | [045%] 11m-56s(E) 26m-31s(X)
 <19m-39s> P10-f2179.nhr.fau.de: Xo@q[1] |###################                     | [047%] 12m-42s(E) 26m-44s(X)
 <20m-25s> P10-f2179.nhr.fau.de: Xo@q[1] |####################                    | [050%] 13m-29s(E) 26m-56s(X)
 <21m-11s> P10-f2179.nhr.fau.de: Xo@q[1] |#####################                   | [052%] 14m-14s(E) 27m-05s(X)
 <21m-53s> P10-f2179.nhr.fau.de: Xo@q[1] |######################                  | [055%] 14m-57s(E) 27m-08s(X)
 <22m-36s> P10-f2179.nhr.fau.de: Xo@q[1] |#######################                 | [057%] 15m-39s(E) 27m-13s(X)
 <23m-15s> P10-f2179.nhr.fau.de: Xo@q[1] |########################                | [060%] 16m-19s(E) 27m-11s(X)
 <23m-55s> P10-f2179.nhr.fau.de: Xo@q[1] |#########################               | [062%] 16m-58s(E) 27m-08s(X)
 <24m-34s> P10-f2179.nhr.fau.de: Xo@q[1] |##########################              | [065%] 17m-37s(E) 27m-05s(X)
 <25m-11s> P10-f2179.nhr.fau.de: Xo@q[1] |###########################             | [067%] 18m-14s(E) 27m-01s(X)
 <25m-48s> P10-f2179.nhr.fau.de: Xo@q[1] |############################            | [070%] 18m-51s(E) 26m-56s(X)
 <26m-24s> P10-f2179.nhr.fau.de: Xo@q[1] |#############################           | [072%] 19m-27s(E) 26m-49s(X)
 <27m-00s> P10-f2179.nhr.fau.de: Xo@q[1] |##############################          | [075%] 20m-04s(E) 26m-44s(X)
 <27m-37s> P10-f2179.nhr.fau.de: Xo@q[1] |###############################         | [077%] 20m-40s(E) 26m-38s(X)
 <28m-12s> P10-f2179.nhr.fau.de: Xo@q[1] |################################        | [080%] 21m-16s(E) 26m-34s(X)
 <28m-48s> P10-f2179.nhr.fau.de: Xo@q[1] |#################################       | [082%] 21m-52s(E) 26m-29s(X)
 <29m-22s> P10-f2179.nhr.fau.de: Xo@q[1] |##################################      | [085%] 22m-26s(E) 26m-22s(X)
 <29m-58s> P10-f2179.nhr.fau.de: Xo@q[1] |###################################     | [087%] 23m-01s(E) 26m-17s(X)
 <30m-32s> P10-f2179.nhr.fau.de: Xo@q[1] |####################################    | [090%] 23m-35s(E) 26m-12s(X)
 <31m-07s> P10-f2179.nhr.fau.de: Xo@q[1] |#####################################   | [092%] 24m-10s(E) 26m-07s(X)
 <31m-44s> P10-f2179.nhr.fau.de: Xo@q[1] |######################################  | [095%] 24m-48s(E) 26m-05s(X)
 <32m-22s> P10-f2179.nhr.fau.de: Xo@q[1] |####################################### | [097%] 25m-25s(E) 26m-03s(X)
 <33m-00s> P10-f2179.nhr.fau.de: Xo@q[1] |########################################| [100%] 26m-03s(E) 26m-03s(X)
 <35m-42s> P10-f2179.nhr.fau.de: [MEMORY]  Free DIP_iR( 55.59900 [Mb]) TOTAL:  53.31503 [Gb] (traced)  97.23200 [Mb] (memstat)
 <35m-42s> P10-f2179.nhr.fau.de: [MEMORY]  Free DIP_P( 55.59900 [Mb]) TOTAL:  53.25942 [Gb] (traced)  97.23200 [Mb] (memstat)
 <35m-42s> P10-f2179.nhr.fau.de: [MEMORY]  Free DIP_v( 55.59900 [Mb]) TOTAL:  53.20382 [Gb] (traced)  97.23200 [Mb] (memstat)
 <01h-15m> P10-f2179.nhr.fau.de: [PARALLEL distribution for X Frequencies on 1 CPU] Loaded/Total (Percentual):100/100(100%)
 <01h-15m> P10-f2179.nhr.fau.de: X@q[1] |                                        | [000%] --(E) --(X)

I appreciate all your help, and once I try out all the options with the current version, I will try to run the calculations with the new version which is 5.4,


--
Best regards,
Vipul Kumar Ambasta
Vipul Kumar Ambasta
MSc. student
Friedrich Alexander Universitaet
Erlangen (Germany)

harrier_class
Posts: 10
Joined: Tue May 13, 2025 4:27 pm

Re: Yambo 5.0.4 parallization strategy for large system [nbnds =3600]

Post by harrier_class » Fri Jun 05, 2026 9:38 am

Dear Dr. Varsano,

I also wanted to share that during the calculation of X@q[1] , the memory requirements and cpu_load [as seen by me on the portal for our HPC system] goes to very small value [and no update in the log file even after 17 hours]. Maybe this helps.

This is the input file, that I used.

Code: Select all

#                                                                     
#  __  __   ________   ___ __ __    _______   ______                  
# /_/\/_/\ /_______/\ /__//_//_/\ /_______/\ /_____/\                 
# \ \ \ \ \\::: _  \ \\::\| \| \ \\::: _  \ \\:::_ \ \                
#  \:\_\ \ \\::(_)  \ \\:.      \ \\::(_)  \/_\:\ \ \ \               
#   \::::_\/ \:: __  \ \\:.\-/\  \ \\::  _  \ \\:\ \ \ \              
#     \::\ \  \:.\ \  \ \\. \  \  \ \\::(_)  \ \\:\_\ \ \             
#      \__\/   \__\/\__\/ \__\/ \__\/ \_______\/ \_____\/             
#                                                                     
#                                                                     
#       Version 5.0.4 Revision 19598 Hash 20b2ffa04                   
#                      Branch is 5.0                                  
#              MPI+OpenMP+SLK+HDF5_IO Build                           
#                http://www.yambo-code.org                            
#
optics                           # [R] Linear Response optical properties
chi                              # [R][CHI] Dyson equation for Chi.
tddft                            # [R][K] Use TDDFT kernel
X_Threads=0                      # [OPENMP/X] Number of threads for response functions
DIP_Threads=0                    # [OPENMP/X] Number of threads for dipoles
Chimod= "ALDA"                   # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
FxcGRLc= 1                 Ry    # [TDDFT] XC-kernel RL size
NGsBlkXd= 1                Ry    # [Xd] Response block size
% QpntsRXd
 1 | 1 |                             # [Xd] Transferred momenta
%
% BndsRnXd
     1 |  3600 |                     # [Xd] Polarization function bands
%
% EnRngeXd
  0.00000 | 6.00000 |         eV    # [Xd] Energy range
%
% DmRngeXd
 0.100000 | 0.100000 |         eV    # [Xd] Damping range
%
ETStpsXd= 100                    # [Xd] Total Energy steps
% LongDrXd
 0.000000 | 0.000000 | 1.000000 |        # [Xd] [cc] Electric Field
%
X_CPU= "1.8.4.1.1"               # [PARALLEL] CPUs for each role
X_ROLEs= "g.v.c.k.q"             # [PARALLEL] CPUs roles (q,g,k,c,v)
X_nCPU_LinAlg_INV= 32            # [PARALLEL] CPUs for Linear Algebra (if -1 it is automatically set)
DIP_CPU= "1 4 8" # [PARALLEL] CPUs for each role
DIP_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
PAR_def_mode= "memory"
X_all_q_ROLEs= "q k c v"
X_all_q_CPU= "1 1 4 
" 
~                       
Thanks for all your kind help.

--
Best regards,
Vipul Kumar Ambasta
Vipul Kumar Ambasta
MSc. student
Friedrich Alexander Universitaet
Erlangen (Germany)

User avatar
Daniele Varsano
Posts: 4346
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Yambo 5.0.4 parallization strategy for large system [nbnds =3600]

Post by Daniele Varsano » Fri Jun 05, 2026 10:16 am

Dear Vipul,

your system is very large and most probably the resources are not appropriate. Usually system of this size are run on GPU or using many nodes.
Moreover note there are some inconsistency in your input you have X_all_q_CPU= "1 1 4" using 32 tasks, but this should not be a problem as Yambo in this cases assigns a default parallelization.
Also the calculation is very cumbersome due to the high number of frequencies you set (100). Actually you can try to reduce the frequency sampling.
Having 1Ry in X I would not expect such large time even if you have many frequencies, is it possible you have some problem with the scalapack libraries?

Anyway if you want to give a try for a TDDFT calculation probably you can try to switch to transition space generating an input as:

Code: Select all

yambo -o b -k alda -y d 
setting:
BSEmod="coupling"
BSENGexx=1 Ry.

and for BSEBands start with few bands across the gap:
% BSEBands
2700 | 2800 | # [BSK] Bands range
%

and increasing the bands window up to convergence of the low part of the energy spectrum, you are interested in.

About the dipoles I would not be much worried at this stage.

..and yes, I recommend to update to a newer release.

Best,

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

Post Reply