Page 1 of 1

GW and BSE calculations with larger k-point mesh

Posted: Thu May 04, 2023 3:29 pm
by nileshkumar
Hello YAMBO developers and users
I am trying to calculate GW quasiparticle energies and BSE spectra for a 2D material. I have the unit cell structure of 7 atoms. I am trying this calculation with a 24*24*1 k-points grid. When I run it with 8 CPUs on 1 node then it is working but I faced a time issue for the calculation. But when I increased the number of CPUs with higher nodes, then the calculation stopped (maybe memory issues). Please help me with how I can make a suitable parallelization of CPUs for this specific calculation.
Thank you

Re: GW and BSE calculations with larger k-point mesh

Posted: Sat May 06, 2023 9:29 am
by Daniele Varsano
Dear Nilesh,
If it is a memory issue, you can set a suitable parallelization strategy in the input file to distribute memory.
Can you post the input file together with submission script and error message of the problematic case?

Best,
Daniele

Re: GW and BSE calculations with larger k-point mesh

Posted: Sat May 06, 2023 11:30 am
by nileshkumar
Hello Mr. Daniele
I was trying to attach the file as attachments but it is not supporting. So I mentioned all the necessary details here,
1. Input file for GW calculation-

Code: Select all

#  __      __   ______   __       __  _______    ______               
# |  \    /  \ /      \ |  \     /  \|       \  /      \              
#  \$$\  /  $$|  $$$$$$\| $$\   /  $$| $$$$$$$\|  $$$$$$\             
#   \$$\/  $$ | $$__| $$| $$$\ /  $$$| $$__/ $$| $$  | $$             
#    \$$  $$  | $$    $$| $$$$\  $$$$| $$    $$| $$  | $$             
#     \$$$$   | $$$$$$$$| $$\$$ $$ $$| $$$$$$$\| $$  | $$             
#     | $$    | $$  | $$| $$ \$$$| $$| $$__/ $$| $$__/ $$             
#     | $$    | $$  | $$| $$  \$ | $$| $$    $$ \$$    $$             
#      \$$     \$$   \$$ \$$      \$$ \$$$$$$$   \$$$$$$              
#                                                                     
# Version 5.1.0 Revision 21422 Hash (prev commit) fde6e2a07           
#                        Branch is                                    
#               MPI+SLEPC+HDF5_MPI_IO Build                           
#                http://www.yambo-code.org                            
#
rim_cut                          # [R] Coulomb potential
HF_and_locXC                     # [R] Hartree-Fock
gw0                              # [R] GW approximation
ppa                              # [R][Xp] Plasmon Pole Approximation for the Screened Interaction
dyson                            # [R] Dyson Equation solver
em1d                             # [R][X] Dynamically Screened Interaction
StdoHash=  40                    # [IO] Live-timing Hashes
Nelectro= 56.00000               # Electrons number
ElecTemp= 0.000000         eV    # Electronic Temperature
BoseTemp=-1.000000         eV    # Bosonic Temperature
OccTresh= 0.100000E-4            # Occupation treshold (metallic bands)
NLogCPUs=0                       # [PARALLEL] Live-timing CPU`s (0 for all)
DBsIOoff= "none"                 # [IO] Space-separated list of DB with NO I/O. DB=(DIP,X,HF,COLLs,J,GF,CARRIERs,OBS,W,SC,BS,ALL)
DBsFRAGpm= "none"                # [IO] Space-separated list of +DB to FRAG and -DB to NOT FRAG. DB=(DIP,X,W,HF,COLLS,K,BS,QINDX,RT,ELP
FFTGvecs= 70            Ry    # [FFT] Plane-waves
#WFbuffIO                      # [IO] Wave-functions buffered I/O
PAR_def_mode= "balanced"         # [PARALLEL] Default distribution mode ("balanced"/"memory"/"workload")
X_and_IO_CPU= "1 1 16 4 4"                 # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v"               # [PARALLEL] CPUs roles (q,g,k,c,v)
X_and_IO_nCPU_LinAlg_INV=-1      # [PARALLEL] CPUs for Linear Algebra (if -1 it is automatically set)
DIP_CPU= "16 4 4"                      # [PARALLEL] CPUs for each role
DIP_ROLEs= "k c v"                    # [PARALLEL] CPUs roles (k,c,v)
SE_CPU= "16 4 4"                       # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b"                     # [PARALLEL] CPUs roles (q,qp,b)
RandQpts=1000000                       # [RIM] Number of random q-points in the BZ
RandGvec= 100                RL    # [RIM] Coulomb interaction RS components
#QpgFull                       # [F RIM] Coulomb interaction: Full matrix
% Em1Anys
 0.000000 | 0.000000 | 0.000000 |        # [RIM] X Y Z Static Inverse dielectric matrix Anysotropy
%
IDEm1Ref=0                       # [RIM] Dielectric matrix reference component 1(x)/2(y)/3(z)
CUTGeo= "box z"                   # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws/slab X/Y/Z/XY..
% CUTBox
 0.000000 | 0.000000 | 41.53774 |        # [CUT] [au] Box sides
%
CUTRadius= 0.000000              # [CUT] [au] Sphere/Cylinder radius
CUTCylLen= 0.000000              # [CUT] [au] Cylinder length
CUTwsGvec= 0.700000              # [CUT] WS cutoff: number of G to be modified
#CUTCol_test                   # [CUT] Perform a cutoff test in R-space
EXXRLvcs= 70           Ry    # [XX] Exchange    RL components
VXCRLvcs= 70           Ry    # [XC] XCpotential RL components
#UseNLCC                       # [XC] If present, add NLCC contributions to the charge density
Chimod= "HARTREE"                # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
ChiLinAlgMod= "LIN_SYS"          # [X] inversion/lin_sys,cpu/gpu
XfnQPdb= "none"                  # [EXTQP Xd] Database action
XfnQP_INTERP_NN= 1               # [EXTQP Xd] Interpolation neighbours (NN mode)
XfnQP_INTERP_shells= 20.00000    # [EXTQP Xd] Interpolation shells (BOLTZ mode)
XfnQP_DbGd_INTERP_mode= "NN"     # [EXTQP Xd] Interpolation DbGd mode
% XfnQP_E
 0.000000 | 1.000000 | 1.000000 |        # [EXTQP Xd] E parameters  (c/v) eV|adim|adim
%
XfnQP_Z= ( 1.000000 , 0.000000 )         # [EXTQP Xd] Z factor  (c/v)
XfnQP_Wv_E= 0.000000       eV    # [EXTQP Xd] W Energy reference  (valence)
% XfnQP_Wv
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP Xd] W parameters  (valence) eV| 1|eV^-1
%
XfnQP_Wv_dos= 0.000000     eV    # [EXTQP Xd] W dos pre-factor  (valence)
XfnQP_Wc_E= 0.000000       eV    # [EXTQP Xd] W Energy reference  (conduction)
% XfnQP_Wc
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP Xd] W parameters  (conduction) eV| 1 |eV^-1
%
XfnQP_Wc_dos= 0.000000     eV    # [EXTQP Xd] W dos pre-factor  (conduction)
ShiftedPaths= ""                 # [DIP] Shifted grids paths (separated by a space)
% QpntsRXp
  1 | 61 |                           # [Xp] Transferred momenta
%
% BndsRnXp
    1 |  300 |                       # [Xp] Polarization function bands
%
NGsBlkXp= 14                Ry    # [Xp] Response block size
CGrdSpXp= 100.0000               # [Xp] [o/o] Coarse grid controller
% EhEngyXp
-1.000000 |-1.000000 |         eV    # [Xp] Electron-hole energy range
%
% LongDrXp
 1.000000 | 1.000000 | 0.000000 |        # [Xp] [cc] Electric Field
%
PPAPntXp= 27.21138         eV    # [Xp] PPA imaginary energy
XTermKind= "BG"                # [X] X terminator ("none","BG" Bruneval-Gonze)
XTermEn= 40.00000          eV    # [X] X terminator energy (only for kind="BG")
#OptDipAverage                 # [Xd] Average Xd along the non-zero Electric Field directions
#QPsymmtrz                     # [GW] Force symmetrization of states with the same energy
GfnQPdb= "none"                  # [EXTQP G] Database action
GfnQP_INTERP_NN= 1               # [EXTQP G] Interpolation neighbours (NN mode)
GfnQP_INTERP_shells= 20.00000    # [EXTQP G] Interpolation shells (BOLTZ mode)
GfnQP_DbGd_INTERP_mode= "NN"     # [EXTQP G] Interpolation DbGd mode
% GfnQP_E
 0.000000 | 1.000000 | 1.000000 |        # [EXTQP G] E parameters  (c/v) eV|adim|adim
%
GfnQP_Z= ( 1.000000 , 0.000000 )         # [EXTQP G] Z factor  (c/v)
GfnQP_Wv_E= 0.000000       eV    # [EXTQP G] W Energy reference  (valence)
% GfnQP_Wv
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP G] W parameters  (valence) eV| 1|eV^-1
%
GfnQP_Wv_dos= 0.000000     eV    # [EXTQP G] W dos pre-factor  (valence)
GfnQP_Wc_E= 0.000000       eV    # [EXTQP G] W Energy reference  (conduction)
% GfnQP_Wc
 0.000000 | 0.000000 | 0.000000 |        # [EXTQP G] W parameters  (conduction) eV| 1 |eV^-1
%
GfnQP_Wc_dos= 0.000000     eV    # [EXTQP G] W dos pre-factor  (conduction)
BoseCut= 0.100000                # [BOSE] Finite T Bose function cutoff
% GbndRnge
    1 | 300 |                       # [GW] G[W] bands range
%
GDamping= 0.200000         eV    # [GW] G[W] damping
dScStep= 0.100000          eV    # [GW] Energy step to evaluate Z factors
GTermKind= "none"                # [GW] GW terminator ("none","BG" Bruneval-Gonze,"BRS" Berger-Reining-Sottile)
GTermEn= 40.81708          eV    # [GW] GW terminator energy (only for kind="BG")
DysSolver= "n"                   # [GW] Dyson Equation solver ("n","s","g")
GWoIter=0                        # [GW] GWo self-consistent (evGWo) iterations on eigenvalues
GWIter=0                         # [GW] GW  self-consistent (evGW)  iterations on eigenvalues
SCEtresh= 0.010000         eV    # [SC] Energy convergence threshold for SC-GW
#NewtDchk                      # [GW] Test dSc/dw convergence
ExtendOut                     # [GW] Print all variables in the output file
#OnMassShell                   # [F GW] On mass shell approximation
#QPExpand                      # [F GW] The QP corrections are expanded all over the BZ
%QPkrange                        # [GW] QP generalized Kpoint/Band indices
1|61|45|61|
%
%QPerange                        # [GW] QP generalized Kpoint/Energy indices
1|61| 0.000000|-1.000000|
%
2. the submit script is here-

Code: Select all

#!/bin/bash
#PBS -A OPEN-24-46
#PBS -N yambo
#PBS -q qlong
#PBS -l select=8:mpiprocs=36
#PBS -l walltime=144:00:00
#PBS -j oe

hostname
date

module load OpenMPI/4.0.3-GCC-9.3.0
cd $PBS_O_WORKDIR

#p2y
#yambo
#yambo -r -fatlog -x -p p -g n -V all -F gw.in
mpirun -np 256 yambo -F gw.in -J gw.out -C report
3. the output error file is here,

Code: Select all

cn48.barbora.it4i.cz
Sat May  6 12:10:16 CEST 2023
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 215 with PID 1446841 on node cn98 exited on signal 9 (Killed).
--------------------------------------------------------------------------
6 total processes killed (some possibly by mpirun during cleanup)
cn99.barbora.it4i.cz: # PBS Epilogue Report
cn99.barbora.it4i.cz: # Issues found in kernel log
cn99.barbora.it4i.cz: [161946.381129] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161947.798359] Out of memory: Killed process 801261 (yambo) total-vm:6301728kB, anon-rss:5435732kB, file-rss:0kB, shmem-rss:3896kB, UID:7916 pgtables:10856kB oom_score_adj:0
cn99.barbora.it4i.cz: [161948.536194] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161949.944615] Out of memory: Killed process 801254 (yambo) total-vm:7314192kB, anon-rss:5641280kB, file-rss:0kB, shmem-rss:3988kB, UID:7916 pgtables:11268kB oom_score_adj:0
cn99.barbora.it4i.cz: [161950.551119] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161951.968622] Out of memory: Killed process 801225 (yambo) total-vm:7314712kB, anon-rss:5808548kB, file-rss:0kB, shmem-rss:3472kB, UID:7916 pgtables:11588kB oom_score_adj:0
cn99.barbora.it4i.cz: [161952.621832] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161954.044314] Out of memory: Killed process 801281 (yambo) total-vm:7312608kB, anon-rss:5957592kB, file-rss:0kB, shmem-rss:3460kB, UID:7916 pgtables:11876kB oom_score_adj:0
cn99.barbora.it4i.cz: [161954.740102] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161956.127949] Out of memory: Killed process 801222 (yambo) total-vm:7314704kB, anon-rss:6140932kB, file-rss:0kB, shmem-rss:3628kB, UID:7916 pgtables:12248kB oom_score_adj:0
cn99.barbora.it4i.cz: [161959.373742] kthreadd invoked oom-killer: gfp_mask=0x6082c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161960.706503] Out of memory: Killed process 801280 (yambo) total-vm:7313648kB, anon-rss:6381496kB, file-rss:0kB, shmem-rss:3444kB, UID:7916 pgtables:12704kB oom_score_adj:0
cn99.barbora.it4i.cz: [161961.504801] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161962.901959] Out of memory: Killed process 801252 (yambo) total-vm:7313396kB, anon-rss:6653896kB, file-rss:0kB, shmem-rss:3880kB, UID:7916 pgtables:13240kB oom_score_adj:0
cn90.barbora.it4i.cz: # PBS Epilogue Report
cn90.barbora.it4i.cz: # Issues found in kernel log
cn90.barbora.it4i.cz: [1881636.919998] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881638.366127] Out of memory: Killed process 273064 (yambo) total-vm:7313124kB, anon-rss:5390044kB, file-rss:0kB, shmem-rss:3360kB, UID:7916 pgtables:10768kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881639.722032] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881641.128085] Out of memory: Killed process 273054 (yambo) total-vm:7312340kB, anon-rss:5543400kB, file-rss:0kB, shmem-rss:3356kB, UID:7916 pgtables:11072kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881641.969390] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881643.364261] Out of memory: Killed process 272990 (yambo) total-vm:6301988kB, anon-rss:5877584kB, file-rss:0kB, shmem-rss:3436kB, UID:7916 pgtables:11720kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881644.732927] ibms_mad_agent invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881646.141366] Out of memory: Killed process 273066 (yambo) total-vm:7312604kB, anon-rss:6022940kB, file-rss:0kB, shmem-rss:3368kB, UID:7916 pgtables:12008kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881646.824997] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881648.228590] Out of memory: Killed process 272986 (yambo) total-vm:7316304kB, anon-rss:6060552kB, file-rss:0kB, shmem-rss:3660kB, UID:7916 pgtables:12080kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881650.481638] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881651.855823] Out of memory: Killed process 272998 (yambo) total-vm:7314456kB, anon-rss:6269372kB, file-rss:0kB, shmem-rss:3528kB, UID:7916 pgtables:12492kB oom_score_adj:0
cn91.barbora.it4i.cz: # PBS Epilogue Report
cn91.barbora.it4i.cz: # Issues found in kernel log
cn91.barbora.it4i.cz: [697703.060750] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697704.468088] Out of memory: Killed process 1800738 (yambo) total-vm:6299084kB, anon-rss:5339656kB, file-rss:0kB, shmem-rss:3212kB, UID:7916 pgtables:10660kB oom_score_adj:0
cn91.barbora.it4i.cz: [697706.057215] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697707.458098] Out of memory: Killed process 1800727 (yambo) total-vm:7312364kB, anon-rss:5585304kB, file-rss:0kB, shmem-rss:3284kB, UID:7916 pgtables:11148kB oom_score_adj:0
cn91.barbora.it4i.cz: [697707.976593] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697709.391896] Out of memory: Killed process 1800729 (yambo) total-vm:7312340kB, anon-rss:5724940kB, file-rss:0kB, shmem-rss:3264kB, UID:7916 pgtables:11428kB oom_score_adj:0
cn91.barbora.it4i.cz: [697710.103304] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697711.486198] Out of memory: Killed process 1800736 (yambo) total-vm:6299756kB, anon-rss:5847800kB, file-rss:0kB, shmem-rss:3220kB, UID:7916 pgtables:11660kB oom_score_adj:0
cn91.barbora.it4i.cz: [697712.329209] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697713.695857] Out of memory: Killed process 1800716 (yambo) total-vm:7314460kB, anon-rss:6058012kB, file-rss:0kB, shmem-rss:3296kB, UID:7916 pgtables:12084kB oom_score_adj:0
cn91.barbora.it4i.cz: [697714.453906] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697715.810939] Out of memory: Killed process 1800709 (yambo) total-vm:7313124kB, anon-rss:6347164kB, file-rss:0kB, shmem-rss:3264kB, UID:7916 pgtables:12640kB oom_score_adj:0
cn91.barbora.it4i.cz: [697717.964331] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697719.312565] Out of memory: Killed process 1800708 (yambo) total-vm:7314704kB, anon-rss:6691032kB, file-rss:0kB, shmem-rss:3292kB, UID:7916 pgtables:13316kB oom_score_adj:0
cn59.barbora.it4i.cz: # PBS Epilogue Report
cn59.barbora.it4i.cz: # Issues found in kernel log
cn59.barbora.it4i.cz: [1806539.905358] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806541.326459] Out of memory: Killed process 3941956 (yambo) total-vm:7314172kB, anon-rss:5355484kB, file-rss:0kB, shmem-rss:3312kB, UID:7916 pgtables:10704kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806542.575819] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806543.995877] Out of memory: Killed process 3941991 (yambo) total-vm:6365076kB, anon-rss:5801600kB, file-rss:0kB, shmem-rss:2980kB, UID:7916 pgtables:11572kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806549.355318] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806550.774951] Out of memory: Killed process 3941988 (yambo) total-vm:6300812kB, anon-rss:5690684kB, file-rss:0kB, shmem-rss:3476kB, UID:7916 pgtables:11352kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806551.421019] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806552.840724] Out of memory: Killed process 3941970 (yambo) total-vm:7315496kB, anon-rss:5900704kB, file-rss:0kB, shmem-rss:3672kB, UID:7916 pgtables:11776kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806556.461055] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806557.871159] Out of memory: Killed process 3942012 (yambo) total-vm:7310516kB, anon-rss:6088080kB, file-rss:0kB, shmem-rss:2988kB, UID:7916 pgtables:12132kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806559.194579] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806560.593263] Out of memory: Killed process 3941978 (yambo) total-vm:7315244kB, anon-rss:6369896kB, file-rss:0kB, shmem-rss:3604kB, UID:7916 pgtables:12688kB oom_score_adj:0
cn59.barbora.it4i.cz: System error
cn98.barbora.it4i.cz: # PBS Epilogue Report
cn98.barbora.it4i.cz: # Issues found in kernel log
cn98.barbora.it4i.cz: [351522.539098] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351523.947279] Out of memory: Killed process 1446841 (yambo) total-vm:7311020kB, anon-rss:5431760kB, file-rss:0kB, shmem-rss:2856kB, UID:7916 pgtables:10840kB oom_score_adj:0
cn98.barbora.it4i.cz: [351524.530026] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351525.938131] Out of memory: Killed process 1446761 (yambo) total-vm:7313908kB, anon-rss:5584324kB, file-rss:0kB, shmem-rss:3500kB, UID:7916 pgtables:11144kB oom_score_adj:0
cn98.barbora.it4i.cz: [351526.898038] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351528.298460] Out of memory: Killed process 1446771 (yambo) total-vm:7314448kB, anon-rss:5750620kB, file-rss:0kB, shmem-rss:3500kB, UID:7916 pgtables:11480kB oom_score_adj:0
cn98.barbora.it4i.cz: [351529.953387] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351531.358892] Out of memory: Killed process 1446764 (yambo) total-vm:7314712kB, anon-rss:5910536kB, file-rss:0kB, shmem-rss:3516kB, UID:7916 pgtables:11788kB oom_score_adj:0
cn98.barbora.it4i.cz: [351532.819466] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351534.215452] Out of memory: Killed process 1446772 (yambo) total-vm:7316024kB, anon-rss:6125908kB, file-rss:0kB, shmem-rss:3548kB, UID:7916 pgtables:12216kB oom_score_adj:0
cn98.barbora.it4i.cz: [351535.490184] mad/handler invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351536.906057] Out of memory: Killed process 1446803 (yambo) total-vm:7314188kB, anon-rss:6393212kB, file-rss:0kB, shmem-rss:3912kB, UID:7916 pgtables:12736kB oom_score_adj:0
cn98.barbora.it4i.cz: [351541.191143] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351542.553790] Out of memory: Killed process 1446825 (yambo) total-vm:7313400kB, anon-rss:6703660kB, file-rss:0kB, shmem-rss:3452kB, UID:7916 pgtables:13332kB oom_score_adj:0
cn60.barbora.it4i.cz: # PBS Epilogue Report
cn60.barbora.it4i.cz: # Issues found in kernel log
cn60.barbora.it4i.cz: [1755733.475081] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755734.899513] Out of memory: Killed process 351207 (yambo) total-vm:6302784kB, anon-rss:5329924kB, file-rss:0kB, shmem-rss:3456kB, UID:7916 pgtables:10648kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755735.412500] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755736.820157] Out of memory: Killed process 351197 (yambo) total-vm:7315760kB, anon-rss:5501428kB, file-rss:0kB, shmem-rss:3956kB, UID:7916 pgtables:10992kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755737.898513] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755739.329351] Out of memory: Killed process 351235 (yambo) total-vm:7312616kB, anon-rss:5673540kB, file-rss:0kB, shmem-rss:3336kB, UID:7916 pgtables:11316kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755740.376220] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755741.798103] Out of memory: Killed process 351200 (yambo) total-vm:7314704kB, anon-rss:5808332kB, file-rss:0kB, shmem-rss:3692kB, UID:7916 pgtables:11592kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755742.952718] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755744.358852] Out of memory: Killed process 351204 (yambo) total-vm:6301728kB, anon-rss:5991416kB, file-rss:0kB, shmem-rss:3472kB, UID:7916 pgtables:11932kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755747.062628] yambo invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755748.488439] Out of memory: Killed process 351215 (yambo) total-vm:7315492kB, anon-rss:6207780kB, file-rss:0kB, shmem-rss:3856kB, UID:7916 pgtables:12368kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755749.679994] mad/handler invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755751.102421] Out of memory: Killed process 351223 (yambo) total-vm:7312612kB, anon-rss:6494304kB, file-rss:0kB, shmem-rss:3224kB, UID:7916 pgtables:12932kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755752.399894] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755753.801679] Out of memory: Killed process 351211 (yambo) total-vm:7315504kB, anon-rss:6784816kB, file-rss:0kB, shmem-rss:3532kB, UID:7916 pgtables:13504kB oom_score_adj:0
cn48.barbora.it4i.cz: # PBS Epilogue Report
cn48.barbora.it4i.cz: # Issues found in kernel log
cn48.barbora.it4i.cz: [1900015.497187] yambo invoked oom-killer: gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900016.951951] Out of memory: Killed process 601689 (yambo) total-vm:7313936kB, anon-rss:5321700kB, file-rss:0kB, shmem-rss:3688kB, UID:7916 pgtables:10632kB oom_score_adj:0
cn48.barbora.it4i.cz: [1900018.663468] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900020.100254] Out of memory: Killed process 601707 (yambo) total-vm:7314700kB, anon-rss:5701004kB, file-rss:0kB, shmem-rss:3628kB, UID:7916 pgtables:11384kB oom_score_adj:0
cn48.barbora.it4i.cz: [1900020.883627] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900022.330629] Out of memory: Killed process 601699 (yambo) total-vm:7314448kB, anon-rss:5840600kB, file-rss:0kB, shmem-rss:3652kB, UID:7916 pgtables:11664kB oom_score_adj:0
cn48.barbora.it4i.cz: [1900022.928544] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900024.375939] Out of memory: Killed process 601693 (yambo) total-vm:6301876kB, anon-rss:6002104kB, file-rss:0kB, shmem-rss:3804kB, UID:7916 pgtables:11964kB oom_score_adj:0
4. the log file is here-

Code: Select all

_|      _|   _|_|   _|      _|  _|_|_|     _|_|
  _|  _|   _|    _| _|_|  _|_| _|    _| _|    _|
    _|     _|_|_|_| _|  _|  _| _|_|_|   _|    _|
    _|     _|    _| _|      _| _|    _| _|    _|
    _|     _|    _| _|      _| _|_|_|     _|_|

 <---> P1: [01] MPI/OPENMP structure, Files & I/O Directories
 <---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads   : 256(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)-1(threads@NL)
 <---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads   : DIP(environment)-16 4 4(CPUs)-k c v(ROLEs)
 <---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads   : X_and_IO(environment)-1 1 16 4 4(CPUs)-q g k c v(ROLEs)
 <---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads   : SE(environment)-16 4 4(CPUs)-q qp b(ROLEs)
 <---> P1-cn48.barbora.it4i.cz: [02] CORE Variables Setup
 <---> P1-cn48.barbora.it4i.cz: [02.01] Unit cells
 <---> P1-cn48.barbora.it4i.cz: [02.02] Symmetries
 <---> P1-cn48.barbora.it4i.cz: [02.03] Reciprocal space
 <---> P1-cn48.barbora.it4i.cz: [02.04] K-grid lattice
 <---> P1-cn48.barbora.it4i.cz: Grid dimensions      :  24  24
 <---> P1-cn48.barbora.it4i.cz: [02.05] Energies & Occupations
 <---> P1-cn48.barbora.it4i.cz: [03] Transferred momenta grid and indexing
 <---> P1-cn48.barbora.it4i.cz: [04] Coloumb potential Random Integration (RIM)
 <---> P1-cn48.barbora.it4i.cz: [04.01] RIM initialization
 <---> P1-cn48.barbora.it4i.cz: Random points |                                        | [000%] --(E) --(X)
 <---> P1-cn48.barbora.it4i.cz: Random points |########################################| [100%] --(E) --(X)
 <02s> P1-cn48.barbora.it4i.cz: [04.02] RIM integrals
 <02s> P1-cn48.barbora.it4i.cz: Momenta loop |                                        | [000%] --(E) --(X)
 <04s> P1-cn48.barbora.it4i.cz: Momenta loop |########################################| [100%] 02s(E) 02s(X)
 <04s> P1-cn48.barbora.it4i.cz: [05] Coloumb potential CutOffbox
 <04s> P1-cn48.barbora.it4i.cz: Box |                                        | [000%] --(E) --(X)
 <06s> P1-cn48.barbora.it4i.cz: Box |########################################| [100%] --(E) --(X)
 <06s> P1-cn48.barbora.it4i.cz: [06] Dipoles
 <06s> P1-cn48.barbora.it4i.cz: [PARALLEL DIPOLES for K(ibz) on 16 CPU] Loaded/Total (Percentual):4/61(7%)
 <06s> P1-cn48.barbora.it4i.cz: [PARALLEL DIPOLES for CON bands on 4 CPU] Loaded/Total (Percentual):75/300(25%)
 <06s> P1-cn48.barbora.it4i.cz: [PARALLEL DIPOLES for VAL bands on 4 CPU] Loaded/Total (Percentual):14/56(25%)
 <06s> P1-cn48.barbora.it4i.cz: [DIP] Checking dipoles header
 <06s> P1-cn48.barbora.it4i.cz: [DIP] Database not correct or missing. To be computed
 <06s> P1-cn48.barbora.it4i.cz: [x,Vnl] computed using 262 projectors
 <06s> P1-cn48.barbora.it4i.cz: [WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
 <07s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |                                        | [000%] --(E) --(X)
 <07s> P1-cn48.barbora.it4i.cz: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):75/300(25%)
 <18s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |#                                       | [002%] 11s(E) 07m-47s(X)
 <32s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |##########                              | [025%] 25s(E) 01m-39s(X)
 <45s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |####################                    | [050%] 38s(E) 01m-17s(X)
 <58s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |##############################          | [075%] 51s(E) 01m-09s(X)
 <01m-00s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |########################################| [100%] 53s(E) 53s(X)
 <01m-08s> P1-cn48.barbora.it4i.cz: [DIP] Writing dipoles header
 <01m-09s> P1-cn48.barbora.it4i.cz: [07] Dynamic Dielectric Matrix (PPA)
 <01m-16s> P1-cn48.barbora.it4i.cz: [WARNING] Response block size reduced to 1299    RL (7124    mHa)
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for K(bz) on 16 CPU] Loaded/Total (Percentual):36/576(6%)
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for Q(ibz) on 1 CPU] Loaded/Total (Percentual):61/61(100%)
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):61/61(100%)
 <01m-16s> P1-cn48.barbora.it4i.cz: [LA] SERIAL linear algebra
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for K(ibz) on 1 CPU] Loaded/Total (Percentual):61/61(100%)
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for CON bands on 4 CPU] Loaded/Total (Percentual):75/300(25%)
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for VAL bands on 4 CPU] Loaded/Total (Percentual):14/56(25%)
 <01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL distribution for RL vectors(X) on 1 CPU] Loaded/Total (Percentual):1687401/1687401(100%)
 <01m-17s> P1-cn48.barbora.it4i.cz: [DIP] Checking dipoles header
The calculation stopped at this line " [DIP] Checking dipoles header".
Please let me know what can I do solve this issue?

Re: GW and BSE calculations with larger k-point mesh

Posted: Mon May 08, 2023 8:37 am
by Daniele Varsano
Dear Nilesh,

this is not really comfortable to read, you can share files renaming them with a supported suffix as e.g. file.txt
Anyway, a good parallelization strategy to share memory among cpu, considering you are using 288 CPUs:

Code: Select all

X_and_IO_CPU= "1 1 1 36 8" # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v" 

SE_CPU= "1 8 36" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
Furthermore, I strongly advise you to set:

Code: Select all

XTermKind= "none" # [X] X terminator ("none","BG" Bruneval-Gonze)

as this is very memory intensive and not particularly efficient.

Best,
Daniele

Re: GW and BSE calculations with larger k-point mesh

Posted: Sun Apr 28, 2024 2:53 pm
by Harshita
Dear Team,

I am also facing similar problem while interpolating band structures, which is supposedly arising from virtual memory issue. I tried using more nodes and less cores, so as to use higher memory, but all in vain.

The calculation was stopping every time at
<14s> P1: [05.01] G0W0 on the real axis
<14s> P1: Self_Energy parallel ENVIRONMENT is incomplete. Switching to defaults
<14s> P1: [PARALLEL Self_Energy for QPs on 3 CPU] Loaded/Total (Percentual):10260/30780(33%)
<14s> P1: [PARALLEL Self_Energy for Q(ibz) on 1 CPU] Loaded/Total (Percentual):81/81(100%)
<14s> P1: [PARALLEL Self_Energy for G bands on 2 CPU] Loaded/Total (Percentual):190/380(50%)
<15s> P1: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):20520/30780(67%)

It was creating the error file "yambo.80s-56187,node9.btr" which shows:

yambo:55185 terminated with signal 11 at PC=4af805 SP=7ffd064bf6c0. Backtrace: /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x4af805] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x497eeb] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x493469] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x40bc25] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x73f525] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x406875] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b59338823d5] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x4066a9]

After seeing this forum, I tried using:

X_and_IO_CPU= "1 1 1 48 4"
X_and_IO_ROLEs= "q g k c v"

SE_CPU= "1 4 48" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)

Now, the "yambo.80s-56187,node9.btr" files are not created, still the calculation is stopping at the same place.

Any help would be highly appreciated.

Thanks and regards,
Harshita