Hello YAMBO developers and users
I am trying to calculate GW quasiparticle energies and BSE spectra for a 2D material. I have the unit cell structure of 7 atoms. I am trying this calculation with a 24*24*1 k-points grid. When I run it with 8 CPUs on 1 node then it is working but I faced a time issue for the calculation. But when I increased the number of CPUs with higher nodes, then the calculation stopped (maybe memory issues). Please help me with how I can make a suitable parallelization of CPUs for this specific calculation.
Thank you
GW and BSE calculations with larger k-point mesh
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano
-
- Posts: 3
- Joined: Thu May 12, 2022 2:32 pm
GW and BSE calculations with larger k-point mesh
Nilesh Kumar
Ph.D. Scholar
University of Ostrava
Ph.D. Scholar
University of Ostrava
- Daniele Varsano
- Posts: 3816
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: GW and BSE calculations with larger k-point mesh
Dear Nilesh,
If it is a memory issue, you can set a suitable parallelization strategy in the input file to distribute memory.
Can you post the input file together with submission script and error message of the problematic case?
Best,
Daniele
If it is a memory issue, you can set a suitable parallelization strategy in the input file to distribute memory.
Can you post the input file together with submission script and error message of the problematic case?
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 3
- Joined: Thu May 12, 2022 2:32 pm
Re: GW and BSE calculations with larger k-point mesh
Hello Mr. Daniele
I was trying to attach the file as attachments but it is not supporting. So I mentioned all the necessary details here,
1. Input file for GW calculation-
2. the submit script is here-
3. the output error file is here,
4. the log file is here-
The calculation stopped at this line " [DIP] Checking dipoles header".
Please let me know what can I do solve this issue?
I was trying to attach the file as attachments but it is not supporting. So I mentioned all the necessary details here,
1. Input file for GW calculation-
Code: Select all
# __ __ ______ __ __ _______ ______
# | \ / \ / \ | \ / \| \ / \
# \$$\ / $$| $$$$$$\| $$\ / $$| $$$$$$$\| $$$$$$\
# \$$\/ $$ | $$__| $$| $$$\ / $$$| $$__/ $$| $$ | $$
# \$$ $$ | $$ $$| $$$$\ $$$$| $$ $$| $$ | $$
# \$$$$ | $$$$$$$$| $$\$$ $$ $$| $$$$$$$\| $$ | $$
# | $$ | $$ | $$| $$ \$$$| $$| $$__/ $$| $$__/ $$
# | $$ | $$ | $$| $$ \$ | $$| $$ $$ \$$ $$
# \$$ \$$ \$$ \$$ \$$ \$$$$$$$ \$$$$$$
#
# Version 5.1.0 Revision 21422 Hash (prev commit) fde6e2a07
# Branch is
# MPI+SLEPC+HDF5_MPI_IO Build
# http://www.yambo-code.org
#
rim_cut # [R] Coulomb potential
HF_and_locXC # [R] Hartree-Fock
gw0 # [R] GW approximation
ppa # [R][Xp] Plasmon Pole Approximation for the Screened Interaction
dyson # [R] Dyson Equation solver
em1d # [R][X] Dynamically Screened Interaction
StdoHash= 40 # [IO] Live-timing Hashes
Nelectro= 56.00000 # Electrons number
ElecTemp= 0.000000 eV # Electronic Temperature
BoseTemp=-1.000000 eV # Bosonic Temperature
OccTresh= 0.100000E-4 # Occupation treshold (metallic bands)
NLogCPUs=0 # [PARALLEL] Live-timing CPU`s (0 for all)
DBsIOoff= "none" # [IO] Space-separated list of DB with NO I/O. DB=(DIP,X,HF,COLLs,J,GF,CARRIERs,OBS,W,SC,BS,ALL)
DBsFRAGpm= "none" # [IO] Space-separated list of +DB to FRAG and -DB to NOT FRAG. DB=(DIP,X,W,HF,COLLS,K,BS,QINDX,RT,ELP
FFTGvecs= 70 Ry # [FFT] Plane-waves
#WFbuffIO # [IO] Wave-functions buffered I/O
PAR_def_mode= "balanced" # [PARALLEL] Default distribution mode ("balanced"/"memory"/"workload")
X_and_IO_CPU= "1 1 16 4 4" # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v" # [PARALLEL] CPUs roles (q,g,k,c,v)
X_and_IO_nCPU_LinAlg_INV=-1 # [PARALLEL] CPUs for Linear Algebra (if -1 it is automatically set)
DIP_CPU= "16 4 4" # [PARALLEL] CPUs for each role
DIP_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
SE_CPU= "16 4 4" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
RandQpts=1000000 # [RIM] Number of random q-points in the BZ
RandGvec= 100 RL # [RIM] Coulomb interaction RS components
#QpgFull # [F RIM] Coulomb interaction: Full matrix
% Em1Anys
0.000000 | 0.000000 | 0.000000 | # [RIM] X Y Z Static Inverse dielectric matrix Anysotropy
%
IDEm1Ref=0 # [RIM] Dielectric matrix reference component 1(x)/2(y)/3(z)
CUTGeo= "box z" # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws/slab X/Y/Z/XY..
% CUTBox
0.000000 | 0.000000 | 41.53774 | # [CUT] [au] Box sides
%
CUTRadius= 0.000000 # [CUT] [au] Sphere/Cylinder radius
CUTCylLen= 0.000000 # [CUT] [au] Cylinder length
CUTwsGvec= 0.700000 # [CUT] WS cutoff: number of G to be modified
#CUTCol_test # [CUT] Perform a cutoff test in R-space
EXXRLvcs= 70 Ry # [XX] Exchange RL components
VXCRLvcs= 70 Ry # [XC] XCpotential RL components
#UseNLCC # [XC] If present, add NLCC contributions to the charge density
Chimod= "HARTREE" # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
ChiLinAlgMod= "LIN_SYS" # [X] inversion/lin_sys,cpu/gpu
XfnQPdb= "none" # [EXTQP Xd] Database action
XfnQP_INTERP_NN= 1 # [EXTQP Xd] Interpolation neighbours (NN mode)
XfnQP_INTERP_shells= 20.00000 # [EXTQP Xd] Interpolation shells (BOLTZ mode)
XfnQP_DbGd_INTERP_mode= "NN" # [EXTQP Xd] Interpolation DbGd mode
% XfnQP_E
0.000000 | 1.000000 | 1.000000 | # [EXTQP Xd] E parameters (c/v) eV|adim|adim
%
XfnQP_Z= ( 1.000000 , 0.000000 ) # [EXTQP Xd] Z factor (c/v)
XfnQP_Wv_E= 0.000000 eV # [EXTQP Xd] W Energy reference (valence)
% XfnQP_Wv
0.000000 | 0.000000 | 0.000000 | # [EXTQP Xd] W parameters (valence) eV| 1|eV^-1
%
XfnQP_Wv_dos= 0.000000 eV # [EXTQP Xd] W dos pre-factor (valence)
XfnQP_Wc_E= 0.000000 eV # [EXTQP Xd] W Energy reference (conduction)
% XfnQP_Wc
0.000000 | 0.000000 | 0.000000 | # [EXTQP Xd] W parameters (conduction) eV| 1 |eV^-1
%
XfnQP_Wc_dos= 0.000000 eV # [EXTQP Xd] W dos pre-factor (conduction)
ShiftedPaths= "" # [DIP] Shifted grids paths (separated by a space)
% QpntsRXp
1 | 61 | # [Xp] Transferred momenta
%
% BndsRnXp
1 | 300 | # [Xp] Polarization function bands
%
NGsBlkXp= 14 Ry # [Xp] Response block size
CGrdSpXp= 100.0000 # [Xp] [o/o] Coarse grid controller
% EhEngyXp
-1.000000 |-1.000000 | eV # [Xp] Electron-hole energy range
%
% LongDrXp
1.000000 | 1.000000 | 0.000000 | # [Xp] [cc] Electric Field
%
PPAPntXp= 27.21138 eV # [Xp] PPA imaginary energy
XTermKind= "BG" # [X] X terminator ("none","BG" Bruneval-Gonze)
XTermEn= 40.00000 eV # [X] X terminator energy (only for kind="BG")
#OptDipAverage # [Xd] Average Xd along the non-zero Electric Field directions
#QPsymmtrz # [GW] Force symmetrization of states with the same energy
GfnQPdb= "none" # [EXTQP G] Database action
GfnQP_INTERP_NN= 1 # [EXTQP G] Interpolation neighbours (NN mode)
GfnQP_INTERP_shells= 20.00000 # [EXTQP G] Interpolation shells (BOLTZ mode)
GfnQP_DbGd_INTERP_mode= "NN" # [EXTQP G] Interpolation DbGd mode
% GfnQP_E
0.000000 | 1.000000 | 1.000000 | # [EXTQP G] E parameters (c/v) eV|adim|adim
%
GfnQP_Z= ( 1.000000 , 0.000000 ) # [EXTQP G] Z factor (c/v)
GfnQP_Wv_E= 0.000000 eV # [EXTQP G] W Energy reference (valence)
% GfnQP_Wv
0.000000 | 0.000000 | 0.000000 | # [EXTQP G] W parameters (valence) eV| 1|eV^-1
%
GfnQP_Wv_dos= 0.000000 eV # [EXTQP G] W dos pre-factor (valence)
GfnQP_Wc_E= 0.000000 eV # [EXTQP G] W Energy reference (conduction)
% GfnQP_Wc
0.000000 | 0.000000 | 0.000000 | # [EXTQP G] W parameters (conduction) eV| 1 |eV^-1
%
GfnQP_Wc_dos= 0.000000 eV # [EXTQP G] W dos pre-factor (conduction)
BoseCut= 0.100000 # [BOSE] Finite T Bose function cutoff
% GbndRnge
1 | 300 | # [GW] G[W] bands range
%
GDamping= 0.200000 eV # [GW] G[W] damping
dScStep= 0.100000 eV # [GW] Energy step to evaluate Z factors
GTermKind= "none" # [GW] GW terminator ("none","BG" Bruneval-Gonze,"BRS" Berger-Reining-Sottile)
GTermEn= 40.81708 eV # [GW] GW terminator energy (only for kind="BG")
DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g")
GWoIter=0 # [GW] GWo self-consistent (evGWo) iterations on eigenvalues
GWIter=0 # [GW] GW self-consistent (evGW) iterations on eigenvalues
SCEtresh= 0.010000 eV # [SC] Energy convergence threshold for SC-GW
#NewtDchk # [GW] Test dSc/dw convergence
ExtendOut # [GW] Print all variables in the output file
#OnMassShell # [F GW] On mass shell approximation
#QPExpand # [F GW] The QP corrections are expanded all over the BZ
%QPkrange # [GW] QP generalized Kpoint/Band indices
1|61|45|61|
%
%QPerange # [GW] QP generalized Kpoint/Energy indices
1|61| 0.000000|-1.000000|
%
Code: Select all
#!/bin/bash
#PBS -A OPEN-24-46
#PBS -N yambo
#PBS -q qlong
#PBS -l select=8:mpiprocs=36
#PBS -l walltime=144:00:00
#PBS -j oe
hostname
date
module load OpenMPI/4.0.3-GCC-9.3.0
cd $PBS_O_WORKDIR
#p2y
#yambo
#yambo -r -fatlog -x -p p -g n -V all -F gw.in
mpirun -np 256 yambo -F gw.in -J gw.out -C report
Code: Select all
cn48.barbora.it4i.cz
Sat May 6 12:10:16 CEST 2023
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 215 with PID 1446841 on node cn98 exited on signal 9 (Killed).
--------------------------------------------------------------------------
6 total processes killed (some possibly by mpirun during cleanup)
cn99.barbora.it4i.cz: # PBS Epilogue Report
cn99.barbora.it4i.cz: # Issues found in kernel log
cn99.barbora.it4i.cz: [161946.381129] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161947.798359] Out of memory: Killed process 801261 (yambo) total-vm:6301728kB, anon-rss:5435732kB, file-rss:0kB, shmem-rss:3896kB, UID:7916 pgtables:10856kB oom_score_adj:0
cn99.barbora.it4i.cz: [161948.536194] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161949.944615] Out of memory: Killed process 801254 (yambo) total-vm:7314192kB, anon-rss:5641280kB, file-rss:0kB, shmem-rss:3988kB, UID:7916 pgtables:11268kB oom_score_adj:0
cn99.barbora.it4i.cz: [161950.551119] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161951.968622] Out of memory: Killed process 801225 (yambo) total-vm:7314712kB, anon-rss:5808548kB, file-rss:0kB, shmem-rss:3472kB, UID:7916 pgtables:11588kB oom_score_adj:0
cn99.barbora.it4i.cz: [161952.621832] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161954.044314] Out of memory: Killed process 801281 (yambo) total-vm:7312608kB, anon-rss:5957592kB, file-rss:0kB, shmem-rss:3460kB, UID:7916 pgtables:11876kB oom_score_adj:0
cn99.barbora.it4i.cz: [161954.740102] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161956.127949] Out of memory: Killed process 801222 (yambo) total-vm:7314704kB, anon-rss:6140932kB, file-rss:0kB, shmem-rss:3628kB, UID:7916 pgtables:12248kB oom_score_adj:0
cn99.barbora.it4i.cz: [161959.373742] kthreadd invoked oom-killer: gfp_mask=0x6082c2(GFP_KERNEL|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161960.706503] Out of memory: Killed process 801280 (yambo) total-vm:7313648kB, anon-rss:6381496kB, file-rss:0kB, shmem-rss:3444kB, UID:7916 pgtables:12704kB oom_score_adj:0
cn99.barbora.it4i.cz: [161961.504801] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn99.barbora.it4i.cz: [161962.901959] Out of memory: Killed process 801252 (yambo) total-vm:7313396kB, anon-rss:6653896kB, file-rss:0kB, shmem-rss:3880kB, UID:7916 pgtables:13240kB oom_score_adj:0
cn90.barbora.it4i.cz: # PBS Epilogue Report
cn90.barbora.it4i.cz: # Issues found in kernel log
cn90.barbora.it4i.cz: [1881636.919998] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881638.366127] Out of memory: Killed process 273064 (yambo) total-vm:7313124kB, anon-rss:5390044kB, file-rss:0kB, shmem-rss:3360kB, UID:7916 pgtables:10768kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881639.722032] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881641.128085] Out of memory: Killed process 273054 (yambo) total-vm:7312340kB, anon-rss:5543400kB, file-rss:0kB, shmem-rss:3356kB, UID:7916 pgtables:11072kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881641.969390] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881643.364261] Out of memory: Killed process 272990 (yambo) total-vm:6301988kB, anon-rss:5877584kB, file-rss:0kB, shmem-rss:3436kB, UID:7916 pgtables:11720kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881644.732927] ibms_mad_agent invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881646.141366] Out of memory: Killed process 273066 (yambo) total-vm:7312604kB, anon-rss:6022940kB, file-rss:0kB, shmem-rss:3368kB, UID:7916 pgtables:12008kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881646.824997] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881648.228590] Out of memory: Killed process 272986 (yambo) total-vm:7316304kB, anon-rss:6060552kB, file-rss:0kB, shmem-rss:3660kB, UID:7916 pgtables:12080kB oom_score_adj:0
cn90.barbora.it4i.cz: [1881650.481638] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn90.barbora.it4i.cz: [1881651.855823] Out of memory: Killed process 272998 (yambo) total-vm:7314456kB, anon-rss:6269372kB, file-rss:0kB, shmem-rss:3528kB, UID:7916 pgtables:12492kB oom_score_adj:0
cn91.barbora.it4i.cz: # PBS Epilogue Report
cn91.barbora.it4i.cz: # Issues found in kernel log
cn91.barbora.it4i.cz: [697703.060750] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697704.468088] Out of memory: Killed process 1800738 (yambo) total-vm:6299084kB, anon-rss:5339656kB, file-rss:0kB, shmem-rss:3212kB, UID:7916 pgtables:10660kB oom_score_adj:0
cn91.barbora.it4i.cz: [697706.057215] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697707.458098] Out of memory: Killed process 1800727 (yambo) total-vm:7312364kB, anon-rss:5585304kB, file-rss:0kB, shmem-rss:3284kB, UID:7916 pgtables:11148kB oom_score_adj:0
cn91.barbora.it4i.cz: [697707.976593] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697709.391896] Out of memory: Killed process 1800729 (yambo) total-vm:7312340kB, anon-rss:5724940kB, file-rss:0kB, shmem-rss:3264kB, UID:7916 pgtables:11428kB oom_score_adj:0
cn91.barbora.it4i.cz: [697710.103304] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697711.486198] Out of memory: Killed process 1800736 (yambo) total-vm:6299756kB, anon-rss:5847800kB, file-rss:0kB, shmem-rss:3220kB, UID:7916 pgtables:11660kB oom_score_adj:0
cn91.barbora.it4i.cz: [697712.329209] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697713.695857] Out of memory: Killed process 1800716 (yambo) total-vm:7314460kB, anon-rss:6058012kB, file-rss:0kB, shmem-rss:3296kB, UID:7916 pgtables:12084kB oom_score_adj:0
cn91.barbora.it4i.cz: [697714.453906] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697715.810939] Out of memory: Killed process 1800709 (yambo) total-vm:7313124kB, anon-rss:6347164kB, file-rss:0kB, shmem-rss:3264kB, UID:7916 pgtables:12640kB oom_score_adj:0
cn91.barbora.it4i.cz: [697717.964331] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn91.barbora.it4i.cz: [697719.312565] Out of memory: Killed process 1800708 (yambo) total-vm:7314704kB, anon-rss:6691032kB, file-rss:0kB, shmem-rss:3292kB, UID:7916 pgtables:13316kB oom_score_adj:0
cn59.barbora.it4i.cz: # PBS Epilogue Report
cn59.barbora.it4i.cz: # Issues found in kernel log
cn59.barbora.it4i.cz: [1806539.905358] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806541.326459] Out of memory: Killed process 3941956 (yambo) total-vm:7314172kB, anon-rss:5355484kB, file-rss:0kB, shmem-rss:3312kB, UID:7916 pgtables:10704kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806542.575819] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806543.995877] Out of memory: Killed process 3941991 (yambo) total-vm:6365076kB, anon-rss:5801600kB, file-rss:0kB, shmem-rss:2980kB, UID:7916 pgtables:11572kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806549.355318] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806550.774951] Out of memory: Killed process 3941988 (yambo) total-vm:6300812kB, anon-rss:5690684kB, file-rss:0kB, shmem-rss:3476kB, UID:7916 pgtables:11352kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806551.421019] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806552.840724] Out of memory: Killed process 3941970 (yambo) total-vm:7315496kB, anon-rss:5900704kB, file-rss:0kB, shmem-rss:3672kB, UID:7916 pgtables:11776kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806556.461055] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806557.871159] Out of memory: Killed process 3942012 (yambo) total-vm:7310516kB, anon-rss:6088080kB, file-rss:0kB, shmem-rss:2988kB, UID:7916 pgtables:12132kB oom_score_adj:0
cn59.barbora.it4i.cz: [1806559.194579] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn59.barbora.it4i.cz: [1806560.593263] Out of memory: Killed process 3941978 (yambo) total-vm:7315244kB, anon-rss:6369896kB, file-rss:0kB, shmem-rss:3604kB, UID:7916 pgtables:12688kB oom_score_adj:0
cn59.barbora.it4i.cz: System error
cn98.barbora.it4i.cz: # PBS Epilogue Report
cn98.barbora.it4i.cz: # Issues found in kernel log
cn98.barbora.it4i.cz: [351522.539098] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351523.947279] Out of memory: Killed process 1446841 (yambo) total-vm:7311020kB, anon-rss:5431760kB, file-rss:0kB, shmem-rss:2856kB, UID:7916 pgtables:10840kB oom_score_adj:0
cn98.barbora.it4i.cz: [351524.530026] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351525.938131] Out of memory: Killed process 1446761 (yambo) total-vm:7313908kB, anon-rss:5584324kB, file-rss:0kB, shmem-rss:3500kB, UID:7916 pgtables:11144kB oom_score_adj:0
cn98.barbora.it4i.cz: [351526.898038] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351528.298460] Out of memory: Killed process 1446771 (yambo) total-vm:7314448kB, anon-rss:5750620kB, file-rss:0kB, shmem-rss:3500kB, UID:7916 pgtables:11480kB oom_score_adj:0
cn98.barbora.it4i.cz: [351529.953387] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351531.358892] Out of memory: Killed process 1446764 (yambo) total-vm:7314712kB, anon-rss:5910536kB, file-rss:0kB, shmem-rss:3516kB, UID:7916 pgtables:11788kB oom_score_adj:0
cn98.barbora.it4i.cz: [351532.819466] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351534.215452] Out of memory: Killed process 1446772 (yambo) total-vm:7316024kB, anon-rss:6125908kB, file-rss:0kB, shmem-rss:3548kB, UID:7916 pgtables:12216kB oom_score_adj:0
cn98.barbora.it4i.cz: [351535.490184] mad/handler invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351536.906057] Out of memory: Killed process 1446803 (yambo) total-vm:7314188kB, anon-rss:6393212kB, file-rss:0kB, shmem-rss:3912kB, UID:7916 pgtables:12736kB oom_score_adj:0
cn98.barbora.it4i.cz: [351541.191143] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn98.barbora.it4i.cz: [351542.553790] Out of memory: Killed process 1446825 (yambo) total-vm:7313400kB, anon-rss:6703660kB, file-rss:0kB, shmem-rss:3452kB, UID:7916 pgtables:13332kB oom_score_adj:0
cn60.barbora.it4i.cz: # PBS Epilogue Report
cn60.barbora.it4i.cz: # Issues found in kernel log
cn60.barbora.it4i.cz: [1755733.475081] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755734.899513] Out of memory: Killed process 351207 (yambo) total-vm:6302784kB, anon-rss:5329924kB, file-rss:0kB, shmem-rss:3456kB, UID:7916 pgtables:10648kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755735.412500] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755736.820157] Out of memory: Killed process 351197 (yambo) total-vm:7315760kB, anon-rss:5501428kB, file-rss:0kB, shmem-rss:3956kB, UID:7916 pgtables:10992kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755737.898513] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755739.329351] Out of memory: Killed process 351235 (yambo) total-vm:7312616kB, anon-rss:5673540kB, file-rss:0kB, shmem-rss:3336kB, UID:7916 pgtables:11316kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755740.376220] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755741.798103] Out of memory: Killed process 351200 (yambo) total-vm:7314704kB, anon-rss:5808332kB, file-rss:0kB, shmem-rss:3692kB, UID:7916 pgtables:11592kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755742.952718] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755744.358852] Out of memory: Killed process 351204 (yambo) total-vm:6301728kB, anon-rss:5991416kB, file-rss:0kB, shmem-rss:3472kB, UID:7916 pgtables:11932kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755747.062628] yambo invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755748.488439] Out of memory: Killed process 351215 (yambo) total-vm:7315492kB, anon-rss:6207780kB, file-rss:0kB, shmem-rss:3856kB, UID:7916 pgtables:12368kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755749.679994] mad/handler invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755751.102421] Out of memory: Killed process 351223 (yambo) total-vm:7312612kB, anon-rss:6494304kB, file-rss:0kB, shmem-rss:3224kB, UID:7916 pgtables:12932kB oom_score_adj:0
cn60.barbora.it4i.cz: [1755752.399894] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn60.barbora.it4i.cz: [1755753.801679] Out of memory: Killed process 351211 (yambo) total-vm:7315504kB, anon-rss:6784816kB, file-rss:0kB, shmem-rss:3532kB, UID:7916 pgtables:13504kB oom_score_adj:0
cn48.barbora.it4i.cz: # PBS Epilogue Report
cn48.barbora.it4i.cz: # Issues found in kernel log
cn48.barbora.it4i.cz: [1900015.497187] yambo invoked oom-killer: gfp_mask=0x7080c0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900016.951951] Out of memory: Killed process 601689 (yambo) total-vm:7313936kB, anon-rss:5321700kB, file-rss:0kB, shmem-rss:3688kB, UID:7916 pgtables:10632kB oom_score_adj:0
cn48.barbora.it4i.cz: [1900018.663468] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900020.100254] Out of memory: Killed process 601707 (yambo) total-vm:7314700kB, anon-rss:5701004kB, file-rss:0kB, shmem-rss:3628kB, UID:7916 pgtables:11384kB oom_score_adj:0
cn48.barbora.it4i.cz: [1900020.883627] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900022.330629] Out of memory: Killed process 601699 (yambo) total-vm:7314448kB, anon-rss:5840600kB, file-rss:0kB, shmem-rss:3652kB, UID:7916 pgtables:11664kB oom_score_adj:0
cn48.barbora.it4i.cz: [1900022.928544] yambo invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), order=0, oom_score_adj=0
cn48.barbora.it4i.cz: [1900024.375939] Out of memory: Killed process 601693 (yambo) total-vm:6301876kB, anon-rss:6002104kB, file-rss:0kB, shmem-rss:3804kB, UID:7916 pgtables:11964kB oom_score_adj:0
Code: Select all
_| _| _|_| _| _| _|_|_| _|_|
_| _| _| _| _|_| _|_| _| _| _| _|
_| _|_|_|_| _| _| _| _|_|_| _| _|
_| _| _| _| _| _| _| _| _|
_| _| _| _| _| _|_|_| _|_|
<---> P1: [01] MPI/OPENMP structure, Files & I/O Directories
<---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads : 256(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)-1(threads@NL)
<---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads : DIP(environment)-16 4 4(CPUs)-k c v(ROLEs)
<---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads : X_and_IO(environment)-1 1 16 4 4(CPUs)-q g k c v(ROLEs)
<---> P1-cn48.barbora.it4i.cz: MPI Cores-Threads : SE(environment)-16 4 4(CPUs)-q qp b(ROLEs)
<---> P1-cn48.barbora.it4i.cz: [02] CORE Variables Setup
<---> P1-cn48.barbora.it4i.cz: [02.01] Unit cells
<---> P1-cn48.barbora.it4i.cz: [02.02] Symmetries
<---> P1-cn48.barbora.it4i.cz: [02.03] Reciprocal space
<---> P1-cn48.barbora.it4i.cz: [02.04] K-grid lattice
<---> P1-cn48.barbora.it4i.cz: Grid dimensions : 24 24
<---> P1-cn48.barbora.it4i.cz: [02.05] Energies & Occupations
<---> P1-cn48.barbora.it4i.cz: [03] Transferred momenta grid and indexing
<---> P1-cn48.barbora.it4i.cz: [04] Coloumb potential Random Integration (RIM)
<---> P1-cn48.barbora.it4i.cz: [04.01] RIM initialization
<---> P1-cn48.barbora.it4i.cz: Random points | | [000%] --(E) --(X)
<---> P1-cn48.barbora.it4i.cz: Random points |########################################| [100%] --(E) --(X)
<02s> P1-cn48.barbora.it4i.cz: [04.02] RIM integrals
<02s> P1-cn48.barbora.it4i.cz: Momenta loop | | [000%] --(E) --(X)
<04s> P1-cn48.barbora.it4i.cz: Momenta loop |########################################| [100%] 02s(E) 02s(X)
<04s> P1-cn48.barbora.it4i.cz: [05] Coloumb potential CutOffbox
<04s> P1-cn48.barbora.it4i.cz: Box | | [000%] --(E) --(X)
<06s> P1-cn48.barbora.it4i.cz: Box |########################################| [100%] --(E) --(X)
<06s> P1-cn48.barbora.it4i.cz: [06] Dipoles
<06s> P1-cn48.barbora.it4i.cz: [PARALLEL DIPOLES for K(ibz) on 16 CPU] Loaded/Total (Percentual):4/61(7%)
<06s> P1-cn48.barbora.it4i.cz: [PARALLEL DIPOLES for CON bands on 4 CPU] Loaded/Total (Percentual):75/300(25%)
<06s> P1-cn48.barbora.it4i.cz: [PARALLEL DIPOLES for VAL bands on 4 CPU] Loaded/Total (Percentual):14/56(25%)
<06s> P1-cn48.barbora.it4i.cz: [DIP] Checking dipoles header
<06s> P1-cn48.barbora.it4i.cz: [DIP] Database not correct or missing. To be computed
<06s> P1-cn48.barbora.it4i.cz: [x,Vnl] computed using 262 projectors
<06s> P1-cn48.barbora.it4i.cz: [WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
<07s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) | | [000%] --(E) --(X)
<07s> P1-cn48.barbora.it4i.cz: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):75/300(25%)
<18s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |# | [002%] 11s(E) 07m-47s(X)
<32s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |########## | [025%] 25s(E) 01m-39s(X)
<45s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |#################### | [050%] 38s(E) 01m-17s(X)
<58s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |############################## | [075%] 51s(E) 01m-09s(X)
<01m-00s> P1-cn48.barbora.it4i.cz: Dipoles: P, V and iR (T) |########################################| [100%] 53s(E) 53s(X)
<01m-08s> P1-cn48.barbora.it4i.cz: [DIP] Writing dipoles header
<01m-09s> P1-cn48.barbora.it4i.cz: [07] Dynamic Dielectric Matrix (PPA)
<01m-16s> P1-cn48.barbora.it4i.cz: [WARNING] Response block size reduced to 1299 RL (7124 mHa)
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for K(bz) on 16 CPU] Loaded/Total (Percentual):36/576(6%)
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for Q(ibz) on 1 CPU] Loaded/Total (Percentual):61/61(100%)
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):61/61(100%)
<01m-16s> P1-cn48.barbora.it4i.cz: [LA] SERIAL linear algebra
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for K(ibz) on 1 CPU] Loaded/Total (Percentual):61/61(100%)
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for CON bands on 4 CPU] Loaded/Total (Percentual):75/300(25%)
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL Response_G_space_and_IO for VAL bands on 4 CPU] Loaded/Total (Percentual):14/56(25%)
<01m-16s> P1-cn48.barbora.it4i.cz: [PARALLEL distribution for RL vectors(X) on 1 CPU] Loaded/Total (Percentual):1687401/1687401(100%)
<01m-17s> P1-cn48.barbora.it4i.cz: [DIP] Checking dipoles header
Please let me know what can I do solve this issue?
Nilesh Kumar
Ph.D. Scholar
University of Ostrava
Ph.D. Scholar
University of Ostrava
- Daniele Varsano
- Posts: 3816
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: GW and BSE calculations with larger k-point mesh
Dear Nilesh,
this is not really comfortable to read, you can share files renaming them with a supported suffix as e.g. file.txt
Anyway, a good parallelization strategy to share memory among cpu, considering you are using 288 CPUs:
Furthermore, I strongly advise you to set:
as this is very memory intensive and not particularly efficient.
Best,
Daniele
this is not really comfortable to read, you can share files renaming them with a supported suffix as e.g. file.txt
Anyway, a good parallelization strategy to share memory among cpu, considering you are using 288 CPUs:
Code: Select all
X_and_IO_CPU= "1 1 1 36 8" # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v"
SE_CPU= "1 8 36" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
Code: Select all
XTermKind= "none" # [X] X terminator ("none","BG" Bruneval-Gonze)
as this is very memory intensive and not particularly efficient.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 4
- Joined: Thu May 25, 2023 12:02 pm
Re: GW and BSE calculations with larger k-point mesh
Dear Team,
I am also facing similar problem while interpolating band structures, which is supposedly arising from virtual memory issue. I tried using more nodes and less cores, so as to use higher memory, but all in vain.
The calculation was stopping every time at
<14s> P1: [05.01] G0W0 on the real axis
<14s> P1: Self_Energy parallel ENVIRONMENT is incomplete. Switching to defaults
<14s> P1: [PARALLEL Self_Energy for QPs on 3 CPU] Loaded/Total (Percentual):10260/30780(33%)
<14s> P1: [PARALLEL Self_Energy for Q(ibz) on 1 CPU] Loaded/Total (Percentual):81/81(100%)
<14s> P1: [PARALLEL Self_Energy for G bands on 2 CPU] Loaded/Total (Percentual):190/380(50%)
<15s> P1: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):20520/30780(67%)
It was creating the error file "yambo.80s-56187,node9.btr" which shows:
yambo:55185 terminated with signal 11 at PC=4af805 SP=7ffd064bf6c0. Backtrace: /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x4af805] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x497eeb] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x493469] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x40bc25] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x73f525] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x406875] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b59338823d5] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x4066a9]
After seeing this forum, I tried using:
X_and_IO_CPU= "1 1 1 48 4"
X_and_IO_ROLEs= "q g k c v"
SE_CPU= "1 4 48" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
Now, the "yambo.80s-56187,node9.btr" files are not created, still the calculation is stopping at the same place.
Any help would be highly appreciated.
Thanks and regards,
Harshita
I am also facing similar problem while interpolating band structures, which is supposedly arising from virtual memory issue. I tried using more nodes and less cores, so as to use higher memory, but all in vain.
The calculation was stopping every time at
<14s> P1: [05.01] G0W0 on the real axis
<14s> P1: Self_Energy parallel ENVIRONMENT is incomplete. Switching to defaults
<14s> P1: [PARALLEL Self_Energy for QPs on 3 CPU] Loaded/Total (Percentual):10260/30780(33%)
<14s> P1: [PARALLEL Self_Energy for Q(ibz) on 1 CPU] Loaded/Total (Percentual):81/81(100%)
<14s> P1: [PARALLEL Self_Energy for G bands on 2 CPU] Loaded/Total (Percentual):190/380(50%)
<15s> P1: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):20520/30780(67%)
It was creating the error file "yambo.80s-56187,node9.btr" which shows:
yambo:55185 terminated with signal 11 at PC=4af805 SP=7ffd064bf6c0. Backtrace: /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x4af805] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x497eeb] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x493469] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x40bc25] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x73f525] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x406875] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b59338823d5] /apps/scratch/compile/yambo-5.1/yambo-5.1.1/bin/yambo[0x4066a9]
After seeing this forum, I tried using:
X_and_IO_CPU= "1 1 1 48 4"
X_and_IO_ROLEs= "q g k c v"
SE_CPU= "1 4 48" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
Now, the "yambo.80s-56187,node9.btr" files are not created, still the calculation is stopping at the same place.
Any help would be highly appreciated.
Thanks and regards,
Harshita
Harshita, Research Scholar, INST