Error on Allocation of X_mat for parallel computation
Posted: Wed Sep 30, 2020 4:27 pm
Dear all,
I am trying to do G0W0 computation on a linear chain made of 200 atoms along the z axis, performing the full frequency computation. I am running on GALILEO machines, with a mixed MPI+open MP parallelization in order to avoid OUT OF MEMORY errors.
This is the input file, together with the bash for submitting the parallel job:
#!/bin/bash
#SBATCH -N 5 # number of nodes
#SBATCH --mem=118000 # memory 86000MB for cache/flat nodes
#SBATCH --time=24:00:00 # time limits: 24 hour
#SBATCH --tasks-per-node=6
#SBATCH --cpus-per-task=6
gw0 # [R GW] GoWo Quasiparticle energy levels
rim_cut # [R RIM CUT] Coulomb potential
HF_and_locXC # [R XX] Hartree-Fock Self-energy and Vxc
em1d # [R Xd] Dynamical Inverse Dielectric Matrix
X_nCPU_LinAlg_INV= $ncpu
X_Threads=0 # [OPENMP/X] Number of threads for response functions
SE_Threads=0 # [OPENMP/GW] Number of threads for self-energy
DIP_Threads=0
RandQpts=0 # [RIM] Number of random q-points in the BZ
RandGvec= 1 RL # [RIM] Coulomb interaction RS components
CUTGeo= "ws Z" # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws X/Y/Z/XY..
CUTwsGvec= 1.1000 # [CUT] WS cutoff: number of G to be modified
EXXRLvcs= 50 Ry # [XX] Exchange RL components
VXCRLvcs= 424401 RL # [XC] XCpotential RL components
Chimod= "HARTREE" # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
% GbndRnge
1 | 500 | # [GW] G[W] bands range
%
XTermKind = "BG"
GDamping= 0.10000 eV # [GW] G[W] damping
dScStep= 0.10000 eV # [GW] Energy step to evaluate Z factors
% BndsRnXd
1 | 500 | # [Xd] Polarization function bands
%
GTermKind = "BG"
NGsBlkXd= 3 Ry # [Xd] Response block size
% DmRngeXd
0.20000 | 0.20000 | eV # [Xd] Damping range
%
ETStpsXd= 100 # [Xd] Total Energy steps
% LongDrXd
1.000000 | 1.000000 | 1.000000 | # [Xd] [cc] Electric Field
%
DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g")
%QPkrange # # [GW] QP generalized Kpoint/Band indices
1|1|399|402|
%
Before starting the computation I reduced the number of RL vectors from 42441312 to 40000 RL, in order to reduce the computational time.
The output is the following:
[01] CPU structure, Files & I/O Directories
===========================================
* CPU-Threads :30(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)-1(threads@NL)
* MPI CPU : 30
* THREADS (max): 1
* THREADS TOT(max): 30
* I/O NODES : 5
* NODES(computing): 5
* (I/O): 1
* Fragmented WFs : yes
CORE databases in .
Additional I/O in .
Communications in .
Input file is gw_all_BZ_ff.in
Report file is ./r-all_BZ_ff_em1d_HF_and_locXC_gw0_rim_cut
Precision is SINGLE
Log files in ./LOG
Job string(s)-dir(s) (main): all_BZ_ff
[RD./SAVE//ns.db1]------------------------------------------
Bands : 500
K-points : 1
G-vectors [RL space]: 42441312
Components [wavefunctions]: 1513945
Symmetries [spatial+T-rev]: 16
Spinor components : 1
Spin polarizations : 1
Temperature [ev]: 0.000000
Electrons : 800.0000
WF G-vectors : 1513945
Max atoms/species : 200
No. of atom species : 1
Exact exchange fraction in XC : 0.000000
Exact exchange screening in XC : 0.000000
Magnetic symmetries : no
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
[04] Coloumb potential CutOff :ws
=================================
Cut directions :Z
WS Cutoff [units to be defined]: 1.100000
Symmetry test passed :yes
Cutoff: 1.100000
n grid: 4 4 84
WS Direct Lattice(DL) unit cell [iru / cc(a.u.)]
A1 = 1.000000 0.000000 0.000000 18.89727 0.000000 0.000000
A2 = 0.000000 1.000000 0.000000 0.000000 18.89727 0.000000
A3 = 0.000000 0.000000 1.000000 0.000000 0.000000 478.8568
[WR./all_BZ_ff//ndb.cutoff]---------------------------------
Brillouin Zone Q/K grids (IBZ/BZ): 1 1 1 1
CutOff Geometry :ws z
Coulomb cutoff potential :ws z 1.100
Box sides length [au]: 0.00 0.00 0.00
Sphere/Cylinder radius [au]: 0.000000
Cylinder length [au]: 0.000000
RL components : 399997
RL components used in the sum : 399997
RIM corrections included :no
RIM RL components :0
RIM random points :0
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
[05] Dipoles
============
[WARNING] DIPOLES database not correct or not present
[RD./SAVE//ns.kb_pp_pwscf]----------------------------------
Fragmentation :yes
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
[WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
[WF-Oscillators/G space] Performing Wave-Functions I/O from ./SAVE
[WF-Oscillators/G space loader] Normalization (few states) min/max :0.865E-11 1.00
[WR./all_BZ_ff//ndb.dipoles]--------------------------------
Brillouin Zone Q/K grids (IBZ/BZ): 1 1 1 1
RL vectors (WF): 399997
Fragmentation :yes
Electronic Temperature [K]: 0.000000
Bosonic Temperature [K]: 0.000000
X band range : 1 500
X band range limits : 400 1
X e/h energy range [ev]:-1.000000 -1.000000
RL vectors in the sum : 399997
[r,Vnl] included :yes
Bands ordered :yes
Direct v evaluation :no
Field momentum norm :0.1000E-4
Approach used :G-space v
Dipoles computed :R V P
Wavefunctions :Perdew, Burke & Ernzerhof(X)+Perdew, Burke & Ernzerhof(C)
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
Timing [Min/Max/Average]: 02h-15m-40s/02h-15m-43s/02h-15m-42s
[06] Dynamical Dielectric Matrix
================================
However, the computation stops with the following error:
[ERROR] STOP signal received while in :[06] Dynamical Dielectric Matrix
[ERROR]Allocation of X_mat failed
In the LOG directory I found:
<02h-16m-35s> P1-r039c02s08: [06] Dynamical Dielectric Matrix
<03h-39m-08s> P1-r039c02s08: Response_G_space parallel ENVIRONMENT is incomplete. Switching to defaults
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for K(bz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for Q(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [LA] SERIAL linear algebra
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for K(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for CON bands on 5 CPU] Loaded/Total (Percentual):100/500(20%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for VAL bands on 3 CPU] Loaded/Total (Percentual):134/400(34%)
P1-r039c02s08: [ERROR] STOP signal received while in :[06] Dynamical Dielectric Matrix
P1-r039c02s08: [ERROR]Allocation of X_mat failed
Am I doing something wrong? Do you have any suggestion to overcome such problem?
Sincerely,
Davide Romanin
-----------------------------------------------------
PhD student in Physics XXXIII cycle
Representative of the PhD students in Physics
Applied Science and Technology department (DiSAT)
Politecnico di Torino
Corso Duca degli Abruzzi, 24
10129 Torino ITALY
------------------------------------------------------
I am trying to do G0W0 computation on a linear chain made of 200 atoms along the z axis, performing the full frequency computation. I am running on GALILEO machines, with a mixed MPI+open MP parallelization in order to avoid OUT OF MEMORY errors.
This is the input file, together with the bash for submitting the parallel job:
#!/bin/bash
#SBATCH -N 5 # number of nodes
#SBATCH --mem=118000 # memory 86000MB for cache/flat nodes
#SBATCH --time=24:00:00 # time limits: 24 hour
#SBATCH --tasks-per-node=6
#SBATCH --cpus-per-task=6
gw0 # [R GW] GoWo Quasiparticle energy levels
rim_cut # [R RIM CUT] Coulomb potential
HF_and_locXC # [R XX] Hartree-Fock Self-energy and Vxc
em1d # [R Xd] Dynamical Inverse Dielectric Matrix
X_nCPU_LinAlg_INV= $ncpu
X_Threads=0 # [OPENMP/X] Number of threads for response functions
SE_Threads=0 # [OPENMP/GW] Number of threads for self-energy
DIP_Threads=0
RandQpts=0 # [RIM] Number of random q-points in the BZ
RandGvec= 1 RL # [RIM] Coulomb interaction RS components
CUTGeo= "ws Z" # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws X/Y/Z/XY..
CUTwsGvec= 1.1000 # [CUT] WS cutoff: number of G to be modified
EXXRLvcs= 50 Ry # [XX] Exchange RL components
VXCRLvcs= 424401 RL # [XC] XCpotential RL components
Chimod= "HARTREE" # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
% GbndRnge
1 | 500 | # [GW] G[W] bands range
%
XTermKind = "BG"
GDamping= 0.10000 eV # [GW] G[W] damping
dScStep= 0.10000 eV # [GW] Energy step to evaluate Z factors
% BndsRnXd
1 | 500 | # [Xd] Polarization function bands
%
GTermKind = "BG"
NGsBlkXd= 3 Ry # [Xd] Response block size
% DmRngeXd
0.20000 | 0.20000 | eV # [Xd] Damping range
%
ETStpsXd= 100 # [Xd] Total Energy steps
% LongDrXd
1.000000 | 1.000000 | 1.000000 | # [Xd] [cc] Electric Field
%
DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g")
%QPkrange # # [GW] QP generalized Kpoint/Band indices
1|1|399|402|
%
Before starting the computation I reduced the number of RL vectors from 42441312 to 40000 RL, in order to reduce the computational time.
The output is the following:
[01] CPU structure, Files & I/O Directories
===========================================
* CPU-Threads :30(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)-1(threads@NL)
* MPI CPU : 30
* THREADS (max): 1
* THREADS TOT(max): 30
* I/O NODES : 5
* NODES(computing): 5
* (I/O): 1
* Fragmented WFs : yes
CORE databases in .
Additional I/O in .
Communications in .
Input file is gw_all_BZ_ff.in
Report file is ./r-all_BZ_ff_em1d_HF_and_locXC_gw0_rim_cut
Precision is SINGLE
Log files in ./LOG
Job string(s)-dir(s) (main): all_BZ_ff
[RD./SAVE//ns.db1]------------------------------------------
Bands : 500
K-points : 1
G-vectors [RL space]: 42441312
Components [wavefunctions]: 1513945
Symmetries [spatial+T-rev]: 16
Spinor components : 1
Spin polarizations : 1
Temperature [ev]: 0.000000
Electrons : 800.0000
WF G-vectors : 1513945
Max atoms/species : 200
No. of atom species : 1
Exact exchange fraction in XC : 0.000000
Exact exchange screening in XC : 0.000000
Magnetic symmetries : no
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
[04] Coloumb potential CutOff :ws
=================================
Cut directions :Z
WS Cutoff [units to be defined]: 1.100000
Symmetry test passed :yes
Cutoff: 1.100000
n grid: 4 4 84
WS Direct Lattice(DL) unit cell [iru / cc(a.u.)]
A1 = 1.000000 0.000000 0.000000 18.89727 0.000000 0.000000
A2 = 0.000000 1.000000 0.000000 0.000000 18.89727 0.000000
A3 = 0.000000 0.000000 1.000000 0.000000 0.000000 478.8568
[WR./all_BZ_ff//ndb.cutoff]---------------------------------
Brillouin Zone Q/K grids (IBZ/BZ): 1 1 1 1
CutOff Geometry :ws z
Coulomb cutoff potential :ws z 1.100
Box sides length [au]: 0.00 0.00 0.00
Sphere/Cylinder radius [au]: 0.000000
Cylinder length [au]: 0.000000
RL components : 399997
RL components used in the sum : 399997
RIM corrections included :no
RIM RL components :0
RIM random points :0
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
[05] Dipoles
============
[WARNING] DIPOLES database not correct or not present
[RD./SAVE//ns.kb_pp_pwscf]----------------------------------
Fragmentation :yes
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
[WARNING] [x,Vnl] slows the Dipoles computation. To neglect it rename the ns.kb_pp file
[WF-Oscillators/G space] Performing Wave-Functions I/O from ./SAVE
[WF-Oscillators/G space loader] Normalization (few states) min/max :0.865E-11 1.00
[WR./all_BZ_ff//ndb.dipoles]--------------------------------
Brillouin Zone Q/K grids (IBZ/BZ): 1 1 1 1
RL vectors (WF): 399997
Fragmentation :yes
Electronic Temperature [K]: 0.000000
Bosonic Temperature [K]: 0.000000
X band range : 1 500
X band range limits : 400 1
X e/h energy range [ev]:-1.000000 -1.000000
RL vectors in the sum : 399997
[r,Vnl] included :yes
Bands ordered :yes
Direct v evaluation :no
Field momentum norm :0.1000E-4
Approach used :G-space v
Dipoles computed :R V P
Wavefunctions :Perdew, Burke & Ernzerhof(X)+Perdew, Burke & Ernzerhof(C)
- S/N 000347 -------------------------- v.04.05.01 r.00165 -
Timing [Min/Max/Average]: 02h-15m-40s/02h-15m-43s/02h-15m-42s
[06] Dynamical Dielectric Matrix
================================
However, the computation stops with the following error:
[ERROR] STOP signal received while in :[06] Dynamical Dielectric Matrix
[ERROR]Allocation of X_mat failed
In the LOG directory I found:
<02h-16m-35s> P1-r039c02s08: [06] Dynamical Dielectric Matrix
<03h-39m-08s> P1-r039c02s08: Response_G_space parallel ENVIRONMENT is incomplete. Switching to defaults
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for K(bz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for Q(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [LA] SERIAL linear algebra
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for K(ibz) on 1 CPU] Loaded/Total (Percentual):1/1(100%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for CON bands on 5 CPU] Loaded/Total (Percentual):100/500(20%)
<03h-39m-11s> P1-r039c02s08: [PARALLEL Response_G_space for VAL bands on 3 CPU] Loaded/Total (Percentual):134/400(34%)
P1-r039c02s08: [ERROR] STOP signal received while in :[06] Dynamical Dielectric Matrix
P1-r039c02s08: [ERROR]Allocation of X_mat failed
Am I doing something wrong? Do you have any suggestion to overcome such problem?
Sincerely,
Davide Romanin
-----------------------------------------------------
PhD student in Physics XXXIII cycle
Representative of the PhD students in Physics
Applied Science and Technology department (DiSAT)
Politecnico di Torino
Corso Duca degli Abruzzi, 24
10129 Torino ITALY
------------------------------------------------------