Parallel TDDFT error
Posted: Sun Jul 14, 2019 4:08 pm
Dear Yambo experts,
I've got a problem when I was doing parallel TDDFT calculations. Below are my input and log files. I used Yambo-4.1.3. And there are 16 kpoints.
input:->
_________________________________________
optics # [R OPT] Optics
chi # [R CHI] Dyson equation for Chi.
tddft # [R K] Use TDDFT kernel
NLogCPUs= 0 # [PARALLEL] Live-timing CPU`s (0 for all)
FFTGvecs= 4007 RL # [FFT] Plane-waves
X_q_0_CPU= "8 1 1" # [PARALLEL] CPUs for each role
X_q_0_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
X_q_0_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra
X_finite_q_CPU= "1 8 1 1" # [PARALLEL] CPUs for each role
X_finite_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
X_finite_q_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra
Chimod= "ALDA" # [X] IP/Hartree/ALDA/LRC/BSfxc
FxcGRLc= 705 RL # [TDDFT] XC-kernel RL size
NGsBlkXd= 705 RL # [Xd] Response block size
% QpntsRXd
1 | 1 | # [Xd] Transferred momenta
%
% BndsRnXd
1 | 384 | # [Xd] Polarization function bands
%
% EnRngeXd
0.000000 | 6.000000 | eV # [Xd] Energy range
%
% DmRngeXd
0.10000 | 0.10000 | eV # [Xd] Damping range
%
ETStpsXd= 200 # [Xd] Total Energy steps
% LongDrXd
1.000000 | 0.000000 | 0.000000 | # [Xd] [cc] Electric Field
%
__________________________________________________________________________________
log:->
___________________________________________________________
<---> P0001: [M 0.059 Gb] Alloc RL_Gshells RL_Eshells ( 0.018)
____ ____ ___ .___ ___. .______ ______
\ \ / / / \ | \/ | | _ \ / __ \
\ \/ / / ^ \ | \ / | | |_) | | | | |
\_ _/ / /_\ \ | |\/| | | _ < | | | |
| | / _____ \ | | | | | |_) | | `--" |
|__| /__/ \__\ |__| |__| |______/ \______/
<---> P0001: [01] CPU structure, Files & I/O Directories
<---> P0001: CPU-Threads:8(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)
<---> P0001: CPU-Threads:X_q_0(environment)-8 1 1(CPUs)-k c v(ROLEs)
<---> P0001: CPU-Threads:X_finite_q(environment)-1 8 1 1(CPUs)-q k c v(ROLEs)
<---> P0001: [02] CORE Variables Setup
<---> P0001: [02.01] Unit cells
<01s> P0001: [02.02] Symmetries
<01s> P0001: [02.03] RL shells
<01s> P0001: [02.04] K-grid lattice
<01s> P0001: [02.05] Energies [ev] & Occupations
<01s> P0001: [03] Transferred momenta grid
<01s> P0001: [M 0.327 Gb] Alloc bare_qpg ( 0.260)
<02s> P0001: [04] External corrections
<02s> P0001: [05] Optics
<02s> P0001: [LA] SERIAL linear algebra
<02s> P0001: [PARALLEL Response_G_space_Zero_Momentum for K(ibz) on 8 CPU] Loaded/Total (Percentual):2/16(13%)
<02s> P0001: [PARALLEL Response_G_space_Zero_Momentum for CON bands on 1 CPU] Loaded/Total (Percentual):56/56(100%)
<02s> P0001: [PARALLEL Response_G_space_Zero_Momentum for VAL bands on 1 CPU] Loaded/Total (Percentual):328/328(100%)
<01m-11s> P0001: [M 0.374 Gb] Alloc WF ( 0.048)
<01m-11s> P0001: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):656/5248(13%)
<01m-11s> P0001: [WF] Performing Wave-Functions I/O from ./SAVE
<01m-11s> P0001: [FFT-Rho] Mesh size: 12 18 45
<01m-11s> P0001: [M 0.472 Gb] Alloc wf_disk ( 0.097)
<01m-11s> P0001: Reading wf_fragments_1_1
<01m-11s> P0001: Reading wf_fragments_1_2
<01m-11s> P0001: Reading wf_fragments_1_3
<01m-11s> P0001: Reading wf_fragments_1_4
<01m-12s> P0001: Reading wf_fragments_1_5
<01m-12s> P0001: Reading wf_fragments_1_6
<01m-12s> P0001: Reading wf_fragments_1_7
<01m-12s> P0001: Reading wf_fragments_2_1
<01m-13s> P0001: Reading wf_fragments_2_2
<01m-13s> P0001: Reading wf_fragments_2_3
<01m-13s> P0001: Reading wf_fragments_2_4
<01m-13s> P0001: Reading wf_fragments_2_5
<01m-14s> P0001: Reading wf_fragments_2_6
<01m-14s> P0001: Reading wf_fragments_2_7
<01m-14s> P0001: [M 0.374 Gb] Free wf_disk ( 0.097)
<01m-15s> P0001: [xc] Functional Perdew, Burke & Ernzerhof(X)+Perdew, Burke & Ernzerhof(C)
__________________________________________________________________________________________
job script:->
_______________________________________________________________________________________
#!/bin/bash
#SBATCH -J SnTe
#SBATCH --get-user-env
#SBATCH -e exclusive
#SBATCH -N 1
#SBATCH -n 8
#SBATCH -p C032M0128G
#SBATCH --qos low
bin=~/software/yambo-4.1.3/bin/yambo
jf="04_tddft.in"
srun hostname -s | sort -u > slurm.hosts
mpirun -n 8 -machinefile slurm.hosts $bin -F ${jf} -J 04_tddft
rm -rf slurm.hosts
_______________________________________________________________________________
And slurm log :->
___________________________________________________________________
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 432360 RUNNING AT b2u09n3
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
Intel(R) MPI Library troubleshooting guide:
https://software.intel.com/node/561764
===================================================================================
_______________________________________________________________________________________
Could you help me? Thank you very much!
Best,
Xiaowei
I've got a problem when I was doing parallel TDDFT calculations. Below are my input and log files. I used Yambo-4.1.3. And there are 16 kpoints.
input:->
_________________________________________
optics # [R OPT] Optics
chi # [R CHI] Dyson equation for Chi.
tddft # [R K] Use TDDFT kernel
NLogCPUs= 0 # [PARALLEL] Live-timing CPU`s (0 for all)
FFTGvecs= 4007 RL # [FFT] Plane-waves
X_q_0_CPU= "8 1 1" # [PARALLEL] CPUs for each role
X_q_0_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
X_q_0_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra
X_finite_q_CPU= "1 8 1 1" # [PARALLEL] CPUs for each role
X_finite_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
X_finite_q_nCPU_LinAlg_INV= 1 # [PARALLEL] CPUs for Linear Algebra
Chimod= "ALDA" # [X] IP/Hartree/ALDA/LRC/BSfxc
FxcGRLc= 705 RL # [TDDFT] XC-kernel RL size
NGsBlkXd= 705 RL # [Xd] Response block size
% QpntsRXd
1 | 1 | # [Xd] Transferred momenta
%
% BndsRnXd
1 | 384 | # [Xd] Polarization function bands
%
% EnRngeXd
0.000000 | 6.000000 | eV # [Xd] Energy range
%
% DmRngeXd
0.10000 | 0.10000 | eV # [Xd] Damping range
%
ETStpsXd= 200 # [Xd] Total Energy steps
% LongDrXd
1.000000 | 0.000000 | 0.000000 | # [Xd] [cc] Electric Field
%
__________________________________________________________________________________
log:->
___________________________________________________________
<---> P0001: [M 0.059 Gb] Alloc RL_Gshells RL_Eshells ( 0.018)
____ ____ ___ .___ ___. .______ ______
\ \ / / / \ | \/ | | _ \ / __ \
\ \/ / / ^ \ | \ / | | |_) | | | | |
\_ _/ / /_\ \ | |\/| | | _ < | | | |
| | / _____ \ | | | | | |_) | | `--" |
|__| /__/ \__\ |__| |__| |______/ \______/
<---> P0001: [01] CPU structure, Files & I/O Directories
<---> P0001: CPU-Threads:8(CPU)-1(threads)-1(threads@X)-1(threads@DIP)-1(threads@SE)-1(threads@RT)-1(threads@K)
<---> P0001: CPU-Threads:X_q_0(environment)-8 1 1(CPUs)-k c v(ROLEs)
<---> P0001: CPU-Threads:X_finite_q(environment)-1 8 1 1(CPUs)-q k c v(ROLEs)
<---> P0001: [02] CORE Variables Setup
<---> P0001: [02.01] Unit cells
<01s> P0001: [02.02] Symmetries
<01s> P0001: [02.03] RL shells
<01s> P0001: [02.04] K-grid lattice
<01s> P0001: [02.05] Energies [ev] & Occupations
<01s> P0001: [03] Transferred momenta grid
<01s> P0001: [M 0.327 Gb] Alloc bare_qpg ( 0.260)
<02s> P0001: [04] External corrections
<02s> P0001: [05] Optics
<02s> P0001: [LA] SERIAL linear algebra
<02s> P0001: [PARALLEL Response_G_space_Zero_Momentum for K(ibz) on 8 CPU] Loaded/Total (Percentual):2/16(13%)
<02s> P0001: [PARALLEL Response_G_space_Zero_Momentum for CON bands on 1 CPU] Loaded/Total (Percentual):56/56(100%)
<02s> P0001: [PARALLEL Response_G_space_Zero_Momentum for VAL bands on 1 CPU] Loaded/Total (Percentual):328/328(100%)
<01m-11s> P0001: [M 0.374 Gb] Alloc WF ( 0.048)
<01m-11s> P0001: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):656/5248(13%)
<01m-11s> P0001: [WF] Performing Wave-Functions I/O from ./SAVE
<01m-11s> P0001: [FFT-Rho] Mesh size: 12 18 45
<01m-11s> P0001: [M 0.472 Gb] Alloc wf_disk ( 0.097)
<01m-11s> P0001: Reading wf_fragments_1_1
<01m-11s> P0001: Reading wf_fragments_1_2
<01m-11s> P0001: Reading wf_fragments_1_3
<01m-11s> P0001: Reading wf_fragments_1_4
<01m-12s> P0001: Reading wf_fragments_1_5
<01m-12s> P0001: Reading wf_fragments_1_6
<01m-12s> P0001: Reading wf_fragments_1_7
<01m-12s> P0001: Reading wf_fragments_2_1
<01m-13s> P0001: Reading wf_fragments_2_2
<01m-13s> P0001: Reading wf_fragments_2_3
<01m-13s> P0001: Reading wf_fragments_2_4
<01m-13s> P0001: Reading wf_fragments_2_5
<01m-14s> P0001: Reading wf_fragments_2_6
<01m-14s> P0001: Reading wf_fragments_2_7
<01m-14s> P0001: [M 0.374 Gb] Free wf_disk ( 0.097)
<01m-15s> P0001: [xc] Functional Perdew, Burke & Ernzerhof(X)+Perdew, Burke & Ernzerhof(C)
__________________________________________________________________________________________
job script:->
_______________________________________________________________________________________
#!/bin/bash
#SBATCH -J SnTe
#SBATCH --get-user-env
#SBATCH -e exclusive
#SBATCH -N 1
#SBATCH -n 8
#SBATCH -p C032M0128G
#SBATCH --qos low
bin=~/software/yambo-4.1.3/bin/yambo
jf="04_tddft.in"
srun hostname -s | sort -u > slurm.hosts
mpirun -n 8 -machinefile slurm.hosts $bin -F ${jf} -J 04_tddft
rm -rf slurm.hosts
_______________________________________________________________________________
And slurm log :->
___________________________________________________________________
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 432360 RUNNING AT b2u09n3
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
Intel(R) MPI Library troubleshooting guide:
https://software.intel.com/node/561764
===================================================================================
_______________________________________________________________________________________
Could you help me? Thank you very much!
Best,
Xiaowei