ypp is very slow to calculate avehole
Posted: Wed Jun 08, 2022 7:55 am
Dear developer,
I try to use ypp (5.0.4version) to calculate the average hole and electron, I find it is very slow to calculte even if I use several handreds cores. It seem the post-process is slower than the GW and BSE calculation. Is there any methond to accerate the calculate or how to setup the paralleled parameters? Thanks very much!
Input for avehole
#############################
ElecTemp= 0.258520E-4 eV # Electronic Temperature
BoseTemp=-1.000000 eV # Bosonic Temperature
StdoHash= 40 # [IO] Live-timing Hashes
excitons # [R] Excitonic properties
avehole # [R] Average hole/electron wavefunction
infver # [R] Input file variables verbosity
wavefunction # [R] Wavefunction
Format= "c" # Output format [(c)ube/(g)nuplot/(x)crysden]
Direction= "123" # [rlu] [1/2/3] for 1d or [12/13/23] for 2d [123] for 3D
FFTGvecs= 28799 RL # [FFT] Plane-waves
States= "0 - 1" # Index of the BS state(s)
En_treshold= 0.000000 eV # Select states below this energy treshold
Res_treshold= 0.000000 # Select states above this optical strength treshold (max normalized to 1.)
BSQindex= 1 # Q-Index of the BS state(s)
Degen_Step= 0.010000 eV # Maximum energy separation of two degenerate states
Weight_treshold= 0.050000 # Print transitions above this weight treshold (max normalized to 1.)
WFMult= 1.000000 # Multiplication factor to the excitonic wavefunction
EHdensity= "h" # Calculate (h)ole/(e)lectron density from BSE wave-function
#############################
Output
###########################
<---> P1: [01] MPI/OPENMP structure, Files & I/O Directories
<---> P1: MPI Cores-Threads : 40(CPU)-4(threads)
<---> P1: MPI assigned to GPU : 0
<---> P1: [02] Y(ambo) P(ost)/(re) P(rocessor)
<---> P1: [03] Core DB
<---> P1: :: Electrons : 190.0000
<---> P1: :: Temperature : 0.950044E-3 [eV]
<---> P1: :: Lattice factors : 12.70797 11.00542 23.02844 [a.u.]
<---> P1: :: K points : 62
<---> P1: :: Bands : 1600
<---> P1: :: Symmetries : 12
<---> P1: :: RL vectors : 202433
<---> P1: [04] K-point grid
<---> P1: :: Q-points (IBZ): 62
<---> P1: :: X K-points (IBZ): 62
<---> P1: [05] CORE Variables Setup
<---> P1: [05.01] Unit cells
<---> P1: [05.02] Symmetries
<---> P1: [05.03] Reciprocal space
<---> P1: [05.04] K-grid lattice
<---> P1: Grid dimensions : 6 9 9
<---> P1: [05.05] Energies & Occupations
<02s> P1: [06] Excitonic Properties @ Q-index #1
<02s> P1: Sorting energies
<02s> P1: 2 excitonic states selected
<04s> P1: [06.01] Excitonic Wave Function
<04s> P1: [06.01.01] Real-Space grid setup
<04s> P1: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):744/744(100%)
<04s> P1: [FFT-EXCWF] Mesh size: 33 33 60
<08s> P1: [WF] Copying WF data from GPU device
<09s> P1: Extended grid : 33 33 60
<09s> P1: Processing 2 states
<09s> P1: State 1 Merged with states 1 -> 2
<09s> P1: ExcWF@1 | | [000%] --(E) --(X)
<32m-57s> P1: ExcWF@1 |# | [002%] 32m-48s(E) 21h-47m(X)
<01h-05m> P1: ExcWF@1 |## | [005%] 01h-05m(E) 21h-46m(X)
<01h-38m> P1: ExcWF@1 |### | [007%] 01h-38m(E) 21h-45m(X)
<02h-11m> P1: ExcWF@1 |#### | [010%] 02h-11m(E) 21h-45m(X)
<02h-43m> P1: ExcWF@1 |##### | [012%] 02h-43m(E) 21h-45m(X)
<03h-16m> P1: ExcWF@1 |###### | [015%] 03h-16m(E) 21h-45m(X)
<03h-48m> P1: ExcWF@1 |####### | [017%] 03h-48m(E) 21h-45m(X)
<04h-21m> P1: ExcWF@1 |######## | [020%] 04h-21m(E) 21h-45m(X)
<04h-54m> P1: ExcWF@1 |######### | [022%] 04h-53m(E) 21h-45m(X)
<05h-26m> P1: ExcWF@1 |########## | [025%] 05h-26m(E) 21h-45m(X)
#####################
BSE input
########################
WRbsWF
ppa # [R Xp] Plasmon Pole Approximation
rim_cut # [R RIM CUT] Coulomb potential
optics # [R OPT] Optics
bss # [R BSS] Bethe Salpeter Equation solver
em1d # [R Xd] Dynamical Inverse Dielectric Matrix
bse # [R BSE] Bethe Salpeter Equation.
bsk # [R BSK] Bethe Salpeter Equation kernel
K_Threads=0 # [OPENMP/BSK] Number of threads for response functions
#RandQpts= 1000000 # [RIM] Number of random q-points in the BZ
#RandGvec= 300 RL # [RIM] Coulomb interaction RS components
#CUTGeo= "box yz" # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws X/Y/Z/XY..
#% CUTBox
#0.00 | 56.00 | 28.00 | # [CUT] [au] Box sides
#%
#CUTRadius= 0.000000 # [CUT] [au] Sphere/Cylinder radius
#CUTCylLen= 0.000000 # [CUT] [au] Cylinder length
#CUTwsGvec= 0.700000 # [CUT] WS cutoff: number of G to be modified
Chimod= "hartree" # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
BSEmod= "retarded" # [BSE] resonant/retarded/coupling
BSKmod= "SEX" # [BSE] IP/Hartree/HF/ALDA/SEX
BSSmod= "d" # [BSS] (h)aydock/(d)iagonalization/(i)nversion/(t)ddft`
XTermKind= "none" # [X] X terminator ("none","BG" Bruneval-Gonze)
XTermEn= 40 eV # [X] X terminator energy (only for kind="BG")
XfnQP_Wc_dos= 0.000000 eV # [EXTQP Xd] W dos pre-factor (conduction)
BSENGexx= 60 Ry # RL # [BSK] Exchange components
BSENGBlk= 8 Ry # RL # [BSK] Screened interaction block size
KfnQPdb= "E < ./ndb.QP" # [EXTQP BSK BSS] Database
KfnQP_N= 1 # [EXTQP BSK BSS] Interpolation neighbours
% KfnQP_E
0.0000000 | 1.000000 | 1.000000 | # [EXTQP BSK BSS] E parameters (c/v) eV|adim|adim
%
#WehCpl # [BSK] eh interaction included also in coupling
% BEnRange
-2.00000 | 10.00000 | eV # [BSS] Energy range
%
% BDmRange
0.030000 | 0.030000 | eV # [BSS] Damping range
%
BEnSteps=1000 # [BSS] Energy steps
% BLongDir
1.000000 | 1.000000 | 1.000000 | # [BSS] [cc] Electric Field
%
% BSEBands
185 | 196 | # [BSK] Bands range
%
DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g")
% BndsRnXp
1 | 1580 | # [Xp] Polarization function bands
%
NGsBlkXp= 12 Ry # RL # [Xp] Response block size
% LongDrXp
1.000000 | 1.000000 | 1.000000 | # [Xp] [cc] Electric Field
%
PPAPntXp= 27.21138 eV # [Xp] PPA imaginary energy
####################################
Best regards
Ke
I try to use ypp (5.0.4version) to calculate the average hole and electron, I find it is very slow to calculte even if I use several handreds cores. It seem the post-process is slower than the GW and BSE calculation. Is there any methond to accerate the calculate or how to setup the paralleled parameters? Thanks very much!
Input for avehole
#############################
ElecTemp= 0.258520E-4 eV # Electronic Temperature
BoseTemp=-1.000000 eV # Bosonic Temperature
StdoHash= 40 # [IO] Live-timing Hashes
excitons # [R] Excitonic properties
avehole # [R] Average hole/electron wavefunction
infver # [R] Input file variables verbosity
wavefunction # [R] Wavefunction
Format= "c" # Output format [(c)ube/(g)nuplot/(x)crysden]
Direction= "123" # [rlu] [1/2/3] for 1d or [12/13/23] for 2d [123] for 3D
FFTGvecs= 28799 RL # [FFT] Plane-waves
States= "0 - 1" # Index of the BS state(s)
En_treshold= 0.000000 eV # Select states below this energy treshold
Res_treshold= 0.000000 # Select states above this optical strength treshold (max normalized to 1.)
BSQindex= 1 # Q-Index of the BS state(s)
Degen_Step= 0.010000 eV # Maximum energy separation of two degenerate states
Weight_treshold= 0.050000 # Print transitions above this weight treshold (max normalized to 1.)
WFMult= 1.000000 # Multiplication factor to the excitonic wavefunction
EHdensity= "h" # Calculate (h)ole/(e)lectron density from BSE wave-function
#############################
Output
###########################
<---> P1: [01] MPI/OPENMP structure, Files & I/O Directories
<---> P1: MPI Cores-Threads : 40(CPU)-4(threads)
<---> P1: MPI assigned to GPU : 0
<---> P1: [02] Y(ambo) P(ost)/(re) P(rocessor)
<---> P1: [03] Core DB
<---> P1: :: Electrons : 190.0000
<---> P1: :: Temperature : 0.950044E-3 [eV]
<---> P1: :: Lattice factors : 12.70797 11.00542 23.02844 [a.u.]
<---> P1: :: K points : 62
<---> P1: :: Bands : 1600
<---> P1: :: Symmetries : 12
<---> P1: :: RL vectors : 202433
<---> P1: [04] K-point grid
<---> P1: :: Q-points (IBZ): 62
<---> P1: :: X K-points (IBZ): 62
<---> P1: [05] CORE Variables Setup
<---> P1: [05.01] Unit cells
<---> P1: [05.02] Symmetries
<---> P1: [05.03] Reciprocal space
<---> P1: [05.04] K-grid lattice
<---> P1: Grid dimensions : 6 9 9
<---> P1: [05.05] Energies & Occupations
<02s> P1: [06] Excitonic Properties @ Q-index #1
<02s> P1: Sorting energies
<02s> P1: 2 excitonic states selected
<04s> P1: [06.01] Excitonic Wave Function
<04s> P1: [06.01.01] Real-Space grid setup
<04s> P1: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):744/744(100%)
<04s> P1: [FFT-EXCWF] Mesh size: 33 33 60
<08s> P1: [WF] Copying WF data from GPU device
<09s> P1: Extended grid : 33 33 60
<09s> P1: Processing 2 states
<09s> P1: State 1 Merged with states 1 -> 2
<09s> P1: ExcWF@1 | | [000%] --(E) --(X)
<32m-57s> P1: ExcWF@1 |# | [002%] 32m-48s(E) 21h-47m(X)
<01h-05m> P1: ExcWF@1 |## | [005%] 01h-05m(E) 21h-46m(X)
<01h-38m> P1: ExcWF@1 |### | [007%] 01h-38m(E) 21h-45m(X)
<02h-11m> P1: ExcWF@1 |#### | [010%] 02h-11m(E) 21h-45m(X)
<02h-43m> P1: ExcWF@1 |##### | [012%] 02h-43m(E) 21h-45m(X)
<03h-16m> P1: ExcWF@1 |###### | [015%] 03h-16m(E) 21h-45m(X)
<03h-48m> P1: ExcWF@1 |####### | [017%] 03h-48m(E) 21h-45m(X)
<04h-21m> P1: ExcWF@1 |######## | [020%] 04h-21m(E) 21h-45m(X)
<04h-54m> P1: ExcWF@1 |######### | [022%] 04h-53m(E) 21h-45m(X)
<05h-26m> P1: ExcWF@1 |########## | [025%] 05h-26m(E) 21h-45m(X)
#####################
BSE input
########################
WRbsWF
ppa # [R Xp] Plasmon Pole Approximation
rim_cut # [R RIM CUT] Coulomb potential
optics # [R OPT] Optics
bss # [R BSS] Bethe Salpeter Equation solver
em1d # [R Xd] Dynamical Inverse Dielectric Matrix
bse # [R BSE] Bethe Salpeter Equation.
bsk # [R BSK] Bethe Salpeter Equation kernel
K_Threads=0 # [OPENMP/BSK] Number of threads for response functions
#RandQpts= 1000000 # [RIM] Number of random q-points in the BZ
#RandGvec= 300 RL # [RIM] Coulomb interaction RS components
#CUTGeo= "box yz" # [CUT] Coulomb Cutoff geometry: box/cylinder/sphere/ws X/Y/Z/XY..
#% CUTBox
#0.00 | 56.00 | 28.00 | # [CUT] [au] Box sides
#%
#CUTRadius= 0.000000 # [CUT] [au] Sphere/Cylinder radius
#CUTCylLen= 0.000000 # [CUT] [au] Cylinder length
#CUTwsGvec= 0.700000 # [CUT] WS cutoff: number of G to be modified
Chimod= "hartree" # [X] IP/Hartree/ALDA/LRC/PF/BSfxc
BSEmod= "retarded" # [BSE] resonant/retarded/coupling
BSKmod= "SEX" # [BSE] IP/Hartree/HF/ALDA/SEX
BSSmod= "d" # [BSS] (h)aydock/(d)iagonalization/(i)nversion/(t)ddft`
XTermKind= "none" # [X] X terminator ("none","BG" Bruneval-Gonze)
XTermEn= 40 eV # [X] X terminator energy (only for kind="BG")
XfnQP_Wc_dos= 0.000000 eV # [EXTQP Xd] W dos pre-factor (conduction)
BSENGexx= 60 Ry # RL # [BSK] Exchange components
BSENGBlk= 8 Ry # RL # [BSK] Screened interaction block size
KfnQPdb= "E < ./ndb.QP" # [EXTQP BSK BSS] Database
KfnQP_N= 1 # [EXTQP BSK BSS] Interpolation neighbours
% KfnQP_E
0.0000000 | 1.000000 | 1.000000 | # [EXTQP BSK BSS] E parameters (c/v) eV|adim|adim
%
#WehCpl # [BSK] eh interaction included also in coupling
% BEnRange
-2.00000 | 10.00000 | eV # [BSS] Energy range
%
% BDmRange
0.030000 | 0.030000 | eV # [BSS] Damping range
%
BEnSteps=1000 # [BSS] Energy steps
% BLongDir
1.000000 | 1.000000 | 1.000000 | # [BSS] [cc] Electric Field
%
% BSEBands
185 | 196 | # [BSK] Bands range
%
DysSolver= "n" # [GW] Dyson Equation solver ("n","s","g")
% BndsRnXp
1 | 1580 | # [Xp] Polarization function bands
%
NGsBlkXp= 12 Ry # RL # [Xp] Response block size
% LongDrXp
1.000000 | 1.000000 | 1.000000 | # [Xp] [cc] Electric Field
%
PPAPntXp= 27.21138 eV # [Xp] PPA imaginary energy
####################################
Best regards
Ke