slow performance after a X0 is finished
Posted: Sat Feb 17, 2024 6:34 am
Dear developers,
I am working with yambo 5.1.1 for quasiparticle energies using PPA. But in the dynamical dielectric matrix stage, when one X0 is finished, yambo seems to be trapped in some work. Only after several hours yambo will continue his work and calculate X. Like the log below:
Here the X0 for q2 consumes about 2 hours, but then it stops for about 6 hours(06h-21m to 12h-01m), then X for q2 starts. It seems strange. I wonder what does yambo do here that consumes 6 hours, and if I can do something to improve the performance. Thank you very much.
(I'm using 512 cores to work on a spin polarized system. 10Ry cut for epsilon and 300bands for summation. The system has 110 electrons.)
Best,
Yuanfan Xiong
I am working with yambo 5.1.1 for quasiparticle energies using PPA. But in the dynamical dielectric matrix stage, when one X0 is finished, yambo seems to be trapped in some work. Only after several hours yambo will continue his work and calculate X. Like the log below:
Code: Select all
<04h-03m> P1-cnode399: Xo@q[2] | | [000%] --(E) --(X)
<04h-03m> P1-cnode399: [MEMORY] Alloc Xo_res( 136.6050 [Mb]) TOTAL: 10.39174 [Gb] (traced) 117.5200 [Mb] (memstat)
<04h-05m> P1-cnode399: Xo@q[2] |# | [002%] 02m-13s(E) 01h-28m(X)
<04h-07m> P1-cnode399: Xo@q[2] |## | [005%] 04m-25s(E) 01h-28m(X)
<04h-09m> P1-cnode399: Xo@q[2] |### | [007%] 06m-37s(E) 01h-28m(X)
<04h-11m> P1-cnode399: Xo@q[2] |#### | [010%] 08m-49s(E) 01h-28m(X)
<04h-14m> P1-cnode399: Xo@q[2] |##### | [012%] 11m-01s(E) 01h-28m(X)
<04h-16m> P1-cnode399: Xo@q[2] |###### | [015%] 13m-12s(E) 01h-28m(X)
<04h-18m> P1-cnode399: Xo@q[2] |####### | [017%] 15m-24s(E) 01h-28m(X)
<04h-20m> P1-cnode399: Xo@q[2] |######## | [020%] 17m-36s(E) 01h-28m(X)
<04h-22m> P1-cnode399: Xo@q[2] |######### | [022%] 19m-48s(E) 01h-28m(X)
<04h-25m> P1-cnode399: Xo@q[2] |########## | [025%] 21m-59s(E) 01h-27m(X)
<04h-27m> P1-cnode399: Xo@q[2] |########### | [027%] 24m-11s(E) 01h-27m(X)
<04h-29m> P1-cnode399: Xo@q[2] |############ | [030%] 26m-22s(E) 01h-27m(X)
<04h-31m> P1-cnode399: Xo@q[2] |############# | [032%] 28m-34s(E) 01h-27m(X)
<04h-33m> P1-cnode399: Xo@q[2] |############## | [035%] 30m-47s(E) 01h-27m(X)
<04h-36m> P1-cnode399: Xo@q[2] |############### | [037%] 32m-58s(E) 01h-27m(X)
<04h-38m> P1-cnode399: Xo@q[2] |################ | [040%] 35m-10s(E) 01h-27m(X)
<04h-40m> P1-cnode399: Xo@q[2] |################# | [042%] 37m-23s(E) 01h-27m(X)
<04h-42m> P1-cnode399: Xo@q[2] |################## | [045%] 39m-35s(E) 01h-27m(X)
<04h-44m> P1-cnode399: Xo@q[2] |################### | [047%] 41m-46s(E) 01h-27m(X)
<04h-47m> P1-cnode399: Xo@q[2] |#################### | [050%] 43m-58s(E) 01h-27m(X)
<04h-49m> P1-cnode399: Xo@q[2] |##################### | [052%] 46m-09s(E) 01h-27m(X)
<04h-51m> P1-cnode399: Xo@q[2] |###################### | [055%] 48m-20s(E) 01h-27m(X)
<04h-53m> P1-cnode399: Xo@q[2] |####################### | [057%] 50m-32s(E) 01h-27m(X)
<04h-55m> P1-cnode399: Xo@q[2] |######################## | [060%] 52m-44s(E) 01h-27m(X)
<04h-57m> P1-cnode399: Xo@q[2] |######################### | [062%] 54m-56s(E) 01h-27m(X)
<05h-00m> P1-cnode399: Xo@q[2] |########################## | [065%] 57m-08s(E) 01h-27m(X)
<05h-02m> P1-cnode399: Xo@q[2] |########################### | [067%] 59m-21s(E) 01h-27m(X)
<05h-04m> P1-cnode399: Xo@q[2] |############################ | [070%] 01h-01m(E) 01h-27m(X)
<05h-06m> P1-cnode399: Xo@q[2] |############################# | [072%] 01h-03m(E) 01h-27m(X)
<05h-08m> P1-cnode399: Xo@q[2] |############################## | [075%] 01h-05m(E) 01h-27m(X)
<05h-11m> P1-cnode399: Xo@q[2] |############################### | [077%] 01h-08m(E) 01h-27m(X)
<05h-13m> P1-cnode399: Xo@q[2] |################################ | [080%] 01h-10m(E) 01h-27m(X)
<05h-15m> P1-cnode399: Xo@q[2] |################################# | [082%] 01h-12m(E) 01h-28m(X)
<05h-18m> P1-cnode399: Xo@q[2] |################################## | [085%] 01h-15m(E) 01h-28m(X)
<05h-20m> P1-cnode399: Xo@q[2] |################################### | [087%] 01h-17m(E) 01h-28m(X)
<05h-22m> P1-cnode399: Xo@q[2] |#################################### | [090%] 01h-19m(E) 01h-28m(X)
<05h-26m> P1-cnode399: Xo@q[2] |##################################### | [092%] 01h-23m(E) 01h-29m(X)
<05h-29m> P1-cnode399: Xo@q[2] |###################################### | [095%] 01h-26m(E) 01h-31m(X)
<05h-48m> P1-cnode399: Xo@q[2] |####################################### | [097%] 01h-45m(E) 01h-47m(X)
<06h-21m> P1-cnode399: Xo@q[2] |########################################| [100%] 02h-18m(E) 02h-18m(X)
<06h-21m> P1-cnode399: [MEMORY] Free Xo_res( 136.6050 [Mb]) TOTAL: 10.26575 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-01m> P1-cnode399: [PARALLEL distribution for X Frequencies on 256 CPU] Loaded/Total (Percentual):1/2(50%)
<12h-01m> P1-cnode399: X@q[2] | | [000%] --(E) --(X)
<12h-01m> P1-cnode399: [MEMORY] Alloc KERNEL%blc( 1.019056 [Gb]) TOTAL: 11.27420 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-01m> P1-cnode399: [MEMORY] Alloc Xo%blc( 1.019056 [Gb]) TOTAL: 12.29326 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-02m> P1-cnode399: [MEMORY] Alloc BUFFER%blc( 1.019056 [Gb]) TOTAL: 13.31231 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: X@q[2] |########################################| [100%] 02m-03s(E) 02m-03s(X)
<12h-04m> P1-cnode399: [MEMORY] Free M_par%blc( 1.019056 [Gb]) TOTAL: 12.29326 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Free M_par%blc( 1.019056 [Gb]) TOTAL: 11.27420 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Free M_par%blc( 273.2110 [Mb]) TOTAL: 11.00099 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Alloc X_par%blc( 509.4830 [Mb]) TOTAL: 11.51047 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [PARALLEL distribution for RL vectors(X) on 4 CPU] Loaded/Total (Percentual):32606955/******(25%)
<12h-04m> P1-cnode399: [MEMORY] Free M_par%blc( 1.019056 [Gb]) TOTAL: 10.49142 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Free M_par%blc( 509.4830 [Mb]) TOTAL: 9.979949 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Free X_par_lower_triangle%blc( 273.2110 [Mb]) TOTAL: 9.706738 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Alloc X_par_lower_triangle%blc( 273.2110 [Mb]) TOTAL: 9.979949 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [MEMORY] Alloc X_par%blc( 273.2110 [Mb]) TOTAL: 10.25316 [Gb] (traced) 117.5200 [Mb] (memstat)
<12h-04m> P1-cnode399: [PARALLEL distribution for RL vectors(X) on 4 CPU] Loaded/Total (Percentual):17485551/******(13%)
<12h-04m> P1-cnode399: [X-CG] R(p) Tot o/o(of R): 10998 81000 100
<12h-04m> P1-cnode399: Xo@q[3] | | [000%] --(E) --(X)
(I'm using 512 cores to work on a spin polarized system. 10Ry cut for epsilon and 300bands for summation. The system has 110 electrons.)
Best,
Yuanfan Xiong