Stuck in the calculation of collisions
Posted: Thu Jul 11, 2024 9:42 am
Dear developers,
I'm doing a repeated SHG calculation of ML-MoS2 by referring paper:PHYSICAL REVIEW B 89, 081102(R) (2014). I had troubles in calculating collisions database in BSE level.
At first, I used 2 (node) * 20 (ntasks-per-node) in our cluster to calculate collisions database, but all reported CPU files were stuck in the step [06.01] SEX+HARTREE (e-e correlation), the job stopped until the time limit (6 hours) of cluster.
I guess the critical tags are #RT_CPU# and #RT_ROLEs# for Parallelization. so I try to add these two tags in input file,
Then I found reported files CPU_1/2/3/4 stuck at 5% 5% 12% and 25%, and the other CPU files are in similar situations.
My questions are
1. If I switch to a cluster with longer runtime limits, will the collisions calculation be able to complete?
2. how to distribute cores for parallelization in my calculation?
3. Could you please review the parameters in my input files to ensure they are reasonable?
Best,
Liangting
I'm doing a repeated SHG calculation of ML-MoS2 by referring paper:PHYSICAL REVIEW B 89, 081102(R) (2014). I had troubles in calculating collisions database in BSE level.
At first, I used 2 (node) * 20 (ntasks-per-node) in our cluster to calculate collisions database, but all reported CPU files were stuck in the step [06.01] SEX+HARTREE (e-e correlation), the job stopped until the time limit (6 hours) of cluster.
Code: Select all
<01h-53m> P1-cn537: [06.01] SEX+HARTREE (e-e correlation)
<01h-53m> P1-cn537: [SEX+HARTREE] Plane waves (H,X,C) : 139 139 139
<01h-53m> P1-cn537: [MEMORY] Alloc WF%c( 1.472871 [Gb]) TOTAL: 1.582761 [Gb] (traced) 69.25200 [Mb] (memstat)
<01h-53m> P1-cn537: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):4410/6174(71%)
<01h-53m> P1-cn537: [FFT-SEX+HARTREE Collisions] Mesh size: 15 15 95
Code: Select all
RT_CPU= "10.4.1" # [PARALLEL] CPUs for each role
RT_ROLEs= "k.b.q" # [PARALLEL] CPUs roles (k,b,q,qp)
Code: Select all
<01h-54m> P1-cn519: Collisions | | [000%] --(E) --(X)
<02h-50m> P1-cn519: Collisions |# | [002%] 56m-17s(E) 01d-13h-31m(X)
<03h-52m> P1-cn519: Collisions |## | [005%] 01h-58m(E) 01d-15h-28m(X)
1. If I switch to a cluster with longer runtime limits, will the collisions calculation be able to complete?
2. how to distribute cores for parallelization in my calculation?
3. Could you please review the parameters in my input files to ensure they are reasonable?
Best,
Liangting