Page 1 of 1

GW terminated with Kill signal 9

Posted: Fri Feb 18, 2022 12:59 pm
by burkzdemir
Dear developers,

I am trying to run GW for a bulk system with small unit cell. The k-point mesh in nscf calculation is 14x14x13. But, some of the CPU's do not start the Xo@q calculation and then the run is terminated with Kill Signal 9. I shared the files with you (link is below). Is this a memory issue? I am using 100 GB per cpu and 60 CPUs in total.

https://drive.google.com/drive/folders/ ... sp=sharing

Best,
Burak

Re: GW terminated with Kill signal 9

Posted: Fri Feb 18, 2022 1:09 pm
by Daniele Varsano
Dear Burak,

it does not seem like a memory problem, can you see if there is some error message at the end of one of the log files?
Please note that your input file:

Code: Select all

NGsBlkXp= 1                RL   
this does not make much sense as you are treating the screening as a scalar.

Next, you are not definie a parallel structure and the default one could be unappropriate, I suggest you to assign the CPU explicitly:

Code: Select all

X_and_IO_CPU= "1 1 10 3 2"                 # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v"               # [PARALLEL] CPUs roles (q,g,k,c,v)
Best,
Daniele

Re: GW terminated with Kill signal 9

Posted: Tue Mar 08, 2022 10:18 am
by burkzdemir
Dear Yambo developers,

I tried different parallelization schemes and some of my calculations cancelled before self energy calculation or even before that at the first of Xo@q[1] calculation. I do not understand, I did not have this problem in the past, is this related to new version of Yambo. I do not understand how to find the right parallelization scheme that would successfully finish the calculation. Here is one of my calculation results (the link is below).

https://drive.google.com/drive/folders/ ... sp=sharing

Best,
Burak

Re: GW terminated with Kill signal 9

Posted: Tue Mar 22, 2022 7:08 pm
by burkzdemir
Ok I solved the problem by using internal linear algebra.

Best,
Burak