GW terminated with Kill signal 9

You can find here problems arising when using old releases of Yambo (< 5.0). Issues as parallelization strategy, performance issues and other technical aspects.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan

Locked
burkzdemir
Posts: 91
Joined: Tue Sep 09, 2014 7:57 pm

GW terminated with Kill signal 9

Post by burkzdemir » Fri Feb 18, 2022 12:59 pm

Dear developers,

I am trying to run GW for a bulk system with small unit cell. The k-point mesh in nscf calculation is 14x14x13. But, some of the CPU's do not start the Xo@q calculation and then the run is terminated with Kill Signal 9. I shared the files with you (link is below). Is this a memory issue? I am using 100 GB per cpu and 60 CPUs in total.

https://drive.google.com/drive/folders/ ... sp=sharing

Best,
Burak
Burak Ozdemir
Post-doc,
University of Nantes, France

User avatar
Daniele Varsano
Posts: 3773
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: GW terminated with Kill signal 9

Post by Daniele Varsano » Fri Feb 18, 2022 1:09 pm

Dear Burak,

it does not seem like a memory problem, can you see if there is some error message at the end of one of the log files?
Please note that your input file:

Code: Select all

NGsBlkXp= 1                RL   
this does not make much sense as you are treating the screening as a scalar.

Next, you are not definie a parallel structure and the default one could be unappropriate, I suggest you to assign the CPU explicitly:

Code: Select all

X_and_IO_CPU= "1 1 10 3 2"                 # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "q g k c v"               # [PARALLEL] CPUs roles (q,g,k,c,v)
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

burkzdemir
Posts: 91
Joined: Tue Sep 09, 2014 7:57 pm

Re: GW terminated with Kill signal 9

Post by burkzdemir » Tue Mar 08, 2022 10:18 am

Dear Yambo developers,

I tried different parallelization schemes and some of my calculations cancelled before self energy calculation or even before that at the first of Xo@q[1] calculation. I do not understand, I did not have this problem in the past, is this related to new version of Yambo. I do not understand how to find the right parallelization scheme that would successfully finish the calculation. Here is one of my calculation results (the link is below).

https://drive.google.com/drive/folders/ ... sp=sharing

Best,
Burak
Burak Ozdemir
Post-doc,
University of Nantes, France

burkzdemir
Posts: 91
Joined: Tue Sep 09, 2014 7:57 pm

Re: GW terminated with Kill signal 9

Post by burkzdemir » Tue Mar 22, 2022 7:08 pm

Ok I solved the problem by using internal linear algebra.

Best,
Burak
Burak Ozdemir
Post-doc,
University of Nantes, France

Locked