SOC error

Run-time issues concerning Yambo that are not covered in the above forums.

Moderators: myrta gruning, andrea marini, Daniele Varsano, Conor Hogan

Post Reply
yanghuang
Posts: 23
Joined: Fri Jul 08, 2016 8:11 am

SOC error

Post by yanghuang » Sun Oct 09, 2016 2:16 pm

Dear Developers and Users,
I want to perform GW+SOC, the GW calculation is correct, but the following error occurred when combined with SOC:

[yanghuang@ln1%tianhe2-C a.save]$ yhrun: error: cn15063: task 80: Killed
yhrun: First task exited 60s ago
yhrun: tasks 0,3,5,8,11,15,22-23,26-29,33,35-37,39,41,43-44,47,51-53,56,58-59,69-72,74,78,82-83,85,87,92,94: running
yhrun: tasks 1-2,4,6-7,9-10,12-14,16-21,24-25,30-32,34,38,40,42,45-46,48-50,54-55,57,60-68,73,75-77,79-81,84,86,88-91,93,95: exited abnormally
yhrun: Terminating job step 2761720.0
slurmd[cn10954]: *** STEP 2761720.0 KILLED AT 2016-10-09T20:26:58 WITH SIGNAL 9 ***
yhrun: Job step aborted: Waiting up to 2 seconds for job step to finish.
slurmd[cn10954]: *** STEP 2761720.0 KILLED AT 2016-10-09T20:26:58 WITH SIGNAL 9 ***
yhrun: error: Timed out waiting for job step to complete

I've multiplied BndsRnXp, GbndRnge by two.

Could you give me some advice please?

Thanks in advance.
You do not have the required permissions to view the files attached to this post.
Yang Huang
Student
Suzhou University
huangyang10010@163.com

User avatar
Daniele Varsano
Posts: 4198
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: SOC error

Post by Daniele Varsano » Sun Oct 09, 2016 3:59 pm

Dear Yang Huang,
the error is due most probably for lack of memory:
[M 8.917 Gb] Alloc wf_disk ( 0.119)
as it is allocating around ~9Gb.
Such a big amount of memory is required as in your input you are asking to calculate corrections for 2000 points. (200 bands and 10 k points).
%QPkrange # [GW] QP generalized Kpoint/Band indices
1| 10| 1|200|
%
Why do you need to calculate the corrections from 1 to 200 bands?
Your last occupied bands is number 156, usually on it is interested in band structure corrections around the Fermi level, I think that you can avoid to calculate the correction for such a big number of bands. If you consider something like 10 occupied and 10 empty bands:

Code: Select all

[quote]%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1| 10|  146 |167|
%
...or similar this should solve your memory problem.
If you are really instead interested in deep energy levels and high energy states (which by the way most probably are unbound and do not make much sense), you will need to lower some of the convergence parameters.

I can see you are running in serial mode, why don't you try to run in parallel? If you switch to yambo 4.x you can control the parallelization strategy and parallelizing over bands you can reduce the memory needed per core.


Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

yanghuang
Posts: 23
Joined: Fri Jul 08, 2016 8:11 am

Re: SOC error

Post by yanghuang » Mon Oct 10, 2016 7:50 am

Daniele,
Many thanks for your detailed reply, I will try it all over again.

Best
Yang Huang
Student
Suzhou University
huangyang10010@163.com

Post Reply