Not responding calculations on collisions calculation step

Questions and doubts about features of non linear optic in Yambo (yamb_nl)

Moderators: Davide Sangalli, claudio, myrta gruning

Post Reply
DmitrySkachkov
Posts: 11
Joined: Mon Dec 13, 2021 8:52 pm
Contact:

Not responding calculations on collisions calculation step

Post by DmitrySkachkov » Mon Nov 18, 2024 9:20 pm

I am running nonlinear calculation with yambo_nl on first step with calculation of collisions.

The program after some time starting to not responding. Cancelling and resubmitting the job may help, and collisions calculations may continue:
<33m-59s> P15: Collisions |################ | [040%] 32m-03s(E) 01h-20m(X)
<46m-05s> P15: Collisions |########################################| [100%] 44m-10s(E) 44m-10s(X)
However, after 100% the program again is not responding. If to cancel and resubmit the job again, it will finish with "Game over" status and everything looks ok.
But when you submit the job for second step nonlinear calculations, to calculate dynamical equation, the job crashes immediately with the error:
<10s> P53-r2x01: [WARNING] HXC collisions not found.
The file with collisions exists:
18G Nov 17 23:38 ndb.COLLISIONS_HXC
8 Nov 17 09:19 ndb.COLLISIONS_HXC-2125987841-13348.lock
420K Nov 17 22:52 ndb.COLLISIONS_HXC_header
But it is appear the file .lock, which may be causes the problem.

How to improve this step?
How to avoid the program to be not responding?

I tried to run the job on 1, 2, ..., 8 nodes in parallel. The HPC cluster characteristics:
the nodes with 2x AMD processors 64 cores, 2000Gb memory per node
Dmitry Skachkov
University of Central Florida
https://github.com/Dmitry-Skachkov

User avatar
claudio
Posts: 526
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: Not responding calculations on collisions calculation step

Post by claudio » Wed Nov 20, 2024 2:36 pm

Dear Dmitry Skachkov

try to delete the collisions databases and recalculate them
using yambo_rt.
yambo_rt is compiled in single precision, it requires less memory/disk space
and it is much faster.
I can guarantee that the final result will exactly the same.
Notice however that to run the non-linear dynamics you always need yambo_nl

let me knwo if it works, otherwise we can search for other solutions

best
Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

DmitrySkachkov
Posts: 11
Joined: Mon Dec 13, 2021 8:52 pm
Contact:

Re: Not responding calculations on collisions calculation step

Post by DmitrySkachkov » Mon Nov 25, 2024 9:27 pm

Hi Claudio,

I followed your suggestions, compiled another version of yambo with single precision, and used yambo_rt instead of yambo_nl to calculate collisions.
Yambo_rt works much better, more stable, however, I have the same problem, after some time of calculation, the task stopped to response, and does not write log file.
If for yambo_nl the program is stopping to response at ~30% of collision calculation, now with single precision yambo_rt, the program starting not to response at ~60% of collision calculations.

I am attaching my input file for collision calculation with yambo_rt, the k-mesh is 18x18x1. The computer cluster has 32Gb/core memory, so no problems with memory.

Could you please recommend how to eliminate this problem with not responding, probably, also, by reducing the parameters of the system.

Thank you,
Dmitry
You do not have the required permissions to view the files attached to this post.
Dmitry Skachkov
University of Central Florida
https://github.com/Dmitry-Skachkov

Post Reply