Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Deals with issues related to computation of optical spectra in reciprocal space: RPA, TDDFT, local field effects.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan

Post Reply
gfratesi
Posts: 5
Joined: Mon Apr 20, 2015 11:16 am

Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Post by gfratesi » Thu Apr 30, 2015 4:27 pm

Dear all,

I am computing the independent-particle optical properties (yambo -o
c) of a metal slab with adsorbates. All discussions below refer to
calculations on a single CPU.

I started by a problem with yambo-3.4.1 which I was able to reproduce
with the new version yambo-4.0.0. But before coming to that (into
another thread), I noticed something strange in the RAM/CPU cost while
executing the new version: this is the focus of the current message.

The system is moderately large: the SAVE folder generated by p2y takes
21GB with 430 bands. There are 96 KP and ~700 electrons. I can handle
the problem in house, but some calculations were tested also on
the HPC system galileo.cineca.it.


With yambo-3.4.1:
-----------------

The RAM taken by the calculation (yambo -o c, to compute the IP-RPA at
gamma) is about the size of the SAVE folder.
After the evaluation of the dipoles (about 1h) the calculation of
Xo@q1 took only few seconds (<10s).


With yambo-4.0.0:
-----------------

The calculation of the dipoles goes as above. Also, the final spectrum
is consistent. But as the calculation enters the evaluation of X0@q1
are the unexpected issues:

1) the RAM occupied grows up to 53GB

2) the calculation of X0@q1 takes 5hours, and slows down at about 42%
(see below the extract from l_optics_chi). This is reproducible at
the very same point by rerunning the calculation on the same machine
as well as from scratch (also the pw.x run) on galileo.cineca.it.

<16m-58s> Xo@q[1] |############### | [037%] 09m-06s(E) 24m-17s(X)
<20m-16s> Xo@q[1] |################ | [040%] 12m-24s(E) 31m-01s(X)
<31m-30s> Xo@q[1] |################# | [042%] 23m-38s(E) 55m-37s(X)
<05h-33m-15s> Xo@q[1] |################## | [045%] 05h-25m-22s(E) 12h-03m-03s(X)
<05h-36m-46s> Xo@q[1] |################### | [047%] 05h-28m-54s(E) 11h-32m-24s(X)
<05h-37m-55s> Xo@q[1] |#################### | [050%] 05h-30m-03s(E) 11h-00m-06s(X)


Is there something wrong in the way I do invoke the calculation? In
both cases, the input file automatically generated by yambo already
fits my needs (they are reported below).

Many thanks,
Guido


INPUT FOR 3.4.1:
======================================================================
optics # [R OPT] Optics
chi # [R CHI] Dyson equation for Chi.
Chimod= "IP" # [X] IP/Hartree/ALDA/LRC/BSfxc
% QpntsRXd
1 | 1 | # [Xd] Transferred momenta
%
% BndsRnXd
1 | 430 | # [Xd] Polarization function bands
%
% EnRngeXd
0.00000 | 10.00000 | eV # [Xd] Energy range
%
% DmRngeXd
0.10000 | 0.10000 | eV # [Xd] Damping range
%
ETStpsXd= 1001 # [Xd] Total Energy steps
% LongDrXd
0.000000 | 0.000000 | 1.000000 | # [Xd] [cc] Electric Field
%
======================================================================


INPUT FOR 4.0.0:
======================================================================
optics # [R OPT] Optics
chi # [R CHI] Dyson equation for Chi.
X_q_0_CPU= "1 1 1" # [PARALLEL] CPUs for each role
X_q_0_ROLEs= "k c v" # [PARALLEL] CPUs roles (k,c,v)
X_q_0_nCPU_invert=1 # [PARALLEL] CPUs for matrix inversion
X_finite_q_CPU= "" # [PARALLEL] CPUs for each role
X_finite_q_ROLEs= "" # [PARALLEL] CPUs roles (q,k,c,v)
X_finite_q_nCPU_invert=0 # [PARALLEL] CPUs for matrix inversion
Chimod= "IP" # [X] IP/Hartree/ALDA/LRC/BSfxc
NGsBlkXd= 1 RL # [Xd] Response block size
% QpntsRXd
1 | 1 | # [Xd] Transferred momenta
%
% BndsRnXd
1 | 430 | # [Xd] Polarization function bands
%
% EnRngeXd
0.00000 | 20.00000 | eV # [Xd] Energy range
%
% DmRngeXd
0.10000 | 0.10000 | eV # [Xd] Damping range
%
ETStpsXd= 2001 # [Xd] Total Energy steps
% LongDrXd
1.000000 | 0.000000 | 0.000000 | # [Xd] [cc] Electric Field
%
======================================================================
Guido Fratesi
Università degli Studi di Milano, Italy

User avatar
Daniele Varsano
Posts: 4198
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Post by Daniele Varsano » Thu Apr 30, 2015 4:36 pm

Dear Guido,
thanks for your report, we will inspect it and let you know.
Can you also post some details on the compilation? e.g. the config.log file? and the scf/nscf qe input in order to reproduce your calculations.
Anyway please note the two inputs differs in the variable ETStpsXd which implies the double of operations in the second run, anyway this does not justify at all the slow down and memory usage you are experiencing.

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

gfratesi
Posts: 5
Joined: Mon Apr 20, 2015 11:16 am

Re: Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Post by gfratesi » Mon May 04, 2015 9:48 am

[ADDENDUM] When using both versions, I am moving the KB files (*kb*) out from the SAVE folder, so to not compute the non-local PP contribution and save some time in preliminary calculations.



Thank you. I uploaded an example with information about compilation at the link:
https://www.sendspace.com/file/ls8j6j (there are few banners to avoid)

I tried to select a "small" example. The SAVE is ~19GB. On 4.0.0, it is still running (launched today) but is already showing the same problem, i.e. very long time of Xo@q[1] and stopped at some point.

PS the number of energy points was different, yes, but this was not changing the situation.

I look forward your suggestions.
Guido
Guido Fratesi
Università degli Studi di Milano, Italy

User avatar
Daniele Varsano
Posts: 4198
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Post by Daniele Varsano » Mon May 04, 2015 10:51 am

Dear Guido,
thanks a lot. What are you are experiencing is rather strange unexpected , and alarming. We will look at it in details and let you know.

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

User avatar
Davide Sangalli
Posts: 640
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Post by Davide Sangalli » Sun May 17, 2015 9:33 am

Ciao Guido,
one difference when doing IP calculation was that version 3.4.1 (stable) wan not loading the WFs anymore after the computation of the dipoles, while version 4.0.0 (devel) was loading them again. I've fixed this in the new version you can download via the svn repository (devel). Let us know if this solves the problem or not.

I'll go on doing tests with the DBs you provided.

Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

gfratesi
Posts: 5
Joined: Mon Apr 20, 2015 11:16 am

Re: Memory and CPU time from yambo-3.4.1 to yambo-3.4.0

Post by gfratesi » Tue May 19, 2015 4:46 pm

It seems that the problem was solved. Thank you very much.
Guido Fratesi
Università degli Studi di Milano, Italy

Post Reply