Page 1 of 1

memory issue

Posted: Tue Mar 24, 2009 1:11 pm
by xixi
Hi All;

When I compile Yambo in MPI, I find the memory isn't shared at all, for example, one cpu it cost 1G, and when in MPI, 4 cpus cost 4G. dose anyone meet such problems? Thanks!

Re: memory issue

Posted: Tue Mar 24, 2009 1:28 pm
by andrea marini
xixi wrote:When I compile Yambo in MPI, I find the memory isn't shared at all, for example, one cpu it cost 1G, and when in MPI, 4 cpus cost 4G. dose anyone meet such problems? Thanks!
Dear xixi,

Yambo memory usage depends strongly on the different runlevel. Wavefunctions, response function matrices and Bethe-Salpeter hamiltonians are the variables the use more memory. What is the variable that is requiring the 1G in your case?

To identify the variable check in your standard output lines like

<01s> [M 0.051 Gb] Alloc X (0.042)
<01s> [M 0.077 Gb] Alloc WF (0.024)

If your wavefunctions are allocating 1 G you should see

<01s> [M 1.00 Gb] Alloc WF (1.00)

Now the BS kernel is already memory parallelized in some parts. The wavefunctions are a difficult issue. At difference with ground state codes Yambo needs to create electron-hole excitations with arbitrary band indexes and energy and in general it is diffucult to find an optimal way to spread the wavefunction over the different cpu's. We are working on it ;)

Nevertheless, to reduce memory usage in the case of the wavefunction you can try to reduce the number of G vectors in the FFT transformation by using the variable FFTGvecs.

Andrea

P.S.: Please add in your signature your complete name and affiliation. It is really important as posts from anonymous users are not permitted

Re: memory issue

Posted: Tue Feb 16, 2010 6:28 pm
by cantele
Hi, I'm experiencing a similar problem, the calculation is BSE. The last lines of the l_* file read:

<01h-32m-16s> [07.02] Main loop
<01h-32m-16s> [M 0.386 Gb] Alloc BS_O (0.143)
<01h-32m-16s> [M 3.291 Gb] Alloc O1x O2x (2.905)
<01h-32m-16s> [WARNING] Memory management of array BS_mat failed.

The job queing system reports the message: "forrtl: severe (41): insufficient virtual memory"

Virtual memory is the sum of physical memory and swap, I think.
I read that 3gb memory are needed. I submitted the job by using
#PBS -l pmem=11gb
so 11 gb memory per task should have been reserved. Nonetheless the above memory issue appears.

My question is: do the above lines show just the last array that yambo tried to allocate? If so, any hint why memory
should be insufficient even though a sufficient memory should have been reserved by the PBS?

Thanks,

Giovanni

Re: memory issue

Posted: Tue Feb 16, 2010 7:07 pm
by claudio
Dear Giovanni

probably Yambo crashed because it cannot allocate the BSE matrix,
try to reduce the number of bands and/or k-points and see if it works.
The memory required by the BSE matrix scales as: (n_valence x n_conduction x n_kpoints )^2

Unfortunately the memory is not distributed among the processors but this will be done in the next releases.

Claudio

Re: memory issue

Posted: Wed Feb 17, 2010 11:27 am
by Conor Hogan
cantele wrote: <01h-32m-16s> [M 3.291 Gb] Alloc O1x O2x (2.905)
<01h-32m-16s> [WARNING] Memory management of array BS_mat failed.

I read that 3gb memory are needed. I submitted the job by using
#PBS -l pmem=11gb
so 11 gb memory per task should have been reserved. Nonetheless the above memory issue appears.
I think you guessed the point, but yes: the M 3.291 refers to memory successfully allocated, so the code is using that amount
when it then tries to allocate for BS_mat - which is probably huge in your case, but Yambo doesnt tell you in advance how much it is trying to allocate. Reduce the bands to the minimum to get something running, and then
you can see how the memory grows.