Hi All;
When I compile Yambo in MPI, I find the memory isn't shared at all, for example, one cpu it cost 1G, and when in MPI, 4 cpus cost 4G. dose anyone meet such problems? Thanks!
memory issue
Moderators: myrta gruning, andrea marini, Daniele Varsano, Conor Hogan
- andrea marini
- Posts: 325
- Joined: Mon Mar 16, 2009 4:27 pm
- Contact:
Re: memory issue
Dear xixi,xixi wrote:When I compile Yambo in MPI, I find the memory isn't shared at all, for example, one cpu it cost 1G, and when in MPI, 4 cpus cost 4G. dose anyone meet such problems? Thanks!
Yambo memory usage depends strongly on the different runlevel. Wavefunctions, response function matrices and Bethe-Salpeter hamiltonians are the variables the use more memory. What is the variable that is requiring the 1G in your case?
To identify the variable check in your standard output lines like
<01s> [M 0.051 Gb] Alloc X (0.042)
<01s> [M 0.077 Gb] Alloc WF (0.024)
If your wavefunctions are allocating 1 G you should see
<01s> [M 1.00 Gb] Alloc WF (1.00)
Now the BS kernel is already memory parallelized in some parts. The wavefunctions are a difficult issue. At difference with ground state codes Yambo needs to create electron-hole excitations with arbitrary band indexes and energy and in general it is diffucult to find an optimal way to spread the wavefunction over the different cpu's. We are working on it
Nevertheless, to reduce memory usage in the case of the wavefunction you can try to reduce the number of G vectors in the FFT transformation by using the variable FFTGvecs.
Andrea
P.S.: Please add in your signature your complete name and affiliation. It is really important as posts from anonymous users are not permitted
Andrea MARINI
Istituto di Struttura della Materia, CNR, (Italy)
Istituto di Struttura della Materia, CNR, (Italy)
-
- Posts: 21
- Joined: Wed Dec 23, 2009 2:58 pm
- Contact:
Re: memory issue
Hi, I'm experiencing a similar problem, the calculation is BSE. The last lines of the l_* file read:
<01h-32m-16s> [07.02] Main loop
<01h-32m-16s> [M 0.386 Gb] Alloc BS_O (0.143)
<01h-32m-16s> [M 3.291 Gb] Alloc O1x O2x (2.905)
<01h-32m-16s> [WARNING] Memory management of array BS_mat failed.
The job queing system reports the message: "forrtl: severe (41): insufficient virtual memory"
Virtual memory is the sum of physical memory and swap, I think.
I read that 3gb memory are needed. I submitted the job by using
#PBS -l pmem=11gb
so 11 gb memory per task should have been reserved. Nonetheless the above memory issue appears.
My question is: do the above lines show just the last array that yambo tried to allocate? If so, any hint why memory
should be insufficient even though a sufficient memory should have been reserved by the PBS?
Thanks,
Giovanni
<01h-32m-16s> [07.02] Main loop
<01h-32m-16s> [M 0.386 Gb] Alloc BS_O (0.143)
<01h-32m-16s> [M 3.291 Gb] Alloc O1x O2x (2.905)
<01h-32m-16s> [WARNING] Memory management of array BS_mat failed.
The job queing system reports the message: "forrtl: severe (41): insufficient virtual memory"
Virtual memory is the sum of physical memory and swap, I think.
I read that 3gb memory are needed. I submitted the job by using
#PBS -l pmem=11gb
so 11 gb memory per task should have been reserved. Nonetheless the above memory issue appears.
My question is: do the above lines show just the last array that yambo tried to allocate? If so, any hint why memory
should be insufficient even though a sufficient memory should have been reserved by the PBS?
Thanks,
Giovanni
Dr. Giovanni Cantele
CNR-SPIN and Univ. di Napoli "Federico II"
Phone: +39 081 676910
E-mail: giovanni.cantele@cnr.it
giovanni.cantele@na.infn.it
Web: http://people.na.infn.it/cantele
Skype: giocan74
CNR-SPIN and Univ. di Napoli "Federico II"
Phone: +39 081 676910
E-mail: giovanni.cantele@cnr.it
giovanni.cantele@na.infn.it
Web: http://people.na.infn.it/cantele
Skype: giocan74
- claudio
- Posts: 458
- Joined: Tue Mar 31, 2009 11:33 pm
- Location: Marseille
- Contact:
Re: memory issue
Dear Giovanni
probably Yambo crashed because it cannot allocate the BSE matrix,
try to reduce the number of bands and/or k-points and see if it works.
The memory required by the BSE matrix scales as: (n_valence x n_conduction x n_kpoints )^2
Unfortunately the memory is not distributed among the processors but this will be done in the next releases.
Claudio
probably Yambo crashed because it cannot allocate the BSE matrix,
try to reduce the number of bands and/or k-points and see if it works.
The memory required by the BSE matrix scales as: (n_valence x n_conduction x n_kpoints )^2
Unfortunately the memory is not distributed among the processors but this will be done in the next releases.
Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
- Conor Hogan
- Posts: 111
- Joined: Tue Mar 17, 2009 12:17 pm
- Contact:
Re: memory issue
I think you guessed the point, but yes: the M 3.291 refers to memory successfully allocated, so the code is using that amountcantele wrote: <01h-32m-16s> [M 3.291 Gb] Alloc O1x O2x (2.905)
<01h-32m-16s> [WARNING] Memory management of array BS_mat failed.
I read that 3gb memory are needed. I submitted the job by using
#PBS -l pmem=11gb
so 11 gb memory per task should have been reserved. Nonetheless the above memory issue appears.
when it then tries to allocate for BS_mat - which is probably huge in your case, but Yambo doesnt tell you in advance how much it is trying to allocate. Reduce the bands to the minimum to get something running, and then
you can see how the memory grows.
Dr. Conor Hogan
CNR-ISM, via Fosso del Cavaliere, 00133 Roma, Italy;
Department of Physics and European Theoretical Spectroscopy Facility (ETSF),
University of Rome "Tor Vergata".
CNR-ISM, via Fosso del Cavaliere, 00133 Roma, Italy;
Department of Physics and European Theoretical Spectroscopy Facility (ETSF),
University of Rome "Tor Vergata".