Memory explotion in the BSE calculation
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan
- claudio
- Posts: 528
- Joined: Tue Mar 31, 2009 11:33 pm
- Location: Marseille
- Contact:
Memory explotion in the BSE calculation
Dear developers
I noticed that in the BSE calculation the memory increases during the main loop
because there are some arrays that are not deallocated, and this is a problem for large systems (with many G-vectors).
I tried to solve the problem adding the line
call BS_oscillators_free(max(i_Tgrp_k,i_Tgrp_p),0)
at the end of the main BSE loop, I will make you know
regards
Claudio
I noticed that in the BSE calculation the memory increases during the main loop
because there are some arrays that are not deallocated, and this is a problem for large systems (with many G-vectors).
I tried to solve the problem adding the line
call BS_oscillators_free(max(i_Tgrp_k,i_Tgrp_p),0)
at the end of the main BSE loop, I will make you know
regards
Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
- Daniele Varsano
- Posts: 4209
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Claudio,
thanks a lot for reporting,
which release are you using?
Many thanks,
Daniele
thanks a lot for reporting,
which release are you using?
Many thanks,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
- claudio
- Posts: 528
- Joined: Tue Mar 31, 2009 11:33 pm
- Location: Marseille
- Contact:
Re: Memory explotion in the BSE calculation
I'm using Yambo 4.1.3
Cla
Cla
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
- claudio
- Posts: 528
- Joined: Tue Mar 31, 2009 11:33 pm
- Location: Marseille
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Daniele
unfortunately the correction I sent you before does not solve the problem,
if I do a calculation at Gamma point only memory always explode in yambo
here the output where I modified yambo to write all allocations
the problem arises from the file src/modules/mod_BS.F
and in particular the
subroutine BS_oscillators_free(iG,iB)
if I have only one k-points this subroutine never deallocate the BS_blk(i_b)%O_c and BS_blk(i_b)%O_table,
due to the condition in line 199
if ( ik_now==ik_loop .and. ip_now==ip_loop) cycle
that is always true if there is only one k-point.
I'm trying to solve the problem, but sill unsuccessful
ciao
cla
unfortunately the correction I sent you before does not solve the problem,
if I do a calculation at Gamma point only memory always explode in yambo
here the output where I modified yambo to write all allocations
Code: Select all
....
<15s> P0001: [M 1.543 Gb] Alloc BS_T_group_X_oscillators_1 ( 0.054)
<15s> P0001: [M 2.594 Gb] Alloc BS_bkl_O_c_1 ( 1.051)
<03h-31m-11s> P0001: [M 2.541 Gb] Free BS_T_group_X_oscillators_1 ( 0.054)
<03h-31m-11s> P0001: Kernel |# | [002%] 03h-30m-55s(E) 05d-09h-39m-36s(X)
<03h-31m-11s> P0001: [M 2.594 Gb] Alloc BS_T_group_X_oscillators_1 ( 0.054)
<03h-31m-11s> P0001: [M 2.648 Gb] Alloc BS_T_group_X_oscillators_2 ( 0.054)
<03h-31m-11s> P0001: [M 3.700 Gb] Alloc BS_bkl_O_c_2 ( 1.051)
<10h-31m-01s> P0001: [M 3.647 Gb] Free BS_T_group_X_oscillators_2 ( 0.054)
<10h-31m-01s> P0001: [M 3.593 Gb] Free BS_T_group_X_oscillators_1 ( 0.054)
<10h-31m-04s> P0001: Kernel |### | [008%] 10h-30m-49s(E) 05d-09h-27m-04s(X)
<10h-31m-04s> P0001: [M 3.647 Gb] Alloc BS_T_group_X_oscillators_1 ( 0.054)
<10h-31m-04s> P0001: [M 3.700 Gb] Alloc BS_T_group_X_oscillators_3 ( 0.054)
....
and in particular the
subroutine BS_oscillators_free(iG,iB)
if I have only one k-points this subroutine never deallocate the BS_blk(i_b)%O_c and BS_blk(i_b)%O_table,
due to the condition in line 199
if ( ik_now==ik_loop .and. ip_now==ip_loop) cycle
that is always true if there is only one k-point.
I'm trying to solve the problem, but sill unsuccessful
ciao
cla
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
- Davide Sangalli
- Posts: 641
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Claudio,
the BSE is still not super tested, so there might well be a problem.
However I'm not fully sure this is the case.
In the BS_oscillators_free there is
which should make the final deallocation when iB_ref==n_BS_blks
Now, during the loop the memory is however increasing as you report.
The reason is that yambo is trying to avoid the repeated computation in "src/bse/K_correlation_collisions.F" of %O_c
However if memory is an issue you may try to remove in that subroutines all the loops like
where iB_ref is set.
After that you can also comment the check you where reporting
It may work.
I did not try however ...
Best,
D.
the BSE is still not super tested, so there might well be a problem.
However I'm not fully sure this is the case.
In the BS_oscillators_free there is
Code: Select all
if(iB_ref==n_BS_blks) then
ik_now=0
ip_now=0
endif
Now, during the loop the memory is however increasing as you report.
The reason is that yambo is trying to avoid the repeated computation in "src/bse/K_correlation_collisions.F" of %O_c
However if memory is an issue you may try to remove in that subroutines all the loops like
Code: Select all
do iB_p=iB,1,-1
if(BS_blk(iB_p)%ik/=BS_blk(iB)%ik .or. BS_blk(iB_p)%ip/=BS_blk(iB)%ip) exit
if(BS_blk(iB_p)%O_table(1,i_s_collision,i_v_k,i_v_p,i_k_sp)==0) cycle
iB_ref=iB_p
exit
enddo
After that you can also comment the check you where reporting
Code: Select all
if ( ik_now==ik_loop .and. ip_now==ip_loop ) cycle
I did not try however ...
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
- claudio
- Posts: 528
- Joined: Tue Mar 31, 2009 11:33 pm
- Location: Marseille
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Davide
thank for the reply, following your suggestion I solved the problem.
I added a new flag BS_gamma_point that it true if you have only 1 k-point.
Then I changed the line
if ( ik_now==ik_loop .and. ip_now==ip_loop.and..not.BS_gamma_point) cycle
and also K_correlation_collisions.F, otherwise the code searches
for old collisions that are not in memory anymore.
Another small change, I added
call BS_oscillators_free(i_Tgrp_k,0)
call BS_oscillators_free(i_Tgrp_p,0)
at the end of the main loop to free even more memory.
It seems to work, I will make another test and then send you the modified files.
best
Claudio
thank for the reply, following your suggestion I solved the problem.
I added a new flag BS_gamma_point that it true if you have only 1 k-point.
Then I changed the line
if ( ik_now==ik_loop .and. ip_now==ip_loop.and..not.BS_gamma_point) cycle
and also K_correlation_collisions.F, otherwise the code searches
for old collisions that are not in memory anymore.
Another small change, I added
call BS_oscillators_free(i_Tgrp_k,0)
call BS_oscillators_free(i_Tgrp_p,0)
at the end of the main loop to free even more memory.
It seems to work, I will make another test and then send you the modified files.
best
Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
- Davide Sangalli
- Posts: 641
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Claudio,
indeed the two extra lines you added free even more memory and require also the exchange matrix elements to be recomputed,
since, doing so, the check in K_exchange_collitions.F
will be overcome.
It can be a good idea to have this option at the gamma_point. I'd be curious about the performances: "cpu time" vs "memory"
Indeed I think the best would be to have a variable such as "yambo_scheme" with options "minimaze memory" and "minimize cpu time", not only at the gamma point but always.
Another tip: the I/O of the BSE matrix is still affecting a lot the time of the computation.
I do not remember if it is off by default. In case it is not you may consider switching it off from the input.
Best,
D.
indeed the two extra lines you added free even more memory and require also the exchange matrix elements to be recomputed,
since, doing so, the check in K_exchange_collitions.F
Code: Select all
if (allocated( BS_T_grp(i_T_grp)%O_x )) return
It can be a good idea to have this option at the gamma_point. I'd be curious about the performances: "cpu time" vs "memory"
Indeed I think the best would be to have a variable such as "yambo_scheme" with options "minimaze memory" and "minimize cpu time", not only at the gamma point but always.
Another tip: the I/O of the BSE matrix is still affecting a lot the time of the computation.
I do not remember if it is off by default. In case it is not you may consider switching it off from the input.
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
- claudio
- Posts: 528
- Joined: Tue Mar 31, 2009 11:33 pm
- Location: Marseille
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Davide
here the files with my correction to the bug:
1) correct the wrong loop in K_correlation_collisions.F
2) detect if the system has only one-kpoint (Gamma point) and in this case
force deallocation to free memory
3) deallocate oscilaltors
call BS_oscillators_free(i_Tgrp_p,0)
call BS_oscillators_free(i_Tgrp_k,0)
at the end of the main loop because they are not required anymore (also in case of more k-points)
probably you can solve the problem in different ways, just check that when there is only one k-points it works well
ciao
cla
here the files with my correction to the bug:
1) correct the wrong loop in K_correlation_collisions.F
2) detect if the system has only one-kpoint (Gamma point) and in this case
force deallocation to free memory
3) deallocate oscilaltors
call BS_oscillators_free(i_Tgrp_p,0)
call BS_oscillators_free(i_Tgrp_k,0)
at the end of the main loop because they are not required anymore (also in case of more k-points)
probably you can solve the problem in different ways, just check that when there is only one k-points it works well
ciao
cla
You do not have the required permissions to view the files attached to this post.
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com
- Davide Sangalli
- Posts: 641
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: Memory explotion in the BSE calculation
Ciao Claudio,
thanks.
The advantage in case of isolated systems is clear.
I think in systems with many kpts and symmetries the gain in memory would be small against a significant slow down in the CPU time.
I'll see if I'll be able to include a modified version in the next release.
I keep saying that it is not a bug-fix, but an upgrade of the performances of the code.
Best,
D.
thanks.
The advantage in case of isolated systems is clear.
I think in systems with many kpts and symmetries the gain in memory would be small against a significant slow down in the CPU time.
I'll see if I'll be able to include a modified version in the next release.
I keep saying that it is not a bug-fix, but an upgrade of the performances of the code.
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/