Memory explotion in the BSE calculation

You can find here problems arising when using old releases of Yambo (< 5.0). Issues as parallelization strategy, performance issues and other technical aspects.

Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan

Locked
User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Memory explotion in the BSE calculation

Post by claudio » Fri Jun 09, 2017 9:39 am

Dear developers

I noticed that in the BSE calculation the memory increases during the main loop
because there are some arrays that are not deallocated, and this is a problem for large systems (with many G-vectors).

I tried to solve the problem adding the line

call BS_oscillators_free(max(i_Tgrp_k,i_Tgrp_p),0)

at the end of the main BSE loop, I will make you know

regards
Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

User avatar
Daniele Varsano
Posts: 3868
Joined: Tue Mar 17, 2009 2:23 pm
Contact:

Re: Memory explotion in the BSE calculation

Post by Daniele Varsano » Fri Jun 09, 2017 9:42 am

Ciao Claudio,
thanks a lot for reporting,
which release are you using?

Many thanks,

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/

User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: Memory explotion in the BSE calculation

Post by claudio » Fri Jun 09, 2017 9:48 am

I'm using Yambo 4.1.3

Cla
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: Memory explotion in the BSE calculation

Post by claudio » Tue Jun 13, 2017 9:01 pm

Ciao Daniele

unfortunately the correction I sent you before does not solve the problem,
if I do a calculation at Gamma point only memory always explode in yambo
here the output where I modified yambo to write all allocations

Code: Select all

 ....
 <15s> P0001: [M  1.543 Gb] Alloc BS_T_group_X_oscillators_1 ( 0.054)
 <15s> P0001: [M  2.594 Gb] Alloc BS_bkl_O_c_1 ( 1.051)
 <03h-31m-11s> P0001: [M  2.541 Gb] Free BS_T_group_X_oscillators_1 ( 0.054)
 <03h-31m-11s> P0001: Kernel |#                                       | [002%] 03h-30m-55s(E) 05d-09h-39m-36s(X)
 <03h-31m-11s> P0001: [M  2.594 Gb] Alloc BS_T_group_X_oscillators_1 ( 0.054)
 <03h-31m-11s> P0001: [M  2.648 Gb] Alloc BS_T_group_X_oscillators_2 ( 0.054)
 <03h-31m-11s> P0001: [M  3.700 Gb] Alloc BS_bkl_O_c_2 ( 1.051)
 <10h-31m-01s> P0001: [M  3.647 Gb] Free BS_T_group_X_oscillators_2 ( 0.054)
 <10h-31m-01s> P0001: [M  3.593 Gb] Free BS_T_group_X_oscillators_1 ( 0.054)
 <10h-31m-04s> P0001: Kernel |###                                     | [008%] 10h-30m-49s(E) 05d-09h-27m-04s(X)
 <10h-31m-04s> P0001: [M  3.647 Gb] Alloc BS_T_group_X_oscillators_1 ( 0.054)
 <10h-31m-04s> P0001: [M  3.700 Gb] Alloc BS_T_group_X_oscillators_3 ( 0.054)
....
the problem arises from the file src/modules/mod_BS.F
and in particular the

subroutine BS_oscillators_free(iG,iB)

if I have only one k-points this subroutine never deallocate the BS_blk(i_b)%O_c and BS_blk(i_b)%O_table,
due to the condition in line 199

if ( ik_now==ik_loop .and. ip_now==ip_loop) cycle

that is always true if there is only one k-point.
I'm trying to solve the problem, but sill unsuccessful

ciao
cla
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

User avatar
Davide Sangalli
Posts: 614
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: Memory explotion in the BSE calculation

Post by Davide Sangalli » Wed Jun 14, 2017 11:01 pm

Ciao Claudio,
the BSE is still not super tested, so there might well be a problem.

However I'm not fully sure this is the case.
In the BS_oscillators_free there is

Code: Select all

     if(iB_ref==n_BS_blks) then
       ik_now=0
       ip_now=0
     endif
which should make the final deallocation when iB_ref==n_BS_blks

Now, during the loop the memory is however increasing as you report.
The reason is that yambo is trying to avoid the repeated computation in "src/bse/K_correlation_collisions.F" of %O_c

However if memory is an issue you may try to remove in that subroutines all the loops like

Code: Select all

         do iB_p=iB,1,-1
           if(BS_blk(iB_p)%ik/=BS_blk(iB)%ik .or. BS_blk(iB_p)%ip/=BS_blk(iB)%ip) exit
           if(BS_blk(iB_p)%O_table(1,i_s_collision,i_v_k,i_v_p,i_k_sp)==0) cycle
           iB_ref=iB_p
           exit
         enddo
where iB_ref is set.

After that you can also comment the check you where reporting

Code: Select all

       if ( ik_now==ik_loop .and. ip_now==ip_loop ) cycle
It may work.
I did not try however ...

Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: Memory explotion in the BSE calculation

Post by claudio » Thu Jun 15, 2017 7:49 am

Ciao Davide

thank for the reply, following your suggestion I solved the problem.
I added a new flag BS_gamma_point that it true if you have only 1 k-point.

Then I changed the line

if ( ik_now==ik_loop .and. ip_now==ip_loop.and..not.BS_gamma_point) cycle

and also K_correlation_collisions.F, otherwise the code searches
for old collisions that are not in memory anymore.

Another small change, I added

call BS_oscillators_free(i_Tgrp_k,0)
call BS_oscillators_free(i_Tgrp_p,0)

at the end of the main loop to free even more memory.

It seems to work, I will make another test and then send you the modified files.

best
Claudio
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

User avatar
Davide Sangalli
Posts: 614
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: Memory explotion in the BSE calculation

Post by Davide Sangalli » Thu Jun 15, 2017 10:43 am

Ciao Claudio,
indeed the two extra lines you added free even more memory and require also the exchange matrix elements to be recomputed,
since, doing so, the check in K_exchange_collitions.F

Code: Select all

 if (allocated( BS_T_grp(i_T_grp)%O_x )) return
will be overcome.

It can be a good idea to have this option at the gamma_point. I'd be curious about the performances: "cpu time" vs "memory"
Indeed I think the best would be to have a variable such as "yambo_scheme" with options "minimaze memory" and "minimize cpu time", not only at the gamma point but always.

Another tip: the I/O of the BSE matrix is still affecting a lot the time of the computation.
I do not remember if it is off by default. In case it is not you may consider switching it off from the input.

Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

User avatar
claudio
Posts: 459
Joined: Tue Mar 31, 2009 11:33 pm
Location: Marseille
Contact:

Re: Memory explotion in the BSE calculation

Post by claudio » Fri Jun 16, 2017 8:12 pm

Ciao Davide

here the files with my correction to the bug:

1) correct the wrong loop in K_correlation_collisions.F

2) detect if the system has only one-kpoint (Gamma point) and in this case
force deallocation to free memory

3) deallocate oscilaltors
call BS_oscillators_free(i_Tgrp_p,0)
call BS_oscillators_free(i_Tgrp_k,0)
at the end of the main loop because they are not required anymore (also in case of more k-points)

probably you can solve the problem in different ways, just check that when there is only one k-points it works well

ciao
cla
You do not have the required permissions to view the files attached to this post.
Claudio Attaccalite
[CNRS/ Aix-Marseille Université/ CINaM laborarory / TSN department
Campus de Luminy – Case 913
13288 MARSEILLE Cedex 09
web site: http://www.attaccalite.com

User avatar
Davide Sangalli
Posts: 614
Joined: Tue May 29, 2012 4:49 pm
Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
Contact:

Re: Memory explotion in the BSE calculation

Post by Davide Sangalli » Mon Jun 19, 2017 2:35 pm

Ciao Claudio,
thanks.

The advantage in case of isolated systems is clear.
I think in systems with many kpts and symmetries the gain in memory would be small against a significant slow down in the CPU time.

I'll see if I'll be able to include a modified version in the next release.
I keep saying that it is not a bug-fix, but an upgrade of the performances of the code.

Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/

Locked